-
Notifications
You must be signed in to change notification settings - Fork 20.8k
core/filtermaps: hashdb safe delete range #31525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
81c5f88
to
e41033b
Compare
core/rawdb/database.go
Outdated
if stopCallback() { | ||
return ErrDeleteRangeInterrupted | ||
} | ||
start = append(bytes.Clone(it.Key()), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit strange to append the 0
here.
I think it's fine to construct the iterator using it.Key
- if the entry is already deleted in the last round, the position of new iterator
will be shifted automatically - if the entry is skipped for deletion, it's fine to reprocess it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
append(key, 0)
literally comes right next after key
in a lexicographic ordering. Sure, it's not a big performance loss to start from key
either but why not move to the next possible key here? I feel it's just nicer and makes it more obvious what we are doing here. I will add a comment for now but if I there are more downvotes then I will remove it :)
core/rawdb/accessors_indexes.go
Outdated
if err := db.DeleteRange(filterMapRowKey(mapRows.First(), false), filterMapRowKey(mapRows.AfterLast(), false)); err != nil { | ||
log.Crit("Failed to delete range of filter map rows", "err", err) | ||
} | ||
func DeleteFilterMapRows(db ethdb.KeyValueStore, mapRows common.Range[uint64], hashDbSafe bool, stopCallback func() bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the normal deletion rather than using range deleter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will just copy my response from discord here:
We could, I did consider it, rejected it in the end. The thing is that deleting everything by key takes a significant amount of time. Also, the rows are stored in a way that in most cases 32 adjacent rows are stored under a single key ("base layer" row data) while a few rows have larger amount of data related to higher mapping layers, addressed under individual keys. So if we do not iterate, just delete all keys where there can be data then we have to delete 10x-30x more keys than the actually existing number. Now the epoch delete takes 1-2 seconds, with blind delete it would take significantly more.
Furthermore, it cannot be done atomically anyways, both because of batch memory req and also just cannot block the indexer for very long. The whole process should be easily and quickly interruptable. Now we could do the "blind delete" approach in multiple batches, remembering where we aborted last time, but then we are at a more complex solution that is also slower.
Also, in the most typical case (pebble db and path based state) we can use native DeleteRange and I didn't want to lose this advantage, so it's another reason why it's easier to use the range delete approach in the fallback case too.
core/rawdb/accessors_indexes.go
Outdated
if err := db.DeleteRange(filterMapLastBlockKey(maps.First()), filterMapLastBlockKey(maps.AfterLast())); err != nil { | ||
log.Crit("Failed to delete range of filter map last block pointers", "err", err) | ||
} | ||
func DeleteFilterMapLastBlocks(db ethdb.KeyValueStore, maps common.Range[uint32], hashDbSafe bool, stopCallback func() bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the normal deletion rather than using range deleter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of map and block pointers of course a normal delete loop would be fine too, actually it does not matter so much performance wise since 99% of the data is the filter rows. But the thing is that for the rows it does make a lot of sense to use SafeDeleteRange
and once we do that for the rows, it also makes sense to not have any different logic for the pointers. If native DeleteRange
can be used then it's faster, if the fallback has to be used then the same interrupt/clean up later logic works here too. It's just less extra logic.
core/rawdb/accessors_indexes.go
Outdated
if err := db.DeleteRange(filterMapBlockLVKey(blocks.First()), filterMapBlockLVKey(blocks.AfterLast())); err != nil { | ||
log.Crit("Failed to delete range of block log value pointers", "err", err) | ||
} | ||
func DeleteBlockLvPointers(db ethdb.KeyValueStore, blocks common.Range[uint64], hashDbSafe bool, stopCallback func() bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the normal deletion rather than using range deleter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
core/filtermaps/filtermaps.go
Outdated
logPrinted bool | ||
lastLogPrinted = start | ||
) | ||
switch err := deleteFn(f.db, f.hashScheme, func() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very weird to print logs if there are nothing to delete,
e.g.
INFO [03-31|15:07:25.452] Removing old bloom bits database in progress... elapsed=17.430236ms
INFO [03-31|15:07:25.644] Removing old bloom bits database finished elapsed=209.842669ms
These two logs are printed even the legacy bloom data are not existent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Moved |
f.filterMapCache.Remove(mapIndex) | ||
} | ||
if err := rawdb.DeleteFilterMapLastBlocks(f.db, common.NewRange(firstMap, f.mapsPerEpoch-1), hashScheme, stopCb); err != nil { // keep last enrty | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the counter is f.mapsPerEpoch-1
? Shouldn't it be f.mapsPerEpoch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment says // keep last enrty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this is intended, it might be easier to see if the delete range is assigned into a variable on its own line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is intentional, last entry of each epoch should always be available. It's even added for past epochs based on the checkpoints.
core/filtermaps/filtermaps.go
Outdated
for mapIndex := firstMap; mapIndex < firstMap+f.mapsPerEpoch-1; mapIndex++ { | ||
f.lastBlockCache.Remove(mapIndex) | ||
} | ||
if err := rawdb.DeleteBlockLvPointers(f.db, common.NewRange(firstBlock, lastBlock-firstBlock), hashScheme, stopCb); err != nil { // keep last enrty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the counter be lastBlock-firstblock+1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, intentional, keeping the last entry.
This PR adds `rawdb.SafeDeleteRange` and uses it for range deletion in `core/filtermaps`. This includes deleting the old bloombits database, resetting the log index database and removing index data for unindexed tail epochs (which previously weren't properly implemented for the fallback case). `SafeDeleteRange` either calls `ethdb.DeleteRange` if the node uses the new path based state scheme or uses an iterator based fallback method that safely skips trie nodes in the range if the old hash based state scheme is used. Note that `ethdb.DeleteRange` also has its own iterator based fallback implementation in `ethdb/leveldb`. If a path based state scheme is used and the backing db is pebble (as it is on the majority of new nodes) then `rawdb.SafeDeleteRange` uses the fast native range delete. Also note that `rawdb.SafeDeleteRange` has different semantics from `ethdb.DeleteRange`, it does not automatically return if the operation takes a long time. Instead it receives a `stopCallback` that can interrupt the process if necessary. This is because in the safe mode potentially a lot of entries are iterated without being deleted (this is definitely the case when deleting the old bloombits database which has a single byte prefix) and therefore restarting the process every time a fixed number of entries have been iterated would result in a quadratic run time in the number of skipped entries. When running in safe mode, unindexing an epoch takes about a second, removing bloombits takes around 10s while resetting a full log index might take a few minutes. If a range delete operation takes a significant amount of time then log messages are printed. Also, any range delete operation can be interrupted by shutdown (tail uinindexing can also be interrupted by head indexing, similarly to how tail indexing works). If the last unindexed epoch might have "dirty" index data left then the indexed map range points to the first valid epoch and `cleanedEpochsBefore` points to the previous, potentially dirty one. At startup it is always assumed that the epoch before the first fully indexed one might be dirty. New tail maps are never rendered and also no further maps are unindexed before the previous unindexing is properly cleaned up. --------- Co-authored-by: Gary Rong <[email protected]> Co-authored-by: Felix Lange <[email protected]>
This PR adds `rawdb.SafeDeleteRange` and uses it for range deletion in `core/filtermaps`. This includes deleting the old bloombits database, resetting the log index database and removing index data for unindexed tail epochs (which previously weren't properly implemented for the fallback case). `SafeDeleteRange` either calls `ethdb.DeleteRange` if the node uses the new path based state scheme or uses an iterator based fallback method that safely skips trie nodes in the range if the old hash based state scheme is used. Note that `ethdb.DeleteRange` also has its own iterator based fallback implementation in `ethdb/leveldb`. If a path based state scheme is used and the backing db is pebble (as it is on the majority of new nodes) then `rawdb.SafeDeleteRange` uses the fast native range delete. Also note that `rawdb.SafeDeleteRange` has different semantics from `ethdb.DeleteRange`, it does not automatically return if the operation takes a long time. Instead it receives a `stopCallback` that can interrupt the process if necessary. This is because in the safe mode potentially a lot of entries are iterated without being deleted (this is definitely the case when deleting the old bloombits database which has a single byte prefix) and therefore restarting the process every time a fixed number of entries have been iterated would result in a quadratic run time in the number of skipped entries. When running in safe mode, unindexing an epoch takes about a second, removing bloombits takes around 10s while resetting a full log index might take a few minutes. If a range delete operation takes a significant amount of time then log messages are printed. Also, any range delete operation can be interrupted by shutdown (tail uinindexing can also be interrupted by head indexing, similarly to how tail indexing works). If the last unindexed epoch might have "dirty" index data left then the indexed map range points to the first valid epoch and `cleanedEpochsBefore` points to the previous, potentially dirty one. At startup it is always assumed that the epoch before the first fully indexed one might be dirty. New tail maps are never rendered and also no further maps are unindexed before the previous unindexing is properly cleaned up. --------- Co-authored-by: Gary Rong <[email protected]> Co-authored-by: Felix Lange <[email protected]>
This PR adds
rawdb.SafeDeleteRange
and uses it for range deletion incore/filtermaps
. This includes deleting the old bloombits database, resetting the log index database and removing index data for unindexed tail epochs (which previously weren't properly implemented for the fallback case).SafeDeleteRange
either callsethdb.DeleteRange
if the node uses the new path based state scheme or uses an iterator based fallback method that safely skips trie nodes in the range if the old hash based state scheme is used. Note thatethdb.DeleteRange
also has its own iterator based fallback implementation inethdb/leveldb
. If a path based state scheme is used and the backing db is pebble (as it is on the majority of new nodes) thenrawdb.SafeDeleteRange
uses the fast native range delete.Also note that
rawdb.SafeDeleteRange
has different semantics fromethdb.DeleteRange
, it does not automatically return if the operation takes a long time. Instead it receives astopCallback
that can interrupt the process if necessary. This is because in the safe mode potentially a lot of entries are iterated without being deleted (this is definitely the case when deleting the old bloombits database which has a single byte prefix) and therefore restarting the process every time a fixed number of entries have been iterated would result in a quadratic run time in the number of skipped entries.When running in safe mode, unindexing an epoch takes about a second, removing bloombits takes around 10s while resetting a full log index might take a few minutes. If a range delete operation takes a significant amount of time then log messages are printed. Also, any range delete operation can be interrupted by shutdown (tail uinindexing can also be interrupted by head indexing, similarly to how tail indexing works). If the last unindexed epoch might have "dirty" index data left then the indexed map range points to the first valid epoch and
cleanedEpochsBefore
points to the previous, potentially dirty one. At startup it is always assumed that the epoch before the first fully indexed one might be dirty. New tail maps are never rendered and also no further maps are unindexed before the previous unindexing is properly cleaned up.