Skip to content

Current path range should be respected when path to hash and path to KV indices are restored #18571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
artemananiev opened this issue Mar 31, 2025 · 0 comments · Fixed by #18592
Assignees
Labels
Milestone

Comments

@artemananiev
Copy link
Contributor

MerkleDb has a feature to restore path to hash and path to KV indices on load, if the indices are not available on disk. This is done by iterating over all data files, loading all data (hash or KV) records, and updating corresponding index entries.

There are two issues about this feature:

  1. This mode is turned on, when the index is empty after it's loaded from disk:
        final boolean needRestorePathToDiskLocationLeafNodes = pathToDiskLocationLeafNodes.size() == 0;

In addition to this check, first/last leaf paths should be checked, too. If they are -1 (which means the database is empty), there is no need to restore anything.

  1. Data files may be old and may contain entries that are outside of the current path range. It leads to an assertion error in AbstractLongList.putimpl(), which is called from loading callbacks:
            leafRecordLoadedCallback = (dataLocation, leafData) -> {
                final VirtualLeafBytes leafBytes = VirtualLeafBytes.parseFrom(leafData);
                pathToDiskLocationLeafNodes.put(leafBytes.path(), dataLocation);
            };

There is no need to put() if the record path is not within first/last leaf path range.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ✅ Done
1 participant