-
Notifications
You must be signed in to change notification settings - Fork 13
Eliminate repeated linear searches in BlockManager #268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* For now, pattern detection is very simple: if the previous byte is already stored by a block, | ||
* then we conclude that the byte is requested by a sequential read. In the future we should | ||
* probably extend this to work with some kind of tolerance or radius (i.e., look back more than a | ||
* single byte). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we wanted to extend this class in future. I don't think we should remove it and take the logic to BlockManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good optimisation,
It reduces BlockStore linear search lookups from 3 to 1, improving performance and we can still easily add future enhancements, like read tolerance
Something like this
long tolerance = configuration.getSequentialReadTolerance();
generation = blockStore.getBlock(pos - 1, tolerance)
.map(block -> block.getGeneration() + 1)
.orElse(0L);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rajdchak in its current form, it's introducing a performance bottleneck, which is why I opted to simplify it and move the logic into BlockManager. My thinking is that we can always reintroduce a well-designed, extensible version of the class later if and when we need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keeping SequentialPatternDetector around might be helpful. I can easily see how we are going to extend these patterns to cover nearly sequential reads (sequential read with wholes).
generation = 0; | ||
} else { | ||
// Check if it is sequential pattern | ||
generation = blockStore.getBlock(pos - 1).map(block -> block.getGeneration() + 1).orElse(0L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, think we're losing out on some test coverage though?
do we still have tests that will verify if a block with pos-1 exists, then it's generation will be increased?
Description of change
Currently, BlockManager performs three linear searches in BlockStore when handling a sync read that continues from a previous block:
i. To check if the read is sequential
ii. Again inside getGeneration to verify sequentiality
iii. To fetch the generation of the previous block
This PR reduces the overhead by collapsing these into a single lookup: it retrieves the previous block once and determines both sequentiality and generation based on its existence.
Relevant issues
N/A
Does this contribution introduce any breaking changes to the existing APIs or behaviors?
No
Does this contribution introduce any new public APIs or behaviors?
No
How was the contribution tested?
Executed unit tests
Does this contribution need a changelog entry?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).