Skip to content

Fix bugs in replication lag computation #18602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Jun 24, 2025

Description

Fix replication lag computation
This change fixes a bug with replication lag computation to correctly use epoch reference point with Instant.now() and DateUtils. This change also fixes pruning logic to correctly remove the latest synced to checkpoint from tracking. Previously we would only prune up to the latest. This ensures that when a new checkpoint is eventually received we aren't incorrectly computing lag from the synced-to checkpoint.

Related Issues

Resolves #18437 (comment)

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

This change fixes a bug with replication lag computation to correctly use epoch reference point with Instant.now() and DateUtils.
This change also fixes pruning logic to correctly remove the latest synced to checkpoint from tracking.  Previously we would only
prune up to the latest.  This ensures that when a new checkpoint is eventually received we aren't incorrectly computing lag from the synced-to checkpoint.

Signed-off-by: Marc Handalian <[email protected]>
@mch2 mch2 requested a review from a team as a code owner June 24, 2025 21:17
@github-actions github-actions bot added bug Something isn't working Indexing:Replication Issues and PRs related to core replication framework eg segrep labels Jun 24, 2025
Copy link
Contributor

✅ Gradle check result for 7b51ccb: SUCCESS

Copy link

codecov bot commented Jun 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.79%. Comparing base (6bf1a6d) to head (7b51ccb).
Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18602      +/-   ##
============================================
+ Coverage     72.75%   72.79%   +0.03%     
- Complexity    68258    68267       +9     
============================================
  Files          5549     5549              
  Lines        313737   313737              
  Branches      45506    45506              
============================================
+ Hits         228250   228373     +123     
+ Misses        66919    66806     -113     
+ Partials      18568    18558      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Indexing:Replication Issues and PRs related to core replication framework eg segrep
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Segment replication lag metric seems to be incorrect
3 participants