Skip to content

Commit 396add1

Browse files
[segment replication] Avoid skewed segment replication lag metric (#17831)
* avoid skewed sr lag metric Signed-off-by: guojialiang <[email protected]> * add CHANGELOG Signed-off-by: guojialiang <[email protected]> * add test Signed-off-by: guojialiang <[email protected]> --------- Signed-off-by: guojialiang <[email protected]>
1 parent 7b6108b commit 396add1

File tree

3 files changed

+5
-1
lines changed

3 files changed

+5
-1
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
3535
- Unwrap singleton DocValues in global ordinal value source of composite histogram aggregation ([#17740](https://github.com/opensearch-project/OpenSearch/pull/17740))
3636
- Unwrap singleton DocValues in date histogram aggregation. ([#17643](https://github.com/opensearch-project/OpenSearch/pull/17643))
3737
- Introduce 512 byte limit to search and ingest pipeline IDs ([#17786](https://github.com/opensearch-project/OpenSearch/pull/17786))
38+
- Avoid skewed segment replication lag metric ([#17831](https://github.com/opensearch-project/OpenSearch/pull/17831))
3839

3940
### Dependencies
4041
- Bump `com.nimbusds:nimbus-jose-jwt` from 9.41.1 to 10.0.2 ([#17607](https://github.com/opensearch-project/OpenSearch/pull/17607), [#17669](https://github.com/opensearch-project/OpenSearch/pull/17669))

server/src/main/java/org/opensearch/index/seqno/ReplicationTracker.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1307,7 +1307,8 @@ public synchronized void startReplicationLagTimers(ReplicationCheckpoint checkpo
13071307
&& replicationGroup.getUnavailableInSyncShards().contains(allocationId) == false
13081308
&& shouldSkipReplicationTimer(e.getKey()) == false
13091309
&& latestReplicationCheckpoint.isAheadOf(cps.visibleReplicationCheckpoint)
1310-
&& cps.checkpointTimers.containsKey(latestReplicationCheckpoint)) {
1310+
&& cps.checkpointTimers.containsKey(latestReplicationCheckpoint)
1311+
&& cps.checkpointTimers.get(latestReplicationCheckpoint).startTime() == 0) {
13111312
cps.checkpointTimers.get(latestReplicationCheckpoint).start();
13121313
}
13131314
});

server/src/test/java/org/opensearch/index/seqno/ReplicationTrackerTests.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1870,6 +1870,8 @@ public void testSegmentReplicationCheckpointTracking() {
18701870

18711871
tracker.setLatestReplicationCheckpoint(initialCheckpoint);
18721872
tracker.startReplicationLagTimers(initialCheckpoint);
1873+
// retry start replication lag timers
1874+
tracker.startReplicationLagTimers(initialCheckpoint);
18731875
tracker.setLatestReplicationCheckpoint(secondCheckpoint);
18741876
tracker.startReplicationLagTimers(secondCheckpoint);
18751877
tracker.setLatestReplicationCheckpoint(thirdCheckpoint);

0 commit comments

Comments
 (0)