Skip to content

Apply WaitStabilityMinDuration when syncing blocks #5406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

justinjung04
Copy link
Contributor

@justinjung04 justinjung04 commented Jun 14, 2023

What this PR does:

Currently the WaitStabilityMinDuration and WaitStabilityMaxDuration are only used for new store gateway instance joining the ring, to make sure the ring info is propagated to all members of the ring before doing initial sync.

But this config was not used to existing members of the ring -- for example, when two new members are joining within a split second, there is a possibility of an existing member starts syncing block with only one member added to its ring (since it simply polls for any ring change every 5 second without any stability check). This is not desired when we are scaling up/down more than one store-gateway at once, or when we are doing rollout deployment.

This PR makes the periodic and ring-change block sync to wait for ring token and zone stability, before kicking off block sync.

Which issue(s) this PR fixes:
n/a

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@justinjung04 justinjung04 force-pushed the block-sync-improvement branch 4 times, most recently from a9c5f3c to 3f0f587 Compare June 15, 2023 23:48
@justinjung04 justinjung04 marked this pull request as ready for review June 16, 2023 00:09
@justinjung04 justinjung04 force-pushed the block-sync-improvement branch 2 times, most recently from b605588 to e234318 Compare June 18, 2023 12:13
@justinjung04 justinjung04 force-pushed the block-sync-improvement branch from e234318 to 8d283d7 Compare June 18, 2023 12:16
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this LGTM!

@justinjung04 justinjung04 force-pushed the block-sync-improvement branch from ed87699 to aa909d7 Compare June 21, 2023 18:05
@justinjung04
Copy link
Contributor Author

Please hold on, I'm verifying whether my code change will make no block to be available when querier or ruler is making a call

@@ -370,6 +372,10 @@ func (g *StoreGateway) waitRingStability(ctx context.Context, reason string) {
minWaiting := g.gatewayCfg.ShardingRing.WaitStabilityMinDuration
maxWaiting := g.gatewayCfg.ShardingRing.WaitStabilityMaxDuration

if !g.gatewayCfg.ShardingEnabled || minWaiting <= 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it duplicate of shouldWaitRingStability? I think we can keep one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops no that's supposed to be removed, I'll delete it

@justinjung04
Copy link
Contributor Author

justinjung04 commented Jun 21, 2023

Here's my concern:

  • total of 10 replica, from SG1 to SG10. B1 used to be in SG2, SG5, SG10
  • at T0, SG11 and SG12 are added to the ring, B1 is now mapped to SG7, SG11, SG12
  • at T1, all SG7, SG11, SG12 are waiting for ring to be stable
  • at T2, querier asks for B1, checks SG7. SG11, SG12 and see none of them have loaded B1 yet. Returns 503
  • at T3, SG7, SG11, SG12 start syncing the block
  • at T4, querier asks for B1 again, checks SG7. SG11, SG12 and see none of them have loaded B1 yet. Returns 503
  • at T5, SG7, SG11, SG12 start syncing the block

I'll verify what SGs will be returned to the querier when it tries to read B1 at T2 and T4, and add a test case if possible.

Signed-off-by: Justin Jung <[email protected]>
@harry671003
Copy link
Contributor

harry671003 commented Jun 22, 2023

Here's my concern:

  • total of 10 replica, from SG1 to SG10. B1 used to be in SG2, SG5, SG10

  • at T0, SG11 and SG12 are added to the ring, B1 is now mapped to SG7, SG11, SG12

  • at T1, all SG7, SG11, SG12 are waiting for ring to be stable

  • at T2, querier asks for B1, checks SG7. SG11, SG12 and see none of them have loaded B1 yet. Returns 503

  • at T3, SG7, SG11, SG12 start syncing the block

  • at T4, querier asks for B1 again, checks SG7. SG11, SG12 and see none of them have loaded B1 yet. Returns 503

  • at T5, SG7, SG11, SG12 start syncing the block

I'll verify what SGs will be returned to the querier when it tries to read B1 at T2 and T4, and add a test case if possible.

It doesn't work like this. If only two new store-gateways are added to the ring the block will still be assigned to at least one of the three store-gateways that held the block because of consistent hashing.

Eg: SG2, SG11, SG12

Because queriers retry three times, they will eventually be able to query the store-gateway that has the block.

This is why we are scaling store-gateways one by one to keep ring disruptions to a minimum.

@justinjung04
Copy link
Contributor Author

Closing this PR in favor of using -store-gateway.sharding-ring.keep-instance-in-the-ring-on-shutdown that was introduced in #5421. Here is the configuration and different scenarios:

-store-gateway.sharding-ring.keep-instance-in-the-ring-on-shutdown=true
-store-gateway.sharding-ring.tokens-file-path=/data/token_file
-store-gateway.sharding-ring.heartbeat-timeout=1m
-store-gateway.sharding-ring.replication-factor=3
  1. Rollout deployment: Ring sync is never triggered, as members of the ring and their tokens do not change
  2. Scale down: Ring sync is triggered after 1 minute of shutdown, as the instance state changes from LEAVING to Unhealthy after heartbeat timeout. To avoid query unavailability during the 1 minute, we should only scale down one pod at a time
  3. Scale up: RIng sync is triggered as soon as new instance joins the ring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants