Fix for alertmanagers not configuring users before becoming ACTIVE. #4110

stevesg · 2021-04-23T12:53:51Z

What this PR does:

(Only applies to sharding operation)

Currently when the alertmanager joins the sharding ring, it tries to
load user configurations, and checks the ring to see what users it
should be servicing. Once finished, the alertmanager changes it's state
to ACTIVE.

However, when performing this initial sync of user configurations, the
alertmanager is in the JOINING state, but checking if a user is owned
requires an instance to be in the ACTIVE state. Therefore, the initial
sync will never configure any users, but the instance will be
considered ACTIVE.

This change fixes the initial sync by considering a user to be owned by
the instance if it is in either the ACTIVE or JOINING state.

Note this fix should address the sporadic integration test failures:
The distributor forwards requests to an instance which it believes is
configured, but the request then fails, because it is not ready yet.

Checklist

~~Tests updated~~
~~Documentation added~~
~~CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]~~

(Only applies to sharding operation) Currently when the alertmanager joins the sharding ring, it tries to load user configurations, and checks the ring to see what users it should be servicing. Once finished, the alertmanager changes it's state to ACTIVE. However, when performing this initial sync of user configurations, the alertmanager is in the JOINING state, but checking if a user is owned requires an instance to be in the ACTIVE state. Therefore, the initial sync will _never_ configure any users, but the instance will be considered ACTIVE. This change fixes the initial sync by considering a user to be owned by the instance if it is in either the ACTIVE or JOINING state. Note this fix should address the sporadic integration test failures: The distributor forwards requests to an instance which it believes is configured, but the request then fails, because it is not ready yet. Signed-off-by: Steve Simpson <[email protected]>

pstibrany

Makes sense to me. Thanks!

Signed-off-by: Steve Simpson <[email protected]>

pracucci

Makes sense! We do it in the store-gateway too. I left a question.

pkg/alertmanager/alertmanager_ring.go

Signed-off-by: Steve Simpson <[email protected]>

pracucci

LGTM!

pull-request-size bot added the size/XS label Apr 23, 2021

pstibrany approved these changes Apr 23, 2021

View reviewed changes

Fix race condition in unit test.

e288417

Signed-off-by: Steve Simpson <[email protected]>

pull-request-size bot added size/S and removed size/XS labels Apr 23, 2021

pracucci reviewed Apr 23, 2021

View reviewed changes

pkg/alertmanager/alertmanager_ring.go Outdated Show resolved Hide resolved

pkg/alertmanager/alertmanager_ring.go Outdated Show resolved Hide resolved

Review comments.

e9a733d

Signed-off-by: Steve Simpson <[email protected]>

stevesg marked this pull request as ready for review April 23, 2021 14:13

pracucci approved these changes Apr 23, 2021

View reviewed changes

gotjosh approved these changes Apr 23, 2021

View reviewed changes

pstibrany merged commit 63703f5 into cortexproject:master Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for alertmanagers not configuring users before becoming ACTIVE. #4110

Fix for alertmanagers not configuring users before becoming ACTIVE. #4110

Uh oh!

stevesg commented Apr 23, 2021 •

edited

Loading

Uh oh!

pstibrany left a comment

Uh oh!

pracucci left a comment

Uh oh!

Uh oh!

Uh oh!

pracucci left a comment

Uh oh!

Uh oh!

Fix for alertmanagers not configuring users before becoming ACTIVE. #4110

Fix for alertmanagers not configuring users before becoming ACTIVE. #4110

Uh oh!

Conversation

stevesg commented Apr 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pstibrany left a comment

Choose a reason for hiding this comment

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pracucci left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stevesg commented Apr 23, 2021 •

edited

Loading