-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Fix bugs in silences #3892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in silences #3892
Conversation
@@ -518,14 +518,10 @@ func matchesEmpty(m *pb.Matcher) bool { | |||
} | |||
|
|||
func validateSilence(s *pb.Silence) error { | |||
if s.Id == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed checks for the Id
and UpdatedAt
fields from validateSilence
. There are two reasons for this:
- I have moved validation of silences to the start of the function before the
Id
andUpdatedAt
fields are assigned. This is to fix the issue where an existing silence is expired but the new silence is not created. - These fields are supposed to be set in
silences.Set(*pb.Silence)
, not by the calling code. We assert that these are present within tests, not validation code that covers both internal and external fields.
s := *sil | ||
return &s | ||
// Expire the silence with the given ID immediately. | ||
func (s *Silences) Expire(id string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No changes here, just been moved up in the file.
msil, ok := s.st[id] | ||
if !ok { | ||
return nil, false | ||
// Set inserts a new silence or updates an existing silence. It inserts a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No changes to existing behavior here, this is just documentation.
// time extensions), the existing silence is expired and replaced with a | ||
// new silence with a different ID. | ||
// | ||
// When updating an existing silence, the existing silence must be cloned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again no changes to existing behavior here, this is just documentation. The requirement that silences are cloned before being updated was always there (see silence_test.go for examples of this). What I have done here is documented the fact and added an explicit Clone()
function to deep copy a silence.
// When updating an existing silence, the existing silence must be cloned | ||
// using Clone() before making any modifications to the silence. This | ||
// prevents situations where the stored silence is modified because both | ||
// pointers reference the same memory. If Clone() is not used before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clone()
is required otherwise functions like canUpdate
break. Again, no changes here, this is just documentation.
sil3 := cloneSilence(sil2) | ||
sil3.EndsAt = start3.Add(100 * time.Minute) | ||
|
||
// Can update silence without modifications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an extra test case here.
require.NoError(t, s.Set(sil3)) | ||
require.Equal(t, sil2.Id, sil3.Id) | ||
|
||
// Can update silence with new comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added another test case here.
require.NoError(t, s.Set(sil4)) | ||
require.Equal(t, sil3.Id, sil4.Id) | ||
|
||
// Extend sil4 to expire at a later time. This should not expire the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are existing test cases, but the variable names have been updated since I added two new cases above.
@@ -479,23 +502,18 @@ func TestSilenceLimits(t *testing.T) { | |||
StartsAt: time.Now(), | |||
EndsAt: time.Now().Add(5 * time.Minute), | |||
} | |||
require.EqualError(t, s.Set(sil2), "exceeded maximum number of silences: 1 (limit: 1)") | |||
require.Equal(t, "", sil2.Id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need to check the id here, removed.
require.NotEqual(t, "", sil3.Id) | ||
require.EqualError(t, s.Set(sil3), fmt.Sprintf("exceeds limits: maximum size: %d bytes (limit: 4096 bytes)", s.toMeshSilence(sil3).Size())) | ||
|
||
// Should be able to insert sil4 and then update it without modifications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added additional test cases to show that the fixes work.
This commit fixes a number of bugs that can occur when creating and updating silences: 1. When creating a silence fails because the silence is invalid or would exceed limits, subsequent calls for the same silence fail even when the validation or limits are fixed. This happens because the silence is assigned an ID that does not exist. This commit fixes that issue as it assigns a new ID to silences that do not exist. 2. When updating an existing silence, and the update changes how alerts are matched or the time window in which the silence is active (excluding time extensions), then the existing silence must be expired and a new silence created. However, if the new silence is invalid or would exceed limits, the existing silence is expired, but the new silence is not created. This commit fixes that issue such that existing silences are not expired unless the replacing silence is both valid and within limits. Signed-off-by: George Robinson <[email protected]>
026afc4
to
1e18844
Compare
This commit updates the TestPostSilencesHandler acceptance tests to expect 200 when making a POST request for a silence that does not exist. In this case, a new silence is created with a different ID. Signed-off-by: George Robinson <[email protected]>
@@ -240,11 +240,13 @@ func TestPostSilencesHandler(t *testing.T) { | |||
expectedCode int | |||
}{ | |||
{ | |||
"with an non-existent silence ID - it returns 404", | |||
// This can happen when updating an expired silence that has been GC'd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are new semantics for silences, so please review with care. If considered problematic then I will reconsider how silences.Set
works for missing IDs.
Signed-off-by: George Robinson <[email protected]>
51305d9
to
9f4a9a7
Compare
This commit fixes a number of bugs that can occur when creating and updating silences:
When creating a silence fails because the silence is invalid or would exceed limits, subsequent calls for the same silence fail even when the validation or limits are fixed. This happens because the silence is assigned an ID that does not exist. This commit fixes that issue as it assigns a new ID to silences that do not exist.
When updating an existing silence, and the update changes how alerts are matched or the time window in which the silence is active (excluding time extensions), then the existing silence must be expired and a new silence created. However, if the new silence is invalid or would exceed limits, the existing silence is expired, but the new silence is not created. This commit fixes that issue such that existing silences are not expired unless the replacing silence is both valid and within limits.
Fixes #3877