-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn about duplicated events received from GitHub via Admin Monitor #388
Warn about duplicated events received from GitHub via Admin Monitor #388
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Made some suggestions.
src/main/java/org/jenkinsci/plugins/github/extension/GHSubscriberEvent.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/github/webhook/subscriber/DuplicateEventsSubscriber.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/github/webhook/subscriber/DuplicateEventsSubscriber.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/github/webhook/subscriber/DuplicateEventsSubscriber.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/github/webhook/subscriber/DuplicateEventsSubscriber.java
Outdated
Show resolved
Hide resolved
…DuplicateEventsSubscriber.java Co-authored-by: Vincent Latombe <[email protected]>
…admin monitor (checking if this is the best, if yes then will write a test for it)
I am fixing how the last duplicate event is presented to the user from, Edit: I made the payload open in a new browser tab since browsers like Firefox and Chrome format JSON nicely. With ~200 lines, it's easy to view without issues. The admin can still download it if needed or simply review it in the browser and proceed. |
src/main/java/org/jenkinsci/plugins/github/admin/GitHubDuplicateEventsMonitor.java
Fixed
Show resolved
Hide resolved
I think this PR is now ready 😄 I feel like covered all the features, and automated tests and manual tests. |
(the CI failure is infra issue in windows (linux already passed), I am currently thinking to push new commits are a PR review, that way the CI will retrigger) |
src/main/java/org/jenkinsci/plugins/github/extension/GHSubscriberEvent.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/github/webhook/subscriber/DuplicateEventsSubscriber.java
Outdated
Show resolved
Hide resolved
src/main/java/org/jenkinsci/plugins/github/webhook/subscriber/DuplicateEventsSubscriber.java
Outdated
Show resolved
Hide resolved
src/test/java/org/jenkinsci/plugins/github/admin/GitHubDuplicateEventsMonitorTest.java
Outdated
Show resolved
Hide resolved
…rties Co-authored-by: Jesse Glick <[email protected]>
…teEventsMonitorTest.java Co-authored-by: Jesse Glick <[email protected]>
…teEventsMonitor.java Co-authored-by: Jesse Glick <[email protected]>
…teEventsMonitor.java Co-authored-by: Jesse Glick <[email protected]>
…berEvent.java Co-authored-by: Jesse Glick <[email protected]>
… duplicate monitor
All updates done after the code reviews for now. I have tested and made sure the same testing notes are still applicable and tested again as well. |
src/main/java/org/jenkinsci/plugins/github/admin/GitHubDuplicateEventsMonitor.java
Outdated
Show resolved
Hide resolved
@KostyaSha are you available to review? |
…teEventsMonitor.java Co-authored-by: Jesse Glick <[email protected]>
i trust you, can merge as is. |
imo, it should be squash merged |
|
The test failure above corresponds to Lines 83 to 86 in 596fbc3
This doesn't have any timing logic, so I am getting delayed to figure out what could have caused (even the timing logic shouldn't cause as problem, as a FakeTicker is used to mock time) I couldn't reproduce in local (terminal ( mvn test logs correctly printed the number of tests passed
|
did it happen all the time or was it flaky ? (mean time I am trying to think more and see if I can reproduce) |
I run twice during release process and it failed both times... |
Passing for me on Ubuntu with Java 21. |
released with 21, 17 failed before |
I tested with Java 17 but couldn't reproduce the issue, so it's likely not related to the Java version. Instead, it might be due to the static nature of the cache: github-plugin/src/main/java/org/jenkinsci/plugins/github/admin/GitHubDuplicateEventsMonitor.java Lines 117 to 121 in 54fb24f
If the two tests in GitHubDuplicateEventsMonitorUnitTest run in parallel, they may interfere with cache expiration by advancing the fakeTicker. However, still not sure why
Caffeine is already thread-safe, so it’s unlikely to be causing racing condition in itself during parallel executions. |
To prevent cache residues, I can add an Are the tests already running sequentially (in local they seem to run sequential), or are they parallelized? |
Ah. @gbhat618 github-plugin/src/main/java/org/jenkinsci/plugins/github/admin/GitHubDuplicateEventsMonitor.java Line 117 in 54fb24f
static . (It should be an instance field of the @Extension singleton.) Usages outside instance methods of the class should therefore use ExtensionList.lookupSingleton .
In general, this mistake can cause hard-to-track test failures when using |
noted about it. thank you. sorry for the mistake on this, |
Problem
If duplicated GitHub events get delivered to Jenkins controller, Jenkins need to act on events both times (they are not automatically de-duplicated). These events may be a branch scan, build trigger etc.
further discussion of problem:
A branch scan event running twice may not be a problem, however build triggering twice could be a problem. Even the builds doesn't get in triggered twice due to
quietPeriod: 5s
(unless the quietPeriod was set to 0, or duplicates come after quietPeriod: 5s, which case two builds do get triggered).Note: usually GH by itself won't automatically redeliver an event if failed, GH simply marks the event as failed (an event is marked failed if GH didn't receive a 200 OK response within 10s; reference)
The reason for such duplicated events is,
Although it is typically rare for this duplication issue to happen, if it happens, it could require quite some time to figure out what is going on. And in some case it may simply go unnoticed.
Logs showing build can trigger twice in case of `quietPeriod: 0`
Note: in the below logs, when the two events came
Started by GitHub push by gbhat618
twice (ref); however rest of logs showed build steps were executed only once; I verified by this by having a single executor.Proposed solution
An admin monitor to show duplicate events are being received - admin monitor goes away by itself after 24 hours if no more duplicates.
further discussion of solution:
Duplication of event is identified by a header
X-GitHub-Delivery
(reference, also linked in javadoc) which is a unique identifier for an event in GitHub.An admin monitor is shown to warn about duplicated events being received. If the duplicated events are to occur, they typically occur in a few seconds window itself, we track an event for duplicate checking for about
10 minutes
- so it should be enough to catch the issue.Event tracking cache:
Caffeine is used for caching via caffeine-api-plugin. The cache max size is attempted to be kept at ~1MB by setting max size, and ttl (further notes in javadoc).
The admin can find the last received duplicate payload via the
Click Here
button shown (see the screenshots in Testing section) -- it opens the payload JSON in the new browser tab next to current tab. (browsers like firefox and chrome are by default acting as pretty json viewer - due to content-typeapplication/json
)I have considered alternative ways of presenting the event payload as well,
Click Here
: most of the admins can take a decision directly from seeing the payload in the browser (downloading to laptop - is another problem for the admin to cleanup the file later)Testing done
Automated tests are submitted
Manual testing
quietPeriod: 0
).Screenshots
Submitter checklist