-
Notifications
You must be signed in to change notification settings - Fork 2k
Kick vsock during VM resume #4796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
661ae3b
to
99a51f1
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4796 +/- ##
==========================================
- Coverage 84.32% 84.32% -0.01%
==========================================
Files 249 249
Lines 27519 27521 +2
==========================================
Hits 23206 23206
- Misses 4313 4315 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
roypat
reviewed
Sep 12, 2024
c89c074
to
b0bd4cf
Compare
pb8o
reviewed
Sep 12, 2024
We need to kick vsock queue during resume in order for a VM to process `TRANSPORT_RESET_EVENT` we sent during snapshot creation. Otherwise it will wait for it forever. Signed-off-by: Egor Lazarchuk <[email protected]>
b0bd4cf
to
1ab4976
Compare
The test starts `socat` server on the host and `socat` client in the guest. The test then validates that `socat` client stops after snapshot is taken. Signed-off-by: Egor Lazarchuk <[email protected]>
1ab4976
to
454480a
Compare
roypat
previously approved these changes
Sep 12, 2024
kalyazin
reviewed
Sep 12, 2024
6514a21
to
1e851f6
Compare
kalyazin
previously approved these changes
Sep 12, 2024
roypat
previously approved these changes
Sep 12, 2024
Add entry for vsock fix into CHANGELOG. Signed-off-by: Egor Lazarchuk <[email protected]>
1e851f6
to
a51d09a
Compare
kalyazin
approved these changes
Sep 12, 2024
roypat
approved these changes
Sep 12, 2024
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Sep 18, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
9 tasks
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Sep 18, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Sep 18, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
ShadowCurse
added a commit
that referenced
this pull request
Sep 18, 2024
This is a fix for a fix introduced in #4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
RiverPhillips
pushed a commit
to RiverPhillips/firecracker
that referenced
this pull request
Sep 20, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]> Signed-off-by: River Phillips <[email protected]>
5 tasks
kalyazin
pushed a commit
to kalyazin/firecracker
that referenced
this pull request
Oct 8, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
bchalios
pushed a commit
that referenced
this pull request
Oct 8, 2024
This is a fix for a fix introduced in #4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Nov 5, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Nov 5, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Nov 5, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
ShadowCurse
added a commit
to ShadowCurse/firecracker
that referenced
this pull request
Nov 5, 2024
This is a fix for a fix introduced in firecracker-microvm#4796 The issue was in vsock device hanging after snapshot restoration due to the guest not being notified about the termination packet. But there was bug in the fix, maily we saved the vsock state before the notification was sent, thus discarding all modifications made to sent the notification. The reason original fix worked, is because we were only testing with 1 iteration of snap/restore. This way even though we lost synchronization with the guest in the event queue state, it worked fine once. But doing more iterations causes vsock to hang as before. This commit fixes the issue by storing vsock state after the notification is sent and modifies the vsock test to run multiple iterations of snap/restore. Signed-off-by: Egor Lazarchuk <[email protected]>
CompuIves
added a commit
to codesandbox/firecracker
that referenced
this pull request
Dec 3, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Status: Awaiting review
Indicates that a pull request is ready to be reviewed
Type: Bug
Indicates an unexpected problem or unintended behavior
Type: Fix
Indicates a fix to existing code
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Kick vsock when resuming VM. This ensures that if VM is restored from a snapshot, guest gets a notification to read a queue with
TRANSPORT_RESET_EVENT
inside. This will trigger guest to close all vsock connections. This is expected behavior as vsock devices does not support bringing connections over the snapshot boundary.Reason
Fixes #4736
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
PR.
CHANGELOG.md
.TODO
s link to an issue.contribution quality standards.
rust-vmm
.