Skip to content

Retry disabled on "read: connection reset" #3971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
vcschapp opened this issue Jun 22, 2021 · 7 comments
Closed
3 tasks done

Retry disabled on "read: connection reset" #3971

vcschapp opened this issue Jun 22, 2021 · 7 comments
Labels
feature-request A feature should be added or improved.

Comments

@vcschapp
Copy link

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug

Retry strategy does not retry when remote host causes "read: connection reset".

Version of AWS SDK for Go?
Example: v1.20.2 .. v1.38.65 inclusive

Version of Go (go version)?
1.15+, but the version of Go isn't really relevant here

To Reproduce (observed behavior)

  1. Use any AWS service client within the SDK with MaxRetry > 0
  2. Get a TCP connection reset from the remote service
  3. It doesn't retry. Instead the SDK client immediately gives up with an error like: RequestError: send request failed\ncaused by: Post \"https://<service>.<region>.amazonaws.com/records\": read tcp 169.254.76.1:35798->52.94.227.177:443: read: connection reset by peer

Expected behavior

It should retry.

Additional context

  • The commit that disables the retry for "read: connection reset" is c3d27102, which references S3 multipart upload.
  • This makes me think that this change might have been put in to solve an issue with S3 multipart upload while breaking the more general behavior with other services.
  • In general resets often happen when remote services are not gracefully started up/added to load balancer or shut down (removed from load balancer) and I don't understand why this wouldn't be a good opportunity to retry.
@vcschapp vcschapp added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 22, 2021
@skmcgrail
Copy link
Member

This issue has been raised before in several GitHub issues, so for simplicity I will link the corresponding reasoning on why the SDK does not retry the read: connection reset by peer errors. See here.

If in the context of your application the operation in question is idempotent and safe to retry then you may wish to implement a custom retryer that allows retrying for this specific error condition. See the aws/request#Retryer) interface.

@skmcgrail skmcgrail added guidance Question that needs advice or information. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 24, 2021
@github-actions
Copy link

github-actions bot commented Jul 2, 2021

This issue has not received a response in 1 week. If you want to keep this issue open, please just leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jul 2, 2021
@vcschapp
Copy link
Author

vcschapp commented Jul 2, 2021

Thanks Sean.
I appreciate the answer (and incidentally sorry for putting you through another iteration of the same bug report).

If I could suggest one action on the AWS side that might reduce the incidence of these bug reports and help other devs understand the idempotency point you're making, it would be: Could you add a comment in the code, with a one-liner explanation of which read: connection reset by peer isn't retried, and put a slightly deeper explanation (maybe with reference to this ticket and others) into the commit message? Having that would have helped me take other steps without bothering you with a ticket.

@github-actions github-actions bot removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Jul 3, 2021
@KaibaLopez
Copy link
Contributor

sounds like a good idea @vcschapp ,
Would you like to try making a PR with the changes or would you rather we handle it?

@KaibaLopez KaibaLopez added the feature-request A feature should be added or improved. label Jul 6, 2021
@vcschapp
Copy link
Author

vcschapp commented Jul 6, 2021

My preference would be for you to handle it.

@vudh1 vudh1 removed the guidance Question that needs advice or information. label Apr 1, 2022
@Emil-G
Copy link

Emil-G commented Jan 10, 2023

Has this been fixed?

@lucix-aws lucix-aws closed this as not planned Won't fix, can't repro, duplicate, stale Apr 1, 2024
Copy link

github-actions bot commented Apr 1, 2024

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

6 participants