-
Notifications
You must be signed in to change notification settings - Fork 838
Add proper categorization for client connection closing error #6844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
// Check for gRPC connection closing error | ||
if strings.Contains(err.Error(), constants.GRPCConnectionClosingError) { | ||
logger.Warn(constants.GRPCConnectionClosingError, tag.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other failure types have a corresponding metric. let's define one for this to be able to keep track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is normal, should we still have as a warn and not info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this error is not within our control, I think making it into a warn level can help during events like network outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we can't use errors.Is
/As so we can handle wrapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have change the code to use error.Is instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors.Is
expect same type of error. I'm not sure the actual error we receive will match with your inline errors.New(constants.GRPCConnectionClosingError)
. Can you try to repro the issue locally to make sure this part works (stop matching while there's a long poll request made by frontend)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after some discussion offline I agree it might be slightly fiddly to be able to match it. I think it might be possible, but is not worth blocking on. It's not 100% clear to me if the error bubbling up is a YARPC error or a GPRC error and so any attempt to catch the wrapped variant would need to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From grpc manual, it's cancelled error. Yarpc must have a mapping of canceled error type. Thus it should be handled like
if err.Code() == yarpcerrors.Canceled
// Check for gRPC connection closing error | ||
if strings.Contains(err.Error(), constants.GRPCConnectionClosingError) { | ||
logger.Warn(constants.GRPCConnectionClosingError, tag.Error(err)) | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the implication of returning err
as is vs returning frontendInternalServiceError
from this function? does that change retry behavior on the caller side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frontendInternalServiceError is simply a wrapper of fmt.Errorf and make sure the error not being a thrift error. It does not change any retry behavior on the caller side.
…ror type cannot be ascertained through gRPC transport
metricsClient := metrics.NewClient(testScope, metrics.Frontend) | ||
handler := &apiHandler{} | ||
|
||
err := handler.handleErr(errors.New(constants.GRPCConnectionClosingError), metricsClient.Scope(0), logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not covering the yarpc canceled error branch.
What changed?
Why?
Previously, gRPC client connection closing errors were being treated as uncategorized errors, which:
These errors are common during normal operation (e.g., when clients disconnect) and should be handled gracefully
How did you test it?
Unit tests
Potential risks
Release notes
Documentation Changes