-
Notifications
You must be signed in to change notification settings - Fork 1k
Redis Cluster Client Deadlock with custom SocketAddressResolver #3240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
😲 This is - by far - the best issue report that I have ever read. Huge appreciation for all the work done to analyse and document the issue! |
Update - While concoting workarounds, same issue was observed outside of the custom logic, because MappingSocketAddressResolver does this: |
Managed to work around the issue by copying the I can open a PR during the weekend to fix and test this scenario if you want :) |
Sure, if you have the code we'd love to see it contributed! |
you rock @henry701 |
@tishun PR opened, cheers |
Bug Report
Reproduction Conditions
The deadlock occurs when attempting to connect to a Redis cluster when an exception is thrown from the
SocketAddressResolver
.I'm using Spring Data, but this error is reproducible without using it as well.
Environment
Logger Warning Stack Trace
The error that is silently caught but not properly handled:
Thread Dump Stack Trace
Reproduction
I've created a repository here with the repro of this bug: https://github.com/henry701/lettuce-bug-report-infiwait
Pretty straightforward, just
mvn compile exec:java
and the application will compile and hang.Investigation
After analyzing the code execution flow together with the stack trace and a thread dump, the following was determined:
getPartitions()
which is waiting for a promise returned byinitializePartitions()
loadPartitionsAsync()
is never completedloadViews()
, which is also never completedloadViews()
future depends on a future from theConnectionTracker
class passed to theopenConnections()
methodtracker.addConnection(redisURI, sync)
when errors occur in certain flowsSocketAddress socketAddress = clientResources.socketAddressResolver().resolve(redisURI)
when calling a custom implementation ofSocketAddressResolver
, which fails due to an invalid port format configuration in my case.The issue lies in the
openConnections
method, where exceptions during address resolution are caught and logged, but thesync
CompletableFuture
is not properly completed or added to the tracker in this error path.Possible Solution
Move the declaration of
CompletableFuture<StatefulRedisConnection<String, String>> sync = new CompletableFuture<>()
to the top of the loop in theopenConnections
method, and modify the catch block to contain:This ensures that even when there's a connection error, the future is properly completed exceptionally and added to the
ConnectionTracker
, completing its future exceptionally and preventing the deadlock.Additional Context
This bug causes applications to hang indefinitely when attempting to connect to a Redis cluster with invalid node configuration, which can occur in various scenarios such as:
The thread is stuck parking forever unless interrupted. As a result when ran from the main thread or from a thread which is waited upon such as a web thread, the application becomes unresponsive and requires continuous restarts to recover.
The text was updated successfully, but these errors were encountered: