You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
My colleague chatted with you on Discord about this. Our main use case is disparate clusters, where each one is behind a load balancer. We want to use one of those clusters as a primary location, and only swap to the other cluster when the main cluster is completely down/unavailable. We think that this could be easily achievable with a little more interrogation from externally-pluggable logic when setting up the ConnectionFactory.
Describe the solution you'd like
Ultimately something like the RetryListener, but for Connections and not just for Topology. Or, it could be done with lambdas like Predicates, and also on the connection that failed. Maybe also a connection retry count passed in to help make judgments.
We envision setting cluster tags in our servers that inform the client about which cluster they're connected to, and perhaps additionally a cluster tag to indicate that the address used was behind a load balancer. So, we could check to see if the server tags indicate a load balancer address, combined with the reason the connection was shut down.
Maybe an easy way to plug this in currently is to have an interface that returns an AddressResolver, so an easy default implementation is to return the current AddressResolver unconditionally. This would preserve current behavior.
Then we could send a non-shuffling list of [secondary, primary] when there's an unexpected issue, or if retry count goes higher than some tolerable level, otherwise ask the system to attempt [primary, secondary] in standard scenarios. Or even skip sending primary/secondary together and let the new implementation determine whether the primary or secondary should be tried by itself. I.e. try primary three times, then try secondary three times, then give up.
I have not yet looked at downstream impacts of wiring this through the existing code. First just want to hash ideas on what you guys like / don't like. We're willing to do the legwork to contribute.
Describe alternatives you've considered
Currently we override AddressResolver to always return a fixed list and skip shuffling, which works mostly well but there are edge cases where a client may cascade to the more distant cluster when their primary is still up.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Sounds like a problem for a stateful AddressResolver and an AddressResolver extension that would tell it what is the state/name/tag of the current connection.
This may require introduce a way to tag a Connection, when a ConnectionFactory creates one or after that, to help the AddressResolver above identify the current destination.
I'd not modify the client beyond that for this exotic use case where clusters disappear and clients should not even try to connect to the same cluster.
It could be indeed an extension to AddressResolver (stateful or not, a new method would have to provide the state for the latter), but a new hook is fine as well, especially if what you introduce does not exactly fit the AddressResolver. Whatever we end up with, having a default "passthrough" implementation to avoid any breaking changes is what we prefer.
In your example of the ConnectionRetryListener I would use a single Context/State argument (can be an inner interface in the main interface) that contains the information you need. This makes it easier to add new information in the future, instead of breaking the interface by adding a new parameter in the main method.
You can create a PR to start iterating. I'd be curious to learn more about why you need the failed connection for and how you would handle "cluster tags". Maybe you could add a test that simulates a simplified version of your use case if that is not too much work (always good to validate the design).
Is your feature request related to a problem? Please describe.
My colleague chatted with you on Discord about this. Our main use case is disparate clusters, where each one is behind a load balancer. We want to use one of those clusters as a primary location, and only swap to the other cluster when the main cluster is completely down/unavailable. We think that this could be easily achievable with a little more interrogation from externally-pluggable logic when setting up the ConnectionFactory.
Describe the solution you'd like
Ultimately something like the RetryListener, but for Connections and not just for Topology. Or, it could be done with lambdas like Predicates, and also on the connection that failed. Maybe also a connection retry count passed in to help make judgments.
We envision setting cluster tags in our servers that inform the client about which cluster they're connected to, and perhaps additionally a cluster tag to indicate that the address used was behind a load balancer. So, we could check to see if the server tags indicate a load balancer address, combined with the reason the connection was shut down.
Maybe an easy way to plug this in currently is to have an interface that returns an AddressResolver, so an easy default implementation is to return the current AddressResolver unconditionally. This would preserve current behavior.
So, maybe, all notional:
Then we could send a non-shuffling list of
[secondary, primary]
when there's an unexpected issue, or if retry count goes higher than some tolerable level, otherwise ask the system to attempt[primary, secondary]
in standard scenarios. Or even skip sending primary/secondary together and let the new implementation determine whether the primary or secondary should be tried by itself. I.e. try primary three times, then try secondary three times, then give up.I have not yet looked at downstream impacts of wiring this through the existing code. First just want to hash ideas on what you guys like / don't like. We're willing to do the legwork to contribute.
Describe alternatives you've considered
Currently we override AddressResolver to always return a fixed list and skip shuffling, which works mostly well but there are edge cases where a client may cascade to the more distant cluster when their primary is still up.
Additional context
No response
The text was updated successfully, but these errors were encountered: