Raft deadlock during member removal #1379

sebadob · 2025-05-28T18:05:58Z

sebadob
May 28, 2025

Hey,

My openraft impl is running very smoothly and stable by now. However, there's one situation I can't really figure out for a few days now, maybe I need to dig into the openraft code itself for this.

As long as I am not losing my state, everything is fine and stable. Node restarts, full cluster shutdowns and restarts, all working good. But, what I am currently investigating in regard to disaster recovery is the situation where you would lose a node for whatever reason, which then cannot come up again and needs to be replaced, e.g. because of a broken volume.

Sometimes, when I just restart the broken node with an empty volume and allowed openraft/loosen-follower-log-revert, it works just fine. The node will eventually get the latest snapshot and missing logs from the leader and it syncs nicely again with the fresh volume. But, sometimes it just does not work and the Raft gets locked on the Leader, which ultimately brings the whole cluster down.
I then figured it would probably be a way better idea in such a case to first remove that node from the cluster and re-join it cleanly. But when I try to do this (while the node is offline of course), the Rafts ends up in a very bad deadlock on the Leader almost each time. It most often gets so bad, that even CTRL + C and things don't work anymore and I have to kill -9 the process.

I tried different ways of doing this in an automated way for a few days now, but I always end up in an unstable situation, where the whole cluster basically locks itself up, I need to kill it and do a clean restart. I tried .raft.change_membership() with a BTreeSet<_> of the nodes that should stay, and retain set to false. I also tried passing in ChangeMembers::RemoveVoters(_) (or ChangeMembers::RemoveNodes for a Learner), as well as ChangeMembers::SetNodes. All options usually work nicely, but not when the other node died and is offline. The leader gets stuck after

2025-05-28T17:58:09.965205Z  WARN openraft::replication: error replication to target=3 error=timeout after 150ms when AppendEntries 1->3
2025-05-28T17:58:09.965226Z ERROR openraft::replication: RPCError err=timeout after 150ms when AppendEntries 1->3

And the issue is, it gets stuck so badly, the the whole application stops working. It seems like it's actually blocking the tokio runtime, is that possible? The most weird part about it is, that the other healthy Follower then stops working as well, and is getting into a locked state, which is the reason why the whole cluster just dies, after it logs:

2025-05-28T17:58:12.826061Z  INFO openraft::core::raft_core: received Notify::VoteResponse: openraft::core::raft_core::RaftCore<_, _, _, _>::handle_notify now=19:58:12.826057 resp={T2-N2:committed, last_log:None}
2025-05-28T17:58:12.826086Z  INFO openraft::engine::engine_impl: openraft::engine::engine_impl::Engine<_>::handle_vote_resp resp={T2-N2:committed, last_log:None} target=2 my_vote=T2-N2:committed my_last_log_id=Some(T2-N2-8)
2025-05-28T17:58:12.896238Z ERROR openraft::core::raft_core: timeout error=timeout after 750ms when Vote 2->1 target=1

Now my question: Am I doing something obvious wrong, missing something, or is it actually possible that such a situation comes up and if so, how can I avoid it?

Edit:

As long as I did not mess up somewhere else, I can confirm that it completely locks the whole tokio runtime on all nodes. I enabled trace level logging for it and everything stops when I get to that point.

Answered by drmingdrmer

May 31, 2025

I noticed there is a blocking send in the drop method:
https://github.com/sebadob/hiqlite/blob/60337939de8bde26ab09c06a3f43122b735fdb80/hiqlite/src/network/raft_client.rs#L511-L518

It blocks when node-3 re-join the cluster. The reason it blocks may be because the receiving end has destroyed or else.

Change it to try_send(), a non-blocking mode send, it seems node-3 can smoothly re-join the cluster.

impl Drop for NetworkConnectionStreaming {
    fn drop(&mut self) {
        let _ = self.sender.try_send(RaftRequest::Shutdown);
        if let Some(task) = self.task.take() {
            task.abort();
        }
    }
}

I tried update the channel buffer size from 1 to 100 but it does not chan…

View full answer

sebadob · 2025-05-28T21:03:43Z

sebadob
May 28, 2025
Author

I tried digging a bit deeper into the cause, but unfortunately it's not that easy to see if you're unfamiliar with the openraft intenals. The only thing I can see is that sadly (for debugging at least) all tasks are spawned from the same location in async_runtime.rs:133. But what tokio-console shows is that there is a task blocking everything else. The Busy timer will go up endlessly until I kill the process. In the 3rd line, there is already the next openraft task in waiting state, but the first one blocks.

From knowing when this happens every time, it must be related to removing the "broken" and offline cluster member. The console in the screenshot is connected to the process of the current cluster Leader.

Did I mess up badly, or is it known (and perhaps fixed on main) and expected to block the async runtime?

Edit:

I also noticed, that the Leader, after removing the node from Members and removing and -starting all replication, that it also creates a new replication client for the just removed node, which is something I would not expect after a removal. In the example logs below e.g., I simulated a lost Raft state volume on Node 2, which is first being removed from the cluster members, but then a new Client stream for that node is opened anyway? The connection of course fails as expected, because the node is offline and should be removed from the cluster.

2025-05-29T11:27:42.484501Z  INFO leave_cluster:change_membership: openraft::raft::impl_raft_blocking_write: change_membership: start to commit joint config changes=RemoveVoters({2}) retain=false
2025-05-29T11:27:42.484538Z  INFO openraft::core::raft_core: received RaftMsg::ChangeMembership: openraft::core::raft_core::RaftCore<_, _, _, _>::handle_api_msg members=RemoveVoters({2}) retain=false
2025-05-29T11:27:42.484658Z  INFO openraft::core::raft_core: remove all replication
2025-05-29T11:27:42.484716Z  INFO openraft::core::raft_core: Done joining removed replication : 2
2025-05-29T11:27:42.484746Z  WARN hiqlite::network::raft_client: Exiting Client Stream Writer
2025-05-29T11:27:42.484751Z  INFO openraft::core::raft_core: Done joining removed replication : 3
2025-05-29T11:27:42.484752Z  WARN hiqlite::network::raft_client: Exiting Client Stream Writer
2025-05-29T11:27:42.484776Z  INFO hiqlite::network::raft_client: Building new Raft Cache client with target Node { id: 3, rpc_addr: localhost:8102, api_addr: localhost:8202 }
2025-05-29T11:27:42.484782Z  WARN hiqlite::network::raft_client: Exiting Client Stream Writer
2025-05-29T11:27:42.484792Z  INFO hiqlite::network::raft_client: Building new Raft Cache client with target Node { id: 3, rpc_addr: localhost:8102, api_addr: localhost:8202 }
2025-05-29T11:27:42.484823Z  INFO hiqlite::network::raft_client: Building new Raft Cache client with target Node { id: 2, rpc_addr: localhost:8101, api_addr: localhost:8201 }
2025-05-29T11:27:42.484830Z  INFO hiqlite::network::raft_client: Building new Raft Cache client with target Node { id: 2, rpc_addr: localhost:8101, api_addr: localhost:8201 }

Edit 2:

Just to make sure that my cluster leave logic works in general, I did the cluster leave automatically before shutting down the raft. This works just fine, as expected, and finishes basically immediately. When this node comes up again (in-memory logs store for volume loss simulation), it re-joins and syncs Snapshot + Logs and everything is just fine.

However, when I call raft.shutdown().await? before the node removal from the members, I have the exact same behavior as when doing it with the node completely offline. It blocks the Raft on all cluster members. Could it be, that a membership change needs to be confirmed by all existing Raft members and not just a quorum, and the Leader blocks until this is done on purpose? This is of course impossible with an offline node.

0 replies

drmingdrmer · 2025-05-29T14:13:33Z

drmingdrmer
May 29, 2025
Maintainer

It is the latest 0.9 release, right? The latest is 0.9.18.

I'm going to have a look to see if it's possible to have such an issue.

1 reply

sebadob May 29, 2025
Author

It is the latest 0.9 release, right? The latest is 0.9.18.

Yes, 0.9.18. I am currently using an openraft copy from disk to debug it a bit better. I am not sure yet, but I think it's the 2nd call to self.inner.call_core() in change_membership() where it gets stuck.

drmingdrmer · 2025-05-29T14:45:35Z

drmingdrmer
May 29, 2025
Maintainer

First of all, I don't yet have an idea what kind of situation would lead to such an issue.

I need your help to find out what went wrong. First, please enable debug level logging. It will print a lot of other actions that happened during execution.

And please add some logging to your trait implementations, such as outputting a log just after entering RaftNetwork::append_entries() and just before returning from it. Same for vote, snapshot, etc., and also for the RaftLogStorage methods. But I'm afraid it has nothing to do with RaftLogStorage. I highly suspect it's about RaftNetwork.

And let me confirm with you about reproducing this issue with the problematic node:

Do you shutdown a node, or start a node with all its data purged?
Then call change-membership to remove this node, right?

And I assume you were running a 3-node cluster with nodes 1, 2, 3, right?

16 replies

drmingdrmer May 30, 2025
Maintainer

https://discord.com/invite/ZKw3WG7FQ9

drmingdrmer May 30, 2025
Maintainer

It looks like it require some env variable to be set to run?

!!! CAUTION !!!                                                                                                                                                                                                                                                                                                                                                                     
Statement logging is activated - this can leak sensitive information into your logs,                                                                                                                                                                                                                                                                                                
as it will log query parameters as well. Be careful when using this in production and                                                                                                                                                                                                                                                                                               
clean up logs after debugging!                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                                                                                    
Error: Error("CryptrError::Generic(environment variable not found)")

sebadob May 30, 2025
Author

Oh sorry, I was not aware of the fact that it uses the .env from the main repo as a fallback. I added test keys and pushed them.
There are defaults in the main src/config file, but the example does not read it. This was unnoticed, because I have a .env in src which is in .gitignore and contains such keys.

When you do a new pull, it should work. I added

ENC_KEYS="
bVCyTsGaggVy5yqQ/UzluN29DZW41M3hTSkx6Y3NtZmRuQkR2TnJxUTYzcjQ=
"
ENC_KEY_ACTIVE=bVCyTsGaggVy5yqQ

to .env in the example folder. This is just a DEV dummy key.

I am on the run now and will probably be available again in ~1 - 1.5 hours.

drmingdrmer May 30, 2025
Maintainer

Then I will have already been on bed :)

sebadob May 30, 2025
Author

When the joint config only requires a quorum, my setup should work, I guess, and it actually does sometimes. ^^
Which is why I think it's maybe a race condition somewhere without having a clue yet, where this could be.

drmingdrmer · 2025-05-30T16:17:47Z

drmingdrmer
May 30, 2025
Maintainer

Now I can run the cluster: while all 3 nodes online, ctrl-c to shut down node-3; then restart it. What should I expect to see?

It looks like on my laptop node-1 and node-2 works fine. They keep outputting logs on the terminal. Does it mean no deadlock happened?

3 replies

sebadob May 30, 2025
Author

Now I can run the cluster

Yay!

Does it mean no deadlock happened?

Exactly. What you should see is that Node 1 (possibly the Leader most of the time) will lock and get into the state like shown in the logs. And then Node 2 follows with a lock as well, when it runs into the 750ms timeout. Node 3 sometimes keeps logging in that state, other times it's locked well.

At least on my current system, I get into the deadlock almost every time. I can test on other systems and see, if maybe the performance makes a difference (which it could if it's a race condition).

sebadob May 30, 2025
Author

Oh sorry, no, you need to restart Node 3 as well. On restart, it automatically removes itself from the "remote" cluster with an API call. Then it should lock.

Another way to get into it would be in the clients shutdown procedure, when you try the cluster leave after a raft.shutdown(), but that would require another small code modification. Then it would happen on shutdown.

sebadob May 30, 2025
Author

It should look like this:

Bildschirmaufnahme_20250530_184543.webm

drmingdrmer · 2025-05-30T17:05:19Z

drmingdrmer
May 30, 2025
Maintainer

I use your example to start all 3 nodes. But it still looks like there are two raft instance running inside each process?

There are some log with text: raft_type=Sqlite.

node-3 log


2025-05-30T16:55:40.551385Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: hiqlite::init: Skipping 'this' node 
2025-05-30T16:55:40.551385Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: hiqlite::init: Skipping 'this' node 
2025-05-30T16:55:40.551447Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:handle_api_msg{state=Learner id=3}: openraft::core::raft_core: recv from r x_api: External Request 
2025-05-30T16:55:40.551447Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:handle_api_msg{state=Learner id=3}: openraft::core::raft_core: recv from r x_api: External Request 
2025-05-30T16:55:40.551476Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:run_engine_commands: openraft::core::raft_core: queued commands: start...  
2025-05-30T16:55:40.551490Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:run_engine_commands: openraft::core::raft_core: queued commands: end...  
2025-05-30T16:55:40.551525Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:run_engine_commands: openraft::core::raft_core: queued commands: start...  
2025-05-30T16:55:40.551541Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:run_engine_commands: openraft::core::raft_core: queued commands: end...
2025-05-30T16:55:40.551543Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}: openraft::core::raft_core: all RaftMsg are processed, wait for more
2025-05-30T16:55:40.551581Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}: openraft::core::raft_core: all RaftMsg are processed, wait for more
2025-05-30T16:55:40.551587Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}: openraft::core::raft_core: all Notify are processed, wait for more
2025-05-30T16:55:40.551595Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}: openraft::core::raft_core: all Notify are processed, wait for more
2025-05-30T16:55:40.551612Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:flush_metrics:report_metrics:current_leader: openraft::core::raft_core: ge t current_leader self_id=3 vote=T0-N0:uncommitted 
2025-05-30T16:55:40.551615Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:flush_metrics:report_metrics:current_leader: openraft::core::raft_core: ge t current_leader self_id=3 vote=T0-N0:uncommitted 
2025-05-30T16:55:40.551654Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:flush_metrics:report_metrics: openraft::core::raft_core: report_metrics: M etrics{id:3, Learner, term:0, vote:T0-N0:uncommitted, last_log:None, last_applied:None, leader:None(since_last_ack:None ms), membership:{log_id:None, {voters:[], learners:[]}}, snapsho t:None, purged:None, replication:{}} 
2025-05-30T16:55:40.551657Z DEBUG new{cluster=hiqlite}:RaftCore{id=3 cluster=hiqlite}:main:runtime_loop{id=3}:flush_metrics:report_metrics: openraft::core::raft_core: report_metrics: M etrics{id:3, Learner, term:0, vote:T0-N0:uncommitted, last_log:None, last_applied:None, leader:None(since_last_ack:None ms), membership:{log_id:None, {voters:[], learners:[]}}, snapsho t:None, purged:None, replication:{}} 
2025-05-30T16:55:40.551694Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: hiqlite::init: Sending request to https://127.0.0.1:8101/cluster/add_learner/cache 
2025-05-30T16:55:40.551718Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: hiqlite::init: Sending request to https://127.0.0.1:8101/cluster/add_learner/sqlite 
2025-05-30T16:55:40.551820Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: reqwest::connect: starting new connection: https://127.0.0.1:8101/ 
2025-05-30T16:55:40.551820Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: reqwest::connect: starting new connection: https://127.0.0.1:8101/
2025-05-30T16:55:40.551858Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: hyper_util::client::legacy::connect::http: connecting to 127.0.0.1:8101
2025-05-30T16:55:40.551872Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: hyper_util::client::legacy::connect::http: connecting to 127.0.0.1:8101
2025-05-30T16:55:40.552166Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: hyper_util::client::legacy::connect::http: connected to 127.0.0.1:8101
2025-05-30T16:55:40.552173Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: hyper_util::client::legacy::connect::http: connected to 127.0.0.1:8101
2025-05-30T16:55:40.552196Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: rustls::client::hs: No cached session for IpAddress(V4(Ipv4Addr([127, 0, 0, 1])))
2025-05-30T16:55:40.552196Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: rustls::client::hs: No cached session for IpAddress(V4(Ipv4Addr([127, 0, 0, 1])))
2025-05-30T16:55:40.552426Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: rustls::client::hs: Not resuming any session
2025-05-30T16:55:40.552436Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: rustls::client::hs: Not resuming any session
2025-05-30T16:55:40.552701Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: hiqlite::init: raw request to https://127.0.0.1:8101/cluster/add_learner/cache: Err(reqwest::Error { kind: Request, url: "https://127.0.0.1:8101/cluster/add_learner/cache", source: hyper_util::client::legacy::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, e rror: InvalidMessage(InvalidContentType) } }) }) 
2025-05-30T16:55:40.552713Z ERROR become_cluster_member{raft_type=Cache this_node=3}: hiqlite::init: Node connection error: error sending request for url (https://127.0.0.1:8101/cluste r/add_learner/cache) 
2025-05-30T16:55:40.552774Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: hiqlite::init: raw request to https://127.0.0.1:8101/cluster/add_learner/sqlite: Err(reqwest::Err or { kind: Request, url: "https://127.0.0.1:8101/cluster/add_learner/sqlite", source: hyper_util::client::legacy::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData , error: InvalidMessage(InvalidContentType) } }) }) 
2025-05-30T16:55:40.552782Z ERROR become_cluster_member{raft_type=Sqlite this_node=3}: hiqlite::init: Node connection error: error sending request for url (https://127.0.0.1:8101/clust er/add_learner/sqlite) ^C

And node-3 can not connect to the cluster:

2025-05-30T16:55:40.552701Z DEBUG become_cluster_member{raft_type=Cache this_node=3}: hiqlite::init: raw request to https://127.0.0.1:8101/cluster/add_learner/cache: Err(reqwest::Error { kind: Request, url: "https://127.0.0.1:8101/cluster/add_learner/cache", source: hyper_util::client::legacy::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, error : InvalidMessage(InvalidContentType) } }) })
2025-05-30T16:55:40.552713Z ERROR become_cluster_member{raft_type=Cache this_node=3}: hiqlite::init: Node connection error: error sending request for url (https://127.0.0.1:8101/cluster/add_learner/cache)
2025-05-30T16:55:40.552774Z DEBUG become_cluster_member{raft_type=Sqlite this_node=3}: hiqlite::init: raw request to https://127.0.0.1:8101/cluster/add_learner/sqlite: Err(reqwest::Error { kind: Request, url: "https://127.0.0.1:8101/cluster/add_learner/sqlite", source: hyper_util::client::legacy::Error(Connect, Custom { kind: Other, error: Custom { kind: InvalidData, er ror: InvalidMessage(InvalidContentType) } }) })
2025-05-30T16:55:40.552782Z ERROR become_cluster_member{raft_type=Sqlite this_node=3}: hiqlite::init: Node connection error: error sending request for url (https://127.0.0.1:8101/cluster/add_learner/sqlite)

I'm gonna test it on another computer tomorrow to see if it just does not work on my laptop 🤔

2 replies

sebadob May 30, 2025
Author

I use your example to start all 3 nodes. But it still looks like there are two raft instance running inside each process?

I just saw that as well after checking the branch out again, and somehow the Cargo.lock did not get updated and it still had the sqlite feature enabled, when in Cargo.toml, it's out-commented. Very weird, have never seen this. After a cargo update, it removed it.

And node-3 can not connect to the cluster

If that's before the Raft lock, then it would be very weird. Afterward or maybe after is has been removed, still weird, but could be due to an API crash or because of the lock, I am not sure.

I removed the Cargo.lock, let it rebuild, and pushed it. I have no idea why it did not pick up the feature change before, but now it seems fine. Also, just to have this example more clean, I simplified it even more.

start up all 3 nodes
shut down node 3

Then you will see the connection error logs from the leader as expected. You can do a simple curl to trigger the cluster leave on the leader, I hope this makes it more clear that nothing is interfering from the code / startup logic on Node 3:

curl --insecure -XDELETE \
    -H 'Accept: application/json' \
    -H 'Content-Type: application/json' \
    -H 'X-API-SECRET: SuperSecureApiSecret' \
    --data '{"node_id":3,"stay_as_learner":false}' \
    https://127.0.0.1:8201/cluster/membership/cache

I will also do a fresh checkout on another machine later to be really sure, that the Cargo.lock is actually fixed.

sebadob May 30, 2025
Author

So I grabbed an old and slow Laptop, did a fresh clone of the repo, checked out the branch and it behaves in exactly the same way.

With the manual curl, I also noticed, that the API call itself does not even finish, because it's stuck inside raft.change_membership().

Also, when I start up all 3 nodes and while the 3rd is still online and healthy, I do the curl, and the removal is just fine. I can then shutdown and restart the Node and since it sees, that it has no cluster membership, it will simply re-join and everything is fine. But when I do the curl after Node shutdown, the Raft locks completely.

drmingdrmer · 2025-05-31T03:35:22Z

drmingdrmer
May 31, 2025
Maintainer

Finally I can reproduce the deadlock issue ˙–˙. Let find out why

0 replies

drmingdrmer · 2025-05-31T04:25:04Z

drmingdrmer
May 31, 2025
Maintainer

I noticed there is a blocking send in the drop method:
https://github.com/sebadob/hiqlite/blob/60337939de8bde26ab09c06a3f43122b735fdb80/hiqlite/src/network/raft_client.rs#L511-L518

It blocks when node-3 re-join the cluster. The reason it blocks may be because the receiving end has destroyed or else.

Change it to try_send(), a non-blocking mode send, it seems node-3 can smoothly re-join the cluster.

impl Drop for NetworkConnectionStreaming {
    fn drop(&mut self) {
        let _ = self.sender.try_send(RaftRequest::Shutdown);
        if let Some(task) = self.task.take() {
            task.abort();
        }
    }
}

I tried update the channel buffer size from 1 to 100 but it does not change anything 🤔

impl RaftNetworkFactory<TypeConfigSqlite> for NetworkStreaming {
    type Network = NetworkConnectionStreaming;
    async fn new_client(&mut self, _target: NodeId, node: &Node) -> Self::Network {
        let (sender, rx) = flume::bounded(1); // may be blocked if the receiving end does not consume the message.

https://github.com/sebadob/hiqlite/blob/60337939de8bde26ab09c06a3f43122b735fdb80/hiqlite/src/network/raft_client.rs#L108

3 replies

sebadob May 31, 2025
Author

I noticed there is a blocking send in the drop method:

Omg, I should have found that myself. I am so sorry. -.-
The reason it's blocking in the first place, is that I am always draining events regularly between connection attempts to make sure nothing ever piles up. Because if the other side should be closed, the blocking send would error anyway.

But, and that's why increasing the channel size does not fix it, when I drain events regularly, I did not respect the shutdown at that point, which means the rx would never be closed. If I just respect the shutdown message, it works. After a lot of testing, I found, that a higher channel size does not increase performance at all, which is why it's 1 in almost all places.

rx.drain().for_each(|req| {
    let ack = match req {
        #[cfg(feature = "sqlite")]
        RaftRequest::AppendDB((ack, _)) => Some(ack),
        #[cfg(feature = "sqlite")]
        RaftRequest::VoteDB((ack, _)) => Some(ack),
        #[cfg(feature = "sqlite")]
        RaftRequest::SnapshotDB((ack, _)) => Some(ack),
        #[cfg(feature = "cache")]
        RaftRequest::AppendCache((ack, _)) => Some(ack),
        #[cfg(feature = "cache")]
        RaftRequest::VoteCache((ack, _)) => Some(ack),
        #[cfg(feature = "cache")]
        RaftRequest::SnapshotCache((ack, _)) => Some(ack),
        RaftRequest::StreamResponse(_) => None,
        RaftRequest::Shutdown => {
            shutdown = true;
            None
        }
    };

    if let Some(ack) = ack {
        let _ = ack.send(Err(Error::Connect(format!(
            "Cannot connect to {}",
            node.addr_raft
        ))));
    }
});

if shutdown {
    break;
}

Thank you a lot. I am so sorry for wasting your time. -.-

Edit:

Was looking to give you a sponsoring for all your help, but it's not set up. Do you work at Databend directly?

Btw, what I just saw, is that the weird Non-Updating Cargo.lock for the example also was the reason for the ENC_KEYS issue, because usually these should only be required with an active dashboard or s3 feature.

drmingdrmer May 31, 2025
Maintainer

That's perfectly fine :) Debugging is just part of the engineering job. Sometimes people help me, and sometimes I help others.

I haven't set up a sponsor account yet :) And yes, I'm officially a Databend employee.

sebadob May 31, 2025
Author

Thank you very much! Really appreciate it.

Raft deadlock during member removal #1379

Uh oh!

Uh oh!

sebadob May 28, 2025

Replies: 7 comments · 25 replies

Uh oh!

Uh oh!

sebadob May 28, 2025 Author

Uh oh!

drmingdrmer May 29, 2025 Maintainer

Uh oh!

Uh oh!

sebadob May 29, 2025 Author

Uh oh!

drmingdrmer May 29, 2025 Maintainer

Uh oh!

drmingdrmer May 30, 2025 Maintainer

Uh oh!

drmingdrmer May 30, 2025 Maintainer

Uh oh!

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

drmingdrmer May 30, 2025 Maintainer

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

drmingdrmer May 30, 2025 Maintainer

Uh oh!

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

drmingdrmer May 30, 2025 Maintainer

Uh oh!

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

Uh oh!

sebadob May 30, 2025 Author

Uh oh!

drmingdrmer May 31, 2025 Maintainer

Uh oh!

drmingdrmer May 31, 2025 Maintainer

Uh oh!

Uh oh!

sebadob May 31, 2025 Author

Uh oh!

drmingdrmer May 31, 2025 Maintainer

Uh oh!

sebadob May 31, 2025 Author

sebadob
May 28, 2025

Replies: 7 comments 25 replies

sebadob
May 28, 2025
Author

drmingdrmer
May 29, 2025
Maintainer

sebadob May 29, 2025
Author

drmingdrmer
May 29, 2025
Maintainer

drmingdrmer May 30, 2025
Maintainer

drmingdrmer May 30, 2025
Maintainer

sebadob May 30, 2025
Author

drmingdrmer May 30, 2025
Maintainer

sebadob May 30, 2025
Author

drmingdrmer
May 30, 2025
Maintainer

sebadob May 30, 2025
Author

sebadob May 30, 2025
Author

sebadob May 30, 2025
Author

drmingdrmer
May 30, 2025
Maintainer

sebadob May 30, 2025
Author

sebadob May 30, 2025
Author

drmingdrmer
May 31, 2025
Maintainer

drmingdrmer
May 31, 2025
Maintainer

sebadob May 31, 2025
Author

drmingdrmer May 31, 2025
Maintainer

sebadob May 31, 2025
Author