Skip to content

By @Ayanda-D: new CLI health check that detects QQs without an elected reachable leader #13433 (backport #13487) (backport #13488) #13489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 12, 2025

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Mar 12, 2025

This is #13433 by @Ayanda-D with several changes from me:

  1. Exit status in case of a failure is 69 and not 0, as it should be with our health checks and such CLI tools in general
  2. --quiet and --silent are both handled correctly
  3. Default formatter is used for all output, just like with other
  4. The values passed around by run/2 and output/2 now follow the established pattern

This is an automatic backport of pull request #13487 done by [Mergify](https://mergify.com).
This is an automatic backport of pull request #13488 done by [Mergify](https://mergify.com).

…d reachable leader #13433 (#13487)

* Implement rabbitmq-queues leader_health_check command for quorum queues

(cherry picked from commit c26edbe)

* Tests for rabbitmq-queues leader_health_check command

(cherry picked from commit 6cc03b0)

* Ensure calling ParentPID in leader health check execution and
reuse and extend formatting API, with amqqueue:to_printable/2

(cherry picked from commit 76d66a1)

* Extend core leader health check tests and update badrpc error handling in cli tests

(cherry picked from commit 857e2a7)

* Refactor leader_health_check command validators and ignore vhost arg

(cherry picked from commit 6cf9339)

* Update leader_health_check_command description and banner

(cherry picked from commit 96b8bce)

* Improve output formatting for healthy leaders and support
silent mode in rabbitmq-queues leader_health_check command

(cherry picked from commit 239a69b)

* Support global flag to run leader health check for
all queues in all vhosts on local node

(cherry picked from commit 48ba3e1)

* Return immediately for leader health checks on empty vhosts

(cherry picked from commit 7873737)

* Rename leader health check timeout refs

(cherry picked from commit b7dec89)

* Update banner message for global leader health check

(cherry picked from commit c7da4d5)

* QQ leader-health-check: check_process_limit_safety before spawning leader checks

(cherry picked from commit 1736845)

* Log leader health check result in broker logs (if any leaderless queues)

(cherry picked from commit 1084179)

* Ensure check_passed result for leader health internal calls)

(cherry picked from commit 68739a6)

* Extend CLI format output to process check_passed payload

(cherry picked from commit 5f5e992)

* Format leader healthcheck result log and function exports

(cherry picked from commit ebffd7d)

* Change leader_health_check command scope from queues to diagnostics

(cherry picked from commit 663fc98)

* Update (c) line year

(cherry picked from commit df82f12)

* Rename command to check_for_quorum_queues_without_an_elected_leader
and use across_all_vhosts option for global checks

(cherry picked from commit b2acbae)

* Use rabbit_db_queue for qq leader health check lookups
and introduce rabbit_db_queue:get_all_by_type_and_vhost/2.
Update leader health check timeout to 5s and process limit
threshold to 20% of node's process_limit.

(cherry picked from commit 7a8e166)

* Update tests: quorum_queue_SUITE and rabbit_db_queue_SUITE

(cherry picked from commit 9bdb81f)

* Fix typo (cli test module)

(cherry picked from commit 6158568)

* Small refactor - simpler final leader health check result return on function head match

(cherry picked from commit ea07938)

* Clear dialyzer warning & fix type spec

(cherry picked from commit a45aa81)

* Ignore result without strict match to avoid diayzer warning

(cherry picked from commit bb43c0b)

* 'rabbitmq-diagnostics check_for_quorum_queues_without_an_elected_leader' documentation edits

(cherry picked from commit 845230b)

* 'rabbitmq-diagnostics check_for_quorum_queues_without_an_elected_leader' output copywriting

(cherry picked from commit 235f43b)

* diagnostics check_for_quorum_queues_without_an_elected_leader: behave like a health check w.r.t. error reporting

(cherry picked from commit db73767)

* check_for_quorum_queues_without_an_elected_leader: handle --quiet and --silent

plus simplify function heads.

References #13433.

(cherry picked from commit 7b39231)

---------

Co-authored-by: Ayanda Dube <[email protected]>
(cherry picked from commit 09f1ab4)
(cherry picked from commit e1d7481)

# Conflicts:
#	deps/rabbit/test/quorum_queue_SUITE.erl
Copy link
Author

mergify bot commented Mar 12, 2025

Cherry-pick of e1d7481 has failed:

On branch mergify/bp/v4.0.x/pr-13488
Your branch is up to date with 'origin/v4.0.x'.

You are currently cherry-picking commit e1d748131.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   deps/rabbit/src/amqqueue.erl
	modified:   deps/rabbit/src/rabbit_db_queue.erl
	modified:   deps/rabbit/src/rabbit_quorum_queue.erl
	modified:   deps/rabbit/test/rabbit_db_queue_SUITE.erl
	modified:   deps/rabbitmq_cli/lib/rabbitmq/cli/core/output.ex
	new file:   deps/rabbitmq_cli/lib/rabbitmq/cli/diagnostics/commands/check_for_quorum_queues_without_an_elected_leader_command.ex
	new file:   deps/rabbitmq_cli/test/diagnostics/check_for_quorum_queues_without_an_elected_leader_command_test.exs

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   deps/rabbit/test/quorum_queue_SUITE.erl

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@mergify mergify bot added the conflicts label Mar 12, 2025
@michaelklishin michaelklishin added this to the 4.0.8 milestone Mar 12, 2025
@michaelklishin michaelklishin merged commit 880ca8f into v4.0.x Mar 12, 2025
270 checks passed
@michaelklishin michaelklishin deleted the mergify/bp/v4.0.x/pr-13488 branch March 12, 2025 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant