-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Search Replica Allocation and Recovery #17457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search Replica Allocation and Recovery #17457
Conversation
❌ Gradle check result for f9c54c5: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for 2d5b977: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17457 +/- ##
============================================
- Coverage 72.46% 72.34% -0.12%
+ Complexity 65757 65717 -40
============================================
Files 5311 5311
Lines 305001 305052 +51
Branches 44230 44237 +7
============================================
- Hits 221022 220696 -326
- Misses 65932 66280 +348
- Partials 18047 18076 +29 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
❌ Gradle check result for 1ecdbc3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 1ecdbc3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 1ecdbc3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
❌ Gradle check result for d197bd0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for d197bd0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for d197bd0: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
...n/java/org/opensearch/cluster/routing/allocation/decider/SearchReplicaAllocationDecider.java
Show resolved
Hide resolved
...n/java/org/opensearch/cluster/routing/allocation/decider/SearchReplicaAllocationDecider.java
Outdated
Show resolved
Hide resolved
...n/java/org/opensearch/cluster/routing/allocation/decider/SearchReplicaAllocationDecider.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/org/opensearch/cluster/routing/allocation/allocator/LocalShardsBalancer.java
Show resolved
Hide resolved
Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
❕ Gradle check result for 79a8da2: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
* Restrict Search Replicas to Allocate only to Search dedicated node Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * fixed the javadoc Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * fixed tests Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Treat Regular and Search Replicas Separately to Prevent Allocation Blocking Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Updated tests and some refactor Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Fixed SearchReplica recovery scenario for same node and new node Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Updated the logic for SearchReplica recovery scenario for new node Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Fixed nits after self review Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Modified the search replica allocation based on node attribute Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * fixed PR comments Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Revert "Fixed SearchReplica recovery scenario for same node and new node" This reverts commit de1e719. Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Separated the recovery flow method for search replica Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Revert "fixed PR comments" This reverts commit 8fe8dcf. Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Added unit tests in IndexShardTests Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * updated method name and minor refactor Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Removed search replica recovery logic from internalRecoverFromStore method Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Added integ test to cover search node restart scenario Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Applied search node role in tests and removed searchonly attribute Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Fixed failing test Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Removed unwanted comment Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Address PR comments Signed-off-by: Vinay Krishna Pudyodu <[email protected]> --------- Signed-off-by: Vinay Krishna Pudyodu <[email protected]>
Description
Modified the search replica allocation based on node attribute
In this PR we restrict Search replica to assign to only nodes with
search
roleAlso in this PR, we have made changes to treat search and regular replicas differently so unable to allocate one do not block the other: [RW Separation] Treat Regular and Search Replicas Separately to Prevent Allocation Blocking #17421
I also fixed the recovery of search replica when there is a node left scenario: [RW Separation] Search replica recovery flow breaks when search shard allocated to new node after node drop #17334
Related Issues
Resolves #17422
Resolves #17421
Resolves #17334
Related to #15306
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.