Skip to content

Apply boolean must_not rewrite to numeric match, term, and terms queries #18498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

peteralfonsi
Copy link
Contributor

Description

Followup to #17655, where we rewrote range queries in boolean must_not clauses to instead be should clauses containing the complement of the original query. This PR extends the rewrite to match, term, and terms queries on numeric fields.

The speedups here seem larger than for range queries, plus I imagine must_not of numeric terms might be more common than on ranges. (Imagine excluding all documents with HTTP status 200 for example). So hopefully this PR will be more useful than the first one.

Some benchmark numbers from http_logs are below. These were run on 1-node clusters using tar installs in c5.2xl ec2 instances. "Originally written as" means whether the query was sent to OpenSearch with a must_not clause, and if so, whether it was a match or term query, or if the query was sent already rewritten with should clauses. Ideally, after the changes are applied, these p50s should be the same, because the must_nots are internally rewritten to be should of ranges. Note 200 is the most common value, and 404 and 500 are rarer.

Excluded status value(s) Originally written as p50 before changes (ms) p50 after changes (ms) Speedup as fraction of original
200 must_not of match 1021 20.72 49x
200 must_not of term 1011 18.83 54x
200 should of ranges 20.71 18.10 -
404 must_not of match 515.6 12.03 43x
404 must_not of term 513.0 12.01 43x
404 should of ranges 13.14 11.74 -
500 must_not of match 481.6 9.49 44x
500 must_not of term 487.3 9.53 51x
500 should of ranges 10.26 9.05 -
200, 500 must_not of terms 958.6 17.76 54x
200, 500 should of ranges 19.87 17.84 -
404, 500 must_not of terms 508.1 11.75 43x
404, 500 should of ranges 12.88 11.28 -

Related Issues

Part of #17586

Check List

  • Functionality includes testing.
  • [N/A] API changes companion pull request created, if applicable.
  • [N/A] Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Peter Alfonsi added 7 commits June 6, 2025 10:46
Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

❌ Gradle check result for 9a036fa: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@peteralfonsi peteralfonsi marked this pull request as draft June 11, 2025 20:17
Copy link
Contributor

❌ Gradle check result for 841d01b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

❌ Gradle check result for d1200d6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

❌ Gradle check result for 9712c7b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@peteralfonsi
Copy link
Contributor Author

Flaky test: #14509

Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

❌ Gradle check result for 4d9faab: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@peteralfonsi
Copy link
Contributor Author

Flaky tests: #15840, #18476

Signed-off-by: Peter Alfonsi <[email protected]>
Copy link
Contributor

✅ Gradle check result for 0dc1b5c: SUCCESS

@peteralfonsi peteralfonsi marked this pull request as ready for review June 12, 2025 22:39
@peteralfonsi
Copy link
Contributor Author

@msfroh any chance you could look at this when you have time? It's an extension to #17655 which you'd reviewed

Copy link
Contributor

✅ Gradle check result for 7d1b471: SUCCESS

Copy link

codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 86.66667% with 10 lines in your changes missing coverage. Please review.

Project coverage is 72.81%. Comparing base (7c8f01b) to head (7d1b471).

Files with missing lines Patch % Lines
.../org/opensearch/index/query/RangeQueryBuilder.java 20.00% 4 Missing ⚠️
.../org/opensearch/index/query/TermsQueryBuilder.java 78.57% 0 Missing and 3 partials ⚠️
...a/org/opensearch/index/query/BoolQueryBuilder.java 81.81% 0 Missing and 2 partials ⚠️
.../opensearch/index/query/ComplementHelperUtils.java 97.22% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18498      +/-   ##
============================================
+ Coverage     72.68%   72.81%   +0.12%     
- Complexity    68130    68198      +68     
============================================
  Files          5540     5541       +1     
  Lines        313379   313440      +61     
  Branches      45472    45484      +12     
============================================
+ Hits         227785   228229     +444     
+ Misses        67070    66647     -423     
- Partials      18524    18564      +40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant