Skip to content

[CLN] Split get orchestrator #4673

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Sicheng-Pan
Copy link
Contributor

@Sicheng-Pan Sicheng-Pan commented May 29, 2025

Description of changes

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • Renamed the KnnFilterOrchestrator to FilterOrchestrator
    • Seperated the KnnError type into error types for each individual orchestrators
    • Simplify GetOrchestrator to reuse the FilterOrchestrator
  • New functionality
    • N/A

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@Sicheng-Pan Sicheng-Pan mentioned this pull request May 29, 2025
1 task
Copy link
Contributor Author

Sicheng-Pan commented May 29, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@Sicheng-Pan Sicheng-Pan marked this pull request as ready for review May 29, 2025 21:10
Copy link
Contributor

propel-code-bot bot commented May 29, 2025

Refactor and Modularize Orchestration Pipeline: Split, Rename, and Decouple Error Types

This PR significantly refactors the Chroma orchestration subsystem to modularize and clarify the orchestration pipeline. The previously monolithic 'KnnFilterOrchestrator' is renamed to 'FilterOrchestrator', and new orchestrators are introduced to clarify separation of concerns between filtering, KNN (both HNSW and SPANN), and 'Get' operations. Shared error types like 'KnnError' are split into focused error enums for each orchestrator, improving error clarity, code modularity, and maintainability. The 'GetOrchestrator' logic is simplified by delegating the filtering stage to 'FilterOrchestrator', enabling both orchestration code and test/bench code to use unified, more composable primitives. This involved significant changes to Rust modules, orchestrator interfaces, test/bench wiring, and RPC endpoints.

Key Changes:
• Renamed 'KnnFilterOrchestrator' to 'FilterOrchestrator'; removed and split the 'KnnError' type by orchestrator (KnnHnswOrchestratorError, KnnSpannOrchestratorError, etc).
• Split orchestration logic for filtering and vector search: introduced 'FilterOrchestrator', 'KnnHnswOrchestrator', and 'KnnSpannOrchestrator' for separation of filtering, HNSW/Spann KNN search, and projection.
• Refactored 'GetOrchestrator' to depend only on the output of 'FilterOrchestrator', greatly simplifying its pipeline and reducing coupling.
• Propagated orchestrator changes to API/server call points, test code, and all benchmarks, updating calls and type dependencies.
• Removed legacy 'knn' and 'knn_filter' modules; added new 'filter', 'knn_hnsw', and 'knn_spann' modules with orchestrator and error type definitions.
• Updated Protobuf conversion to support possible empty projection results ('Default' impl for GetResult and ProjectionOutput).

Affected Areas:
• rust/worker/src/execution/orchestration (filter.rs, get.rs, knn_hnsw.rs, knn_spann.rs, mod.rs)
• rust/worker/src/server.rs
• rust/types/src/execution/operator.rs
• rust/worker/benches/get.rs
• rust/worker/benches/query.rs
• rust/worker/src/execution/orchestration/compact.rs

This summary was automatically generated by @propel-code-bot

@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch from a3bc9e8 to fd7eb63 Compare May 29, 2025 23:06
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-28-_cln_unify_operator_types branch from 70d88f2 to c7b932b Compare May 30, 2025 18:29
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch 2 times, most recently from 061830b to 1ba91d7 Compare June 2, 2025 17:06
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-28-_cln_unify_operator_types branch from c7b932b to ddab3ad Compare June 2, 2025 17:06
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch from 1ba91d7 to b1b2820 Compare June 5, 2025 21:52
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-28-_cln_unify_operator_types branch from ddab3ad to 50dd8cb Compare June 5, 2025 21:52
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch from b1b2820 to 2115787 Compare June 9, 2025 17:24
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-28-_cln_unify_operator_types branch 2 times, most recently from 6cd365b to c643f08 Compare June 9, 2025 22:52
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch 2 times, most recently from 6d93f14 to 64dd8e9 Compare June 9, 2025 23:59
@Sicheng-Pan Sicheng-Pan changed the base branch from sicheng/05-28-_cln_unify_operator_types to graphite-base/4673 June 10, 2025 00:33
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch from 64dd8e9 to 1fca7f9 Compare June 10, 2025 00:33
@graphite-app graphite-app bot changed the base branch from graphite-base/4673 to main June 10, 2025 00:33
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch from 1fca7f9 to d4baecd Compare June 10, 2025 00:33
@Sicheng-Pan Sicheng-Pan force-pushed the sicheng/05-29-_cln_split_get_orchestrator branch from d4baecd to 9af4da8 Compare June 16, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant