Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Milvus Vector - Unable to Set nprobe in doSimilaritySearch for Default IVF_FLAT When initializeSchema = true #2294

Open
waileong opened this issue Feb 23, 2025 · 4 comments · May be fixed by #2300

Comments

@waileong
Copy link
Contributor

Milvus Vector - Unable to Set nprobe in doSimilaritySearch for Default IVF_FLAT When initializeSchema = true

Description

We are encountering an issue in MilvusVectorStore where it is not possible to set nprobe when performing similarity searches (doSimilaritySearch) if the index type is IVF_FLAT.

By default, IVF_FLAT is selected when initializeSchema = true, but nprobe is not explicitly set, leading to poor recall or zero results in some cases.

Expected Behavior

  • Users should be able to set nprobe when performing similarity searches with IVF_FLAT.
  • A way to override nprobe in doSimilaritySearch() should be provided via either:
    • A configurable parameter in SearchRequest
    • A fallback to a reasonable default value (e.g., nprobe = 256)

Current Behavior

  • IVF_FLAT is used as the default index, but nprobe is not explicitly set in doSimilaritySearch().
  • This leads to low recall or 0 search results if nprobe defaults to a very small value (1).
  • There is no exposed method to override nprobe in SearchRequest.

Steps to Reproduce

  1. Initialize a MilvusVectorStore instance with initializeSchema = true.
  2. Add some vector data to the store.
  3. Perform a similarity search using doSimilaritySearch().
  4. Observe that nprobe is not explicitly set, leading to poor recall or zero results.

Code Example:

MilvusVectorStore vectorStore = MilvusVectorStore.builder(milvusClient, embeddingModel)
    .initializeSchema(true)  // Uses default IVF_FLAT
    .build();

SearchRequest searchRequest = SearchRequest.query("example query")
    .withTopK(5)
    .withSimilarityThreshold(0.7);  // No way to set nprobe

List<Document> results = vectorStore.similaritySearch(searchRequest);

// Expecting search results, but sometimes returns 0 records due to missing nprobe
@dev-jonghoonpark
Copy link
Contributor

related document : https://milvus.io/docs/ivf-flat.md

@dev-jonghoonpark
Copy link
Contributor

To achieve this feature request,
we need to add a field for custom params to the SearchRequest class.
but this change would affect all vector stores.

@waileong
Copy link
Contributor Author

This is not just a feature request; it's a necessary functionality because all index types in Milvus require custom parameters for optimal performance.

  • IVF-based indexes (e.g., IVF_FLAT, IVF_PQ) require nprobe to control the number of clusters searched.

    • Without explicitly setting nprobe, the recall is significantly impacted.
    • The default value (often nprobe=1) leads to poor search results, sometimes returning zero results.
  • HNSW indexes require ef for search expansion.

    • Just like nprobe, ef must be configurable for optimal performance.
  • The lack of a way to configure nprobe means users are stuck with suboptimal defaults, which defeats the purpose of using an optimized vector database like Milvus.

  • Since IVF_FLAT is the default index when initializeSchema = true, the inability to set nprobe directly in SearchRequest severely limits its usability.

  • A proper solution should allow users to specify index-specific parameters within SearchRequest, ensuring flexibility and control over Milvus indexing.

For more details on why these parameters are essential, refer to the Milvus documentation:
https://milvus.io/docs/index.md?tab=floating.

@waileong
Copy link
Contributor Author

waileong commented Feb 27, 2025

@ilayaperumalg, I would be grateful if you could kindly look into this issue. Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants