Skip to content

spike(v6): add factory methods to Vectorizer namespace #391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: v6
Choose a base branch
from

Conversation

bevzzz
Copy link
Collaborator

@bevzzz bevzzz commented Jun 16, 2025

Allows vectorizer-first syntax like:

client.collections.create("VectorizedThings",
  c -> c.vectors(
    Vectorizer.none("supplied"),
    Vectorizer.text2vecWeaviate("custom", t2v -> t2v.dimensions(1))));

This PR is more of a POC I put together quickly, as underneath the library still has the "hierarchy" VectorIndex --has-> Vectorizer and not the other way around.

I haven't found a good way to add .vectorIndex to all vectorizer builders without rewriting that hierarchy.

@bevzzz bevzzz self-assigned this Jun 16, 2025
Copy link

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca

@bevzzz bevzzz changed the title feat: add factory methods to Vectorizer namespace spike(v6): add factory methods to Vectorizer namespace Jun 16, 2025
Allows vectorizer-first syntax like Vectorizer.text2vecWeaviate('custom', t2v -> t2v.dimensions(1)).
@bevzzz bevzzz force-pushed the v6-vectorizer-first branch from 84d9dea to 615145e Compare June 17, 2025 07:52
bevzzz added 2 commits June 17, 2025 12:02
This is getting awfully hacky, because now we're pretenting do builder Vectorizer
while actually builder a VectorIndex object.
Vectorizer.<Hnsw, Hnsw.Builder>text2vecContextionary(
    Hnsw::of,
    t2v -> t2v.vectorizeCollectionName(false),
    hnsw -> hnsw.cleanupIntervalSeconds(300));
@bevzzz
Copy link
Collaborator Author

bevzzz commented Jun 17, 2025

Here's another thing I had in mind while working on it:

// Using default vector index
Vectorizer.text2VecContextionary(t2v -> t2v.vectorizeCollectionName(false));

// Adding explicit vector index configuration
Vectorizer.<Hnsw.Builder, Hnsw>text2VecContextionary(
    t2v -> t2v.vectorizeCollectionName(false),
    Hnsw::of, hnsw -> hnsw.cleanupIntervalSeconds(30));

// Adding explicit vector index configuration (current syntax)
Hnsw.of(
    Text2VecContextionaryVectorizer.of(t2v -> t2v.vectorizeCollectionName(false)),
    hnsw -> hnsw.cleanupIntervalSeconds(30));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant