Skip to content

.Net: Change Sqlite connector to accept connection string instead of DbConnection #10972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 14, 2025

Conversation

roji
Copy link
Member

@roji roji commented Mar 14, 2025

This is a resubmit of #10550, which was closed when its target branch (feature-vector-data-preb1) was deleted.

  • I believe that IVectorStore and IVectorStoreRecordCollection should have clear guarantees around thread-safety/concurrency, which should be clearly documented etc. For example, we shouldn't have one implementation/instance of IVectorStoreRecordCollection which is thread-safe, and another which isn't (in the way that SqliteVectorStoreRecordCollection is, by closing over SqliteConnection, which itself isn't thread-safe).
  • If we agree on the above, I believe the MEVD abstractions should be thread-safe; IVectorStore and IVectorStoreRecordCollection conceptually represent/reference database-side entities (the database itself, a collection within the database), and not e.g. a connection to the database (like DbConnection); so an instance doesn't represent something that's client-side stateful (again, like a connection). Also, in most databases, the client is thread-safe, representing simply a way to execute calls (e.g. via gRPC) on the database.
  • The proposed implementation is very similar to the PostgreSQL one - a (pooled) connection is retrieved and opened for each operation, and then closed. Note that this means that Data Source=:memory: cannot be used in most cases, since the databases only lives while the connection is open, and disappears immediately when closed. This isn't perfect, but supporting in-memory would mean no concurrent usage.
  • Unlike the PostgreSQL implementation, there's no SqliteDataSource like there is NpgsqlDataSource (but I hope there will be at some point - see #30991); there's a DbDataSource abstraction in ADO.NET which we could theoretically accept, but that's not accepted usage, and in any case, we require a SqliteConnection (not DbConnection) in order to e.g. call LoadExtension(), and so would need to down-cast DbConnections coming out of DbDataSource in any case. As a result, I'm proposing that for now, the Sqlite connector simply accept a connection string, which is the natural/common way for ADO.NET-related things to work. In the future, when a SqliteDataSource is introduced, we can add support for that like we have for NpgsqlDataSource.
  • I started out trying to preserve (but obsolete) the ability to accept SqliteConnection (to avoid the breaking change), but it turned out quite complicated, requiring lots of refactoring just for that, and a bit of risk of getting it wrong. Since this would be a temporary, one-month obsoletion only, and this is a SQLite-only break, I'm proposing to break this now rather than do all the effort and take on the risk for the two months we have. So I've obsoleted the relevant APIs with [Obsolete(error: true)], to have an informative error message for the user.

Closes #10454

/cc @westey-m @dmytrostruk @adamsitnik

@roji roji requested a review from a team as a code owner March 14, 2025 09:28
@roji roji changed the base branch from main to feature-vector-data-preb2 March 14, 2025 09:28
@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel memory labels Mar 14, 2025
@github-actions github-actions bot changed the title Change Sqlite connector to accept connection string instead of DbConnection .Net: Change Sqlite connector to accept connection string instead of DbConnection Mar 14, 2025
Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @roji !

@roji roji merged commit ce572c8 into microsoft:feature-vector-data-preb2 Mar 14, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Issues or pull requests impacting the core kernel memory .NET Issue or Pull requests regarding .NET code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.Net MEVD: Clarify thread-safety of the abstraction
4 participants