.Net: [MEVD] Sqlite filtering behavior is problematic/incorrect #11655
Labels
Build
Features planned for next Build conference
msft.ext.vectordata
Related to Microsoft.Extensions.VectorData
.NET
Issue or Pull requests regarding .NET code
In all of our providers, the (LINQ) filter is a pre-filter, taking effect before the vector similarity search. However, in SQLite this seems to be the opposite: if you ask do a filtered similarity search with top=1, it looks like this first gets the most similar record, and only then applies the filter (returning nothing if the single filter doesn't match).
I'm not sure if this is intentional/documented, but it certainly doesn't seem very useful - the point of filtering is usually to restrict which records get considered for similarity search (e.g. within a given category, tenant...). We should check if this behavior is because of our own connector or just the way sqlite_vec works, and consider what to do based on that.
@dmytrostruk assigning to you since I think you wrote the SQLite connector and are the most familiar with it. This technically isn't blocking for Build, but we shouldn't GA the connector before we understand what's going on (and we have plans to use SQLite as our demo/getting started connector...).
Full repro
The text was updated successfully, but these errors were encountered: