Skip to content

Commit 3881a31

Browse files
alexbarevawharrison-28dluc
authored
Python: Adding USearch memory connector (#2358)
### Motivation and Context The integration of [USearch](https://github.com/unum-cloud/usearch) as a memory connector to Semantic Kernel (SK). ### Description The USearch `Index` does not natively have the ability to store different collections, and it only stores embeddings without other attributes like `MemoryRecord`. The `USearchMemoryStore` class encapsulates these capabilities. It uses the USearch `Index` to store a collection of embeddings under unique IDs, with original collection names mapped to those IDs. Other `MemoryRecord ` attributes are stored in a `pyarrow.Table`, which is mapped to each collection. It's important to note the current behavior when a user removes a record or upserts a new one with an existing ID: the old row is not removed from the `pyarrow.Table`. This is done for performance reasons but could lead to the table growing in size. By default, `USearchMemoryStore` operates as an in-memory store. To enable persistence, you must set the persist mode with calling appropriate `__init__ `, supplying a path to the directory for the persist files. For each collection, two files will be created: `{collection_name}.usearch` and `{collection_name}.parquet`. Changes will only be dumped to the disk when `close_async` is called. Due to the interface provided by the base class `MemoryStoreBase`, this happens implicitly when using a context manager, or it may be called explicitly. Since collection names are used to store files on disk, all names are converted to lowercase. To ensure efficient use of memory, you should call `close_async`. --------- Co-authored-by: Abby Harrison <[email protected]> Co-authored-by: Abby Harrison <[email protected]> Co-authored-by: Devis Lucato <[email protected]>
1 parent 4bc5ff7 commit 3881a31

File tree

5 files changed

+1153
-1
lines changed

5 files changed

+1153
-1
lines changed

Diff for: python/poetry.lock

+115-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Diff for: python/pyproject.toml

+4
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,10 @@ azure-search-documents = {version = "11.4.0b8", allow-prereleases = true}
6161
azure-core = "^1.28.0"
6262
azure-identity = "^1.13.0"
6363

64+
[tool.poetry.group.usearch.dependencies]
65+
usearch = "^1.1.1"
66+
pyarrow = "^12.0.1"
67+
6468
[tool.isort]
6569
profile = "black"
6670

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Copyright (c) Microsoft. All rights reserved.
2+
3+
from semantic_kernel.connectors.memory.usearch.usearch_memory_store import (
4+
USearchMemoryStore,
5+
)
6+
7+
__all__ = ["USearchMemoryStore"]

0 commit comments

Comments
 (0)