Skip to content

Commit d22ee5a

Browse files
TaoChenOSUmoonbox3
andauthored
Python: Azure Model-as-a-Service Python connector (#6742)
### Motivation and Context <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> Related to: #6693 Azure Model-as-a-Service allows users to deploy certain models from the Azure AI Studio model catalog as an API. This option also provides pay-as-you-go access to the models hosted. Below are some of the models that are supported: - Microsoft Phi-3 family - Meta Llama family (Llama 2 chat & Llama 3 instruct) - Mistral-Small & Mistral-Large - and more We'd like to provide an AI connector for users of SK to use Azure Model-as-a-Service. ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> A new AI connector named `azure_ai_inference` is added to support Azure Model-as-a-Service. This connector takes a new dependency on the Python `azure.ai.inference` SDK. ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [X] The code builds clean without any errors or warnings - [X] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [X] All unit tests pass, and I have added new tests where possible - [X] I didn't break anyone 😄 --------- Co-authored-by: Evan Mattson <[email protected]>
1 parent ad8c819 commit d22ee5a

17 files changed

+1024
-30
lines changed

python/poetry.lock

+20-4
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

python/pyproject.toml

+5-2
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ weaviate-client = { version = ">=3.18,<5.0", optional = true}
4848
pinecone-client = { version = ">=3.0.0", optional = true}
4949
psycopg = { version="^3.1.9", extras=["binary","pool"], optional = true}
5050
redis = { version = "^4.6.0", optional = true}
51+
azure-ai-inference = {version = "^1.0.0b1", allow-prereleases = true, optional = true}
5152
azure-search-documents = {version = "11.6.0b4", allow-prereleases = true, optional = true}
5253
azure-core = { version = "^1.28.0", optional = true}
5354
azure-identity = { version = "^1.13.0", optional = true}
@@ -73,6 +74,7 @@ optional = true
7374

7475
[tool.poetry.group.unit-tests.dependencies]
7576
google-generativeai = { version = ">=0.1,<0.4" }
77+
azure-ai-inference = {version = "^1.0.0b1", allow-prereleases = true}
7678
azure-search-documents = {version = "11.6.0b4", allow-prereleases = true}
7779
azure-core = "^1.28.0"
7880
azure-cosmos = "^4.7.0"
@@ -97,6 +99,7 @@ weaviate-client = ">=3.18,<5.0"
9799
pinecone-client = ">=3.0.0"
98100
psycopg = { version="^3.1.9", extras=["binary","pool"]}
99101
redis = "^4.6.0"
102+
azure-ai-inference = {version = "^1.0.0b1", allow-prereleases = true}
100103
azure-search-documents = {version = "11.6.0b4", allow-prereleases = true}
101104
azure-core = "^1.28.0"
102105
azure-identity = "^1.13.0"
@@ -116,10 +119,10 @@ weaviate = ["weaviate-client"]
116119
pinecone = ["pinecone-client"]
117120
postgres = ["psycopg"]
118121
redis = ["redis"]
119-
azure = ["azure-search-documents", "azure-core", "azure-identity", "azure-cosmos", "msgraph-sdk"]
122+
azure = ["azure-ai-inference", "azure-search-documents", "azure-core", "azure-identity", "azure-cosmos", "msgraph-sdk"]
120123
usearch = ["usearch", "pyarrow"]
121124
notebooks = ["ipykernel"]
122-
all = ["google-generativeai", "grpcio-status", "transformers", "sentence-transformers", "torch", "qdrant-client", "chromadb", "pymilvus", "milvus", "weaviate-client", "pinecone-client", "psycopg", "redis", "azure-search-documents", "azure-core", "azure-identity", "azure-cosmos", "usearch", "pyarrow", "ipykernel"]
125+
all = ["google-generativeai", "grpcio-status", "transformers", "sentence-transformers", "torch", "qdrant-client", "chromadb", "pymilvus", "milvus", "weaviate-client", "pinecone-client", "psycopg", "redis", "azure-ai-inference", "azure-search-documents", "azure-core", "azure-identity", "azure-cosmos", "usearch", "pyarrow", "ipykernel"]
123126

124127
[tool.ruff]
125128
line-length = 120
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Copyright (c) Microsoft. All rights reserved.
2+
3+
from semantic_kernel.connectors.ai.azure_ai_inference.azure_ai_inference_prompt_execution_settings import (
4+
AzureAIInferenceChatPromptExecutionSettings,
5+
AzureAIInferenceEmbeddingPromptExecutionSettings,
6+
)
7+
from semantic_kernel.connectors.ai.azure_ai_inference.azure_ai_inference_settings import AzureAIInferenceSettings
8+
from semantic_kernel.connectors.ai.azure_ai_inference.services.azure_ai_inference_chat_completion import (
9+
AzureAIInferenceChatCompletion,
10+
)
11+
from semantic_kernel.connectors.ai.azure_ai_inference.services.azure_ai_inference_text_embedding import (
12+
AzureAIInferenceTextEmbedding,
13+
)
14+
15+
__all__ = [
16+
"AzureAIInferenceChatCompletion",
17+
"AzureAIInferenceChatPromptExecutionSettings",
18+
"AzureAIInferenceEmbeddingPromptExecutionSettings",
19+
"AzureAIInferenceSettings",
20+
"AzureAIInferenceTextEmbedding",
21+
]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Copyright (c) Microsoft. All rights reserved.
2+
3+
from typing import Literal
4+
5+
from pydantic import Field
6+
7+
from semantic_kernel.connectors.ai.prompt_execution_settings import PromptExecutionSettings
8+
from semantic_kernel.utils.experimental_decorator import experimental_class
9+
10+
11+
@experimental_class
12+
class AzureAIInferencePromptExecutionSettings(PromptExecutionSettings):
13+
"""Azure AI Inference Prompt Execution Settings.
14+
15+
Note:
16+
`extra_parameters` is a dictionary to pass additional model-specific parameters to the model.
17+
"""
18+
19+
frequency_penalty: float | None = Field(None, ge=-2, le=2)
20+
max_tokens: int | None = Field(None, gt=0)
21+
presence_penalty: float | None = Field(None, ge=-2, le=2)
22+
seed: int | None = None
23+
stop: str | None = None
24+
temperature: float | None = Field(None, ge=0.0, le=1.0)
25+
top_p: float | None = Field(None, ge=0.0, le=1.0)
26+
extra_parameters: dict[str, str] | None = None
27+
28+
29+
@experimental_class
30+
class AzureAIInferenceChatPromptExecutionSettings(AzureAIInferencePromptExecutionSettings):
31+
"""Azure AI Inference Chat Prompt Execution Settings."""
32+
33+
34+
@experimental_class
35+
class AzureAIInferenceEmbeddingPromptExecutionSettings(PromptExecutionSettings):
36+
"""Azure AI Inference Embedding Prompt Execution Settings.
37+
38+
Note:
39+
`extra_parameters` is a dictionary to pass additional model-specific parameters to the model.
40+
"""
41+
42+
dimensions: int | None = Field(None, gt=0)
43+
encoding_format: Literal["base64", "binary", "float", "int8", "ubinary", "uint8"] | None = None
44+
input_type: Literal["text", "query", "document"] | None = None
45+
extra_parameters: dict[str, str] | None = None
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Copyright (c) Microsoft. All rights reserved.
2+
3+
from typing import ClassVar
4+
5+
from pydantic import SecretStr
6+
7+
from semantic_kernel.kernel_pydantic import HttpsUrl, KernelBaseSettings
8+
from semantic_kernel.utils.experimental_decorator import experimental_class
9+
10+
11+
@experimental_class
12+
class AzureAIInferenceSettings(KernelBaseSettings):
13+
"""Azure AI Inference settings.
14+
15+
The settings are first loaded from environment variables with
16+
the prefix 'AZURE_AI_INFERENCE_'.
17+
If the environment variables are not found, the settings can
18+
be loaded from a .env file with the encoding 'utf-8'.
19+
If the settings are not found in the .env file, the settings
20+
are ignored; however, validation will fail alerting that the
21+
settings are missing.
22+
23+
Required settings for prefix 'AZURE_AI_INFERENCE_' are:
24+
- endpoint: HttpsUrl - The endpoint of the Azure AI Inference service deployment.
25+
This value can be found in the Keys & Endpoint section when examining
26+
your resource from the Azure portal.
27+
(Env var AZURE_AI_INFERENCE_ENDPOINT)
28+
- api_key: SecretStr - The API key for the Azure AI Inference service deployment.
29+
This value can be found in the Keys & Endpoint section when examining
30+
your resource from the Azure portal. You can use either KEY1 or KEY2.
31+
(Env var AZURE_AI_INFERENCE_API_KEY)
32+
"""
33+
34+
env_prefix: ClassVar[str] = "AZURE_AI_INFERENCE_"
35+
36+
endpoint: HttpsUrl
37+
api_key: SecretStr
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Copyright (c) Microsoft. All rights reserved.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Copyright (c) Microsoft. All rights reserved.
2+
3+
import asyncio
4+
import contextlib
5+
from abc import ABC
6+
7+
from azure.ai.inference.aio import ChatCompletionsClient, EmbeddingsClient
8+
9+
from semantic_kernel.kernel_pydantic import KernelBaseModel
10+
from semantic_kernel.utils.experimental_decorator import experimental_class
11+
12+
13+
@experimental_class
14+
class AzureAIInferenceBase(KernelBaseModel, ABC):
15+
"""Azure AI Inference Chat Completion Service."""
16+
17+
client: ChatCompletionsClient | EmbeddingsClient
18+
19+
def __del__(self) -> None:
20+
"""Close the client when the object is deleted."""
21+
with contextlib.suppress(Exception):
22+
asyncio.get_running_loop().create_task(self.client.close())

0 commit comments

Comments
 (0)