Skip to content

Commit 9f7ec07

Browse files
committed
Clarify OpenAI endpoint prefixing
Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
1 parent f1c8d2c commit 9f7ec07

File tree

1 file changed

+4
-1
lines changed
  • docs/modelserving/v1beta1/llm/huggingface

1 file changed

+4
-1
lines changed

Diff for: docs/modelserving/v1beta1/llm/huggingface/README.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,13 @@ For information on the models supported by the vLLM backend, please visit [vLLM'
2020

2121

2222
## API Endpoints
23-
Both of the backends support serving generative models (text generation and text2text generation) using [OpenAI's Completion](https://platform.openai.com/docs/api-reference/completions) and [Chat Completion](https://platform.openai.com/docs/api-reference/chat) API.
23+
Both of the backends support serving generative models (text generation and text2text generation) using [OpenAI's Completion](https://platform.openai.com/docs/api-reference/completions), [Chat Completion](https://platform.openai.com/docs/api-reference/chat) and [Embeddings](https://platform.openai.com/docs/api-reference/embeddings) API.
2424

2525
The other types of tasks like token classification, sequence classification, and fill mask are served using KServe's [Open Inference Protocol](../../../data_plane/v2_protocol.md) or [V1 API](../../../data_plane/v1_protocol.md).
2626

27+
!!! Tip
28+
KServe prefixes OpenAI API endpoints with 'openai' to prevent confusion with the [V1 API](../../../data_plane/v1_protocol.md). For instance, the endpoint for text generation via OpenAI's Completion API becomes `openai/v1/completions`, while the Chat Completion API is accessed at `openai/v1/chat/completions`. To remove this prefix, set the `KSERVE_OPENAI_ROUTE_PREFIX` environment variable to an empty string ("").
29+
2730
## Examples
2831
The following examples demonstrate how to deploy and perform inference using the Hugging Face runtime with different ML tasks:
2932

0 commit comments

Comments
 (0)