-
Notifications
You must be signed in to change notification settings - Fork 0
Research Integrations #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
If we can push the data into Prometheus and visualise in Grafana that may well simplify matters when it comes to ops after GA. Open Telemetry is designed to do just that, raw data processing and custom callbacks would also allow it, but would require the instrumentation be built up. |
Open Telemetry (OTEL) looks like it fits pretty nicely with the use case - the collector can be added to the docker-compose easily, and it has pre-existing integration with Prometheus for metrics exports. Traces are not what we are looking for at this point, but may well prove useful in future. If so, then it is worth looking at a tool like Jaeger (CNCF, open source) for the visualisation. Some sharp edges - I'm still experimenting with adding the instrumentation to LiteLLM without having to go too deep into the config, but it may require a step in the setup of the model to add the callbacks, rather than applying globally to all LiteLLM models. |
For security/compliance purposes we may want to look into ensuring we have integration with the PII masking hooks from the LiteLLM proxy. |
Sometimes you look around a corner, and realise that almost exactly what you want is behind a paywall. LiteLLM enterprise includes built in prometheus metrics. So, now I need to see what I can do about these callbacks - I may need to do something custom which then uses either OTEL or Prometheus scraping to turn data into metrics. |
I've done a little bit of a run down the rabbit hole of setting up Jaeger which is really for distributed log tracing, so not quite solving the use case here. The purpose for that was mostly to ensure that the data collection happens the way I believe, and that with the right combination of OTLP (Open Telemetry Protocol) exporters and receivers the data can be extracted from the LiteLLM proxy.
|
The more I look, the more I see the need to write a custom callback to hook into our LiteLLM proxy. The purpose of the callback would be to structure the messages in such a way that they can be treated as metrics and not just traces. |
Useful fields we may want to filter/sort on available from the logging (all examples very arbitrary, running in a local sandbox): Arbitrary injectable metadata {
"key": "metadata.requester_metadata",
"type": "string",
"value": "{'ID': '1234'}"
} Key alias {
"key": "metadata.user_api_key_alias",
"type": "string",
"value": "[email protected] - PippaTest"
} Gen AI fields {
"key": "gen_ai.request.model",
"type": "string",
"value": "llama3.2"
}
{
"key": "gen_ai.response.model",
"type": "string",
"value": "ollama/llama3.2"
}
{
"key": "gen_ai.system",
"type": "string",
"value": "ollama"
}
{
"key": "gen_ai.usage.completion_tokens",
"type": "int64",
"value": 282
}
{
"key": "gen_ai.usage.prompt_tokens",
"type": "int64",
"value": 33
} |
The trouble with many of the integrated services is that they rely on default cloud-based endpoints rather than being self-owned (e.g. Helicone/Lago/OpenMeter). In order to gain Pometheus metrics, or remote custom callbacks we need an enterprise key. We have all the data we want in the trace logs, but those are not well formatted for the type of filtering and searching which is needed here. |
NewRelic have done us a service by writing up a piece on Connectors and transformations in OTEL ETLs, which is a great way of getting the precise data we want from traces into metrics. |
Some further thoughts on integrations.
|
That's a user error. You just have to define the environment variables correctly. If you don't it just silently uses the default - which is a very sensible default. |
Not all the work for this has ended up in the amazee.ai repo, so here are some important links for future wanderers.
Next steps here will mostly involve updates in the K0rdent space so we can get a variety of metering/billing/usage tracking systems in place. |
We want to better understand the integrations and what we'll meet our needs
https://docs.litellm.ai/docs/observability
The text was updated successfully, but these errors were encountered: