Research Integrations #16

bonus414 · 2025-03-21T14:57:57Z

We want to better understand the integrations and what we'll meet our needs
https://docs.litellm.ai/docs/observability

PhiRho · 2025-03-24T12:48:44Z

Open telemetry
Raw data
Helicone may provide examples of how to use additional metadata for logging
Custom Callbacks may not truly suit the use case, as they require modification to the config.yaml.

PhiRho · 2025-03-24T13:00:31Z

If we can push the data into Prometheus and visualise in Grafana that may well simplify matters when it comes to ops after GA. Open Telemetry is designed to do just that, raw data processing and custom callbacks would also allow it, but would require the instrumentation be built up.

PhiRho · 2025-03-25T11:22:53Z

Open Telemetry (OTEL) looks like it fits pretty nicely with the use case - the collector can be added to the docker-compose easily, and it has pre-existing integration with Prometheus for metrics exports. Traces are not what we are looking for at this point, but may well prove useful in future. If so, then it is worth looking at a tool like Jaeger (CNCF, open source) for the visualisation.

Some sharp edges - I'm still experimenting with adding the instrumentation to LiteLLM without having to go too deep into the config, but it may require a step in the setup of the model to add the callbacks, rather than applying globally to all LiteLLM models.

PhiRho · 2025-03-25T12:06:53Z

For security/compliance purposes we may want to look into ensuring we have integration with the PII masking hooks from the LiteLLM proxy.

PhiRho · 2025-03-25T12:56:52Z

Sometimes you look around a corner, and realise that almost exactly what you want is behind a paywall. LiteLLM enterprise includes built in prometheus metrics.

So, now I need to see what I can do about these callbacks - I may need to do something custom which then uses either OTEL or Prometheus scraping to turn data into metrics.

PhiRho · 2025-03-26T12:51:05Z

I've done a little bit of a run down the rabbit hole of setting up Jaeger which is really for distributed log tracing, so not quite solving the use case here. The purpose for that was mostly to ensure that the data collection happens the way I believe, and that with the right combination of OTLP (Open Telemetry Protocol) exporters and receivers the data can be extracted from the LiteLLM proxy.

Jaeger "flame graph" of a single LiteLLM chat-completion request.

PhiRho · 2025-03-26T12:57:59Z

The more I look, the more I see the need to write a custom callback to hook into our LiteLLM proxy. The purpose of the callback would be to structure the messages in such a way that they can be treated as metrics and not just traces.

PhiRho · 2025-03-26T13:58:00Z

Useful fields we may want to filter/sort on available from the logging (all examples very arbitrary, running in a local sandbox):

Arbitrary injectable metadata

{
  "key": "metadata.requester_metadata",
  "type": "string",
  "value": "{'ID': '1234'}"
}

Key alias

{
  "key": "metadata.user_api_key_alias",
  "type": "string",
  "value": "[email protected] - PippaTest"
}

Gen AI fields

{
  "key": "gen_ai.request.model",
  "type": "string",
  "value": "llama3.2"
}
{
  "key": "gen_ai.response.model",
  "type": "string",
  "value": "ollama/llama3.2"
}
{
  "key": "gen_ai.system",
  "type": "string",
  "value": "ollama"
}
{
  "key": "gen_ai.usage.completion_tokens",
  "type": "int64",
  "value": 282
}
{
  "key": "gen_ai.usage.prompt_tokens",
  "type": "int64",
  "value": 33
}

PhiRho · 2025-03-27T09:19:28Z

The trouble with many of the integrated services is that they rely on default cloud-based endpoints rather than being self-owned (e.g. Helicone/Lago/OpenMeter). In order to gain Pometheus metrics, or remote custom callbacks we need an enterprise key.

We have all the data we want in the trace logs, but those are not well formatted for the type of filtering and searching which is needed here.

PhiRho · 2025-03-28T07:45:22Z

NewRelic have done us a service by writing up a piece on Connectors and transformations in OTEL ETLs, which is a great way of getting the precise data we want from traces into metrics.

PhiRho · 2025-04-01T12:41:39Z

Some further thoughts on integrations.

Connectors are great, but not good for pulling custom values into metrics (better for RED)
There are a handful of places where LiteLLM hardcodes an endpoint which then can't be overridden (e.g. OpenMeter) and they try to force you into the managed version of the other software. This might be the right answer but is a serious pain for testing.

PhiRho · 2025-04-02T08:18:08Z

LiteLLM hardcodes an endpoint which then can't be overridden

That's a user error. You just have to define the environment variables correctly. If you don't it just silently uses the default - which is a very sensible default.

PhiRho · 2025-04-30T08:18:48Z

Not all the work for this has ended up in the amazee.ai repo, so here are some important links for future wanderers.

Fork of liteLLM which removes the enterprise code and allows us to make changes as necessary without licensing issues
K0rdent catalog which contains a bunch of the helm templates etc for the liteLLM deployments, as well as grafana and prometheus.
(not linked) Private K0rdent clusters which contain specific values needed to integrate with these other systems

Next steps here will mostly involve updates in the K0rdent space so we can get a variety of metering/billing/usage tracking systems in place.

bonus414 moved this to Icebox in amazee + Afrolabs Mar 21, 2025

bonus414 added this to amazee + Afrolabs Mar 21, 2025

bonus414 added enhancement New feature or request question Further information is requested labels Mar 21, 2025

bonus414 assigned bonus414 and unassigned bonus414 Mar 21, 2025

PhiRho self-assigned this Apr 1, 2025

PhiRho moved this from Icebox to In Progress in amazee + Afrolabs Apr 7, 2025

PhiRho closed this as completed by moving to In Progress in amazee + Afrolabs Apr 7, 2025

PhiRho reopened this Apr 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research Integrations #16

Research Integrations #16

bonus414 commented Mar 21, 2025

PhiRho commented Mar 24, 2025

PhiRho commented Mar 24, 2025

PhiRho commented Mar 25, 2025

PhiRho commented Mar 25, 2025

PhiRho commented Mar 25, 2025

PhiRho commented Mar 26, 2025

PhiRho commented Mar 26, 2025

PhiRho commented Mar 26, 2025

PhiRho commented Mar 27, 2025

PhiRho commented Mar 28, 2025

PhiRho commented Apr 1, 2025

PhiRho commented Apr 2, 2025

PhiRho commented Apr 30, 2025

Research Integrations #16

Research Integrations #16

Comments

bonus414 commented Mar 21, 2025

PhiRho commented Mar 24, 2025

PhiRho commented Mar 24, 2025

PhiRho commented Mar 25, 2025

PhiRho commented Mar 25, 2025

PhiRho commented Mar 25, 2025

PhiRho commented Mar 26, 2025

PhiRho commented Mar 26, 2025

PhiRho commented Mar 26, 2025

PhiRho commented Mar 27, 2025

PhiRho commented Mar 28, 2025

PhiRho commented Apr 1, 2025

PhiRho commented Apr 2, 2025

PhiRho commented Apr 30, 2025