Skip to content

Research Integrations #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bonus414 opened this issue Mar 21, 2025 · 13 comments
Open

Research Integrations #16

bonus414 opened this issue Mar 21, 2025 · 13 comments
Assignees
Labels
enhancement New feature or request question Further information is requested

Comments

@bonus414
Copy link

We want to better understand the integrations and what we'll meet our needs
https://docs.litellm.ai/docs/observability

@bonus414 bonus414 moved this to Icebox in amazee + Afrolabs Mar 21, 2025
@bonus414 bonus414 added enhancement New feature or request question Further information is requested labels Mar 21, 2025
@bonus414 bonus414 assigned bonus414 and unassigned bonus414 Mar 21, 2025
@PhiRho
Copy link
Collaborator

PhiRho commented Mar 24, 2025

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 24, 2025

If we can push the data into Prometheus and visualise in Grafana that may well simplify matters when it comes to ops after GA. Open Telemetry is designed to do just that, raw data processing and custom callbacks would also allow it, but would require the instrumentation be built up.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 25, 2025

Open Telemetry (OTEL) looks like it fits pretty nicely with the use case - the collector can be added to the docker-compose easily, and it has pre-existing integration with Prometheus for metrics exports. Traces are not what we are looking for at this point, but may well prove useful in future. If so, then it is worth looking at a tool like Jaeger (CNCF, open source) for the visualisation.

Some sharp edges - I'm still experimenting with adding the instrumentation to LiteLLM without having to go too deep into the config, but it may require a step in the setup of the model to add the callbacks, rather than applying globally to all LiteLLM models.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 25, 2025

For security/compliance purposes we may want to look into ensuring we have integration with the PII masking hooks from the LiteLLM proxy.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 25, 2025

Sometimes you look around a corner, and realise that almost exactly what you want is behind a paywall. LiteLLM enterprise includes built in prometheus metrics.

So, now I need to see what I can do about these callbacks - I may need to do something custom which then uses either OTEL or Prometheus scraping to turn data into metrics.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 26, 2025

I've done a little bit of a run down the rabbit hole of setting up Jaeger which is really for distributed log tracing, so not quite solving the use case here. The purpose for that was mostly to ensure that the data collection happens the way I believe, and that with the right combination of OTLP (Open Telemetry Protocol) exporters and receivers the data can be extracted from the LiteLLM proxy.

Jaeger Flame
Jaeger "flame graph" of a single LiteLLM chat-completion request.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 26, 2025

The more I look, the more I see the need to write a custom callback to hook into our LiteLLM proxy. The purpose of the callback would be to structure the messages in such a way that they can be treated as metrics and not just traces.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 26, 2025

Useful fields we may want to filter/sort on available from the logging (all examples very arbitrary, running in a local sandbox):

Arbitrary injectable metadata

{
  "key": "metadata.requester_metadata",
  "type": "string",
  "value": "{'ID': '1234'}"
}

Key alias

{
  "key": "metadata.user_api_key_alias",
  "type": "string",
  "value": "[email protected] - PippaTest"
}

Gen AI fields

{
  "key": "gen_ai.request.model",
  "type": "string",
  "value": "llama3.2"
}
{
  "key": "gen_ai.response.model",
  "type": "string",
  "value": "ollama/llama3.2"
}
{
  "key": "gen_ai.system",
  "type": "string",
  "value": "ollama"
}
{
  "key": "gen_ai.usage.completion_tokens",
  "type": "int64",
  "value": 282
}
{
  "key": "gen_ai.usage.prompt_tokens",
  "type": "int64",
  "value": 33
}

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 27, 2025

The trouble with many of the integrated services is that they rely on default cloud-based endpoints rather than being self-owned (e.g. Helicone/Lago/OpenMeter). In order to gain Pometheus metrics, or remote custom callbacks we need an enterprise key.

We have all the data we want in the trace logs, but those are not well formatted for the type of filtering and searching which is needed here.

@PhiRho
Copy link
Collaborator

PhiRho commented Mar 28, 2025

NewRelic have done us a service by writing up a piece on Connectors and transformations in OTEL ETLs, which is a great way of getting the precise data we want from traces into metrics.

@PhiRho PhiRho self-assigned this Apr 1, 2025
@PhiRho
Copy link
Collaborator

PhiRho commented Apr 1, 2025

Some further thoughts on integrations.

  1. Connectors are great, but not good for pulling custom values into metrics (better for RED)
  2. There are a handful of places where LiteLLM hardcodes an endpoint which then can't be overridden (e.g. OpenMeter) and they try to force you into the managed version of the other software. This might be the right answer but is a serious pain for testing.

@PhiRho
Copy link
Collaborator

PhiRho commented Apr 2, 2025

LiteLLM hardcodes an endpoint which then can't be overridden

That's a user error. You just have to define the environment variables correctly. If you don't it just silently uses the default - which is a very sensible default.

@PhiRho PhiRho moved this from Icebox to In Progress in amazee + Afrolabs Apr 7, 2025
@PhiRho PhiRho closed this as completed by moving to In Progress in amazee + Afrolabs Apr 7, 2025
@PhiRho PhiRho reopened this Apr 11, 2025
@PhiRho
Copy link
Collaborator

PhiRho commented Apr 30, 2025

Not all the work for this has ended up in the amazee.ai repo, so here are some important links for future wanderers.

  • Fork of liteLLM which removes the enterprise code and allows us to make changes as necessary without licensing issues
  • K0rdent catalog which contains a bunch of the helm templates etc for the liteLLM deployments, as well as grafana and prometheus.
  • (not linked) Private K0rdent clusters which contain specific values needed to integrate with these other systems

Next steps here will mostly involve updates in the K0rdent space so we can get a variety of metering/billing/usage tracking systems in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
Status: In Progress
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants