Skip to content

Drop some of the metrics exposed by prometheus-adapter #1409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 30, 2021

Conversation

dgrisonnet
Copy link
Contributor

Description

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.
If it fixes a bug or resolves a feature request, be sure to link to that issue.

The current implementation of prometheus-adapter exposes a lot of
metrics about the health of its aggregated apiserver. The issue is that
the some of these metrics are not very useful in the context of
prometheus-adapter, and we currently can't avoid exposing them since
they are registered to the Kubernetes global Prometheus registry. Until
this is improved in upstream Kubernetes, we could benefit from dropping
some of the metrics that are not very useful.

Before this change, in a default kube-prometheus installation, we would
have 800+ series for prometheus-adapter against 400+, so we divided the
number of series by two will focusing on the most valuable metrics for
prometheus-adapter.

Type of change

What type of changes does your code introduce to the kube-prometheus? Put an x in the box that apply.

  • CHANGE (fix or feature that would cause existing functionality to not work as expected)
  • FEATURE (non-breaking change which adds functionality)
  • BUGFIX (non-breaking change which fixes an issue)
  • ENHANCEMENT (non-breaking change which improves existing functionality)
  • NONE (if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)

Changelog entry

Please put a one-line changelog entry below. Later this will be copied to the changelog file.

Drop some of prometheus-adapter metrics that are inherited from the apiserver code but aren't useful in the context of prometheus-adapter.

The current implementation of prometheus-adapter exposes a lot of
metrics about the health of its aggregated apiserver. The issue is that
the some of these metrics are not very useful in the context of
prometheus-adapter, and we currently can't avoid exposing them since
they are registered to the Kubernetes global Prometheus registry. Until
this is improved in upstream Kubernetes, we could benefit from dropping
some of the metrics that are not very useful.

Before this change, in a default kube-prometheus installation, we would
have 800+ series for prometheus-adapter against 400+, so we divided the
number of series by two will focusing on the most valuable metrics for
prometheus-adapter.

Signed-off-by: Damien Grisonnet <[email protected]>
@dgrisonnet
Copy link
Contributor Author

@fpetkovski @prashbnair can you please have a look at these changes and let me know if they make sense? Also if you have any other idea of metrics that we could drop let me know.

@fpetkovski
Copy link
Contributor

I am not familiar with which metrics are exposed by prometheus-adapter, but would it make sense to make an allowlist instead of a denylist?

@dgrisonnet
Copy link
Contributor Author

I don't think so since it would mean that we would have to think about updating the allowlist whenever we will add new health metrics in prometheus-adapter.

@fpetkovski
Copy link
Contributor

Sounds good, then this lgtm

@prashbnair
Copy link
Contributor

Do we need the ones related to authentication?

@dgrisonnet
Copy link
Contributor Author

dgrisonnet commented Sep 29, 2021

I never used them personally, but I thought that they might be useful if somehow there is an issue with the authentication since the authentication process of aggregated APIs is quite complex and these are the only metrics that we have to investigate. As far as I can tell, we don't have any intel from the apiserver itself since the requests are proxied. That's why I would be reluctant to remove them even though they are responsible for a big part of the series.

Maybe @s-urbaniak can chime in here since he has knowledge on both prometheus-adapter and kubernetes authentication.

Essentially the metrics that I think are worth keeping are:

But I am in no way an expert on that topic, so I don't really know if these metrics really make sense for an aggregated API.

@prashbnair
Copy link
Contributor

prashbnair commented Sep 29, 2021

lgtm
lets merge this and we can remove the others if needed.

@s-urbaniak
Copy link
Contributor

/lgtm

@dgrisonnet
Copy link
Contributor Author

Thanks everyone for the reviews.

@dgrisonnet dgrisonnet merged commit 374413f into prometheus-operator:main Sep 30, 2021
@dgrisonnet dgrisonnet deleted the drop-pa-metrics branch September 30, 2021 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants