Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): generate docs/monitoring/metrics.md file #5117

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

ivankatliarchuk
Copy link
Contributor

@ivankatliarchuk ivankatliarchuk commented Feb 23, 2025

Description

No issue. On one of the proposals, there were mentioning about metrics #5079 (comment). So I've tried to understand how metrics/observability is configured. Turns out, that there were no page about that, only FAQ with outdated data.

This pull request is very similar to #4983

It adds

  • generation of all available metrics to a document
  • monitoring documentation
  • added debug logs for metric instrumentation

This PR is massive, I'm could split it if required, just need a guidance how to slice it.

Checklist

  • Unit tests updated
  • End user documentation updated

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 23, 2025
@ivankatliarchuk
Copy link
Contributor Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 23, 2025
@ivankatliarchuk ivankatliarchuk changed the title chore(docs): generate docs/monitoring/metrics.md file WIP chore(docs): generate docs/monitoring/metrics.md file Feb 23, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2025
@ivankatliarchuk ivankatliarchuk marked this pull request as draft February 23, 2025 16:05
@ivankatliarchuk ivankatliarchuk changed the title WIP chore(docs): generate docs/monitoring/metrics.md file chore(docs): generate docs/monitoring/metrics.md file Feb 23, 2025
@ivankatliarchuk ivankatliarchuk marked this pull request as ready for review February 23, 2025 17:27
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2025
@mloiseleur
Copy link
Contributor

Overall PR LGTM.
Wdyt of moving the template in a separate file ?

It's the case for the chart and it's clearly easier to review & update.

@ivankatliarchuk
Copy link
Contributor Author

Changed to a template file.

Copy link
Contributor

@mloiseleur mloiseleur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please add CI on this ? (like for flags)
On my system, it generates two additional metrics:

diff --git a/docs/monitoring/metrics.md b/docs/monitoring/metrics.md
index 6c7782af..effbd242 100644
--- a/docs/monitoring/metrics.md
+++ b/docs/monitoring/metrics.md
@@ -80,6 +80,8 @@ curl https://localhost:7979/metrics
 | http_request_duration_seconds |
 | process_cpu_seconds_total |
 | process_max_fds |
+| process_network_receive_bytes_total |
+| process_network_transmit_bytes_total |
 | process_open_fds |
 | process_resident_memory_bytes |
 | process_start_time_seconds |

@ivankatliarchuk
Copy link
Contributor Author

The Go runtime's behavior differs considerably across systems due to factors like Docker containerization, the host operating system, and architecture (ARM, AMD64, etc.). This inconsistency is why its tests are disabled. We can either add them manually or remove them entirely.

The following Go runtime metrics are available for scraping. Please note that they may change over time and they are OS dependent.

@ivankatliarchuk
Copy link
Contributor Author

ivankatliarchuk commented Feb 25, 2025

process_network_receive_bytes_total
+| process_network_receive_bytes_total |
+| process_network_transmit_bytes_total |

I'm unsure how to catch them from within a code, they are only available when promhttp.Handler() is activated.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from mloiseleur. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ivankatliarchuk
Copy link
Contributor Author

I've added manually this two metrics. It's a best effort at the moment.

@ivankatliarchuk
Copy link
Contributor Author

Current behaviour for CI tests

  • Will fail if new custom metric is added to a code but not to a file
  • Will ingore if Go adds/removes runtime metrics. So no CI tests to fail in such case

@ivankatliarchuk
Copy link
Contributor Author

Need guidance here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants