Skip to content

secrets: migrate secrets to utilize opentelemetry #3547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

pitabwire
Copy link
Contributor

Micro PR to migrate monolith : #3539

PR affects :

secrets

Copy link
Contributor

@vangent vangent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's iterate on this one until it looks good, then you can update the others as needed and I'll look at those.

otel.Handle(err)
}

completedCallsCounter, err = meter.Int64Counter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Doesn't the latency histogram implicitly have a call count?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vangent I tried to maintain existing functionality here, I have no strong opinion on it and can get rid of it if its preferred not to exist
.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the existing functionality, I only see one metric, a "latencyMeasure"....?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the other "completed_calls" was being added through a view, I have tried to spend sometime to replicate the functionality. To be able to use the latency measure as its being used now.

metric.WithDescription("Latency distribution of method calls"),
)
if err != nil {
otel.Handle(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do, and in what state does it leave the latencyHistogram?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The otle.Handle(err) is a global handler for errors so for sure there will be an error, however now I see we could endup with a partially configured latencyHistogram. Just looking further into this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "there will be an error" mean though? Does the application crash? Or does it just log something and return? If the latter, what happens later one when we try to record metrics?

Copy link
Contributor Author

@pitabwire pitabwire Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a panic here, I will need to spend sometime to ensure this error is only related to things like invalid names being used or things which can be caught during development and not something like unable to provision the infrastracture for the meter.

defer func() {
// Set status on span before ending
if err != nil {
span.SetAttributes(attribute.String("gocdk.status", "13")) // Internal error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you get 13? Isn't there a constant for it somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cleaned up

defer func() { k.tracer.End(ctx, err) }()
start := time.Now()
ctx, span := k.tracer.Start(ctx, "Decrypt")
// Set span attributes for testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it's not great that we went from 2 lines of code to record metrics to about 30. I suspect this will happen in every package. It's repetitive even inside this package (lines 130-160, lines 178-209 are basically the same).

a) Why does OpenTelemetry need so much more? For example, you're already passing the method name to the OpenSpan, why do we need to provide it again as an attribute?

b) Can you add a wrapper Start/End in the otel/ package that could be called here so that secrets/blob/pubsub/etc. only have to call Start and defer End?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vangent I would like you to check the view in internal/otel/metrics.go you might have a better idea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants