Skip to content

JMX metrics unit conversion #13448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Mar 20, 2025

Conversation

robsunday
Copy link
Contributor

@robsunday robsunday commented Mar 4, 2025

Related to: #13369

Some of JMX enabled applications publish their metrics with units incompatible with semconv. Most frequently it is publishing elapsed time in milliseconds or nanoseconds, while semconv recommends seconds.

We can support semconv recommendations by implementing metric unit conversion mechanism that can be used with YAML JMX configs.

Solution description:
A new YAML property sourceUnit can now be specified to define original unit of mapped bean property value.
It should be used together with unit, which defines unit of metric value reported to the backend.

Example:

      totalProcessingTime:
        metric: processingTime
        type: counter
        sourceUnit: ms
        unit: s
        desc: Total time for processing all requests

This means that totalProcessingTime bean property is mapped to processingTime metric and value is converted from milliseconds to seconds.

If only unit is specified and there is no sourceUnit then no conversion is performed.
If only sourceUnit is specified and there is no unit then metric has no unit and no conversion is performed - it is just silently ignored.

Note
This PR will be followed by another one which will solve numeric precision issue mentioned below

@robsunday
Copy link
Contributor Author

To discuss - should we silently ignore situation when unit is missing and only sourceUnit is specified? Should we gereate parsing error, or maybe use sourceUnit as it would be unit?

@SylvainJuge
Copy link
Contributor

To discuss - should we silently ignore situation when unit is missing and only sourceUnit is specified? Should we gereate parsing error, or maybe use sourceUnit as it would be unit?

We must have an explicit unit, so unit is mandatory, and sourceUnit is optional. So I would suggest to throw an error rather than try to use sourceUnit because it's simpler to document and explain if those fields as always "mandatory" and "optional" and do not have their definitions depends on each other.

Copy link
Contributor

@SylvainJuge SylvainJuge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding documentation for sourceUnit will be required in https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/jmx-metrics/javaagent/README.md once we agree on the implementation.

@robsunday
Copy link
Contributor Author

According to mandatory unit - YAML is correctly processed if unit is not provided. In such a case the unit is and empty string.
We made it mandatory in our metric assertions framework (so we need to call .hasUnit("") in case unit was not provided)
I need to update the code in few places to handle unit = "" correctly.

@SylvainJuge
Copy link
Contributor

According to mandatory unit - YAML is correctly processed if unit is not provided. In such a case the unit is and empty string. We made it mandatory in our metric assertions framework (so we need to call .hasUnit("") in case unit was not provided) I need to update the code in few places to handle unit = "" correctly.

You are right, the io.opentelemetry.sdk.testing.assertj.MetricAssert#hasUnit metric assertion in sdk testing has a few examples with an empty string as unit, which tends to confirm that unit is not mandatory even if it's not very common in practice. I assume this might be needed to handle compatibility when metric is not available, which is not the case for the metrics that are part of semantic conventions.

Converters stored in a single flat map
Exception thrown when converter not found
@@ -27,21 +27,28 @@ public enum Type {
// How to report the metric using OpenTelemetry API
private final String metricName; // used as Instrument name
@Nullable private final String description;
@Nullable private final String unit;
@Nullable private final String sourceUnit;
private final String unit;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] unit is required by semconv, so it should not be nullable

@robsunday robsunday marked this pull request as ready for review March 7, 2025 07:21
@robsunday robsunday requested a review from a team as a code owner March 7, 2025 07:21
@robsunday robsunday requested a review from SylvainJuge March 7, 2025 08:07
@SylvainJuge
Copy link
Contributor

I think there are a few issues with the current approach we've taken so far, I'll try to elaborate a bit on what I think could be problematic with this approach.

Currently, we assume that an MBean attribute of type Float or Double will produce a Double OTel metric and a Long metric otherwise, this is done using AttributeInfo.usesDoubleValues(). (1)

When adding the unit conversion, we take in account that the conversion to seconds always produces a Double which is reflected by the UnitConverter.isConvertingToDouble(), and hence the choice of the callback method is appropriately made to deal with that.

However, in the implementation of the unit converter themselves, for example in ms to s, we read the values with Number#doubleValue even if it's a Long value, that implies a potential loss of precision. In the reverse case of the Mby to bytes conversion, it means such conversion would produce a Double value for which the decimal part is useless.

In short, I think using a unit conversion should:

  • ideally allow to not assume that the value is read as a double or long to prevent loss of precision when we can: this is quite easy to implement in the conversion function)
  • allow to specify the type of the resulting value: this is currently missing from the yaml

But the downside here is that it means for every converter, for example ms_to_s to create two variants ms_to_s_long and ms_to_s_double, also we have to either add another attribute in yaml or use a do-it-all attribute that directly refers to the converter key, for example valueConverter: 'ms_to_s_long' in yaml would convert milliseconds to seconds as Long value.

So, I think that we really miss here is the ability to know which numeric type should be used for a given metric. As said here (1) using the type of the attribute is a good default but we should provide an explicit option as well. When unit conversion is involved, we could then reuse this information to know how to perform the conversion appropriately.

Doing so could also maybe bring a few other use-cases, but I only have one so far:

  • JMX metric published as a double but captured as a long or the opposite, without changing the unit.

Added support for default sourceUnit defined on rule level
@SylvainJuge
Copy link
Contributor

As discussed today, we currently do not have the ability to have an explicit "numeric type" for the produced metric.

This is fine for now as we only need to convert to seconds for the JVM/tomcat metrics, which will always produce a double value, so we can deal with that improvement later in a separate PR.

As explained here in documentation choosing numeric types can have an impact with overflows, but this could also be something problematic for the consumers of those metrics if they rely on the numeric type (for example distinct storage for long and double values).

Converter registration is now package private.
}

@Test
void shouldSupportCustomConverter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] this test is more for future-proofing the design of unit conversion, we don't expect for now to provide the ability to register custom converters.

@robsunday robsunday requested a review from jaydeluca March 13, 2025 11:46
@trask
Copy link
Member

trask commented Mar 13, 2025

cc @PeterF778 in case you have a chance to review

@trask trask merged commit 511a9c9 into open-telemetry:main Mar 20, 2025
86 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants