Skip to content

[processor/transform] Add skeleton for query language transform processor #7047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 7, 2022
20 changes: 17 additions & 3 deletions processor/transformprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,30 @@ Supported where operations:

Example configuration:
```yaml
receivers:
otlp:
protocols:
grpc:

exporters:
nop

processors:
transform:
queries:
- set(status.code, 1) where attributes["http.path"] == "/health"
- keep(resource.attributes, "service.name", "service.namespace", "cloud.region")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing discussion from #6985 (comment)

Don't remember the final decision about the design (1 processor per signal or 1 processor for all signals). If we have 1 processor for all signals as the name suggest ("transform") then this will apply for any telemetry type, but the first line would apply only to "traces" since the status is a span concept?

I believe would be hard to understand all these subtle things, so maybe we can have a per signal processor and reuse as much code as possible?

Not sure we need the final decision now, but when reading these lines I got confused.

Originally posted by @bogdandrutu in #6985 (comment)

Copy link
Contributor Author

@anuraaga anuraaga Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you're coming from, by blunt I guess I meant that it would be harder to grok IMO to have three separate components that are the same thing conceptually, running functions on query expressions. Would it be better to put the signals into the config?

transform:
  spans:
    queries:
  metrics:
    queries:
  all:
    queries:

? It would also allow implementing Validate which seems useful indeed

Copy link
Member

@bogdandrutu bogdandrutu Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this proposal (latest you had), since I don't think is too hard to change if we get feedback that this is still confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu Updated to use per-signal queries in config

- set(name, attributes["http.route"])
service:
pipelines:
traces:
receivers: [otlp]
processors: [transform]
exporters: [nop]
```

This processor will perform the operations in order
This processor will perform the operations in order for all spans

1) Set status code to OK for all spans with a path `/health`
1) Set status code to OK with a path `/health`
2) Keep only `service.name`, `service.namespace`, `cloud.region` resource attributes
3) Set `name` to the `http.route` attribute if it is set.
3) Set `name` to the `http.route` attribute if it is set