add a negative cache to regexp decoder #527

dqminh · 2025-02-26T14:48:32Z

Often, some decoders such as regexp can run repeatedly on the same input and skip them
with regexp filter. An common example is matching cgroup path in a chain like so:

- name: cgroup
- name: regexp
  regexps:
  - ^.*(system.slice).*$

Anything that is not in system.slice cgroup will be skipped. When only a small
subset of inputs is matched, the overhead of regexp matching can often be noticable.

We add a skip cache here to test for input that would produce ErrSkipLabelSet and
skip regex matching on them to reduce the work done on regexp matching.
The cache size is customizable with the flag config.skip-cache-size

bobrik · 2025-02-26T19:09:36Z

decoder/regexp.go


 	if r.cache == nil {
 		r.cache = map[string]*regexp.Regexp{}
 	}
+	if conf.LruCacheSize > 0 && r.outputCache == nil {


There's already a cache for label sets, shouldn't it cover this too and at a higher level?

ebpf_exporter/decoder/decoder.go

Lines 96 to 100 in b62525b

cache, ok := s.cache[name]

if !ok {

cache = map[string][]string{}

s.cache[name] = cache

}

@bobrik Yes, you are right. The LRU cache does not make sense. I think this should have been a negative cache instead, because the majority of overhead comes from matching input that would have produced ErrSkipLabelSet.
I've adjusted the implementation as such.

I don't think we need to limit the size here, as we don't do it for positive lookups. The most likely scenario is that you see the whole possible set of values on every scrape, so having an LRU that's one element too small will make it evict on every call and never hit anything. We don't have metrics, so it's easy to get this wrong and get somewhat higher CPU usage as an outcome.

Given that the decoding process is supposed to be deterministic (and we fully cache successes already), I think we need to cache errors as well and in the same place:

ebpf_exporter/decoder/decoder.go

Lines 108 to 111 in b62525b

values, err := s.decodeLabels(in, labels)

if err != nil {

return nil, err

}

This will be both faster (as we do while set instead of one decoder kind) and universal.

Thinking about it some more: we skip things we don't want in metrics and they can in fact be higher cardinality than the thing we want to retain. Perhaps we do want an LRU cache after all, but it still makes sense to have it at the global level.

Not sure what to do about the sizing. Maybe log a message if too many errors are produced and hit rate is bad?

Yes, i think we can lift the cache to be the same level as global, and make this a default behavior with configurable size. The question is whether we want multiple negative LRU cache, or only a single one. In practice, i think only regexp currently skips labels with potentially high cardinalities, so i think a single global LRU cache should be ok. It's simpler to configure too.

Regarding hit rates, I guess we can produce a prometheus metrics for this, and it's user responsibility to determine the desirable size by looking at the metrics and also load of the exporter.

Often, some decoders such as regexp can run repeatedly on the same input and skip them with regexp filter. An common example is matching cgroup path in a chain like so: ``` - name: cgroup - name: regexp regexps: - ^.*(system.slice).*$ ``` Anything that is not in system.slice cgroup will be skipped. When only a small subset of inputs is matched, the overhead of regexp matching can often be noticable. We add a skip cache here to test for input that would produce ErrSkipLabelSet and skip regex matching on them to reduce the work done on regexp matching. The cache size is customizable with the flag `config.skip-cache-size` Signed-off-by: Daniel Dao <[email protected]>

dqminh force-pushed the regexp-cache branch 2 times, most recently from 786f90d to 67101f0 Compare February 26, 2025 16:15

bobrik reviewed Feb 26, 2025

View reviewed changes

dqminh force-pushed the regexp-cache branch from 67101f0 to ceac4e8 Compare March 3, 2025 12:42

dqminh changed the title ~~add a lru cache to regexp decoder~~ add a negative cache to regexp decoder Mar 3, 2025

dqminh force-pushed the regexp-cache branch 2 times, most recently from 33fbe97 to 3fd6913 Compare March 3, 2025 17:03

dqminh force-pushed the regexp-cache branch from 3fd6913 to 8b8d605 Compare June 5, 2025 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add a negative cache to regexp decoder #527

add a negative cache to regexp decoder #527

Uh oh!

dqminh commented Feb 26, 2025 •

edited

Loading

Uh oh!

bobrik Feb 26, 2025

Uh oh!

dqminh Mar 3, 2025

Uh oh!

bobrik Mar 3, 2025

Uh oh!

bobrik Mar 4, 2025

Uh oh!

dqminh Mar 4, 2025

Uh oh!

bobrik Mar 4, 2025

Uh oh!

Uh oh!

	cache, ok := s.cache[name]
	if !ok {
	cache = map[string][]string{}
	s.cache[name] = cache
	}

	values, err := s.decodeLabels(in, labels)
	if err != nil {
	return nil, err
	}

add a negative cache to regexp decoder #527

Are you sure you want to change the base?

add a negative cache to regexp decoder #527

Uh oh!

Conversation

dqminh commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bobrik Feb 26, 2025

Choose a reason for hiding this comment

Uh oh!

dqminh Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

bobrik Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

bobrik Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

dqminh Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

bobrik Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dqminh commented Feb 26, 2025 •

edited

Loading