perf(sampling): optimize matching and limit cache memory [APMSP-2948] by bantonsson · Pull Request #1977 · DataDog/libdatadog

bantonsson · 2026-05-12T10:36:59Z

What does this PR do?

Optimizes the glob matching and limits the memory taken up by the LRU match cache.

Motivation

A follow up PR for bigger performance questions that came up in #1927 as well as bounding the cache memory for the LRU cache that came up in a security review.

How to test the change?

Unit tests and benchmarks are in the PR.

github-actions · 2026-05-12T10:39:53Z

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

Base Branch: origin/main
PR Branch: origin/ban/optimize-matching

Summary by Rule

Rule	Base Branch	PR Branch	Change

Annotation Counts by File

File	Base Branch	PR Branch	Change

Annotation Stats by Crate

Crate	Base Branch	PR Branch	Change
`clippy-annotation-reporter`	5	5	No change (0%)
`datadog-ffe-ffi`	1	1	No change (0%)
`datadog-ipc`	21	21	No change (0%)
`datadog-live-debugger`	6	6	No change (0%)
`datadog-live-debugger-ffi`	10	10	No change (0%)
`datadog-profiling-replayer`	4	4	No change (0%)
`datadog-remote-config`	3	3	No change (0%)
`datadog-sidecar`	57	57	No change (0%)
`libdd-common`	13	13	No change (0%)
`libdd-common-ffi`	12	12	No change (0%)
`libdd-data-pipeline`	5	5	No change (0%)
`libdd-ddsketch`	2	2	No change (0%)
`libdd-dogstatsd-client`	1	1	No change (0%)
`libdd-profiling`	13	13	No change (0%)
`libdd-telemetry`	20	20	No change (0%)
`libdd-tinybytes`	4	4	No change (0%)
`libdd-trace-normalization`	2	2	No change (0%)
`libdd-trace-obfuscation`	8	8	No change (0%)
`libdd-trace-stats`	1	1	No change (0%)
`libdd-trace-utils`	15	15	No change (0%)
Total	203	203	No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 364249df6a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-12T10:41:05Z

+        let mut cache = self.cache.lock_or_panic();
+        if let Some(&result) = cache.get(subject) {
+            return result;
+        }
+        let result = glob_match_bytes::<true>(pattern, subject);


Avoid running glob matching while holding the cache lock

For wildcard rules with cache misses, cache is still locked while glob_match_bytes runs. In concurrent sampling, one high-cardinality or adversarial subject can make every other thread using the same cloned GlobMatcher wait behind the full backtracking match, including threads that would only need a cache hit. The previous implementation released the mutex before doing the match and only re-locked to insert, so this can serialize rule matching under load; consider dropping the guard before computing and re-locking for put.

Useful? React with 👍 / 👎.

datadog-official · 2026-05-12T10:54:10Z

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
• Patch Coverage: 99.24%
• Overall Coverage: 72.70% (+0.06%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 894e089 | Docs | Datadog PR Page | Give us feedback!}

codecov-commenter · 2026-05-12T11:00:58Z

Codecov Report

❌ Patch coverage is 99.23664% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.69%. Comparing base (ec2fd4e) to head (894e089).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1977      +/-   ##
==========================================
+ Coverage   72.64%   72.69%   +0.05%     
==========================================
  Files         451      452       +1     
  Lines       74687    74888     +201     
==========================================
+ Hits        54253    54443     +190     
- Misses      20434    20445      +11

Components	Coverage Δ
libdd-crashtracker	`65.31% <ø> (-0.03%)`	⬇️
libdd-crashtracker-ffi	`37.68% <ø> (ø)`
libdd-alloc	`98.77% <ø> (ø)`
libdd-data-pipeline	`85.97% <ø> (ø)`
libdd-data-pipeline-ffi	`71.04% <ø> (ø)`
libdd-common	`79.81% <ø> (ø)`
libdd-common-ffi	`74.41% <ø> (ø)`
libdd-telemetry	`73.34% <ø> (ø)`
libdd-telemetry-ffi	`31.36% <ø> (ø)`
libdd-dogstatsd-client	`82.64% <ø> (ø)`
datadog-ipc	`76.17% <ø> (-0.05%)`	⬇️
libdd-profiling	`81.56% <ø> (-0.04%)`	⬇️
libdd-profiling-ffi	`64.51% <ø> (ø)`
libdd-sampling	`97.46% <99.23%> (+0.21%)`	⬆️
datadog-sidecar	`29.09% <ø> (ø)`
datdog-sidecar-ffi	`9.67% <ø> (ø)`
spawn-worker	`48.86% <ø> (ø)`
libdd-tinybytes	`93.16% <ø> (ø)`
libdd-trace-normalization	`81.71% <ø> (ø)`
libdd-trace-obfuscation	`87.39% <ø> (ø)`
libdd-trace-protobuf	`68.25% <ø> (ø)`
libdd-trace-utils	`89.59% <ø> (ø)`
libdd-tracer-flare	`86.88% <ø> (ø)`
libdd-log	`74.83% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

dd-octo-sts · 2026-05-12T11:40:16Z

Artifact Size Benchmark Report

aarch64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a	81.84 MB	81.84 MB	0% (0 B) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so	7.57 MB	7.57 MB	0% (0 B) 👌

aarch64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so	10.01 MB	10.01 MB	0% (0 B) 👌
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a	98.03 MB	98.03 MB	0% (0 B) 👌

libdatadog-x64-windows

Artifact	Baseline	Commit	Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll	24.48 MB	24.48 MB	0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib	79.87 KB	79.87 KB	0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb	180.23 MB	180.23 MB	+0% (+8.00 KB) 👌
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib	913.96 MB	913.96 MB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll	7.73 MB	7.73 MB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib	79.87 KB	79.87 KB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb	23.17 MB	23.17 MB	0% (0 B) 👌
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib	45.36 MB	45.36 MB	0% (0 B) 👌

libdatadog-x86-windows

Artifact	Baseline	Commit	Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll	21.09 MB	21.09 MB	0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib	81.11 KB	81.11 KB	0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb	184.41 MB	184.44 MB	+.02% (+40.00 KB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib	900.44 MB	900.44 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll	5.99 MB	5.99 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib	81.11 KB	81.11 KB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb	24.81 MB	24.81 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib	42.87 MB	42.87 MB	0% (0 B) 👌

x86_64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a	72.93 MB	72.93 MB	0% (0 B) 👌
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so	8.42 MB	8.42 MB	0% (0 B) 👌

x86_64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a	90.70 MB	90.70 MB	0% (0 B) 👌
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so	10.06 MB	10.06 MB	0% (0 B) 👌

iunanua · 2026-05-12T11:53:38Z

+        // A concurrent inserter racing us on the same key is harmless: `put` overwrites with
+        // the same boolean result.


But N concurrent inserters would call glob_match_bytes if there is a cache miss.
I guess that's the price we have to pay.

If we really need to worry about that in the future, there are some concurrent implementations of LRU caches (https://crates.io/crates/concurrent_lru for one)

yannham · 2026-05-12T15:24:45Z

+        // A concurrent inserter racing us on the same key is harmless: `put` overwrites with
+        // the same boolean result.


If we really need to worry about that in the future, there are some concurrent implementations of LRU caches (https://crates.io/crates/concurrent_lru for one)

bantonsson requested a review from a team as a code owner May 12, 2026 10:37

github-actions Bot added the sampling label May 12, 2026

chatgpt-codex-connector Bot reviewed May 12, 2026

View reviewed changes

bantonsson force-pushed the ban/optimize-matching branch 2 times, most recently from 66834c7 to 6f98b4c Compare May 12, 2026 10:47

bantonsson force-pushed the ban/optimize-matching branch from 6f98b4c to b272f60 Compare May 12, 2026 11:08

iunanua reviewed May 12, 2026

View reviewed changes

Comment thread libdd-sampling/src/bounded_byte_cache.rs Outdated

iunanua reviewed May 12, 2026

View reviewed changes

Comment thread libdd-sampling/src/bounded_byte_cache.rs

bantonsson force-pushed the ban/optimize-matching branch 2 times, most recently from 75730ae to 642e445 Compare May 12, 2026 13:20

yannham approved these changes May 12, 2026

View reviewed changes

paullegranddc approved these changes May 13, 2026

View reviewed changes

Comment thread libdd-sampling/src/bounded_byte_cache.rs Outdated

bantonsson added 3 commits May 15, 2026 10:51

perf(sampling): add glob matching benchmarks

56b15fa

perf(sampling): optimize matching and bound cache memory

b0c652c

chore(sampling): pr review feedback

894e089

bantonsson force-pushed the ban/optimize-matching branch from 642e445 to 894e089 Compare May 15, 2026 08:51

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: waiting mergequeue-status: queued mergequeue-status: in_progress and removed mergequeue-status: waiting mergequeue-status: queued labels May 15, 2026

gh-worker-dd-mergequeue-cf854d Bot merged commit 3d1e6c1 into main May 15, 2026
110 checks passed

gh-worker-dd-mergequeue-cf854d Bot deleted the ban/optimize-matching branch May 15, 2026 10:08

gh-worker-dd-devflow-36fce6 Bot added mergequeue-status: done and removed mergequeue-status: in_progress labels May 15, 2026

bantonsson mentioned this pull request May 15, 2026

refactor(sampling): use sampling from libdatadog [APMSP-3021] DataDog/dd-trace-rs#154

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(sampling): optimize matching and limit cache memory [APMSP-2948]#1977

perf(sampling): optimize matching and limit cache memory [APMSP-2948]#1977
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
ban/optimize-matching

bantonsson commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Uh oh!

datadog-official Bot commented May 12, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading

Uh oh!

codecov-commenter commented May 12, 2026 •

edited

Loading

Uh oh!

dd-octo-sts Bot commented May 12, 2026 •

edited

Loading

Uh oh!

iunanua May 12, 2026

Uh oh!

yannham May 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yannham May 12, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		// A concurrent inserter racing us on the same key is harmless: `put` overwrites with
		// the same boolean result.

Conversation

bantonsson commented May 12, 2026

What does this PR do?

Motivation

How to test the change?

Uh oh!

github-actions Bot commented May 12, 2026

Clippy Allow Annotation Report

Summary by Rule

Annotation Counts by File

Annotation Stats by Crate

About This Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

datadog-official Bot commented May 12, 2026 • edited by datadog-datadog-prod-us1-2 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dd-octo-sts Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Artifact Size Benchmark Report

Uh oh!

iunanua May 12, 2026

Choose a reason for hiding this comment

Uh oh!

yannham May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yannham May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

datadog-official Bot commented May 12, 2026 •

edited by datadog-datadog-prod-us1-2 Bot

Loading

codecov-commenter commented May 12, 2026 •

edited

Loading

dd-octo-sts Bot commented May 12, 2026 •

edited

Loading