perf(sampling): optimize matching and limit cache memory [APMSP-2948]#1977
Conversation
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 364249df6a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let mut cache = self.cache.lock_or_panic(); | ||
| if let Some(&result) = cache.get(subject) { | ||
| return result; | ||
| } | ||
| let result = glob_match_bytes::<true>(pattern, subject); |
There was a problem hiding this comment.
Avoid running glob matching while holding the cache lock
For wildcard rules with cache misses, cache is still locked while glob_match_bytes runs. In concurrent sampling, one high-cardinality or adversarial subject can make every other thread using the same cloned GlobMatcher wait behind the full backtracking match, including threads that would only need a cache hit. The previous implementation released the mutex before doing the match and only re-locked to insert, so this can serialize rule matching under load; consider dropping the guard before computing and re-locking for put.
Useful? React with 👍 / 👎.
66834c7 to
6f98b4c
Compare
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 894e089 | Docs | Datadog PR Page | Give us feedback! |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1977 +/- ##
==========================================
+ Coverage 72.64% 72.69% +0.05%
==========================================
Files 451 452 +1
Lines 74687 74888 +201
==========================================
+ Hits 54253 54443 +190
- Misses 20434 20445 +11
🚀 New features to boost your workflow:
|
6f98b4c to
b272f60
Compare
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
| // A concurrent inserter racing us on the same key is harmless: `put` overwrites with | ||
| // the same boolean result. |
There was a problem hiding this comment.
But N concurrent inserters would call glob_match_bytes if there is a cache miss.
I guess that's the price we have to pay.
There was a problem hiding this comment.
If we really need to worry about that in the future, there are some concurrent implementations of LRU caches (https://crates.io/crates/concurrent_lru for one)
75730ae to
642e445
Compare
| // A concurrent inserter racing us on the same key is harmless: `put` overwrites with | ||
| // the same boolean result. |
There was a problem hiding this comment.
If we really need to worry about that in the future, there are some concurrent implementations of LRU caches (https://crates.io/crates/concurrent_lru for one)
642e445 to
894e089
Compare
What does this PR do?
Optimizes the glob matching and limits the memory taken up by the LRU match cache.
Motivation
A follow up PR for bigger performance questions that came up in #1927 as well as bounding the cache memory for the LRU cache that came up in a security review.
How to test the change?
Unit tests and benchmarks are in the PR.