Metrics: add histogram bucket views per metric-name pattern#66805
Open
1fanwang wants to merge 1 commit into
Open
Metrics: add histogram bucket views per metric-name pattern#668051fanwang wants to merge 1 commit into
1fanwang wants to merge 1 commit into
Conversation
2 tasks
df02c2d to
a6931d8
Compare
Follow-up to apache#64207, which set ExponentialBucketHistogramAggregation as the instrument-type default for OTel histograms. Non-timer histogram families (*_count, *_duration, *_delay) span very different value ranges yet still inherit one bucket layout chosen at each call site, so their distributions are poorly resolved. A single value range cannot serve a millisecond latency and an hours-long delay equally well. Closes apache#66801 Signed-off-by: 1fanwang <1fannnw@gmail.com>
a6931d8 to
bc5ec90
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
OTel histogram metrics in Airflow share one bucket layout regardless of what they measure, so a metric whose useful range differs from the default gets uninformative tails — task duration buckets fine, but scheduler-loop duration needs finer low-end resolution, and second-to-hour delays need coarser high-end. #64207 set an exponential default per instrument type, but non-timer histogram families (
*_count,*_duration,*_delay) still resolve their boundaries at each call site, so the same family can end up shaped differently depending on which module created the instrument.This adds a declarative
metric-name pattern → bucket aggregationmap inshared/observabilityand layers the resulting OTelViews on top of the existing instrument-type baseline. Latency families get exponential buckets, counts get a small linear range, delays get a wide range; the per-instrument-type default still applies and the pattern views only refine it for matching names. Deployments needing a different layout pass an override dict.Closes #66801
Tests
New
test_histogram_buckets.pycovers the default pattern map, per-pattern aggregation resolution, the custom-mapping override, and that oneViewis built per entry.test_otel_logger.pyis updated to assert the layered shape (baseline view followed by the pattern views).before/after on the discriminating test
Reverting
otel_logger.pytoupstream/main(baseline view only, no pattern views):With the change restored:
10 passed(fulltest_histogram_buckets.pyplus the updatedotel_loggerassertion).A standalone
MeterProvider(views=build_views_for_patterns())driving anInMemoryMetricReaderconfirms the SDK resolves per name end to end:task_durationcollects asExponentialHistogramDataPoint, whileschedule_delay/retry_countcollect as fixed-boundaryHistogramDataPoint. Without the patch all three fall through to the exponential default regardless of name.