Span Metrics Connector support for extrapolated metrics from tracestate `ot.th`

### Component(s)

connector/spanmetricsconnector

### Is your feature request related to a problem? Please describe.

[OTEP 235](https://github.com/open-telemetry/oteps/blob/main/text/trace/0235-sampling-threshold-in-trace-state.md) describes the new tracestate encoding for sampling probability. A number of built-in samplers will support this, such as TraceIdRatioBased.

An `adjusted_count` can be derived from the `ot.th` key in `tracestate`. Specifically, the value can be a floating point number (a fractional number instead of an integer). In current spanmetricsconnector, each span counts as exactly one. Their value contribute to the latency histogram as one datapoint and each event carried count as one as well. (Note that span event might be deprecated so we may not need to support it. However, that is a different issue and is beyond the scope here.

Current data structures used also just support integral counts. In order to support fractional count, we propose to use stochastic rounding of the `adjusted_count`. In addition, there will be attributes to specify whether the metrics are extrapolated. Note that stochastic rounding may introduce discrepancies when the sampling size is small. However, it is statistically accurate when the data set is large enough (TODO: defined large enough). In this regard, the statistical nature is not different from the metrics and histograms themselves.

Guidance will also be given for users to use sampling rate that is the reciprocal of integral `adjusted_count`. In such a case, stochastic rounding reduces to no-op. However, sampling rate that is the reciprocal of fractional numbers, such as `3/4` will be supported, and will be statistically correct.

Estimating standard error is possible, but the priority is secondary and is at the expense of extra floating point operations that stochastic rounding is aiming to save. We expect users' sampling choice result in statistically significant samples. This is also a basic requirement for using sampling and derived metrics from samples from the first place.

### Describe the solution you'd like

We want to use stochastic rounding in order to preserve efficient integer operations.

### Describe alternatives you've considered

We've considered to use floating point count. However, this introduces extra overhead and also contention to FPU.

We've also considered to keep aside the fractional `adjusted_count` along with the integer metrics counter (sum, count) and histograms. However, extra complexity is needed in order to support non-uniform or dynamically changing sampling rates.

### Additional context

_No response_

### Tip

<sub>[React](https://github.blog/news-insights/product-news/add-reactions-to-pull-requests-issues-and-comments/) with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding `+1` or `me too`, to help us triage it. Learn more [here](https://opentelemetry.io/community/end-user/issue-participation/).</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Span Metrics Connector support for extrapolated metrics from tracestate `ot.th` #45539

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Tip

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Span Metrics Connector support for extrapolated metrics from tracestate ot.th #45539

Description

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Tip

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Span Metrics Connector support for extrapolated metrics from tracestate `ot.th` #45539