feat(sdk): implement exporter metrics#4976
feat(sdk): implement exporter metrics#4976anuraaga wants to merge 9 commits intoopen-telemetry:mainfrom
Conversation
| class ExporterMetrics: | ||
| def __init__( | ||
| self, | ||
| component_type: str, |
There was a problem hiding this comment.
Maybe just make the component type be the original enum and convert to string in the constructor
| component_type: str, | |
| component_type: OtelComponentTypeValues, |
There was a problem hiding this comment.
It seemed nice to keep the flexibility to use a custom name in tests like
But that could also be just a semconv value, it wouldn't match the actual component though. Let me know if that seems worth it
There was a problem hiding this comment.
Seconded, using the enum as type would limit unexpected usage.
Also I think that in the last PR we switched to using that already?
There was a problem hiding this comment.
We switched in the public API, but this one is a private API (we didn't update the private API last time too). So do you think we should update them too?
| def __init__( | ||
| self, | ||
| component_type: str, | ||
| signal: Literal["span", "log", "metric_data_point"], |
There was a problem hiding this comment.
Can we make this more consistent with other places in the codebase? For example in opentelemetry-sdk/src/opentelemetry/sdk/_configuration/__init__.py we use
| signal: Literal["span", "log", "metric_data_point"], | |
| signal: Literal["traces", "metrics", "logs"], |
| self._exported = create_exported(meter) | ||
| self._duration = create_otel_sdk_exporter_operation_duration(meter) | ||
|
|
||
| def start_export( |
There was a problem hiding this comment.
Could we instead just have three methods on_start, on_end and on_error (similar to the logging implementation) and use an instance variable or ContextVar for tracking the duration?
There was a problem hiding this comment.
Not sure if that would simplify things, especially since exceptions are suppressed by the exporter implementations (otherwise a context manager would work great). ContextVar is good for black-box code but seems unnecessary when we are instrumenting the SDK itself. Could return a class, but it seems like it would be very similar to the Callable. Can do that if it seems better though.
There was a problem hiding this comment.
Does using a context manager as a mutable object work to record error info? We can use finally to do metric stuff after return is called in export.
with self._metrics.export_operation(self._count_data(data)) as result:
...
result.error = error
result.error_attrs = {RPC_RESPONSE_STATUS_CODE: error.code().name}
and then in ExporterMetrics:
...
@contextmanager
def export_operation(self, num_items: int):
result = ExportResult()
try:
yield result
finally:
...
duration_attrs = result.error_attrs if result.error_attrs else _standard_attrs
self._duration.record(metric, duration_attrs)
start_export -> finish_export would probably work in practice because the operations in finish are fine to be left dangling if we don't call them in every case (like if exporter shutdown during export) but design wise it seems dangerous to me.
There was a problem hiding this comment.
Thanks for the idea! I'll give that a try
| elif endpoint.scheme == "http": | ||
| port = 80 | ||
|
|
||
| component_type = (component_type or OtelComponentTypeValues("unknown_otlp_exporter")).value |
There was a problem hiding this comment.
@xrmx I realized one reason I had it string was to keep component_type an optional parameter for API evolution purposes, that in practice we always set. But I guess this might be ok - I could also assert it is set, but didn't seem worth it
Description
I am helping to implement SDK internal metrics
https://opentelemetry.io/docs/specs/semconv/otel/sdk-metrics/
This implements the exporter metrics.
Similar PR in JS: open-telemetry/opentelemetry-js#6480
/cc @xrmx to help with review. As there are many exporters, this is a pretty big one
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Contrib Repo Change?
Checklist: