Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
([#1809])(https://github.com/open-telemetry/opentelemetry-python/pull/1809)
- Fixed sequence values in OTLP exporter not translating
([#1818](https://github.com/open-telemetry/opentelemetry-python/pull/1818))
- Update transient errors retry timeout and retryable status codes
([#1842](https://github.com/open-telemetry/opentelemetry-python/pull/1842))

### Removed
- Moved `opentelemetry-instrumentation` to contrib repository.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -262,14 +262,11 @@ def _translate_data(
pass

def _export(self, data: TypingSequence[SDKDataT]) -> ExportResultT:

max_value = 64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to make this configurable? For example as an export_timeout: maximum duration for exporting spans argument to the exporter?

My use case is CLI applications where anything having them hang on exit for 64 seconds because a telemetry backend is down is unacceptable. For comparison the Go implementation has two separate timeouts ExportTimeout for the full operation and BatchTimeout for the direct publish.

For now I have special subclass with a copy of this method to tweak the behavior but it's a maintenance issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will bring up the topic of supporting configurability in next SIG meeting. Spec is not clear(at this point) in that regard and anything we do might become breaking change in future. And speaking of ExportTimeout and BatchTimeout python also has them as schedule_delay_millis and export_timeout_millis here I don't know why but we are only enforcing the export timeout only during force flush.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked about this in SIG call. This is out of scope for this. Consensus was this is not clear from Spec so we might end up making backwards incompatible change by introducing the support for configuring this. One thing we want to explore is making retry async and not blocking and also looking at how span processor timeout is behaving. Please create another issue and we will track it there.

# expo returns a generator that yields delay values which grow
# exponentially. Once delay is greater than max_value, the yielded
# value will remain constant.
# max_value is set to 900 (900 seconds is 15 minutes) to use the same
# value as used in the Go implementation.

max_value = 900

for delay in expo(max_value=max_value):

if delay == max_value:
Expand All @@ -289,8 +286,6 @@ def _export(self, data: TypingSequence[SDKDataT]) -> ExportResultT:
if error.code() in [
StatusCode.CANCELLED,
StatusCode.DEADLINE_EXCEEDED,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't cancelled and deadline exceeded errors raised by the client itself? Not 100% sure about deadline exceeded but I think cancelled should not be retried as it means (if I understand correctly) that the client decided to cancel the request for example in case it received some short of shutdown/exit signal.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is typically client but I believe this is allowed to cover the case when server does it https://grpc.io/docs/what-is-grpc/core-concepts/#cancelling-an-rpc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

@owais owais May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Not a blocker for me then but I wonder if is there a simple way for the exporter to cancel an in-flight request on shutdown (with or without timeout)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I understand your concern. Anyway we have some work to do in making shutdown spec complaint we can address this there.

StatusCode.PERMISSION_DENIED,
StatusCode.UNAUTHENTICATED,
StatusCode.RESOURCE_EXHAUSTED,
StatusCode.ABORTED,
StatusCode.OUT_OF_RANGE,
Expand Down