Disable flushing metrics in thread#120
Merged
Merged
Conversation
ed017fa to
64dda84
Compare
nhinsch
reviewed
Mar 29, 2021
nhinsch
approved these changes
Mar 29, 2021
Contributor
nhinsch
left a comment
There was a problem hiding this comment.
Looks good, I would just suggest we de-emphasize the setting more, either be removing it completely, renaming it, and/or hiding it, to avoid confusion, since it is discouraged/potentially broken to flush metrics in the background.
64dda84 to
f558fe2
Compare
f558fe2 to
aaad49b
Compare
14 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Disable
flushing_in_threadofthreadstatsby default. Previously, for synchronous metric submission (without piggybacking on the forwarder and logs), thethreadstatsis configured to flush the metrics both at the end of the invocation (blocking) and every 10s in a background thread (non-blocking). This PR disables the periodic background flushing.This change is a soft breaking change, though it shouldn't result in any data loss, it does delay synchronous metric submission for long-running Lambda functions (duration > 10s). In the worst case, metrics submitted at the beginning of an invocation could be delayed for showing up in Datadog for 15 min (the max Lambda duration allowed).
For affected functions, the behavior could be re-enabled by setting env var
DD_FLUSH_IN_THREADtotrue. However, given the problematic nature of this flushing behavior, I would recommend submitting metrics through the new Datadog Lambda extension instead.Datadog Forwarder relies on the
datadog-lambdato flush metrics, so thedatadog-lambdaversion needs to be bumped to get the issue fixed for the Datadog Forwarder DataDog/datadog-serverless-functions#436Also fixed the broken and flaky integration tests.
Motivation
A bug has been recently identified (see sample logs below). If there happen to be a background flushing in progress, the end-of-invocation flushing will be skipped (because there is already an in-flight flushing). However, if the background flushing doesn't finish before the AWS freezing the execution environment, it will resume at the beginning of the next invocation. There are two problems with it. First, if AWS decides to destroy the current execution environment, this request will never complete. Second, if the next invocation only happens a few min later (e.g., a cronjob), the request will fail due to connection timeout.
In short, the background thread flushing isn't suitable for Lambda and should be disabled.
Testing Guidelines
Additional Notes
Types of Changes
Check all that apply