Skip to content

[SVLS-8979] Add CloudFormation template for Lambda Durable Function event forwarder#330

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 7 commits into
masterfrom
yiming.luo/durable-event-forwarder
Jun 25, 2026
Merged

[SVLS-8979] Add CloudFormation template for Lambda Durable Function event forwarder#330
gh-worker-dd-mergequeue-cf854d[bot] merged 7 commits into
masterfrom
yiming.luo/durable-event-forwarder

Conversation

@lym953

@lym953 lym953 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Context

To capture TIMED_OUT and STOPPED status of Lambda durable function executions, we need to capture the status change events in EventBridge and forward them to Datadog. This involves three changes:

  1. AWS client-side integrations: EventBridge rule to get the logs, Firehose to batch send the logs to Datadog. The CloudFormation template will be published to S3 so customers can install it by 1 click, in the same way as installing Datadog forwarder. (This PR)
  2. Logs pipeline on Datadog side: transforms the logs (https://github.com/DataDog/integrations-internal-core/pull/3389)
  3. Datadog's Lambda UI: consume the logs

This template was originally proposed in DataDog/datadog-serverless-functions#1149. Per team guidance, this repo (cloudformation-template) is the correct place to host the template, so the PR has been re-created here following this repo's conventions (a top-level aws_<name>/ directory with the template and README.md). Publishing the template to S3 will be handled separately in a different repo, so this PR contains no release script.

Architecture

image

This is Option 4.3 in the design doc. See the doc for why we need to capture the status change events.

Changes

  • Add aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml — the CloudFormation template for the AWS-side resources
  • Add aws_durable_function_event_forwarder/README.md

Params of the CloudFormation template

  • It supports three ways to set DD API key:
    • Plaintext
    • Secrets Manager secret ARN
    • SSM SecureString parameter name
  • Datadog site, defaults to datadoghq.com
  • Statuses to forward, defaults to empty, i.e. forward all statuses: RUNNING,SUCCEEDED,FAILED,TIMED_OUT,STOPPED.
  • Function ARN filters. Can be either the unqualified ARN for a single function, e.g. arn:aws:lambda:us-east-2:425362996713:function:my-durable-function, or a wildcard pattern, e.g. arn:aws:lambda:us-east-2:425362996713:function:my-durable-*.
    • We support up to 5 filters. If there's a ask, we can consider adding more. However, I expect most customers to leave them empty so the stack covers all the durable functions in the region.
  • Firehose buffer interval, defaults to 60 seconds. This is the interval at which events are sent to Datadog.

Next steps

  • Enable releasing to Datadog Prod account

Test plan

Steps

  • Upload the CloudFormation template to S3 in Datadog Serverless Sandbox by running, from the aws_durable_function_event_forwarder/ directory:
  aws-vault exec sso-serverless-sandbox-account-admin -- \
    aws s3 cp durable_function_event_forwarder.yaml \
    s3://datadog-cloudformation-template-serverless-sandbox/aws/lambda-durable-function-event-forwarder/latest.yaml \
    --content-type text/yaml

Result

After a few minutes, the logs appeared in Datadog.

Raw logs, without the new logs pipeline on Datadog side for transforming the durable execution event logs: (query)
image

Processed logs (with the logs pipeline added in https://github.com/DataDog/integrations-internal-core/pull/3389): (query)
image

…vent forwarder

Captures AWS Lambda Durable Function execution status change events from
EventBridge and forwards them to the Datadog HTTP intake via Amazon Data
Firehose. Records arrive at Datadog as the raw EventBridge envelope;
reshaping is handled by the Datadog-side logs pipeline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@datadog-system-tests-org

Copy link
Copy Markdown

📊 dbt CICD (full report)

Impact Lineage

⚠️ Analysis failed. An error occurred while computing the impact lineage for this PR.

Error details

Error fetching impact lineage results.

Drift Detection

⚠️ Drift detection failed. An error occurred while computing prod-vs-CI tests for this PR.

Error details

Error fetching drift detection results.

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 812c224 | Docs | Give us feedback!

@datadog-tso

This comment has been minimized.

@datadog-datadog-prod-us1-2

This comment has been minimized.

Keep the README user-facing: drop CloudFormation-internal notes
(NoEcho, {{resolve:...}} dynamic references, Rules.ApiKeyRequired,
Mappings.Constants) and the Firehose-URL aside from the parameter
and output tables.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lym953 lym953 requested review from Copilot and jchrostek-dd June 24, 2026 17:45
@lym953 lym953 marked this pull request as ready for review June 24, 2026 17:45
@lym953 lym953 requested a review from a team as a code owner June 24, 2026 17:45

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new, customer-installable CloudFormation template (plus documentation) to capture AWS Lambda Durable Function execution status change events from EventBridge and forward them to Datadog via Kinesis Data Firehose, with failed-record backup to S3.

Changes:

  • Introduces aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml to provision EventBridge rule, Firehose delivery stream, IAM roles, and S3 backup bucket.
  • Adds aws_durable_function_event_forwarder/README.md explaining architecture, parameters, outputs, and the raw forwarded event shape.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml New CloudFormation template provisioning EventBridge → Firehose → Datadog forwarding with S3 failed-record backup.
aws_durable_function_event_forwarder/README.md Documentation for deploying/configuring the template and understanding the forwarded payload.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread aws_durable_function_event_forwarder/durable_function_event_forwarder.yaml Outdated
Comment thread aws_durable_function_event_forwarder/README.md Outdated
- Description: drop "transforms into Datadog log documents" — the stack
  forwards raw EventBridge envelopes unchanged.
- Add Rules.ApiKeyExclusive so exactly one API key source is enforced
  (previously multiple could be set, with plaintext silently winning).
- Expire S3 backup records after 30 days instead of retaining forever.
- README: minor grammar fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make serverless-aws and serverless-onboarding-enablement owners of
aws_durable_function_event_forwarder/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@lym953 lym953 requested review from a team and TalUsvyatsky June 25, 2026 15:28
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Leave serverless-aws as the sole owner of
aws_durable_function_event_forwarder/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 3c86ac8 into master Jun 25, 2026
5 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the yiming.luo/durable-event-forwarder branch June 25, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants