feat: per-alarm action and severity overrides#1880
Open
sai-ray wants to merge 15 commits into
Open
Conversation
Allow customers to override action wiring for specific alarms via the new `alarmOverrides` prop on `ConstructHub`. Each entry, keyed by the alarm's construct path, can set `severity` (route the alarm to a different bucket's action) and/or `actions` (supply custom actions that bypass the buckets entirely).
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Customer-facing change: override keys are now the alarm's CloudWatch display
name (e.g. 'Sources/NpmJs/Canary/NotRunningOrFailing') — the same string
visible in tickets and the CloudWatch console. Severity is now a plain string
('HIGH' | 'MEDIUM' | 'LOW') instead of an enum import.
Drops the AlarmPath enum; lookup reads `(alarm.node.defaultChild as CfnAlarm |
CfnCompositeAlarm).alarmName` and strips the ConstructHub prefix. Unknown
override keys are surfaced as synth-time validation errors.
Also wires up the dev-app to exercise all three override flavors so changes
can be verified end-to-end with `yarn dev:synth`.
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
The dev-app is the golden snapshot, so test scaffolding shouldn't ship in it. Manual verification was done locally, no need to commit the topics and alarmOverrides example into the canonical dev deployment.
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
These three alarms were the only ones in the construct tree without an explicit `alarmName`, which meant CloudFormation generated opaque hash names (e.g. WebAppExpirationMonitorACMAlarm12ABC34D-X7K9P2QRSTUV) and they couldn't be targeted via `alarmOverrides`. Setting an explicit name gives them readable ticket titles and makes them override-able like the rest.
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
- AlarmOverride.severity uses the AlarmSeverity enum (consistent with the rest of the API; drops the string-literal union and severityFromString helper) - align MonitoringProps and ConstructHubProps jsdoc on "CloudWatch display name" terminology - throw at synth time when a registered alarm has no explicit alarmName, so a missing name doesn't silently make it un-overridable - drop the `this.node.scope!` non-null assertion in favor of a safe fallback - reword AlarmOverride.actions doc to handle the multi-action case - add tests for default-path (alarm with name, no override) and missing-name (synth fails)
1e21b78 to
514a44e
Compare
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
- guard `actions: []` so an empty array falls back to the bucket action instead of silently muting the alarm - collapse the redundant cast on `alarm.node.defaultChild` and document the literal-alarmName-only requirement - only run the missing-alarmName validation when `alarmOverrides` is non-empty, so subclasses with anonymous alarms aren't rejected - point the unmatched-key error at `ConstructHubProps.alarmOverrides` - drop the extra trailing newline - add a test for the `actions: []` fallback
- ConstructHubProps.alarmOverrides: severity is `AlarmSeverity.HIGH/MEDIUM/LOW`, not the string-literal union from an earlier draft - AlarmOverride.actions: document that an empty array falls back to the bucket action rather than silently muting the alarm
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1599
This PR adds
alarmOverridesonConstructHubso customers can change what fires for a specific alarm at deploy time.Today every alarm is locked into one of three severity buckets and each bucket has one action. Customers can't reclassify a single alarm or attach a custom action to one. The existing
AlarmSeveritiesinterface only covers two named alarms.Example usage
This is the actual stack used to verify the feature against a real AWS account; every override flavor is exercised below.
The actual
aws cloudwatch describe-alarmsoutput after deploy:Sources/NpmJs/Canary/NotRunningOrFailingseverity: LOWPackageStats/Failuresseverity: HIGHIngestion/DLQNotEmptyactions: [custom]Sources/NpmJs/Canary/StaleCanaryPackageseverity: HIGH, actions: [custom]VersionTracker/NotRunningactions: []Ingestion/Failure(control)VersionTracker/Failures(control)The two control rows (no override) prove the override is what's changing the wiring, not a fixed rule.
Raw
aws cloudwatch describe-alarmsoutputseverity— wire this alarm to a different bucket's action (AlarmSeverity.HIGH/MEDIUM/LOW).actions— supply custom actions that bypass the buckets entirely.Keys are the alarm's CloudWatch display name relative to the
ConstructHubconstruct i.e. the same string a customer sees in tickets and the CloudWatch console.Note
The existing
AlarmActionsinterface calls the lowest-tier slotnormalSeverity(legacy). Everywhere else inthe codebase (the
AlarmSeverityenum,add*SeverityAlarmmethods, and our newseverityfield) usesLOW. RenamingAlarmActions.normalSeverity→lowSeveritycan be a separate PR.Warning
Three previously-anonymous alarms now have explicit
alarmNamesThe lookup mechanism keys on the alarm's CloudWatch display name. Three alarms in the codebase predate the
alarmName: ${scope.node.path}/...convention and had no explicit name set:MonitoredCertificate/ACMAlarm(45-day cert expiry)MonitoredCertificate/EndpointAlarm(45-day cert expiry)Monitoring/WebCanary/Errors(web canary error rate)This PR adds explicit names to all three, which means CloudFormation will replace the existing alarms (delete + create) on next deploy. The replacement is acceptable because (a) CloudWatch alarms hold no state, (b) the gap is bounded by the deploy duration, and (c) none of the three are time-sensitive (cert expiry is a 45-day window; the web canary fires only on sustained errors).
Implementation
AlarmOverrideinterface insrc/api.ts.Monitoringreads(alarm.node.defaultChild as CfnAlarm | CfnCompositeAlarm).alarmName, strips theConstructHubprefix, looks up the override. If found, fully wires the alarm itself; otherwise falls through to existing per-bucket logic. Eachadd*SeverityAlarmgains one early-return line; existing logic is untouched.alarmOverrideslives entirely onConstructHubprops, no per-source plumbing.alarmName, are surfaced as synth-time validation errors vianode.addValidation. Validation only runs whenalarmOverridesis non-empty, so existing customers and downstream forks are unaffected.actions: []override falls back to the bucket action (rather than silently muting the alarm).Coverage
Every alarm registered through
IMonitoringis overridable. Three alarms that previously had no explicitalarmName(the two cert-expiry alarms and the web canary alarm) now have one, so they're covered too.Tests
7 new tests in
monitoring.test.ts: severity-only override, actions-only override, both, emptyactions: []falls back to bucket, unknown override key (synth-time error), default path with explicit alarmName, alarm registered without alarmName when overrides are set (synth-time error).By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license