fix: allow stackName to change#70
Merged
Merged
Conversation
Contributor
Author
|
Upon trial:
cdk/src/stacks/agent.ts — need these 3 changes:
cdk/src/constructs/agent-memory.ts — 2 changes:
cdk/test/constructs/agent-memory.test.ts — test updated:
|
Contributor
Author
|
Tested in AWS Account in one region for successful deployment in one region (functional testing not performed - just deployment to stacks with unique resources) |
This was referenced May 9, 2026
krokoko
reviewed
May 11, 2026
krokoko
reviewed
May 11, 2026
Signed-off-by: bgagent <scottschreckengaust@users.noreply.github.com>
Each resource with a name uniqueness constraint now truncates to its
own API limit, removing the need for a central stackName length check:
- Guardrail: ${stackName}-guardrail (.slice(0, 50))
- Memory: mem_${stackName} (.slice(0, 48))
- Runtime: sanitize + .slice(0, 48), ensure starts with letter
- Model invocation logging: remove onDelete (account-level singleton)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Slack verification test uses 'test-signing-secret-abc123' which triggers gitleaks generic-api-key rule. Not a real credential. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The default memory name changed from `mem_` to `bgagent_memory_` prefix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
Remove explicit runtimeName and memoryName — the L2 constructs auto-generate unique names via Names.uniqueResourceName() from the construct path (which includes the stack name), eliminating name collisions entirely without manual .slice() guards. Only the guardrail name remains explicit (required by the L2 API). Log group names retain stackName for operational discoverability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ac4efbb to
8d0a110
Compare
krokoko
approved these changes
May 11, 2026
isadeks
pushed a commit
to isadeks/sample-autonomous-cloud-coding-agents
that referenced
this pull request
May 19, 2026
Replaces the personal-API-key wizard with a 7-step OAuth flow that
authorizes a Linear workspace via AgentCore Identity:
1. Resolve stack outputs (CliWorkloadIdentityName, registry table,
user mapping table, webhook secret ARN). Errors loudly if any are
missing — typically means the stack predates 2.0b.
2. Read Cognito sub from cached id_token.
3. Mint workload access token via getWorkloadAccessTokenForUserId.
4. Initiate OAuth dance: getResourceOauth2Token returns an authorize
URL + sessionUri. customParameters: {actor: 'app'} propagates so
Linear surfaces the Agent install variant of the consent screen
(verified via 2.0b spike).
5. Start localhost HTTPS callback server, open browser to the auth
URL, await session_id from the callback.
6. Poll getResourceOauth2Token (5s/600s) until accessToken arrives;
translate sessionStatus=FAILED into a Linear-app-config remediation
hint.
7. Query Linear viewer + organization with the OAuth token, persist
the workspace registry row + admin user-mapping row, then prompt
for the webhook signing secret if not already configured.
Hard cutover from PAK: the new wizard is OAuth-only — there is no
--use-pak flag. The webhook signing secret prompt remains because
HMAC verification of inbound Linear webhooks is independent of how the
agent calls Linear outbound. Webhook prompt is skipped on subsequent
add-workspace runs by detecting the lin_wh_ prefix on the stored
secret; --rotate-webhook-secret forces a re-prompt.
Splits queryLinearIdentity out so both the legacy PAK auto-link helper
(authorization=`lin_api_…`) and the OAuth path (authorization=
`Bearer <token>`) reuse the same GraphQL query. The PAK helper stays
exported to support the legacy linkage path until LinearApiTokenSecret
is retired in aws-samples#70.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
isadeks
added a commit
to isadeks/sample-autonomous-cloud-coding-agents
that referenced
this pull request
May 26, 2026
… 2.0b) (aws-samples#160) * feat(linear): resolve API token via AgentCore Identity (Phase 2.0a) Migrates the agent runtime's Linear personal API token resolution from AWS Secrets Manager to AWS Bedrock AgentCore Identity. This is the "validate Identity SDK" step of the v2 plan; Phase 2.0b will swap the API key for OAuth and converge Linear MCP onto AgentCore Gateway in one cutover. Per Alain's guidance: "start by using api key, if it works, switch to oauth. you will setup an outbound auth for your server using agentcore identity. that identity can be (AC identity is like a wrapper around secrets manager) api key or oauth." Lambdas (orchestrator + processor) intentionally keep using Secrets Manager via the existing `LinearApiTokenSecret` for now. The Python `bedrock_agentcore` SDK has no Node.js equivalent — Lambda migration requires `@aws-sdk/client-bedrock-agentcore` raw API calls and folds into 2.0b's bigger refactor. End-state of 2.0a: agent reads from Identity, Lambdas read from Secrets Manager, both pointing at the same underlying token value (admin populates both). `agent/src/config.py::resolve_linear_api_token`: - Drops boto3 SecretsManager fetch + `LINEAR_API_TOKEN_SECRET_ARN` env. - Reads new env `LINEAR_API_KEY_PROVIDER_NAME` (provider name in Identity vault). - Calls `IdentityClient.get_api_key()` with the workload access token auto-injected into `BedrockAgentCoreContext` by AgentCore Runtime (verified by reading the SDK's `auth.py` decorator implementation — no manual workload-identity mint needed inside the runtime). - Caches the resolved token in `LINEAR_API_TOKEN` so downstream consumers stay unchanged: `channel_mcp.py`'s `${LINEAR_API_TOKEN}` placeholder in `.mcp.json` and `linear_reactions.py`'s GraphQL Authorization header. Preserves PR aws-samples#87's nice-to-have improvements: - `ImportError` graceful fallback (now for `bedrock_agentcore` instead of `boto3`) — degrade with WARN, don't crash the agent. - `AccessDeniedException` and `ResourceNotFoundException` logged at ERROR severity (persistent IAM/config bugs that should page). Other ClientErrors stay at WARN (transient throttle/network). `agent/pyproject.toml`: adds `bedrock-agentcore==1.9.1` dep. `cdk/src/stacks/agent.ts`: - On the AgentCore runtime: drops `linearIntegration.apiTokenSecret. grantRead(runtime)` and the `LINEAR_API_TOKEN_SECRET_ARN` env-var override. Adds `LINEAR_API_KEY_PROVIDER_NAME` env (hardcoded `'linear-api-key'` for now; can parametrize later via context if multi-environment naming is needed) and IAM permissions for `bedrock-agentcore:GetResourceApiKey` and `bedrock-agentcore:GetWorkloadAccessToken`. - Lambdas (orchestrator + processor) untouched — they still grant on the Linear secret and read from Secrets Manager. - Resource scope on the new IAM is `*` for now; AgentCore Identity ARN format isn't fully standardized in public docs as of 2026-05-15. Tighten in 2.0b when OAuth migration documents the canonical resource shape. `docs/guides/LINEAR_SETUP_GUIDE.md`: adds Step 4.5 documenting the one-time `agentcore add credential --type api-key --name linear-api-key` admin command users must run alongside the existing `bgagent linear setup` wizard. Notes that Lambdas keep Secrets Manager temporarily and 2.0b will retire the dual-store setup. Starlight mirror synced. `agent/tests/test_config.py::TestResolveLinearApiToken` — 10 tests covering: cached env var fast-path; missing provider name; missing region; workload token absent (outside runtime); happy path with env-var side-effect; botocore error swallowed with WARN; SDK returns None defensively; ImportError fallback; AccessDeniedException → ERROR severity; ResourceNotFoundException → ERROR severity. 542 agent / 1271 cdk / 196 cli, all green. Lint + typecheck clean. CDK synth clean. `bedrock_agentcore` SDK confirmed working in our runtime image (verified in `node_modules` post-install). The `BedrockAgentCoreContext` workload token auto-injection is documented behaviour for code running inside AgentCore Runtime — verified by reading the SDK's `@requires_api_key` decorator implementation, which uses the same context lookup we use here. Stacked on PR aws-samples#87 (`feat/linear-processor-feedback`). Will conflict on `config.py` and `test_config.py` if aws-samples#87 needs further rework before merge — happy to rebase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(linear): use aws CLI for credential provider, not the agentcore command The setup guide referenced `agentcore add credential` which doesn't actually work end-to-end: - The Python `bedrock-agentcore-starter-toolkit` CLI (`agentcore`) only exposes agent-lifecycle commands; there is no `credential-provider` subcommand. Confirmed by reading the toolkit's CLI reference and by user trying `agentcore configure credential-provider --type api-key --name ...` and receiving `No such command 'credential-provider'`. - The new npm `@aws/agentcore` CLI does have `agentcore add credential` but uses a declarative project model — the credential lands in `agentcore.json` + `.env.local`, not the actual AgentCore Identity vault, until `agentcore deploy` runs against a project structured for that CLI. ABCA isn't structured that way. Switch the docs to the plain AWS CLI which works directly against the AgentCore Identity API: aws bedrock-agentcore-control create-api-key-credential-provider \ --name linear-api-key \ --api-key "<paste lin_api_… token here>" \ --region us-east-1 Plus the matching `list-api-key-credential-providers` for verification. Add a "Tooling note" at the bottom of the section explaining why the plain AWS CLI is the right path here vs. the two `agentcore` CLIs. Starlight mirror synced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(linear): pass runtimeUserId so AgentCore injects WorkloadAccessToken Smoke on backgroundagent-dev caught a real bug in the Phase 2.0a migration: the agent's `resolve_linear_api_token()` was correctly calling `IdentityClient.get_api_key()` but failing earlier at `BedrockAgentCoreContext.get_workload_access_token()` returning None. The Linear MCP then loaded with an unresolved `${LINEAR_API_TOKEN}` placeholder and 👀 didn't post. Root cause (from reading bedrock-agentcore-sdk-python source): The `WorkloadAccessToken` request header (which the runtime container reads to populate `BedrockAgentCoreContext`) is only injected by AgentCore Identity when `InvokeAgentRuntimeCommand` is called with `runtimeUserId`. Per AWS docs at https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-oauth.html: "Agent Runtime exchanges this token for a Workload Access Token via bedrock-agentcore:GetWorkloadAccessTokenForJWT API and delivers it to your agent code via the payload header `WorkloadAccessToken`." Without `runtimeUserId`, AgentCore never derives a workload token and the header is absent. `app.py::_build_request_context` reads the header off the inbound request; the agent sees None. Fix: 1. Thread `userId` through the `ComputeStrategy.startSession` interface (compute-strategy.ts). 2. Pass `task.user_id` (the task's Cognito sub) at the call site in orchestrate-task.ts. 3. Set `runtimeUserId: input.userId` on `InvokeAgentRuntimeCommand` in agentcore-strategy.ts. Log it alongside session_id for traceability. 4. ECS strategy accepts the new parameter to satisfy the interface; doesn't use it (ECS doesn't go through AgentCore Identity). 5. Grant the orchestrator role `bedrock-agentcore:InvokeAgentRuntimeForUser` alongside `InvokeAgentRuntime` (task-orchestrator.ts). Without this, the new `runtimeUserId` parameter would 403. Tests updated: - `agentcore-strategy.test.ts`: pin that `runtimeUserId` flows from input into the SDK command; pass `userId: 'cognito-user-1'` in 4 call sites. - `ecs-strategy.test.ts`: pass `userId` (unused by ECS) on 3 call sites. - `start-session-composition.test.ts`: pass `userId: 'cognito-test'` on 3 call sites. - `task-orchestrator.test.ts`: assert the IAM action list includes `InvokeAgentRuntimeForUser` (2 assertions). 542 agent / 1273 cdk / 196 cli — all green. Lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(linear): two undocumented gotchas to make AgentCore Identity actually work End-to-end smoke on backgroundagent-dev surfaced two more silent failure modes after the runtimeUserId fix landed: `BedrockAgentCoreContext.get_workload_access_token()` returned None inside the pipeline thread even though the platform delivered the token on the inbound request. Cause: Python ContextVar storage is per-thread, not shared across `threading.Thread` boundaries. Our `_run_task_background` spawns a new thread for the pipeline, so any context-var the SDK's middleware sets in the request handler thread doesn't reach it. Compounding factor: the SDK's `_build_request_context` middleware only runs when using `BedrockAgentCoreApp` from `bedrock_agentcore.runtime`. Plain FastAPI apps like ours never get that bridge at all. Fix: read the workload token off the request in `_extract_invocation_params` (handling both observed header spellings — `WorkloadAccessToken` and `x-amzn-bedrock-agentcore-runtime-workload-accesstoken`), thread it through the kwargs of `_run_task_background`, and have the pipeline thread call `BedrockAgentCoreContext.set_workload_access_token` on entry. (cdk/src/stacks/agent.ts) After (1) was applied, `IdentityClient.get_api_key()` actually fired and got `AccessDeniedException: ... not authorized to perform: secretsmanager:GetSecretValue`. Cause: AgentCore Identity stores api-key credentials in Secrets Manager under reserved prefix `bedrock-agentcore-identity!*` (the actual ARN shape: `arn:aws:secretsmanager:REGION:ACCOUNT:secret:bedrock-agentcore- identity!default/apikey/<provider-name>-<hash>`). The `GetResourceApiKey` control-plane API surfaces the underlying secret to the caller, and AWS verifies the *caller* role (our runtime role) has `GetSecretValue` on the actual secret resource — not the SLR. Fix: grant the runtime role `secretsmanager:GetSecretValue` scoped to the `bedrock-agentcore-identity!*` prefix in the current account/region. Tightly scoped to Identity-managed secrets; doesn't leak read access to other Secrets Manager resources. - Runtime container reads workload token from request, propagates across thread boundary, calls IdentityClient successfully - 👀 reaction posts at +525ms after task pickup, no warnings - Linear MCP loads cleanly with the resolved token - No more `workload access token not in context` WARN - No more `AccessDeniedException` from `GetResourceApiKey` Three undocumented requirements total for Phase 2.0a (combining with the runtimeUserId fix from the prior commit): 1. Caller (orchestrator) sends `runtimeUserId` and has `InvokeAgentRuntimeForUser` IAM 2. Runtime container bridges the workload-token header into the ContextVar, with per-thread propagation if the pipeline runs in a spawned thread 3. Runtime role has `secretsmanager:GetSecretValue` on `bedrock-agentcore-identity!*` All three are silent failures on their own; missing any one returns None or AccessDenied without obvious "you forgot X" diagnostics. Will file an upstream issue against `aws/bedrock-agentcore-sdk-python` summarising all three so others don't burn the same cycles. Tests: 542 agent / 1273 cdk / 196 cli — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(2.0b): foundation — workspace registry, admin invite-user, Linear app template Wave 1 of Phase 2.0b: prereq pieces for the Linear OAuth migration. - LinearWorkspaceRegistryTable: maps Linear org-id → AgentCore credential provider name, so webhook + orchestrator Lambdas can resolve the workspace's OAuth token without knowing about provider naming. - bgagent admin invite-user: wraps Cognito admin-create-user with the right defaults and prints a base64 bundle that --from-bundle decodes into ~/.bgagent/config.json. Replaces a four-flag dance with a single paste for joining teammates. - bgagent linear app-template: prints the Linear OAuth app form values captured from the 2.0b spike — GitHub username with [bot] suffix and Webhooks ON gate the actor=app flow; misleading "Invalid redirect_uri" error is the symptom when either is missing. - USER_GUIDE roles section + joining-an-existing-deployment flow: makes the four-role lifecycle explicit (stack admin / workspace admin / repo onboarder / teammate) so a teammate landing on the docs has a clear non-admin path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(linear): rewrite setup guide for OAuth (2.0b) Replace the personal-API-key flow with the Linear OAuth `actor=app` install path verified by the 2.0b spike. Major changes: - Step 1: AgentCore credential provider via `bgagent linear oauth-register-workspace`, capturing the AWS-hosted callback URL that Linear will actually see. - Step 2: Linear OAuth app creation via `bgagent linear app-template`, documenting the GitHub-username-with-[bot]-suffix and Webhooks-ON gates that produce Linear's misleading "Invalid redirect_uri" error when missing. - Step 4: OAuth dance via the rewritten `bgagent linear setup` — ephemeral localhost HTTPS callback; no own ALB/Lambda needed since AWS proxies the OAuth flow. - Step 7: clarify that the PAK-owner auto-link becomes the setup-runner auto-link; the manual DDB mapping path stays for now until self-service `@bgagent link` ships. - New "Adding additional Linear workspaces" section for multi-workspace deployments. - New "Migration from 2.0a (PAK) to 2.0b (OAuth)" runbook. - Troubleshooting expanded to cover the Invalid-redirect_uri and 401-from-Linear scenarios surfaced in the spike. Notes the docs reference commands shipping in Wave 2 (aws-samples#63 oauth-register-workspace, aws-samples#65 setup wizard, aws-samples#67 add-workspace) — the 2.0b branch is a coherent unit and aws-samples#62 must land before those flows are wired so the docs aren't a moving target during implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): workload-token retrieval helper for AgentCore Identity (2.0b C1) The CLI runs OUTSIDE AgentCore Runtime, so the in-container ContextVar trick from 2.0a does not apply. This module gives every 2.0b OAuth-flow command a single way to obtain a workload access token: - getWorkloadAccessToken({region, workloadName, userId}) calls the data-plane GetWorkloadAccessTokenForUserId, scoping the resulting token to (workload, cognito_sub) so OAuth-token retrieval is per platform user. - decodeCognitoSub(idToken) extracts the sub claim from the cached id_token, parsing only — token validation is API Gateway's job. - DEFAULT_CLI_WORKLOAD_NAME is the deployment-time convention; the workload identity itself will be created by a follow-up CDK custom resource (aws-samples#61). Stack output 'CliWorkloadIdentityName' wires the CLI to whatever the deployed name actually is. Two SDK errors get translated into actionable remediation hints: - ValidationException: WorkloadIdentity is linked to a service — documented footgun from the spike, surfaces when the CLI is pointed at a runtime workload. - AccessDeniedException / ResourceNotFoundException — same surface treatment, with the bgagent-side checklist embedded in the message. Adds @aws-sdk/client-bedrock-agentcore + bedrock-agentcore-control as CLI deps. Pins all CLI AWS SDK clients to 3.1024.0 (matching) to keep the @smithy/core dependency graph deduplicated; mixed-version pins caused interface-collision typecheck errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): bgagent linear oauth-register-workspace (2.0b B2) Registers a Linear workspace as an AgentCore OAuth2 credential provider. The command: - Validates the workspace slug shape ([a-zA-Z0-9_-]{4,50}) so the resulting provider name fits AgentCore's 64-char limit. - Prompts for clientId + clientSecret (interactive, not echoed). - Calls CreateOauth2CredentialProvider with credentialProviderVendor= 'CustomOauth2' and explicit authorizationServerMetadata for Linear's fixed endpoints (Linear has no .well-known/openid-configuration, so vendor-discovery cannot auto-resolve). - Prints the AWS-hosted callback URL the operator pastes into Linear's app form — the AWS-side proxy that Linear actually redirects to. - Idempotent: re-running with an existing provider name fetches the callbackUrl and reports "already exists — re-using it". Smoke test against dev account (2026-05-19) revealed AWS surfaces the duplicate-name case as ValidationException (NOT ConflictException as CFN/REST conventions would suggest). Detection is by message-substring match; tests cover both the duplicate path and the "ValidationException for a non-duplicate reason" path so we don't accidentally swallow input validation errors. AccessDeniedException gets a remediation hint pointing at the 'bedrock-agentcore:CreateOauth2CredentialProvider' permission, since the most common misconfiguration is running the command as a Cognito-authenticated CLI user (no permissions) rather than as an admin/stack-deploy IAM principal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(2.0b): CLI workload identity + localhost OAuth callback server (A3) Two pieces that together let the CLI run the OAuth dance without any externally-facing infrastructure: CDK side (CliWorkloadIdentity construct, wired into the agent stack): - Creates a dedicated AgentCore Identity workload identity named `bgagent-cli`, distinct from the runtime workload (which is service- linked and cannot mint user-scoped tokens). - Allowlists `https://localhost:8443/oauth/callback` as a permitted resourceOauth2ReturnUrl. AgentCore validates browser-redirect URLs against this list, so the CLI cannot finish the OAuth dance without it. - Implementation: AwsCustomResource (no L2/L1 for AgentCore Identity in CDK as of May 2026). Idempotent — Create/Update/Delete lifecycle wired so re-deploys reconcile the allowlist and stack-deletes don't leak workload identities (50/account-region quota). - Stack outputs `CliWorkloadIdentityName` and `LinearWorkspaceRegistryTableName` so the CLI can discover them at runtime. CLI side (oauth-callback-server module): - Generates a fresh self-signed cert in /tmp via openssl on each invocation; cert is cleaned up when the server shuts down. - Starts an HTTPS listener on localhost:8443/oauth/callback, captures the first request's `session_id` query param, renders a success page, shuts down. Uses res.once('finish') to ensure the response body flushes before the listener closes — otherwise the browser hangs waiting for bytes that never arrive (caught by integration test). - Translates EADDRINUSE and timeout into actionable CliErrors. The CLI URL constant and the CDK default allowlist must agree on the exact URL string — drift would silently break the OAuth dance with "redirect_uri not allowlisted". A regression-locking test on the URL constant + matching CDK default flags the issue at unit-test time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): bgagent linear setup OAuth dance orchestration (2.0b C2/C3) Replaces the personal-API-key wizard with a 7-step OAuth flow that authorizes a Linear workspace via AgentCore Identity: 1. Resolve stack outputs (CliWorkloadIdentityName, registry table, user mapping table, webhook secret ARN). Errors loudly if any are missing — typically means the stack predates 2.0b. 2. Read Cognito sub from cached id_token. 3. Mint workload access token via getWorkloadAccessTokenForUserId. 4. Initiate OAuth dance: getResourceOauth2Token returns an authorize URL + sessionUri. customParameters: {actor: 'app'} propagates so Linear surfaces the Agent install variant of the consent screen (verified via 2.0b spike). 5. Start localhost HTTPS callback server, open browser to the auth URL, await session_id from the callback. 6. Poll getResourceOauth2Token (5s/600s) until accessToken arrives; translate sessionStatus=FAILED into a Linear-app-config remediation hint. 7. Query Linear viewer + organization with the OAuth token, persist the workspace registry row + admin user-mapping row, then prompt for the webhook signing secret if not already configured. Hard cutover from PAK: the new wizard is OAuth-only — there is no --use-pak flag. The webhook signing secret prompt remains because HMAC verification of inbound Linear webhooks is independent of how the agent calls Linear outbound. Webhook prompt is skipped on subsequent add-workspace runs by detecting the lin_wh_ prefix on the stored secret; --rotate-webhook-secret forces a re-prompt. Splits queryLinearIdentity out so both the legacy PAK auto-link helper (authorization=`lin_api_…`) and the OAuth path (authorization= `Bearer <token>`) reuse the same GraphQL query. The PAK helper stays exported to support the legacy linkage path until LinearApiTokenSecret is retired in aws-samples#70. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cdk): use full SDK v3 package name for AgentCore Identity custom resource CDK's AwsCustomResource auto-derives the SDK package name from `service` by lowercasing and dropping hyphens — `'BedrockAgentCoreControl'` becomes `@aws-sdk/client-bedrockagentcorecontrol`, which doesn't exist. The actual package is `@aws-sdk/client-bedrock-agentcore-control` (hyphens). Verified by deploy: with the lowercased mapping the Lambda backing the CR fails with "Package @aws-sdk/client-bedrockagentcorecontrol does not exist" and the stack rolls back. Switching to the full v3 package name (supported per the AwsSdkCall.service jsdoc) routes the import correctly. Verified end-to-end: `bgagent-cli` workload identity created with `https://localhost:8443/oauth/callback` on the resourceOauth2ReturnUrls allowlist, stack outputs populated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(2.0b): park diagnostic flags + tokenEndpointAuthMethods + force-reauth Smoke test against backgroundagent-dev (2026-05-19) hit a service-side bug in AgentCore Identity: USER_FEDERATION token-exchange against Linear's /oauth/token never completes. sessionStatus stays IN_PROGRESS indefinitely, no FAILED transition, no diagnostics on the wire. Verified via manual curl that Linear's token endpoint works perfectly with the same clientId/secret/scopes/code/actor=app — bug is on AWS's side. AgentCore Identity has zero token-injection APIs, so Option 3 (do OAuth ourselves + inject) is architecturally impossible. AWS support case + PAR-compatibility upstream issue aws/bedrock-agentcore-sdk-python#111 are the official fix paths. Parking the wizard work but committing the diagnostic flags we added during triage so they're available when this is unparked: - `tokenEndpointAuthMethods: ['client_secret_post']` on the provider metadata. Linear expects POST-body credentials; AgentCore defaults to Basic. Field name verified against the SDK type (`tokenEndpointAuthMethods`, not the `Supported` suffix the boto3 reference suggested). - `--verbose-poll` flag on `bgagent linear setup` — prints per-poll sessionStatus + response keys so the stuck state is visible. - `--force-reauth` flag — sets `forceAuthentication: true` on GetResourceOauth2Token to bypass cached tokens after a Linear-side revoke. - `CompleteResourceTokenAuth` call between callback capture and poll loop. Per AWS sample 09-Outbound_Auth_Self_Hosted, this is required to bind the captured session to a userId. Confirmed it's NOT what unblocks our specific bug, but is correct per spec for any USER_FEDERATION flow. Status of resume paths in memory/project_oauth_2_0b.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(2.0b-O2): direct Linear OAuth + per-workspace Secrets Manager Replaces the parked AgentCore Identity OAuth flow with a CLI-side direct OAuth dance against Linear's /oauth/token endpoint. The flow verified by the manual curl smoke test on 2026-05-19 returned a valid access_token in <100ms, so we know Linear's side works. AWS's USER_FEDERATION wrapper is broken specifically for Linear (or actor=app); see memory/project_oauth_2_0b.md for the parked-bug details and resume prompt. Architecture: - New module cli/src/linear-oauth.ts owns the OAuth helpers: generatePkce (S256), buildAuthorizationUrl (with actor=app), exchangeAuthorizationCode, refreshAccessToken, StoredLinearOauthToken JSON shape, computeExpiresAt, isAccessTokenExpiring (60s threshold), linearOauthSecretName. 19 hermetic tests (no network). - Per-workspace Secrets Manager secret bgagent-linear-oauth-<slug> holds the token JSON. CLI creates+updates at runtime via upsertOauthSecret (CreateSecret + ResourceExistsException → PutSecretValue fallback). - LinearWorkspaceRegistryTable row gains oauth_secret_arn. Lambdas resolve workspace → secret_arn → token JSON, with refresh-if-expiring. (Lambda migration is Wave C.) - bgagent linear setup is rewritten end-to-end: prompt-for-credentials → PKCE → open browser → callback captures ?code+?state → state verify (CSRF) → exchangeAuthorizationCode → query Linear viewer+org → write secret + registry row + user mapping → webhook secret prompt (unchanged from prior wizard). No AgentCore calls. No polling. No CompleteResourceTokenAuth. - Localhost callback server now exposes both AgentCore-style (session_id) and direct-Linear-style (code+state+error) shapes via a CallbackResult with nullable fields. Backward-compat with the parked AgentCore path's tests. Removals: - cli/src/agentcore-identity.ts + test (workload-token helper) - cdk/src/constructs/cli-workload-identity.ts + test (workload identity) - providerNameForWorkspace, buildLinearProviderInput, registerLinearWorkspace, initiateOauthDance, completeResourceTokenAuth, pollForOauthAccessToken, AgentCore SDK imports — all gone from linear.ts - bgagent linear oauth-register-workspace command (no AWS-side provider to register; folded into setup) - CliWorkloadIdentityName CfnOutput from agent.ts - 6 describe blocks of AgentCore-flavored tests in linear.test.ts Net change: -1100 lines, +700 lines of new direct-OAuth wiring. 286/286 CLI tests pass. 9/9 linear-integration CDK tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(2.0b-O2): space-separated OAuth scopes + --no-actor-app diagnostic End-to-end smoke test against backgroundagent-dev (2026-05-20): - The OAuth dance was failing with Linear's "Invalid redirect_uri" error even though the redirect_uri was correct. Root cause: scopes were comma-separated (`read,write,...`) instead of space-separated. RFC 6749 §3.3 mandates space; Linear surfaces the violation as the misleading "Invalid redirect_uri" error, the same misdirection we hit during the 2.0b spike. Fix: `.join(' ')` in buildAuthorizationUrl. - Adds `--no-actor-app` diagnostic flag on `bgagent linear setup`. Drops the `actor=app` query param so a stuck flow can be isolated to agent-install vs vanilla-OAuth without changing the Linear app config. Off by default; surfaces a warning when invoked. After the fix, full smoke test passed: - Browser opens to Linear consent - User authorizes, redirects to https://localhost:8443/oauth/callback - CLI captures code+state, exchanges for access_token + refresh_token - Token JSON persisted to bgagent-linear-oauth-maguireb in Secrets Manager - LinearWorkspaceRegistryTable row written with oauth_secret_arn - LinearUserMappingTable row written for the admin - Token verified against Linear's GraphQL viewer query (works) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(2.0b-O2): Wave C — migrate Lambdas + agent runtime to per-workspace OAuth Replaces every consumer of the legacy LinearApiTokenSecret PAK with the per-workspace Secrets Manager OAuth-token pattern from Waves A/B. Deploy of this commit will fully cut over the integration; the LinearApiTokenSecret construct is gone. CDK side: - New `cdk/src/handlers/shared/linear-oauth-resolver.ts` resolves workspace_id → registry row → oauth_secret_arn → token JSON → refresh-if-expiring → access_token. In-memory caches (1m TTL) on both registry rows and token JSON. Lazy refresh with PutSecretValue write-back so concurrent Lambdas see the rotated token. 11 unit tests. - linear-feedback.ts: postIssueComment / addIssueReaction / reportIssueFailure now take a {linearWorkspaceId, registryTableName} context instead of an apiTokenSecretArn. Auth header switches from bare PAK value to `Bearer ${accessToken}`. - linear-webhook-processor.ts: env vars LINEAR_WORKSPACE_REGISTRY_TABLE_NAME replace LINEAR_API_TOKEN_SECRET_ARN. safeReportIssueFailure threads the webhook payload's organizationId through to the resolver. Webhook processor now stamps `linear_oauth_secret_arn` + `linear_workspace_slug` into channel_metadata at task-creation time so the agent runtime can fetch the secret directly without a registry round-trip. - orchestrate-task.ts: notifyLinearOnConcurrencyCap reads LINEAR_WORKSPACE_REGISTRY_TABLE_NAME and the task's channel_metadata.linear_workspace_id. - LinearIntegration construct: drops apiTokenSecret + ApiTokenSecret Secrets Manager resource entirely. Webhook processor IAM now grants Get+Put on `bgagent-linear-oauth-*` Secrets Manager prefix. - Agent stack: orchestrator IAM mirrors the new prefix grant. Runtime IAM drops AgentCore Identity grants and gains Get+Put on `bgagent-linear-oauth-*`. LINEAR_API_KEY_PROVIDER_NAME env var, LINEAR_API_TOKEN_SECRET_ARN env var, and LinearApiTokenSecretArn CfnOutput all removed. Agent side (Python): - config.py::resolve_linear_api_token: rewritten to read the per-task channel_metadata.linear_oauth_secret_arn (or LINEAR_OAUTH_SECRET_ARN env fallback) via boto3.secretsmanager. Lazy refresh: if expires_at is within 60s, POST refresh_token grant to Linear /oauth/token using client_id/client_secret co-located in the secret JSON, write the rotated token back via put_secret_value, return the new access_token. - pipeline.py: passes config.channel_metadata into resolve_linear_api_token. - linear-oauth.ts (CLI): StoredLinearOauthToken schema gains client_id + client_secret fields so Lambda + agent refresh can run without per-Lambda OAuth env vars. Setup wizard writes them. Tests pruned of AgentCore Identity mocks; new tests cover the Secrets-Manager-direct path (CDK 11 + agent 6 new). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(orchestrator): bundle import.meta.url shim for durable-execution SDK `@aws/durable-execution-sdk-js@1.1.3`'s ESM build calls `fileURLToPath(import.meta.url)` at module load. esbuild's ESM→CJS bundling leaves `import.meta.url` undefined, crashing every invocation with `TypeError: path must be a string`. Define an identifier substitution + banner that materializes a valid file:// URL from `__filename` at runtime. Discovered while smoke-testing Wave C end-to-end on backgroundagent-dev. Refs: aws/aws-durable-execution-sdk-js#543 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cli): switch OAuth callback to plain HTTP localhost Per RFC 8252 §7.3, OAuth providers (including Linear) treat http://localhost as a special case that doesn't need TLS — the connection never leaves the host. The previous self-signed-cert HTTPS approach forced testers through a "connection not private" warning that scared them off mid-setup. Drops the openssl shell-out + temp-cert plumbing (~60 lines) along with the user-facing warning copy in `bgagent linear setup`. Updates the callback constants to http://localhost:8080/oauth/callback and the test suite to plain http.GET. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agent): ruff lint + format for OAuth refresh path Five lint errors surfaced when CI ran `ruff check --fix` against the Wave C agent changes: - F401 unused `timezone` import in `config.py` (replaced with `timedelta`, which is what's actually needed) - RUF034 useless if-else in the `expires_at` ternary — both branches returned identical strings before the recompute below; flatten into a single straightforward `if expires_in: ... else: ...` block - E501 three line-length violations in `config.py` and `test_config.py` — break the long expressions onto helper-named intermediates Confirmed locally: `ruff check .` clean, `ruff format --check` clean, `pytest tests/test_config.py` 15/15 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(linear-feedback): rewrite tests against OAuth context signature Wave C migrated postIssueComment / addIssueReaction / reportIssueFailure from a (secretArn: string, ...) signature to a (ctx: LinearFeedbackContext, ...) signature, but the test file still passed bare strings — TypeScript caught it at compile time only when CI ran a full build. Three test suites failed to compile (the typecheck error blocked the whole suite, not just this file). - Mock `resolveLinearOauthToken` (the new resolver) instead of `getLinearSecret` (the old PAK fetcher). - Build a `LinearFeedbackContext` fixture with linearWorkspaceId + registryTableName, pass it everywhere SECRET_ARN was used. - Update the Authorization-header assertion to match the new `Bearer <token>` form (PAK was bare-token; OAuth is Bearer-prefixed). All 41 tests across linear-feedback, linear-webhook-processor, and orchestrate-task-feedback pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cdk): align yarn.lock with upstream main + bump table count for OAuth registry Two CI failures came together because they share a root cause: this branch's yarn.lock had drifted from upstream main during interim re-resolves, leaving an inconsistent dep tree that broke ts-jest's module resolution for @aws-cdk/mixins-preview/aws-bedrockagentcore. Restoring upstream main's yarn.lock fixes the resolution; the agent.test.ts table-count assertion then needs to bump from 12 to 13 to account for the LinearWorkspaceRegistryTable added in Phase 2.0b Wave A4. Verified locally: agent.test.ts (44/44) and github-tags.test.ts (5/5) both pass after the changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: apply eslint --fix from CI's self-mutation guard CI runs `mise run build`, which invokes `eslint --fix` and then fails if the working tree changed (self-mutation guard). Three cosmetic lints needed applying: - Import-order: DynamoDBClient and CliError moved earlier in their files to satisfy alphabetic-by-package ordering - formatJson import added in alphabetic position in linear.ts - Three template literals with no interpolation converted to single-quoted strings in oauth-callback-server.ts and linear.ts (eslint quotes rule prefers single-quotes when no template variables are used) Pure mechanical fixes; no behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(oauth): refresh-token race recovery + log gaps from review Addresses the blocker + critical items from PR review: - Refresh-token race (review blocker). Linear rotates refresh_tokens on every use; concurrent Lambdas/agents racing the same secret will all read the same expiring token and one's refresh will succeed while the others get `invalid_grant`. On `invalid_grant`, re-read the secret from Secrets Manager (bypassing cache). If the refresh_token has changed, another caller already rotated; use the freshly-read token (or retry refresh once if it's also expiring). If unchanged, the refresh_token is permanently rejected and the workspace needs re-onboarding. Implemented in both the TS resolver (linear-oauth-resolver.ts) and Python resolver (config.py). - Unguarded bedrock_agentcore import in agent/src/server.py (review critical). The bare `from bedrock_agentcore.runtime.context import BedrockAgentCoreContext` inside `_run_task_background` killed the entire pipeline thread with no diagnostic if the SDK was missing or its module structure changed. Wrap in try/except (ImportError, AttributeError) and log via _warn_cw — the Linear token resolver has its own SM fallback, so the agent can proceed without the workload-token bridge. - Cache invalidation on fetch-level refresh failure (review high). The TS resolver's `invalidateLinearOauthCache()` only ran in the `!resp.ok` branch; if `fetch()` itself threw (timeout, DNS), the catch returned null without invalidating, leaving the stale expiring token cached for 60s and hammering Linear's token endpoint. Move invalidate into the fetch-level catch too. - Malformed expires_at log (review medium). The Python `_is_expiring` caught `ValueError` and silently returned True, masking consistently-bad writes. Add a WARN log so operators see the bad data instead of just an unexplained refresh on every task. - Positive-path refresh log (review non-blocking aws-samples#5). Added INFO-level breadcrumb on successful refresh in both resolvers so operators diagnosing intermittent 401s have a trace of which workspace refreshed and to what expiry. 11/11 existing resolver unit tests still pass; will add tests for the new race-recovery branch in a followup commit. * fix(2.0b-O2): review-2 batch — error specificity, half-creates, runbook Addresses four PR review items focused on operator UX when things go sideways: - isWebhookSecretConfigured (review high). The bare `catch { return false }` swallowed AccessDeniedException and DecryptionFailureException, making setup re-prompt for a webhook secret when the real problem was IAM. Now: only ResourceNotFoundException returns false; everything else throws a CliError pointing the operator at the IAM gap. Test updated to assert both paths. - admin invite-user half-create (review medium). If AdminCreateUser succeeds but AdminSetUserPassword fails (stricter password policy than generator, partial IAM grant on the Set verb), the user was left in FORCE_CHANGE_PASSWORD with no diagnostic. Wrap the second call in try/catch and throw a CliError that names the user, explains the broken state, and gives both a delete-user CLI and a manual-fix path. - PAK migration runbook (review non-blocking #1). Expanded the "Migration from 2.0a (PAK) to 2.0b (OAuth)" section in LINEAR_SETUP_GUIDE.md with: a pre-deploy checklist, what survives the migration vs what doesn't, an explicit rollback note (fix forward; the original PAK secret is gone with the CFN resource), and the per-step difference between 2.0a-with-Identity (skipped) vs 2.0a-with-PAK (migrate) deploys. - Vestigial AgentCore Identity dep (review non-blocking #2). bedrock-agentcore==1.9.1 is kept in agent/pyproject.toml because the workload-token bridge in server.py still calls it (now wrapped in try/except per review batch 1). Add an inline comment explaining why it's pinned even though Phase 2.0b-O2 reads Secrets Manager directly — it's the seam for resuming the AgentCore Identity path in 2.0c. CLI tests: 13/13 pass. * fix(2.0b-O2): batch 3 — schema validation, CallbackResult union, parity contract Three remaining substantive review items from PR aws-samples#160: - Validate all 11 fields in getOauthSecret (review non-blocking aws-samples#7). Was checking only access_token / refresh_token / expires_at; missing client_id or client_secret only surfaced 24h later when the refresh call needed them and found undefined. Extracted the required-field list into a const next to the StoredOauthToken interface and check the full set at deserialization. Bad secrets fail fast at fetch time with a structured log line naming the missing fields. - CallbackResult discriminated union (review non-blocking aws-samples#6). Was `{ sessionId: string|null, code: string|null, state: string|null }` which let callers construct unreachable shapes. Split into `{ kind: 'agentcore', sessionId } | { kind: 'direct-oauth', code, state }`. Updated the resolver site (`oauth-callback-server.ts`), the consumer (`bgagent linear setup`), and the test file to use exhaustive type-narrowing. The setup wizard now errors clearly if it gets the agentcore shape (parked path) instead of silently passing nulls down. - Cross-language schema-parity contract test (review non-blocking #3). CLI's StoredLinearOauthToken and Lambda's StoredOauthToken define the same JSON-in-Secrets-Manager schema independently; drift between the two would be a silent bug (CLI writes one field name, Lambda reads another, refresh works, every Lambda invocation logs a missing-field error). New test in `cdk/test/contracts/stored-oauth-token-parity.test.ts` regex-parses both interface definitions out of source and asserts the field set is equal. Also asserts the new `STORED_OAUTH_TOKEN_REQUIRED_FIELDS` const matches the interface, so future field additions can't drift between the validator and the type. CLI tests 286/286 pass. CDK resolver + contract 13/13 pass. * style: apply eslint --fix indentation on CallbackResult union * fix(2.0b-O2): review-3 batch — defensive error handling, security hardening, refresh test coverage Bugs (B1-B3): - linear-oauth-resolver: try/catch around ddb.send() in getRegistryRow so transient DDB errors fail the resolver cleanly instead of crashing the Lambda thread - orchestrate-task: try/catch around reportIssueFailure in notifyLinearOnConcurrencyCap; Linear feedback failures must never block the rejection path - Python _fetch_token: guard json.loads + KeyError so a corrupted SM payload returns None (logged ERROR) rather than raising Tests (T1-T3): - Python: TestResolveLinearApiTokenRefreshPaths covering happy refresh, invalid_grant + concurrent rotation, invalid_grant + no rotation, malformed expires_at, network failure during refresh, and corrupted secret JSON - TS: concurrent-refresh recovery via re-read; permanent-rejection on same refresh_token; cache invalidation after network failure Security (S1-S3): - agent runtime IAM: drop secretsmanager:PutSecretValue on the Linear OAuth secret prefix. Untrusted repo code in the agent must not be able to overwrite tokens; Lambdas (trusted) handle persistence. The refreshed in-memory token still works for the current task; rotated refresh_token is lost on agent exit but Linear's grace window absorbs the rare race where the agent refreshes strictly before any Lambda - Python _try_refresh_once: narrow except Exception to (urllib.error.URLError, OSError) — programmer errors propagate with clean stack traces instead of being swallowed - linear-oauth-resolver: RegistryRowStatus is now a discriminated 'active' | 'revoked' literal; missing or unknown values fail closed to revoked rather than defaulting active * fix(2.0b-O2): use email.message.Message for HTTPError hdrs in tests ty's stricter typeshed flagged dict[Unknown, Unknown] passed where Message[str, str] is expected. Drop-in replacement that satisfies both ty and runtime. * fix(2.0b-O2): update feedback test for B2 — helper now swallows internally Round-3 review B2 moved the try/catch into notifyLinearOnConcurrencyCap. The pre-existing test asserted the old contract (rejection propagates, caller must catch). Flip the contract assertion to reflect the new behavior: the helper is now best-effort end-to-end and returns undefined on internal failure. --------- Co-authored-by: bgagent <bgagent@noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Alain Krok <alkrok@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes multi-stack name collisions by making each resource self-guard its own name length, rather than imposing a central
stackNamelength limit.Problem
Deploying a second named stack (e.g.
commit-fa647ca...) in the same account fails because several resources use hardcoded names with account-level uniqueness:task-input-guardrail(per-account per-region unique)bgagent_memory(per-account per-region unique)jean_cloude(per-account per-region unique)onDeletein one stack disables logging for all stacksFix
stackName→ sanitize non-alphanumeric to_, ensure starts with letter.slice(0, 48)${stackName}-guardrail.slice(0, 50)mem_${stackName}(sanitized).slice(0, 48)onDelete— account-level singleton shouldn't be torn down by one stackThe 40-character
stackNameinput validation is removed — each resource truncates to its own API limit independently.Also included
test-signing-secret-abc123in Slack verification test was triggering false positiveTest plan
mise //cdk:compilepassesmise //cdk:test— 1170 tests pass, 68 suites🤖 Generated with Claude Code