Skip to content

LPX-578: sync:fargate command for ECS + ALB adoption#25

Merged
stevethomas merged 7 commits into
mainfrom
steve/lpx-578-sync-command-ecs-cluster-alb-adoption
May 19, 2026
Merged

LPX-578: sync:fargate command for ECS + ALB adoption#25
stevethomas merged 7 commits into
mainfrom
steve/lpx-578-sync-command-ecs-cluster-alb-adoption

Conversation

@stevethomas
Copy link
Copy Markdown
Member

@stevethomas stevethomas commented May 19, 2026

What this ships

sync:compute — the full provisioning surface for Fargate-shaped YOLO apps.

Tracking: LPX-578.

TL;DR

yolo init
yolo sync production       # provisions VPC/storage/standalone + compute (this PR)
yolo build production      # PR #26
yolo deploy production     # PR #26

SyncCommand only enrols SyncComputeCommand when the manifest has tasks.web — non-Fargate consumers stay unchanged.

Steps in sync:compute (in order)

  1. SyncEcrRepositoryStep — creates ECR repo + attaches a lifecycle policy. Defaults: expire untagged after 7d, keep last 30 tagged. Overridable via tasks.web.image-retention.{keep-count,untagged-days}.
  2. SyncEcsClusterStep — Fargate + FARGATE_SPOT capacity providers, container insights enabled.
  3. SyncTaskSecurityGroupStep — task ENI security group with inbound from the ALB SG only.
  4. SyncLoadBalancerStep — internet-facing ALB on the public subnets, looked up by aws.alb or defaulted to the env-shared name yolo-{env}.
  5. SyncTargetGroupStep — IP-target group with configurable health check.
  6. SyncHttpListenerStep — HTTP:80 → HTTPS:443 permanent redirect.
  7. SyncHttpsListenerStep — HTTPS:443 with ACM cert attached. Returns SKIPPED cleanly when the cert isn't ISSUED yet (re-sync after DNS validation completes) or when no apex/domain is set.
  8. SyncListenerRuleStep — host-based routing rule with bounded priority assignment in the 1000-49999 range. Pure nextAvailablePriority() extracted and unit-tested for wrap, exhaustion, determinism, range.
  9. SyncTaskLogGroupStep — pre-creates the CloudWatch log group with configurable retention (tasks.web.log-retention, default 30d).
  10. SyncTaskDefinitionStep — Fargate-compatible task definition (awsvpc, configurable cpu/memory).
  11. SyncEcsServiceStep — ECS service with rolling deployment config (min 100% / max 200%). Drift detection covers desiredCount + healthCheckGracePeriodSeconds. Task-definition revision adoption is owned by yolo deploy, not sync.

Manifest shape

environments:
  production:
    domain: codinglabs.com.au
    aws:
      account-id: 848509375702
      region: ap-southeast-2
      # alb: my-shared-alb   # optional, defaults to per-env shared (yolo-{env})
    tasks:
      web:
        port: 8000
        cpu: '512'
        memory: '1024'
        desired-count: 1
        # all optional, defaults shown:
        # dockerfile: Dockerfile
        # platform: linux/amd64
        # execution-role: ecsTaskExecutionRole
        # task-role: ~                              # omit unless app needs SES/SQS/S3 etc.
        # log-retention: 30
        # image-retention:
        #   keep-count: 30
        #   untagged-days: 7
        # health-check:
        #   path: /health
        #   grace-period: 60
        # enable-execute-command: false

Manifest value validation

Three values flow into AWS API payloads through strict validators that throw IntegrityCheckException with a clear message on bad input, before any AWS round trip:

  • tasks.web.enable-execute-commandHelpers::validateStrictBool. The YAML-quoted string "false" parses as false (PHP's native (bool) cast would return true).
  • tasks.web.image-retention.{keep-count,untagged-days}Helpers::validatePositiveInt. Non-numeric strings rejected (not coerced to 0).
  • tasks.web.log-retentionHelpers::validateCloudWatchLogRetention. Restricted to CloudWatch's accepted enum (1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 2192, 2557, 2922, 3288, 3653).

Other AWS plumbing

  • Aws::ecr() / Aws::ecs() SDK client registrations.
  • Aws::ecsTags() helper for the lowercase key/value tag shape ECS expects.
  • UsesEcr / UsesEcs concerns mixed into AwsResources.
  • publicSubnetIds() on UsesEc2.
  • New SecurityGroup::ECS_TASK_SECURITY_GROUP + SecurityGroupRule::ECS_TASK_LB_INGRESS_RULE enum cases.
  • UsesElasticLoadBalancingV2::targetGroup() uses per-app naming (yolo-{env}-{app}) so multiple apps can share one ALB with separate TGs.
  • UsesEc2::securityGroups() is intentionally un-memoised — steps frequently describe → catch-not-found → create → re-describe in a single sync and a stale cache silently breaks the re-read.
  • UsesEcs::ecsService() and ecsTaskDefinition() translate any AwsException (incl. ClusterNotFoundException on cold accounts) into the project's standard ResourceDoesNotExistException so the calling step's catch handles dry-run + create cleanly.

Known follow-ups (separate tickets)

  • LPX-596 — strip the remaining static caches from UsesEc2/UsesEcs/UsesEcr/UsesElasticLoadBalancingV2.
  • LPX-597 — ALB attribute polish (deletion protection, access logs, hardened defaults). Land before production cutover.
  • LPX-598 — deploy hardening polish (cross-env safety, revision pinning, log surfacing, circuit breaker, …). Surfaced by the PR LPX-576 + LPX-579: build + deploy commands for first Fargate POC #26 re-review; bundled there since most items touch the deploy path.
  • IAM provisioning for ecsTaskExecutionRole + per-app task roles.
  • destroy:compute teardown command.
  • Stub + README pass to reflect the Fargate manifest shape.

Reviewer notes

  • IAM is out-of-scope. tasks.web.execution-role defaults to the bare 'ecsTaskExecutionRole' role name; the AWS-managed role must exist in the target account or registerTaskDefinition fails.
  • ALB stays in provisioning for ~2-5 min after creation. AWS accepts listener creates against a provisioning ALB, so sync proceeds — but sync:compute returns before the ALB is actually serving traffic.
  • ECR repo is empty after first sync; the task-def's default :latest reference doesn't resolve until yolo build has pushed an image. PR LPX-576 + LPX-579: build + deploy commands for first Fargate POC #26 ships the build + deploy half.
  • 91 pest, 0 phpstan, pint clean.

Adds the sync:fargate command and steps for provisioning the Fargate
compute stack: ECR repo, ECS cluster, task SG, ALB (lookup-or-create
by aws.alb manifest key), target group, HTTP/HTTPS listeners, host
based listener rule, task definition, and ECS service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@stevethomas stevethomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Verdict: ❌ Changes Requested

Two blockers, six warnings worth landing fixes for, several notes. The blockers both hit on first run for a current dev-main consumer (CL marketing site is on dev-main right now), so they need to land before this merges.

Risk Assessment

Category Status
Data Integrity ❌ Blocker
Security ⚠️ Warnings
Breaking Changes ❌ Blocker
Operational ⚠️ Warnings
Test Coverage ⚠️ Warnings

Blockers

  1. src/Steps/Fargate/SyncTaskSecurityGroupStep.php:64 — first-run create path is broken by a stale-cache bug. UsesEc2::securityGroups() memoises into static::$securityGroups on the failed lookup at line 23. After createSecurityGroup succeeds, the re-fetch at line 64 hits the cached pre-create list and throws ResourceDoesNotExistException. Verified by tracing UsesEc2::securityGroupByName()UsesEc2::securityGroups() — no refresh between the two calls. Same pattern exists in the alpha-era SyncEc2SecurityGroupStep (out of scope for this PR but worth noting).

  2. src/Commands/SyncCommand.php:39SyncFargateCommand is enrolled in the top-level sync orchestration with no manifest gate. For any current dev-main consumer with a non-Fargate manifest (the CL marketing site yolo.yml is one), running yolo sync production now silently provisions a new ALB (~AU$24/mo), an ECS cluster (containerInsights enabled — ongoing CloudWatch cost), an ECR repo, and a task SG. Conflicts with Steve's standing rule that infra commands need explicit confirmation. Gate with Manifest::has('tasks.web') or have each Fargate step short-circuit to SKIPPED when tasks.web is absent.

Warnings (inline)

See inline comments on:

  • SyncListenerRuleStep.php:96 — priority collision walk escapes the 1000-49999 floor
  • SyncEcrRepositoryStep.php:25 — no lifecycle policy, images accumulate forever
  • SyncTaskDefinitionStep.php:60 — log group auto-created with no retention policy
  • SyncEcsServiceStep.php:38 — narrow drift detection; service never adopts new task def revisions
  • UsesElasticLoadBalancingV2.php:42 — TG name lookup change, ELBv2's 32-char Name cap unvalidated
  • SyncHttpsListenerStep.php:18Manifest::apex() TypeErrors on domain-less manifests (only caught by ResourceDoesNotExistException)

Notes (no inline comment)

  • Existence-only checks across the Fargate suiteSyncEcsClusterStep, SyncTargetGroupStep, SyncEcsServiceStep (mostly) all return SYNCED when a resource exists, regardless of manifest drift. Health-check path, capacity providers, network config, deployment config — all silently masked. Pattern issue, worth a follow-up that establishes a consistent drift convention.
  • Task definition revisions accumulateSyncTaskDefinitionStep re-registers a fresh revision on every sync with no payload diff. Combined with the service-not-adopting issue above, every sync churns a new revision that the service ignores.
  • Public subnet placement — Fargate tasks land in public subnets with assignPublicIp: ENABLED. Costs ~AU$5.40/mo per task in IPv4 fees and diverges from standard Fargate posture. Acceptable for the alpha (no private subnets in sync:network yet) but call it out in docs.
  • ECR imageTagMutability: MUTABLE — fine for :latest rolling but precludes the digest-pin / hardened-deploy path; flip to IMMUTABLE when the deploy command lands if it's digest-pinned.
  • Cert status fall-throughSyncHttpsListenerStep:23 skips on any status !== 'ISSUED'. Correct behaviour but EXPIRED/REVOKED/FAILED get the same SKIPPED signal as PENDING_VALIDATION, masking real cert problems during sync.
  • No teardown path — no destroy:fargate (or equivalent) ships with this PR. Half-failed syncs leave an awful manual-cleanup checklist (cluster → service → task def revisions → listener rule → listeners → TG → ALB → task SG → ECR repo → log group). File the follow-up.
  • Stub + README still alpha-shapedstubs/yolo.yml.stub is EC2/ASG-shaped with no aws.alb, aws.ecs.security-group, ecs.cluster, or tasks.web.* representation. README's one-liner promises sync works end-to-end but the stub doesn't reflect it. Either update the stub or add a docs/manifest.md.

Test coverage gaps

  • SyncListenerRuleStep::priority() + findRuleForHosts() — zero coverage. Highest-value missing tests given the multi-tenant priority-collision risk.
  • SyncEcsServiceStep update-vs-create branching — four behavioural paths, zero coverage. healthCheckGracePeriodSeconds ?? 0 coalesce will silently churn updateService calls on every sync against an existing service.
  • SyncTaskSecurityGroupStep tag-key drift — read filter and write tag are coupled by string match across the same file; a small test pinning the tag key in both places would catch any future drift.
  • Dry-run honesty across all 10 Fargate steps — no test confirms --dry-run actually avoids SDK writes. Trivial to add (bind mock client that throws on writes).

Naming

Separately raising with @stevethomas: sync:fargate ties the command name to the implementation when Fargate IS the implementation. sync:compute slots next to sync:network / sync:storage / sync:standalone and reclaims the v1-alpha vocabulary. Step namespace (src/Steps/Fargate/) stays Fargate-specific. Rename incoming as a follow-up commit on this branch.


🤖 Reviewed by Claude Code

],
]);

$securityGroup = AwsResources::ecsTaskSecurityGroup();
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: First-run create path will always throw ResourceDoesNotExistException here. UsesEc2::securityGroups() memoises into static::$securityGroups on the failed lookup at line 23 (the call inside the try block populates the cache with the pre-create list, then throws). After createSecurityGroup succeeds on line 52, the re-fetch at line 64 calls securityGroupByName()securityGroups() again, hits the cached pre-create list, doesn't find the new SG, throws. The same pattern exists in the alpha-era SyncEc2SecurityGroupStep:110 (out of scope for this PR but worth a separate ticket).
Severity: BLOCKER

Suggested Fix

Use the GroupId returned by createSecurityGroup directly instead of re-fetching via the cached lookup:

$result = Aws::ec2()->createSecurityGroup([...]);
Aws::ec2()->authorizeSecurityGroupIngress(
    static::loadBalancerIngressRule(['GroupId' => $result['GroupId']], $containerPort)
);

Or force a refresh by calling AwsResources::securityGroups(refresh: true) before the re-fetch. The SDK-response-direct approach is one fewer round trip.

Prompt
Review the code at src/Steps/Fargate/SyncTaskSecurityGroupStep.php:64 alongside src/Concerns/UsesEc2.php:100-110. A potential issue has been flagged: UsesEc2::securityGroups() memoises into a static property; the first call inside this step (line 23 via ecsTaskSecurityGroup → securityGroupByName) populates the cache with the pre-create SG list, then throws. After createSecurityGroup succeeds at line 52, the re-fetch at line 64 hits the stale cache and throws ResourceDoesNotExistException, breaking first-run sync. Verify whether this is real — if so, propose a fix using the createSecurityGroup response GroupId directly; if not, explain why.

Comment thread src/Commands/SyncCommand.php Outdated
: [
SyncStandaloneCommand::class,
],
SyncFargateCommand::class,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: SyncFargateCommand is enrolled in the top-level sync orchestration unconditionally, with no manifest guard. For any current dev-main consumer with a non-Fargate manifest (the CL marketing site yolo.yml matches this shape today), running yolo sync production will silently provision an ALB (~AU$24/mo), an ECS cluster with containerInsights enabled (CloudWatch cost), an ECR repo, and a task SG that they didn't ask for. Conflicts with Steve's standing rule that infra commands need explicit consent — currently it'd run on the first sync after upgrading their dev-main pin without any opt-in. Also: SyncHttpsListenerStep:18 and SyncListenerRuleStep:19 both call Manifest::apex() (return-typed : string) inside a try/catch that only catches ResourceDoesNotExistException, so a domain-less manifest crashes the whole sync run with an unhandled TypeError instead of cleanly skipping.
Severity: BLOCKER

Suggested Fix

Gate the enrolment behind a manifest opt-in:

collect([
    SyncNetworkCommand::class,
    SyncStorageCommand::class,
    ...Manifest::isMultitenanted() ? [...] : [SyncStandaloneCommand::class],
    ...Manifest::has('tasks.web') ? [SyncFargateCommand::class] : [],
    SyncIamCommand::class,
    SyncLoggingCommand::class,
])->each(...);

OR have each Fargate step return StepResult::SKIPPED when tasks.web is absent. The first form is cheaper and matches the existing multitenancy gating shape just above. Pairs naturally with the sync:compute rename — same gate, new command name.

Prompt
Review the code at src/Commands/SyncCommand.php:39. A potential issue has been flagged: SyncFargateCommand runs unconditionally as part of the top-level sync, so any current dev-main consumer with a non-Fargate manifest (e.g. CL marketing site) will silently get a new ALB, ECS cluster, ECR repo, and task SG provisioned without consent. Also crashes on domain-less manifests because SyncHttpsListenerStep/SyncListenerRuleStep call the return-typed Manifest::apex() inside a try/catch that only catches ResourceDoesNotExistException. Verify and propose either a manifest gate on enrolment or per-step SKIPPED short-circuits.

while (in_array($base, $usedPriorities, true)) {
$base = ($base % 50000) + 1;
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: The collision walk $base = ($base % 50000) + 1 can produce priorities below 1000 (the intended floor) and walk into priority slots typically reserved for manual high-priority overrides — base=49999 collides, next is 1, then 2, then 3, etc. Also: no bounded termination, so if all 50000 slots are taken the loop spins forever. Additionally describeRules is called twice per invocation (once in findRuleForHosts(), again here) — minor API pressure, but in multi-tenant fan-out it doubles ELB describe TPS pressure (10 TPS cap per region).
Severity: WARNING

Suggested Fix

Stay bounded within the intended range and bail on exhaustion:

$base = (abs(crc32(Helpers::keyedResourceName(exclusive: true))) % 49000) + 1000;
$attempts = 0;

while (in_array($base, $usedPriorities, true)) {
    if (++$attempts > 49000) {
        throw new IntegrityCheckException('ALB listener rule priority space exhausted');
    }
    $base = $base >= 49999 ? 1000 : $base + 1;
}

Add a unit test pinning: priority always within 1000..49999, deterministic for a given app name when no collision, walks correctly when collisions occur.

Prompt
Review the code at src/Steps/Fargate/SyncListenerRuleStep.php:82-99. A potential issue has been flagged: (a) the collision walk `$base = ($base % 50000) + 1` escapes the intended 1000-49999 floor and can land on priority 1, conflicting with manually-set rules; (b) no bounded termination — exhausted space loops forever; (c) describeRules is called twice per invocation. Verify and propose a bounded walk + a unit test for priority() and findRuleForHosts() using a Container-bound mock elasticLoadBalancingV2 client.

return StepResult::WOULD_CREATE;
}

Aws::ecr()->createRepository([
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: ECR repository is created with no lifecycle policy. Every build will push a new tag (PR #26 pushes :{app-version} + :latest), and ECR keeps everything forever by default. Default per-repo image limit is 10k images (raisable but a hard ceiling), and storage is billed at US$0.10/GB-mo per image layer. For LP's multi-tenant footprint this gets noticeable fast.
Severity: WARNING

Suggested Fix

Follow up the createRepository call with a putLifecyclePolicy setting a sensible default (configurable via manifest):

Aws::ecr()->putLifecyclePolicy([
    'repositoryName' => AwsResources::ecrRepositoryName(),
    'lifecyclePolicyText' => json_encode([
        'rules' => [
            [
                'rulePriority' => 1,
                'description' => 'Expire untagged images after 7 days',
                'selection' => ['tagStatus' => 'untagged', 'countType' => 'sinceImagePushed', 'countUnit' => 'days', 'countNumber' => 7],
                'action' => ['type' => 'expire'],
            ],
            [
                'rulePriority' => 2,
                'description' => 'Keep last 30 tagged images',
                'selection' => ['tagStatus' => 'any', 'countType' => 'imageCountMoreThan', 'countNumber' => 30],
                'action' => ['type' => 'expire'],
            ],
        ],
    ]),
]);
Prompt
Review the code at src/Steps/Fargate/SyncEcrRepositoryStep.php:19-30. A potential issue has been flagged: ECR repository is created with no lifecycle policy — images accumulate forever and eventually hit the soft cap of 10k per repo plus unbounded storage cost. Propose a default putLifecyclePolicy (expire untagged after 7d, keep last 30 tagged) overridable via manifest, e.g. tasks.web.image-retention.

'awslogs-stream-prefix' => 'web',
'awslogs-create-group' => 'true',
],
],
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: awslogs-create-group: 'true' lets the awslogs driver auto-create the log group on first task launch. The auto-created group has no retention policy (logs never expire), which means log storage costs grow linearly forever. Worse, the group ends up tagged/owned outside YOLO's normal sync convention — there's no AwsResources::logGroup() lookup pointing at it for future drift detection.
Severity: WARNING

Suggested Fix

Pre-create the log group explicitly via a new SyncTaskLogGroupStep (added to SyncFargateCommand::$steps before SyncTaskDefinitionStep), with a configurable retention from tasks.web.log-retention (default e.g. 30 days). Then flip awslogs-create-group: 'false' here so the driver doesn't shadow-own the group.

Prompt
Review the code at src/Steps/Fargate/SyncTaskDefinitionStep.php:52-62. A potential issue has been flagged: awslogs-create-group: 'true' lets the awslogs driver auto-create the CloudWatch log group with infinite retention, growing log storage costs forever. Propose a new SyncTaskLogGroupStep that pre-creates the group with a configurable retention (default 30 days) and disables awslogs-create-group here.

'cluster' => AwsResources::ecsClusterName(),
'service' => AwsResources::ecsServiceName(),
'desiredCount' => $desiredCount,
'healthCheckGracePeriodSeconds' => $gracePeriod,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: Drift detection is too narrow. Only desiredCount and healthCheckGracePeriodSeconds are diffed; everything else (taskDefinition revision, network config, capacity provider strategy, deployment config, execute-command toggle) is silently masked as SYNCED. Combined with SyncTaskDefinitionStep always registering a new revision per sync, this means: every sync churns a new task def revision that the service never adopts. Service stays pinned to whichever revision it was originally created with, accumulating an orphan trail of registered-but-unused revisions.

Also: the healthCheckGracePeriodSeconds ?? 0 coalesce will report drift on every sync when AWS returns a service without that field set (older create lineage) against a manifest default of 60 — silently churning updateService calls.
Severity: WARNING

Suggested Fix

Decide whether sync or deploy owns task-def revision adoption. If deploy (matches PR #26), add an explicit comment here: // task definition revision is managed by deploy command, not sync. Either way, widen the drift detection — at minimum check taskDefinition ARN against the latest revision, and consider extracting a payloadHasDifferences helper similar to Helpers::payloadHasDifferences() already used in SyncEc2SecurityGroupStep.

For the coalesce churn: pull healthCheckGracePeriodSeconds default from the manifest only once and compare against ?? of the same default, not 0.

Prompt
Review the code at src/Steps/Fargate/SyncEcsServiceStep.php:20-41 alongside SyncTaskDefinitionStep.php. A potential issue has been flagged: drift detection only covers desiredCount + healthCheckGracePeriodSeconds. The service never adopts new task definition revisions registered by SyncTaskDefinitionStep, accumulating orphan revisions. The `healthCheckGracePeriodSeconds ?? 0` coalesce also reports drift on every sync against older services. Verify intent (does deploy own revision adoption?) and either widen drift detection or document the create-only contract.

}

$targetGroups = Aws::elasticLoadBalancingV2()->describeTargetGroups();
$name = Helpers::keyedResourceName(exclusive: true);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: Target group name lookup switched from Helpers::keyedResourceName(exclusive: false) (= yolo-{env}, shared across apps) to exclusive: true (= yolo-{env}-{app}, per-app). PR body confirms no current consumer relies on the old name. Two gotchas worth pinning:

  1. ELBv2 Name cap is 32 chars. yolo-production- is 16 chars; that leaves 16 chars for the app name. Anything ≥17 chars hits AWS validation at create time with no upfront manifest precheck. Add validation in Manifest or in this concern.
  2. No regression test pins the new name. A 5-line test asserting the Names => [...] arg shape passed to describeTargetGroups would document the intent and catch any future drift back to the old shared name.
    Severity: WARNING
Suggested Fix

Add a manifest precheck (somewhere in Command::execute() or as a new Steps/Ensures/EnsureResourceNameLengths step):

if (strlen(Helpers::keyedResourceName(exclusive: true)) > 32) {
    throw new IntegrityCheckException(sprintf(
        "Target group name 'yolo-%s-%s' exceeds ELB's 32-char limit. Shorten the app name to %d chars or less.",
        Helpers::environment(),
        Manifest::name(),
        32 - strlen("yolo-" . Helpers::environment() . "-")
    ));
}

And a regression test under tests/Unit/Concerns/UsesElasticLoadBalancingV2Test.php pinning the lookup name shape.

Prompt
Review the code at src/Concerns/UsesElasticLoadBalancingV2.php:36-63. A potential issue has been flagged: the lookup naming changed from exclusive:false to exclusive:true. ELBv2's 32-char Name cap means app names ≥17 chars will fail at create time with no upfront precheck. Add a manifest validation pass (early-fail with a clear message) plus a regression test pinning the Names arg shape passed to describeTargetGroups.

public function __invoke(array $options): StepResult
{
try {
$certificate = AwsResources::certificate(Manifest::apex());
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: Manifest::apex() has a : string return type and will TypeError (not throw ResourceDoesNotExistException) when neither apex nor domain is set in the manifest. The try/catch only catches ResourceDoesNotExistException, so a domain-less Fargate-shaped manifest blows up the whole sync run with an unhandled TypeError instead of cleanly skipping. Same shape in SyncListenerRuleStep.php:19.
Severity: WARNING

Suggested Fix

Guard with a manifest presence check that returns SKIPPED:

if (! Manifest::has('apex') && ! Manifest::has('domain')) {
    return StepResult::SKIPPED;
}

OR change Manifest::apex() to throw a catchable IntegrityCheckException (or similar) when both keys are absent — but that's a wider blast radius affecting alpha-era code, so the per-step guard is the safer fix for this PR.

Prompt
Review the code at src/Steps/Fargate/SyncHttpsListenerStep.php:17-21 and src/Steps/Fargate/SyncListenerRuleStep.php:19-23. A potential issue has been flagged: Manifest::apex() return type is `: string` and will TypeError on domain-less manifests, but the try/catch only catches ResourceDoesNotExistException. Verify and propose either a Manifest::has() guard returning SKIPPED, or a wider change making Manifest::apex() throw a catchable exception.

stevethomas and others added 2 commits May 19, 2026 21:09
`sync:fargate` ties the command name to the implementation when Fargate
*is* the implementation. `sync:compute` slots cleanly next to
`sync:network` / `sync:storage` / `sync:standalone` / `sync:iam` /
`sync:logging` and reclaims the v1-alpha vocabulary so anyone reading
old docs lands in the right place.

Public surface is the command name — durable. Step namespace
(`src/Steps/Fargate/`) stays Fargate-specific because the steps
*are* concretely Fargate (SyncEcsClusterStep, SyncTargetGroupStep).
If a second compute backend ever materialises, add `src/Steps/X/` and
the command picks the step set by manifest shape; the user-facing
command stays put.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Blockers:

- SyncTaskSecurityGroupStep: use createSecurityGroup response GroupId
  directly instead of re-fetching through UsesEc2::securityGroups(),
  which memoises pre-create state into a static property and would
  throw `Could not find Security Group matching name yolo-...-ecs-task-
  security-group` on every first-run sync.

- SyncCommand: gate SyncComputeCommand enrolment behind
  `Manifest::has('tasks.web')` so existing non-Fargate consumers don't
  silently get a brand-new ALB / ECS cluster / ECR repo provisioned on
  their next `yolo sync`.

Warnings:

- SyncListenerRuleStep::nextAvailablePriority(): extract the pure walk
  out of priority() and bound it to 1000-49999, wrap floor↔ceiling on
  collision rather than escaping into the reserved <1000 range, and
  bail with IntegrityCheckException after a full sweep instead of
  spinning forever. 6 new unit tests pin the contract.

- SyncEcrRepositoryStep: putLifecyclePolicy after createRepository
  with sensible defaults (expire untagged after 7d, keep last 30
  tagged), overridable via `tasks.web.image-retention.{keep-count,
  untagged-days}`.

- SyncTaskLogGroupStep (new): pre-create the CloudWatch log group
  before the task definition is registered, with a configurable
  retention (`tasks.web.log-retention`, default 30 days). Drops the
  `awslogs-create-group: true` driver flag from the task definition so
  the awslogs driver doesn't shadow-own the group.

- SyncEcsServiceStep: replace `?? 0` with `?? $gracePeriod` so older
  services without `healthCheckGracePeriodSeconds` set don't churn an
  updateService call on every sync. Comment makes the
  sync-vs-deploy ownership of task-def revisions explicit.

- EnsureResourceNameLengthsStep (new): manifest precheck for AWS's
  32-char Name cap on target groups + ECS services. Fails fast with
  a clear message naming the offending key and the allowable length.

- SyncHttpsListenerStep + SyncListenerRuleStep: short-circuit to
  SKIPPED when neither `apex` nor `domain` is set, instead of
  TypeErroring on Manifest::apex()'s `: string` return type.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@stevethomas stevethomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-Review Summary (post-fix)

Verdict: ❌ Changes Requested

The first-round blockers + 6 warnings are addressed cleanly, but the re-review surfaced a new blocker that the original SG-cache fix didn't reach, plus a cluster of validation gaps that ride the same Manifest::get(..., default) | (int)/(bool) pattern.

Risk Assessment

Category Status
Data Integrity ❌ Blocker (downstream stale-cache hit)
Security ⚠️ Warnings (cast validation gaps)
Breaking Changes ⚠️ Warnings (DNS step TypeError, stale stub)
Operational ⚠️ Warnings (retention enum / int validation)
Test Coverage ⚠️ Warnings (new code shipped without tests)

Blockers

  1. src/Steps/Fargate/SyncEcsServiceStep.php:63 — the original SG stale-cache fix only resolved the same-step exposure in SyncTaskSecurityGroupStep. The downstream consumer SyncEcsServiceStep's create branch calls AwsResources::ecsTaskSecurityGroup()['GroupId'] for the awsvpcConfiguration. static::$ecsTaskSecurityGroup is unset (the catch in the SG step never seeded it), so it falls through to securityGroupByName()securityGroups() which still returns the pre-create cached list. Throws ResourceDoesNotExistException, aborts sync mid-create. First-run sync:compute still fails — just one step later.

Warnings (inline)

  • SyncEcsServiceStep.php:69(bool) Manifest::get('tasks.web.enable-execute-command', false): PHP's (bool) cast on the YAML-quoted string "false" evaluates to true, silently enabling ECS Exec (shell into running tasks). Security-sensitive footgun.
  • SyncEcrRepositoryStep.php:42(int) casts on keep-count / untagged-days: garbage strings coerce to 0 (e.g. keep-count: thirty); floats truncate silently (keep-count: 1.71). AWS rejects countNumber: 0 server-side so it fails loud, but the failure surface is a deep AWS error rather than a clean preflight. Validate >= 1 up front.
  • SyncTaskLogGroupStep.php:18tasks.web.log-retention cast to int with no validation, but CloudWatch's putRetentionPolicy only accepts a fixed enum (1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 2192, 2557, 2922, 3288, 3653). Reasonable-looking values like 45 or 0 throw InvalidParameterException at sync time.
  • src/Steps/Deploy/SyncStandaloneRecordSetStep.php:19Manifest::apex() TypeError exposure re-opens through DeployCommand (PR #26 stack base) for Fargate-only domain-less manifests. Same fix pattern as the SyncHttpsListenerStep / SyncListenerRuleStep guard already applied.
  • stubs/yolo.yml.stub:1 — original review WARNING #4 unaddressed. Still EC2/ASG-shaped with aws.autoscaling/aws.ec2 blocks and no tasks.web. A fresh yolo init produces a manifest where sync skips Fargate provisioning entirely.
  • tests/Unit/Steps/Fargate/SyncListenerRuleStepTest.php:30 — the "wraps from the ceiling back to the floor" test asserts < 49000 && >= 1000 but doesn't prove the floor-jump branch executed. A name whose CRC-derived base lands below 49000 already passes via the unrelated baseline. Tighten by computing or seeding a name that forces base==49999, then assert result == 1000 exactly.

Coverage gaps shipped this PR

  • SyncTaskLogGroupStep (new) — three branches (existing+match, existing+drift, missing→create), zero tests. At minimum a payload-pinning test on logGroupName().
  • SyncEcrRepositoryStep::lifecyclePolicy() (new) — manifest-driven JSON shape, zero coverage. Mirror the SyncTaskDefinitionStepTest pattern.
  • EnsureResourceNameLengthsStep (new) — boundary (32 chars exactly) not pinned. 3-case test would catch a future >>= flip.
  • SyncEcsServiceStep update-vs-create (carry-over) — still zero coverage on the diff predicate + create payload shape.

Notes (no inline)

  • describeLogGroups no paginationfindLogGroup() returns the first page only (default 50). Low-probability collision today but ergonomically annoying when it happens. Mirror onto UsesCloudWatchLogs::logGroup() which has the same gap.
  • SyncComputeCommand direct invocationyolo sync:compute production against a non-tasks.web manifest bypasses the SyncCommand gate and provisions Fargate infra with all defaults. Either guard the sub-command or accept as power-user drift.
  • EnsureResourceNameLengthsStep format check — covers 32-char length, not AWS Name format rules (alphanumeric + hyphens, no leading/trailing/consecutive hyphens). Out-of-format names still surface as deep AWS errors.
  • Aws::tags(..., 'tags', associative: true)['tags'] round-trip in SyncTaskLogGroupStep — wraps then immediately unwraps. Cleaner with a dedicated Aws::associativeTags() helper.
  • First-pass notes still applicableassignPublicIp: ENABLED, executionRoleArn bare default, imageTagMutability: MUTABLE, HTTPS cert Status !== 'ISSUED' fall-through — all unchanged, all tracked, no new exposure.
  • Carry-over coverage debtSyncTaskSecurityGroupStep tag-key drift coupling, Aws::ecsTags() no direct test, dry-run honesty across the suite. None blocking; worth a tracking issue.

🤖 Reviewed by Claude Code

'networkConfiguration' => [
'awsvpcConfiguration' => [
'subnets' => AwsResources::publicSubnetIds(),
'securityGroups' => [AwsResources::ecsTaskSecurityGroup()['GroupId']],
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: The original blocker fix in SyncTaskSecurityGroupStep:64 (use createSecurityGroup response GroupId directly) only dodged the stale-cache exposure inside that step. This line calls AwsResources::ecsTaskSecurityGroup() to populate the service's awsvpcConfiguration. static::$ecsTaskSecurityGroup is never set in the catch path of the SG step, so the lookup falls through to securityGroupByName()UsesEc2::securityGroups() which still returns the pre-create cached list (populated by the initial failed lookup on SyncTaskSecurityGroupStep:23). The newly-created SG isn't in that list, so this throws ResourceDoesNotExistException and aborts sync mid-create. First-run sync:compute still fails — just one step later.
Severity: BLOCKER

Suggested Fix

Seed the typed static cache inline after createSecurityGroup succeeds. Add a public setter to UsesEc2 (or a slim setEcsTaskSecurityGroup(array $sg) helper), then in SyncTaskSecurityGroupStep after the create + authorize:

$created = Aws::ec2()->createSecurityGroup([...]);
Aws::ec2()->authorizeSecurityGroupIngress(
    static::loadBalancerIngressRule(['GroupId' => $created['GroupId']], $containerPort)
);
AwsResources::setEcsTaskSecurityGroup(['GroupId' => $created['GroupId']]);

Downstream consumers only read ['GroupId'] so the minimal stub is sufficient. Belt-and-braces alternative is AwsResources::securityGroups(refresh: true) after create — costs an extra describe call but invalidates the whole pre-create cache. The targeted setter is cheaper.

Broader fix (kill the static caches entirely) is tracked in LPX-596.

Prompt
Review the code at src/Steps/Fargate/SyncEcsServiceStep.php#L63 alongside src/Concerns/UsesEc2.php#L158-L172 and src/Steps/Fargate/SyncTaskSecurityGroupStep.php#L45-L70. A potential issue has been flagged: the first-round SG cache fix dodged the stale-cache exposure inside SyncTaskSecurityGroupStep, but SyncEcsServiceStep's create branch re-resolves the same SG via AwsResources::ecsTaskSecurityGroup() which falls through to the still-stale UsesEc2::securityGroups() cache. Verify whether first-run sync:compute against a fresh environment will throw ResourceDoesNotExistException here. If real, propose seeding static::$ecsTaskSecurityGroup with the createSecurityGroup response GroupId via a setter on the trait.

],
'loadBalancers' => [
[
'targetGroupArn' => AwsResources::targetGroup()['TargetGroupArn'],
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: (bool) Manifest::get('tasks.web.enable-execute-command', false) — PHP's (bool) cast on the YAML-quoted string "false" evaluates to true. YAML's loose-typing rules mean a manifest line like enable-execute-command: "false" (quoted by accident or by editor auto-formatting) silently enables ECS Exec, granting shell access into running containers to anyone holding ecs:ExecuteCommand. Security-sensitive enough to warrant a strict parse.
Severity: WARNING

Suggested Fix

Use strict boolean parsing:

'enableExecuteCommand' => filter_var(
    Manifest::get('tasks.web.enable-execute-command', false),
    FILTER_VALIDATE_BOOLEAN,
    FILTER_NULL_ON_FAILURE,
) === true,

Or require === true so only the YAML boolean true enables it.

Prompt
Review src/Steps/Fargate/SyncEcsServiceStep.php#L69. A potential issue has been flagged: PHP's (bool) cast on the string "false" returns true, so a YAML-quoted enable-execute-command: "false" would silently enable ECS Exec. Verify whether the YAML loader prevents the misquote case; if not, propose strict boolean parsing.

}
}

protected static function lifecyclePolicy(): string
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: (int) casts on tasks.web.image-retention.keep-count and untagged-days silently coerce garbage. keep-count: thirty0 (string → int). keep-count: 1.71 (float truncation). AWS server-side rejects countNumber: 0, so a typo fails noisily — but the failure surfaces as a deep AWS API error at sync time, not a clean preflight. Worth a lower-bound + numeric validation up front.
Severity: WARNING

Suggested Fix

Validate via filter_var with FILTER_VALIDATE_INT and a min_range option, throwing IntegrityCheckException on invalid input:

$keepCount = static::validatePositiveInt(
    Manifest::get('tasks.web.image-retention.keep-count', 30),
    'tasks.web.image-retention.keep-count',
);
$untaggedDays = static::validatePositiveInt(
    Manifest::get('tasks.web.image-retention.untagged-days', 7),
    'tasks.web.image-retention.untagged-days',
);

Pair with a unit test pinning the validation contract (test coverage warning above).

Prompt
Review src/Steps/Fargate/SyncEcrRepositoryStep.php#L42. A potential issue has been flagged: (int) casts on keep-count and untagged-days accept non-numeric strings (→ 0), floats, negatives silently. Propose a positive-int validator that throws IntegrityCheckException with a clear message before the AWS round trip.

public function __invoke(array $options): StepResult
{
$name = static::logGroupName();
$retention = (int) Manifest::get('tasks.web.log-retention', 30);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding: tasks.web.log-retention is cast to int and passed straight to putRetentionPolicy, but the API only accepts a fixed enum: 1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 2192, 2557, 2922, 3288, 3653. Reasonable-looking values like 45, 28, or 0 throw InvalidParameterException at sync time with a generic AWS error rather than a friendly manifest-validation message. Same gap exists in the alpha-era SyncIvsCloudWatchLogGroupStep so a shared helper makes sense.
Severity: WARNING

Suggested Fix

Validate against the enum up front:

static $validRetentionDays = [1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, 2192, 2557, 2922, 3288, 3653];

$retention = (int) Manifest::get('tasks.web.log-retention', 30);

if (! in_array($retention, $validRetentionDays, true)) {
    throw new IntegrityCheckException(sprintf(
        'tasks.web.log-retention must be one of: %s (got %d)',
        implode(', ', $validRetentionDays),
        $retention,
    ));
}

Extract to Helpers::validateCloudWatchLogRetention($value, $key) and call from both this step and SyncIvsCloudWatchLogGroupStep.

Prompt
Review src/Steps/Fargate/SyncTaskLogGroupStep.php#L18. A potential issue has been flagged: tasks.web.log-retention isn't validated against CloudWatch's fixed enum of accepted retention values, so values like 45 or 0 throw a generic InvalidParameterException at sync time. Propose a shared Helpers::validateCloudWatchLogRetention helper used here and in SyncIvsCloudWatchLogGroupStep.

The targeted dodge in SyncTaskSecurityGroupStep (use createSecurityGroup
response GroupId directly) only fixed the in-step exposure. Downstream
consumers — specifically SyncEcsServiceStep:63 calling
AwsResources::ecsTaskSecurityGroup() for the awsvpcConfiguration —
re-resolve the SG via securityGroupByName() → securityGroups(), which
returned the pre-create cached list and threw ResourceDoesNotExistException.
First-run sync:compute against a fresh environment still failed,
just one step later.

Strips static::\$securityGroups + the unused \$refresh parameter. The
extra describeSecurityGroups call is ~50-100ms and only fires when a
typed cache (loadBalancerSecurityGroup / ec2SecurityGroup /
ecsTaskSecurityGroup / rdsSecurityGroup) misses, so the cost is
bounded.

Progressive strip per LPX-596 — the rest of the static caches stay
until the broader cleanup, but \$securityGroups is the one currently
biting first-run sync so it goes first.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three Manifest::get(...) values flow into AWS API payloads with
unchecked PHP casts that silently coerce garbage:

* (bool) Manifest::get('tasks.web.enable-execute-command') — PHP's
  cast on the YAML-quoted string "false" returns true, silently
  enabling ECS Exec (shell into running containers).
* (int) Manifest::get('tasks.web.image-retention.{keep-count,
  untagged-days}') — non-numeric strings coerce to 0, floats truncate.
  ECR rejects countNumber: 0 server-side so the failure is loud but
  deep (deferred AWS API error rather than preflight).
* (int) Manifest::get('tasks.web.log-retention') — CloudWatch only
  accepts a fixed enum (1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180,
  365, 400, 545, 731, 1827, 2192, 2557, 2922, 3288, 3653). Values
  like 45 or 28 throw InvalidParameterException at sync time.

Adds three helpers on Codinglabs\Yolo\Helpers:

* validatePositiveInt(value, key) — filter_var INT with min_range=1,
  throws IntegrityCheckException naming the offending key + value.
* validateCloudWatchLogRetention(value, key) — enum check, throws
  with the valid set listed.
* validateStrictBool(value, key) — filter_var VALIDATE_BOOLEAN with
  NULL_ON_FAILURE; "false" string evaluates to false (not true),
  garbage strings throw.

Used in SyncEcsServiceStep (enable-execute-command),
SyncEcrRepositoryStep (image-retention), SyncTaskLogGroupStep
(log-retention). lifecyclePolicy() promoted to public static so the
ECR payload shape is now also unit-testable (separate follow-up).

20 new tests pin the validators' behaviour incl. the "false"-string
trap the strict-bool helper exists to close.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
DeployCommand unconditionally enrols this step, but Manifest::apex()
has a `: string` return type and TypeErrors when neither `apex` nor
`domain` is set in the manifest — a valid v1 configuration for a
Fargate POC sitting behind the default ALB DNS.

Mirrors the same SKIPPED early-return guard already applied to
SyncHttpsListenerStep and SyncListenerRuleStep on PR #25.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…account lookups

Repro: `php vendor/bin/yolo sync:compute production --dry-run` against
a fresh AWS account (no ECS cluster yet) ran the first 9 steps cleanly
(ECR, cluster, task SG, ALB, target group, listeners, task def — all
WOULD_CREATE) then aborted at step 10 with an uncaught Guzzle 400:

    {"__type":"ClusterNotFoundException","message":"Cluster not found."}

`UsesEcs::ecsService()` called `describeServices` with no try/catch,
so the SDK's `ClusterNotFoundException` (Cluster doesn't exist yet
on a cold account) escaped past the calling step's
`catch (ResourceDoesNotExistException)` branch — which is where the
`if (dry-run) return WOULD_CREATE` lives.

Mirrors the existing pattern in `ecsTaskDefinition()` at
src/Concerns/UsesEcs.php — translate any AwsException to
ResourceDoesNotExistException so the caller's dry-run + create branches
work.

Audited siblings under src/Concerns/Uses*.php:
- ecsTaskDefinition, ecrRepository, targetGroup, getParameter — already
  wrapped, no change.
- loadBalancer, dbSubnetGroup, hostedZones, certificates, securityGroups,
  vpc, internetGateway, routeTable, subnets, logGroup — enumerate then
  fall through to ResourceDoesNotExistException naturally; no parent-
  identifier reference that could 404.
- loadBalancerListenerOnPort, listenerCertificate — call sites guarantee
  the parent (ALB / listener) exists before invocation; the protective
  ResourceDoesNotExistException fires upstream.

Also drops `EnsureResourceNameLengthsStep` — AWS's native
`InvalidParameterValue: Target group name cannot exceed 32 characters`
error is already actionable. The step was a busywork pre-flight that
added noise to sync output without meaningful ergonomic gain.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
DeployCommand unconditionally enrols this step, but Manifest::apex()
has a `: string` return type and TypeErrors when neither `apex` nor
`domain` is set in the manifest — a valid v1 configuration for a
Fargate POC sitting behind the default ALB DNS.

Mirrors the same SKIPPED early-return guard already applied to
SyncHttpsListenerStep and SyncListenerRuleStep on PR #25.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Mirrors PR #25's SyncCommand gate. Without it, a non-Fargate consumer
running `yolo deploy production` would unconditionally execute the
six Fargate-flavoured steps — registering a stray ECS task definition
revision, then crashing at UpdateEcsServiceStep with
ServiceNotFoundException. Partial state + opaque error.

The gate matches the existing entry-message pattern (early `error()`
+ FAILURE return) so the failure surface is operator-friendly: clear
message, no AWS-side side effects.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`/healthy` was a carry-over from the alpha lineage with no convention
behind it. `/health` is the generic HTTP health-endpoint convention
that doesn't tie YOLO's default to a specific Laravel version. (Laravel
11+ ships `/up` as its native default; consumers can override via
`tasks.web.health-check.path`.)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
DeployCommand unconditionally enrols this step, but Manifest::apex()
has a `: string` return type and TypeErrors when neither `apex` nor
`domain` is set in the manifest — a valid v1 configuration for a
Fargate POC sitting behind the default ALB DNS.

Mirrors the same SKIPPED early-return guard already applied to
SyncHttpsListenerStep and SyncListenerRuleStep on PR #25.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Mirrors PR #25's SyncCommand gate. Without it, a non-Fargate consumer
running `yolo deploy production` would unconditionally execute the
six Fargate-flavoured steps — registering a stray ECS task definition
revision, then crashing at UpdateEcsServiceStep with
ServiceNotFoundException. Partial state + opaque error.

The gate matches the existing entry-message pattern (early `error()`
+ FAILURE return) so the failure surface is operator-friendly: clear
message, no AWS-side side effects.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@stevethomas stevethomas merged commit b9f85ee into main May 19, 2026
5 checks passed
@stevethomas stevethomas deleted the steve/lpx-578-sync-command-ecs-cluster-alb-adoption branch May 19, 2026 12:14
stevethomas added a commit that referenced this pull request May 19, 2026
Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
DeployCommand unconditionally enrols this step, but Manifest::apex()
has a `: string` return type and TypeErrors when neither `apex` nor
`domain` is set in the manifest — a valid v1 configuration for a
Fargate POC sitting behind the default ALB DNS.

Mirrors the same SKIPPED early-return guard already applied to
SyncHttpsListenerStep and SyncListenerRuleStep on PR #25.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
Mirrors PR #25's SyncCommand gate. Without it, a non-Fargate consumer
running `yolo deploy production` would unconditionally execute the
six Fargate-flavoured steps — registering a stray ECS task definition
revision, then crashing at UpdateEcsServiceStep with
ServiceNotFoundException. Partial state + opaque error.

The gate matches the existing entry-message pattern (early `error()`
+ FAILURE return) so the failure surface is operator-friendly: clear
message, no AWS-side side effects.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
* feat(LPX-576, LPX-579): build + deploy commands for first Fargate POC

Stacked on PR #25. Closes the loop from `sync:fargate` plumbing to a
working end-to-end deploy:

- `build` appends three Docker steps when the manifest sets `tasks.web`:
  ECR login (SDK token → `docker login --password-stdin`), `docker build`
  on the prepared `.yolo/build/` context, then push both `:{app-version}`
  and `:latest`. `tasks.web.dockerfile` overrides the default `Dockerfile`
  path; `tasks.web.platform` defaults to `linux/amd64` so Apple Silicon
  builds land on Fargate without `--platform` ceremony.

- `deploy` registers a new task definition revision pinning
  `:{app-version}` (DRY-reuses `SyncTaskDefinitionStep::payload(?string)`
  with a new optional tag arg), then `updateService` with
  `forceNewDeployment: true` so ECS rolls the running tasks. `--watch`
  blocks on the `ServicesStable` waiter. Falls back to reading
  `.yolo/build/APP_VERSION` when `--app-version` isn't passed, so
  `yolo build production && yolo deploy production` Just Works.

- Reuses existing `SyncStandaloneRecordSetStep` /
  `SyncMultitenancyRecordSetStep` for the ALB → Route 53 alias.
  `ChecksIfCommandsShouldBeRunning` already gates them on tenancy mode.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: preserve yolo.yml deploy + deploy-all blocks for Fargate

PR #26's first cut silently dropped manifest deploy: / deploy-all:
execution — the alpha-era StartCommand ran them server-side via
CodeDeploy hooks, and the new DeployCommand had no equivalent.

- ExecuteDeployStepsStep (new) runs the `deploy:` block as a one-off
  ECS RunTask on the just-registered task definition revision, before
  the service is rolled. The container CMD is overridden to `sh -c
  "set -e; <commands>"` so a failed migration aborts the deploy with
  the service still on the previous revision (no traffic shift,
  rollback is a no-op).

- GenerateEntrypointScriptStep (new) writes
  `.yolo/build/.yolo-entrypoint.sh` during build. The script runs each
  `deploy-all:` command then `exec "$@"`. The consumer's Dockerfile
  wires it as `ENTRYPOINT ["/app/.yolo-entrypoint.sh"]` so per-container
  hooks (artisan optimize etc.) fire on every task start without YOLO
  reaching into the container at runtime.

deploy: → one-off RunTask (runs once per deploy, before traffic switch)
deploy-all: → entrypoint wrapper (runs on every container start)

Manifest abstraction preserved end-to-end.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: guard SyncStandaloneRecordSetStep against domain-less manifests

DeployCommand unconditionally enrols this step, but Manifest::apex()
has a `: string` return type and TypeErrors when neither `apex` nor
`domain` is set in the manifest — a valid v1 configuration for a
Fargate POC sitting behind the default ALB DNS.

Mirrors the same SKIPPED early-return guard already applied to
SyncHttpsListenerStep and SyncListenerRuleStep on PR #25.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: gate `yolo deploy` behind Manifest::has('tasks.web')

Mirrors PR #25's SyncCommand gate. Without it, a non-Fargate consumer
running `yolo deploy production` would unconditionally execute the
six Fargate-flavoured steps — registering a stray ECS task definition
revision, then crashing at UpdateEcsServiceStep with
ServiceNotFoundException. Partial state + opaque error.

The gate matches the existing entry-message pattern (early `error()`
+ FAILURE return) so the failure surface is operator-friendly: clear
message, no AWS-side side effects.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: gate BuildCommand's Docker steps on Manifest::has('tasks'), not 'tasks.web'

Docker build + ECR push are task-type-agnostic — the same Laravel-in-a-
container image serves whatever task type the deploy targets. The
runtime difference (octane:start vs queue:work vs schedule:run) lives
in the task definition's `command` override, not in the image.

The original `Manifest::has('tasks.web')` gate meant queue- or
scheduler-only apps got the base build pipeline (rsync, composer
install, npm) but no image to deploy. Broadening to `Manifest::has(
'tasks')` future-proofs without changing today's behaviour (web is
still the only task type in the schema).

DeployCommand's gate stays at `tasks.web` deliberately — today's
DeployCommand is web-specific (it updates the ECS web service). Queue
and scheduler deploys will branch or get their own commands when
those task types land (LPX-580+).

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stevethomas added a commit that referenced this pull request May 19, 2026
…out a public face (#29)

* feat(LPX-604): headless mode — skip ALB/listeners/Route53 when no public face

Detect headless apps from manifest shape (no `domain`, no `apex`, no
tenant declares either) and skip every step that touches a public-
facing AWS resource: ALB, target group, HTTP/HTTPS listeners, listener
rule, hosted zone, SSL cert, Route 53 record set. The ECS service
still runs without a load balancer association.

Use cases: queue worker pools, scheduler-only apps, internal services
with no public surface.

Surface:

- `Manifest::isHeadless()` — true when neither `domain`/`apex` is set
  on the solo manifest, and (for multitenant) every tenant entry lacks
  both keys.
- `Contracts/ExecutesWebStep` — new marker for ALB-side steps. Applied
  to SyncLoadBalancerStep, SyncTargetGroupStep, SyncHttpListenerStep,
  SyncHttpsListenerStep, SyncListenerRuleStep.
- `ExecutesDomainStep` finally wired into the gate (was an unused
  marker, flagged in the PR #26 review). Applied to
  SyncSoloRecordSetStep alongside its existing ExecutesSoloStep.
- `ChecksIfCommandsShouldBeRunning` skips both markers under headless.
- `SyncEcsServiceStep` create/update payloads drop `loadBalancers`
  and `healthCheckGracePeriodSeconds` when headless. Extracted
  `needsUpdate()` / `createPayload()` / `updatePayload()` as pure
  helpers — also closes the test-coverage gap the PR #25 review
  flagged for this step.
- `Manifest::tenants()` no longer TypeErrors on a headless tenant
  entry — apex normalisation gracefully resolves to null instead of
  bombing on `str_starts_with(null, ...)`.

16 new unit tests pin the headless detector matrix, the
SyncEcsServiceStep drift predicate (incl. carry-over coverage gap from
PR #25 review), the headless conditional payload shape, and the
tenants() normaliser's headless tolerance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(LPX-604): unify shouldBeRunning gate into one compound skip block

Three step-skip conditions (solo-in-multitenant, multitenancy-in-solo,
public-step-in-headless) sat as two separate blocks. Folding into one
compound `if` matches the shape of the check (all are "skip when
manifest mode disagrees with the marker") and makes the skip surface
trivially extensible for the next marker we add.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(LPX-604): collapse ExecutesDomainStep into ExecutesWebStep

Two markers for "stuff that only matters with a public face" was YAGNI
— no real step is web-but-not-domain or domain-but-not-web. Headless
mode skips the lot uniformly.

* ExecutesWebStep is now the sole public-facing marker.
* ExecutesDomainStep contract deleted.
* SyncHostedZoneStep, SyncSslCertificateStep, SyncSoloRecordSetStep
  re-tagged with ExecutesWebStep.
* ChecksIfCommandsShouldBeRunning skip-clause halves in size.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(LPX-604): split headless gate from the tenancy-mode gate

The previous compound-skip block lumped three different concerns into
one OR-chain. Tenancy-mode mismatch (solo step in mt app, mt step in
solo app) and "is this a public-facing step in a headless app" are
orthogonal — readability wins from separation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: guard AWS account mismatch in Command::execute

Before any AWS-touching command runs handle(), verify the manifest's
aws.account-id matches sts:GetCallerIdentity for the resolved profile.
Catches the foot-gun where YOLO_<ENV>_AWS_PROFILE points at the wrong
account and the manifest's ID gets used silently for ARN interpolation.

One STS call per command run (~50ms). Skips when the manifest doesn't
declare an account (legacy / test fixtures).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: throw IntegrityCheckException from the account-mismatch guard

`ensureManifestAccountMatchesProfile()` returned bool + called error()
itself, while every other manifest-validation failure in the codebase
(Manifest::apex, Helpers::validate*, EnsureEnvIsConfiguredCorrectlyStep,
SyncListenerRuleStep::nextAvailablePriority, the SG-rule integrity
guards) throws IntegrityCheckException. Aligning with the prevailing
pattern: void return, throw on either STS lookup failure or account
mismatch, Symfony's exception renderer handles the styled output for
free.

DX wins: single responsibility (guard either succeeds or throws), the
caller drops the `if (! ...) return 1` shim, stack trace available with
-vvv for debugging.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: revert account guard to bool+error() pattern, add profileAccountId helper

Two pieces of feedback:

1. Throwing IntegrityCheckException from Command::execute() broke the
   prevailing pattern at that layer — every other early gate (missing
   manifest, wrong environment, missing AWS_PROFILE) uses
   `error() + return 1`. The IntegrityCheckException throws all live
   deeper, in Steps and Manifest helpers. Reverted to bool return so
   the four early gates look identical.

2. Was reaching past Aws::accountId() to call Manifest::get() directly.
   Plus the STS side had no helper. Added Aws::profileAccountId()
   (named for the .env profile rather than 'caller', since the value
   is whatever YOLO_<ENV>_AWS_PROFILE resolves to) so both sides go
   through Aws::*.

Presence check now Manifest::has('aws.account-id') rather than the
truthy-check on Aws::accountId() — the latter has a `: string` return
type that TypeErrors when the key is absent, and widening to `?string`
would cascade into every existing caller using sprintf/concat.

InitCommand carve-out at Command::execute() short-circuits before
registerAwsServices(), so new apps don't hit the guard before they've
written a manifest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: require aws.account-id in the manifest

The "skip when not declared" early-return was a silent foot-gun — same
one the whole guard exists to prevent. A manifest missing
aws.account-id would happily run against whatever YOLO_<ENV>_AWS_PROFILE
resolves to, with no safety net.

Now bails with a clear "must declare aws.account-id under
environments.<env>" message. InitCommand short-circuits before this
guard, so new apps run `yolo init` without tripping it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: require name + aws.region in manifest integrity check

Split the manifest-vs-AWS reconciliation into two early gates in
Command::execute():

* ensureManifestIntegrity() runs *before* registerAwsServices(). Bails
  when any of `name`, `aws.region`, `aws.account-id` is missing.
  `aws.region` is the load-bearing one — RegistersAws feeds it into
  every SDK client at construction time, so a null region collapses
  into a deep AWS SDK error. `name` drives every keyedResourceName()
  and silently produces invalid AWS resource names (trailing hyphen)
  if absent. account-id presence is now also in this group.

* ensureAccountMatchesProfile() runs *after* registerAwsServices()
  and STS-verifies the manifest's account-id against
  YOLO_<ENV>_AWS_PROFILE. Renamed from ensureManifestAccountMatches-
  Profile (the manifest part is implied now that integrity runs
  upstream).

Test split: CommandManifestIntegrityTest (no AWS mocks needed) and
CommandAccountGuardTest (STS mock).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: extract ensureNameDeclared + ensureManifestKeyDeclared

ensureManifestIntegrity() reads as a series of named gates short-
circuited via && rather than a foreach loop with implicit semantics.
Each extracted method tells you what it's checking — the loop was
clever but obscured the per-key checks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: collapse ensureNameDeclared into ensureManifestKeyDeclared

Manifest::has() now falls back to top-level when an env-scoped lookup
misses. The dedicated ensureNameDeclared() method + its bespoke error
message were the only special-case for top-level keys; collapsing both
down keeps the integrity check a flat list of `Manifest::has` lookups.

Error message dropped the "under environments.<env>" qualifier — works
uniformly for top-level `name` and env-scoped `aws.*` keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: revert Manifest::has widening, keep ensureNameDeclared separate

Widening Manifest::has() to fall back to top-level was solving one
caller's problem by relaxing the contract of a widely-used helper —
opens subtle ambiguity if a future env-scoped key ever shares a name
with a top-level one. Reverted; ensureNameDeclared() handles the
top-level `name` lookup directly via Manifest::current().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: drop comments that restate code

routedHosts() comment block restated the ternary, and the tenants
normaliser comment took three lines to caption a one-line null
coalesce. Both code paths read for themselves.

Kept comments elsewhere that explain non-obvious *why* — un-memoised
SG cache (UsesEc2), AwsException translation (UsesEcs), narrow
service drift detection (SyncEcsServiceStep), headless tenant raw
read (Manifest::isHeadless).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: drop Manifest comments that restated code

* `// prefer the apex key when specified` — restated what `get(key, default)` already does.
* `// apex resolves to null for headless tenants` — the `?? ($config['domain'] ?? null)` chain already shows it.

Kept the one-line `// Read raw — tenants() normaliser TypeErrors on a
headless tenant.` in isHeadless() — that's a real why-not-use-tenants()
signal that disappears without it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant