Skip to content

fix: enforce receipts and block leases#241

Closed
richardfogaca wants to merge 1 commit into
compozy:mainfrom
richardfogaca:fix/task-block-receipts
Closed

fix: enforce receipts and block leases#241
richardfogaca wants to merge 1 commit into
compozy:mainfrom
richardfogaca:fix/task-block-receipts

Conversation

@richardfogaca
Copy link
Copy Markdown
Contributor

@richardfogaca richardfogaca commented May 30, 2026

What this changes

This PR adds two task-safety primitives needed for reliable autonomous task routing: explicit run blocking and completion receipt enforcement.

Why

A worker should not be treated as done when required receipt artifacts are missing, and a human-blocked run should not keep a confusing claimed lease. These are the core pieces needed for AGH to park blocked work clearly and fail/retry receipt gaps deterministically.

Implementation notes

  • Adds agh task block and the matching agent UDS/API contract.
  • Adds native tool support for agh__task_run_block so agents can park their current run with the claim token.
  • Implements BlockRunLease across task service interfaces, GlobalDB, daemon bindings, CLI client, coordinator allowlist, and tests.
  • Clears lease ownership when a run is parked in needs_attention, preventing it from looking actively claimed.
  • Enforces completion_contract metadata before a run can complete, requiring declared artifacts to exist.
  • Updates derived task status handling for needs_attention.
  • Regenerates API/tool metadata and includes the existing MDX formatting cleanup required by repo verification.

Reviewer focus

  • Lease/token semantics: only the current claimed run should be blockable.
  • Completion contract behavior: missing required artifacts should produce a clear validation failure.
  • Status derivation: blocked runs should not mask newer terminal work.

Validation

  • PATH="$HOME/.bun/bin:$PATH" make verify

Copilot AI review requested due to automatic review settings May 30, 2026 01:11
@vercel
Copy link
Copy Markdown

vercel Bot commented May 30, 2026

@richardfogaca is attempting to deploy a commit to the Compozy Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Walkthrough

This PR adds a new task-run blocking operation that parks claimed task runs into a "needs_attention" state pending human review, along with a completion-contract validation framework. The feature spans API contracts, HTTP handlers, CLI commands, daemon autonomy tools, storage, service orchestration, and comprehensive test coverage.

Changes

Block Run Lease & Completion Contracts

Layer / File(s) Summary
Tool ID definition and authorization
internal/tools/builtin_ids.go, internal/tools/builtin/autonomy.go, internal/tools/builtin/toolsets.go, internal/tools/builtin/builtin_test.go, internal/coordinator/coordinator.go
ToolIDTaskRunBlock constant declared, added to autonomy toolset, validated as RiskMutating with read/destructive/open-world flags, and authorized in coordinator session allowlist.
Domain types, interfaces, and lease contracts
internal/task/lease.go, internal/task/types.go, internal/task/interfaces.go
LeaseBlock request type with JSON fields and Normalize/Validate methods; BlockRunLeaseMutation payload type; BlockRunLease methods added to Manager and RunStore interfaces.
Completion contract metadata extraction and artifact validation
internal/task/completion_contract.go
New module validates required artifacts from task completion-contract metadata, enforces missing_policy, resolves and verifies artifact paths within workspace root, validates claim-token hashes and lease expiration.
Storage-layer BlockRunLease implementation
internal/store/globaldb/global_db_task_claim.go, internal/store/globaldb/global_db_task_claim_test.go
GlobalDB.BlockRunLease verifies claim token, transitions run to needs_attention, clears lease/claim/session fields, writes block reason to error column; integration test verifies parks runs unclaimably.
Task manager service with BlockRunLease and completion contract integration
internal/task/manager.go, internal/task/lease_manager.go, internal/task/lease_test.go
Service adds BlockRunLease method with authority check, normalization, load/store/reconcile/emit flow; extends CompleteRunLease with lease and contract precondition validation; introduces CompletionContractRootResolver option; refactors task status derivation via snapshot summary; tests verify blocking parks runs and completion enforces required artifacts.
Workspace root resolution for completion contracts
internal/daemon/task_runtime.go
Detects taskStore implementations providing GetWorkspace, registers completion-contract root resolver that validates workspace_id and returns RootDir for artifact path resolution.
HTTP handler, contract type, and route registration
internal/api/contract/agents.go, internal/api/core/agent_tasks.go, internal/api/udsapi/routes.go, internal/api/udsapi/handlers_test.go
AgentTaskBlockRequest contract type; AgentTaskBlock HTTP handler decodes request, retrieves claim token, calls service, returns updated lease; POST /api/agent/tasks/:run_id/block route registered with coverage assertions.
CLI client interface, implementation, and task block command
internal/cli/client.go, internal/cli/task.go, internal/cli/task_test.go
DaemonClient.AgentTaskBlock interface method; unixSocketClient implementation delegates to agentTaskLeaseAction with "block" action; agh task block <run-id> Cobra command with --reason validation, agent identity requirement, and lease output; tests verify command payload and input validation.
Daemon native tools handler for autonomy block operation
internal/daemon/native_tools.go, internal/daemon/native_tools_test.go
daemonNativeTools.autonomyBlock routes ToolIDTaskRunBlock calls, decodes input, derives autonomy context, looks up active lease, calls service, returns structured response with internal tokens redacted; test verifies token redaction and block state transitions.
Test doubles, stubs, and test coverage updates
internal/api/testutil/task_stub.go, internal/daemon/daemon_test.go, internal/cli/helpers_test.go, internal/task/manager_test.go, internal/task/interfaces_integration_test.go
StubTaskManager.BlockRunLeaseFn callback and method; inMemoryManagerStore.BlockRunLease normalizes, validates, transitions to NeedsAttention, clears fields; stubClient.AgentTaskBlock test hook; native tools test mock updates with block tracking; fakeStore interface stub; route/handler coverage tests.

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures both main changes: adding task run blocking capability and enforcing completion contract receipts/artifacts.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new “block task run” lifecycle action (parking an active lease in needs_attention) and introduces completion-contract validation that requires specified artifacts to exist before a run can be completed.

Changes:

  • Add agh__task_run_block tool across tool IDs, toolsets, native catalog, daemon bindings, coordinator allowlist, CLI, and UDS API.
  • Implement BlockRunLease end-to-end (task service, store interfaces, in-memory store, GlobalDB store logic, and tests).
  • Add completion-contract artifact validation on completion with a pluggable workspace-root resolver.

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/site/content/runtime/core/configuration/config-toml.mdx Adjusts coordinator config table formatting in docs.
internal/tools/builtin_ids.go Adds new builtin tool ID constant for blocking a run lease.
internal/tools/builtin/toolsets.go Includes the new tool in the autonomy toolset list.
internal/tools/builtin/testdata/native-tool-catalog.json Registers the new tool in the native tool catalog testdata.
internal/tools/builtin/builtin_test.go Extends builtin descriptor/toolset tests for the new tool.
internal/tools/builtin/autonomy.go Adds the new autonomy descriptor and its input schema.
internal/task/types.go Introduces mutation type for blocking a run lease.
internal/task/manager_test.go Implements in-memory store behavior for BlockRunLease in tests.
internal/task/manager.go Validates completion contracts on completion; refactors task status derivation to consider needs_attention.
internal/task/lease_test.go Adds tests for blocking a lease and for completion-contract required artifacts.
internal/task/lease_manager.go Adds BlockRunLease service method; adds completion-contract validation to lease completion path.
internal/task/lease.go Adds LeaseBlock with normalization/validation.
internal/task/interfaces_integration_test.go Updates fake store to satisfy new interface method.
internal/task/interfaces.go Extends manager/store interfaces with BlockRunLease.
internal/task/completion_contract.go Adds completion-contract parsing + artifact existence validation + path resolution.
internal/store/globaldb/global_db_task_claim_test.go Adds integration test for GlobalDB BlockRunLease.
internal/store/globaldb/global_db_task_claim.go Implements GlobalDB BlockRunLease and SQL update helper.
internal/daemon/task_runtime.go Injects completion-contract root resolver via workspace store when available.
internal/daemon/native_tools_test.go Extends native tool tests/stubs for task_run_block.
internal/daemon/native_tools.go Adds daemon autonomy binding + handler for task_run_block.
internal/daemon/daemon_test.go Updates recording registry stub for the new store method.
internal/coordinator/coordinator.go Adds task_run_block to coordinator tool allowlist.
internal/cli/task_test.go Adds CLI tests for task block mapping and local validation.
internal/cli/task.go Adds agh task block command implementation.
internal/cli/helpers_test.go Extends stub client to support AgentTaskBlock.
internal/cli/client.go Extends daemon client interface + unix socket client with AgentTaskBlock.
internal/api/udsapi/routes.go Registers the new agent task block endpoint.
internal/api/udsapi/handlers_test.go Updates route coverage/binding tests for the new endpoint.
internal/api/testutil/task_stub.go Extends stub task manager with BlockRunLease.
internal/api/core/agent_tasks.go Implements AgentTaskBlock handler.
internal/api/contract/agents.go Adds API contract type for block request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/task/manager.go
Comment thread internal/task/completion_contract.go
Comment thread internal/task/lease_manager.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (4)
internal/store/globaldb/global_db_task_claim_test.go (1)

841-898: ⚡ Quick win

Wrap this in a t.Run("Should ...") subtest and mark it parallel.

The new case skips the required subtest/parallel pattern used elsewhere in this file.

As per coding guidelines, **/*_test.go: "Structure tests with t.Run("Should ...") subtests and mark them with t.Parallel() by default unless there is a specific reason to disable parallelization."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/store/globaldb/global_db_task_claim_test.go` around lines 841 - 898,
The TestGlobalDBBlockRunLeaseParksNeedsAttention test must be converted into a
parallel subtest: wrap its existing body in a t.Run("Should park blocked run and
mark needs attention", func(t *testing.T) { t.Parallel(); /* existing test code
*/ }) so the test follows the file's subtest/parallel pattern; ensure the
top-level TestGlobalDBBlockRunLeaseParksNeedsAttention function only calls that
t.Run and returns.
internal/daemon/task_runtime.go (1)

498-512: ⚡ Quick win

Validate that workspaceRecord.RootDir is non-empty before returning.

The completion contract root resolver validates that workspace_id is non-empty but doesn't verify that the fetched workspace has a non-empty RootDir. If a workspace record exists with an empty root directory, returning it could cause artifact validation to fail silently or produce misleading errors when resolving artifact paths.

🛡️ Proposed validation for RootDir
 		options = append(options, taskpkg.WithCompletionContractRootResolver(
 			func(ctx context.Context, taskRecord taskpkg.Task, _ taskpkg.Run) (string, error) {
 				workspaceID := strings.TrimSpace(taskRecord.WorkspaceID)
 				if workspaceID == "" {
 					return "", fmt.Errorf("workspace_id is required")
 				}
 				workspaceRecord, err := workspaceStore.GetWorkspace(ctx, workspaceID)
 				if err != nil {
 					return "", err
 				}
+				rootDir := strings.TrimSpace(workspaceRecord.RootDir)
+				if rootDir == "" {
+					return "", fmt.Errorf("workspace %q has empty root_dir", workspaceID)
+				}
-				return workspaceRecord.RootDir, nil
+				return rootDir, nil
 			},
 		))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/daemon/task_runtime.go` around lines 498 - 512, The completion
contract root resolver registered via taskpkg.WithCompletionContractRootResolver
checks workspaceRecord.WorkspaceID but does not validate
workspaceRecord.RootDir; update the anonymous resolver (the func passed to
WithCompletionContractRootResolver) to verify workspaceRecord.RootDir is
non-empty after calling workspaceStore.GetWorkspace(ctx, workspaceID) and return
a clear error (e.g., "workspace root directory is empty") instead of returning
an empty string so callers of taskWorkspaceGetter receive a proper failure when
RootDir is missing.
internal/task/lease_test.go (2)

540-540: 💤 Low value

Consider using fixed time for consistency.

While time.Now() works here since the test doesn't assert on absolute values, other tests in this file use fixed time (e.g., line 352: time.Date(2026, 4, 26, 12, 0, 0, 0, time.UTC)). Using a fixed timestamp improves test determinism and consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/task/lease_test.go` at line 540, Replace the non-deterministic
time.Now() usage for the test-local variable now with a fixed UTC timestamp to
make the test deterministic; locate the assignment to now (now :=
time.Now().UTC()) in internal/task/lease_test.go and set it to a fixed
time.Date(...) value consistent with other tests (e.g., time.Date(2026, 4, 26,
12, 0, 0, 0, time.UTC)).

621-621: 💤 Low value

Consider using fixed time for consistency.

Same suggestion as line 540 - while time.Now() is functional, fixed time would be more consistent with other tests in this file (e.g., line 352).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/task/lease_test.go` at line 621, Replace the dynamic time usage that
sets now := time.Now().UTC() with a fixed, deterministic timestamp (e.g.,
construct via time.Date(..., time.UTC)) so the test uses a constant "now" like
other tests in this file; update any assertions that depend on this value if
needed and ensure the variable name remains now to keep references (e.g., in
lease_test.go tests) consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/api/core/agent_tasks.go`:
- Around line 220-223: The AgentTaskBlock handler uses the non-canonical action
name "task.block" which breaks consistency with the rest of the file; update the
action string passed to agentTaskLeaseMutationSetup in AgentTaskBlock to use the
canonical "agent.task.block" (find the call to agentTaskLeaseMutationSetup in
the AgentTaskBlock function and replace the literal), and run a quick search for
any other solitary "task.block" occurrences to ensure auth/audit policy names
remain consistent with the agent.task.* namespace.

In `@internal/coordinator/coordinator.go`:
- Line 60: PromptOverlay() currently advertises only
"next|heartbeat|complete|fail|release" but ToolIDTaskRunBlock is now
allowlisted; update the prompt text or enum returned by PromptOverlay() to
include "block" so the coordinator guidance matches the new allowed action.
Locate the PromptOverlay() function and add "block" to the list/string of
allowed actions (the same place where "next", "heartbeat", "complete", "fail",
"release" are defined) so the coordinator is explicitly instructed to use
ToolIDTaskRunBlock.

In `@internal/daemon/native_tools_test.go`:
- Around line 6416-6418: The test struct adds blockCalls but never includes it
in totalCalls(), so update the totalCalls() implementation to include blockCalls
in the summed return value (alongside any existing fields like lastBlock,
blockErr, etc.) so that assertions expecting zero task-manager invocations
correctly account for blockCalls (look for the totalCalls() method and the
fields blockCalls, lastBlock, blockErr to modify).

In `@internal/daemon/native_tools.go`:
- Around line 2770-2801: Reject blank/whitespace-only "reason" early in
autonomyBlock before calling lookupAutonomyLease: validate input.Reason (trim
and ensure non-empty) right after decoding input and before the call to
n.lookupAutonomyLease, returning an invalid_input error (same style as
requiredNativeString) when blank; update autonomyBlock (and apply the identical
change to the other handler that also calls n.lookupAutonomyLease) so lease
lookup never runs for empty human-facing reasons.

In `@internal/store/globaldb/global_db_task_claim_test.go`:
- Around line 863-898: The test currently only checks the read model's
ClaimTokenHash after calling globalDB.BlockRunLease but must also assert the raw
persisted claim_token was removed from storage; update the test around
BlockRunLease/ClaimNextRun (the BlockRunLease call and the later ClaimNextRun
assertion) to query the underlying task_runs.claim_token column for claim.Run.ID
(using existing globalDB/sql DB handle or a helper) and assert that the stored
claim_token is NULL/empty, failing the test if the secret still exists; keep the
existing ClaimTokenHash assertion but add this direct DB-level assertion to
ensure redaction is enforced.

In `@internal/task/manager.go`:
- Around line 2884-2897: The task status computation currently checks
snapshot.hasQueuedOrClaimed before snapshot.hasNeedsAttention, causing tasks
with needs_attention to be marked ready when queued siblings exist; update the
logic in summarizeTaskRunPolicySnapshot usage (in the block that inspects
snapshot.hasInProgress, snapshot.hasQueuedOrClaimed, snapshot.hasNeedsAttention
and runnableBlocked) to test snapshot.hasNeedsAttention before checking
snapshot.hasQueuedOrClaimed so that when snapshot.hasNeedsAttention is true the
function returns TaskStatusBlocked (regardless of queued/claimed runs), ensuring
ClaimNextRun and lease blocking honor the blocked state.

---

Nitpick comments:
In `@internal/daemon/task_runtime.go`:
- Around line 498-512: The completion contract root resolver registered via
taskpkg.WithCompletionContractRootResolver checks workspaceRecord.WorkspaceID
but does not validate workspaceRecord.RootDir; update the anonymous resolver
(the func passed to WithCompletionContractRootResolver) to verify
workspaceRecord.RootDir is non-empty after calling
workspaceStore.GetWorkspace(ctx, workspaceID) and return a clear error (e.g.,
"workspace root directory is empty") instead of returning an empty string so
callers of taskWorkspaceGetter receive a proper failure when RootDir is missing.

In `@internal/store/globaldb/global_db_task_claim_test.go`:
- Around line 841-898: The TestGlobalDBBlockRunLeaseParksNeedsAttention test
must be converted into a parallel subtest: wrap its existing body in a
t.Run("Should park blocked run and mark needs attention", func(t *testing.T) {
t.Parallel(); /* existing test code */ }) so the test follows the file's
subtest/parallel pattern; ensure the top-level
TestGlobalDBBlockRunLeaseParksNeedsAttention function only calls that t.Run and
returns.

In `@internal/task/lease_test.go`:
- Line 540: Replace the non-deterministic time.Now() usage for the test-local
variable now with a fixed UTC timestamp to make the test deterministic; locate
the assignment to now (now := time.Now().UTC()) in internal/task/lease_test.go
and set it to a fixed time.Date(...) value consistent with other tests (e.g.,
time.Date(2026, 4, 26, 12, 0, 0, 0, time.UTC)).
- Line 621: Replace the dynamic time usage that sets now := time.Now().UTC()
with a fixed, deterministic timestamp (e.g., construct via time.Date(...,
time.UTC)) so the test uses a constant "now" like other tests in this file;
update any assertions that depend on this value if needed and ensure the
variable name remains now to keep references (e.g., in lease_test.go tests)
consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 49ea97f1-30a9-416c-af70-d1d8fef9c6fd

📥 Commits

Reviewing files that changed from the base of the PR and between cd4b155 and a5e7bf6.

⛔ Files ignored due to path filters (2)
  • internal/tools/builtin/testdata/native-tool-catalog.json is excluded by !**/*.json
  • packages/site/content/runtime/core/configuration/config-toml.mdx is excluded by !**/*.mdx
📒 Files selected for processing (29)
  • internal/api/contract/agents.go
  • internal/api/core/agent_tasks.go
  • internal/api/testutil/task_stub.go
  • internal/api/udsapi/handlers_test.go
  • internal/api/udsapi/routes.go
  • internal/cli/client.go
  • internal/cli/helpers_test.go
  • internal/cli/task.go
  • internal/cli/task_test.go
  • internal/coordinator/coordinator.go
  • internal/daemon/daemon_test.go
  • internal/daemon/native_tools.go
  • internal/daemon/native_tools_test.go
  • internal/daemon/task_runtime.go
  • internal/store/globaldb/global_db_task_claim.go
  • internal/store/globaldb/global_db_task_claim_test.go
  • internal/task/completion_contract.go
  • internal/task/interfaces.go
  • internal/task/interfaces_integration_test.go
  • internal/task/lease.go
  • internal/task/lease_manager.go
  • internal/task/lease_test.go
  • internal/task/manager.go
  • internal/task/manager_test.go
  • internal/task/types.go
  • internal/tools/builtin/autonomy.go
  • internal/tools/builtin/builtin_test.go
  • internal/tools/builtin/toolsets.go
  • internal/tools/builtin_ids.go

Comment thread internal/api/core/agent_tasks.go
Comment thread internal/coordinator/coordinator.go
Comment thread internal/daemon/native_tools_test.go
Comment thread internal/daemon/native_tools.go
Comment thread internal/store/globaldb/global_db_task_claim_test.go
Comment thread internal/task/manager.go Outdated
@richardfogaca richardfogaca force-pushed the fix/task-block-receipts branch from a5e7bf6 to 9e68909 Compare May 30, 2026 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants