Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 78 additions & 21 deletions .gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions LEGAL_DISCLAIMER.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
71 changes: 70 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,76 @@ See the full [ROADMAP](./docs/guides/ROADMAP.md) for details on each iteration.

## Getting started

### Installation and deployment
### Claude Code plugin (recommended)

This repository ships a [Claude Code plugin](https://docs.anthropic.com/en/docs/claude-code/plugins) that provides guided workflows for setup, deployment, task submission, and troubleshooting.

#### Installing the plugin

```bash
git clone https://github.com/aws-samples/sample-autonomous-cloud-coding-agents.git
cd sample-autonomous-cloud-coding-agents
claude --plugin-dir docs/abca-plugin
```

The `--plugin-dir` flag tells Claude Code to load the local plugin from the `docs/abca-plugin/` directory. The plugin's skills, commands, agents, and hooks will be available immediately.

> **Tip:** If you use Claude Code via VS Code or JetBrains, you can add `--plugin-dir docs/abca-plugin` to the extension's CLI arguments setting.

#### What the plugin provides

**Skills** (guided multi-step workflows — Claude activates these automatically based on your request):

| Skill | Triggers on | What it does |
|-------|------------|--------------|
| `setup` | "get started", "install", "first time setup" | Full guided setup: prerequisites, toolchain, deploy, smoke test |
| `deploy` | "deploy", "cdk diff", "destroy" | Deploy, diff, or destroy the CDK stack with pre-checks |
| `onboard-repo` | "add a repo", "onboard", 422 errors | Add a new GitHub repository via Blueprint construct |
| `submit-task` | "submit task", "run agent", "review PR", "quick submit" | Submit a coding task with prompt quality coaching (supports quick mode) |
| `troubleshoot` | "debug", "error", "not working", "failed" | Diagnose deployment, auth, or task execution issues |
| `status` | "status", "health check", "is ABCA running" | Platform health check: stack status, running tasks, build health |

**Agents** (specialized subagents, spawned automatically or via the Agent tool):

| Agent | When it's used |
|-------|---------------|
| `cdk-expert` | CDK architecture, construct design, handler implementation, stack modifications |
| `agent-debugger` | Task failure investigation, CloudWatch log analysis, agent runtime debugging |

**Hook** (runs automatically):

A `SessionStart` hook advertises available skills and agents so Claude can proactively suggest them when your request matches.

#### Local plugin development

If you're modifying the plugin itself, here's the file layout:

```
docs/abca-plugin/
plugin.json # Plugin manifest (name, version, description)
skills/
setup/SKILL.md # First-time setup workflow
deploy/SKILL.md # CDK deployment workflow
onboard-repo/SKILL.md # Repository onboarding workflow
submit-task/SKILL.md # Task submission (guided + quick mode)
troubleshoot/SKILL.md # Diagnostic workflow
status/SKILL.md # Platform health check
agents/
cdk-expert.md # CDK infrastructure specialist
agent-debugger.md # Task failure debugger
hooks/
hooks.json # SessionStart capability advertisement
```

**Key conventions:**
- The plugin lives under `docs/` to keep documentation and plugin content colocated
- Skills live in subdirectories with a `SKILL.md` file (not flat `.md` files)
- Agents are flat `.md` files with YAML frontmatter
- The hook advertises plugin capabilities only (no project-specific content)

**After editing plugin files**, restart Claude Code with `claude --plugin-dir docs/abca-plugin` to pick up changes.

### Manual installation and deployment

Install [mise](https://mise.jdx.dev/getting-started.html) if you want to use repo tasks (`mise run install`, `mise run build`). For monorepo-prefixed tasks (`mise //cdk:build`, etc.), set **`MISE_EXPERIMENTAL=1`** — see [CONTRIBUTING.md](./CONTRIBUTING.md).

Expand Down
43 changes: 38 additions & 5 deletions agent/src/memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,27 @@
ERROR level to surface bugs quickly.
"""

import hashlib
import os
import re
import time

from sanitization import sanitize_external_content

_client = None

# Validates "owner/repo" format — must match the TypeScript-side isValidRepo pattern.
_REPO_PATTERN = re.compile(r"^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$")

# Current event schema version — used to distinguish records written under
# different namespace schemes (v1 = repos/ prefix, v2 = namespace templates).
_SCHEMA_VERSION = "2"
# Current event schema version:
# v1 = repos/ prefix
# v2 = namespace templates (/{actorId}/...)
# v3 = adds source_type provenance + content_sha256 integrity hash
_SCHEMA_VERSION = "3"

# Valid source_type values for provenance tracking (schema v3).
# Must stay in sync with MemorySourceType in cdk/src/handlers/shared/memory.ts.
MEMORY_SOURCE_TYPES = frozenset({"agent_episode", "agent_learning", "orchestrator_fallback"})


def _get_client():
Expand Down Expand Up @@ -50,7 +59,8 @@ def _log_error(func_name: str, err: Exception, memory_id: str, task_id: str) ->
level = "ERROR" if is_programming_error else "WARN"
label = "unexpected error" if is_programming_error else "infra failure"
print(
f"[memory] [{level}] {func_name} {label}: {type(err).__name__}",
f"[memory] [{level}] {func_name} {label}: {type(err).__name__}: {err}"
f" (memory_id={memory_id}, task_id={task_id})",
flush=True,
)

Expand All @@ -75,6 +85,9 @@ def write_task_episode(
namespace templates (/{actorId}/episodes/{sessionId}/) place records
into the correct per-repo, per-task namespace.

Metadata includes source_type='agent_episode' for provenance tracking
and content_sha256 for integrity auditing on read (schema v3).

Returns True on success, False on failure (fail-open).
"""
try:
Expand All @@ -94,10 +107,16 @@ def write_task_episode(
parts.append(f"Agent notes: {self_feedback}")

episode_text = " ".join(parts)
# Hash the sanitized form; store the original. The read path re-sanitizes
# and checks against this hash: sanitize(original) at write == sanitize(stored) at read.
sanitized_text = sanitize_external_content(episode_text)
content_hash = hashlib.sha256(sanitized_text.encode("utf-8")).hexdigest()

metadata = {
"task_id": {"stringValue": task_id},
"type": {"stringValue": "task_episode"},
"source_type": {"stringValue": "agent_episode"},
"content_sha256": {"stringValue": content_hash},
"schema_version": {"stringValue": _SCHEMA_VERSION},
}
if pr_url:
Expand Down Expand Up @@ -142,12 +161,24 @@ def write_repo_learnings(
namespace templates (/{actorId}/knowledge/) place records into
the correct per-repo namespace.

Metadata includes source_type='agent_learning' for provenance tracking
and content_sha256 for integrity auditing on read (schema v3).
Note: hash auditing only happens on the TS orchestrator read path
(loadMemoryContext in memory.ts) where mismatches are logged but
records are kept — the Python side does not independently check hashes.

Returns True on success, False on failure (fail-open).
"""
try:
_validate_repo(repo)
client = _get_client()

learnings_text = f"Repository learnings: {learnings}"
# Hash the sanitized form; store the original. The read path re-sanitizes
# and checks against this hash: sanitize(original) at write == sanitize(stored) at read.
sanitized_text = sanitize_external_content(learnings_text)
content_hash = hashlib.sha256(sanitized_text.encode("utf-8")).hexdigest()

client.create_event(
memoryId=memory_id,
actorId=repo,
Expand All @@ -156,14 +187,16 @@ def write_repo_learnings(
payload=[
{
"conversational": {
"content": {"text": f"Repository learnings: {learnings}"},
"content": {"text": learnings_text},
"role": "OTHER",
}
}
],
metadata={
"task_id": {"stringValue": task_id},
"type": {"stringValue": "repo_learnings"},
"source_type": {"stringValue": "agent_learning"},
"content_sha256": {"stringValue": content_hash},
"schema_version": {"stringValue": _SCHEMA_VERSION},
},
)
Expand Down
8 changes: 7 additions & 1 deletion agent/src/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from __future__ import annotations

from enum import StrEnum
from typing import Self
from typing import Literal, Self

from pydantic import BaseModel, ConfigDict, Field, model_validator

Expand Down Expand Up @@ -52,6 +52,11 @@ class MemoryContext(BaseModel):
past_episodes: list[str] = Field(default_factory=list)


# Trust classification for content sources — mirrors ContentTrustLevel in context-hydration.ts.
# 'trusted': user-supplied input, 'untrusted-external': GitHub-sourced content,
# 'memory': memory records.
ContentTrustLevel = Literal["trusted", "untrusted-external", "memory"]

# Bump when this agent supports a new orchestrator HydratedContext shape
# (see cdk/src/handlers/shared/context-hydration.ts).
SUPPORTED_HYDRATED_CONTEXT_VERSION = 1
Expand All @@ -73,6 +78,7 @@ class HydratedContext(BaseModel):
guardrail_blocked: str | None = None
resolved_branch_name: str | None = None
resolved_base_branch: str | None = None
content_trust: dict[str, ContentTrustLevel] | None = None

@model_validator(mode="after")
def version_supported(self) -> Self:
Expand Down
5 changes: 3 additions & 2 deletions agent/src/prompt_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

from config import AGENT_WORKSPACE
from prompts import get_system_prompt
from sanitization import sanitize_external_content as sanitize_memory_content
from shell import log
from system_prompt import SYSTEM_PROMPT

Expand Down Expand Up @@ -49,11 +50,11 @@ def build_system_prompt(
if mc.repo_knowledge:
mc_parts.append("**Repository knowledge:**")
for item in mc.repo_knowledge:
mc_parts.append(f"- {item}")
mc_parts.append(f"- {sanitize_memory_content(item)}")
if mc.past_episodes:
mc_parts.append("\n**Past task episodes:**")
for item in mc.past_episodes:
mc_parts.append(f"- {item}")
mc_parts.append(f"- {sanitize_memory_content(item)}")
if mc_parts:
memory_context_text = "\n".join(mc_parts)
system_prompt = system_prompt.replace("{memory_context}", memory_context_text)
Expand Down
Loading