devils-council

Get a staff-level LGTM on every plan, PR, and RFC — in 48 seconds.

A Staff Engineer, SRE, Product Manager, and Devil's Advocate independently review your work before anyone else sees it. Every finding cites your own words back at you. Ship with confidence.

See it in action (30 seconds, no setup)

# Install
echo '{"plugin":["devils-council-opencode"]}' > .opencode/opencode.json

# Run the built-in demo against a deliberately flawed plan
/devils-council-demo

The demo reviews a notification service plan with over-engineering, circular risk mitigations, and missing operational details. Watch each persona catch different issues.

What it catches (real examples)

Catch	Personas	Impact
Credential migration with no verification gate	SRE + Devil's Advocate	Prevented 173-workspace CI/CD outage
CI trigger premise disproven from both directions	Devil's Advocate + SRE	Killed a design with simultaneous false positives AND false negatives
4 plans build tooling, none prove it works	Product Manager	Caught "plumbing without water" before 2 weeks of wasted effort
Threat model claims control exists, code doesn't	Devil's Advocate + SRE	Avoided false SOC2 attestation
3,291 lines for a problem standard tooling solves	All three (convergent)	Deleted 3,291 lines, replaced with 40

See all case studies →

How it works

Four always-on core personas critique your artifact in parallel. Additional bench personas auto-trigger on structural signals — Helm values changes wake Dual-Deploy; AWS SDK imports wake FinOps; auth/crypto code wakes Security; code diffs wake Junior Engineer.

A Council Chair synthesizes contradictions by name ("PM says ship, SRE says block because...") without collapsing dissent into a scalar verdict. Every finding cites a verbatim quote from the artifact; every finding has a stable ID so dismissals persist across re-runs.

Status: v1.6.0 — Dual-runtime plugin (OpenCode + Claude Code). MIT licensed.

Requirements

OpenCode v1.14+ (primary runtime — npm plugin, no extra install)
Claude Code v2.1.63+ (secondary runtime — marketplace plugin)
jq (macOS: brew install jq; Ubuntu: apt install jq)
python3 + PyYAML (pip install pyyaml) — required for HTML report generation
OpenAI Codex CLI — required for Security + Dual-Deploy deep scans (see Codex Setup)
Node 18+ only if installing Codex via npm (brew install --cask codex avoids this)

Install

OpenCode (recommended)

Add to your project's opencode.json:

{
  "plugin": ["devils-council-opencode"]
}

OpenCode auto-installs npm plugins at startup — no npm install needed. The plugin ships 10 agents:

@staff-engineer, @sre, @product-manager, @devils-advocate — standalone persona reviews
@council-chair — synthesizes findings from multiple persona scorecards
@council-review — full council (4 core + dynamic bench activation + Chair synthesis) in one invocation
@security-reviewer, @finops-auditor, @air-gap-reviewer, @performance-reviewer — bench personas activated by signal detection

Claude Code

From the GitHub marketplace:

/plugin marketplace add astrowicked/devils-council
/plugin install devils-council@devils-council

Verify:

claude plugin list --json | jq '.[] | select(.name=="devils-council") | .version'
# Expected: "1.6.0"

After upgrade (see Troubleshooting #1 if new commands don't appear):

/plugin uninstall devils-council@devils-council
/plugin install devils-council@devils-council

Quickstart

Review a plan

# Full council review (core personas + triggered bench + Chair synthesis)
/devils-council:review path/to/PLAN.md

# GSD integration: review every plan in a phase
/devils-council:on-plan 7

Review code

# Review a diff file
/devils-council:review path/to/diff.patch --type=code-diff

# Review your current branch against main
git diff main...HEAD > /tmp/d.patch && /devils-council:review /tmp/d.patch --type=code-diff

# GSD integration: review the committed diff for a phase
/devils-council:on-code 7                 # auto-discovers phase-start commit
/devils-council:on-code 7 --from HEAD~10  # explicit base when .planning/ is gitignored

Code diffs auto-trigger the Junior Engineer persona (D-12) in addition to the four core personas. Security signals in the diff wake the Security Reviewer; Helm changes wake Dual-Deploy.

Dig into a finding

# Ask a follow-up scoped to ONE persona's scorecard
/devils-council:dig staff-engineer latest "Why is the ConfigLoader a blocker?"
/devils-council:dig security-reviewer latest "justify blocker severity"

Output renders synthesis-first: top-3 blockers with persona attribution, contradictions called out by name, per-persona scorecards collapsed into one-liners by default (pass --show-nits to expand). Raw scorecards live under .council/<run>/<persona>.md.

HTML Reports

After every review, if python3 is available, devils-council automatically generates a self-contained HTML report:

.council/<run-id>/REPORT.html

Open it in any browser — no server needed. The report includes the synthesis, all persona scorecards, finding IDs, and severity badges. Useful for sharing with teammates who don't have the plugin installed.

Generate or regenerate a report manually:

# Latest run
bin/dc-render-html.py latest

# Specific run directory
bin/dc-render-html.py .council/2026-05-16T14-32-00/

The HTML is fully self-contained (inline CSS, no external dependencies) so it survives being attached to a ticket or emailed.

GitHub Action (CI integration)

Teammates see findings on every PR. No local install required for reviewers.

Zero-config (free, no API key)

name: Code Review
on: [pull_request]

jobs:
  council-review:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
      models: read
    steps:
      - uses: actions/checkout@v4
      - uses: astrowicked/devils-council-action@v2

That's it. No secrets, no signup. Uses GitHub Models (gpt-4.1-mini) for a quick scan from Staff Engineer + SRE personas.

Full review (add API key)

      - uses: astrowicked/devils-council-action@v2
        with:
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

Unlocks all 4 core personas + signal-triggered bench (Security, FinOps, Air-Gap, Performance) + Council Chair synthesis + inline line-level comments.

Review plans on PR

      - uses: astrowicked/devils-council-action@v2
        with:
          artifact: '.planning/**/PLAN.md'
          type: plan
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

The action fetches the PR diff (or specified artifact), runs the council, and posts findings. Blockers appear as REQUEST_CHANGES; majors and minors as COMMENT.

See devils-council-action for full docs and configuration options.

Telemetry (optional)

Devils Council can emit anonymous, opt-in usage events so the maintainer can answer adoption + quality questions like: which personas actually fire, where reviews fail, which prompts produce false positives, which personas can be retired. Default: OFF. No event is sent unless you explicitly opt in AND DO_NOT_TRACK is not set on the runner environment.

Note on "contract": Throughout this section, "contract" means an API contract — a wire-format commitment about field names, types, and additive evolution. It is NOT a legal contract, license, or warranty. Devils Council makes no legal promises to operators or users beyond what the open-source license already grants.

What is NEVER sent — by deliberate design. No source code. No file paths. No line numbers. No finding text. No model output. No SYNTHESIS.md content. No persona reasoning. No PR titles, branch names, or comment bodies. No review verdicts in determinative form. No repository names (only the opaque server-side HMAC of owner/repo). The fields enumerated below are the complete v1 surface; adding any field that could carry intellectual property, source content, or model-generated text would be a privacy regression and is rejected on principle.

Why opt in

Devils Council is a small project. The only way we know which personas actually fire, which prompts produce noise, and where reviews fail is the events you send. With a handful of opted-in operators we can retire personas no one ever invokes, tune prompts that produce false positives, and fix error classes that cluster. With zero opted-in operators we're guessing. We send no code, no finding text, no repo names — just counts, persona names, and error classes (full schema below).

Opt in — GitHub Action

- uses: astrowicked/devils-council-action@v2
  with:
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
    telemetry: true

See the action's Telemetry docs for the full input list, fire-and-forget guarantees, and the workflow feedback-targets artifact.

Opt in — OpenCode / Claude Code plugins

Both plugin runtimes (OpenCode and Claude Code) emit a single review.completed event per /devils-council:review run when opted in. Emission is env-var only: set DC_TELEMETRY=true in the shell that runs OpenCode or Claude Code, and the next slash-command invocation will fire the POST after rendering completes.

export DC_TELEMETRY=true
# then run OpenCode or Claude Code as normal; /devils-council:review will emit

Env var	Purpose	Default
`DC_TELEMETRY`	Set to literal string `true` to enable emission.	unset (off)
`DC_TELEMETRY_ENDPOINT`	Override the default endpoint (must be `https://`).	`https://dc-telemetry.pub.andyjwoodard.net/v1/events`
`DO_NOT_TRACK`	Set to `1` to hard-disable telemetry regardless of any other signal.	unset

DC_TELEMETRY requires the exact string "true" — 1, yes, on are NOT accepted. This is a deliberate "explicit-only" gate to avoid drift across the runtimes' env-handling quirks.

The plugins only emit from github repos — bin/dc-telemetry.sh inspects git remote get-url origin; non-github remotes (gitlab, bitbucket, self-hosted Gitea, no remote at all) silently skip the POST without erroring. Reason: the dc-telemetry wire schema requires owner_repo in owner/repo form; non-github URLs cannot be parsed unambiguously.

In v1 only review.completed is emitted from the plugins. review.failed emission is deferred — see PLUG-05 in the dc-telemetry backlog.

Manual invocation

Telemetry fires automatically at the end of /devils-council:review. To manually emit for a prior council run (or to inspect the payload without POSTing), invoke the script directly:

# preview only — no POST
DC_TELEMETRY=true bin/dc-telemetry.sh .council/<run-dir> --runtime=claude-code --dry-run
# or --runtime=opencode

# real emit
DC_TELEMETRY=true bin/dc-telemetry.sh .council/<run-dir> --runtime=claude-code

<run-dir> is the timestamped directory produced by a prior /devils-council:review invocation (look under .council/).

`DO_NOT_TRACK` precedence (hard override)

Precedence chain, highest wins:

DO_NOT_TRACK=1 environment variable on the runner / dev shell — HARD OFF.
Client config flag (action input telemetry: false, or plugin DC_TELEMETRY not equal to "true").
Opt-in input (telemetry: true on the action; DC_TELEMETRY=true for plugins).

If DO_NOT_TRACK=1 is set, NO event is sent regardless of any other config — even if telemetry: true is also set on the action or DC_TELEMETRY=true is set in the plugin shell. The action logs ::notice::telemetry: DO_NOT_TRACK honored, no event sent; the plugins exit silently (no logging surface — fire-and-forget by design).

Event envelope

Every event uses the v1 envelope:

Field	Type	Required	Notes
`event_type`	string enum	yes	One of `review.completed` \| `review.failed`.
`owner_repo`	string `owner/repo`	yes	Server HMACs this into `repo_hash` and drops the plaintext. Clients never send a hash.
`payload`	JSON object	yes	Per-event shape; see tables below.

Every payload also carries schema_version: 1 so future additions are additive without breaking existing consumers.

`review.completed` payload fields

Field	Type	Required	Example	Notes
`schema_version`	integer	yes	`1`	Always `1` in v1; additive-only API contract going forward.
`ts`	string (ISO)	yes	`2026-05-19T19:03:56.060Z`	Client-set UTC ISO8601.
`source`	enum string	yes	`"action"`	One of `"action"` \| `"plugin"`.
`runtime`	enum string	yes	`"action-shell"`	One of `"action-shell"` \| `"opencode"` \| `"claude-code"`.
`client_version`	string	yes	`"v2.3.0"`	The action version (or plugin version) emitting the event.
`provider`	enum string	no	`"anthropic"`	One of `"anthropic"` \| `"bedrock"` \| `"github-models"`. Nullable.
`model`	string	no	`"claude-sonnet-4-20250514"`	The model ID used. Nullable.
`blocker_count`	integer	yes	`0`	Count of BLOCKER-severity findings.
`major_count`	integer	yes	`2`	Count of MAJOR-severity findings.
`minor_count`	integer	yes	`1`	Count of MINOR-severity findings.
`nit_count`	integer	yes	`4`	Count of NIT-severity findings.
`findings_total`	integer	yes	`7`	Total findings produced (pre-threshold filter).
`findings_dropped`	integer	yes	`0`	Findings filtered by `severity-threshold` or `max-findings`.
`triggered_bench`	array	yes	`["staff-engineer", "security-reviewer"]`	Persona names whose scorecards had non-empty findings for this run.
`duration_s`	number	yes	`48`	End-to-end wall-clock for the council review, in seconds.

No verdict field is sent. Downstream dashboards derive a verdict-like signal from the counts (blocker_count > 0 → BLOCK, major_count > 0 → WARN, else PASS). Keeping the determination out of the wire payload removes a future leak vector for the underlying reasoning.

`review.failed` payload fields

Field	Type	Required	Example	Notes
`schema_version`	integer	yes	`1`	Same as above.
`ts`	string (ISO)	yes	`2026-05-19T19:03:56.060Z`	Client-set UTC ISO8601.
`source`	enum string	yes	`"action"`	One of `"action"` \| `"plugin"`.
`runtime`	enum string	yes	`"action-shell"`	Same enum as `review.completed`.
`client_version`	string	yes	`"v2.3.0"`	Same as `review.completed`.
`error_class`	enum string	yes	`"synthesis-failed"`	Failure classification. Full v1 enum: `synthesis-failed`, `prep-failed`, `no-api-key`, `timeout`, `model-error`, `validation-failed`. Different client versions emit different subsets — `devils-council-action@v2.3.0` emits `synthesis-failed` + `prep-failed`; the rest are reserved for future clients with finer classification.
`stage`	enum string	yes	`"synthesis"`	One of `"prep"` \| `"persona-run"` \| `"validation"` \| `"synthesis"`.

Consumers MUST treat any unknown error_class value as the catch-all "failure of an unspecified subclass" rather than rejecting the event.

No error message strings, no stack traces, no log lines are sent. Only the classified error_class + stage enums.

Fire-and-forget guarantees

3-second total time budget on the POST (curl --max-time 3 --connect-timeout 2).
POST failure NEVER blocks the action or the plugin — the emitter always exits 0 / continues normally.
Successful POSTs log nothing; failures log a single ::notice:: (action) or equivalent (plugin) annotation.

Self-host

The reference endpoint is one collector, not a required dependency. Run your own (see dc-telemetry) and point clients at it.

For the GitHub Action, set telemetry-endpoint::

- uses: astrowicked/devils-council-action@v2
  with:
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
    telemetry: true
    telemetry-endpoint: https://telemetry.yourcompany.example/v1/events

For the OpenCode / Claude Code plugins, export DC_TELEMETRY_ENDPOINT in the same shell:

export DC_TELEMETRY=true
export DC_TELEMETRY_ENDPOINT=https://telemetry.yourcompany.example/v1/events

The endpoint must be https:// — bin/dc-telemetry.sh silently rejects any other scheme (SSRF mitigation). The collector requires an HMAC secret to boot; without it the service refuses to start.

Persona Roster

Sixteen personas ship in v1.6.0. Core tier always runs; bench tier auto-triggers on artifact signals.

Persona	Tier	Primary Concern	Trigger / always-on
Staff Engineer	core	Simplicity, YAGNI, right-sized design	always-on
SRE / On-call	core	Operational reality, failure modes	always-on
Product Manager	core	Business alignment, user value	always-on
Devil's Advocate	core	Red-team, premise attack	always-on
Security Reviewer	bench	Authn/z, crypto, input handling, dependencies	auth/crypto code, secret handling, dep update
FinOps Auditor	bench	Cloud cost, storage/compute efficiency	AWS SDK, new cloud resource, autoscaling, storage class
Air-Gap Reviewer	bench	Self-hosted, no-egress, pinned deps	network egress, external image pull, unpinned dep, license phone-home
Dual-Deploy Reviewer	bench	SaaS + self-hosted parity, Helm/KOTS surface	Helm values change, Chart.yaml, KOTS config, SaaS-only assumption
Compliance Reviewer	bench	Regulatory control citations (GDPR, HIPAA, SOC2, PCI)	compliance_marker signal
Performance Reviewer	bench	Hot-path characterization, worst-case latency	perf_sensitive signal
Test Lead	bench	Circular tests, flaky patterns, coverage gaps	test_imbalance signal
Executive Sponsor	bench	Quantified business impact (dollars, weeks, customers)	exec_keyword signal (plan/rfc only)
Competing Team Lead	bench	Shared-infra consumer impact	shared_infra_change signal
Junior Engineer	bench	First-person comprehension failure	always-invokable on code-diff (D-12)
Council Chair	chair	Contradiction synthesis, top-3 blockers	always-on; runs sequentially after core + bench
Artifact Classifier	classifier	Ambiguous artifact type routing	Haiku fallback when structural signals are zero

Each persona has a distinct value-system anchor, characteristic-objection list, and banned-phrase list; a blinded-reader test on a sample artifact can attribute scorecards to personas without filenames (Phase 4 CORE-05 criterion; verified 2026-04-23).

Trigger Rules

Bench personas auto-join the review when the classifier detects structural signals in the artifact. Signal detection is pure-function filename + AST + Helm-key + Chart.yaml + AWS-SDK-import matching — NOT keyword matching (per BNCH-01). Ambiguous artifacts fall back to a Haiku-classifier subagent.

21 signals (expand)

Source of truth: lib/signals.json.

Signal ID	Description	Target personas
`auth_code_change`	Auth/session/token code (auth, session, jwt*, bcrypt/jose imports, /login endpoints)	security-reviewer
`crypto_import`	Imports of cryptographic primitives or RNG	security-reviewer
`secret_handling`	Env-var reads, secret-manager fetches, KMS calls	security-reviewer
`dependency_update`	Lockfile / package.json / requirements.txt / go.sum / Cargo.lock diffs	security-reviewer, air-gap-reviewer
`aws_sdk_import`	New use of AWS SDK client (`@aws-sdk/client-*`, `aws-sdk`, `boto3`)	finops-auditor
`new_cloud_resource`	New Terraform / CDK / CloudFormation / Pulumi resource	finops-auditor, dual-deploy-reviewer
`autoscaling_change`	HPA, ASG, replica count, batch concurrency	finops-auditor
`storage_class_change`	S3 StorageClass, GCS storageClass, PV storageClassName, lifecycle policies	finops-auditor
`network_egress`	Outbound calls to non-loopback hostnames, NetworkPolicy egress	air-gap-reviewer
`external_image_pull`	Dockerfile FROM / Pod image / values image.repository outside org registry	air-gap-reviewer, dual-deploy-reviewer
`unpinned_dependency`	Ranges instead of pins, `:latest` tags, unversioned deps	air-gap-reviewer
`license_phone_home`	Sentry/Datadog/Mixpanel/Amplitude/license-server SDKs	air-gap-reviewer
`helm_values_change`	`values.yaml` / `values.schema.json` / templates referencing `.Values.*`	dual-deploy-reviewer
`chart_yaml_present`	Artifact contains or modifies `Chart.yaml`	dual-deploy-reviewer
`kots_config_change`	`kots-*.yaml` / `kind: Config` (kots.io)	dual-deploy-reviewer
`saas_only_assumption`	Single-tenant, cloud-only architectural assumptions	dual-deploy-reviewer
`compliance_marker`	GDPR/HIPAA/SOC2/PCI references, data-residency requirements	compliance-reviewer
`perf_sensitive`	Hot-path annotations, latency SLOs, throughput targets	performance-reviewer
`test_imbalance`	Test-to-implementation ratio anomalies, mock-heavy patterns	test-lead
`exec_keyword`	Business-impact language, OKR references, revenue/cost framing	executive-sponsor
`shared_infra_change`	Shared platform components, cross-team API surface changes	competing-team-lead

Override the auto-selection with flags:

/devils-council:review path/to/diff --only=security-reviewer,sre
/devils-council:review path/to/diff --exclude=devils-advocate
/devils-council:review path/to/diff --cap-usd=0.25

Trigger reasons appear in MANIFEST.trigger_reasons{} for every bench persona that joined a run.

Commands

Command	Purpose	Example
`/devils-council:review <artifact>`	Primary entry point: runs core + triggered bench + Chair synthesis on a file / stdin	`/devils-council:review plan.md`
`/devils-council:on-plan <phase>`	Review every `.planning/phases/<NN>-/<NN>--PLAN.md` for a GSD phase	`/devils-council:on-plan 7`
`/devils-council:on-code <phase>`	Review the committed diff for a GSD phase; `--from <ref>` fallback for commit_docs=false projects	`/devils-council:on-code 7 --from HEAD~10`
`/devils-council:dig <persona> <run-id\|latest> [question]`	Ask a follow-up scoped to ONE persona's scorecard; single-turn, ephemeral	`/devils-council:dig security-reviewer latest "justify blocker severity"`
`/devils-council:create-persona [slug]`	Interactive wizard to scaffold a custom persona with voice-kit coaching	`/devils-council:create-persona cost-hawk`

Flags on review:

--only=<persona,persona> — force-include (ignores triggers)
--exclude=<persona,persona> — force-exclude
--cap-usd=<N> — override budget cap
--type=code-diff|plan|rfc — override auto-detected artifact type
--show-nits — expand collapsed nits inline (default: nits hidden, one-line summary)

Configuration

Budget cap (config.json)

{
  "budget": {
    "cap_usd": 0.50,
    "per_persona_estimate_usd": 0.05,
    "bench_priority_order": ["security-reviewer", "compliance-reviewer", "dual-deploy-reviewer", "performance-reviewer", "finops-auditor", "air-gap-reviewer", "test-lead", "executive-sponsor", "competing-team-lead"]
  }
}

Default cap is $0.50 / 30s per invocation. When exceeded, further bench fan-out is halted and skipped personas appear in MANIFEST.personas_skipped[]. Override per-invocation with --cap-usd=<N>.

GSD integration

Off by default. When enabled, a one-line pointer appears after gsd-plan-checker or gsd-code-reviewer runs, directing you to /devils-council:review <path>.

For OpenCode, set in your project's opencode.json:

{
  "plugin": ["devils-council-opencode"],
  "devils-council": {
    "gsd_integration": true
  }
}

For Claude Code, toggle via:

/plugin config devils-council
# Toggle: gsd_integration → true

The hook is a no-op when GSD is not installed.

Custom Personas

Create a custom persona interactively:

/devils-council:create-persona

Or provide a slug directly:

/devils-council:create-persona cost-hawk

The scaffolder walks you through every voice-kit field:

Name (kebab-case slug)
Tier (core = always-on, bench = signal-triggered)
Primary concern (one-sentence value-system anchor ending with ?)
Blind spots (what this persona does NOT care about)
Characteristic objections (3+ verbatim phrases)
Banned phrases (5+ phrases the persona must never use)
Worked examples (2 good findings + 1 bad finding)

The scaffolder:

Suggests available signal IDs for bench triggers
Warns if your banned phrases overlap >30% with a shipped persona
Flags objections that contain your own banned phrases
Previews the full file before writing
Validates against scripts/validate-personas.sh before declaring success

Output lands in:

${DC_ROOT}/create-persona-workspace/<slug>/
  agents/<slug>.md
  persona-metadata/<slug>.yml

To install into your plugin:

cp "${DC_ROOT}/create-persona-workspace/<slug>/agents/<slug>.md" agents/
cp "${DC_ROOT}/create-persona-workspace/<slug>/persona-metadata/<slug>.yml" persona-metadata/
./scripts/validate-personas.sh

Codex Setup

Security + Dual-Deploy personas delegate deep code scans to Codex via codex exec --json --sandbox read-only.

1. Install

macOS:

brew install --cask codex

Linux / as fallback:

npm install -g @openai/codex

Verify: codex --version (expected: 0.122.0 or newer).

2. Authenticate (OAuth — recommended)

codex login

Browser opens for ChatGPT OAuth. Credentials land in ~/.codex/ (gitignored).

Verify: codex login status exits 0.

3. Authenticate (API key — CI / headless)

export OPENAI_API_KEY="sk-..."
echo "$OPENAI_API_KEY" | codex login --with-api-key
codex login status

4. Smoke test

./scripts/smoke-codex.sh
# Expected: "smoke-codex: OK" and exit 0
# First run ~7s on a fast network.

Sandbox

--sandbox read-only is fixed in v1 (no writes, no network beyond Codex's own API). Widening is a post-v1 decision requiring a dedicated threat-model review.

Responses Workflow

After a review, annotate findings by editing .council/responses.md:

---
version: 1
responses:
  - finding_id: security-reviewer-a3f2c1d8
    status: dismissed
    reason: legacy migration path, tracked in adr-014
    date: 2026-04-24
  - finding_id: sre-b9e401f2
    status: accepted
    reason: will fix in follow-up PR
    date: 2026-04-24
---
# Notes (free-form body)

status enum: accepted | dismissed | deferred. finding_id format is <persona-slug>-<8hex> (stable across re-runs of the same artifact with the same persona and target+claim per Phase 5 D-38).

On re-run: dismissed findings are suppressed from Chair synthesis; a one-line Suppressed N findings per .council/responses.md (dismissed: N1, deferred: N2) note renders above top-3.

See Troubleshooting #4 for a known limitation.

Troubleshooting

1. Plugin cache staleness after version bump

Symptom: new commands (on-plan, on-code, dig) don't appear in the picker after /plugin install on an already-installed older version. claude plugin list --json still reports the old version.

Fix: version bumps require uninstall + reinstall — this is documented Claude Code behavior, not a devils-council bug:

/plugin uninstall devils-council@devils-council
/plugin install devils-council@devils-council

2. Install picks up old version after tag bump

Symptom: After a new tag ships, /plugin install devils-council@devils-council still installs the old version because the marketplace descriptor is cached locally.

Fix: refresh the marketplace descriptor first, THEN reinstall:

/plugin marketplace update devils-council
/plugin uninstall devils-council@devils-council
/plugin install devils-council@devils-council

Confirm the new version is loaded:

claude plugin list --json | jq '.[] | select(.name=="devils-council") | .version'

This is Claude Code marketplace-caching behavior, not a devils-council bug.

3. Codex unavailable

Symptom: Security or Dual-Deploy scorecard includes a finding with category: delegation_failed; [devils-council: Codex unavailable, persona proceeded without deep scan] in command output.

Fix: codex login status (exit non-zero = not authed); re-run codex login. If Codex CLI isn't installed, see Codex Setup. If delegation_failed persists with Codex healthy, run ./scripts/smoke-codex.sh and inspect stderr.

Per D-51 the plugin is fail-loud by design — it does NOT silently degrade.

4. Dismissals not suppressing on re-run (RESP-03 LLM variance)

Symptom: you dismissed a finding in .council/responses.md, but it re-appears on re-run with a slightly different claim and a different finding ID.

Cause: finding IDs hash persona + target + claim. Personas may produce slightly different claim text across re-runs. Same concern, different hash.

Workaround: dismiss multiple variants proactively. Or use --exclude=<persona> on re-run to skip the persona while you investigate.

Fix tracked: normalize claim before hashing (lowercase + stopword-strip + whitespace-collapse).

5. GSD hook integration not firing

Symptom: after gsd-plan-checker runs, no [devils-council: ...] pointer appears.

Checks:

Opt-in on? For OpenCode: gsd_integration: true in opencode.json. For Claude Code: jq '.plugins.devils-council.userConfig.gsd_integration' ~/.claude/settings.json (expected: true)
GSD installed? ls ~/.claude/agents/gsd-plan-checker.md ~/.claude/agents/gsd-code-reviewer.md — at least one must exist
Hook registered? Check bin/dc-gsd-wrap.sh extraction — the tool_input.prompt field might not contain an extractable PLAN.md path. Run the guard test locally: ./scripts/test-hooks-gsd-guard.sh.

6. Bench persona not spawning despite artifact match

Symptom: reviewing a Helm values diff, but dual-deploy-reviewer didn't join.

Checks:

cat .council/<run>/MANIFEST.json | jq '.classifier' — did the classifier detect helm_values_change?
Does the persona's triggers: list in agents/dual-deploy-reviewer.md include the signal ID?
Was the budget cap reached? jq '.personas_skipped' .council/<run>/MANIFEST.json lists personas dropped by the cap.

7. Codex sandbox violation in delegation

Symptom: MANIFEST.personas_run[].delegation.error_code == "codex_sandbox_violation" — Codex rejected a delegation.

Cause: persona requested a write or network-access operation. Expected behavior per D-51 fail-loud; the persona emits a category: delegation_failed finding and the scorecard still ships.

Fix: v1 is read-only sandbox by design. Widening is deferred to post-v1 threat-model review.

8. Budget cap too low — important personas skipped

Symptom: jq '.personas_skipped' .council/<run>/MANIFEST.json shows a non-empty array and you wanted those personas to run.

Fix: /devils-council:review <artifact> --cap-usd=1.00 (or edit config.json budget.cap_usd for a persistent change). Priority order is controllable via config.json bench_priority_order.

9. Terminal render unreadable — too much output

Symptom: synthesis + 4-8 scorecards fills the terminal.

Fix: default render collapses nits. If still too much, read raw scorecards directly:

ls -t .council/ | head -1 | xargs -I{} cat ".council/{}/staff-engineer.md"

Or open the HTML report instead:

open .council/$(ls -t .council/ | head -1)/REPORT.html

Or use --show-nits only when you want everything inline (default: top-3 blockers + major/minor one-liners + nits-collapsed summary).

Uninstall

OpenCode: remove "devils-council-opencode" from your opencode.json plugin array.

Claude Code:

/plugin uninstall devils-council@devils-council
/plugin marketplace remove astrowicked/devils-council

Runtime artifacts under .council/<run-id>/ (review outputs) and ~/.codex/ (Codex credentials) are NOT removed by uninstall — delete manually if desired.

Contributing

PRs welcome. See the phase-artifact trail under .planning/ (in-repo, gitignored for public users — fork to see planning docs) for the design rationale behind every shipped feature. Personas are markdown files under agents/ with YAML frontmatter; add a new one and scripts/validate-personas.sh will accept it if the schema holds.

Tests: bash scripts/validate-personas.sh for a quick check; full suite lives in .github/workflows/ci.yml.

Publishing (OpenCode npm package)

The OpenCode plugin is published to npm as devils-council-opencode via GitHub Actions on tag push.

Release workflow:

# 1. Bump version in both manifests
#    .claude-plugin/plugin.json → "version": "X.Y.Z"
#    .opencode/package.json    → "version": "X.Y.Z"
# 2. Update CHANGELOG.md
# 3. Commit & tag
git add -A && git commit -m "chore: bump version to X.Y.Z"
git tag vX.Y.Z
git push origin main --tags

The publish-opencode.yml workflow automatically builds and publishes to npm when a v* tag is pushed.

Setup (one-time):

Create an npm access token at https://www.npmjs.com/settings/tokens (type: Automation)
Add it as a repository secret: Settings → Secrets → Actions → NPM_TOKEN
First publish requires the npm account to own the devils-council-opencode package name

Local build/test:

cd .opencode
npm install --legacy-peer-deps
bash build.sh          # transforms agents + compiles TypeScript
npm test               # runs signal + speckit-hook tests
npm pack --dry-run     # verify tarball contents

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 345 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
.opencode		.opencode
.planning		.planning
.specify		.specify
agents		agents
benchmarks		benchmarks
bin		bin
blog		blog
commands		commands
examples		examples
hooks		hooks
lib		lib
persona-metadata		persona-metadata
scripts		scripts
skills		skills
templates		templates
tests		tests
.gitignore		.gitignore
.markdownlint-cli2.jsonc		.markdownlint-cli2.jsonc
.markdownlint.jsonc		.markdownlint.jsonc
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
config.json		config.json
opencode.json		opencode.json

Folders and files

Latest commit

History

Repository files navigation

devils-council

See it in action (30 seconds, no setup)

What it catches (real examples)

How it works

Requirements

Install

OpenCode (recommended)

Claude Code

Quickstart

Review a plan

Review code

Dig into a finding

HTML Reports

GitHub Action (CI integration)

Zero-config (free, no API key)

Full review (add API key)

Review plans on PR

Telemetry (optional)

Why opt in

Opt in — GitHub Action

Opt in — OpenCode / Claude Code plugins

Manual invocation

DO_NOT_TRACK precedence (hard override)

Event envelope

review.completed payload fields

review.failed payload fields

Fire-and-forget guarantees

Self-host

Persona Roster

Trigger Rules

Commands

Configuration

Budget cap (config.json)

GSD integration

Custom Personas

Codex Setup

1. Install

2. Authenticate (OAuth — recommended)

3. Authenticate (API key — CI / headless)

4. Smoke test

Sandbox

Responses Workflow

Troubleshooting

1. Plugin cache staleness after version bump

2. Install picks up old version after tag bump

3. Codex unavailable

4. Dismissals not suppressing on re-run (RESP-03 LLM variance)

5. GSD hook integration not firing

6. Bench persona not spawning despite artifact match

7. Codex sandbox violation in delegation

8. Budget cap too low — important personas skipped

9. Terminal render unreadable — too much output

Uninstall

Contributing

Publishing (OpenCode npm package)

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`DO_NOT_TRACK` precedence (hard override)

`review.completed` payload fields

`review.failed` payload fields

Packages