Skip to content

Latest commit

 

History

History

README.md

rust — Coverage-Guided Semantic Fuzzer

Hand the agent a Rust binary built with coverage instrumentation. Its job: find inputs that reach branches the corpus hasn't covered yet, and trip crashes byte-wise fuzzers would never reach. The agent reads the symbol table, reasons about likely format invariants, proposes mutated inputs, executes them through a sandboxed runner, observes the coverage bitmap, and updates its hypotheses — all coordinated by ARCP.

See PROMPT.md for the full design narrative.

What "done" looks like

  • make up brings four containers green: ollama, arcp-runtime, harness-runner, arcp-client.
  • Within ~10s the dashboard shows status: analyzing_binaryseeding_corpusmutating.
  • The coverage gauge climbs as the agent finds inputs that reach new edges.
  • A thought event appears with the model's hypothesis ("Symbol decode_array_len reads 4 bytes — try varying that prefix.").
  • First crash on the bundled target_cbor demo typically within 5–15 minutes on a laptop. artifact_ref event fires; the crash byte stream appears in the work-crashes volume.
  • Stop with make down. make up resumes from the saved corpus + coverage map.
  • Let it run to exhaustion — Status::budget_exhausted fires and the campaign closes cleanly.

Architecture

                       ┌───────────────────────┐
                       │      ollama           │
                       └───────────▲───────────┘
                                   │ http
   ┌────────────────┐               │
   │  arcp-client   │               │
   │  (TUI)         │               │
   └───────┬────────┘               │
           │ ws/arcp                │
           ▼                        │
   ┌──────────────────────────────────────────┐
   │             arcp-runtime                  │
   │  hosts `fuzz.explore` agent               │
   │  persists corpus + coverage bitmap        │
   └───────────────────┬──────────────────────┘
                       │ tool.call: harness.run
                       ▼
   ┌──────────────────────────────────────────┐
   │           harness-runner                  │
   │   POST /run executes the target binary    │
   │   returns coverage bitmap + stderr        │
   └──────────────────────────────────────────┘

Quickstart

cp .env.example .env
make up

Workspace layout

rust/
├── crates/
│   ├── fuzzcommon/   shared types: Bitmap, FuzzState, RunArgs, Persona, ...
│   ├── arcp-stubs/   thin SDK surface — replace with the real `arcp` crate
│   ├── runtime/      hosts the `fuzz.explore` agent (PROMPT §5)
│   ├── runner/       POST /run HTTP sandbox (PROMPT §6)
│   └── client/       ratatui dashboard (PROMPT §7)
└── examples/
    └── target_cbor/  deliberately-buggy CBOR-ish decoder used as the target

Tests

The workspace ships a fast, hermetic test suite — no docker, no Ollama, no network:

make test
# or
cargo test --workspace --no-fail-fast

17 tests covering Bitmap::absorb, FuzzState::seed_from, crash dedup canonicalisation, strategy-prompt construction, idempotency-key derivation, and the target_cbor length-confusion bug at depth 6 (should_panic).

Configure

Variable Default Effect
FUZZ_RUN_TIMEOUT_MS 250 Per-execution wall-clock cap.
FUZZ_BUDGET_USD 1.00 Inference cost cap. Fuzzer halts when reached.
FUZZ_BUDGET_EXECS 50000 Total harness executions. Not money — see below.
FUZZ_WALLTIME_HOURS 2 Outer wall-clock kill switch.
RUNNER_SECCOMP strict Seccomp profile (see "Compromises" — currently unused).
RUNTIME_STORE /data/fuzz.db Persistent corpus + bitmap. Lets make down && make up resume.
OLLAMA_MODEL qwen2.5:1.5b-instruct See top-level README for alternatives.
ARCP_SDK_VERSION latest Pin a specific crates.io release for reproducible builds.

credits is a runtime-defined currency (PROMPT §8) representing harness executions, not USD. Each harness.run decrements it. When either USD or credits hit zero the fuzzer halts cleanly with BUDGET_EXHAUSTED.

Compromises in this build

These deltas from PROMPT.md are intentional so cargo test --workspace stays fast and CI-friendly:

  1. arcp-stubs crate. The PROMPT uses ergonomic types (Runtime::builder, JobContext::tool_call, etc.) that aren't published on crates.io today. We provide a thin local stub with the same shape. Search for // TODO: replace with real SDK API when published.
  2. No seccomp. seccompiler is Linux-only and CI-flaky. The runner currently uses tokio::process + wall-clock timeout. Real seccomp is a one-file diff.
  3. No libFuzzer counters. Re-instrumenting the target with -Cinstrument-coverage and mmap-sharing the counter region is real-libFuzzer territory. We use a stub coverage proxy that hashes each distinct stderr line into a 1024-bit bitmap — correlates well enough with reaching new code paths to drive the agent loop against target_cbor.

The JSON wire shape between runtime and runner is unchanged, so swapping in a real sandbox + real coverage is mechanical.

Targets

  • make up — full stack.
  • make submit — kick a one-shot job inside the runtime container.
  • make report — list crashes + corpus volumes.
  • make test — run the workspace test suite.
  • make verifydocker build only.
  • make down — stop everything.