Point the ARCP runtime at a directory of audio recordings (hydrophone clips, ultrasonic bat recordings, dawn-chorus forest mic). The agent walks the directory, runs each clip through a CoreML-bundled bioacoustic classifier, and — for ambiguous frames — asks the LLM to disambiguate using the classifier's top-k candidates. Output: a Raven Pro-compatible selection table, a per-recording markdown summary, and spectrogram PNGs for the unusual clips.
This is the canonical Swift demo because it puts ARCP on Apple's home turf: the entire inference pipeline runs locally via CoreML/Accelerate/AVFoundation while the model still drives the high-level reasoning. The Mac becomes a self-contained classification node.
Showcases: provisioned credentials (§9.8), per-clip progress, sensor
telemetry metrics, custom-currency budgets (compute-ms), artifact uploads
(Raven .txt + PNG + markdown), and a tools-first agent loop where 95% of
frames never touch the LLM.
make test # fast smoke tests (no docker, ~0.1s)
make seed # synthesise the bundled WAV corpus
make run-host-runtime # run the runtime natively (CoreML mode)
# in another terminal:
make annotate IN=./Samples/whalesOutputs land under ./out/<job_id>/:
out/<job_id>/
├── selections.txt # Raven Pro 1.4 tab-separated table
├── summary.md # one-paragraph "what's notable"
└── spectrograms/ # PNG per "unusual" frame
cp .env.example .env
make upOn macOS, make up brings ollama + arcp-client into containers and prints
Detected macOS — start the Swift runtime natively for CoreML:
make run-host-runtimein another terminal
On Linux, the runtime container starts too and uses the ONNX fallback (actually the deterministic stub classifier — see "Honest split" below).
The PROMPT calls for an idiomatic @ArcpAgent(name: "bioacoustic.annotate")
macro surface and a fully-distributed runtime ↔ client over WebSocket.
Neither is in the published swift-sdk yet, so this example ships:
| Layer | Status | Notes |
|---|---|---|
@ArcpAgent macro |
stubbed | Replaced by manual registration. See Sources/Runtime/Stubs/ArcpStub.swift. |
JobContext / emit |
stubbed | Records events in-process; tests assert on them. |
| Provisioned credentials | stubbed (§9.8) | JobContext.openProvisionedCredential resolves to a path under MODELS_DIR. |
| Audio spectrogram (FFT) | real | Pure-Swift Hann-windowed DFT. Tests confirm peak-bin accuracy on a 440Hz sine. |
| Audio framing | real | Sliding-window indexer over the spectrogram tensor. |
| CoreML classify (Mac) | partial (Mac) | Loads an .mlpackage via CoreML.MLModel when present; falls through to deterministic stub for prediction shape stability. Guarded by #if canImport(CoreML). |
| ONNX classify (Linux) | stub | Returns deterministic top-3 keyed by (modelName, frameId) hash. |
| PNG renderer | real | Hand-rolled PNG (zlib stored). No external image deps. |
| Selection table writer | real | Raven Pro 1.4 tab-separated format. |
| Markdown summary | real | Species inventory + "what's notable" paragraph. |
| Disambiguation schema | real | JSON validator + smoke-test verified. |
| Ollama / LLM inference | stubbed shim | InferenceInjector swaps in a real HTTP client when wired. |
Every stub is grep-able for TODO: replace with real SDK API when published.
Swapping Sources/Runtime/Stubs/ for the real SDK is the migration path.
- macOS host-runtime mode (default): the runtime binary runs natively.
When you drop a real
*.mlpackageunderMODELS_DIR(matchingDEFAULT_MODEL),CoreMLBridge.tryClassifywill compile and load it; the prediction path itself currently returns the deterministic stub output (real models have model-specific input layouts that the demo doesn't try to second-guess). Swap inmodel.prediction(from:)inCoreMLBridge.swiftto wire up a specific model. - Linux container mode: CoreML is absent. The same agent code runs and the classifier returns the deterministic stub. ONNX Runtime via the Swift bindings is the production path; the stub keeps the demo end-to-end testable without the dependency.
Six fast XCTest cases under Tests/SmokeTests/. Together they take ~0.1s,
need no docker, no network, no real CoreML model:
swift testThe cases:
audio.spectrogrampeak bin on a 440Hz sine wave (±35Hz at fftN=256).audio.framingon a 3-second clip with 3s window / 1s stride → 1 frame.coreml.classifydeterministic top-3 for a fixed input frame.- Provisioned-credential handle resolves to
MODELS_DIR/<file>.mlpackage. - Selection-table writer emits valid Raven Pro 1.4 header + row count.
DisambiguationSchemavalidates a known-good JSON response.
| Variable | Default | Effect |
|---|---|---|
RECORDINGS_DIR |
/data/recordings |
Directory the agent walks. |
MODELS_DIR |
/data/models |
Provisioned-credential resolution root. |
ANNOTATIONS_DIR |
/data/annotations |
Where selection tables / summaries / PNGs land. |
DEFAULT_MODEL |
BirdNET-v2.4 |
Logical credential name → <name>.mlpackage. |
COREML_BACKEND |
coreml |
coreml (Mac) or onnx (Linux). |
CONFIDENCE_FLOOR |
0.85 |
Top-1 confidence threshold for direct selection-table emission. |
ENTROPY_TRIGGER |
1.2 |
Above this entropy the agent disambiguates via the LLM. |
PER_RECORDING_BUDGET_USD |
0.20 |
Per-recording cost ceiling for LLM disambiguation. |
RUNTIME_HOST_MODE |
auto |
auto / host / container. |
OLLAMA_MODEL |
qwen2.5:1.5b-instruct |
See top-level README for alternatives. |
ARCP_SDK_VERSION |
latest |
Reserved — the example currently uses a vendored stub SDK surface. |
ARCP's budget capability isn't just about money — it's about any quota you
can name. The interesting framing for the bioacoustic example is compute,
not USD: register an additional compute.budget currency (custom — e.g.
"compute_ms:60000") for the CoreML calls. Each coreml.classify
decrements wall-clock-ms. That gives operators a real-time cap on a
self-hosted, possibly battery-powered node.
Sources/Runtime/Agents/BioacousticAnnotate.swift— the agent loop.Sources/Runtime/Tools/—audio.spectrogram,audio.framing,coreml.classify,spectrogram.render_png.Sources/Runtime/CoreMLBridge/CoreMLBridge.swift— real CoreML model loading (Mac only).Sources/Runtime/Models/— selection table, markdown summary, disambiguation schema.Sources/Client/Bioacoustic.swift— thebioacousticCLI / TUI.Sources/Runtime/Stubs/ArcpStub.swift— the stub SDK surface to delete once the real macro / server APIs land upstream.
Package.swift currently has the published swift-sdk git dependency
commented out. The vendored ArcpStub target supplies the surface the
PROMPT calls for. To re-enable the registry path once the macro lands:
.package(
url: "https://github.com/agentruntimecontrolprotocol/swift-sdk.git",
/* ARCP_VERSION */ from: "1.0.0"
)and drop Sources/Runtime/Stubs/ from the target sources.