A study collection of agent harnesses, frameworks, and runtimes. Submodules to the source repos, structured analysis cards comparing them on the same axes, cross-cutting synthesis docs, and a weekly digest of what changed across the collection.
This collection is curated by gathercommunity, who are also building their own harness at cura-crew. We read these to learn from. The cards stay descriptive (pros, cons, distinctive choices); we don't publish "what we'd adopt" (that's an internal decision and lives elsewhere).
- A curated set of git submodules under
examples/pointing at real harness implementations. - A growing set of structured analysis cards under
analysis/cards/, one per example, written to the same schema so they compare apples-to-apples. - A small number of synthesis documents under
analysis/synthesis/. Cross-cutting writeups (routing, persistence, decomposition, etc.) that sit on top of the cards once we have enough of them. - A weekly digest under
weekly-reviews/. Every Thursday, the collection's submodules are pulled to upstream tip, the diffs read, and a short narrative is published: what moved, what was added, and what patterns are visible across the collection that week.
- An "agent framework roundup" or ranking. We are not telling people which harness to pick.
- An endorsement. Inclusion means "interesting enough to study," not "good."
- Vendor-neutral. Every card is written by practitioners building their own harness, and that bias inevitably shapes what we find interesting. The bias is acknowledged but not weaponized: cards stay descriptive.
- A place where adoption decisions are recorded. Adoption is an internal process for the curators (or for any reader); only the raw observations live here.
When reading a harness, we ask the same questions every time, so cards compare:
- Primitives: what nouns does this thing expose? (agent, tool, skill, workflow, graph node, memory, etc.)
- Routing and dispatch: how does it pick the next step? LLM, graph, recipe DSL, hand-written switch?
- Persistence model: where does long-running state live? Files, DB, vector store, in-memory only? Who is allowed to write?
- Multi-agent shape: single agent with tools, hierarchical supervisor + workers, peer swarm, fixed pipeline, none?
- Tool execution model: direct calls, MCP, sandboxes, code interpreters?
- Determinism and replay: can you reconstruct what happened? Replay it?
- Observability: what can you see, and how?
- Opinionatedness: light scaffold vs. strong framework vs. platform.
Things we mostly don't try to answer here:
- Benchmark scores (AgentBench, SWE-bench, etc.).
- Which model provider is "best supported."
- Surface ergonomics for one-off scripts.
.
├── README.md # this file
├── examples/ # submodules, one per harness
│ ├── amplifier/ # microsoft/amplifier
│ ├── superpowers/ # obra/superpowers
│ ├── claude-mpm/ # bobmatnyc/claude-mpm
│ └── ...
├── analysis/
│ ├── _template.md # the analysis-card schema
│ ├── cards/ # one card per example
│ │ ├── README.md # tracker
│ │ ├── amplifier.md
│ │ └── ...
│ └── synthesis/ # cross-cutting writeups
│ ├── README.md
│ └── ...
├── weekly-reviews/ # Thursday digests
│ ├── README.md # cadence + template
│ └── YYYY-MM-DD.md # one per week
└── scripts/
└── update-examples.sh # refresh submodules to upstream tip
analysis/cards/ filenames mirror the examples/ directory names (so examples/amplifier ↔ analysis/cards/amplifier.md). The two attractor repos are disambiguated by owner prefix in both places.
Every example gets one card written to the schema in analysis/_template.md. Cards are deliberately uniform so a reader can scan twenty of them quickly and compare across the same axes. Cards are descriptive, not prescriptive: pros, cons, distinctive design choices, observable behavior. Editorial positions about industry patterns live in analysis/synthesis/, written once we have enough cards to draw real signal.
The schema captures, for each harness:
- One-line summary
- Stage and shape
- Primitives exposed
- Routing and dispatch model
- Persistence model
- Multi-agent shape
- Tool execution model
- Determinism story
- Observability story
- Opinionatedness
- Pros (neutral observations of strengths)
- Cons (neutral observations of tradeoffs and weaknesses)
- Distinctive design choices
- Quick stats
- Notes (optional)
Every Thursday, the curator runs:
scripts/update-examples.sh: pulls every submodule to upstream tip.- Reviews the diff for each moved submodule, focused on docs, READMEs, examples, and recent merged PRs.
- Writes a short narrative for the week into
weekly-reviews/YYYY-MM-DD.md: which submodules moved, what kind of changes they shipped, any new harnesses added to the collection that week, and any patterns visible across the diffs ("three of this week's changes were memory-related," "two harnesses adopted MCP in the same week," etc.).
The weekly review is itself descriptive. It reports trends, it doesn't pick winners. See weekly-reviews/README.md for the cadence and template.
Clone with submodules:
git clone --recurse-submodules https://github.com/gathercommunity/harnesses.gitOr, if already cloned:
git submodule update --init --recursiveRefresh every submodule to its upstream tip and see what moved:
scripts/update-examples.shRefresh just one (substring match on the submodule path):
scripts/update-examples.sh amplifierThe updater script does not commit; it leaves the parent repo dirty so a human (or agent) can eyeball the diff and decide whether to land it.
git submodule add https://github.com/<owner>/<repo>.git examples/<short-name>- Copy
analysis/_template.mdtoanalysis/cards/<short-name>.mdand fill it in. (Or merge later, in a follow-up PR. The submodule landing and the card landing don't have to ship together.) - Open a PR.
- The next weekly review notes the addition.
Pick <short-name> to match the upstream repo name unless that name collides with an existing example, in which case prefix with the owner (e.g. strongdm-attractor).
See examples/ and analysis/cards/README.md for the tracker. The seed batch was selected as a high-signal cross-section of the current landscape (commercial frameworks, research projects, single-author starter kits, claude-native harnesses, large vendor offerings). New examples are added when something interesting shows up; the cadence is irregular.