Skip to content

feat: per-user API keys with authenticated provenance#24

Merged
bkrabach merged 1 commit into
mainfrom
feat/per-user-api-keys
Jun 22, 2026
Merged

feat: per-user API keys with authenticated provenance#24
bkrabach merged 1 commit into
mainfrom
feat/per-user-api-keys

Conversation

@bkrabach

Copy link
Copy Markdown
Collaborator

Per-User API Keys with Authenticated Provenance

What: Replaces the single shared API key model with a keystore-based system that assigns authenticated, spoof-proof contributor attribution to all graph writes.

Why: The previous single-key model left no way to distinguish which peer contributed which data — a data-integrity gap the council flagged (workspace is client-supplied, unchecked). This feature solves it by:

  • Issuing per-peer API keys
  • Server-authenticating each request and stamping the identity into the event
  • Writing that identity write-once into the graph via ON CREATE SET
  • Overwriting any client-supplied 'created_by' (anti-spoofing)

Design: Council-reviewed spec (docs/designs/per-user-api-keys.md, v2) with all must-fixes resolved: D1–D10 decisions locked, identity persisted through the async queue, fail-closed on auth, write-once graph semantics, and zero client change.

Proof: Verified end-to-end on an isolated test stack (never touching the live deployment or Marc's data):

  • Gate 1 (anti-spoof): Peer posts lying created_by, server overwrites it ✓
  • Gate 2 (write-once): Owner re-writes the same node, created_by stays unchanged ✓
  • Gate 3 (fail-closed): Bogus token → 401, nothing written ✓

All 1353 unit tests pass; ruff/pyright clean.

Impact: Server-only change; zero client change. Historical nodes retain null created_by. Single legacy api_key still works (maps to id 'owner') for back-compat. Auth disabled only when no keys configured (dev mode preserved).

Next: A CLA bot comment is expected and fine. Please review and let me know if anything needs adjustment before merge.

🤖 Generated with Amplifier

Replaces the single shared API key with a keystore: 'api_keys' block in
the credentials YAML mapping sha256-hex(token) -> {id: <contributor>}.
The legacy single 'api_key' still works (folds to id "owner") for
back-compat. Auth disabled only when no keys at all are configured
(dev mode preserved). Malformed keystore fails closed (server refuses
to start).

The auth middleware resolves the bearer token to a contributor id
(sha256 + constant lookup; never returns "unknown") and injects it into
the ASGI scope.

POST /events stamps the authenticated id into the event as 'created_by',
OVERWRITING any client-supplied value (anti-spoofing), persisted into
the durable queue line.

The graph write stamps 'created_by' WRITE-ONCE via ON CREATE SET on both
node-write paths (Session inline MERGE and the universal _NODE_MERGE_CYPHER),
never as part of the MERGE key — node identity/uniqueness constraints
unchanged. Adds a non-unique idx_session_created_by index.

Authenticate-and-stamp only (no roles in v1); workspace remains a client
label. Zero client change. Historical nodes keep null created_by.

Verified end-to-end against an isolated instance: a peer's lying
'created_by' is overwritten by the server (anti-spoof), created_by is
immutable across a second writer (write-once), and a bogus token gets 401
with nothing written. 1353 unit tests pass; ruff/pyright clean.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@colombod

colombod commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

Review — per-user API keys

Solid core. The security-critical pieces check out: the anti-spoof overwrite on POST /events, Cypher parameterization, fail-closed on bad/missing tokens, malformed-keystore aborting before workers fork, and single-worker enforcement. Notes below are scoped to what's necessary.

Correction to my earlier note on attribution: I had flagged "first-poster-wins" provenance as a blocker. Given the system's design invariant that each session_id is owned by exactly one contributor (only one contributor ever produces events for a session), that concern goes away — worker-level created_by is correct, and per-event threading would be needless complexity. The only ask there is to make that invariant explicit (see #3). Leaving the rest of the review below.

Blocking

  1. Empty api_keys: {} silently disables auth (fail-open). The validator accepts an empty dict, build_keystore() returns {}, and the middleware treats an empty keystore as "no auth" (auth.py, not self.keystore). An operator clearing or migrating keys can switch auth off with a clean startup and no error. Please reject an empty api_keys in _validate_api_keys — only an omitted/null keystore should mean dev-mode. (Test: api_keys={}ValidationError.)

  2. Write-once isn't structural (independent of the attribution model). ON CREATE SET n.created_by = $created_by is followed by an unconditional SET n += row.props. If created_by ever lands in row.props, += overwrites the stamp — including on existing nodes. Today only a comment guards this, with no test, and this PR does add created_by to the event body. Either keep created_by out of props by construction, or make it explicit (SET n.created_by = CASE WHEN n.created_by IS NULL THEN $created_by ELSE n.created_by END), plus a test that smuggling created_by into props can't change the stamp.

Small / cheap

  1. Make the session-ownership invariant explicit. The whole provenance model rests on "one contributor per session_id." That's correct by design, but it's load-bearing and currently implicit. A short assertion/comment near get_or_create (and a line in the design doc) makes it safe against future changes that might let a session be shared.

  2. Uppercase digest → silent permanent 401. _is_hex accepts uppercase, but hexdigest() is lowercase and the lookup is case-sensitive, so an uppercase digest in YAML authenticates nothing while the server starts clean. One-line lowercase-normalize at parse time.

  3. Crash-recovery test (T15) re-implements the lifespan parser inline instead of calling the real recovery code, so the real path (which now carries created_by) isn't exercised. Worth pointing it at the actual code.

init command — proposing removal

Credential generation still emits the legacy single api_key (init_command.py), and the docs point operators at init for key generation — so the documented setup path produces a config that does not enable this feature.

Rather than grow init to support the nested api_keys format, I'd like to remove the init subcommand entirely — its code, tests, and doc references — in favor of clear deployment/configuration documentation that a human or an agent can follow (setup here is Amplifier/agent-driven). Net: less surface to maintain, one source of truth for setup. We'll provide the replacement docs (generate token → sha256(token) → write the api_keys entry → hand the raw token to the peer) and update the repo's AGENTS.md so agents configure it correctly.

One doc fix is needed regardless: docs/remote-access-sharing.md lists "per-peer keys (server change)" as a future item — that's this PR, so it currently reads as "the feature doesn't exist yet."

One open question for you

Edge provenance — v1 or fast-follow? Edges (and the bare placeholder endpoint nodes created by edge MERGE) carry no created_by today. Given the one-contributor-per-session invariant, stamping r.created_by on relationships is cheap (the worker-level scalar is already in scope) and would make "who asserted this relationship" queryable. Not a blocker — but worth a deliberate call: include relationship attribution in v1, or defer it as a fast-follow?


We'll take 1, 2, 4, 5 + the init removal and docs through a proper design→plan→implementation cycle on a branch off this one and bring it back. 3 and the edge question are yours to confirm.

@colombod

Copy link
Copy Markdown
Collaborator

Heads-up: the completion work for this feature is now in #25, opened against main. That PR carries this commit (7854776) plus the fixes and changes required to merge to main and ship the new server (v5.0.0) — fail-closed auth (empty api_keys now errors), structural write-once (created_by excluded from props), edge/relationship provenance, an observable session-ownership invariant, removal of the init subcommand in favour of an api_keys-emitting Docker bootstrap + doc-driven setup, full operator docs + AGENTS.md, and a version bump to 5.0.0. It also fixes a pre-existing test-spy arity break (the created_by param added to _write_batch here wasn't propagated to two spies in tests/neo4j/test_flush_lock_ordering.py, which fail under Docker-backed CI). Verified: 1409 non-Docker tests + 79 isolated-Neo4j tests green, ruff/pyright clean. Please review and land via that PR (merge preserving history — not squash — so your authorship on 7854776 is retained). Suggest closing this PR in favour of it once you've reviewed.

@bkrabach bkrabach merged commit 7854776 into main Jun 22, 2026
3 checks passed
@bkrabach

Copy link
Copy Markdown
Collaborator Author

Superseded by #25, now merged to main. #25 carries commit 7854776 (this work) plus the completion fixes from review — fail-closed empty keystore, structural write-once created_by, edge/relationship provenance, init removal in favor of doc-driven setup — and ships v5.0.0. Thanks for the thorough review. Closing in favor of #25.

@colombod

Copy link
Copy Markdown
Collaborator

All changes from this PR are now in main: it was completed and superseded by #25, which has been merged (rebased for linear history). main now contains your foundation commit 7854776 plus the completion work required to ship the new server (v5.0.0) — fail-closed auth, structural write-once, edge provenance, observable session-ownership invariant, init removal + api_keys Docker bootstrap, full operator docs, and the version bump. Authorship is preserved in the rebase. GitHub already shows this PR as merged since its commit is in main — nothing here is lost. Closing out. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants