Skip to content

fix(upload-cli): default to replay=true, add --no-replay opt-out (supersedes #22)#47

Merged
colombod merged 1 commit into
mainfrom
fix/upload-replay-default-v2
Jun 29, 2026
Merged

fix(upload-cli): default to replay=true, add --no-replay opt-out (supersedes #22)#47
colombod merged 1 commit into
mainfrom
fix/upload-replay-default-v2

Conversation

@colombod

Copy link
Copy Markdown
Collaborator

What & why

The upload CLI never passed ?replay=true, so the server's 7-day in-memory idempotency cache silently dropped session:start events on re-upload — leaving Session nodes without started_at. This makes replay=true the default (bypassing that cache; Neo4j idempotency still holds via MERGE + SET n += row.props) and adds --no-replay to restore the old dedup behaviour for live, in-progress sessions.

Supersedes #22 (do not merge #22)

#22 carried this exact fix but is CONFLICTING / DIRTY and ~5 weeks stale: main has since rewritten both cli.py and uploader.py with the dual-auth feature. Rather than merge a conflicting stale branch over the auth work, the fix is re-applied cleanly on current main (the auth-aware run_upload(auth_strategy=...) / build_auth_strategy(...) code). Verified the fix is still absent from main before porting.

Changes

  • uploader.pyrun_upload() gains replay: bool = True; POSTs with params={"replay": "true"} when set (None when --no-replay).
  • cli.py--no-replay flag (default off); usage + IDEMPOTENCY help rewritten.
  • tests — red-green coverage: default sends ?replay=true, --no-replay sends no params, and the flag wiring.

Proof (unit-only — the fix is "client sends ?replay=true")

  • uv run pytest -q182 passed; 8 replay tests green (red-green confirmed: assert None == {'replay': 'true'} before the change).
  • ruff check + ruff format --check clean; pyright 0 errors.
  • No DTU/Neo4j needed — the server-side cache-bypass behaviour is unchanged; this PR only ensures the client requests it.

The upload CLI never passed ?replay=true, so the server's 7-day in-memory
idempotency cache silently dropped session:start events on re-upload,
leaving Session nodes without started_at. Default replay=true bypasses that
cache (Neo4j idempotency still holds via MERGE + SET n += row.props);
--no-replay restores the old dedup behaviour for live in-progress sessions.

- uploader.py: run_upload() gains replay: bool = True; POSTs with
  params={"replay":"true"} when set (None when --no-replay).
- cli.py: --no-replay flag (default off); usage + IDEMPOTENCY help rewritten.
- tests: red-green coverage for the query param and the flag.

Re-applied on current main (auth-aware cli.py/uploader.py) — supersedes the
stale, conflicting #22. Unit-proven: 182 passed, ruff + pyright clean.

🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@colombod

Copy link
Copy Markdown
Collaborator Author

End-to-end proof (ROB's gate): started_at verified against a real server + Neo4j

Per review feedback, the unit tests only proved "the client emits ?replay=true." This adds the missing end-to-end layer: a real context-intelligence server (local pr-29 wheel) + Neo4j, in an isolated Digital Twin (DTU) instance, static-key auth, clean DB, with the upload CLI from this PR branch (fix/upload-replay-default-v2) driving every step.

Environment: isolated DTU ci-server-replay @ http://localhost:38100, main:asgi_app (auth-wrapped, static bearer), neo4j:5.26-community (empty). Server boot verified: POST /events no bearer → 401; POST /cypher w/ bearer → 200.

Fixture: one session replay-e2e-sess-001, a single session:start event carrying data.timestamp=2026-06-29T19:00:00Z, workspace replay-e2e-ws. Read-back via the server's /cypher endpoint: MATCH (s:Session) RETURN s.session_id, s.started_at, s.status.

Scenario 1 — the fix (default replay=true): upload twice, started_at present after the re-upload

[upload #1, default]   {"status":"completed","sessions_uploaded":1,"events_uploaded":1}
  read -> {"sid":"replay-e2e-sess-001","started_at":"2026-06-29T19:00:00.000000000+00:00","status":"running"}
[upload #2, default]   {"status":"completed","sessions_uploaded":1,"events_uploaded":1}   <- the RE-upload
  read -> {"sid":"replay-e2e-sess-001","started_at":"2026-06-29T19:00:00.000000000+00:00","status":"running"}  ✅ still present

Scenario 2 — the bug (--no-replay) vs the repair (default), driven through the CLI

The decisive contrast. Same fixture, same server, same cache state; the only variable is the --no-replay flag. To make the durable-store/in-memory-cache divergence observable, Neo4j is wiped between the seed and the re-upload (a stand-in for a graph rebuild / data loss / a failed first write — the condition under which the 7-day cache's silent dedup becomes user-visible as missing started_at).

[B2] CLI upload --no-replay (first time; seeds the 7-day idempotency cache, event IS processed)
  read -> started_at = 2026-06-29T19:00:00...   (present)
[B3] wipe Neo4j  (durable graph gone; in-memory cache PERSISTS)   -> node count 0
[B4] CLI RE-upload --no-replay  (OLD behaviour)
  read -> {"results": []}                       ❌ started_at ABSENT  (session:start silently deduped, node never recreated)  <- THE BUG
  node count -> 0
[B5] CLI RE-upload (DEFAULT, replay=true)  (THE FIX)
  read -> started_at = 2026-06-29T19:00:00...   ✅ RESTORED  (cache bypassed, session:start reprocessed)  <- THE FIX

Conclusion: B4 → B5 is the smoking gun — identical inputs, the lone difference being --no-replay vs the new default. Under the old behaviour a re-upload cannot repair a missing started_at (the start event is dropped by the cache); under this PR's default it does. The fix is real end-to-end, not just at the request layer.

Mechanism confirmed in server source on pr-29: main.py:342-351 (if request.idempotency_key and not replay: → returns {"status":"duplicate"} without queuing), idempotency.py:9-44 (7-day in-memory cache), handlers/data_layer_2/session.py:153-173 (session:startstarted_at upsert).

🤖 Generated with Amplifier

@colombod colombod merged commit 72800db into main Jun 29, 2026
8 checks passed
@colombod colombod deleted the fix/upload-replay-default-v2 branch June 29, 2026 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant