fix: chunk Neo4j flush into bounded sub-transactions to prevent transaction-memory OOM by bkrabach · Pull Request #14 · microsoft/amplifier-context-intelligence

bkrabach · 2026-06-18T05:31:49Z

Problem

Under sustained ingest backpressure (e.g. draining a large per-session backlog after a restart), the Neo4j write flush builds a single unbounded transaction and OOMs:

neo4j.exceptions.TransientError: MemoryPoolOutOfMemoryError
"...would use more than the limit 20.6 GiB. Currently using 20.6 GiB.
 dbms.memory.transaction.total.max threshold reached"

It is an unbounded transaction, not a sizing problem. Observed in the field: the "currently using" figure tracks whatever ceiling is configured (it climbed in lock-step when the ceiling was raised 20.6 GiB → 40 GiB) while the Neo4j process RSS stayed ~4.7 GiB. Raising memory only moves the wall.

Root cause

Neo4jGraphStore._flush_body snapshots the entire _node_buffer + _edge_buffer + _label_patches and writes them in one execute_write(_write_batch, ...) transaction. On failure the finally block restores the whole snapshot back into the buffers, so under continued load each failed flush makes the next one larger → a self-amplifying grow-spiral that never commits. The per-session queue offset therefore never advances, and the data is stuck (it stays safe on disk in the JSONL queue, but never reaches the graph). Both ingest (registry._flush_barrier) and session finalization (registry._finalize_session) funnel through the same flush(), so both paths exhibit it.

The existing _DRAIN_MAX_BATCH = 100 read cap does not help — the buffer/flush, not the read batch, is the unbounded unit.

Fix

Write the buffered snapshot in bounded sub-transactions of at most neo4j_flush_chunk_size items each:

_flush_body now writes in three ordered phases — nodes → edges → label-patches (ordering preserved for referential integrity: edges/patches MATCH nodes that must already exist). Each chunk is its own execute_write, so per-transaction memory is bounded regardless of backlog size.
_write_batch is unchanged — still a pure function of its args; each phase passes only its category populated.
Failure restores only the un-committed remainder (from the failing chunk onward), merged with any concurrent new writes. Already-committed chunks are durable and idempotent (MERGE on uniqueness constraints) and are not restored — so the buffer can only shrink across retries. This removes the grow-spiral.
The all-or-raise caller contract is preserved: a failure re-raises, the offset isn't committed, and the drainer re-runs the batch idempotently.

New tunable

neo4j_flush_chunk_size (default 200) — added to Settings, reachable via AMPLIFIER_CONTEXT_INTELLIGENCE_SERVER_NEO4J_FLUSH_CHUNK_SIZE or neo4j_flush_chunk_size: in server-config.yaml, matching the existing write_concurrency pattern.

Tests / proof

New, in tests/neo4j/test_concurrent_flush.py (@pytest.mark.neo4j, real Neo4j container):

test_chunked_flush_writes_all_in_bounded_chunks — buffers 500 nodes + 200 edges at chunk=50; asserts all rows land and no transaction exceeded 50 items (instruments _write_batch per-call sizes).
test_chunked_flush_partial_failure_restores_only_remainder — injects a mid-flush failure; asserts already-committed chunks are durable, the buffer holds only the remainder (not the whole snapshot), and a retry completes.

Results on this branch:

tests/neo4j/test_concurrent_flush.py — 6 passed (4 pre-existing flush/drain/poison tests = no regression, plus the 2 above), real neo4j:5.26.22-community.
Unit suite pytest -m "not neo4j and not integration" — 1273 passed.
ruff + pyright — clean.

Notes

Default neo4j_flush_chunk_size=200 is conservative; operators can tune.
Tracking/discussion: microsoft-amplifier/amplifier-support#278 (filed there because Issues are disabled on this repo — re-enabling Issues would help external triage).
Based on main @ 750de9d.

…action-memory OOM Previously, _flush_body wrote the entire node+edge+patch buffer in a single execute_write transaction. On failure, the finally block would restore and re-merge the whole snapshot, causing a self-amplifying grow-spiral under continued load—each failed flush became larger than the last, leading to unbounded transaction memory growth and MemoryPoolOutOfMemoryError. This fix rewrites the flush to: - Write in bounded sub-transactions (nodes→edges→patches) - Respect neo4j_flush_chunk_size (default 200) - Restore only the un-committed remainder on failure (killing the grow-spiral) - Maintain strict ordering for referential integrity Adds neo4j_flush_chunk_size tunable (AMPLIFIER_CONTEXT_INTELLIGENCE_SERVER_NEO4J_FLUSH_CHUNK_SIZE). Tests: 6/6 real-neo4j flush tests pass (4 existing = no regression, 2 new bounding + failure-restore tests), 1273 unit tests pass, ruff+pyright clean. Fixes: microsoft-amplifier/amplifier-support#278 Generated with [Amplifier](https://github.com/microsoft/amplifier) Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

colombod · 2026-06-18T10:25:51Z

Thanks @bkrabach — we independently arrived at the same core fix (chunked bounded sub-transactions in _flush_body, _write_batch untouched, both drain + finalization paths), which is good validation of the approach.

#15 is a functional superset of this PR, so I'd propose consolidating onto it and closing this one:

Same OOM fix, plus a serialized-byte co-bound alongside the row cap (a 200-row chunk can still OOM on a few fat tool_input/messages rows; the byte bound closes that).
Phase 2 failure-visibility (which #278 also asked for): last_successful_flush + orphaned_sessions() surfaced on /status, so a wedged finalization orphan is no longer silent.
Stronger proof: a live test that reproduces a real MemoryPoolOutOfMemoryError and shows the chunked flush drains the same buffer, on both paths, with a kill/restart "frozen-across-restarts" arc; plus a dangling-node reader audit; a db.memory.transaction.max deployment cap; and the 4.0.1 bump.
One deliberate divergence: fix: chunked Neo4j flush eliminates OOM-induced ingest stall + failure-visibility signal (v4.0.1) #15 restores the whole snapshot and re-raises rather than tracking the uncommitted remainder — correctness-equivalent under idempotent MERGE, and your remainder optimization is easy to port if preferred.

Proposing we close this in favor of #15. Thanks for the independent confirmation of the root cause and fix.

colombod closed this Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: chunk Neo4j flush into bounded sub-transactions to prevent transaction-memory OOM#14

fix: chunk Neo4j flush into bounded sub-transactions to prevent transaction-memory OOM#14
bkrabach wants to merge 1 commit into
mainfrom
fix/chunked-flush-bounded-transaction

bkrabach commented Jun 18, 2026

Uh oh!

colombod commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bkrabach commented Jun 18, 2026

Problem

Root cause

Fix

New tunable

Tests / proof

Notes

Uh oh!

colombod commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants