Make SDK ingestion docs agent-friendly by elenagaljak-db · Pull Request #387 · databricks/zerobus-sdk

elenagaljak-db · 2026-06-18T16:02:09Z

What

Reworks the ingestion documentation across all five SDKs (Rust, Python, TypeScript, Java, Go) plus the root docs so that clients — including AI coding assistants — write performant code by default.

Why

Users (and agents) were producing slow clients that call the wait-for-acknowledgment method (wait_for_offset / waitForOffset / WaitForOffset, or .join() on a per-record future) after every ingest. Because ingestion is asynchronous and pipelined, waiting per record forces a full server round-trip before the next record is sent, limiting throughput to ~one record per round-trip. The docs were partly to blame: every SDK led with the per-record-wait example, and the high-throughput pattern was buried far below.

Changes

Lead with the idiomatic flow — ingest in a loop, then flush() once — as the first example in every README. Per-record waiting is presented as a legitimate tool for low-volume "confirm this specific record" cases, not the default.
Positive framing: a new "Acknowledgments and throughput" section explains how acks work, positioned after the API descriptions (JSON / Protobuf / Arrow) rather than wedged before them.
API doc comments updated where consumers actually see them: Rust ///, Python docstrings (sync + async), TypeScript JSDoc, Java Javadoc, Go godoc — on ingest_*, wait_for_offset/flush and equivalents.
Examples: single-record examples now ingest-then-flush() once.
Java QuickStart rewritten off the deprecated ingestRecord().join()-in-a-loop onto ingestRecordOffset() + wait-once.
CLAUDE.md (root + per-SDK) gain a "client code patterns" / performance rule for in-repo agents.
NEXT_CHANGELOG.md updated in each SDK under Documentation.

Notes

The "wait on the last offset confirms all prior" guidance is verified against the Rust core: the ack watermark is monotonic (wait_for_offset_internal waits for last_received_offset >= target).
Ack callbacks exist only in Rust/Python/Java; the TS and Go docs use the flush()/last-offset approach (no invented APIs).
Verified: Rust doctests pass, Rust examples compile, Go is gofmt-clean, Python compiles. Docs/prose only — no library logic changed.

This pull request and its description were written by Isaac.

Co-authored-by: Isaac Signed-off-by: elenagaljak-db <elena.galjak@databricks.com>

teodordelibasic-db

Public doc examples in record_types.rs and arrow_stream.rs still ends an ingest_* call with an immediate stream.wait_for_offset(offset).await?;.

Per location, the line to change and the fix:

rust/sdk/src/record_types.rs:59 (ProtoBytes example) — we can drop the trailing
stream.wait_for_offset(offset).await?; in favor of stream.flush().await?;, or add a one-line note that per-record waiting is for low-volume confirmation only.
rust/sdk/src/record_types.rs:83 (JsonString example) — same change.
rust/sdk/src/record_types.rs:108 (ProtoMessage example) — same change.
rust/sdk/src/record_types.rs:141 (JsonValue example) — same change.
rust/sdk/src/arrow_stream.rs:190 (ZerobusArrowStream type-level example) — the example ingests one batch then waits on its offset; lead with the loop-then flush() pattern instead, matching lib.rs, and keep the single-batch wait only as the labeled low-volume case.
rust/sdk/src/arrow_stream.rs:1221 (ingest_batch method example) — same: replace the immediate
wait_for_offset(offset) with flush() once at the end, or label it as the confirm-a specific-batch case.

teodordelibasic-db · 2026-06-22T09:18:08Z

+6. **Enable Recovery** - Always set `Recovery: true` in production environments
+7. **Use Batch Ingestion** - For high throughput, ingest many records before calling `Flush()`
+8. **Monitor Errors** - Log and alert on non-retryable errors
+9. **Use Protocol Buffers for Production** - More efficient than JSON for high-volume scenarios


Duplicate number 9 in list below, this line should be followed with 10/11/12.

teodordelibasic-db · 2026-06-22T09:19:08Z

 }
 log.Printf("Batch queued with offset: %d", batchOffset)
+// ... ingest more batches ...
+stream.Flush() // wait for everything at the end


Other Flush() and WaitForOffset() calls in this README check for errors; this lone one does not. We can change to something like:

if err := stream.Flush(); err != nil { log.Fatal(err) }

teodordelibasic-db · 2026-06-22T09:20:56Z

+**Idiomatic flow:**
+
+```python
+for record in records:


The Quick Start above this is async (await ... ; asyncio.run(main())), so maybe we should change this section to use async as well.

for record in records: await stream.ingest_record_offset(record) await stream.flush()

teodordelibasic-db · 2026-06-22T09:21:32Z

+**Confirming a specific record** (waiting on the last offset confirms all prior records):
+
+```python
+for record in records:


ditto:

for record in records: offset = await stream.ingest_record_offset(record) await stream.wait_for_offset(offset) # confirm the run before continuing

teodordelibasic-db · 2026-06-22T09:23:18Z

-    /// // Wait for both to be acknowledged
-    /// await stream.waitForOffset(offset2);
+    /// // High-throughput pattern: ingest in a loop, wait once at the end.
+    /// let lastOffset;


If records is empty, lastOffset is undefined and waitForOffset(undefined) runs instead of being a
no-op. Inconsistent with the neighboring ingestRecordsOffset() example (line 486), which correctly
guards with a null sentinel. Either prefer await stream.flush(); here, or guard:

let lastOffset: bigint | null = null; for (const record of records) lastOffset = await stream.ingestRecordOffset(record); if (lastOffset !== null) await stream.waitForOffset(lastOffset);

teodordelibasic-db · 2026-06-22T09:24:08Z

    ///
    /// ```typescript
-    /// const offsets = [];
+    /// let lastOffset;


Same as typescript/src/lib.rs:423.

teodordelibasic-db · 2026-06-22T09:25:00Z

-    stream.wait_for_offset(offset).await?;
-}
+// Returns Some(offset) for non-empty batches, None for empty batches.
+// Queue many batches this way; flush() once when done.


Since the comment says "flush() once when done" we can add a trailing stream.flush().await?; so the fragment is self-contained.

teodordelibasic-db · 2026-06-22T09:26:20Z

 ```typescript
-// High-throughput pattern: send many, wait once
+// Idiomatic flow: ingest in a loop, then flush once
 const offset1 = await stream.ingestRecordOffset(record1);  // Resolves immediately


offset1 is dead. We can use a small loop with a single lastOffset (matching the JSDoc in src/lib.rs) or
drop the first binding.

teodordelibasic-db · 2026-06-22T09:28:02Z

+## Client code patterns (performance)
+
+When writing or reviewing client/example code, follow the idiomatic async flow.
+`IngestRecordOffset()` (and `IngestRecordsOffset()` / `IngestBatch()`) return as


IngestBatch is a method on ZerobusArrowStream (go/arrow_stream.go:168), not on ZerobusStream where IngestRecordOffset/IngestRecordsOffset live. This reads as if all three are on the same type. We should maybe drop IngestBatch() from the list or note it belongs to the Arrow stream. Flagged by LLM so probably useful.

teodordelibasic-db · 2026-06-22T09:29:39Z

 5. **Use Protocol Buffers for production**: Protocol Buffers (the default) provides better performance and schema validation. Use JSON only when you need schema flexibility or for quick prototyping.
 6. **Store credentials securely**: Use environment variables, never hardcode credentials
 7. **Use batch ingestion**: For high-throughput scenarios, use `ingestRecordsOffset()` instead of individual `ingestRecordOffset()` calls
+8. **Ingest in a loop, then `flush()`**: `ingestRecordOffset()` / `ingestRecordsOffset()` resolve as soon as the record is queued; the SDK sends and tracks acknowledgment in the background. Confirm durability with a single `flush()` (once for a bounded batch, or periodically for a long-running stream). Each ingest returns an offset, and `waitForOffset(offset)` confirms a specific record when you need it (acks are ordered, so the last offset confirms the whole run). Just avoid calling `waitForOffset()` after every record in a tight loop, since that limits throughput to one record per round-trip.


This repeats the same explanation as the "Acknowledgments and throughput" blockquote at line 185. We can shorten #8 to a cross-reference ("See Acknowledgments and throughput above"). Separately, other SDKs give this its own ### Acknowledgments and throughput heading (e.g. python/README.md), whereas TS uses a long > blockquote - promoting it to a heading would match the rest of the set.

danilonajkov-db · 2026-06-25T16:17:23Z

A thing that I have noticed in READMEs is that a JSON example is always given first. We should put proto first and for json add some disclamer about throughput.

elenagaljak-db · 2026-06-25T16:46:43Z

I don't agree that it should be first since JSON is easier to set up to just test and its always labeled as quick start. We can emphasize more that its better to use proto/arrow for production.

teodordelibasic-db · 2026-06-26T12:13:26Z

I don't agree that it should be first since JSON is easier to set up to just test and its always labeled as quick start. We can emphasize more that its better to use proto/arrow for production.

Ideally once we add support for dynamic protobuf etc. it should be as easy to setup as JSON.

Improve docs

5f0058a

elenagaljak-db force-pushed the readme-fix branch from 96faf6f to 5f0058a Compare June 18, 2026 16:13

elenagaljak-db requested a review from davidtosovic-db June 18, 2026 16:16

[Rust] Fix rustfmt in single-record examples

9c6c895

Co-authored-by: Isaac Signed-off-by: elenagaljak-db <elena.galjak@databricks.com>

elenagaljak-db requested a review from teodordelibasic-db June 22, 2026 09:06

elenagaljak-db self-assigned this Jun 22, 2026

teodordelibasic-db reviewed Jun 22, 2026

View reviewed changes

Uh oh!

Conversation

elenagaljak-db commented Jun 18, 2026

What

Why

Changes

Notes

Uh oh!

teodordelibasic-db left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danilonajkov-db commented Jun 25, 2026

Uh oh!

elenagaljak-db commented Jun 25, 2026

Uh oh!

teodordelibasic-db commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants