[C++] Core SDK implementation and build (split 3/6 from #415) by zlata-stefanovic-db · Pull Request #420 · databricks/zerobus-sdk

zlata-stefanovic-db · 2026-06-24T13:45:33Z

Summary

The core of the C++ SDK: a C++17, header + static-library wrapper over the
Zerobus C FFI (rust/ffi), which in turn wraps the Rust core. It gives C++
callers the same gRPC streaming / OAuth / recovery / ingestion engine as the
other SDKs, behind an idiomatic, exception-based, RAII API. This PR contains the
public API, its implementation, the build, and CI - but not the tests or
examples (those are stacked on top; see Merge order). Arrow Flight ingestion is
not part of this PR; it is peeled into its own PR in the split.

What's in this PR

Public API -cpp/include/zerobus/ (the surface; opaque, forward-declared
FFI handles only, so zerobus.h never leaks into consumers):

Sdk / SdkBuilder + TableProperties - connection factory and stream creation.
Stream - proto/JSON ingestion: single, batched, and fire-and-forget (*_nowait),
plus flush, wait_for_offset, get_unacked_records, close.
ProtoSchema - build a descriptor + encode records straight from Unity Catalog
table metadata (no .proto/protoc).
HeadersProvider - custom auth headers.
ZerobusException (is_retryable()), StreamOptions, UnackedRecord,
version(), and the zerobus.hpp umbrella header.

Implementation -cpp/src/ (the only place that includes zerobus.h):

sdk.cpp, stream.cpp, proto_schema.cpp forward to the C FFI;
headers_callback.cpp is the extern "C" trampoline.
src/detail/ internals: ResultGuard (CResult - ZerobusException, always
freeing the C error string), the StreamOptions - C config conversion, and the
trampoline declaration.

Build & CI:

CMakeLists.txt - library target (zerobus::zerobus), the install/
find_package(zerobus) export with the FFI archive bundled, and the
ZEROBUS_SANITIZE option (off by default).
cmake/BuildRustFfi.cmake builds libzerobus_ffi from local Rust source by
default, or links a prebuilt lib via -DZEROBUS_FFI_LIBRARY=.
Makefile (build/test/lint/fmt, SANITIZE= pass-through), .clang-format.
ci-cpp.yml (fmt + test + Addresspush.yml`'s path filter.

Design notes

Memory ownership: Rust owns every handle; wrapper classes are move-only and
free their handle exactly once (the source is nulled on move). Errors are thrown,
never returned.
No gRPC/Protobuf C++ dependency: all marshalling crosses the C FFI as byte
buffers / pointer arrays; the batch helpers build only the small pointer/length
arrays the C entry points need.
Distribution (separate PRs): CMake + GitHub Releases, no package manager.

Merge order

Off main. Tests and examples are stacked on this PR and merge after it. The
add_subdirectory(tests)/(examples) wiring is intentionally omitted here and
arrives with those PRs, so this branch configures and builds the library cleanly
on its own.

Part of the #415 split.

Test plan

make build - configures and builds the library (FFI from source).
make lint - clang-format check + -Wall -Wextra.
Verified a separate project can find_package(zerobus) and link
zerobus::zerobus from a cmake --install tree.
Unit tests and the AddressSanitizer run land with the stacked tests PR.

## Summary Core C++ SDK: the public headers (`include/`), implementation (`src/`), the Rust C FFI build glue (`cmake/`), the CMake build (library target, install / `find_package` export, sanitizer option), the `Makefile`, `.clang-format`, and the C++ CI (`ci-cpp.yml` + `push.yml` path filter). Builds the library. Part of the #415 split (4/6). ### Merge order Off `main`. **Tests (5/6) and examples (6/6) are stacked on this PR** and merge after it. The `add_subdirectory(tests)`/`(examples)` wiring is intentionally not here yet — it arrives with those PRs. Draft until the stack is reviewed. Split from #415. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

Warn against combining the fire-and-forget _nowait ingest APIs with a custom HeadersProvider: detached background tasks are not drained by close() or the destructor, so they can call back into the provider after the Stream releases it. Re-enable LeakSanitizer in the ASan CI job (was detect_leaks=0, which hid all leaks) with a narrow suppression file covering only the intentional once_cell/tokio runtime globals, so real wrapper leaks stay visible. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

- close(): keep the handle alive on a failed close so get_unacked_records() and retry still work; free only on success - ingest_*_records: reject empty batches instead of returning the FFI -2 sentinel as an offset; nowait batch variants no-op on empty - headers callback: signal a non-null error on OOM instead of failing open; reject keys/values containing embedded NUL bytes Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

- Sdk::create(): route through the builder so the user-agent reports zerobus-sdk-cpp instead of the Rust default - callback_max_wait_time_ms: leave the FFI default in place on nullopt instead of forcing None - SdkBuilder: type the handle as CZerobusSdkBuilder* instead of void* - add missing <cstddef>/<utility> includes to public headers Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

- installed CMake config recreates the zerobus_ffi target the export references, fixing external link failures - FFI custom command depends on Rust sources so edits trigger a rebuild - gate tests/examples options on existing subdirs; fail configure on version.hpp drift - narrow LSan suppressions to lazy-init/runtime construction - drop the redundant use_local_sdk patch from C++ CI Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

irinatomic-db · 2026-06-26T12:49:12Z

+  void ingest_json_records_nowait(const std::vector<std::string>& records);
+
+  /// Block until the record at `offset` has been acknowledged by the server.
+  void wait_for_offset(std::int64_t offset);


Not a blocker for this PR, just something to think about and see if it makes sense and is worth doing as a follow-up.

The C++ SDK has the pull-based ack model (ingest → offset → wait_for_offset / flush) but no push-based AckCallback like Python and Java have.

For Go this gap is fine - goroutines are userspace-scheduled by the Go runtime, multiplexed onto a small number of OS threads. Blocking a goroutine on wait_for_offset just suspends it and yields its OS thread to another goroutine; no kernel involvement. You can have thousands of goroutines blocked on ack tracking with negligible cost.

C++ std::thread maps 1:1 to an OS thread: kernel-scheduled. Blocking one on wait_for_offset parks that OS thread doing nothing until the server ack arrives. Scaling this across many concurrent streams might get expensive fast.

The idiomatic C++ solution for "notify me when something happens" is a callback, not a blocked thread. With AckCallback, the Rust tokio runtime fires the callback from its own thread pool when the ack arrives - the application thread never blocks waiting for acks and no extra OS threads are needed.

For now the pull model is functional. I suggest looking into this as a follow up to see if it makes sense to implement it (adding AckCallback to the C FFI). Some issues I see in implementation to think about:

Language boundaries

Callback object lifetime

irinatomic-db · 2026-06-26T14:25:25Z

+  /// Ingest a batch of JSON records. Returns the offset of the last record.
+  /// Throws `ZerobusException` if `records` is empty.
+  std::int64_t ingest_json_records(const std::vector<std::string>& records);


What do other SDKs do in case that records is empty, do they also throw an exception?

Most of them don't, this is a good idea to change it for consistency

Just made this change, thank you! @irinatomic-db

`ingest_proto_records` / `ingest_json_records` threw `ZerobusException` on an empty batch. No other SDK treats an empty batch as an error: the Rust core returns `Ok(None)`, the FFI returns its `-2` sentinel with a success result, and the Go wrapper maps that to `-1` with no error. Return `-1` (a no-op) for an empty batch instead of throwing, bringing C++ in line with the other SDKs. `-1` is unambiguous since real offsets are non-negative. Update the header docs accordingly. The `_nowait` batch variants already no-op on empty, so they are unchanged. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

The empty-batch no-op comments in stream.cpp/stream.hpp exceeded the 80-column limit, failing the clang-format CI check. Reflow them; no code change. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

irinatomic-db · 2026-06-26T15:35:06Z

+  /// argument-validation errors are reported (as exceptions). Ingestion errors
+  /// are silently dropped. The stream must outlive the background work.
+  ///
+  /// WARNING — do not combine the `_nowait` APIs with a custom
+  /// `HeadersProvider`. A fire-and-forget task is detached: neither `close()`
+  /// nor the destructor drains it, and a task that still needs fresh headers
+  /// may call back into the provider after the `Stream` (and the `shared_ptr`
+  /// keeping the provider alive) is destroyed — a use-after-free. The FFI
+  /// exposes no way to drain these tasks, so there is no safe ordering. With a
+  /// `HeadersProvider`, use only the blocking ingest variants, which complete
+  /// before they return.
+  void ingest_proto_record_nowait(const std::uint8_t* data, std::size_t len);


The doc warning is correct but insufficient - it relies on the caller reading and following the warning. The Rust background task is detached; neither close() nor the destructor drains it, so a task that needs fresh headers can call back through a raw pointer after provider_ is destroyed. This should be enforced in code, not just documented.

Maybe smth like:

void Stream::ingest_proto_record_nowait(...) { if (provider_ != nullptr) throw ZerobusException("_nowait APIs cannot be used with a custom HeadersProvider", false); ... }

Agree with this. I discussed it with Danilo already — it’s a real issue, will do this C++ mitigation now.
The reason I didn’t do the full fix in this PR is that the root issue is at the C FFI contract, not just in C++ wrapper logic. For this beta PR we kept the warning-only mitigation to avoid cross-SDK FFI changes right now, but we should enforce it in code.
I’ll file a follow-up for _nowait + custom HeadersProvider enforcement (starting with C++ guard, then proper FFI-safe lifecycle fix).

Made the mitigation now

irinatomic-db · 2026-06-26T15:41:48Z

+std::int64_t Stream::ingest_proto_record(const std::uint8_t* data,
+                                         std::size_t len) {
+  detail::ResultGuard guard;
+  std::int64_t offset =
+      zerobus_stream_ingest_proto_record(handle_, data, len, guard.ptr());


The FFI returns -2 for an empty batch and -1 for errors, alongside CResult.success. C++ relies entirely on throw_if_error() and hands the raw offset straight to the caller. If the FFI ever returned a negative sentinel with success == true, the caller would get -2 as a real offset and pass it to wait_for_offset(-2). Go defends against this explicitly (if offset == -2 { return -1, nil } then if offset < 0 { ... error }). Worth adding a post-call guard:

guard.throw_if_error(); if (offset < 0) throw ZerobusException("unexpected negative offset from FFI", false); return offset;

Addressing this as well, thank you!

Throw when _nowait APIs are used with a custom HeadersProvider to prevent callback lifetime UAF risk. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

The blocking ingest_* methods returned the FFI's raw int64 straight to the caller after throw_if_error(). The FFI overloads that return value with negative sentinels (-1 error, -2 empty batch) separately from CResult.success, so a negative value arriving with success set would be handed back as a real offset and could reach wait_for_offset(-2). Add a checked_offset() helper that throws ZerobusException on a negative offset, applied to ingest_proto_record, ingest_json_record, ingest_proto_records, and ingest_json_records. The batch methods still short-circuit the empty case to -1 before the FFI call, so that path never hits the guard. Mirrors the explicit offset < 0 check Go already performs. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

zlata-stefanovic-db self-assigned this Jun 24, 2026

zlata-stefanovic-db marked this pull request as ready for review June 24, 2026 13:47

zlata-stefanovic-db requested review from elenagaljak-db and teodordelibasic-db June 24, 2026 13:47

zlata-stefanovic-db force-pushed the split/415/core branch from 2b0af0d to 121ffab Compare June 24, 2026 13:57

zlata-stefanovic-db requested review from danilonajkov-db, davidtosovic-db and irinatomic-db June 24, 2026 14:01

zlata-stefanovic-db added the feature-request Net-new capability requested by customers label Jun 24, 2026

zlata-stefanovic-db changed the title ~~[C++] Core SDK implementation and build (split 4/6 from #415)~~ [C++] Core SDK implementation and build (split 3/6 from #415) Jun 24, 2026

zlata-stefanovic-db requested a review from a team June 24, 2026 15:55

zlata-stefanovic-db force-pushed the split/415/core branch from 121ffab to 45886f6 Compare June 25, 2026 09:57

zlata-stefanovic-db added 2 commits June 25, 2026 16:01

zlata-stefanovic-db force-pushed the split/415/core branch from 44aebdb to fd39202 Compare June 25, 2026 16:02

zlata-stefanovic-db linked an issue Jun 25, 2026 that may be closed by this pull request

[C++] Add release workflow and cut the 0.1.0 release #416

Open

5 tasks

zlata-stefanovic-db added 4 commits June 25, 2026 18:12

Merge branch 'main' into split/415/core

5322eeb

irinatomic-db reviewed Jun 26, 2026

View reviewed changes

zlata-stefanovic-db added 3 commits June 26, 2026 14:59

Merge branch 'main' into split/415/core

19d7be5

[C++] Wrap empty-batch comments to clang-format limit

0633827

The empty-batch no-op comments in stream.cpp/stream.hpp exceeded the 80-column limit, failing the clang-format CI check. Reflow them; no code change. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

zlata-stefanovic-db requested a review from irinatomic-db June 26, 2026 15:14

irinatomic-db reviewed Jun 26, 2026

View reviewed changes

zlata-stefanovic-db added 2 commits June 26, 2026 16:08

[C++] Guard nowait with headers provider

2098452

Throw when _nowait APIs are used with a custom HeadersProvider to prevent callback lifetime UAF risk. Signed-off-by: Zlata Stefanovic <zlata.stefanovic@databricks.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[C++] Core SDK implementation and build (split 3/6 from #415)#420

[C++] Core SDK implementation and build (split 3/6 from #415)#420
zlata-stefanovic-db wants to merge 11 commits into
mainfrom
split/415/core

zlata-stefanovic-db commented Jun 24, 2026 •

edited

Loading

Uh oh!

irinatomic-db Jun 26, 2026

Uh oh!

irinatomic-db Jun 26, 2026

Uh oh!

zlata-stefanovic-db Jun 26, 2026 •

edited

Loading

Uh oh!

zlata-stefanovic-db Jun 26, 2026

Uh oh!

irinatomic-db Jun 26, 2026

Uh oh!

zlata-stefanovic-db Jun 26, 2026 •

edited

Loading

Uh oh!

zlata-stefanovic-db Jun 26, 2026

Uh oh!

irinatomic-db Jun 26, 2026

Uh oh!

zlata-stefanovic-db Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zlata-stefanovic-db commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

Design notes

Merge order

Test plan

Uh oh!

irinatomic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

irinatomic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

zlata-stefanovic-db Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zlata-stefanovic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

irinatomic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

zlata-stefanovic-db Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zlata-stefanovic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

irinatomic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

zlata-stefanovic-db Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zlata-stefanovic-db commented Jun 24, 2026 •

edited

Loading

zlata-stefanovic-db Jun 26, 2026 •

edited

Loading

zlata-stefanovic-db Jun 26, 2026 •

edited

Loading