Skip to content

feat: multi-language IPC codegen (TS/C++/Rust/Zig) with cross-language wire compat tests#22654

Closed
charlielye wants to merge 152 commits into
cl/wsdb_cdbfrom
cl/more_ipc_v2
Closed

feat: multi-language IPC codegen (TS/C++/Rust/Zig) with cross-language wire compat tests#22654
charlielye wants to merge 152 commits into
cl/wsdb_cdbfrom
cl/more_ipc_v2

Conversation

@charlielye

Copy link
Copy Markdown
Contributor

Continuation of #21990 (CI3 workflow stopped triggering on that PR).

Same code, fresh PR to unblock CI.

Extract per-service code generation into a shared ServiceConfig-based
system. Each service (bb, wsdb, cdb, avm) declares its binary, language
targets, and output paths. The unified generate.ts orchestrates all
services, while per-service generate.ts files become thin wrappers.

This eliminates ~220 lines of duplicated schema-fetch/compile/write
logic and makes it trivial to add new language targets (Rust, Zig) to
any service.
Document the previously tribal-knowledge JSON schema format as a formal
spec. Covers type encodings (primitives, containers, structs, NamedUnion),
wire protocol (framing, request/response format), and the contract between
C++ schema export and language code generators.
Compute SHA-256 of raw schema JSON and emit it as a constant in all
generated code (SCHEMA_HASH in TS/Rust/C++). Clients can check this
at connection time to detect incompatible schema changes between
service binary and generated bindings.
C++ tests that verify the msgpack wire format of WSDB IPC commands and
responses. Tests validate:
- Request structure: [[command_name, {fields}]] (tuple + NamedUnion)
- Response structure: [response_name, {fields}] (NamedUnion)
- Round-trip: serialize → deserialize → compare original values
- Error response format
- Multiple command types: GetTreeInfo, CreateFork, GetLeafValue

These tests serve as the reference implementation for cross-language
wire compatibility. Other languages (Rust, Zig, TypeScript) should
produce identical msgpack bytes for the same command values.
- Extend RustCodegen with RustCodegenOptions (prefix, apiStructName,
  configurable imports) so it can generate per-service types and APIs
  (e.g. WsdbApi, CdbApi, AvmApi instead of BarretenbergApi)
- Create aztec-ipc Rust crate with:
  - Backend trait for IPC transport abstraction
  - UdsBackend: connects to Unix Domain Socket, 4-byte LE framing
  - IpcError type with Serialization/Deserialization/Backend/IO variants
  - Per-service modules (wsdb, cdb, avm) with placeholder generated files
- Wire Rust targets into wsdb/cdb/avm service configs in service_codegen.ts
- Add aztec-ipc to workspace Cargo.toml
Create zig_codegen.ts that generates from CompiledSchema:
- Zig struct definitions for all command/response types
- Generic packValue/readValue helpers for msgpack serialization
- Command and Response tagged unions with schema name dispatch
- Per-service IPC client structs with UDS connect, length-prefix
  framing, and typed methods per command

Wire Zig targets into wsdb/cdb/avm service configurations
(WsdbClient, CdbClient, AvmClient).
Create barretenberg/zig/aztec-ipc/ project:
- build.zig.zon with zig-msgpack 0.0.14 dependency (Zig 0.15 compat)
- build.zig with library module and test step
- ipc_framing.zig: 4-byte LE length-prefix send/receive with tests
- Per-service placeholder modules (wsdb, cdb, avm)
- Builds and tests pass with `zig build test`
Generate server dispatch boilerplate for C++, TypeScript, Rust, and Zig:

C++ (cpp_codegen.ts):
- generateServerHeader/generateServerImpl: creates make_xxx_handler()
  function returning an ipc::IpcServer::Handler lambda that handles
  msgpack deser → dispatch → ser, shutdown detection, error wrapping
- Wired into wsdb and cdb services

TypeScript (typescript_codegen.ts):
- generateServerApi: creates Handler interface + dispatch() function
  that maps [commandName, payload] → handler method → [responseName, result]

Rust (rust_codegen.ts):
- generateServer: creates Handler trait with one method per command
  + dispatch() function that matches Command enum to trait methods

Zig (zig_codegen.ts):
- generateServer: creates handler vtable struct with function pointers
  + dispatch() method

All server outputs wired into per-service configs in service_codegen.ts.
Replace single-line README with comprehensive documentation covering
the architecture diagram, file index, service matrix, usage examples,
and guides for adding new commands and new languages.
Create test/wire_compat/ infrastructure for cross-language IPC testing:

- schema.json: hand-written echo service schema (EchoBytes, EchoFields,
  EchoNested, EchoShutdown) covering bytes, integers, strings, vectors,
  optionals, and nested structs
- generate.ts: runs codegen on schema.json, producing bindings in TS,
  Rust, and Zig
- Rust echo_server + echo_client: full IPC round-trip over UDS with
  length-prefix framing. All 3 echo commands pass.

Bugfixes discovered by the test:
- rust_codegen: Command enum was including all structs (including nested
  inline structs like EchoInner), not just top-level commands
- rust_codegen: __typename field used skip_serializing but not
  skip_deserializing, causing deser failures. Changed to skip+default.
- schema.json: vector<vector<unsigned char>> needs double-wrapped args
  per the schema spec: ["vector", [["vector", ["unsigned char"]]]]
TypeScript echo server and client using msgpackr over UDS with the
same length-prefix framing protocol as Rust. All 4 language pair
combinations pass (Rust↔Rust, Rust↔TS, TS↔Rust, TS↔TS).

Cross-language test orchestrator (run_cross_language_tests.sh) runs
the full server/client matrix and reports results.

Wire compat fix: use u32-range values for u64 fields in tests to
avoid msgpackr's float64 encoding for large JS numbers.
Golden file infrastructure:
- Rust generate_golden binary creates reference .msgpack files for all
  echo commands and responses
- TypeScript golden_test.ts deserializes golden files and validates
  field values (proves TS can read Rust's msgpack output)
- Rust golden_test binary does the same (baseline self-check)

Full test suite (run_cross_language_tests.sh) now runs:
- Level 1: Golden file deserialization (Rust + TS) — 2 tests
- Level 2+3: IPC round-trip matrix (Rust↔Rust, Rust↔TS, TS↔Rust,
  TS↔TS) — 4 tests
- Total: 6/6 passing
Standalone C++ echo server and client using raw msgpack-c + UDS.
No barretenberg library dependencies — just needs the msgpack-c
headers from the cmake build.

Cross-language matrix now tests all 9 combinations:
  Rust↔Rust, Rust↔TS, Rust↔C++
  TS↔Rust, TS↔TS, TS↔C++
  C++↔Rust, C++↔TS, C++↔C++

Total: 11/11 passing (2 golden file + 9 IPC round-trip).
Standalone Zig echo server and client using raw msgpack encoding over
UDS. Both manually encode/decode msgpack bytes (fixarray, fixmap, str,
bin) matching the wire protocol exactly.

Cross-language matrix now tests all 16 combinations (4x4):
  C++ ↔ C++, C++ ↔ TS, C++ ↔ Rust, C++ ↔ Zig
  TS  ↔ C++, TS  ↔ TS, TS  ↔ Rust, TS  ↔ Zig
  Rust↔ C++, Rust↔ TS, Rust↔ Rust, Rust↔ Zig
  Zig ↔ C++, Zig ↔ TS, Zig ↔ Rust, Zig ↔ Zig

Total: 18/18 passing (2 golden file + 16 IPC round-trip).
In CI, barretenberg/ts bootstrap runs before C++ binaries are built.
The unified generate.ts was failing because it tried to run all 4
service binaries (bb, wsdb, cdb, avm) unconditionally.

Fix: check if the binary exists before invoking it. If not found,
print a skip message and continue. This matches the old behavior
where per-service scripts were only called when binaries existed.
The yarn clean script deleted generated/ directories, but yarn generate
can only recreate them when C++ binaries are available. With
ci-full-no-test-cache (no build cache), this causes build failures
because clean runs before generate, and generate skips when binaries
are missing.

Fix: remove generated dirs from the clean target. The generate script
overwrites them when it runs; if it skips (no binaries), the existing
generated files from cache or a previous build persist.
The build had a circular dependency:
  yarn clean → deletes generated files
  yarn generate → needs C++ binaries (not built yet in TS bootstrap)
  yarn build:esm → needs generated files → FAILS

Fix: commit the generated files to git as a baseline. They are always
available even without C++ binaries or build cache. When binaries ARE
available, yarn generate overwrites them with fresh output.

Changes:
- Un-gitignore src/*/generated/ directories
- Commit existing generated files (api_types.ts, async.ts, sync.ts,
  curve_constants.ts for bb/wsdb/cdb)
- yarn clean no longer deletes generated dirs (from previous commit)
- yarn generate skips services whose binaries are missing (from
  previous commit)

This eliminates the need for build cache to bootstrap bb.js, making
ci-full-no-test-cache work correctly.
barretenberg-rs/src/api.rs and generated_types.rs were gitignored but
are needed for cargo build. Without cache, the Rust bootstrap fails
because codegen is skipped (no C++ binaries available).

Fix: un-gitignore and commit as baseline, same as the TS generated
files in the previous commit.
Design for moving codegen out of bb.js into barretenberg/codegen/:
- Standalone package with own bootstrap.sh and cache hash
- Build order: bb-codegen → bb-cpp-native → bb-generate → bb-ts/bb-rs
- Two-level hashing: tool hash + generation hash
- Generated files not committed, produced at build time
- bb-rs decoupled from bb-ts (only needs generated files + C++ libs)
Move all code generation files out of barretenberg/ts/src/cbind/ into
barretenberg/codegen/src/ with its own package.json and tsconfig.json.

The codegen package has zero dependency on bb.js — it only needs
Node.js + msgpackr + tsx. It writes output to ts/, rust/, cpp/, zig/
directories via relative paths from codegen/src/.

This is Phase 1 of the restructure: the codegen tool exists as an
independent project. Next steps: update Makefile targets, bootstrap
scripts, and remove codegen from bb.js build pipeline.
bootstrap.sh provides:
- build: npm install with cache (fast, no C++ dependency)
- generate: runs codegen against C++ binaries, caches all output
- hash: tool hash from .rebuild_patterns
- generate_hash: hash(tool_hash, cpp_hash) for cache invalidation

Generation hash changes when either codegen source OR C++ schemas
change, ensuring consumers always get fresh bindings.
Makefile:
- Add bb-codegen target (npm install for codegen tool)
- Add bb-generate target (depends on bb-codegen + bb-cpp-native)
- bb-ts now depends on bb-generate (instead of running yarn generate)
- bb-rs now depends on bb-generate (instead of bb-ts)

barretenberg/ts/bootstrap.sh:
- Hash now depends on codegen generate_hash (not cpp hash directly)
- Removed yarn generate from build function

barretenberg/rust/bootstrap.sh:
- Hash now depends on codegen generate_hash (not ts hash)
- Removed yarn generate invocation from build function
- bb-rs no longer transitively depends on bb-ts

barretenberg/ts/package.json:
- Removed yarn generate from build script (handled by bb-generate)
Generated files are no longer committed to git. They are produced at
build time by barretenberg/codegen/bootstrap.sh generate (the
bb-generate Makefile target) which runs before bb-ts and bb-rs.

Restored gitignore entries for:
- barretenberg/ts/src/*/generated/ (TS types, async/sync APIs)
- barretenberg/rust/barretenberg-rs/src/{api,generated_types}.rs
Export schema JSON from each C++ binary and commit as the static
source of truth for codegen:
- schemas/bb_schema.json (55 commands, 14KB)
- schemas/wsdb_schema.json (30 commands, 8KB)
- schemas/cdb_schema.json (13 commands, 4KB)
- schemas/avm_schema.json (3 commands, 1KB)

Codegen will read these files instead of executing C++ binaries,
breaking the circular dependency between C++ build and codegen.
CI validates schemas stay in sync with C++ code after each build.
Replace binary invocation with static file reads:
- service_codegen.ts: ServiceConfig now has schemaFile instead of
  binaryEnvVar/defaultBinaryPath. loadAndCompileSchema() reads JSON.
- generate.ts: curve constants read from committed msgpack file.

The codegen tool now has ZERO runtime dependency on C++ binaries.
update_schemas.sh: dev script to re-export schemas from C++ binaries.
validate_schemas.sh: CI script that validates committed schemas match
C++ binary output. Fails with instructions if they diverge.
Since codegen reads static JSON files, the dependency chain simplifies:
- bb-generate depends only on bb-codegen (not bb-cpp-native)
- generate_hash is gone; codegen hash includes schemas via .rebuild_patterns
- Consumer hashes (bb-ts, bb-rs) use codegen hash directly

Build order is now fully acyclic:
  bb-codegen (npm install) → bb-generate (read JSON, produce code)
  bb-cpp-native (C++ build, independent)
  bb-ts (needs bb-generate + bb-cpp-native)
  bb-rs (needs bb-generate + bb-cpp-native)

Schema validation runs after C++ build as a separate check.
bn254G1FromCompressed silently returned zero point for out-of-range
x-coordinates. Now throws an error. Masks bit 255 (y-parity) before
checking against the field modulus.
…t name

- Replace BBAPI_ERROR with throw_or_abort in ECC handlers so errors
  propagate through the generated dispatch as BbErrorResponse
- Fix hardcoded 'ErrorResponse' in sync client to use the schema's
  actual error type name (e.g. 'BbErrorResponse')
… in Docker)

In Docker containers, Node.js runs as PID 1. The spawned bb/wsdb/avm
binaries see getppid()==1 and immediately shut down, thinking the
parent died. This caused "Native backend process exited unexpectedly"
in the playground test.

prctl(PR_SET_PDEATHSIG) already handles the race correctly — the
kernel delivers SIGTERM retroactively if the parent exited before
prctl was called.
Resolved conflicts:
- avm-transpiler/Cargo.lock: take base branch (latest noir deps)
- bbapi.test.cpp: keep both include sets
- bbapi_chonk.cpp: use circuit_name variable with Bb prefix
- types.rs: keep deletion (replaced by generated code)
BBApiRequest -> BbRequest, AesEncrypt -> BbAesEncrypt,
AesDecrypt -> BbAesDecrypt in tests added by base branch merge.
…N_FIELDS

Zig's cross-compile clang for macOS doesn't support C++20 abbreviated
function templates (auto params). Use explicit template parameter.
…N_FIELDS

Zig's cross-compile clang for macOS doesn't support C++20 abbreviated
function templates (auto params). Use explicit template parameter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant