feat: multi-language IPC codegen (TS/C++/Rust/Zig) with cross-language wire compat tests#22654
Closed
charlielye wants to merge 152 commits into
Closed
feat: multi-language IPC codegen (TS/C++/Rust/Zig) with cross-language wire compat tests#22654charlielye wants to merge 152 commits into
charlielye wants to merge 152 commits into
Conversation
Extract per-service code generation into a shared ServiceConfig-based system. Each service (bb, wsdb, cdb, avm) declares its binary, language targets, and output paths. The unified generate.ts orchestrates all services, while per-service generate.ts files become thin wrappers. This eliminates ~220 lines of duplicated schema-fetch/compile/write logic and makes it trivial to add new language targets (Rust, Zig) to any service.
Document the previously tribal-knowledge JSON schema format as a formal spec. Covers type encodings (primitives, containers, structs, NamedUnion), wire protocol (framing, request/response format), and the contract between C++ schema export and language code generators.
Compute SHA-256 of raw schema JSON and emit it as a constant in all generated code (SCHEMA_HASH in TS/Rust/C++). Clients can check this at connection time to detect incompatible schema changes between service binary and generated bindings.
C++ tests that verify the msgpack wire format of WSDB IPC commands and
responses. Tests validate:
- Request structure: [[command_name, {fields}]] (tuple + NamedUnion)
- Response structure: [response_name, {fields}] (NamedUnion)
- Round-trip: serialize → deserialize → compare original values
- Error response format
- Multiple command types: GetTreeInfo, CreateFork, GetLeafValue
These tests serve as the reference implementation for cross-language
wire compatibility. Other languages (Rust, Zig, TypeScript) should
produce identical msgpack bytes for the same command values.
- Extend RustCodegen with RustCodegenOptions (prefix, apiStructName, configurable imports) so it can generate per-service types and APIs (e.g. WsdbApi, CdbApi, AvmApi instead of BarretenbergApi) - Create aztec-ipc Rust crate with: - Backend trait for IPC transport abstraction - UdsBackend: connects to Unix Domain Socket, 4-byte LE framing - IpcError type with Serialization/Deserialization/Backend/IO variants - Per-service modules (wsdb, cdb, avm) with placeholder generated files - Wire Rust targets into wsdb/cdb/avm service configs in service_codegen.ts - Add aztec-ipc to workspace Cargo.toml
Create zig_codegen.ts that generates from CompiledSchema: - Zig struct definitions for all command/response types - Generic packValue/readValue helpers for msgpack serialization - Command and Response tagged unions with schema name dispatch - Per-service IPC client structs with UDS connect, length-prefix framing, and typed methods per command Wire Zig targets into wsdb/cdb/avm service configurations (WsdbClient, CdbClient, AvmClient).
Create barretenberg/zig/aztec-ipc/ project: - build.zig.zon with zig-msgpack 0.0.14 dependency (Zig 0.15 compat) - build.zig with library module and test step - ipc_framing.zig: 4-byte LE length-prefix send/receive with tests - Per-service placeholder modules (wsdb, cdb, avm) - Builds and tests pass with `zig build test`
Generate server dispatch boilerplate for C++, TypeScript, Rust, and Zig: C++ (cpp_codegen.ts): - generateServerHeader/generateServerImpl: creates make_xxx_handler() function returning an ipc::IpcServer::Handler lambda that handles msgpack deser → dispatch → ser, shutdown detection, error wrapping - Wired into wsdb and cdb services TypeScript (typescript_codegen.ts): - generateServerApi: creates Handler interface + dispatch() function that maps [commandName, payload] → handler method → [responseName, result] Rust (rust_codegen.ts): - generateServer: creates Handler trait with one method per command + dispatch() function that matches Command enum to trait methods Zig (zig_codegen.ts): - generateServer: creates handler vtable struct with function pointers + dispatch() method All server outputs wired into per-service configs in service_codegen.ts.
Replace single-line README with comprehensive documentation covering the architecture diagram, file index, service matrix, usage examples, and guides for adding new commands and new languages.
Create test/wire_compat/ infrastructure for cross-language IPC testing: - schema.json: hand-written echo service schema (EchoBytes, EchoFields, EchoNested, EchoShutdown) covering bytes, integers, strings, vectors, optionals, and nested structs - generate.ts: runs codegen on schema.json, producing bindings in TS, Rust, and Zig - Rust echo_server + echo_client: full IPC round-trip over UDS with length-prefix framing. All 3 echo commands pass. Bugfixes discovered by the test: - rust_codegen: Command enum was including all structs (including nested inline structs like EchoInner), not just top-level commands - rust_codegen: __typename field used skip_serializing but not skip_deserializing, causing deser failures. Changed to skip+default. - schema.json: vector<vector<unsigned char>> needs double-wrapped args per the schema spec: ["vector", [["vector", ["unsigned char"]]]]
TypeScript echo server and client using msgpackr over UDS with the same length-prefix framing protocol as Rust. All 4 language pair combinations pass (Rust↔Rust, Rust↔TS, TS↔Rust, TS↔TS). Cross-language test orchestrator (run_cross_language_tests.sh) runs the full server/client matrix and reports results. Wire compat fix: use u32-range values for u64 fields in tests to avoid msgpackr's float64 encoding for large JS numbers.
Golden file infrastructure: - Rust generate_golden binary creates reference .msgpack files for all echo commands and responses - TypeScript golden_test.ts deserializes golden files and validates field values (proves TS can read Rust's msgpack output) - Rust golden_test binary does the same (baseline self-check) Full test suite (run_cross_language_tests.sh) now runs: - Level 1: Golden file deserialization (Rust + TS) — 2 tests - Level 2+3: IPC round-trip matrix (Rust↔Rust, Rust↔TS, TS↔Rust, TS↔TS) — 4 tests - Total: 6/6 passing
Standalone C++ echo server and client using raw msgpack-c + UDS. No barretenberg library dependencies — just needs the msgpack-c headers from the cmake build. Cross-language matrix now tests all 9 combinations: Rust↔Rust, Rust↔TS, Rust↔C++ TS↔Rust, TS↔TS, TS↔C++ C++↔Rust, C++↔TS, C++↔C++ Total: 11/11 passing (2 golden file + 9 IPC round-trip).
Standalone Zig echo server and client using raw msgpack encoding over UDS. Both manually encode/decode msgpack bytes (fixarray, fixmap, str, bin) matching the wire protocol exactly. Cross-language matrix now tests all 16 combinations (4x4): C++ ↔ C++, C++ ↔ TS, C++ ↔ Rust, C++ ↔ Zig TS ↔ C++, TS ↔ TS, TS ↔ Rust, TS ↔ Zig Rust↔ C++, Rust↔ TS, Rust↔ Rust, Rust↔ Zig Zig ↔ C++, Zig ↔ TS, Zig ↔ Rust, Zig ↔ Zig Total: 18/18 passing (2 golden file + 16 IPC round-trip).
In CI, barretenberg/ts bootstrap runs before C++ binaries are built. The unified generate.ts was failing because it tried to run all 4 service binaries (bb, wsdb, cdb, avm) unconditionally. Fix: check if the binary exists before invoking it. If not found, print a skip message and continue. This matches the old behavior where per-service scripts were only called when binaries existed.
The yarn clean script deleted generated/ directories, but yarn generate can only recreate them when C++ binaries are available. With ci-full-no-test-cache (no build cache), this causes build failures because clean runs before generate, and generate skips when binaries are missing. Fix: remove generated dirs from the clean target. The generate script overwrites them when it runs; if it skips (no binaries), the existing generated files from cache or a previous build persist.
The build had a circular dependency: yarn clean → deletes generated files yarn generate → needs C++ binaries (not built yet in TS bootstrap) yarn build:esm → needs generated files → FAILS Fix: commit the generated files to git as a baseline. They are always available even without C++ binaries or build cache. When binaries ARE available, yarn generate overwrites them with fresh output. Changes: - Un-gitignore src/*/generated/ directories - Commit existing generated files (api_types.ts, async.ts, sync.ts, curve_constants.ts for bb/wsdb/cdb) - yarn clean no longer deletes generated dirs (from previous commit) - yarn generate skips services whose binaries are missing (from previous commit) This eliminates the need for build cache to bootstrap bb.js, making ci-full-no-test-cache work correctly.
barretenberg-rs/src/api.rs and generated_types.rs were gitignored but are needed for cargo build. Without cache, the Rust bootstrap fails because codegen is skipped (no C++ binaries available). Fix: un-gitignore and commit as baseline, same as the TS generated files in the previous commit.
Design for moving codegen out of bb.js into barretenberg/codegen/: - Standalone package with own bootstrap.sh and cache hash - Build order: bb-codegen → bb-cpp-native → bb-generate → bb-ts/bb-rs - Two-level hashing: tool hash + generation hash - Generated files not committed, produced at build time - bb-rs decoupled from bb-ts (only needs generated files + C++ libs)
Move all code generation files out of barretenberg/ts/src/cbind/ into barretenberg/codegen/src/ with its own package.json and tsconfig.json. The codegen package has zero dependency on bb.js — it only needs Node.js + msgpackr + tsx. It writes output to ts/, rust/, cpp/, zig/ directories via relative paths from codegen/src/. This is Phase 1 of the restructure: the codegen tool exists as an independent project. Next steps: update Makefile targets, bootstrap scripts, and remove codegen from bb.js build pipeline.
bootstrap.sh provides: - build: npm install with cache (fast, no C++ dependency) - generate: runs codegen against C++ binaries, caches all output - hash: tool hash from .rebuild_patterns - generate_hash: hash(tool_hash, cpp_hash) for cache invalidation Generation hash changes when either codegen source OR C++ schemas change, ensuring consumers always get fresh bindings.
Makefile: - Add bb-codegen target (npm install for codegen tool) - Add bb-generate target (depends on bb-codegen + bb-cpp-native) - bb-ts now depends on bb-generate (instead of running yarn generate) - bb-rs now depends on bb-generate (instead of bb-ts) barretenberg/ts/bootstrap.sh: - Hash now depends on codegen generate_hash (not cpp hash directly) - Removed yarn generate from build function barretenberg/rust/bootstrap.sh: - Hash now depends on codegen generate_hash (not ts hash) - Removed yarn generate invocation from build function - bb-rs no longer transitively depends on bb-ts barretenberg/ts/package.json: - Removed yarn generate from build script (handled by bb-generate)
Generated files are no longer committed to git. They are produced at
build time by barretenberg/codegen/bootstrap.sh generate (the
bb-generate Makefile target) which runs before bb-ts and bb-rs.
Restored gitignore entries for:
- barretenberg/ts/src/*/generated/ (TS types, async/sync APIs)
- barretenberg/rust/barretenberg-rs/src/{api,generated_types}.rs
Export schema JSON from each C++ binary and commit as the static source of truth for codegen: - schemas/bb_schema.json (55 commands, 14KB) - schemas/wsdb_schema.json (30 commands, 8KB) - schemas/cdb_schema.json (13 commands, 4KB) - schemas/avm_schema.json (3 commands, 1KB) Codegen will read these files instead of executing C++ binaries, breaking the circular dependency between C++ build and codegen. CI validates schemas stay in sync with C++ code after each build.
Replace binary invocation with static file reads: - service_codegen.ts: ServiceConfig now has schemaFile instead of binaryEnvVar/defaultBinaryPath. loadAndCompileSchema() reads JSON. - generate.ts: curve constants read from committed msgpack file. The codegen tool now has ZERO runtime dependency on C++ binaries.
update_schemas.sh: dev script to re-export schemas from C++ binaries. validate_schemas.sh: CI script that validates committed schemas match C++ binary output. Fails with instructions if they diverge.
Since codegen reads static JSON files, the dependency chain simplifies: - bb-generate depends only on bb-codegen (not bb-cpp-native) - generate_hash is gone; codegen hash includes schemas via .rebuild_patterns - Consumer hashes (bb-ts, bb-rs) use codegen hash directly Build order is now fully acyclic: bb-codegen (npm install) → bb-generate (read JSON, produce code) bb-cpp-native (C++ build, independent) bb-ts (needs bb-generate + bb-cpp-native) bb-rs (needs bb-generate + bb-cpp-native) Schema validation runs after C++ build as a separate check.
bn254G1FromCompressed silently returned zero point for out-of-range x-coordinates. Now throws an error. Masks bit 255 (y-parity) before checking against the field modulus.
…t name - Replace BBAPI_ERROR with throw_or_abort in ECC handlers so errors propagate through the generated dispatch as BbErrorResponse - Fix hardcoded 'ErrorResponse' in sync client to use the schema's actual error type name (e.g. 'BbErrorResponse')
… in Docker) In Docker containers, Node.js runs as PID 1. The spawned bb/wsdb/avm binaries see getppid()==1 and immediately shut down, thinking the parent died. This caused "Native backend process exited unexpectedly" in the playground test. prctl(PR_SET_PDEATHSIG) already handles the race correctly — the kernel delivers SIGTERM retroactively if the parent exited before prctl was called.
Resolved conflicts: - avm-transpiler/Cargo.lock: take base branch (latest noir deps) - bbapi.test.cpp: keep both include sets - bbapi_chonk.cpp: use circuit_name variable with Bb prefix - types.rs: keep deletion (replaced by generated code)
BBApiRequest -> BbRequest, AesEncrypt -> BbAesEncrypt, AesDecrypt -> BbAesDecrypt in tests added by base branch merge.
…N_FIELDS Zig's cross-compile clang for macOS doesn't support C++20 abbreviated function templates (auto params). Use explicit template parameter.
…N_FIELDS Zig's cross-compile clang for macOS doesn't support C++20 abbreviated function templates (auto params). Use explicit template parameter.
fdc31e3 to
8c7f1a6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Continuation of #21990 (CI3 workflow stopped triggering on that PR).
Same code, fresh PR to unblock CI.