feat(cts): add spec-driven contract test suite for Python, Java, and Rust clients#343
Conversation
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
cf6f8be to
b697d88
Compare
…ython, Java, and Rust clients
Build an end-to-end Contract Testing Suite (CTS) driven entirely by code
generators on top of docs/src/spec.yaml. Per-client test files and WireMock
stub mappings are produced from the spec, then replayed against WireMock
standalone to verify HTTP serialization/deserialization across all generated
language clients — without ever modifying the source spec.
Highlights
----------
* Single source of truth: contract tests, WireMock stubs, and examples
overlays are all generated from the OpenAPI spec. The spec file itself
stays untouched; the generators read it and emit derived artifacts under
generator-owned trees only.
* Four client surfaces covered:
- Python : python/lance_namespace_urllib3_client/tests/test_contract.py
- Java sync : java/lance-namespace-apache-client/.../cts/WireMockContractIT.java
- Java async: java/lance-namespace-async-client/.../cts/WireMockContractIT.java
- Rust : rust/lance-namespace-reqwest-client/tests/contract.rs
* CTS scripts under ci/cts/:
- gen_client_tests.py — emit per-language contract tests
- gen_wiremock_mappings.py — emit WireMock stub mappings from spec
- gen_examples_overlay.py — derive request/response examples overlay
- apply_overlay.py — apply overlay to the spec at build time
* Make targets to wire the whole flow (Makefile, java/Makefile):
- merge-spec / clean / gen-cts / test-cts
- Per-language gen/build/test entry points kept generator-owned, so
`make clean && make gen` always reproduces the same artifacts.
* CI workflow .github/workflows/contract-tests.yml runs:
1. spec — Spectral lint + breaking-change check (strict)
2. client-conformance — matrix over java / python / rust, generates
WireMock stubs from the merged spec and
replays them against each generated client
(strict).
The originally planned server-conformance job (Schemathesis vs Spring Boot
reference server) is intentionally NOT added in this PR: the
lance-namespace-springboot-server module currently only ships generated
API interfaces — there is no @SpringBootApplication, no controller
implementations, and no application.yml, so the server cannot actually
start. The job will be re-added in a follow-up PR once a runnable
reference server lands.
* Schemathesis configuration (ci/schemathesis.toml) is added and prepared
for that future server-conformance job. It already disables the three
Arrow IPC endpoints whose request bodies use
application/vnd.apache.arrow.stream, which Schemathesis cannot
auto-serialize:
- CreateTable POST /v1/table/{id}/create
- InsertIntoTable POST /v1/table/{id}/insert
- MergeInsertIntoTable POST /v1/table/{id}/merge_insert
* Spec lint rules tightened: ci/spectral.yaml adds project-specific
ruleset; .github/workflows/spec.yml runs Spectral on every change to
docs/src/spec.yaml.
* Generator-owned files isolated from hand-written CTS edits:
- ci/patch_apache_pom.py and java/async-client-pom.xml keep
OpenAPI-generated Maven POMs reproducible while letting CTS
add WireMock/JUnit5 dependencies via a post-gen patch.
- .gitignore updated so build/, merged spec, and generator output
stay out of the tree.
Outcome
-------
`make clean && make gen && make test-cts` regenerates everything from
spec.yaml and runs the full contract test matrix locally. CI mirrors the
same flow: spec lint must pass, then java/python/rust contract tests must
all pass against WireMock stubs derived from the spec. Any future spec
change is immediately reflected in tests on the next `make gen`, making
client/spec drift impossible to merge silently.
b697d88 to
5deb9c0
Compare
- ci/patch_reqwest_arrow_content_type.py (new) post-processes the Rust reqwest client to inject the Arrow Content-Type header on every operation declared with application/vnd.apache.arrow.stream in the spec. The stock OpenAPI Generator reqwest template emits req_builder.body(p_body) with no Content-Type, while Java and Python templates honour the spec's 'consumes'. The header is emitted in rustfmt's expected multi-line layout so the resulting files do not increase the existing cargo fmt --check diff. The patch is idempotent and derives the operation set from the spec, so future Arrow ops are covered automatically. Wired into rust/Makefile after gen-reqwest-client. - ci/cts/gen_client_tests.py and ci/cts/gen_wiremock_mappings.py updated to keep the generated WireMock mappings and contract tests in sync with the spec (Arrow ops, request/response bodies and content types). - Makefile: test-cts now depends on build-cts so that running 'make test-cts' from a clean state regenerates contract test artifacts (e.g. tests/contract.rs) before invoking cargo test. Previously test-clients only depended on gen-wiremock, which produced 'no test target named contract' when the rust client had been re-cleaned. - rust/Makefile: invoke patch_reqwest_arrow_content_type.py at the end of gen-reqwest-client.
Regenerated artifacts produced by 'make build-cts' after the changes in the previous commit: - rust/lance-namespace-reqwest-client/src/apis/data_api.rs and table_api.rs now carry the Arrow Content-Type header injected by ci/patch_reqwest_arrow_content_type.py for every operation declared with application/vnd.apache.arrow.stream in the spec (insert_into_table, merge_insert_into_table, query_table, ...). - rust/lance-namespace-reqwest-client/tests/contract.rs regenerated by ci/cts/gen_client_tests.py. - java/lance-namespace-apache-client/.../WireMockContractIT.java and java/lance-namespace-async-client/.../WireMockContractIT.java regenerated by ci/cts/gen_client_tests.py. - python/lance_namespace_urllib3_client/tests/test_contract.py regenerated by ci/cts/gen_client_tests.py. Verification: - make test-cts: 49 (java apache) + 49 (java async) + 50 (python) + 49 (rust) all green. - cargo fmt --check on rust/: 766 diffs, identical to baseline (pre-existing, unrelated to this change).
…atize generators - Move ci/patch_apache_pom.py, ci/patch_reqwest_arrow_content_type.py, ci/schemathesis.toml into ci/cts/ - Refactor gen_client_tests.py to render Java/Python/Rust contract harnesses via Mustache templates - Add ci/cts/render.py and ci/cts/templates/ (apache/async java, python, rust, shared partials) - Update Makefile, java/Makefile, rust/Makefile, pyproject.toml, uv.lock to reflect new paths and deps
|
Thanks for putting this together! The infrastructure work here (WireMock lifecycle, CI matrix, Makefile targets) is solid and could be reusable. That said, I think we can get more value from a CTS by focusing on behavioral contract testing rather than serialization/deserialization. The generated clients are produced by OpenAPI codegen, so serde correctness is largely the code generator's responsibility — what we really need to validate is that implementations conform to the spec's behavioral contracts. For example, for
The spec already declares the expected error types for every operation in errors.md. For example, Tests could also be parameterized by capability flags ( I'd suggest pivoting toward:
Happy to discuss further if you'd like to align on the direction before investing more time. |
15ab83e to
c2f6706
Compare
Introduce a second contract test suite for the Lance namespace REST API
that exercises operation *semantics* in-process, complementing — not
replacing — the existing wire-level WireMock suite. The two suites now
have distinct, non-overlapping concerns and a clean directory layout.
What's new
----------
* Authoritative spec under docs/src/cts-contracts/ split by domain:
main.yaml, namespace.yaml, table.yaml, data.yaml, index.yaml,
tag.yaml, transaction.yaml. Each case declares pre-conditions
(`given`), the request (`when`) and the expected outcome (`then`:
success / error_code / 4xx / 409 …) plus required capabilities.
* JSON Schema (ci/cts/cts-contracts.schema.json) and strict linter
(ci/cts/lint_contracts.py) enforcing single-file ownership per
operation, capability validity and shape correctness.
* Capability model: ci/cts/capabilities.py plus the per-impl manifest
ci/cts/capabilities.directory.txt; cases requiring an unsupported
capability are skipped at runtime instead of failing.
* ci/cts/contract_loader.py parses & validates contracts;
ci/cts/gen_contract_tests.py renders one Rust test module per
operation under rust/lance-namespace-cts/tests/contracts/ (43
modules), post-processed with `rustfmt --edition 2024` so
`cargo fmt --check` is a fixed point and `--check` mode catches
drift in CI.
* New workspace member rust/lance-namespace-cts/ hosting the
in-process harness — Fixtures, ContractCaller, Capabilities,
assert_contract_{ok,error} — driving DirectoryNamespace from the
sibling `lance` repo via a cargo path dependency. No cdylib, no
network, runs as a normal `cargo test` target.
Disambiguation renames (the name "contract" now refers to the new
behavioural suite; the wire-level suite is consistently "wiremock"):
* ci/cts/gen_client_tests.py -> gen_wiremock_tests.py
* templates/{rust,python,java_*}_contract.mustache
-> *_wiremock.mustache
* rust tests/contract.rs -> tests/wiremock.rs
* python tests/test_contract.py -> tests/test_wiremock.py
* java WireMockContractIT.java -> WireMockIT.java
Build / CI surface
------------------
* Make:
- `make gen-cts-behavior` / `make test-cts-behavior` drive the
new suite (codegen + cargo test, no JVM).
- `make gen-cts-wiremock` / `make build-cts-wiremock` /
`make test-cts-wiremock` keep the historical multi-language
WireMock pipeline (renamed from the old gen-cts/build-cts/
test-cts targets).
- `make test-cts` is the umbrella target and runs the full matrix:
test-spec-lint + test-cts-behavior + test-cts-wiremock.
* GitHub Actions (.github/workflows/contract-tests.yml):
- `behavior-conformance` runs `make test-cts-behavior` on every
push and pull_request — fast, no JVM, PR-blocking.
- The WireMock matrix (java / python / rust) is opt-in: it only
fires on push to long-lived branches and on manual
workflow_dispatch with `run_wiremock=true`, keeping PR latency
low.
* CONTRIBUTING.md gains a "Contract Tests (CTS)" section documenting
the two suites, how to author a behavioural case, and the
lint / codegen / run loop.
Quality gates (all green locally)
---------------------------------
cargo fmt --check OK
cargo clippy -D warnings OK
gen_contract_tests --check OK
lint_contracts --strict OK
make test-cts-behavior 170 passed
make test-cts-wiremock 49 (rust) + 50 (python) + 49 (java)
0b3c5a2 to
7f5d303
Compare
This PR introduces a two-layer contract test suite (CTS) for the Lance Namespace REST API:
urllib3, Javaapache, Javaasync, Rustreqwest) against aWireMockstandalone server, with both the per-client tests and the stub mappings code-generated from the OpenAPI spec.lance-namespace-cts) directly againstDirectoryNamespace, replaying YAML-authored contracts fromdocs/src/cts-contracts/.Both layers are produced by code generators and never mutate the source spec.
Summary
End-to-end contract testing pipeline driven entirely by code generators on top of the OpenAPI spec and an authoritative behavioural-contract YAML set:
cts-contracts/*.yamland run in-process againstDirectoryNamespace(no JVM, no WireMock, no Spring Boot), capability-gated so impls advertise what they actually support.Motivation
Multiple generated clients (Python
urllib3, Javaapache, Javaasync, Rustreqwest) and a Spring Boot server are all derived from a single OpenAPI spec. Until now there was no automated mechanism to ensure that:This PR makes both kinds of regression mechanically detectable: adding a new operation in the spec automatically yields new wire-level tests in every client on the next
make gen-cts, and authoring a new entry indocs/src/cts-contracts/*.yamlautomatically yields a new behavioural test module on the nextmake gen-cts-behavior.Architecture
The source spec is never mutated — example payloads are layered via an overlay file and applied at generation time only. The behavioural-contract YAML is the authoritative spec for everything beyond wire shape.
Generators (the heart of the pipeline)
ci/cts/gen_wiremock_tests.pyurllib3, Javaapache, Javaasync, and Rustreqwestclients from Mustache templatesci/cts/gen_wiremock_mappings.pyci/cts/gen_examples_overlay.pyci/cts/apply_overlay.pyci/cts/gen_contract_tests.pycts-contracts/*.yaml, post-processed withrustfmt --edition 2024ci/cts/contract_loader.pycts-contracts/*.yamlagainst the JSON Schemaci/cts/lint_contracts.pyci/cts/capabilities.pyCI and Build Tooling
gen-cts/build-cts/test-cts-wiremock), behavioural layer (gen-cts-behavior/test-cts-behavior), defaulttest-ctsrunstest-spec-lint+ behavioural, plusverify-spec-untouchedto fence the source spec.ci/spectral.yaml— Spectral lint rules for the OpenAPI spec.ci/cts/schemathesis.toml— Schemathesis config for property-based contract testing.ci/cts/cts-contracts.schema.json— JSON Schema for the behavioural-contract YAML..github/workflows/contract-tests.yml— CI workflow running the full CTS pipeline (253 lines) with PR-blocking limited to the behavioural job; the WireMock job runs alongside as a wire-level signal.Generated Wire-Level Client Tests
Auto-generated — do not hand-edit. Re-run
make gen-ctsinstead.urllib3python/lance_namespace_urllib3_client/tests/test_wiremock.pypytest+ WireMock (dynamic port)apachejava/lance-namespace-apache-client/.../WireMockIT.javaasyncjava/lance-namespace-async-client/.../WireMockIT.javareqwestrust/lance-namespace-reqwest-client/tests/wiremock.rstokioasync + WireMock process lifecycleEach suite exercises every Namespace / Table / Transaction / Tag / Data API operation end-to-end. (Rename note: these files used to be
test_contract.py/WireMockContractIT.java/contract.rs; they were renamed to*wiremock*so that the word "contract" can be reserved for the new behavioural layer.)Generated Behavioural-Contract Tests
Auto-generated — do not hand-edit. Re-run
make gen-cts-behaviorinstead.docs/src/cts-contracts/{main,index,namespace,table,tag,transaction,data}.yaml— one entry per operation describing pre-conditions, request/response shape, expected outcomes (success, 4xx, 409, …) and required capabilities.rust/lance-namespace-cts/—Fixtures,ContractCaller,Capabilities,assert_contract_{ok,error}. UsesDirectoryNamespacevia cargopathon the siblinglancerepository'slance-namespace-implscrate.rust/lance-namespace-cts/tests/contracts/*.rs— 43 per-operation modules + amod.rswired intotests/cts.rs.make test-cts-behavior→ 170 passed.Dependencies
pyproject.toml/uv.lock— test deps.pom.xml,apache-client/pom.xml,async-client/pom.xml— WireMock + JUnit 5.rust/Cargo.toml,rust/lance-namespace-reqwest-client/Cargo.toml(async test deps for the WireMock layer),rust/lance-namespace-cts/Cargo.toml(in-process harness, depends on the siblinglancerepo viapath).Documentation
AGENTS.md: top-level agent-role overview.CONTRIBUTING.md: refreshed contributor guide covering both CTS layers.README.mdfiles forapache-client,async-client,springboot-server,urllib3-client, andreqwest-client.docs/src/cts-contracts/is wired into the docs site so the behavioural contracts ship as published reference material.How to Run Locally
Quality Gates
All green locally on the tip of this branch:
cargo fmt --check✅cargo clippy -D warnings✅gen_contract_tests.py --check(generator output is a fixed point) ✅lint_contracts.py --strict✅make test-cts→ 170 passed ✅make test-cts-wiremock→ 49 (Rust) + 50 (Python) + 49 (Java) ✅