test: Phase 2 — multi-process cluster harness (txnode + scripted host manager + TestCluster)#507
Conversation
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…able NodeConfig defaults is_candidate=false (a voter); a non-candidate cannot be made leader (cc_node.cpp rejects a non-candidate leader target). In this 1-member-per-NG foundation each member is its NG's sole leader, so mark it a candidate, matching how production's ParseNgConfig marks NG primaries. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Implement TestCluster::Start() (HM up -> spawn nodes -> await workload servers -> drive OnLeaderStart per NG + SetLeader -> NotifyNewLeaderStart to all nodes -> WaitReady) and add a WaitReady RPC to the txnode workload service that finishes the native-NG recovery (skip_wal leaves no log service to promote candidate->leader) and drives WaitClusterReady. Add ClusterCrossNg-Test: a real 2-node cluster, write 20 keys via node 0 (some remote to ng1) and read them back via node 1 (some remote to ng0). Cluster bring-up and cross-NG writes work; the cross-NG READ is currently blocked by a pre-existing engine issue in the bucket-meta lock path on a freshly-promoted skip_kv leader (ReadOperation::Forward lock_range_bucket_result_->IsFinished() / the RangeBucket ReadLocal never completing under skip_kv). See the STATUS comment in ClusterCrossNg-Test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ine skip_kv gap); harden node cleanup; non-gating CI Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Warning Review limit reached
More reviews will be available in 40 minutes. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
WalkthroughImplements Phase 2 multi-process cluster test harness: workload proto and port utilities, txnode bring‑up fixtures, scripted HostManager, TestCluster orchestration to spawn txnode subprocesses, txnode binary exposing WorkloadService, Catch2 cross‑NG integration test, and CMake/CI wiring. ChangesPhase 2 Multi-Process Cluster Test Harness
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tx_service/tests/cluster/txnode_main.cpp (1)
164-168: 💤 Low valueConsider validating port range before casting to uint16_t.
std::stoulsucceeds for values > 65535 (e.g., "70000"), butstatic_cast<uint16_t>silently truncates them. While TCP ports are inherently 16-bit and the test harness controls inputs, explicit validation would surface configuration errors early.♻️ Proposed validation
try { uint32_t ng_id = static_cast<uint32_t>(std::stoul(ng_str)); uint32_t node_id = static_cast<uint32_t>(std::stoul(node_str)); - uint16_t port = static_cast<uint16_t>(std::stoul(port_str)); + unsigned long port_ul = std::stoul(port_str); + if (port_ul > 65535) + { + throw std::runtime_error("port out of range"); + } + uint16_t port = static_cast<uint16_t>(port_ul); ng_members[ng_id].emplace_back(node_id, host, port); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tx_service/tests/cluster/txnode_main.cpp` around lines 164 - 168, The code converts port_str via std::stoul then static_cast<uint16_t>, which silently truncates values > 65535; change the port parsing in the try block to first parse into an unsigned long (or uint32_t) (use the current std::stoul result), validate that the value is <= 65535 (UINT16_MAX) before casting, and if out of range throw or log an error and fail the parse (so port is not silently truncated). Update the parsing for port (the variable currently named port and the std::stoul(port_str) call) to perform this range check and handle the error path consistently with the existing ng_id/node_id parse error handling.tx_service/tests/ClusterCrossNg-Test.cpp (1)
61-114: 💤 Low valueConsider extracting the hardcoded RPC timeout to a named constant.
The 5000ms timeout appears in all four helper functions (lines 64, 78, 92, 107). Extracting it to a
constexpr int kRpcTimeoutMs = 5000;at namespace scope would make future adjustments easier and document the timeout policy.♻️ Proposed refactor
namespace { +constexpr int kRpcTimeoutMs = 5000; + // --- Workload-RPC helper wrappers over a WorkloadService_Stub. Each opens a // fresh brpc::Controller (a Controller is single-use) and asserts the RPC did // not fail at the transport level before inspecting the reply. --- int64_t LeaderTerm(WorkloadStub &stub, uint32_t ng_id) { brpc::Controller cntl; - cntl.set_timeout_ms(5000); + cntl.set_timeout_ms(kRpcTimeoutMs); ... } uint64_t Begin(WorkloadStub &stub) { brpc::Controller cntl; - cntl.set_timeout_ms(5000); + cntl.set_timeout_ms(kRpcTimeoutMs); ... } bool Upsert(WorkloadStub &stub, uint64_t handle, int key, int value) { brpc::Controller cntl; - cntl.set_timeout_ms(5000); + cntl.set_timeout_ms(kRpcTimeoutMs); ... } bool Commit(WorkloadStub &stub, uint64_t handle) { brpc::Controller cntl; - cntl.set_timeout_ms(5000); + cntl.set_timeout_ms(kRpcTimeoutMs); ... }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tx_service/tests/ClusterCrossNg-Test.cpp` around lines 61 - 114, The four helper functions LeaderTerm, Begin, Upsert, and Commit duplicate the hardcoded RPC timeout (cntl.set_timeout_ms(5000)); introduce a single namespace-scope constexpr int kRpcTimeoutMs = 5000 and replace each cntl.set_timeout_ms(5000) call with cntl.set_timeout_ms(kRpcTimeoutMs) so the timeout policy is documented and easy to change in one place; keep the new constant near the top of the test file in the anonymous or test namespace scope so it is visible to all four functions.tx_service/tests/cluster/txnode_workload.proto (1)
2-2: 💤 Low valueConsider aligning package directory with proto convention.
The Buf linter flags that package
txnode_workloadshould reside in atxnode_workload/directory relative to the repository root, but the file is intx_service/tests/cluster/. While this works functionally and may be intentional for test organization, aligning with protobuf directory conventions improves tooling compatibility and consistency.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tx_service/tests/cluster/txnode_workload.proto` at line 2, The proto package declaration package txnode_workload does not match protobuf directory conventions; either move the .proto into a matching txnode_workload/ directory (so the file path aligns with package txnode_workload) or change the package declaration to reflect the current test directory layout; update any import or build references that depend on this package name accordingly so the linter and tooling see a consistent package-to-directory mapping.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tx_service/tests/cluster/txnode_bringup.cpp`:
- Around line 63-90: Remove the duplicate BindEphemeralPort implementation and
instead include the shared helper from harness/port_util.h and add a using
declaration (e.g., using ::BindEphemeralPort;) so this file reuses the common
function (as test_node.cpp does); also fix the misleading comment that says
"loopback"—either remove that phrase or change it to reflect that the bind uses
INADDR_ANY (0.0.0.0).
---
Nitpick comments:
In `@tx_service/tests/cluster/txnode_main.cpp`:
- Around line 164-168: The code converts port_str via std::stoul then
static_cast<uint16_t>, which silently truncates values > 65535; change the port
parsing in the try block to first parse into an unsigned long (or uint32_t) (use
the current std::stoul result), validate that the value is <= 65535 (UINT16_MAX)
before casting, and if out of range throw or log an error and fail the parse (so
port is not silently truncated). Update the parsing for port (the variable
currently named port and the std::stoul(port_str) call) to perform this range
check and handle the error path consistently with the existing ng_id/node_id
parse error handling.
In `@tx_service/tests/cluster/txnode_workload.proto`:
- Line 2: The proto package declaration package txnode_workload does not match
protobuf directory conventions; either move the .proto into a matching
txnode_workload/ directory (so the file path aligns with package
txnode_workload) or change the package declaration to reflect the current test
directory layout; update any import or build references that depend on this
package name accordingly so the linter and tooling see a consistent
package-to-directory mapping.
In `@tx_service/tests/ClusterCrossNg-Test.cpp`:
- Around line 61-114: The four helper functions LeaderTerm, Begin, Upsert, and
Commit duplicate the hardcoded RPC timeout (cntl.set_timeout_ms(5000));
introduce a single namespace-scope constexpr int kRpcTimeoutMs = 5000 and
replace each cntl.set_timeout_ms(5000) call with
cntl.set_timeout_ms(kRpcTimeoutMs) so the timeout policy is documented and easy
to change in one place; keep the new constant near the top of the test file in
the anonymous or test namespace scope so it is visible to all four functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 415700e9-30bf-49cc-bd47-843fb46ff92d
📒 Files selected for processing (13)
.github/workflows/unit-tests.ymltx_service/tests/CMakeLists.txttx_service/tests/ClusterCrossNg-Test.cpptx_service/tests/cluster/scripted_host_manager.cpptx_service/tests/cluster/scripted_host_manager.htx_service/tests/cluster/test_cluster.cpptx_service/tests/cluster/test_cluster.htx_service/tests/cluster/txnode_bringup.cpptx_service/tests/cluster/txnode_bringup.htx_service/tests/cluster/txnode_main.cpptx_service/tests/cluster/txnode_workload.prototx_service/tests/harness/port_util.htx_service/tests/harness/test_node.cpp
Apply fixes from a local multi-agent review of PR #507: Diagnosability / robustness (test_cluster.cpp): - AwaitWorkloadServers: detect a txnode that died during bring-up via waitpid(WNOHANG) and fail immediately with its exit/signal status, instead of burning the full 30s on connection-refused RPCs. Give each node its own timeout budget so a slow first node cannot starve later ones. - SpawnNode: check the posix_spawn_file_actions_* / posix_spawnattr_* setup return values; a silently-failed setpgroup would leave the child in the driver's group and make the dtor's group-kill backstop target the wrong group. - KillNode: retry waitpid on EINTR so a signal mid-wait cannot leave the child unreaped. - BuildClient: set max_retry=0 -- the workload RPCs are non-idempotent, so a transport-level retry could leak a tx handle or re-commit. - DriveLeader: treat resp.retry() as a retry condition (matches the documented contract; keeps the loop correct if the engine ever returns retry without error). - AwaitClusterReady: on a non-failed WaitReady reply with ready=false (a driver-ordering bug), fail loudly and immediately rather than spinning to the deadline. txnode_main.cpp: - WaitReady: single-flight the WaitClusterReady handshake so a retried WaitReady RPC (after a client-side timeout) cannot launch a second concurrent WaitClusterReady in the same process. Test (ClusterCrossNg-Test.cpp): - Assert NodeId(c0)==0 / NodeId(c1)==1 so a topology/port mix-up cannot silently degenerate the cross-NG test into a single-node test. Comment accuracy: - Note log-group binds base+2 in the 4-wide port-window comments (txnode_main.cpp, txnode_bringup.cpp, test_cluster.cpp). - DriveLeader doc: it sends only the OnLeaderStart RPC; NotifyLeader and the recovery handshake are driven separately. - WaitReady: name the real promoter (FinishLogGroupReplay). - Clarify that Upsert buffers locally and the cross-NG work is at Commit; describe the node-1 case as the coordinator/remote-role mirror. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- txnode_bringup.cpp: drop the duplicate BindEphemeralPort and reuse the shared txservice::test::BindEphemeralPort from harness/port_util.h (same behavior; the file is already in namespace txservice::test, so the call resolves without a using-declaration). Removes the now-unused <netinet/in.h>/<sys/socket.h> includes. - txnode_main.cpp ParseTopology: reject ports > 65535 explicitly instead of silently truncating std::stoul's result to uint16_t. - ClusterCrossNg-Test.cpp: hoist the duplicated 5000ms RPC timeout into a single constexpr kRpcTimeoutMs. Skipped CodeRabbit's proto-package-directory nitpick: txnode_workload.proto is a test-only definition deliberately colocated with the cluster harness it serves and referenced there by the CMake proto-gen; moving it for the Buf directory convention has no functional benefit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Addressed the CodeRabbit review in 12a07e4:
Skipped: the Verified: |
Quality cleanup from a /simplify pass (no behavior change): - test_cluster.cpp DriveLeader/NotifyLeader: build the CcRpcService stub and the request once before the retry loop instead of per iteration (a stub is a stateless channel wrapper; the request fields are constant across retries). Only the Controller/response are per-attempt. - test_cluster.cpp NotifyLeader: drop the dead leader-NodeProc lookup -- it was found but never dereferenced; the request needs only leader_node_id (already known-valid from the caller's iteration). - test_cluster.cpp: resolve the txnode binary path once in Start() and pass it to SpawnNode, instead of calling LocateTxnodeBinary (readlink + stat) per node. Matches the existing BuildTopologyString hoist. - txnode_bringup.cpp: collapse the two identical CatalogFactory*[3] arrays into one shared by both the DataStoreServiceClient and TxService ctors. - Hoist the duplicated cluster_config_version literal into a single shared constant kClusterConfigVersion in txnode_bringup.h, used by both the node bring-up and the driver's OnLeaderStart (the two must agree; the coupling was previously only documented in a comment). Skipped (noted for follow-up): parallelizing AwaitWorkloadServers and reusing cc-node channels across phases (future N-node scaling, adds concurrency/state not worth it at the 2-node foundation); factoring the shared conf-map / gflags / DSS bring-up blocks out of the merged Phase 1 test_node.cpp (a larger refactor of already-merged code, outside this diff); generalizing ClusterOptions to multi-member NGs and a dedicated Promote(ng,term) RPC (deliberate YAGNI until the failover/standby increment). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two distinct bugs were chained under the quarantined LargeObjLRU-Test (the FastMetaDataMutex tls_shard_idx spin was fixed earlier in this branch): 1. Engine bug (template_cc_map.h, FindEmplace, LO_LRU policy): when the large-object page is the LAST page in the map and a key greater than it is inserted, the `target_it == ccmp_.end()` branch creates the new next-page but never repoints `target_page` to it (the parallel `else` branch does). The key is then Emplaced back into the large-object page, and TryUpdatePageKey calls FirstKey() on the still-empty new page -> assert(!keys_.empty()) (TC-FE-01). In NDEBUG this is worse: FirstKey() reads keys_.front() on an empty vector (UB) and the large-object "alone on its page" invariant is violated. Fix: add the missing `target_page = target_it->second.get();`, mirroring the else branch. 2. Test-setup bug (LargeObjLRU-Test.cpp, 4 sites): tests create partial- commit dirty entries and compensated dirty_data_key_count_ via f.shard.AdjustDataKeyStats(...) -- the SHARD counter only. OnCommittedUpdate normally bumps both the shard and the MAP counters, so the map's dirty_data_key_count_ was left at 0 while the entries are dirty, and Terminate()'s decrement underflowed it (assert in TemplateCcMap::AdjustDataKeyStats). This surfaced only once bug #1 stopped killing the suite earlier. Fix: drive the real clean->dirty API (cc_map.OnCommittedUpdate) so both counters stay in step. (This is unrelated to the production dirty-count underflow paths in #465.) With both fixes the full LargeObjLRU-Test (23 cases, 3701 assertions) passes deterministically, so it is un-gated in CI (only ClusterCrossNg-Test remains non-gating). Full ctest suite: 40/40 green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Eliminate the port reserve-then-bind TOCTOU that made ClusterCrossNg-Test a gating risk, then gate it. - ScriptedHostManager: bind the HM brpc server on a brpc PortRange instead of a pre-picked port, and expose Port(). brpc atomically finds+holds a free port, so there is no reserve/bind window at all. The driver reads the bound port back (Start() -> hm_.Port()); the ctor no longer reserves an HM port. - TestCluster workload port: AwaitWorkloadServers now detects a node that died during bring-up (waitpid WNOHANG) and re-spawns it on a fresh workload port (up to a few times) instead of failing the cluster -- self-healing the workload-port TOCTOU. - TestCluster tx-port window: the tx window is shared across the topology (every node must agree on every peer's ports), so it cannot be re-picked for one node. Start() now wraps the whole spawn+await in a bounded retry that, on a bring-up failure, kills all nodes, re-picks every node's ports, rebuilds the topology, and re-spawns -- covering tx-port collisions under heavy port contention. Stress: 30/30 sequential, 36/36 at 6x parallel, 50/50 at 10x parallel, 0 orphaned processes. Before the whole-cluster retry, 6x parallel showed a ~1/18 SIGABRT during bring-up (a tx-port collision); it is gone now. With this ClusterCrossNg-Test is stable enough to gate, so the CI carve-out is removed and the full suite gates. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
@CodeRabbit review |
✅ Action performedReview finished.
|
Summary
Phase 2 foundation of the data_substrate test framework (Tier 2): a multi-process cluster harness that spawns real node processes and drives transactions across them. Builds on the merged Phase 1 (
TestNode/MemDataStore).Because the engine's singletons (
Sharder,TxStartTsCollector, …) are process-global (one node per process), each node runs as its owntxnodeOS process. The test driver runs an in-process scripted host manager and deterministically scripts leadership by calling each node'sCcRpcService.OnLeaderStart— exactly what the real raft host manager does — instead of relying on raft elections. This makes multi-node behavior testable without timing nondeterminism.What's included
txnodebinary (tx_service/tests/cluster/txnode_main.cpp,txnode_bringup.{h,cpp}): brings up a real multi-nodeTxService(mock catalog + in-processMemDataStore,skip_wal/skip_kv, external host manager,fork_host_manager=false) and hosts a small workload brpc service (txnode_workload.proto: BeginTx/Upsert/Read/Commit/Abort/NodeInfo/WaitReady/Shutdown) the driver uses to drive and inspect the node.ScriptedHostManager(scripted_host_manager.{h,cpp}): minimalHostMangerService— nodes register viaStartNode, learn NG leaders viaGetLeaderfrom a driver-set map.TestCluster(test_cluster.{h,cpp}): spawns/killstxnodeprocesses (own process group, robust reap — no orphans even on test failure), reserves port windows, runs the scripted HM, drives leadership (OnLeaderStart+NotifyNewLeaderStart+ concurrentWaitReady), and exposesClient(node).ClusterCrossNg-Test: brings up a 2-node / 2-NG cluster and asserts both NGs are led and that cross-NG write transactions commit from both nodes (writing keys that span both NGs exercises remote write-lock acquisition + 2PC across nodes). 3× green, zero orphan processes.port_util.h: factored the port-reservation helpers out of Phase 1'stest_node.cpp(shared by both); Phase 1 suite still passes.Known limitation (documented, tracked — NOT a harness defect)
Cross-NG read-back is intentionally not asserted. Under
skip_kv, a freshly-promoted leader'sInitRangeBucketscopies bucket info (enough for write routing) but does not seed the RangeBucket CcMap entries that a read pins (ReadOperation→lock_bucket_op_) — this is flagged// TODO: HARDCORE SEEDincc_node.cpp. So cross-NG/cross-bucket reads hang on the bucket-meta lock while writes commit. This is an engine gap (sameskip_kvfamily that deferred storage-backed Phase 1 work to e2e); resolving it needs coretx_execution/InitRangeBucketschanges and is a follow-up. The cluster test is non-gating in CI (likeLargeObjLRU-Test) until then.CI
ClusterCrossNg-Testruns in the non-gating step (ctest -L), excluded from the gating run (-LE "LargeObjLRU-Test|ClusterCrossNg-Test"), since it spawns processes and is heavier; promote to gating once proven stable across CI runs.Follow-up Phase 2 increments (own PRs)
standby replication + failover; crash-recovery with an embedded LogServer; message-level fault injection; simplified scale in/out; and the cross-NG read once the RangeBucket seed gap is fixed.
🤖 Generated with Claude Code
Summary by CodeRabbit