Simulator "Kill Node" Operation - Design Discussion #3017
Replies: 3 comments
-
|
I agree with the idea of using message bus as just outbox (this makes sense from the point of view how a real TCP based MessageBus actually works) and simulating it this way is definitely the way to go. The thing that I am puzzled about is how to do it in a way where we can use exactly the same primitive |
Beta Was this translation helpful? Give feedback.
-
|
Here is an short markdown with some more details: Core ideaKeep The difference should be only:
Runtime splitDo not make Use this boundary: ProductionIn the real server / cluster binary:
So production still looks like:
SimulatorIn the simulator:
So simulator does:
Important ruleInbound should not go through
Most important implementation points
|
Beta Was this translation helpful? Give feedback.
-
|
i went through the simulator code in detail after reading this proposal. the outbox-only model is the right call - it matches how a real TCP-based message bus actually works (or will work when I implement it, haha... send = stage outbound, network = deliver), and it solves the the crash semantics table is mostly correct. "FROM crashed, in network -> delivered" is right (already on the wire, sender can't recall). "FROM crashed, in outbox -> discarded" is right (never left the process). "TO crashed, in network -> dropped" is right - the part that needs revision is row 4: "TO restarted node -> delivered, consensus rejects stale via view/op checks." this doesn't hold because consensus recovery isn't implemented yet. what actually happens to a restarted replica depends on whether the cluster has changed views since it went down, but both paths are broken. if the cluster advanced views (say view=3 while the replica comes up at view=0): if the cluster is still in the same view (view=0, no view changes occurred, just ops advanced to 51): this is the more direct failure. prepare(view=0, op=52, commit=50) passes the view check (0 > 0 = false), either way, a restarted replica with empty journal cannot safely participate. this isn't a flaw in your proposal - it's a known gap in our consensus layer. for i think the right approach is phased PRs: PR 1: wire Network into Simulator. per-replica outboxes replacing the shared PR 2: add after that, the durability prerequisites and recovery stub are internal work we need to do in the consensus layer before on the "Paused" state - i'd defer it. it's functionally equivalent to a full network partition ( PR 1 is a great starting point. key files: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Problem
The Iggy simulator currently has no mechanism to kill or crash a node. All replicas are created at startup and live for the entire simulation. To test consensus correctness under node failures (leader crashes, minority/majority failures, recovery), we need a
replica_crash/replica_restartoperation. The simulator currently uses a shared MemBus (Arc<MemBus>) as a simple FIFO queue (VecDeque) for all inter-node communication. There is no concept of a node being "up" or "down"and
send_to_replica()enqueues regardless, andstep()dispatches to any replica.To implement kill/restart, we need message filtering for dead nodes.
The PacketSimulator requires &mut self to submit and step packets. But outbound messages originate deep inside consensus:
send_to_replicais called on&self(theMessageBustrait is&self), this creates a problem: theSimulatorowns both theNetworkand theReplicas, and dispatching a message to a replica triggers outbound sends that need&mut Network, while we're already borrowing&self.replicas[id].Proposal
MemBus as outbox-only:
MemBus becomes a per-replica outbox. Consensus still calls send_to_replica() / send_to_client(), but instead of being the delivery queue, the messages are staged. The Simulator drains each outbox after dispatching to a replica and feeds the messages into network.submit().
Each phase borrows either replicas or network, never both simultaneously. The borrow checker is satisfied without
Arc<Mutex<>>,RefCell, or dynamic dispatch.Behavior of
replica_crashPlease let me know your thoughts about this proposal.
Beta Was this translation helpful? Give feedback.
All reactions