Engineering

Designing a Deterministic Matching Engine in Rust

A matching engine is the single most performance- and correctness-critical component of any exchange. Here is how we approach determinism, sequencing, and crash recovery so an engine can be rebuilt byte-for-byte from its log.

Adrian Vance

Founder & Managing Partner

November 12, 2025 7 min read

The matching engine is the heart of an exchange, and it is the one component where "mostly correct" is not a category that exists. A fill that should not have happened, an order matched out of price-time priority, or a book that diverges between two replicas are not bugs you patch in the next sprint. They are reconciliation breaks, regulatory questions, and lost trust. Over several exchange engagements we have converged on a small set of principles that make a Rust matching core both fast and provably consistent.

Determinism is the whole game

The single most important property of a matching engine is that it is a pure function of its input. Given the same ordered sequence of commands (new order, cancel, modify) the engine must produce exactly the same sequence of fills and book states, every time, on every machine. Once you have that, almost everything else becomes tractable. You can run two engines side by side and reconcile them byte-for-byte, rebuild a crashed replica by replaying its log, and test against an oracle by feeding both the same commands.

Determinism is easy to break by accident. Iterating a HashMap whose order depends on hashing, reading the wall clock inside the matching loop, or pulling randomness for tie-breaks will all silently desynchronise replicas. We push every non-deterministic input to the edge: timestamps, sequence numbers and identifiers are stamped onto the command before it enters the log, and the core is built as if the outside world does not exist.

Sequence first, match second

We separate sequencing from matching. An append-only command log assigns every order a global, monotonic position before the matching core ever sees it. The core consumes that log in order and is, by construction, a deterministic replay of it. This buys you horizontal recovery for free: any replica can be reconstructed from a snapshot plus the tail of the log, and a new replica can be warmed up by replaying history at many times real-time speed.

Data structures that respect the cache

Price-time priority maps naturally onto a sorted structure of price levels, each holding a FIFO queue of resting orders. The temptation is to reach for a balanced tree of pointers, but pointer-chasing is where latency goes to die. We favour flat, index-based layouts: arrays of levels addressed by a price-to-index mapping, with an intrusive doubly linked list threading the orders at each level, so the hot path touches contiguous memory. The next-active-price lookup that a crossing order needs becomes a bitmap scan rather than a tree walk.

Recovery measured in seconds, not minutes

Snapshots are not optional. We checkpoint the full book state every N commands and persist it alongside the log position it corresponds to. Cold start then means: load the latest snapshot, replay the handful of commands since, and you are live. On one engagement this took crash recovery from roughly forty minutes of hand-driven log replay to about four seconds of automated snapshot-plus-tail. That difference is the difference between a non-event and a public incident.

Test like correctness is the product

We lean heavily on property-based testing and fuzzing. Randomised sequences of orders and cancels are thrown at the engine and checked against invariants that must always hold: the book is always sorted, total resting quantity only changes by exactly the matched amount, self-trades never occur when prevention is enabled, and price-time priority is never violated. A separate oracle implementation — slow but obviously correct — runs the same sequences, and any divergence is a hard failure. This is how you ship an execution core you are willing to put real volume through on day one.

None of this is exotic. It is the disciplined application of a few ideas: determinism, sequencing, cache-aware layout, snapshotting and adversarial testing, applied consistently. The payoff is an engine that is fast because it does less work on the hot path, and trustworthy because its behaviour is reproducible and checkable.

Tagged #Rust #Matching Engine #Low Latency

Adrian Vance

Founder & Managing Partner

Founder of Web3Software. Twelve years building distributed systems and capital-markets infrastructure, the last six dedicated to blockchain, on-chain settlement, and quantitative trading platforms for institutional clients.

All articles

Get the next deep-dive in your inbox.

Occasional, substantive engineering write-ups from the team. No spam, unsubscribe anytime.