Rebuild, don't restore

The seek problem

Build an interactive timeline — a stepped explanation, a simulation behind a scrubber, an undo stack with a slider — and you quickly meet a deceptively simple requirement. The reader can drag to any position, in any order, and the display must be correct every time. Not approximately correct, not correct on the way forward but stale on the way back: exactly the state that position implies, regardless of how the reader arrived there.

The obvious approach is to remember the answer. Compute the state at each position once, store it, and on a seek hand back the matching snapshot. This is fast to read and easy to reason about for a while. It is also two things at once: storage-heavy, because a snapshot is the whole world at a moment and the number of moments only grows; and, more corrosively, a fork in your sources of truth. The snapshot is a frozen copy of what the reconstruction logic produced on the day it ran. When that logic changes — a bug fixed, a rule revised — the old snapshots do not change with it. They go on encoding the old behavior, silently, until someone seeks to an early position and sees a world the current code would never produce. The drift is invisible precisely because nothing errors. The data is internally consistent; it is just consistent with code that no longer exists.

The alternative

Keep less. Instead of a snapshot per position, keep an ordered list of the operations that change state, plus one clock that says how far along the list we are. The state at time t is then not a thing you stored but a thing you compute: fold(operations up to t), the left-fold of every operation from the beginning of the list up to and including t, starting from a known empty state. Seek becomes “choose a t and fold to it.” Reset is the degenerate case, “fold to 0,” which is just the empty state. Redo is “fold further,” advancing the clock and applying the next operations.

The shift is small to write and large in consequence. State stops being a value you maintain and becomes a pure function of a log prefix. Two positions that should agree now agree by construction, because they are the same function over the same data. There is exactly one source of truth — the ordered log — and the display is a view derived from it. Ordering is the whole game, and a single operation log gives it to you for free: the list is totally ordered by construction, so the question of how to order events never arises. The deeper discipline — reasoning about state by the order of events rather than by any wall clock — is old. Lamport showed in 1978 that a consistent happened-before relation across a distributed system can be built from the ordering of events alone, without trusting any physical clock (Lamport, 1978); the linear log is the trivial case of that idea, the one where the order is handed to you.

Event sourcing, writ small — and where it is not free

None of this is new; that is the point. Persisting the operations rather than the derived state, and rebuilding state by replaying them, is event sourcing, and its property that you can reconstruct any past state by folding the log from the beginning is exactly the one the seek problem needs (Fowler, Event Sourcing). The same shape appears in record-and-replay debuggers, which capture a program’s execution as an ordered log and re-execute it to reproduce the run exactly (O’Callahan et al., rr, USENIX ATC 2017). And it rests on a more general observation: that an append-only, immutable log is easier to reason about and to distribute than mutable state, because the past never changes underneath you (Helland, Immutability Changes Everything).

The honesty has to come here, because the technique has two real costs that the snapshot approach hides. The first is nondeterminism. A fold is only a pure function if the operations are pure; the moment one of them reads a wall clock, draws a random number, or pulls a byte from the network, two replays of the same log can diverge. The discipline a replay system enforces is to capture those nondeterministic inputs into the log itself, so that replay feeds back the same clock reading and the same bytes rather than fetching them anew — this is the core of what record-and-replay tools do to stay deterministic (O’Callahan et al., 2017). If you skip that capture, your fold is not a function and your seek is not reliable. The second cost is recomputation. Folding from the beginning on every seek is fine for a short log and untenable for a long one. The fix is a periodic checkpoint — a stored state at some position — that lets a fold start partway through instead of from zero. But a checkpoint is a cache, not a source: it is a fold someone already computed, and it can be discarded and rebuilt from the log at any time. Treat it as authoritative and you have quietly reinvented the snapshot, drift and all.

The discipline it forces

What this buys, beyond seek, is that every state-changing action becomes data. An operation is no longer a side effect that happened and vanished; it is a row you can inspect, diff against the one before it, and store. A bug is no longer a story about what the user might have done — it is a log you can replay until it recurs, which is the property a record-and-replay debugger exists to give you (O’Callahan et al., 2017). The state is auditable because the thing that produced it is sitting there in order.

The tradeoff is plain, and worth stating without decoration. You pay in recomputation, which checkpoints bound, and in the standing discipline of capturing every source of nondeterminism into the log. In return, correctness across arbitrary seeks stops being something you maintain and becomes something you cannot help but have, and replay comes for free. For anything whose argument is fundamentally about a sequence in time, that is the better trade.

The seek problem

The alternative

Event sourcing, writ small — and where it is not free

The discipline it forces

Sources