Howdy! 👾 If you’re reading this email, you signed up – possibly a while ago – to get occasional updates from Antithesis.
This is the 0th issue of the revamped Bug Bash newsletter, a collection of pieces about reliability in software and the physical world. We want this to be the best of what you’ve missed: worthwhile reads that didn’t go viral, older pieces that still resonate today, and the occasional non-software tidbit. Enjoy!
Virtual time
Author: David Jefferson A 1985 paper introducing the idea of virtual time and making the case for rollback as a first-class reliability primitive. Deterministic simulation testing, optimistic concurrency control, event sourcing, and many more ideas descend from this intellectual lineage. “Virtual Time is to Virtual Memory as Time Warp is to paging.”
Raft consensus with a minority of nodes
Author: Rohan Padhye A thought-provoking Raft variant from a CMU professor. Instead of majorities, this variant uses finite projective planes - the same combinatorics behind the card game Spot It! - to define valid “blocs” of nodes for consensus. Any two blocs share exactly one node, preserving Raft’s safety invariant while allowing progress when most of the cluster is down.
Aurora DSQL: the adjudicator
Author: Marc Bowes An AWS engineer’s deep dive into Aurora DSQL’s Adjudicator, explaining how a serverless, distributed database handles write serialization and failure recovery. Both “future stamping” and daisy-chained two-phase commit are interesting, transferable designs worth keeping in your toolbox.
Using algebra and LLMs to verify a flight-plan bug fix in Lean
Author: James Haydon A personal blog post that takes the 2023 UK air traffic control meltdown bug, restates it algebraically, then uses LLM coding agents to grind the Lean proofs. The LLMs lie about implementations matching specs but are great at formal verification grunt work.
Nobody ever gets credit for fixing problems that never happened
Authors: Nelson Repenning & John D. Sterman A formal systems-dynamics model explaining why the harder you work on reliability, the less management believes reliability was ever a problem. A genuine classic that some readers may find familiar, and others may find outright traumatizing.