Deterministic simulation testing - how it works and when to use it

Deterministic simulation testing (DST) is an advanced approach to software testing that enables developers to find and reliably reproduce complex bugs in distributed systems, e.g. bugs caused by concurrency, multi-threading, and timing issues. These types of bugs are notoriously difficult to detect and fix using conventional example-based tests. Of course, DST catches simpler bugs as well!

This article explores how deterministic simulation testing works, how to implement it effectively, and which kinds of systems benefit most from it.

What is deterministic simulation testing?

Deterministic simulation testing (DST) involves placing software under test in a simulated, deterministic environment.

Simulation testing simulates some or all of a distributed system under test, rather than running the test on real hardware, networks, operating systems, etc. This allows the test harness to control phenomena like the occurrence of faults, on the simulated layers. Simulation testing almost always involves running tests multiple times with different seeds.

In DST, some or all layers of the testing stack are made deterministic, including sources of non-determinism like clocks, thread interleaving, and system-provided sources of randomness (among others). This means bugs can be reliably reproduced, making debugging much easier.

DST is often paired with property-based testing/fuzzing and fault injection.

Practical adoption of this approach was pioneered at FoundationDB and Amazon Web Services around 2010, and seems to have been a case of simultaneous invention, or rather, implementation, since the idea itself predates both these instances. One of the earliest recorded discussions of how to implement DST is Will Wilson’s talk at the Strange Loop conference in 2014.

FoundationDB built a simulation-first testing framework to validate the correctness of their distributed database – the first to be consistent, highly-available, and partition-tolerant. That framework became the backbone of the technology at Antithesis.

At Amazon Web Services, Al Vermeulen introduced the approach to test early implementations of AWS’ internal lock service.

How does deterministic simulation testing work?

Deterministic simulation testing relies on:

  • Running the system in an entirely virtual environment to allow deterministic execution and replay for debugging purposes.
  • Carefully feeding the system entropy such that the system and workload can still appear to have random behavior, while at the same time being perfectly reproducible…
  • Exploring the state space of the system – a wide range of inputs and system faults should be simulated to ensure a vast number of possible states are encountered.
  • Checking system behaviors and invariants – to determine if the system behaves as expected.

How do I implement deterministic simulation testing?

One approach, popularized by FoundationDB, is to design the system under test so that all nondeterministic components are pluggable.

Since this requires the system and all its dependencies to be built with deterministic simulation testing in mind, this approach is generally impractical for systems already in production.

Another approach is to run regular non-deterministic software inside a deterministic hypervisor, using a system like Antithesis.

While building a fully deterministic system is often viewed as the main technical challenge, achieving thorough and efficient exploration of the state space – which in most software systems is extremely large – is a complex undertaking as well. This is why DST is often paired with property-based testing / fuzzing and fault injection.

What are the strengths and limitations of deterministic simulation testing?

Deterministic simulation testing saves developer time and increases engineering productivity, because:

  • Bugs found via DST are a lot easier to debug, as execution can be rolled back and inspected at multiple points in time. Compare this to the same test running outside a DST environment: we may see an error, have no information on how to reproduce it, and never see it again.
  • DST prevents production outages, war rooms, and emergency triage. Compare the above scenario – being able to fix a rare bug – to one where the bug is observed but allowed to enter production because it’s impossible to reproduce. A stitch in time saves nine.
  • Even if DST is not paired with a full property-based testing approach, DST can find bugs that software developers don’t anticipate. An example-based test in a normal testing environment execute a single code path, but that same test, running in a simulation environment with different seeds, may explore different paths.

However, DST can be challenging to implement.

  • Setting up a deterministic simulation environment is a complex, resource-intensive undertaking.
  • Not every system can be designed in a way that enables DST to be built around it.
  • DST platforms like Antithesis enable most types of software to be tested using DST, but still require external dependencies to be mocked or otherwise plugged to ensure determinism.

What kinds of systems benefit most from DST?

DST is applicable to any kind of software, but particularly excels at testing complex distributed systems where concurrency, state, and coordination matter. These include:

  • Distributed databases (e.g. FoundationDB, MongoDB, and TigerBeetle)

  • Financial transaction engines (check out our case study with Formance)

  • Distributed systems infrastructure (e.g., Warpstream, Resonate, and Rising Wave).

  • Blockchains and consensus protocols (like Sui, developed by Mysten Labs)

  • Microservice applications

  • Asynchronous workflows

  • Any complex business system built on distributed infrastructure.

Bugs in such systems tend to be difficult to detect and replicate with manually written, example-based tests.

Conclusion

Deterministic simulation testing isn’t about writing more tests. In fact, fewer simple tests may end up covering more execution paths than many narrow traditional tests. It’s about building a testing system that makes even the rarest bugs become fully observable, reproducible, and fixable.

While implementing this approach can be challenging, the return on investment is significant in terms of uptime, peace of mind, and engineering productivity.


Want to see what deterministic simulation testing could do for your stack? You can try Antithesis today.

  • Introduction
  • How Antithesis works
  • Using Antithesis documentation with AI
  • Get started
  • Test an example system
  • With Docker Compose
  • Build and run an etcd cluster
  • Meet the Test Composer
  • With Kubernetes
  • Build and run an etcd cluster
  • Meet the Test Composer
  • Setup guide
  • For Docker Compose users
  • For Kubernetes users
  • Product
  • Test Composer
  • Test Composer basics
  • Test Composer commands
  • How to check test templates locally
  • How to port tests to Antithesis
  • Test launchers
  • Reports
  • The triage reports
  • Findings
  • Environment
  • Utilization
  • Properties
  • The bug reports
  • Context, Instance, & Logs
  • Bug likelihood over time
  • Logs Explorer & multiverse map
  • Multiverse debugging
  • Overview
  • The Antithesis multiverse
  • Querying with event sets
  • Environment utilities
  • Using the Antithesis Notebook
  • Cookbook
  • Tooling integrations
  • CI integration
  • Discord and Slack integrations
  • Issue tracker integration - BETA
  • Configuration
  • Access and authentication
  • The Antithesis environment
  • Optimizing for testing
  • Docker best practices
  • Kubernetes best practices
  • Concepts
  • Properties and Assertions
  • Properties in Antithesis
  • Assertions in Antithesis
  • Sometimes Assertions
  • Properties to test for
  • Fault injection
  • Reference
  • Webhooks
  • Launching a test
  • Launching a debugging session
  • Webhook parameters
  • SDK reference
  • Define test properties
  • Generate randomness
  • Manage test lifecycle
  • Assertion catalog
  • Coverage instrumentation
  • Go
  • Instrumentor
  • Tutorial
  • Assert (reference)
  • Lifecycle (reference)
  • Random (reference)
  • Java
  • Using the SDK
  • Building your software
  • Tutorial
  • Assert (reference)
  • Lifecycle (reference)
  • Random (reference)
  • C
  • C++
  • C/C++ Instrumentation
  • Tutorial
  • Assert (reference)
  • Lifecycle (reference)
  • Random (reference)
  • JavaScript
  • Python
  • Tutorial
  • Assert (reference)
  • Lifecycle (reference)
  • Random (reference)
  • Rust
  • Instrumentation
  • Tutorial
  • Assert (reference)
  • Lifecycle (reference)
  • Random (reference)
  • .NET
  • Instrumentation
  • Tutorial
  • Assert (reference)
  • Lifecycle (reference)
  • Random (reference)
  • Languages not listed above
  • Assert (reference)
  • Lifecycle (reference)
  • Assertion Schema
  • Handling external dependencies
  • FAQ
  • Product FAQs
  • About Antithesis POCs
  • Release notes
  • Release notes
  • General reliability resources
  • Reliability glossary
  • White paper — How much does an outage cost?
  • Autonomous testing
  • Deterministic simulation testing
  • Property-based testing
  • Catalog of reliability properties for key-value datastores
  • Catalog of reliability properties for blockchains
  • Techniques to improve software testing
  • Test ACID compliance with a ring test