The Antithesis multiverse

Antithesis simulates your software in order to explore its behavior. During the simulation, Antithesis interacts with your system by injecting faults and manipulating random choices. For example, we may decide at any point to bisect the network or to cause network latency. Each such interaction kicks off a new timeline, creating a tree of branching timelines.

The image above is a very simple tree. In actuality, Antithesis frequently returns to interesting decision points to make new choices, and explores multiple timelines in parallel. This effectively allows us to reach a fork in the road and take both paths.

Core concepts

The resulting tree of timelines is the basis by which we interact with test results. To talk about this tree, we have a couple of terms:

Event: An event is an occurrence within the simulation.
Common examples include log messages generated by your system or by SDK assertions.
Every event occurs at a unique moment.
Moment: A moment is an immutable instant in the multiverse. Moments are specified by two parameters: one of them picks out a particular timeline, and the other indicates the time on that timeline.
Moments are very granular in their segmentation of time. We believe they are more granular than users will ever need, but strictly speaking they’re not “dense” in the mathematical sense.
Every instant in a timeline is represented by a moment, including locations in the multiverse where no event occurred.
Branch: A branch is a mutable segment of a timeline. A branch starts at a moment and has a mutable pointer to an “end” moment.
The “end” pointer can be moved forward by campaigns, growing a single timeline.
Campaign: A campaign extends a branch by executing actions in the simulation.
These include things like running a bash command, playing the simulation forward a few seconds, or using other Antithesis builtin commands.
Campaigns are the exciting part. By running a campaign with an associated branch, you create new timelines in the multiverse.

An example

Let’s make this concrete: You’ve requested an interactive debugging session for a particular bug moment. Now you’d like to see the network traffic two seconds before the bug moment. To do this you take the following steps in the Antithesis Notebook:

Illustration Concrete

You’re presented with the bug moment when the Notebook opens.
You reference the moment two seconds back in its history using Moment.rewind and spawn a new branch (represented in green).
You issue a bash campaign to run the bash command netstat on this branch, creating a new timeline.

To do this in Antithesis you’d write something like this in the Notebook you received via our webhook workflow:

debug_moment = moment.rewind(Time.seconds(2))
debug_branch = debug_moment.branch()
bash`netstat`.run({branch: debug_branch, container: container_name})

See Frequently used functions below for more detail on these functions.

Notice a few things about this example:

First, this is awesome. We’re able to pull arbitrary information at arbitrary points in the history, or future, of a bug moment, all without disturbing our ability to gather more information. This is just scratching the surface, see our cookbook for more advanced use cases.

Second, we’re creating new timelines in the multiverse. This is powerful, but comes with a catch.

The catch is that running any campaign spawns a new timeline that will immediately begin to diverge from the original one. Since this new timeline starts before your original bug moment, that moment will not, strictly speaking, occur in the new timeline. You may still see the same, or very similar events in the new timeline, but this will depend on how quickly the timelines diverge and how susceptible the events of interest are to the butterfly effect.

But if you expect that the bug is not vulnerable to small changes in initial conditions, then you can ask powerful hypotheticals like ‘if I kill this node, does the bug still happen?’

Frequently used functions

The following gives you a working knowledge of the basic functions for moving around the multiverse. See our guide to The Environment and its utilities for documentation of capabilities like starting and stopping containers, running backgrounded bash commands, extracting files, and more.

For powerful compositions of these tools check out our cookbook.

Go back in time

An instance method on moment. It accepts a TimeInterval (see help(Time)) and returns the moment corresponding to a specified duration before the original moment.

// Two seconds before the bug moment
new_moment = bug_moment.rewind(Time.seconds(2))

For example, you may want to reference the point in the tree a couple of seconds before your system crashed.

Go to back to a specific time

An instance method on moment. It returns the moment corresponding to a specified time along the path to the original moment.

// Twenty seconds since simulation start
new_moment = bug_moment.rewind_to(20)

This may be more ergonomic than specifying a duration to walk backwards, especially if you want to reference many moments.

The following actions extend a branch. They allow you to ask powerful hypothetical questions, but come with a catch: they create new timelines whose futures are independent of other branches.

Run a bash command

An instance method on bash strings. It uses the provided branch to run the bash command inside the provided container and advances the branch until the command exits. The new end moment of the branch will be the moment at which the command exited.

bash`cat logfile.txt`.run({branch: future_branch, container: client})

For example, you may want to see a logfile from a branch positioned just after a bug occurs.

For more information on running bash commands and enumerating containers, or this function’s return value, see The Environment and its utilities or the built-in help()

Run a bash command in background

An instance method on bash strings. It spawns a new process on the provided branch that runs the bash command in the background in the provided container. The new end moment of the branch will be the moment at which the command is delivered (it may or may not have started running).

bash`sleep 10`.run_in_background({branch, container: client})

For more information on running backgrounded bash commands, communicating with the new process, or waiting for it to exit, see The Environment and its utilities or the builtin help()

Wait a duration of time

An instance method on branch. It runs a campaign that extends the branch, running the simulation for some duration. It modifies the passed branch’s end moment by reference. It returns undefined.

branch.wait(Time.seconds(5))

This is especially useful for creating moments or branches in the future of some moment.

Wait until a specific event

An instance method on branch. It runs a campaign that extends the branch. It modifies the passed branch’s end moment by reference. It runs the simulation until some event occurs or until the optional timeout is reached.

It returns the first matching event or, in the case of a timeout, it returns undefined.

// bash_process.stderr in this example is an *EventSet* representing all messages coming from a bash command's stderr
branch.wait_until(bash_process.stderr) 
branch.wait_until({timeout: Time.seconds(10), until: bash_process.exits})

To do more powerful things with event sets, like waiting until your system produces a FATAL log, see our documentation on querying with event sets.

Conversion functions

These functions are basic utilities for setting up queries or campaigns. They do not themselves alter the tree of timelines.

Moment to branch

An instance method on moment. Creates a new branch beginning at a moment, which will have its own independent future. You might want to create a new branch in order to observe the state of your software without disturbing the original timeline of execution.

interactive_branch = bug_moment.branch()

For example, you could create a new branch to get a core dump without affecting your original bug reproduction.

Convert a branch to a moment

A property on branch. It represents the location in the tree that a new campaign on this branch would start from.

final_moment = branch.end

New branch from an existing one

An instance method on branch. Creates a branch off of the end moment of another branch.

new_branch = branch.branch() // Equivalent to branch.end.branch()

For example, you could fork a branch in order to try running a command without perturbing the original branch.

Querying the tree

To analyze the behavior of your system, you’ll need to see events coming out of the simulation.

Events are things like log files written to our output sink, messages to stdout or stderr, and even system-level events like journald’s output. Available system events are documented in The Environment and its utilities.

If you have a branch with a bug, you can see all the events leading up to it by writing:

// environment.events is our primary entrypoint into viewing events
// it includes stdout and stderr, plus several other event sources 
print(environment.events.up_to(bug_branch))
// or to grab only part of a branch's history
print(environment.events.up_to({begin: start_moment, end: bug_branch})
// or to grab only particular kinds of events
print(environment.events.filter(event => event.output_text.includes("[WARN]")).up_to(bug_branch))

When up_to is run on an EventSet it evaluates the set on a particular branch and produces an EventSequence. You can think of EventSets as abstract queries and EventSequences as their concrete results.

Composing these gets pretty powerful. To learn more, check out our documentation on querying with event sets, or see our in-product help by running help(EventSet).

Printing an EventSequence results in a log viewer UI, documented here.

We provide sugar for producing EventSequences where possible. For example print(bash`echo hello`.run(...)) actually uses up_to under the hood to perform a query that selects precisely the output of your command.