Antithesis logomark
DOCS

Overview

Antithesis runs your system inside a deterministic hypervisor and continuously disrupts the environment in which it’s running. We call those disruptions faults: a process gets killed, the network between two services stops carrying packets, the clock jumps forward by thirty seconds.

Types of faults

TypeExamples
NetworkBaseline latency, partitions, clogs, restore
NodeNode hang, node kill / stop, throttling
ClockForward/backward clock jumps
OtherThread pausing, CPU modulation, custom faults

Understanding what faults occurred

Information about faults appears in three places after a run completes:

  • In the Triage report, as fault-injection events, next to your application’s logs and any assertion outcomes.
  • In the Logs Explorer, filterable by the fault injector category.
  • In the API responses that return logs, as events whose source.name is fault_injector.

Fault events in logs explains the logging format for faults.

Standard fault settings

Antithesis’ basic_test runs with all network faults enabled. Thread pausing can be enabled by instrumenting your code. To enable node faults, clock jitter, or custom faults, talk to your forward-deployed engineer.

Pausing faults

By default, faults are injected throughout the test, interleaved randomly with your workload.

There are two ways to pause fault injection:

  1. Your workload can request a temporary quiet period via the ANTITHESIS_STOP_FAULTS API. Antithesis will pause faults, restore killed containers or pods and not inject any faults for the requested duration. Faults resume after the requested duration has elapsed.
  2. The test commands eventually_ and finally_ create a terminal pause at the end of an execution, giving the system under test time to recover before final validation checks.

Pausing faults has more details.

Overlapping faults

Faults are scheduled independently and may overlap in time. The overlap behavior differs by fault type:

  • Network faults can overlap on the same target. When two network faults affect the same link at the same time, the more aggressive one wins for the duration of the overlap, e.g., a fault that drops packets on a link supersedes a slowdown on the same link. Once the more aggressive fault ends, the other resumes if its window has not yet expired. You will see overlapping network fault events emitted independently in the log; Antithesis does not collapse them into a single combined event.
  • Node faults do not overlap on the same target. Only one node fault can be active on a given container at a time. If subsequent node faults are scheduled against the same target, they’re skipped while one is in progress. The skipped fault isn’t logged.