> ## Fault events in logs and reports

> Fetch the complete documentation index at: https://antithesis.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

---

Most faults Antithesis injects are recorded in the test run's logs. This page describes the shape of those events, what each field means, and how to query them.

## Accessing fault logs

- In the [triage report](/docs/product/reports/), fault events appear in the log viewer panel under "Fault injection events", next to your application's events and any assertion outcomes.
- In the [Logs Explorer](/docs/product/logs_explorer/), fault events are surfaced by the `fault injector` category (also reachable by filtering `general.source = fault_injector`). Temporal queries, "preceded by" and "followed by", let you correlate failures with fault events.
- In API responses that return logs, events whose `source.name` is `fault_injector` and which carry a `fault` field are fault events.

## Fault event schema

Every fault event in the log has at least `source` and `moment` fields:

```json
{
  "source": {
    "name": "fault_injector"
  },
  "moment": {
    "_vtime_ticks": 51539607552,
    "input_hash": "...",
    "session_id": "..."
  },
  "fault": {
    "name": "partition",
    "type": "network",
    "affected_nodes": ["ALL"],
    "max_duration": 10,
    "details": {
      "disruption_type": "Stopped",
      "asymmetric": false,
      "partitions": [["A", "B"], ["C"]]
    }
  }
}
```

## vtime

Events in the logs are globally ordered by a simulated deterministic virtual time, called `vtime`.

`vtime` is expressed in two different ways: `vtime_ticks` and `vtime_seconds`. `vtime_seconds` are floating point numbers and `vtime_ticks` are 64 bit integers. `moment._vtime_ticks` is the integer representing deterministic virtual time. Use this virtual time as the source of truth for ordering, not application-emitted timestamps as they can be out of order under clock jitter or thread pausing.

To convert ticks to seconds:

```
vtime_seconds = vtime_ticks / 4294967296
```

## The `fault` object

The `fault` object contains all information about the injected fault, its duration, and the nodes it affects.

| Field | Type | Description |
|---|---|---|
| `name` | string | One of `partition`, `clog`, `restore`, `kill`, `stop`, `pause`, `throttle`, `skip`. |
| `type` | string | One of `network`, `node`, `clock`. |
| `affected_nodes` | array of string | Nodes targeted by the fault, or `["ALL"]`. If the array is empty, the fault doesn't actually do anything. |
| `max_duration` | number | Number of seconds the fault remains active. |
| `details` | object | Fault-specific payload — `disruption_type`, `partitions`, `asymmetric`, `offset`, etc. |

## Fault names in logs

The fault types listed [here](/docs/concepts/fault_injection/fault_types/) have slightly different names in the logs to provide fine grained information about the fault event.

| Fault                 | `fault.name` | `fault.type` |
|-----------------------|--------------|--------------|
| Network partition     | `partition`  | `network`    |
| Network clog          | `clog`       | `network`    |
| Network restore       | `restore`    | `network`    |
| Node hang             | `pause`      | `node`       |
| Node throttling       | `throttle`   | `node`       |
| Node termination      | `kill`, `stop` | `node`     |
| Clock jitter          | `skip`       | `clock`      |

Thread pausing and CPU modulation happen very frequently during a test run but do not produce fault events in the log to prevent excessive logging.

## Examples by fault type

### Network partition

```json
{
  "fault": {
    "affected_nodes": [
      "ALL"
    ],
    "details": {
      "asymmetric": true,
      "disruption_type": "Slowed",
      "drop_rate": 0,
      "latency": {
        "deviation": 1597.9999999999998,
        "mean": 1492.601977
      },
      "partitions": [["client-1", "client-2"], ["server", "client-3"]]
    },
    "max_duration": 0.183884736,
    "name": "partition",
    "type": "network"
  }
}
```

Nodes are split into groups by `details.partition`. Network links between different groups experience the `details.disruption_type`. Network links within the same group are not affected by this event (though they may be affected by an overlapping fault).

Disruption types:

- Stopped - packets are dropped entirely.
- Slowed - packets are delayed with latency.
- Jammed - packets are "piled up" in a queue until a future deliver time.

### Network clog

```json
{
  "fault": {
    "affected_nodes": ["server", "client-2"],
    "details": {
      "disruption_type": "Stopped"
    },
    "max_duration": 4.515860336,
    "name": "clog",
    "type": "network"
  }
}
```

Any connection to a node listed in `affected_nodes` experiences the `details.disruption_type` for `max_duration`.

Disruption types:

- Stopped - packets are dropped entirely.
- Slowed - packets are delayed with latency.
- Jammed - packets are "piled up" in a queue until a future deliver time.

### Network restore

```json
{
    "fault":{
        "affected_nodes":["ALL"],
        "name":"restore",
        "type":"network"
    }
}
```

All ongoing network faults are stopped until new ones are scheduled in the future.

## Node faults

The `node` type covers three distinct faults that share the same structure (`affected_nodes`, `max_duration`) but differ in semantics.

### Node termination (kill / stop)

```json
{
    "fault":{
        "affected_nodes":["server-3"],
        "max_duration":1.7741677258234223,
        "name":"kill",
        "type":"node"
    }
}
```

The affected nodes are killed (`name: "kill"`) or stopped (`name: "stop"`) for `max_duration` seconds, then restarted. If the node is a pod, Antithesis cannot restart it and it's fully managed by Kubernetes. In that case, `max_duration` will be `0` because Antithesis can't control the restart.

If a restart policy is defined the container may be restarted immediately by docker-compose. This nullifies the Antithesis fault event, so we recommend not defining a restart policy.

### Node hang (pause)

```json
{
    "fault":{
        "affected_nodes":["server-3"],
        "max_duration":1.5223575492507213,
        "name":"pause",
        "type":"node"
    }
}
```

The affected nodes are frozen in place for `max_duration` seconds. The container remains on the network but cannot process anything, so other containers will see timeouts when trying to communicate with it.

### Node throttling

```json
{
    "fault":{
        "affected_nodes":["server-3"],
        "max_duration":1.3029411842394,
        "name":"throttle",
        "type":"node"
    }
}
```

The named node's CPU is constrained for `max_duration` seconds.

## Clock jitter

System level clock jitter moves the time forward/backward by an `offset`. The jump can be temporary or permanent: if the fault event contains a `max_duration` field, Antithesis reverses the offset after that duration; if `max_duration` is missing, the offset is permanent. Clock offsets are cumulative — each new `skip` event shifts the clock from wherever it already is.

```json
{
    "fault":{
        "affected_nodes":["ALL"],
        "details":{
            "offset":-0.11456344671067203
        },
        "max_duration":0.15177661326674713,
        "name":"skip",
        "type":"clock"
    }
}
```
