Bug report
The bug report contains detailed information about a particular issue that Antithesis found while testing your software. This is in contrast to the triage report which summarizes all of the results of a test run.
In an ideal world all bugs would be found immediately after they are introduced and the commit that created them could simply be reverted. In this world, debugging and root-cause analysis would be unnecessary. Antithesis tries to move us closer to that world – by running regular autonomous tests and searching for the unknown unknowns that your conventional tests don’t find. However, sometimes a bug is longstanding or especially rare, and the triage capabilities that Antithesis provides are insufficient for pinpointing it. In these cases, Antithesis provides the bug report to assist with intensive debugging.
Traditional debugging is painful due to the imperfect reproducibility of bugs. If you find a bug, you might not be able to reproduce it; if applying a patch makes the bug go away, you cannot be certain whether it is fixed or whether it has just moved somewhere else. Antithesis deterministically simulates programs and even entire distributed systems to make all bugs perfectly reproducible. This allows Antithesis to debug with time-travel debugging and even with multiverse debugging. In multiverse debugging, we take a particular history of the bug and look at “nearby” worlds where the bug either did or did not happen. We then gather statistical information about commonalities between all the worlds where the bug ended up occurring.
The rest of this document is a detailed walkthrough of an example bug report. We encourage you to follow along with a live interactive example here.
Bug context & bug instance details
Antithesis is an autonomous testing system: you reason about the high-level properties of your system, and Antithesis autonomously generates the test cases. In particular, you write declarative properties about your software that should always hold – for example, no crashes, or no running out of memory. Antithesis then explores your software by interacting with its API and varying its environment to search for cases where the property fails to hold.
In a test run, Antithesis searches for violations of all the properties you have defined, and summarizes the results in a triage report. The inputs to a bug report are from a previously conducted test run, and one particular property that you want investigated in detail. Since the previous test run may have found many ways to violate the property in question, to generate the bug report Antithesis will choose a representative sample of property counterexamples that showcase the same bug appearing in different situations and contexts.
The Bug Context section at the top of the report restates the inputs that were used to generate this report: it shows which specific property failure on which specific test run provided the data for the report.
The Bug Instance Details section lists the sample of property violations that Antithesis (or you) chose as the starting point for time travel and multiverse debugging. It restates the name of the property that was violated, a log excerpt at the moment that the bug was detected, and some information necessary to deterministically reproduce the issue. Clicking on an instance of the bug in this section will change what the entire rest of the report refers to. For example, clicking on the first bug means the (later) logs section will show the logs for this first example.
This report shows a test of the messaging client Kafka. In this report, we see that we tested the property “No null pointer exceptions.” This property failed, meaning that Antithesis found a bug where there are indeed null pointer exceptions Kafka. When a bug report was requested, Antithesis chose two instances of this bug for time travel debugging. We also see the input hash and time needed to recreate this bug.
Bug likelihood over time
The Bug Likelihood Over Time section graphs bug probability over time – it helps find bugs that take time to surface while your software is running. Some particularly painful bugs can be triggered early during execution, but only cause errors or crash the software later. For example, consider a system that doesn’t perform input validation for one of its APIs, and accepts arbitrarily long inputs. It might receive a very long input and store it successfully, but crash later when a different request attempts to query that data again. In a given execution history of this software, whether a test case or a real production scenario, a crash is baked in the moment the long input was stored, but the actual crash happens potentially much later. This is painful to debug because the logs at the moment of the crash might not reveal the origin of the problem – the later request to query the data was perfectly valid. Antithesis uses multiverse debugging to generate probabilities about when the bug was baked in during the execution history. This guides you to the section of the logs where the bug became inevitable.
This graph takes a single test case, and graphs over time the probability of the bug being baked in. It does not address the question of when a bug was committed to the program’s source code. The property history in the triage report helps identify which commit to version control introduced a bug. The bug likelihood over time in the bug report takes a single bug instance and tries to figure out when it became baked in during that test case. The property history tells you which version control commit to examine, the bug likelihood over time tells you which section of the logs to examine.
Consider the example below:
This graph plots the likelihood over time of two different instances of the same bug. We see that in both instances, the probability of the bug dramatically increases in the time period from 20–30 seconds before the bug was detected. At ten seconds before bug detection, the bug likelihood is 60%: if we rewind to ten seconds before the bug and use multiverse debugging to try 100 alternate histories from that point onward, in 60 of them the bug happens. At 45 seconds before bug detection, the bug probability is only 5%: if we rewind to start at that point instead and try 100 alternate histories, only 5 of them ultimately see the bug. We thus conclude that in the original history of the bug, something very important happened 20 to 30 seconds before it ultimately surfaced – somewhere around here would probably be a valuable section of the logs to examine.
Since this debugging capability is unusual and somewhat counterintuitive, let’s walk through how this happens concretely, using the example above of the system with a bug that causes it not to validate input lengths.
The events leading up to the bug being detected are as follows:
- An API request inserts an excessively long name (at 0 seconds).
- A different request inserts a normal record at 5 seconds into testing.
- A third request creates a new field in a form at 10 seconds into testing.
- A final request queries the data inserted by the first request, and the system crashes at 15 seconds into testing.
The bug is baked in from the very first request, but only surfaces at the very end of the test case. Antithesis can rewind time and try variations on the above sequence of events to estimate the likelihood of the bug at various points in that test case.
- First, Antithesis rewinds to just before the crash and tries different queries. The crash will still always happen. The bug likelihood at 0 seconds before bug detection is 100%.
- Then, Antithesis rewinds further and tries many different requests. The crash still always eventually happens. The bug likelihood at 5 seconds before bug detection – or ten seconds into testing – is 100%.
- Then, Antithesis rewinds still further and tries another large set of possibilities. The crash is already baked in, so it always eventually surfaces. The bug likelihood at 10 seconds before bug detection – or 5 seconds into testing – is 100%.
- But, when Antithesis rewinds to the beginning and enters a new (possibly shorter) name and then runs many requests, the crash will not necessarily happen. Now the bug likelihood at 15 seconds before bug detection – the beginning – is lower, possibly 20%.
Thus, we infer that the most important event for understanding the bug happened at the beginning of the test case. Scrolling the logs back to that point shows the very long field name.
The above example discusses time travel simulation of a single original instance of a bug. In the bug report, Antithesis selects a sample of several instances of a bug for time travel debugging. In the linked report, there are two instances with two separate graphs showing when the bug was probably baked in. Clicking on a different bug instance will highlight the other graph, while clicking on a point on the graph will scroll the logs to the points corresponding to that time. Here you should click at 20 or 30 seconds before bug detection to examine the logs – it appears the bug was baked in somewhere around there.
Logs
The Logs Section shows the logs generated over the course of running the selected bug instance. This includes application logs, system journal messages, and other information you may need to understand the bug in question. You may customize what is included by default in the logs, either on a per-property basis or for all of the properties in your test run.
The log viewer allows searching and filtering for particular log messages, using either substrings or regular expressions. You can also select which services or software components should have their logs included, using the button on the top left. An excerpt from the logs is below:
Remember that you can click at a point on the likelihood graph to go to the corresponding time in the logs.
Statistical debug information
The Statistical Debug Information section estimates how strongly associated each area of the code is with the bug. Recall that Antithesis starts with a sample of instances of a bug and then generates a universe of many thousands of test cases by varying conditions a little bit. For every one of these new cases where the bug is hit, Antithesis records which lines of code in your software ran at some point. For every case where the bug is not hit, Antithesis records the same information. This is useful for generating hypotheses about where the bug is in your code: you can look at the lines of code that were most highly correlated with the bug occurring. Conversely, this is also useful for ruling out culprits: you might think that the bug is in one section of code, but Antithesis might find a test case where the bug is detected but that section of code is never called. The bug must therefore be somewhere else.
This section answers two related questions:
- Given that the bug is detected, what is the likelihood that a particular line of code is encountered?
- Given that a line of code is executed, what is the likelihood that the bug eventually surfaces?
The table in this section displays every line of code that ran in any of the test cases, and allows you to sort by these two distinct conditional probabilities.
The table also allows filtering by file, function, or class name in case you are only interested in seeing the statistical information for particular parts of your system.
It is normal that code associated with printing the error message or with the process of crashing will have a very high association with the bug. You may need to scroll down a little bit in the table to get past these trivial associations.