Senior Software Engineer

Antithesis skills for agents

Product features

Snouty and a robot doing tai chi push-hands

Hi, I’m Carl.

I had the dubious pleasure of being the first human being outside the company to attempt to use Antithesis without the assistance of their extraordinarily nice, extraordinarily skilled forward-deployed engineers. That was exactly a year ago, in March 2025.

The world was pretty different back then. Claude had just learned to think, GPT 4.5 had just started doing “Deep Research”, and I was working on a project called Graft – an open-source transactional storage engine optimized for lazy, partial, and strongly consistent replication. It’s designed for edge, offline-first, and distributed applications. I knew Graft would have to be extraordinarily reliable, and I knew that in order to achieve that, I’d need to test it extraordinarily well. Someone connected me with Antithesis, who at the time were looking for someone who was willing to be a guinea pig. And I said, “sure, why not?”

A few weeks later, they published my review: “He’s described the process of unassisted onboarding as “rough”, “brutal”, “like crawling over broken glass”, and various other flattering descriptions…”

But what if you could just have the good bits of this? The benefits without the broken glass? I work at Antithesis now, and the first thing I’m shipping here is a set of Antithesis skills for agents (repo) that give you just that.

First up is a trio of skills for Claude (or your agent of choice) that help bootstrap your system onto Antithesis. You can use them together or (somewhat) separately. Together, they can generally one-shot a system to the end of setup, provide you with a set of basic but valuable test properties, and a similarly basic but valuable workload with which to validate them. Once you get to that point, iterating on your Antithesis testing becomes much easier – for me, it was positively enjoyable (you can read about the good bits here).

Do some research

To show these skills in action, we forked rqlite, and used the skills to bootstrap the system into Antithesis. Here’s the repo (make sure you’re on the antithesis-skills branch).

antithesis-research helps an agent analyze your codebase and figure out how it should be tested with Antithesis. It produces three markdown files that become part of the context for the other skills.

  • One captures your system’s architecture, state and concurrency behaviors, and looks for failure-prone areas of functionality.

  • A second provides a prioritized list of testable properties that a system with this architecture should probably have (created using the context in our extensive reliability resources).

  • The third describes the minimal useful container topology for an Antithesis deployment – the minimal system topology that allows for testing of the properties in the property catalog.

The names may change, because we’re iterating so fast, but this structure should stay constant.

You can and should review these artifacts thoroughly before proceeding – context is everything. The ones our agent generated are at/antithesis/scratchbook/.

Do my setup

antithesis-setup takes care of packaging your software and getting it deployed to Antithesis with the deployment topology suggested. It navigates around all the sharp corners in our 8-page setup guide, and delivers a working deployment of your system in Antithesis, with necessary lifecycle hooks, and at least one testable property along a simple execution path, complete with relevant assertions. At this point, you should literally be able to log in to Antithesis and launch a test, with no further effort required.

The artifacts you’ll get are /antithesis/dockerfile and /antithesis/config/docker-compose.yaml.

Write me some tests

antithesis-workload creates tests for the properties outlined in the property catalog. It adds the necessary assertions to the codebase and creates a test directory (think: a client) that allows Antithesis to exercise the relevant portions of your system’s functionality.

This is what our agent came up with – the test directory at /antithesis/test/v1 is basically a pointer to the collection of client scripts in /antithesis/workload/.1

  1. The agent decided to build the test command binaries in a slightly strange way (line 88 in the [dockerfile](https://github.com/antithesishq/rqlite-demo/blob/antithesis-skills/antithesis/Dockerfile)). This is not how I'd choose to do it, but it works..

It’s available now

These skills can help your coding agent handle large portions – often the most painful portions – of the work of getting started with Antithesis. We’ve proven internally that they work with a good range of systems, but because the product of these skills is driven by your LLM, we can’t provide a hard guarantee that they’ll work. At the very least, they’re a helpful starting point, and illustrate what agents can do with Antithesis.

These are open and collaborative side projects, and we encourage others to share recommendations for changes and improvements. We’ve been iterating quickly, and will continue to do so, hopefully with your feedback. And if you’re already an Antithesis user, know that there’s more to come, including skills to parse your triage reports, and iterate on test properties and workloads based on failures found.

I want to end with a couple of thoughts about this process. At this point, LLMs are able to quickly get from knowing nothing about a topic to a reasonable level of understanding about how it works, especially if “it” is a piece of software. The main obstacle to using Antithesis has always been the specialized, tacit knowledge involved: not just about Antithesis, or even about builds and containerization. It’s the specialized, tacit knowledge about distributed systems in general. We’re no longer surprised when customers tell us “this sounds great, but what properties should my system have?”

As AI-assisted coding becomes more and more common, engineers are having to reason about system specs, and consequently system properties, in ways they never used to. These skills don’t just bootstrap the system into place, they also bootstrap your reasoning about how the system should behave.

We think you’ll find them useful – and you might even learn a thing or two!