Dr. Thomas Pani

Find the Bugs Your Tests Miss.

Before Your On-Call Rotation Does.

I’m Dr. Thomas Pani. I help teams building complex software find the failures their tests were not designed to catch.

Model-Based Adversarial Testing AI-Generated Code Testing Software Reliability Consulting

Who This Is For

  1. 1

    You have fuzzers, property-based tests, or deterministic simulation in place. But you are still not confident your oracle is catching the right failures.

  2. 2

    You are shipping AI-generated infrastructure or protocol code. And your test suite was not built to tell you whether it is actually correct.

  3. 3

    You looked at TLA+ or formal methods and decided the learning curve was too steep. But you still want something more rigorous.

Typical Teams

Distributed systems engineers, SREs, QA architects, and infrastructure teams that already care about reliability, but need a tighter link between their system and expected behavior.

What I Do

The answer is not always "do more testing". Usually it is "testing against the right thing".

Model-Based Adversarial Testing uses executable specs and uses them to generate tougher tests and more meaningful checks.

Grounded in Practice

This is the same general pattern used inside AWS, Oracle, and Microsoft: capture intended behavior in a spec, then use it as a reference model for tests and conformance checks.

Works With Existing Testing

This does not replace fuzzers, chaos testing, or simulation. It makes those tools more useful by giving them a stronger oracle.

New Pressure Point

AI-Generated Code Testing

Executable specs act as precise oracles for LLM-generated code during generation or review. They catch correctness failures humans and agents routinely miss.

Use when: protocol logic is shipping faster than human review can keep up.

Core Offer

Model-Based Adversarial Testing

I write an executable spec of your protocol or system, generate adversarial scenarios from it, and test the implementation against that model.

Deliverables: executable spec, test harness, and a prioritized report of failures found.

High Assurance

Protocol Specification and Review

I write or review specs for distributed protocols, cloud infrastructure, and payment systems, with invariants, simulation, or model checking where needed.

Useful as: executable design model that doubles as a test asset, or a precondition to deeper verification.

Team Enablement

In-House Workshop

A hands-on session on executable specs, guided coding, and an adversarial testing workflow your team can keep using after the workshop.

Format: remote or on-site, scoped to your system and failure modes.

Why Executable Specs?

Integration tests and property-based testing get you most of the way. They stop short of failure modes that require reasoning about concurrent state, message ordering, and partial failures.

Executable specs close that gap.

This is where executable specs sit: between chaos testing and classical formal methods, turning intended behavior into runnable code and usable test oracles.

  1. 01

    Runnable intent

    Express intended behavior, including non-determinism, as runnable code.

  2. 02

    Concrete scenarios

    Replace informal sequence diagrams with spec executions you can inspect and test.

  3. 03

    Adversarial generation

    Generate happy-path and adversarial scenarios via random and symbolic execution.

  4. 04

    Spec as oracle

    Drive differential and conformance tests directly from the specification.

  5. 05

    Stronger guarantees

    Establish safety and liveness properties through model checking when you need stronger guarantees.

A Note on AI-Generated Code

Fast

LLMs produce plausible-looking code at speed.

Looks right, can still fail

They routinely miss interleavings, race conditions, atomicity assumptions, and protocol invariants.

Executable specs fix it

Executable specs give AI-generated code a tight oracle during generation and review.

How This Fits in the Landscape Show comparison

Executable specs give the rest of the stack something it often lacks: an explicit model of correct behavior. They are most useful where other techniques still struggle to define what correct behavior should be.

Jepsen

Finds real failures in running systems. A spec tells you what should have happened.

Antithesis

Improves reproducibility. Specs provide the missing correctness oracle.

Property-Based Testing

Great low-friction starting point. Harder to scale across distributed behavior and failures.

TLA+ and Classical Formal Methods

Highest expressiveness, highest barrier. I help teams use that toolchain without becoming formal methods specialists.

FizzBee and Quint

Both are viable spec-first approaches. The right choice depends on toolchain and rigor needs.

In-House Simulation

If you already simulate, you are ahead of most teams. A spec-based oracle usually increases signal immediately.

Selected Work

Recent examples of executable specs, model checking, and reliability work.

Aztec Governance Protocol: Formal Verification

2025

Massively parallel symbolic verification of 125 invariants across a multi-contract governance system.

GitHub | PDF report | → writeup (recommended)
Aztec Solidity Quint Apalache Formal Verification

Apalache: Symbolic Model Checker for TLA+ and Quint

2022–Now

Core team member of Apalache, used to verify Ethereum and Tendermint consensus, L2 governance protocols, and Cosmos SDK/IBC.

GitHub
TLA+ Quint Formal Verification Cosmos Ethereum

Quint: Modern Language Tooling for TLA+

2022–2024

Co-developed Quint from its early stages at Informal Systems, bringing a more readable TypeScript-like syntax to TLA+ style specifications.

GitHub
Quint TLA+ Simulation Language Tooling

Verified Accountability in Ethereum 3SF

2024

Formal modeling of Ethereum 3-slot finality (3SF) consensus, exhaustively verified accountable safety.

Tech report (arXiv) | GitHub
Ethereum Formal Verification TLA+ Alloy SMT Python

Solarkraft: Model-Based Runtime Monitoring

2024

Developed a low-latency runtime monitor for Soroban smart contracts on Stellar blockchain, applying model-based monitoring to production contract execution.

GitHub
Stellar Soroban Rust TypeScript Go

Independent Security Audits

2022–Now

Independent security reviews on Cantina, Code4rena, and Sherlock, including verification contests with Certora Prover.

Cairo / StarkNet Soroban / Stellar Solidity / Ethereum

Get in Touch Limited Availability

If you need stronger evidence that a complex system behaves correctly, reach out.

  • Consulting engagements and workshops, remote or on-site.
  • I typically respond within one business day.
  • A first call is 30 minutes, no preparation needed, and there is no commitment.