Dr. Thomas Pani

Find Failures Your Tests Weren’t Designed to Catch.

Before customers and your on-call rotation do.

I’m Dr. Thomas Pani. I help platform, infrastructure, and protocol teams build practical reliability workflows around executable specifications.

Spec-Driven Testing & Test Oracles AI-Generated Code Testing Protocol Specification & Review

Who This Is For

  1. 1

    You own a stateful or distributed system where failures depend on ordering. Retries, timeouts, reconciliation, partial failure, and stale observations create paths your test suite does not enumerate.

  2. 2

    You have fuzzers, property-based tests, or deterministic simulation in place. But you are still not confident your oracle is checking the behavior that actually matters.

  3. 3

    You are shipping AI-generated infrastructure or protocol code. And your test suite was not built to tell humans or agents whether it is actually correct.

  4. 4

    You are evaluating TLA+ or formal-methods consulting. But you want a practical path from specification to generated tests, conformance checks, and tools your team can keep using.

What I Do

The answer is not always “write more tests”. Usually it is “test against the right model”.

I build spec-driven testing workflows around executable specifications: property-based tests, fuzzers, simulation, adversarial scenario generation, and conformance testing.

Grounded in Practice

This is the same general pattern used inside AWS, Oracle, and Microsoft: capture intended behavior in a spec, then use it as a reference model for tests, simulation, and conformance checks.

Works With Existing Testing

This does not replace fuzzers, property-based tests, chaos testing, or simulation. It makes those tools more useful by giving them a stronger oracle.

Core Offer

Spec-Driven Testing & Test Oracles

Executable specs used with property-based tests, fuzzers, simulation, adversarial exploration, and conformance checks.

Use when: you already test seriously, but your oracle is too weak to catch behavioral failures.

New Pressure Point

AI-Generated Code Testing

Precise, agent-invokable specs for testing LLM-generated protocol, infrastructure, and state-machine code during generation or review.

Use when: protocol logic is shipping faster than human review can keep up.

High Assurance

Protocol Specification & Review

Executable design models and invariant review for distributed protocols, cloud-native systems, and payment infrastructure.

Useful as: design review artifact that doubles as a test asset, or a precondition to deeper verification.

Team Enablement

Strategy & Training

Hands-on workshops and reliability workflow design for teams building controllers, microservices, schedulers, distributed systems, and stateful platforms.

Format: remote or on-site, scoped to your system and failure modes.

Why Executable Specs?

Production failures often happen on paths nobody wrote tests for: interleavings of messages, retries, timeouts, and partial failures.

Executable specs turn intended behavior into a runnable model that generates those paths and acts as a test oracle.

One executable model, four uses

  1. 01

    Exploration

    Randomized, exhaustive, or custom search reveals counterexamples in the model itself.

  2. 02

    Test generation

    Generate scenarios and expected outcomes for the implementation, then run them as conformance tests.

  3. 03

    Review artifact

    Make design assumptions, invariants, and ambiguous edge cases explicit during review.

  4. 04

    Living documentation

    Replace stale sequence diagrams with runnable behavior that evolves with the system.

A Note on AI-Generated Code

Fast

LLMs produce plausible-looking code at speed.

Looks right, can still fail

They routinely miss interleavings, race conditions, atomicity assumptions, and protocol invariants.

Executable specs fix it

Executable specs give AI-generated code a tight, agent-invokable oracle during generation and review.

How This Fits in the Landscape Show comparison

Executable specs give the rest of the stack something it often lacks: an explicit model of correct behavior. They are most useful where other techniques still struggle to define what correct behavior should be.

Jepsen

Finds real failures in running systems. A spec tells you what should have happened.

Antithesis

Improves reproducibility. Specs provide the missing correctness oracle.

Property-Based Testing

Great low-friction starting point. Harder to scale across distributed behavior and failures.

TLA+ and Classical Formal Methods

Highest expressiveness, highest barrier. I help teams use that toolchain without becoming formal methods specialists.

FizzBee and Quint

Both are viable spec-first approaches. The right choice depends on toolchain and rigor needs.

In-House Simulation

If you already simulate, you are ahead of most teams. A spec-based oracle usually increases signal immediately.

Customers / Recent Work

Recent examples of executable specs, model checking, protocol fuzzing, and reliability work.

Aztec Labs: Governance Protocol

2025

Formal specification and symbolic verification of 125 invariants across a multi-contract governance system.

→ write-up | GitHub | PDF report
Aztec Solidity Quint Apalache Formal Verification

Protocol Fuzzing Workshop @ Protocol Berg v2

2025

Hands-on workshop on building a Solidity protocol fuzzer in 25 minutes.

→ recording + repo
Solidity Protocol Fuzzing Workshop

Ethereum Foundation: 3-Slot Finality

2024

Formal modeling and verification of accountability in Ethereum 3SF consensus.

GitHub | Tech report
Ethereum Consensus TLA+ Alloy SMT

Solarkraft

2024

Model-based runtime monitoring for Soroban/Stellar smart contracts.

GitHub
Stellar Soroban Rust Runtime Monitoring

Open Source

Apalache: Symbolic Model Checker for TLA+ and Quint

2022–Now

Core team member; used on Aztec governance, Ethereum 3SF, Tendermint/CometBFT, and Matter Labs/ZKsync protocols.

GitHub
TLA+ Quint Model Checking Ethereum

Quint: Modern Language Tooling for TLA+

2022–2024

Co-developed Quint from its early stages at Informal Systems, bringing TypeScript-like syntax to TLA+-style specs.

GitHub
Quint TLA+ Simulation Language Tooling

Training & Workshops

Protocol Berg workshop in Berlin Protocol Berg v2 25-Minute Solidity Fuzzer Hands-on protocol fuzzing workshop
DevConf 2026 Find Distributed-Systems Bugs Before Production Executable models for test generation

Get in Touch Limited Availability

Want to try this on a real system?

I take on focused pilot projects for teams with concrete distributed-system testing pain.

Discuss a Pilot Project
  • Best fit: control planes, schedulers, Kubernetes controllers, payment systems, consensus protocols, databases, and stateful cloud infrastructure.
  • A first call is 30 minutes, no preparation needed, and there is no commitment.
  • I typically respond within one business day.