Find Failures Your Tests Weren’t Designed to Catch.
Before customers and your on-call rotation do.
I’m Dr. Thomas Pani. I help platform, infrastructure, and protocol teams build practical reliability workflows around executable specifications.
Find Failures Your Tests Weren’t Designed to Catch.
Before customers and your on-call rotation do.
I’m Dr. Thomas Pani. I help platform, infrastructure, and protocol teams build practical reliability workflows around executable specifications.
Who This Is For
-
1
You own a stateful or distributed system where failures depend on ordering. Retries, timeouts, reconciliation, partial failure, and stale observations create paths your test suite does not enumerate.
-
2
You have fuzzers, property-based tests, or deterministic simulation in place. But you are still not confident your oracle is checking the behavior that actually matters.
-
3
You are shipping AI-generated infrastructure or protocol code. And your test suite was not built to tell humans or agents whether it is actually correct.
-
4
You are evaluating TLA+ or formal-methods consulting. But you want a practical path from specification to generated tests, conformance checks, and tools your team can keep using.
What I Do
The answer is not always “write more tests”. Usually it is “test against the right model”.
I build spec-driven testing workflows around executable specifications: property-based tests, fuzzers, simulation, adversarial scenario generation, and conformance testing.
Grounded in Practice
This is the same general pattern used inside AWS, Oracle, and Microsoft: capture intended behavior in a spec, then use it as a reference model for tests, simulation, and conformance checks.
Works With Existing Testing
This does not replace fuzzers, property-based tests, chaos testing, or simulation. It makes those tools more useful by giving them a stronger oracle.
Spec-Driven Testing & Test Oracles
Executable specs used with property-based tests, fuzzers, simulation, adversarial exploration, and conformance checks.
Use when: you already test seriously, but your oracle is too weak to catch behavioral failures.
AI-Generated Code Testing
Precise, agent-invokable specs for testing LLM-generated protocol, infrastructure, and state-machine code during generation or review.
Use when: protocol logic is shipping faster than human review can keep up.
Protocol Specification & Review
Executable design models and invariant review for distributed protocols, cloud-native systems, and payment infrastructure.
Useful as: design review artifact that doubles as a test asset, or a precondition to deeper verification.
Strategy & Training
Hands-on workshops and reliability workflow design for teams building controllers, microservices, schedulers, distributed systems, and stateful platforms.
Format: remote or on-site, scoped to your system and failure modes.
Why Executable Specs?
Production failures often happen on paths nobody wrote tests for: interleavings of messages, retries, timeouts, and partial failures.
Executable specs turn intended behavior into a runnable model that generates those paths and acts as a test oracle.
One executable model, four uses
-
01
Exploration
Randomized, exhaustive, or custom search reveals counterexamples in the model itself.
-
02
Test generation
Generate scenarios and expected outcomes for the implementation, then run them as conformance tests.
-
03
Review artifact
Make design assumptions, invariants, and ambiguous edge cases explicit during review.
-
04
Living documentation
Replace stale sequence diagrams with runnable behavior that evolves with the system.
A Note on AI-Generated Code
Fast
LLMs produce plausible-looking code at speed.
Looks right, can still fail
They routinely miss interleavings, race conditions, atomicity assumptions, and protocol invariants.
Executable specs fix it
Executable specs give AI-generated code a tight, agent-invokable oracle during generation and review.
How This Fits in the Landscape Show comparison
Executable specs give the rest of the stack something it often lacks: an explicit model of correct behavior. They are most useful where other techniques still struggle to define what correct behavior should be.
Jepsen
Finds real failures in running systems. A spec tells you what should have happened.
Antithesis
Improves reproducibility. Specs provide the missing correctness oracle.
Property-Based Testing
Great low-friction starting point. Harder to scale across distributed behavior and failures.
TLA+ and Classical Formal Methods
Highest expressiveness, highest barrier. I help teams use that toolchain without becoming formal methods specialists.
FizzBee and Quint
Both are viable spec-first approaches. The right choice depends on toolchain and rigor needs.
In-House Simulation
If you already simulate, you are ahead of most teams. A spec-based oracle usually increases signal immediately.
Customers / Recent Work
Recent examples of executable specs, model checking, protocol fuzzing, and reliability work.
Aztec Labs: Governance Protocol
2025Formal specification and symbolic verification of 125 invariants across a multi-contract governance system.
Protocol Fuzzing Workshop @ Protocol Berg v2
2025Hands-on workshop on building a Solidity protocol fuzzer in 25 minutes.
Ethereum Foundation: 3-Slot Finality
2024Formal modeling and verification of accountability in Ethereum 3SF consensus.
Solarkraft
2024Model-based runtime monitoring for Soroban/Stellar smart contracts.
Open Source
Apalache: Symbolic Model Checker for TLA+ and Quint
2022–NowCore team member; used on Aztec governance, Ethereum 3SF, Tendermint/CometBFT, and Matter Labs/ZKsync protocols.
Quint: Modern Language Tooling for TLA+
2022–2024Co-developed Quint from its early stages at Informal Systems, bringing TypeScript-like syntax to TLA+-style specs.
Training & Workshops
Get in Touch Limited Availability
Want to try this on a real system?
I take on focused pilot projects for teams with concrete distributed-system testing pain.
Discuss a Pilot Project- Best fit: control planes, schedulers, Kubernetes controllers, payment systems, consensus protocols, databases, and stateful cloud infrastructure.
- A first call is 30 minutes, no preparation needed, and there is no commitment.
- I typically respond within one business day.