Trustworthy Experiments
Helps design, run, and interpret controlled experiments correctly. Based on Ronny Kohavi's framework from "Trustworthy Online Controlled Experiments".
trustworthy-experiments
Use when asked to "run an A/B test", "design an experiment", "check statistical significance", "trust our results", "avoid false positives", or "experiment guardrails". Helps design, run, and interpret controlled experiments correctly. Based on Ronny Kohavi's framework from "Trustworthy Online Controlled Experiments".
What It Is
Trustworthy Experiments is a framework for running controlled experiments (A/B tests) that produce reliable, actionable results. The core insight: most experiments fail, and many "successful" results are actually false positives.
The key shift: Move from "Did the experiment show a positive result?" to "Can I trust this result enough to act on it?"
Ronny Kohavi, who built experimentation platforms at Microsoft, Amazon, and Airbnb, found that:
- 66-92% of experiments fail to improve the target metric
- 8% of experiments have invalid results due to sample ratio mismatch alone
- When the base success rate is 8%, a P-value of 0.05 still means 26% false positive risk
When to Use It
Use Trustworthy Experiments when you need to:
- Design an A/B test that will produce valid, actionable results
- Determine sample size and runtime for statistical power
- Validate experiment results before making ship/no-ship decisions
- Build an experimentation culture at your company
- Choose metrics (OEC) that balance short-term gains with long-term value
- Diagnose why results look suspicious (Twyman's Law)
- Speed up experimentation without sacrificing validity
When Not to Use It
Don't use controlled experiments when:
- You don't have enough users β Need tens of thousands minimum
- The decision is one-time β Can't A/B test mergers or acquisitions
- There's no real user choice β Employer-mandated software
- You need immediate decisions β Experiments need time
- The metric can't be measured β No experiment without observable outcomes
Resources
Book:
- Trustworthy Online Controlled Experiments by Ronny Kohavi, Diane Tang, and Ya Xu
Quick Install
Add this skill to your AI assistant in 3 simple steps. No coding required!
Create the skill file
Run this command to create the directory and SKILL.md file:
mkdir -p .claude/skills/trustworthy-experiments && touch .claude/skills/trustworthy-experiments/SKILL.md
This creates the directory and an empty SKILL.md file.
Open the skill file
Open the SKILL.md file in your favorite editor:
nano .claude/skills/trustworthy-experiments/SKILL.md
Or use code .claude/skills/trustworthy-experiments/SKILL.md for VS Code
Add the content
Copy the skill content and paste it into the SKILL.md file:
Then save the file. Now you can use the skill by typing /trustworthy-experiments in your AI assistant, or it will automatically use it when relevant.
Using a different AI assistant?
.claude/skills/
.opencode/skills/
Related Skills
Stakeholder Update Generator
Create compelling progress updates and release notes
A/B Test Designer
Design robust A/B test experiments
PMF Survey (Product-Market Fit Survey)
Helps quantify product-market fit and systematically improve it. The PMF Survey framework (created by Sean Ellis, popula...