Why Test Plans Matter (Even If They Can Be a Pain)
Writing test plans is important, but also kinda sucks.
Good Growth PMs and Experimenters need them to keep everyone aligned, ensure clear goals and help nail down the necessary stats like sample size, test duration, minimum detectable effect (MDE), and conversion rates (CVR).
So I spent the past couple of months building a tool to make writing test plans less tedious, vibe coding with Bolt.new and the OpenAI API.
It took a few iterations and a new feature roll out from Bolt but, I got it working (mostly) eventually.
Try it here:
https://planner.experimentationlabs.com/
Introducing the AI Powered Test Plan Generator
The tool guides you through setting up your experiment: context, hypothesis, expected impact, experiment setup, success metrics, and variants.
Each section has an "Enhance with AI" button that taps GPT 4o to improve your draft, making your plan clearer and more professional, like having an experimentation expert edit your inputs and flesh them out.
Highlights and Favorite Features:
Test duration calculator: Enter your traffic and statistical targets to instantly see how long your test should run. This attempts to cut out all the guesswork. Mostly all you need is data about your channel/source, current CVR and ideal testing stats (confidence, MDE, power, etc.)
AI Enhance feature that makes fleshes out your inputs like context, hypothesis. Makes it easier to sound like you know what you're talking about. Also occasionally brings in something I hadn't considered, some good reason to do the thing.
Stuff I Learned Along The Way
I'm still pretty new to databases . Connecting everything together proved tricky, but working through the problems gave me a deeper understanding of databases in general. I'm still not an expert, but I'm understanding it a lot more, thanks Supabase.
I ran into some initial problems integrating the OpenAI API. A timely Bolt update simplified the deployment process significantly. It's probably not bulletproof from a security standpoint yet, but good enough for personal and internal use.
Throughout this project, I realized the value of clearly phrasing questions to GPT models. Asking for explanations in simple terms was key to solving the unexpected issues that popped up.
The tool is built with React, TypeScript, and Tailwind (the Bolt stack) and allows you to export your completed test plans directly to PDF or PowerPoint, making sharing and presenting straightforward. There are some things I'd like to update in the future with the export feature, but I'm honestly not sure what's even possible there, TBD.
Results in Real Life
I've used this for several recent test plans, here are a few examples of the AI enhanced output:
Context
We want to test which type of imagery works best for paid landing pages targeting the buyer audience. We plan to test 3 different versions, but in an unorthodox way. We will roll out a new version as a control and then launch 2 different versions, so control vs variant a vs variant b.
Notes
This was actually a pretty good output, I basically just brain dumped an unorthodox test idea into the input and it was able to take it and make sense of it, output wasn't bad.
Context
We're running an A/B test on the homepage hero section, comparing the current header with the wave design against the same copy paired with product imagery inspired by the new design direction. The goal is to assess which version drives better engagement and conversions.
Notes
The input here was pretty loose, in fact I just copy and pasted a request from a stakeholder, then it took it and turned it into something usable. I was pretty happy with that.
Next Steps (And a Question for You)
I plan to write more about this process soon, but I'd really like your input:
What frustrates you most about test planning?
Are you currently using AI tools in your workflow?
What insights would you find valuable about building similar AI driven tools?
Shoot me a message, I'd love to hear from you.
Dan