The Complete Guide to Split Testing with AI

Learn how to run experiments that actually move the needle. Avoid common pitfalls, understand your results, and let AI do the heavy lifting.

Get Started Free Read the Guide

What Makes abee.pro Different

Traditional A/B testing tools make you do all the work: come up with ideas, write copy, analyze results, decide what to test next. abee.pro flips the script.

🧠

AI-Generated Hypotheses

Our AI analyzes your goals and generates dozens of test hypotheses based on proven conversion principles. No more staring at a blank page.

🔬

Continuous Optimization

Winners are automatically promoted. Losers are retired. New challengers are generated. Your experiments run 24/7 without babysitting.

🎯

Statistical Rigor

Bayesian statistics give you early indicators while frequentist tests confirm significance. Know when you have a real winner, not a fluke.

📚

Learns From History

Every test teaches the AI something new. Patterns that work for your audience get reinforced. Dead ends get avoided.

Under the Hood: How the AI Works

abee.pro uses a sophisticated multi-agent system to generate, evaluate, and refine test ideas. Here's a peek behind the curtain.

🔮

The Strategist

Analyzes your test history and decides the optimal balance between exploration (trying new themes) and exploitation (doubling down on what works). Early on, it explores widely. As patterns emerge, it focuses on winners.

💡

The Hypothesis Generator

Creates test ideas based on your prompt, conversion psychology principles, and what's worked before. Each hypothesis targets a specific theme: urgency, social proof, clarity, benefit-focus, and more.

⚖️

The Critic

Scores every hypothesis on relevance, novelty, testability, and predicted impact. Only the strongest ideas make it through. Weak hypotheses get sent back for refinement or discarded entirely.

😈

The Devil's Advocate

Challenges every hypothesis with hard questions: “Will this actually change behavior?” “Is this different enough to matter?” Ideas that can't withstand scrutiny don't get tested.

✍️

The Editor

Polishes the final copy for clarity, impact, and brand voice. Makes sure your variations are professional and ready for real users.

🔍

The Retrospective Analyst

After each test, extracts learnings and updates the knowledge base. What worked? What didn't? Why? These insights inform future tests.

The result? A tireless optimization team that generates better ideas than most humans, tests them rigorously, and gets smarter with every experiment.

Best Practices for Split Testing

Even with AI doing the heavy lifting, following these principles will dramatically improve your results.

One Test Per Page

If you're testing your homepage headline AND your pricing page CTA, you won't know which change drove results. Run one experiment per page at a time.

Do: Test homepage hero text in one experiment

Don't: Test hero + nav + footer simultaneously

Unique Goals Per Experiment

If your homepage test and signup page test both use “completed_signup” as the goal, attribution becomes impossible. Which test caused the signup?

Do: Homepage → “clicked_signup_cta”, Signup page → “completed_signup”

Don't: Both pages → “completed_signup”

Give Tests Time to Run

Statistical significance requires sufficient data. Don't call a winner after 50 visitors. The “Early Indicators” feature shows Bayesian probabilities, but wait for the “Significant Winner” badge before making permanent changes.

Do: Wait for 95% confidence before declaring winners

Don't: Stop tests after a few hours because one variant “looks better”

Describe Your Context, Not Your Strategy

The prompt builder is conversational—it asks questions to understand your situation. Your job is to describe what you're testing and who it's for. The AI decides how to optimize it. Don't try to direct the testing strategy; let the AI explore angles you might not have considered.

Do: “This is the hero headline for our project management tool. Our users are remote teams who struggle with async communication.”

Don't: “Test urgency and scarcity messaging” (let the AI decide what to test)

Trust the Process

Not every test will be a winner. That's not failure—that's learning. A test that shows no difference tells you that element isn't the bottleneck. The AI uses this information to focus on more promising areas.

Creating Your First Experiment

Define Your Goal

What action do you want users to take? Click a button? Sign up? Add to cart? Create a goal in abee.pro and add the tracking code to fire when that action happens.

Create an Experiment

Give it a descriptive name and key (e.g., “homepage_hero”). Select your primary goal. Enable Auto-Optimize to let the AI generate and manage variations.

Write Your Prompt

Tell the AI what you're testing and why. Include context about your product, audience, and what you want to achieve. The AI will ask clarifying questions if needed.

Integrate the Code

Use the provided API endpoints to fetch the assigned variation and render it on your page. The integration modal shows copy-paste code for both client-side and server-side implementations.

Start the Experiment

Click the Play button to activate your experiment. Traffic will be split between your control and the AI-generated challenger. Watch the results roll in.

Understanding Your Results

The experiment page gives you everything you need to understand how your test is performing. Here's what each section tells you.

Quick Stats

At the top, you'll see lifetime totals: Total Views (how many visitors entered the experiment), Conversions (how many completed your goal), and Conversion Rate (conversions / views).

Winner Callout

When a variation achieves statistical significance, a prominent callout appears showing the winner, its lift over control, and the p-value. You can keep the experiment running to find even bigger wins, or end it and update your site whenever you're satisfied.

Variation Cards

Each variation gets a card showing its performance:

Views: How many visitors saw this variation
Conversions: How many completed the goal
CR (Conversion Rate): The percentage that converted
Uplift: How much better/worse than control (e.g., +15.2%)
Early Indicators: Bayesian probability of beating control (before significance)

Goal Performance Chart

Track conversion trends over time. See how each variation performs day-by-day or hour-by-hour. Iteration markers show when new challengers were introduced.

Activity Log

For AI-optimized experiments, the Activity Log shows what the AI is thinking:

Hypothesis: The reasoning behind the current test
Theme: The psychological lever being tested (urgency, social proof, etc.)
Summary: What was learned from completed iterations
Learnings: Specific insights extracted for future tests

Hypotheses Explorer

Dive deep into the AI's hypothesis registry. See which ideas have been tested, which are queued, and how each performed. Track win/loss records by theme to understand what resonates with your audience.

Common Mistakes to Avoid

🚫 Peeking and Stopping Early

Checking results hourly and stopping when you see a “winner” leads to false positives. Statistical significance exists for a reason. Let tests reach the required sample size.

🚫 Testing Too Many Things at Once

If you change the headline, button color, and image simultaneously, you'll never know which change mattered. Test one element at a time, or use proper multivariate testing (coming soon).

🚫 Ignoring Segment Differences

A variation might lose overall but win for mobile users. Or vice versa. Consider whether your audience segments might respond differently.

🚫 Testing Low-Traffic Pages

If a page gets 10 visitors per week, it'll take months to reach significance. Focus experiments on high-traffic pages where you can learn quickly.

🚫 Not Acting on Winners

Finding a winner is only valuable if you implement it. Don't let successful tests sit in limbo. Update your production copy and start the next experiment.

Start Optimizing Today

Join thousands of teams using AI to optimize their conversion rates. Your first experiment is free.

Get Started FreeNo credit card required