OTW vs. CI vs. Stat Sig
What do these mean and when to use them?
To ensure you make the right decisions, we recommend look at the combination of three different metrics: Odds to Win, Confidence Interval and Statistical Significance.
1. The Probability (Odds to Win)
What it is: The likelihood that a variation is performing better than the Control at this exact moment.
-
How it works: We calculate this from the very first session. It answers: "Based on the data we have right now, who is in the lead?"
-
The Nuance: It has no "minimum sample size." You can have 100% Odds to Win with only 5 visitors, which is why you shouldn't stop a test based on this number alone.
-
The Math: We use Binomial Bayesian models for Conversion Rates and Gaussian models for AOV.
More information can be found in our Odds to Win Article.
2. The Range (Confidence Interval)
What it is: The "Margin of Error." It tells you the likely boundaries of your success if you were to roll the change out to everyone.
-
The Center (Uplift): The percentage shown (e.g., +3.3%) is our best estimate of the improvement.
-
The Error Bounds: We calculate a 95% range around that uplift.
-
Narrow Range: You have a very stable result.
-
Wide Range: The data is "noisy" or the sample size is low.
-
-
The "Zero" Threshold: If the range includes 0%, the result is inconclusive. If the entire range stays above 0%, it is a "No Risk" win.
More information can be found in our Confidence Interval Article.
3. The Confirmation (Statistical Significance)
What it is: The final seal of approval. This tells you that the result is a real effect and not a random fluke.
-
The Green Strip: In our reports, we only show the green strip when a variation has reached statistical significance.
-
The Requirement: To reach this, two things must happen:
-
The Confidence Interval must be entirely above 0.
-
The Minimum Sample Size (calculated at 80% power) must be met.
-
-
Why it matters: It prevents "false positives" where a variation looks like a winner early on but eventually levels out or drops.
More information can be found in our Statistical Significance Article.