Incrementality Testing Checklist

info@weareotherside.co

Get Started

Incrementality Testing Checklist

Learn how incrementality testing can accurately measure the impact of your advertising, helping you optimize ad spend and campaign strategies.

Incrementality testing helps you measure the true impact of your advertising by comparing results from test and control groups. It goes beyond tracking clicks and conversions to reveal how much of your business growth is directly driven by ads versus what would happen naturally. This method is essential for making informed decisions about ad spend and campaign strategies.

Key Steps for Effective Incrementality Testing:

Set Clear Goals and Metrics: Define specific questions and success metrics like Incremental Return on Ad Spend (iROAS) or Cost Per Incremental Acquisition (iCPA). Ensure metrics align with your business objectives and are measurable.
Create Balanced Groups: Divide your audience into statistically similar test and control groups using randomization methods. Maintain consistency in group composition throughout the test.
Run the Test Carefully: Control variables, manage external factors, and track conversions accurately. Use proper timelines and ensure statistical significance in results.
Analyze Results: Calculate incremental lift, validate statistical accuracy, and document findings to guide future campaigns.
Apply Insights: Update campaigns, refine attribution models, and establish a regular testing schedule to continuously improve performance.

This structured approach ensures reliable insights, helping you allocate budgets effectively and focus on strategies that drive measurable results.

Incrementality Testing in Digital Marketing & testing architecture

Set Clear Goals and Metrics

Establishing clear objectives and success metrics is a must for any effective incrementality test. Without these, even a well-executed test can leave you with data that’s more confusing than actionable, making it tough to drive decisions that truly impact your business.

The key to meaningful testing lies in linking your test goals to your broader marketing strategy. This connection shapes everything, from how you design your test to which metrics you choose. When your tests align with your strategy, the insights you gain can directly inform future campaigns and budget decisions.

Define Test Goals

Your goals should focus on answering specific, business-critical questions that influence revenue and performance. Start by identifying what you need to learn from your current advertising efforts. For instance, are you trying to determine if your display campaigns are actually driving incremental sales? Or are you curious if increasing Connected TV ad frequency improves brand recall? Each test should focus on one primary question to keep your results clear and actionable.

Some common goals include measuring sales lift, sign-ups, or brand recall. For subscription-based businesses, the focus might shift to incremental sign-ups or trial conversions. If you're looking at brand awareness, you’ll likely measure metrics like aided or unaided brand recall, purchase intent, or brand favorability. While these metrics are harder to quantify than direct sales, they’re essential for understanding the impact of upper-funnel advertising strategies.

Timing is another critical factor when setting your goals. Some advertising effects show up quickly, while others take weeks or even months to materialize. For example, a test measuring the impact of a promotional campaign may focus on immediate sales lift, whereas a brand awareness test might track changes over 30 to 60 days. Aligning your test timeframe with your overall strategy ensures you’re gathering insights that matter.

Choose Key Performance Indicators

Once your goals are clear, select metrics that will accurately measure your campaigns’ impact. These metrics should directly inform your advertising decisions. One of the most important metrics for many advertisers is Incremental Return on Ad Spend (iROAS), which ties your ad spend directly to incremental revenue. For example, an iROAS of 3.0 means every $1 spent on ads generates $3 in incremental revenue.

Another critical metric is Cost Per Incremental Acquisition (iCPA), especially for businesses focused on customer acquisition. This metric calculates the true cost of acquiring new customers through ads by dividing your total ad spend by the number of incremental conversions. It’s a great way to understand how much you’re spending to drive results beyond organic efforts.

You can also look at conversion rate lift, which measures the percentage point difference in conversion rates between your test and control groups. For instance, if your control group converts at 2.5% and your test group converts at 3.2%, you’re seeing a 0.7 percentage point lift, translating to a 28% relative improvement.

For campaigns with a longer-term focus, consider Incremental Customer Lifetime Value (iCLV). This metric evaluates the long-term revenue generated per customer, offering a broader view of your campaign’s sustained impact. While it can be complex to calculate, iCLV provides valuable insights into the lasting effects of your advertising efforts.

Make sure your results achieve at least 95% statistical significance to ensure reliability. In some cases, like smaller sample sizes or preliminary tests, a 90% threshold might be acceptable. Achieving statistical significance often requires running tests for a longer period or involving larger audience segments.

Secondary metrics, such as incremental email sign-ups, app downloads, or store visits, can add valuable context to your primary KPIs. These metrics help you see the broader impact of your advertising efforts, beyond just the final conversion.

When choosing metrics, ensure they align with your attribution model and measurement capabilities. Avoid picking KPIs that are difficult to track or irrelevant to your business goals. Focus on metrics that will genuinely guide your future advertising strategies and budget decisions.

Create Test and Control Groups

Setting up balanced test and control groups is essential to accurately measure the impact of your ads. These groups need to be statistically similar so that any performance differences can be confidently attributed to your advertising efforts. If the groups aren’t properly designed, your results could be misleading, potentially invalidating the entire test.

The reliability of your insights depends heavily on how well these groups are constructed. Even small imbalances can distort results, leading to incorrect conclusions about your campaign’s effectiveness. But when done right, this process provides the clarity and confidence needed to make informed decisions about your budget and strategy. To help you get started, here’s a step-by-step guide to segmenting, assigning, and sizing your groups effectively.

Divide Your Audience

Begin by identifying the audience you want to test, then segment them based on factors that align with your business goals.

Geographic segmentation: For campaigns like Connected TV or Digital Out-of-Home advertising, divide your audience by DMAs, zip codes, or households, depending on the scope of your campaign.
Demographic segmentation: Use characteristics like age, gender, income, or household composition. For instance, if your campaign targets parents, ensure both groups have similar distributions of household sizes and children’s ages. This approach works well for retail and consumer goods campaigns.
Behavioral segmentation: Focus on past actions, such as purchase history, website visits, or app usage patterns. This method is particularly useful for e-commerce and subscription-based businesses, where previous behaviors often predict future actions. Balance both groups with similar distributions of high-value customers, recent buyers, and engagement levels.
Lookalike segmentation: Create groups based on similarities to your current customers. This ensures that both groups have comparable conversion potential. This method is often used in acquisition campaigns to maintain audience quality across test and control groups.

Keep your attribution window in mind when dividing audiences. For example, if you’re measuring results over 30 days, make sure your segments stay consistent throughout that period. Avoid including users who might drop out or change behavior during the test timeframe.

Assign Groups Randomly

Use true randomization methods, like random number generators or systematic sampling, to give every individual an equal chance of being placed in either group. If needed, apply techniques like stratified or matched pair randomization to maintain balance across key variables.

Stratified randomization: This method ensures specific characteristics are evenly distributed. For instance, if geographic location is a key factor, assign users randomly within each region instead of across the entire audience. This maintains randomization while controlling for important variables.
Matched pair randomization: Pair similar users and assign one to each group. This approach is especially helpful for smaller audiences where pure randomization might result in imbalanced groups. Match users based on metrics like lifetime value, purchase frequency, or engagement scores.

Avoid shortcuts like convenience sampling or cherry-picking, as these can introduce bias. Similarly, don’t use opt-in control groups, as people who volunteer to avoid ads often behave differently from the general audience.

Document your randomization process carefully. This allows you to replicate successful methods in future tests and ensures transparency. Make a detailed record of how the groups were created, including any exclusions or adjustments. Once the groups are set, determine the best split to balance statistical strength with business impact.

Choose Group Size Split

Deciding on the right split between test and control groups depends on factors like the minimum detectable effect, budget, audience size, and test duration.

A 50/50 split maximizes statistical power, making it ideal when you need precise results.
80/20 or 70/30 splits can balance test coverage with reduced business risk but may require longer test durations to maintain statistical confidence.

When choosing a split, consider the following:

Minimum detectable effect: If you need to measure small changes (under 5%), larger sample sizes and balanced splits are essential. For larger effects (15% or more), uneven splits may still provide reliable insights.
Budget constraints: Testing with 80% of your audience costs more than a 50/50 split. Factor these costs into your budget to ensure the test can run long enough to produce meaningful results.
Seasonal factors: During peak sales periods, smaller control groups can minimize revenue impact. In slower periods, larger control groups are more acceptable since the business impact is lower.
Audience size: Smaller audiences often require 50/50 splits to achieve sufficient sample sizes, while larger audiences allow for more flexibility with uneven splits.
Test duration: Uneven splits usually need longer test periods to achieve the same level of confidence as balanced splits. Make sure your test duration aligns with the conditions you’re measuring.

Run the Test

Once you've set your goals and organized your test groups, it's time to dive into the execution phase. This step is all about precision - controlling variables and tracking results with care. Even minor oversights here can skew your results, so attention to detail is key. The challenge lies in keeping everything tightly controlled while juggling a lot of moving parts.

Running a successful incrementality test means carefully managing variables, minimizing outside influences, and ensuring your data stays clean and reliable. Companies that establish clear KPIs during this phase are three times more likely to hit their targets compared to those with vague benchmarks [1][5].

Set Test Variables and Timeline

Your test variables are the backbone of your experiment. Choosing the right ones directly affects how accurate and useful your insights will be [1]. Before you begin, make sure you’ve defined a clear hypothesis.

Key variables might include ad spend levels, creative designs, audience targeting, or your channel mix. For Connected TV campaigns, you could test frequency caps or dayparting strategies. With Display and Native campaigns, you might focus on creative formats, bidding tactics, or placement choices. The specifics depend on your goals and what you're aiming to refine.

Timelines matter too. Align the duration of your test with your customers' buying cycles and your business rhythms. Most tests run for 2 to 6 weeks - long enough to gather meaningful data but short enough to avoid interference from things like seasonal trends or overlapping campaigns [1][3]. For example, e-commerce brands with fast purchase cycles might get solid results in 2-3 weeks, while B2B companies or products with longer consideration periods may need 4-6 weeks.

Sample size is equally important. For instance, to detect a 10% lift with 95% confidence and 80% statistical power, you’ll need about 6,200 users per group [1]. If you're expecting smaller lifts, plan for even larger sample sizes, and adjust your timeline accordingly.

Document everything - start and end dates, budgets, creative assets, and targeting criteria. Having these details locked in helps you manage external variables more effectively.

Manage Outside Factors

External influences can muddy your test results, making it hard to pinpoint whether your variable caused the changes you’re seeing. The goal is to create a controlled environment where only the test variable drives the outcomes.

Coordinate your campaigns by pausing or accounting for other marketing efforts during your test. For instance, if you're testing the effectiveness of Connected TV ads, avoid ramping up email marketing or launching new Display campaigns targeting the same audience segments. Keep your baseline activities steady across all groups.

Timing is critical. Avoid running tests during major events like Black Friday or back-to-school sales unless you can factor these influences into your analysis [1][3]. If testing during a promotional period is unavoidable, ensure both groups are exposed to the same promotions.

Geographic and demographic controls help maintain the integrity of your test. Use tools like geo-fencing to limit ad delivery to specific areas and prevent overlap between test and control groups [1][3]. For household-based testing, make sure shared devices or locations don’t cause cross-contamination between groups.

Here’s an example: In May 2025, Shinola ran a geo-based test and achieved a 14.3% lift in conversions [1].

Once external factors are under control, the next focus is ensuring accurate conversion tracking.

Track Conversions Properly

Conversion tracking is the cornerstone of any incrementality test. Errors here can undermine your entire effort. Unlike attribution-based models, incrementality testing doesn’t rely on third-party cookies or user-level tracking, which makes it especially relevant in today’s privacy-conscious world [1][2][4][6].

Tracking setup should cover all relevant conversion events for both test and control groups. Use platforms like Google Analytics, affiliate link tracking, or custom conversion pixels to ensure consistent tracking across groups [3].

Privacy regulations like CCPA and GDPR must be respected. Incrementality testing naturally aligns with these requirements by focusing on aggregate group performance rather than individual user data. This approach delivers actionable insights without breaching privacy rules or requiring extensive user consent.

Validate your data regularly throughout the test. Check that conversion numbers align with historical trends, confirm that both groups have similar baseline performance, and ensure all relevant touchpoints are being tracked. Set up alerts to flag any unusual data patterns.

Attribution windows should match your customers’ buying journeys. Short windows work for impulse buys, while longer ones are better for high-consideration purchases. Clearly document your measurement window, as it significantly affects how you interpret the results.

Finally, aim for statistical significance. A 95% confidence interval means you can be 95% sure your calculated lift is accurate [1][4]. A p-value below 0.05 typically indicates statistical significance [1][4]. However, remember that statistical significance doesn’t always translate to business impact - evaluate your results in the context of your goals and costs.

Review and Understand Results

After setting up your controlled test and tracking methods, it’s time to turn the raw data into actionable insights. This step is where you evaluate whether the changes you observed were genuinely driven by your test variable - and if those changes are worth the investment. Missteps in this phase can lead to poor strategy decisions, so precision is key.

Effective analysis requires a structured approach. This means crunching the numbers, validating findings, and keeping detailed records for future use. A systematic evaluation process ensures your campaign results are more reliable than if you rush through this critical step.

Calculate Incremental Lift

Incremental lift measures the value your advertising delivered beyond natural outcomes. It compares your test group’s performance to your control group’s, isolating the true impact of your campaigns.

Here’s the basic lift formula:
Incremental Lift = (Test Group Conversions - Control Group Conversions) / Control Group Conversions × 100

For example, if your test group had 1,200 conversions and your control group had 1,000, the lift would be:
(1,200 – 1,000) / 1,000 × 100 = 20%.

If your campaigns span multiple channels, break down the results by channel to identify which ones deliver the most value. This analysis can help fine-tune budget allocation and uncover trends tied to specific platforms, timing, or audience segments.

Don’t forget to account for baseline performance. For instance, if your control group’s conversion rate was already declining, a flat performance in the test group might still represent a win. On the other hand, during growth phases, an apparent lift might not be as impressive as it seems.

To gauge efficiency, calculate the cost per incremental conversion by dividing your total ad spend by the number of incremental conversions. This metric allows for direct comparisons across campaigns and channels.

Finally, validate your results using statistical tests to confirm their reliability.

Check Statistical Accuracy

Statistical validation ensures your findings are based on real patterns, not random fluctuations. Without it, you risk making decisions based on coincidence rather than meaningful data.

To confirm statistical significance, use tools like a two-sample t-test or chi-square test, depending on your data. Aim for a p-value below 0.05, which indicates 95% confidence. For example, if your test shows a lift but the p-value is higher, the improvement might be due to chance. In such cases, consider extending the test duration or increasing the sample size.

Confidence intervals are another essential tool. They provide a range around your lift estimate, showing how precise your results are. Narrow intervals indicate reliability, while wide ones suggest the need for more data or stricter controls.

Look for external factors that might have influenced your results. A sudden spike in conversions, for example, could be tied to an external event - like a viral post or unexpected news coverage - that impacted your test and control groups differently. Check for outliers or inconsistencies across time, regions, or demographics that don’t align with broader trends.

Ensure your sample is representative, just as in the test setup phase. Also, consider seasonality by comparing your test period to the same timeframe in previous years. Seasonal trends can either exaggerate or mask lift, so it’s essential to account for them.

Finally, use power analysis to verify your sample size was sufficient. An underpowered test might miss significant effects or overstate minor ones, making this step critical for planning stronger future experiments.

Record Key Findings

Once you’ve confirmed your results are statistically sound, it’s time to document everything. This step builds institutional knowledge and serves as a roadmap for future campaigns. Create a detailed summary that includes your hypothesis, methodology, results, and interpretations. Highlight which variables had the most impact and which didn’t contribute much. For example, if a frequency capping strategy outperformed an uncapped approach, note the specific settings and audience segments involved.

Don’t overlook unexpected findings, challenges during implementation, or statistical limitations. These insights can guide your next round of testing.

Quantify the broader business impact by calculating metrics like total revenue impact, customer acquisition costs, and effects on customer lifetime value. This context is essential for shaping strategic decisions.

Finally, translate your findings into actionable recommendations. Instead of simply noting that one creative performed better, detail the specific elements - like the headline, imagery, or call-to-action - that drove its success.

Store all this information in a centralized location for easy access. Include raw data, analysis files, and notes on any external factors that influenced the test. Comprehensive documentation ensures your team has a solid foundation for future testing and optimization efforts.

Apply Results and Plan Next Tests

Once you've gathered solid insights from your tests, the next step is to use that data to fine-tune your campaigns and map out future experiments. These results should not only lead to immediate improvements but also help you build a testing system that adapts as your business evolves.

Make sure to update campaigns, refine how you measure success, and create a regular testing schedule to keep improving over time.

Update Campaigns Based on Data

Begin by focusing on the changes that your test results highlight as having the biggest impact. Here are some areas to prioritize:

Budget reallocation: Shift funds toward the segments or strategies showing the highest incremental lift to maximize returns.
Creative optimization: Adjust your content to emphasize elements that perform well while removing or revising underperforming pieces.
Frequency capping: If your tests show better results with moderate ad exposures, tighten frequency controls to avoid oversaturating your audience.
Geographic targeting: Increase investment in regions with stronger performance while scaling back in areas with weaker results.
Scheduling adjustments: Focus your efforts on time periods that your data identifies as peak performance windows.

These updates will not only optimize your campaigns but also prepare the groundwork for refining how you measure and assign credit across different channels.

Update Attribution Models

Your test results can also guide improvements to how you track and assign credit for conversions. Traditional last-click attribution might not fully capture the impact of awareness-building channels.

Adjust attribution windows: If your data shows that conversions happen over a longer period than expected, extend your attribution window to capture the full impact of your campaigns.
Reassign credit: Use your findings to ensure each channel receives credit based on its incremental contribution. For example, if a specific channel consistently drives lift, create custom attribution models to reflect its role.
Incorporate incremental metrics: Update your reporting tools to include metrics like incremental cost per acquisition (iCPA) and incremental return on ad spend (iROAS) for a clearer picture of your campaign's effectiveness.
Cross-channel modeling: If your data shows that combining channels boosts performance, adjust your models to account for these interaction effects.
Leverage data-driven models: Consider models that automatically distribute credit based on actual conversion paths, integrating incrementality insights more effectively than rule-based approaches.

These attribution updates ensure your measurement strategy aligns with the realities of your campaign performance, setting you up for consistent, data-driven decisions.

Plan Regular Testing Schedule

To maintain momentum, establish a structured testing schedule that balances optimization with your team’s workload and audience segmentation. Regular testing helps you stay ahead and refine strategies over time.

Test campaign elements systematically: Rotate tests across creatives, audience segments, channels, and timing to ensure balanced improvements without overloading your team or audience.
Define clear hypotheses: Avoid vague plans like "test new audiences." Instead, specify the audience segments and expected outcomes to improve both the design and interpretation of your tests.
Time tests strategically: Align testing with major launches, seasonal promotions, or stable periods to avoid skewed results from external disruptions.
Allow buffer time: Leave enough time between tests to implement findings and let your campaigns stabilize. You can also use ongoing holdout tests to monitor incremental trends without the complexity of larger-scale testing.

How OTHERSIDE Supports Testing

For incrementality testing to succeed, you need the right tools: solid infrastructure, reliable data, and compelling creative assets. OTHERSIDE’s platform is designed to provide all of these, making your testing process not only possible but effective. Let’s dive into how OTHERSIDE supports each critical piece of incrementality testing.

Cross-Channel Retargeting and Real-Time Optimization

Incrementality testing thrives on accurate audience segmentation and smooth execution across multiple platforms. OTHERSIDE’s cross-channel retargeting ensures your test and control groups remain distinct while tracking user interactions across platforms like Connected TV, Display & Native, Mobile Apps, and Digital Out-of-Home & Audio.

The Nexus Engine™ performance optimization algorithm works in real time, adapting campaigns to market changes. This means your test variables stay consistent, even as external conditions shift.

When running creative or audience tests, OTHERSIDE’s cross-channel setup keeps control groups isolated and messaging coordinated across platforms. This ensures you can measure true incremental lift without interference.

But it’s not just about infrastructure. OTHERSIDE also delivers when it comes to reporting and creative support.

Clear Reporting and Creative Services

Accurate data is the foundation of actionable insights, and OTHERSIDE excels in conversion tracking and creative testing. One of the toughest parts of incrementality testing is pinpointing which creative elements drive results. OTHERSIDE tackles this head-on with detailed reporting and integrated creative development.

"Our pixels track every touchpoint, giving you full-funnel attribution and daily reporting that shows exactly what's working and where your dollars go." [7]

This full-funnel attribution gives you a complete picture of how your test and control groups perform at every stage of the customer journey, making it easier to calculate incremental lift.

On top of that, OTHERSIDE offers in-house creative services at no additional cost. This means you can quickly develop and test multiple creative variations without worrying about extra expenses or delays. By aligning creative testing with audience and channel testing, OTHERSIDE creates a cycle of continuous improvement that builds on your findings.

Targeted Reach and Scalable Campaigns

Precise audience segmentation is key, but scaling your campaigns effectively is just as important. OTHERSIDE makes it easy to move from testing to full-scale execution. With access to over 400 data partners and a flat monthly retainer of $3,750, you get precise targeting and scalable campaigns without breaking the bank. Plus, unlimited scalability means you can expand successful campaigns right away.

You can run multiple tests at once or quickly iterate based on early results. A dedicated account manager and Slack support ensure you maintain momentum throughout the process.

Backing this up, a third-party Marketing Mix Modeling study found that OTHERSIDE delivered a 2.2x return on investment from top-of-funnel programmatic spend [7]. That’s the kind of measurable impact that makes testing worthwhile.

Conclusion

Incrementality testing takes the guesswork out of advertising strategies, providing a clear path to making decisions grounded in data. It's a reliable way to drive growth in your campaigns.

But here's the thing - this process demands precision at every stage. From designing the test and randomizing assignments to tracking results and validating them statistically, every step matters to ensure the insights are accurate and actionable.

Beyond the basics, execution plays a huge role. You’ll need the right infrastructure to maintain controlled environments, tools to track cross-channel effects, and creative teams ready to test variations quickly. Success hinges on scaling what works without delay while continuing to tweak what doesn’t.

OTHERSIDE’s Nexus Engine™ takes this to the next level. By combining cross-channel capabilities with real-time optimization, it simplifies sophisticated incrementality testing. Plus, with a flat $3,750 monthly retainer and unlimited scalability, you can roll out successful tests immediately, making it easier to turn insights into action.

The key to staying ahead? Consistent testing, learning, and improving. Start by addressing your biggest questions, stick to the checklist, and let the data shape your strategy. Each insight brings you closer to a smarter, more refined approach.