There’s an old saying in politics that you stand where you sit.
The same could be said for many voices championing experimentation and CRO today.
Most thought leaders in the A/B testing space are seasoned A/B testing veterans at agencies and consultancies. Many are thoughtful and generous with their expertise and opinions. But because of where they sit, how many of us are informed about experimentation is skewed.
This is especially true now that Google is sunsetting Google Optimize this September 2023, a free tool many web experimentation agencies favor. Opinions on experimentation and how to choose an A/B testing tool are running rampant on LinkedIn. This means it’s more important than ever to understand the bias affecting a lot of experimentation talk today.
You shall not pass
While a company may hire an agency to help build its experimentation program, engineers rarely let any third party touch their backend code. As a result, most of the discussed topics about experimentation relate to web experimentation, where UX and messaging “on the front end” are optimized to affect a KPI(s).
Look closely at the niche engineering blogs by the companies many of us in the experimentation industry envy: Netflix, Amazon, Uber, Noom, etc. You’ll see nary a “growth hacker.” Instead, it’s chock full of engineers, PMs, and data wonks. In this land, Feature and Full Stack Experimentation rule. And its success, coupled with challenges to Web Experimentation, means you should know about the practice too.
It’s the KPIs, silly!
Feature Experimentation or “product experimentation” is mainly identical to web experimentation except for one significant detail: it fits into the existing workflow of how most companies release features on the backend today.
Instead of making an A/B test, product and engineering teams build and release features. Crucially, the feature is already being built; now, the PM or engineer can create variables and variations of the feature. There is no experiment to build; there is a feature to experiment with.
This sort of “invisible testing” is productive, as building tests is not the goal but instead to “release with confidence” digital experiences and products that, per KPI, appear to meet customer needs better.
When testing gets out of the way, ironically, experimentation flourishes.
Show me the money
Since feature experiments are built in the backend by product and engineering teams, they can not be easily outsourced to a third party like a CRO agency. This is also the case for experimentation programs at companies that only build server-side tests, which are also built on the backend.
Many agencies (and some tools!) rely on web, or frontend, A/B test development as an important and predictable revenue stream. Without quickly being able to build a feature or server-side experiment, the motive to sell their CRO services to these teams is weak. As a result, we don’t hear about feature experimentation and server-side testing nearly as often as we do for web experimentation.
While there may be “no money” to be made building a feature or server-side test, it’s because there is more money to be made building web experiments that we hear so much about them instead.
Get off my lawn!
One of the most respected experimentation experts in the industry, Ronny Kohavi, is fond of saying, “getting numbers is easy; getting numbers you can trust is hard.” As usual, he’s right. Where things get murky, however, is in who and how trust is proven.
A/B testing experts charge for their services and are incentivized to tell us they are best at finding the truth in the experiment data. All tools, they will have you believe, are untrustworthy. Again, it depends. Professional tools make their statistical methods publicly available for peer review. Few agencies and experts rise to this best practice.
Professional A/B testing tools are highly advanced today, and the pace of innovation in these tools is accelerating. For years, sample ratio mismatch (SRM) was an unknown pitfall that experts in the field knew to look and account for. Today, in-app SRM notifications are readily available in these professional A/B testing tools. In other cases, experienced experimentation experts deliberately ignore data accuracy challenges because they lack the understanding or technology to address them. Apple’s Intelligent Tracking Prevention (ITP) has a far greater impact on test trustworthiness but rarely appears on our social feeds. Why? Because it’s not something that an experimentation consultant can easily account for, manage, and explain to their stakeholders.
Human vs. robot
When forming opinions on experimentation, be aware of the strengths and weaknesses of humans and technology.
- Tools are not very good at coding digital experiences and tests. Humans (currently) dominate this skill. Sure, tech can engineer and code, but only if you tell it WHAT to build. The “what” is all the value.
- Relative to humans, tools are very good at analyzing data. Again, some are better than others. What to do with the data is best left to humans.
- Tools are horrible at strategy. Humans have the upper hand regarding critical thought, creativity, and innovation.
Can’t we all just get along?
What’s frustrating is that we all would benefit if web experimentation experts and tools came together more often with feature and full-stack experimentation experts and tools. The former is strongly skilled at building a trustworthy and impactful experiment and experimentation strategy. They also generally lead in supplying the customers the product teams require. The latter is best equipped to build digital products that satisfy customers’ needs and desires.
Have your cake and eat it too.
Fortunately, more and more companies realize that they need not choose between marketing-led growth powered by web experimentation and product-led growth powered by feature experimentation. They can and should have both. To unify the two, these companies are doing the following:
- Understanding what KPIs are most indicative of success for each and to each other and agreeing on a single source of truth. With a clear understanding of each other’s conversion funnel(s), they can understand how changes in marketing affect product adoption and vice versa.
- They rely on a unified experimentation solution that offers one interface with one data model: feature management and flagging for engineers, full-stack testing for advanced CRO teams, web experimentation for non-technical growth marketers, and feature experimentation for product managers.
- They do not force any team to experiment like the other. Keeping the WYSIWYG is okay if you give the PM a JSON. Nor do they force teams to give up their preferred analytics or other favored martech solutions. However, they steadily ensure all customer data is accessible and actionable from a data warehouse and/or customer data platform.
There is no better-proven method of delivering steady and sustained growth than experimentation. Harness its full potential by ensuring you understand how marketing and product-led growth teams use experimentation to achieve their goals. It’s not an either/or. You won’t be disappointed.
Broaden the feed
Tip: make a point of also following experimentation thought leaders at companies that regularly build server-side and product experiments. Balance with experts from agencies that mainly work on web experimentation. Here are a few examples:
- Bhavik Pavel – founder of CRAP Talks and Causl
- Rommil Santiago – Growth Product Director, Loblaw, and Founder of ExperimentNation
- Franciska Dethlefsen – Head of Growth Marketing, Amplitude
- Simon Elsworth – Global Head of Experimentation and Digital Analytics, Whirlpool
Senior Director @ Kameleoon
Collin is a Senior Director at Kameleoon, North America. Kameleoon features hybrid experimentation, which helps experimentation leaders and product-led growth teams increase their test velocity and leverage their tech stacks.