We're big advocates of testing. But to us, testing is only one tool in a marketer's toolbox. The bigger mission: find great customers, engage and impress them, start a good relationship, and do it efficiently with a strong return on marketing investment (ROMI).

That's why when it comes to landing pages -- particularly multi-page landing experiences -- we favor A/B split testing over multivariate testing (MVT). It helps keep the big picture in focus.

I'll explain.

I know, MVT is "in" these days. It promises a kind of magical box: drop dozens or hundreds of elements into the box, pour in your respondents, the box shakes around trying thousands and thousands of combinations, and voila!, out pops THE ANSWER. Only a few of us math geeks understand exactly how it reaches that answer, but for everyone else, that's okay, the only thing that really matters is the answer. Right?

Well, sacred cows make great hamburger.

There are a number of caveats to MVT that tend to get lost in the fine print.

For instance, fractional factorial MVT -- popularized by some vendors because they promise to give you the answer without having to test all of those thousands of combinations directly -- are usually based on the assumption that there are no "interaction effects" between the elements, say between a headline on the page and an image on the page. But do you as a marketer believe that a headline and its accompanying image have no interplay?

I've described other concerns with MVT -- optimizing to the average user instead of finding the best answer for each segment independently, difficulty in running apples-to-oranges experiments, etc. -- in a paper I wrote a couple of months ago.

But I'd like to bring attention to a very basic problem with the MVT approach that is often overlooked: the Russian roulette caveat.

Let's say you're testing a landing page with a structure of 5 testable elements: headline, subhead, image, body copy, and call-to-action button. The MVT approach encourages you to load up many variations of each, say 10 headlines, 5 subheads, 5 images, 2 body copy blocks, and 3 different call to action buttons.

That's 10 x 5 x 5 x 2 x 3 = 1,500 possible combinations.

A supposed benefit of MVT is that you as a marketer don't have to visualize each of these combinations. The good news is that somewhere in there is hopefully one combination that is better than all the others, your star, and that's what you're searching for. The bad news, however, is that there are probably a fair number of dogs in there too. Maybe even some really bad ones.

What is a bad combination? (Seth Godin's imagery of a meatball sundae comes to mind.) It's when two or more elements clash so as to confuse or mislead the respondent, give them the wrong impression, or simply come across as a disjointed presentation that reflects poorly on your brand.

For example, say you're marketing your Caribbean resort, and you have two headlines: "Enjoy our delicious food!" and "Relax in our luxurious health spa!". You also have two images: a photo of a big buffet with rich, sinful desserts and another photo of someone being wrapped with cucumber slices over their eyes. Four possible combinations, two of which are fine. However, the other two are not: one is a head scratcher, and the other is downright frightening.

Talk about headline/image interaction effects. Yow.

While this example is obviously contrived, it's not hard to imagine how even well-intentioned MVT trials can lead to unexpected -- and undesired -- combinations. If you were presented with one of these combinations outright, you would immediately veto it. Unfortunately, since it's 1 of 1,500 combinations, you don't get the chance to picture it on its own. It's lost in the math of possibilities.

Now, of course, these bad combinations aren't going to win out in your trials. After you run your MVT experiment long enough to achieve statistical significance, these bad combinations will go away.

But the Russian roulette caveat is that by then a number of legitimate respondents have already been exposed to them.

Any one respondent doesn't know that they're participating in a test of 1,500 different combinations. They don't say, "Heh, heh, sorry guys, this one's wacky, give me an alternative instead." All they know is what they see. To quote The Truman Show, "We accept the reality of the world with which we're presented." If their landing page is disjointed, weird, wacky, misleading, confusing, etc., they simply have a negative experience of your brand.

What percentage of MVT-generated combinations deliver these bad experiences? It's hard to predict. After the fact, you can look at the list of worst performers and see that 10% were awful. But for the sake of argument, let's say it's low, maybe 5%. In which case, 75 of those 1,500 combinations are unfit to represent your brand.

Is your landing page optimization mission worth the risk that 5% of your would-be customers have a bad experience? Particularly as their first impression of you?

The risk here is increased by the fact that many companies are subject to the Pareto principle: 80% of their business comes from 20% of their customers. So on average, if 5% of your landing pages give bad brand and 20% of your customers become superstar customers, then 20% x 5% = 1% of your combinations to would-be customers are resulting in a superstar prospect being mishandled on their first touch.

Statistics are easy to accept impersonally. But when they translate into real people, with high stakes, it's hard to be as cavalier. Every 100th deep-pocketed customer who walks through your door you throw a water balloon at them.

The thing is: you don't have to do it that way.

With A/B testing, you as the marketer get to explicitly visualize each combination before it's in production. Yes, that means a smaller number of combinations are tested, but in exchange you can be guaranteed 100% that no respondent will ever receive a BAD experience.

Every respondent will receive a good experience, and you're simply testing to find the BEST experience.

Narrowing the number of combinations you're trying isn't necessarily a trade off in your results either. The history of direct marketing is rich with examples -- online and offline -- of A/B tests that have produced as significant results as what is advertised by MVT vendors. More combinations does not mean better combinations.

With a smaller number of tests, where you're actually able to visualize each test, you don't waste time with junk combinations or redundant variations. You focus on real hypotheses and clear ideas. This is the sort of thinking that leads to breakthrough experiences that change the game and double or triple your conversion rate.

A/B tests are easy to set up. They're easy to understand. You can visualize exactly what is being tested. And there is no Russian roulette.

-- Scott Brinker

Bookmark with del.icio.us Digg It Submit to Reddit Submit this blog Add to Technorati Favorites Sphinn it

Comments

 re: Playing Russian roulette with your landing pages?

Scott -- thanks for the very interesting post, and you bring up some crucial points regarding MVT. However, the good news is that there are tools out there that address these very issues and allow folks to take advantages of the significant improvement that MVT would have over A/B testing.

To address your three main points:

Fractional Factorial -- you're absolutely right that traditional fractional factorial designs do not take into account interaction effects among variables. However, there are methodologies that do. You can use something called Optimal Design, which essentially trades off perfect orthongonality of design to measure 2nd level (and often 3rd level) interaction effects. That way you would know if a variable performs better in the presense (or absence) of other variables, etc. You can also use a Full Factorial design, which tests every interaction but often produces too many combination to feasibly test.

Optimizing to the average user -- most marketers these days are combinging MVT with targeting, so that you can find the best permutation of a web page (from the thousands or millions or possibilities) uniquely for different audience segments. That way, you effectively optimize and personalize (to a degree) at the same time.

Russian Roulette problem -- again, I agree that traditional approaches don't take into account business rules or constraints -- certain text doesn't make sense with certain pictures, etc. But there are testing vendors out there that let you bake in those rules to prevent any "meatball sundae" examples from ever being displayed. You can't do this with traditional fractional factorial design (which have fixed design criteria), but it definitely can be done -- and is done a lot.

Unfortunately, much of the world's knowledge about MVT is based on MVT work that was done in offline applications (manufacturing, agriculture, etc.) -- that was the origin of Taguchi designs -- the best known "fractional factorial" design. Fortunately with modern computing power and the ability for the Internet to personalize, bake in business rules, etc, there is a new generation of MVT that adapts to what the Internet is good for -- finding the best combination from a limitless set of possibilities thatwork for every visitor.

4/18/2008 6:16 PM | Seth Rosenblatt

# re: Playing Russian roulette with your landing pages?

Hi, Seth -- thanks for your comment! I know you guys do a lot of MVT, so it's great to have your perspective.

Sounds like we agree that many fractional factorial methods have dangerous flaws that some vendors have papered over. We also agree that full factorial will give you access to analyzing interaction effects -- although analyzing second-level and third-level effects starts to be a little complicated to visualize -- but they require large traffic flows to achieve statistical significance on tests of any real size.

I've not seen a mathematical model of your optimal design technique, so I can't comment in depth, other than to note that it's hard to get anything for free in math. There's an inherent relationship between the number of things being tested, the number of trials run, and the depth/accuracy of the conclusions that can be drawn from the results. There have been some brilliant inventions in experimental design over the past century, but most of their genius is in their ability to let you pick your trade-offs at a finer level, rather than a way to circumvent that triangle. The question is whether a marketer clearly understands the particular trade-offs (e.g., "perfect orthogonality") being made on their behalf.

Fair point that a good MVT system should let you add business rules to override certain combinations, such as "don't show headline A with image D". However, my point about the Russian roulette caveat is that when you're trying thousands -- or millions, as you say -- of possible combinations, it's impossible to visualize more than a tiny, tiny (random?) subset of them in advance.

It doesn't occur, a priori, that headline B with image C with body text F and offer M is "bad", because it's never explicitly considered. It's a needle in a haystack, unfortunately found by letting real prospects roll around until some of them yell "ouch". Shifting responsibility to the marketer -- i.e., if you can find that needle, you can remove it -- stacks the odds against the person running the test while giving the system a free pass.

As for optimizing to the average user, I think it's great that audience segmentation is being incorporated in more MVT packages. Certainly the popular examples of MVT -- e.g., Google Website Optimizer -- have not taken this into consideration. However, in doing so, do you not geometrically expand the number respondents you need to test to achieve statistical significance? Essentially, "segment" becomes an additional independent dimension of your test space. (By the way, we think it's by far the most important dimension.)

In addition to multiplying traffic requirements, this also gets tricky when you're trying to exclude certain subsets of combinations based on particular segmentation choices, or dynamically react to multi-step "paths" (instead of stand-alone, single-page optimization). I've seen it done in an MVT context, but the configuration was quite complex and seemed prone to errors both in setup and analysis.

4/21/2008 10:46 AM | Scott Brinker

# re: Playing Russian roulette with your landing pages?

Overall, this is my beef with MVT: it's a rather complex piece of machinery to operate correctly. While that makes sense for certain scenarios -- we've actually recommended you to many e-commerce web sites with significant traffic on key site-wide pages or global check-out processes -- where the investment in a complex test structure, which tends to be consulting-intensive, can pay off, it doesn't make sense for others.

When it comes to The Long Tail philosophy of many different landing pages matched to many specific ads and email campaigns, we think A/B testing is more practical for reasons such as:

* you can set up a new test in a matter of minutes;
* you're guaranteed that no one will get a "bad" experience;
* you don't need a lot of traffic to have statistical significance;
* you can add a segmentation dimension with less overhead;
* you can easily test multi-step paths, not just single pages;
* business rule "special cases" are easier to visualize;
* the analysis is very straightforward to understand;

In our experience, testing more and more combinations of elements on a single page isn't nearly as helpful to Long Tail marketers as the capacity to manage more and more distinct landing pages matched to their different traffic sources. This, of course, is a trade-off as well, but one for which the pros and cons of can be plainly identified.

4/21/2008 10:48 AM | Scott Brinker

# re: Playing Russian roulette with your landing pages?

I've been obsessed with post click marketing lately - and we have finally been doing some a/b testing. I'm not a trained analytic thinker - or scientist. My background is highschool chemistry, where we had a control group. With MVT - that control group is so far removed from your hundreds of possible combinations - it's hard to see what you changed, how it worked, and why.

For many "do-it-yourselfers" such as myself - using A/B testing is a much simpler, and controlled way to test how things work.

Great article - and conversation after the fact - thanks.

~Carrie

4/21/2008 12:20 PM | Carrie Hill

# re: Playing Russian roulette with your landing pages?

Hi, Carrie. Love your article on post-click marketing in a traffic focused world. I particularly liked your discussion of segmentation in the context of travel-related sites. Great stuff!

I think you put your finger on the one of the big "pros" of A/B testing -- simplicity. In a world where online marketing is moving faster and faster, the ability to set up a quick test and prove/disprove a clear hypothesis is immensely valuable.

Just like you would do with an AdWords creative, you can do with your landing page. Be fast and nimble, and keep the brain power focused on your business rather than the intricacies of advanced experimental design.

When we were designing our landing page management software, LiveBall, we thought long and hard about incorporating MVT methods. I still think about it, largely because it's such a buzzword in this space. But every time we go through the process of mapping out how marketers could leverage it day-to-day, the complexities and trade-offs weigh it down. It's not a math problem, it's a usability problem.

Our focus is to make it easier and easier for marketers to manage lots and lots of parallel landing pages for Long Tail campaigns -- and simplicity and clarity are crucial. Otherwise, if it's too much work, people will run fewer tests and miss the entire benefit of creating tightly matched pre-click/post-click experiences.

As I replied to Seth, that complex overhead makes sense in some scenarios -- say, for instance, the home page of The Discovery Channel Store. That's a great opportunity for MVT-based optimization.

But for dozens or hundreds of micro-campaign landing pages, MVT is overkill, overhead, and a distraction from the real business of Long Tail marketing: learning who your different audiences are and speaking to them with genuine specificity.

It's all about the right tool for the job. A/B testing and MVT are both good tools, but they're not interchangeable, and there are trade-offs in both directions.

4/21/2008 2:34 PM | Scott Brinker

# re: Playing Russian roulette with your landing pages?

Hi Scott- Thanks for the kind words - you might check out the article I wrote on post click marketing over at searchenginewatch.com - http://searchenginewatch.com/showPage.html?page=3629127

4/21/2008 3:55 PM | Carrie Hill

# A search marketer, a designer, and an IT engineer walk into a bar...

A search marketer, a designer, and an IT engineer walk into a bar...

6/4/2008 2:26 PM | Post-Click Marketing Blog

Post a Comment

Title
Name
Email
Website
Comments
 
 
  Please add 3 and 8 and type the answer here: