Amazon Search Summit - the need for split testing in SEO

Lessons from SEO split-testing
What can publishers learn from our experiments in e-commerce?
@willcritchlow

PageRank
(the first algorithm to
use the link structure of
the web)
How Google built their search engine

Information
retrieval
PageRank

Information
retrieval
PageRank Original research

Information
retrieval
PageRank Original research TWEAKS
...with growing complexity in subsequent years

Particularly this comment from a user called Kevin Lacker (@lacker):

I was thinking about it like it was a
math puzzle and if I just thought
really hard it would all make sense.
-- Kevin Lacker (@lacker)

Hey why don't you take the square
root?
-- Amit Singhal according to Kevin Lacker (@lacker)

oh... am I allowed to write code that
doesn't make any sense?
-- Kevin Lacker (@lacker)

-- Amit Singhal according to Kevin Lacker (@lacker)
Multiply by 2 if it helps, add 5,
whatever, just make things work
and we can make it make sense
later.

Under Sundar Pichai, ML makes this all worse

Many in the SEO industry still think
they understand how ranking
factors work
Let’s play a game

So we need to test.
Here’s how SEO split tests work

Excuse a brief diversion into geeky
details

Instead of comparing the performance of the control pages directly with the variant pages, we build a
forecast of what’s called the counterfactual which is an estimate of what would have happened if we hadn’t
made the change. We use the control group to make a counterfactual forecast that takes into account
seasonality and site-wide changes.
The black line on the chart above is the actual organic traffic to the variant pages. The blue line is the
counterfactual.
More: Distilled blog post and free forecasting tool

It’s easiest to analyse the results by looking at the cumulative difference over time between the actual
organic traffic and the counterfactual.
The pale blue area is the 95% confidence interval.
We can see a (statistically) zero effect for an initial time while Google crawls and indexes the test,
followed by steady growth. A couple of weeks in, the confidence interval goes above zero and we have a
winning test.
More: Distilled blog

It’s easiest to analyse the results by looking at the cumulative difference over time between the actual
organic traffic and the counterfactual.
The pale blue area is the 95% confidence interval.
We can see a (statistically) zero effect for an initial time while Google crawls and indexes the test,
followed by steady growth. A couple of weeks in, the confidence interval goes above zero and we have a
winning test.
Hashtag winning

Further reading for those interested:
● Predicting the present with Bayesian structural time series [PDF]
● Inferring causal impact using Bayesian structural time series [PDF]
● CausalImpact R package
● Finding the ROI of title tag changes

1. Adding structured data
2. Adding ALT attributes
3. Setting exact match title tags
4. Using JS to show content
5. Removing SEO category text

Credit to my colleague Dom who
runs our split-testing projects
@dom_woodman

Category pages have lots of images and not much text

Adding structured data to category pages

Organic sessions increased by 11%

Established wisdom would say ALT tags on images are good for SEO

We found adding ALT tags to images had no effect

Title tag before: Which TV should I buy? - Argos
Title tag after: Which TV to buy? - Argos
What happens when you match title tags to the greatest search volume?

Organic sessions decreased by an average of 8%

What happens if your content is only visible with Javascript?
Javascript EnabledJavascript Disabled

Making it visible increased organic sessions by ~ 6.2%

Read more on our blog: early results from split-testing JS for SEO

How does SEO text on category pages perform?

E-commerce site number 1 ~ 3.1% increase in organic sessions

E-commerce site number 2 - No effect/negative effect

This is why we have been investing so much in split-testing
Check out odn.distilled.net if you haven’t already. The team will be happy to
demo for you.
We served ~5 billion requests last quarter and recently published
everything from response times to our +£100k / month split test.

But I’m also seeing more subtle impacts on my recommendations:
● You can recommend small tweaks and see the benefits compound
● You can test wild hypotheses with unknown upsides
● You can try things that might have a downside (more focused targeting, less copy, etc.)
And that’s even before you get the benefits of testing clickthrough rate, and the benefits of pretty charts
to show the boss highlighting the impact of your work!
More: blog post

Can we build a better machine than
a coin flip?
Seems like a low bar

The goal is a winning combination
of human and machine
Human + computer beats computer (for now)

● Mobius strip
● Confusion
● Signal box
● Cigar
● Discontinuity
● Confidence
● Burt Totaro
● Sundar Pichai
● John Giannandrea
● Chuck Norris
● Jeff Dean
● Fencing
● Keyboard
Image credits
● Go
● Robot
● Leopard print sofa
● Leopard
● Bug
● Lego robots
● Iron Man
● Amazon warehouse

Amazon Search Summit - the need for split testing in SEO

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Amazon Search Summit - the need for split testing in SEO

Similar to Amazon Search Summit - the need for split testing in SEO (20)

More from Will Critchlow

More from Will Critchlow (16)

Recently uploaded

Recently uploaded (20)

Amazon Search Summit - the need for split testing in SEO