9. 1. Get Analytics Health Checked
2. Test in the right place
3. Understand Cross Device
4. Do your Research
5. Prioritise your testing
@OptimiseOrDie
6. Perform Pre Flight Checks
7. Know how long to test
8. Have a good reason to test
9. Learn from your tests
10. Burn down the silos
10 Shortcuts to Testing Success
11. • Nearly 100 Sites in 3 years
• 95% were broken, often badly
• Trust in data was missing
• Management made bad calls
• Nobody checked the tills
• Calibrate from the basics up!
@OptimiseOrDie
• What sales do we capture?
• What categories?
• What about refunds, lunch money,
gift certificates?
• How do we monitor fraud?
• Do we check it adds up?
• Where does this data go?
1. What about MY clients?
13. @OptimiseOrDie
• Review takes 1-3 days
• Prioritise the issues
• Fix directly with developers
• Integrate with the Testing Tool1
Get an Analytics Health Check
15. 2. Let’s do Random Testing
Let’s try
the
homepage
I’ve got
targets to
hit!
I hate
this job
Let’s test
button
colours!
Has lots of opinions
but no data
Spends too much
time on Twitter
Driven by Ego and
Competitors
Wishes he cared
about testing
16. @OptimiseOrDie
“STOP copying your competitors
They may not know what the
f*** they are doing either”
Peep Laja, ConversionXL
1. Let’s do Random Testing
17. Best Practice Testing?
• Your customers are not the same
• Your site is not the same
• Your advertising and traffic are not the same
• Your UX is not the same
• Your X-Device Mix is not the same
• You have no idea of the data
• Use them to inform or suggest approaches
• Use them for ideas
• Do not use them as a playbook
• It will make you very unhappy
@OptimiseOrDie
21. @OptimiseOrDie
• Do some Analytics modelling
• Understand the shedding of
layers
• Narrow your focus and scope
• Bank better gains earlier in time
2
Test in the Right Places
29. @OptimiseOrDie
1. Motorola Hardware Menu Button
2. MS Word Bullet Button
3. Android Holo Composition Icon
4. Android Context Action Bar Overflow (top right on Android devices)
33. @OptimiseOrDie
• Do you really know your mix?
• Most people undercount Android!
• What iPhone models visit?
• How big is tablet traffic?
• What screen sizes do they have?
• Find out BEFORE you design tests
• Check BEFORE you launch tests
• Use Google Analytics to find out
• 3 reports to rule them all
https://www.google.com/analytics/web/template?uid=lpVf8LveSqyd3mdsHjdfzQ
https://www.google.com/analytics/web/template?uid=fmUzp_gzRIy7LnvZJjCDOQ
https://www.google.com/analytics/web/template?uid=y7sYIXDhQrmswHAiNo8iLA
3. Our customers use iPhones, right?
34. @OptimiseOrDie
3. What iPhone Models do we see?
Screen Resolution
320 x 480 = iPhone 4/4S
320 x 568 = iPhone 5/5S
375 x 667 = iPhone 6
414 x 736 = iPhone 6+
https://www.google.com/analytics/web/template?uid=lpVf8LveSqyd3mdsHjdfzQ
https://www.google.com/analytics/web/template?uid=fmUzp_gzRIy7LnvZJjCDOQ
https://www.google.com/analytics/web/template?uid=y7sYIXDhQrmswHAiNo8iLA
35. @OptimiseOrDie
• Desktop Browsers & versions
• Tablet Models
• Mobile Device Models
• Screen Resolutions3
Figure Out the Device Mix for Testing
36. Is there anything holding you back
from doing conversion research?
1. Time
2. Client/Company Buy-In
3. Budget
4. Don’t know where to start
4. You don’t do any Research before
testing?
@ContentVerve
38. @OptimiseOrDie
4. If you have 4 hours
PLUS
• Snap interviews (Sales, Customer Services, Tech Support)
• Run a quick poll or survey (See my tools slides)
Less
Bullshit!
39.
40. @OptimiseOrDie
4. 1 Hour Page Analytics
Influence Pages
Entry Points
Landing Pages
Device Mix
Customer Mix
Traffic Mix
Flow
Intent
Marketing -> Site
flow
Page or Process
Next Steps
Abandonment
Exits
Mix of abandonment
Flow
41. @OptimiseOrDie
4. 1 Hour Landing Page Analytics
• How old are the visitors?
https://www.google.com/analytics/web/template?uid=hab8Ta93SCCffUpjefjtNQ
• What are the key metrics like (e.g. bounce rate, conversion)?
https://www.google.com/analytics/web/template?uid=hab8Ta93SCCffUpjefjtNQ
• What is the goal or ecommerce conversion through this page?
https://www.google.com/analytics/web/template?uid=hab8Ta93SCCffUpjefjtNQ
• What channel traffic comes to the page?
https://www.google.com/analytics/web/template?uid=Kjb9q8M4QN-fsPe8dOGaig
• What is the mix of tablet / mobile / desktop to the page?
https://www.google.com/analytics/web/template?uid=wLMUWs8eTIa3_mmQHOtPkw
• What are the resolutions of devices?
https://www.google.com/analytics/web/template?uid=wLMUWs8eTIa3_mmQHOtPkw
• How slow are the landing pages?
https://www.google.com/analytics/web/template?uid=AavFsgMoRkucYYKnxlB76Q
• What are the pages right after the landing page?
(Use a landing page report and choose the ‘Entrance Paths’ to show next pages.)
• What is the flow like from this page?
(Use the Behaviour Flow Report)
• What does it look like on the top devices?
(Use real devices + Appthwack.com, Crossbrowsertesting.com or Deviceanywhere.com)
43. @OptimiseOrDie
4. If you have 2 hours
• Form Analytics data
• Scroll or Click Maps
• Session Recording Videos (Hotjar, Decibel Insight, Yandex)
• Make a horizontal funnel from the landing page
• Check the:
– Marketing Creatives / SERP fully
– Look at Landing page ZERO!
44. @OptimiseOrDie
4. If you have 4 hours
• Set up a poll or survey (See my tools slides)
• Set up an exit (bail) survey
• Friends, Family, New Employee user testing
• Guerrilla user testing
• Snap Interviews – 5-10 minutes:
• Customer Services, Sales team (if applicable) then Customers
• 5 Second Test
• Article is here : bit.ly/conversiondeadline
46. @OptimiseOrDie
5. You don’t Prioritise your Tests
Scoring can be cost, time to market, resource, risk, political complexity
• Cost 1-10 Higher is cheaper
• Time 1-10 Higher is shorter
• Opportunity 1-10 Higher is greater
Score = Cost * Time * Opportunity
• For financial institutions, risk should be a factor
• Want to build your own? – ask me!
48. @OptimiseOrDie
5. Make a Money Model
Test Description Metric 2% lift 5% lift 10% lift Estimate
Product page Simplification Basket adds 200,000 500,000 1,000,000 500,000
Register new Improve onboarding New register funnel ratio 25,000 62,500 125,000 250,000
IE8 bugs in cart Fix key broken stuff IE8 Conversion 80,000 200,000 400,000 200,000
Category list page Get product higher User Category -> Product 500,000 1,250,000 2,500,000 1,250,000
Payment Page New card handling User Payment -> Thank you 60,000 150,000 300,000 300,000
49. @OptimiseOrDie
• Score all Test Targets
• Use Cost vs. Opportunity
minimum
• Check it works!
• Make a Money Model
5
Prioritise your Testing Targets
50. @OptimiseOrDie
6. You Don’t Test Before Launch
• Dirty secret of AB testing?
• People break their tests all the time!
• Most people don’t notice
Why?
• Because developers can break them very easily
• What if your AB test was broken on iPhones?
• If you didn’t know, would your results be valid?
• About 40% of my tests fail basic QA
54. The 95% Stopping Problem
@OptimiseOrDie
• Many people use 95, 99% ‘confidence’ to stop
• This value is unreliable and moves around
• Nearly all my tests reach significance before they are
actually ready
• You can hit 95% early in a test (18 minutes!)
• If you stop, it could be a false result
• Read this Nature article : bit.ly/1dwk0if
• Optimizely and VWO have updated their tools
• This 95% thingy – must be LAST on your stop list
55. The 95% Stopping Problem
Scenario 1 Scenario 2 Scenario 3 Scenario 4
After 200
observations
Insignificant Insignificant Significant! Significant!
After 500
observations
Insignificant Significant! Insignificant Significant!
End of
experiment
Insignificant Significant! Insignificant Significant!
“You should know that stopping a test once it’s significant is deadly sin
number 1 in A/B testing land. 77% of A/A tests (testing the same thing
as A and B) will reach significance at a certain point.”
Ton Wesseling, Online Dialogue
56. • TWO BUSINESS CYCLES minimum (week/month)
• 1 PURCHASE CYCLE minimum (or most of one)
• 250 CONVERSIONS minimum per creative
• 350, 500, more if creative response is similar
• FULL WEEKS/CYCLES never part of one
• KNOW what marketing, competitors and cycles are doing
• RUN a test length calculator - bit.ly/XqCxuu
• SET your test run time , RUN IT, STOP IT, ANALYSE IT
• ONLY RUN LONGER if sample is smaller than expected
• DON’T RUN LONGER just because the test isn’t giving the result you want!
@OptimiseOrDie
7. Know How Long to Test for…
57. @OptimiseOrDie
• Most critical mistake
• Use a test calculator
• Full business cycles, 2 minimum
• Don’t waste time hoping7
Know How Long to Test for
58. Insight - Inputs
#FAIL
Competitor
copying
Guessing
Dice rolling
An article
the CEO
read
Competitor
change
Panic
Ego
Opinion
Cherished
notions
Marketing
whims Cosmic rays
Not ‘on
brand’
enough
IT
inflexibility
Internal
company
needs
Some
dumbass
consultant
Shiny
feature
blindness
Knee jerk
reactons
@OptimiseOrDie
8. So you think you have a Hypothesis?
59. Insight - Inputs
Insight
Segmentation
Surveys
Sales and
Call Centre
Session
Replay
Social
analytics
Customer
contact
Eye tracking
Usability
testing
Forms
analytics
Search
analytics Voice of
Customer
Market
research
A/B and
MVT testing
Big &
unstructured
data
Web
analytics
Competitor
evalsCustomer
services
@OptimiseOrDie
8. So you think you have a Hypothesis?
60. @OptimiseOrDie
1. Because we saw (data/feedback)
2. We expect that (change) will cause (impact)
3. We’ll measure this using (data metric)
bit.ly/hyp_kit
8. Use this to deflect stupid testing!
61. @OptimiseOrDie
1. Because we saw (an angry email from the CEO)
2. We expect that (changing button colours) will
cause (the office to cool down for a day)
3. We’ll measure this using (some metric we pluck
out of the air – whatever, man)
bit.ly/hyp_kit
8. Let’s try a real one
62. @OptimiseOrDie
• Don’t do Ego driven testing
• Use the Hypothesis Kit!
8
Get a Proper Hypothesis Going
bit.ly/hyp_kit
63. @OptimiseOrDie
9. Our Testing teaches us Nothing!
• Either your research or hypothesis is weak
• Work back from the outcome!
What if A won – what would that tell us?
What if A failed – what would that tell us?
• What is the value to the business in finding out the answer?
• Is the finding actionable widely and deeply?
• Testing isn’t about lifts – it’s about learning
64. @OptimiseOrDie
9. Our Testing teaches us Nothing!
“You are trying to run a bundle
of tests, whose expected
additional information will give
you the highest return.”
Matt Gershoff, CEO, Conductrics.com
65. @OptimiseOrDie
• Do your research
• Form a solid hypothesis
• Work back from the outcomes
• Learning useful stuff = huge lifts9
Design Tests for Maximum Learning
66. @OptimiseOrDie
10. Burn Down the Silos
• Non agile, non iterative design
• Silos work on product separately
• No ‘One Team’ per product/theme
• Large teams, unwieldy coordination
• Pass the product around
• More PMs and BAs than a conference
• Endless sucking signoff
• AB testing done the same way!
67. @OptimiseOrDie
10. FT Example
• Small teams (6-15) with direct access to publish
• Ability to set and get metrics data directly
• Tools, Autonomy, Lack of interference
• No Project Managers or Business Analysts
• Business defines ‘outcomes’ – teams deliver
• No long signoff chain
• No pesky meddling fools
• 18 Month projects over budget?
68. @OptimiseOrDie
10. FT Example
• 100s of releases a day!
• MVP approach
• Launch as alpha, beta, pilot,
phased rollout
• Like getting in a shower
• Read more at labs.ft.com
69. @OptimiseOrDie
10. Positive Attributes
• Rapid, Iterative, User Centred & Agile
Design. No Silos.
• Small empowered autonomous teams
• Polymaths and Overlap
• Toolkit & Analytics investment
• Persuasive copywriting & Psychology
• Great Testing & Optimisation Tools
70. @OptimiseOrDie
• Agile, Lean, Iterative x silo
teams
• Ability to get and set metrics
• Autonomy, Control, Velocity
• Iterative MVP approach
• Work on outcomes, not features
10
Burn Down the Silos!
71. 71
“If you think of technology as something that’s
spreading like a sort of fractal stain, almost every
point on the edge represents an interesting problem.”
Paul Graham
73. Rumsfeldian
Space• What if we changed our prices?
• What if we gave away less for free?
• What if we took this away?
• What about 3 packages, not 5?
• What are these potential futures I can take?
• How can I know before I spend money?
• McDonalds Hipster Test Store
bit.ly/1TiURi7
@OptimiseOrDie
74. Congratulations!
Today you’re the lucky
winner of our random
awards programme.
You get all these extra
features for free, on us.
Enjoy!
Innovation Testing
@OptimiseOrDie
80. 1 Tool Installed
2 Stupid testing
3
4
Peak of Stupidity
5 ROI questioned
6 Statistics debunked
7 Faith crisis
8
The Trough of Testing
Scaled up
Stupidity
9 Where, How, Why
10 Data science
11
Testing to learn
12
Innovation
Testing
@OptimiseOrDie
81. Thank You!
Email me sullivac@gmail.com
Slides http://bit.ly/em2015
Linkedin linkd.in/pvrg14
Hinweis der Redaktion
I’ve been working on this presentation and thinking about it for over two months now. And this is one of the first graphs I wanted to include, because it represents the hype cycle in AB testing.
This is what a lot of companies go through with new technology adoption, so I wanted to show you what the AB testing hype cycle would look like [CLICK]
So if this is what ramping up your AB testing feels like at your company, welcome to the club! You are not alone!
I know what’s going wrong because I’ve made all these mistakes myself. Today is about giving you shortcuts away from stupidity and getting you from burning cash to productive testing.
My pain is your gain!
And this crappy AB testing is basically the equivalent of funny cat videos
People taking videos of themselves playing video games
And like, wow, there are 6.9 million Gangnam Style videos. Just incredible.
But hidden in those big numbers, YouTube will always have a tiny percentage of really great stuff, very little good stuff and a long tail of absolute bollocks.
And the same is true of split testing - there's some really well run stuff, getting very good results and there's a lot of air guitar going on.
And here are the top 10 reasons – there are about 70 odd ways people manage to break their AB testing but these are the most common mistakes, particularly for companies just starting or scaling up.
Let’s run through each one – I’ve included a summary slide of each one, so you’ve got a nice handy checklist to take back to the office.
So where do you start testing? Where do you focus your efforts?
Over here? This bit over here?
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
It has taken me a long time to find out where all the bear traps are hidden. Mainly from screwing up tests and figuring out what was wrong, through lots of testing time.
And most companies and teams are stepping on these bear traps without even realising. And they wonder why the test results aren’t replicated in the bank account results. Hah.
I have a list now of about 60 ways to easily break, skew, bias or screw up your tests completely. But here are some real biggies to watch for:
And now a bit about something I call Rumsfeldian Space – exploring the unknowns. This is vital if you want to make your testing bold enough to get great results.
And this was the state of my head in 2004. The inability to understand what you can and can’t be confident about – but nobody wants to admit they’re fucking guessing a lot of the time.
And it took me a long time to figure out I didn’t know anything really – it was all assumptions and cherished notions. It was pretty crushing to test my way to this realisation but MUCH I’m happier now.
Now I think I know this much - but I might know a wee bit more than I think I do – but I’m erring on the side of caution.
That’s because I'm always questioning everything I do through the lens of that consumer insight and testing.
Without customers and data driven insights, you can’t shape revenue and delight. They’ll give you the very psychological insights you need to apply levers to influence them, if you only ask questions. Everything else is just a fucking guess.
Even with tests, if the only inputs you’ve got are ego and opinion, they’re going to be lousy guesses and you’re wasting your experiments.
I once explained to my daughter – you know, when adults like look really in control and making decisions and appearing not to suffer from indecision? Don’t believe it for a minute – we’re just better at winging it cause we’re older.
And this is the huge hole that’s gnawing at the hear of many digital operations. The inability to understand what you can and can’t be confident about – but nobody wants to admit they’re guessing a lot of the time.
Hopefully now you’ve joined Guessaholics Anonymous from today, you can move on to a data driven and more productive future. Enjoy less wastage in your products, greater efficiency in marketing spend and grow faster than you imagined. All it takes is a bit of love of your customers, your business and some solid ground rules.
I hope you enjoyed my talk as much I did writing it.
All my details are here including the slides, for you to download.
Go forth and Optimise!