SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
1	
  
-­‐	
  hi,	
  I’m	
  corey,	
  from	
  etsy	
  (@coreyloose)	
  
-­‐	
  	
  Marketplace	
  where	
  people	
  around	
  the	
  world	
  connect	
  to	
  buy	
  and	
  sell	
  unique	
  goods	
  
(not	
  all	
  that	
  different	
  from	
  the	
  art	
  fair	
  going	
  on	
  right	
  now)	
  
-­‐	
  We	
  like	
  to	
  run	
  a	
  lot	
  of	
  a/b	
  tests	
  
2	
  
-­‐  This	
  talk	
  is	
  201,	
  but	
  here’s	
  the	
  quick	
  101	
  
3	
  
-­‐  Have	
  a	
  theory	
  on	
  something	
  that	
  will	
  make	
  your	
  product	
  beJer	
  
-­‐  Show	
  it	
  to	
  some	
  random	
  of	
  visitors	
  (but	
  keep	
  it	
  consistent)	
  “buckeMng”	
  
-­‐  Try	
  both	
  for	
  a	
  bit	
  and	
  see	
  which	
  one	
  does	
  beJer	
  
-­‐  Not	
  only	
  does	
  this	
  test	
  if	
  your	
  idea	
  is	
  good,	
  it	
  also	
  tests	
  your	
  implementaMon	
  and	
  
all	
  sorts	
  of	
  complex	
  interacMons	
  
-­‐  Would	
  this	
  one	
  cause	
  an	
  Increased	
  error	
  rate	
  in	
  variaMon	
  selecMon?	
  
4	
  
-­‐  As	
  I	
  just	
  explained	
  it,	
  A/B	
  tesMng	
  sounds	
  simple	
  +	
  awesome	
  
-­‐  And	
  it	
  is,	
  but	
  as	
  always	
  the	
  devil	
  is	
  in	
  the	
  details	
  
-­‐  I’m	
  going	
  to	
  tell	
  a	
  bunch	
  of	
  stories	
  of	
  stuff	
  that	
  we	
  did	
  wrong,	
  not	
  to	
  be	
  negaMve	
  
but	
  it’s	
  just	
  more	
  interesMng	
  then	
  spraying	
  campaign	
  around	
  
-­‐  Lets	
  start	
  with	
  a	
  really	
  common	
  no-­‐no	
  
5	
  
-­‐  Trying	
  one	
  thing	
  for	
  a	
  week,	
  then	
  trying	
  another	
  
6	
  
-­‐  Alluring	
  because	
  it	
  doesn’t	
  require	
  you	
  to	
  have	
  rich	
  metric	
  gathering	
  or	
  buckeMng	
  
-­‐  You’re	
  going	
  to	
  need	
  some	
  tooling	
  
-­‐  We	
  built	
  Feature	
  and	
  Catapult	
  
7	
  
-­‐  (only	
  code	
  in	
  the	
  presentaMon)	
  
-­‐  Plenty	
  of	
  other	
  opMons	
  out	
  there,	
  but	
  we’re	
  happy	
  with	
  this	
  
-­‐  Open	
  source	
  
-­‐  Easy	
  enough	
  that	
  PMs	
  can	
  change	
  experiment	
  weights	
  
-­‐  Uses	
  cookie	
  to	
  ensure	
  user	
  experience	
  stays	
  consistent	
  
-­‐  You’ll	
  need	
  your	
  own	
  logging	
  to	
  do	
  analysis	
  
8	
  
-­‐  Internal	
  tool	
  that	
  does	
  data	
  analysis	
  of	
  a/b	
  tests	
  based	
  on	
  data	
  processing	
  from	
  
feature	
  event	
  logs	
  
-­‐  For	
  this	
  experiment:	
  more	
  pages	
  but	
  less	
  add	
  to	
  cart	
  
-­‐  No	
  staMsMcal	
  significance	
  for	
  conversion	
  rate	
  
9	
  
-­‐  A	
  bit	
  sobering	
  but	
  you	
  goJa	
  have	
  a	
  lot	
  of	
  traffic,	
  or	
  make	
  a	
  big	
  change	
  to	
  do	
  this	
  
10	
  
-­‐  WriJen	
  by	
  an	
  Etsy	
  alumni	
  
-­‐  To	
  detect	
  a	
  small	
  change	
  you	
  need	
  a	
  lot	
  of	
  Mme	
  
11	
  
-­‐  The	
  good	
  news	
  is	
  if	
  you	
  can	
  make	
  a	
  bigger	
  effect,	
  it	
  gets	
  much	
  easier	
  to	
  detect	
  (1%	
  
=>	
  5%)	
  
12	
  
-­‐  Have	
  a	
  hypothesis	
  going	
  in,	
  no	
  fishing	
  (lets	
  just	
  pump	
  some	
  people	
  full	
  of	
  this	
  new	
  
chemical)	
  
-­‐  Lets	
  get	
  into	
  some	
  more	
  interesMng	
  failures	
  
13	
  
-­‐  Going	
  to	
  tell	
  a	
  few	
  stories	
  about	
  a	
  first	
  type	
  of	
  failure	
  
-­‐  Mechanical	
  
14	
  
-­‐  All	
  users	
  get	
  bucketed	
  but	
  only	
  Australian	
  users	
  are	
  eligible	
  for	
  an	
  experiment	
  
15	
  
-­‐  This	
  is	
  what	
  really	
  happens,	
  since	
  the	
  rest	
  of	
  the	
  world	
  isn’t	
  eligible	
  
-­‐  Going	
  to	
  under	
  represent	
  any	
  effects	
  
16	
  
-­‐  Need	
  to	
  exclude	
  the	
  rest	
  
17	
  
-­‐  If	
  your	
  experiment	
  causes	
  the	
  page	
  to	
  be	
  a	
  lot	
  bigger,	
  weirdness	
  can	
  happen	
  
-­‐  Page	
  loads	
  slower	
  
18	
  
-­‐  This	
  ensures	
  the	
  user	
  actually	
  saw	
  the	
  page	
  +	
  we	
  have	
  access	
  to	
  more	
  informaMon	
  
19	
  
-­‐  Slow	
  network	
  speed	
  on	
  mobile	
  
-­‐  The	
  combo	
  led	
  to	
  experiments	
  being	
  under-­‐reported	
  
-­‐  NoMced	
  because	
  experiment	
  group	
  would	
  appear	
  to	
  have	
  far	
  less	
  people	
  in	
  it	
  
-­‐  Lesson:	
  Watch	
  page	
  weight	
  
20	
  
-­‐  We	
  don’t	
  support	
  ie7	
  
-­‐  We	
  ran	
  an	
  experiment	
  once	
  that	
  looked	
  like	
  this	
  in	
  Ie7	
  
-­‐  Was	
  sMll	
  enough	
  traffic	
  to	
  tank	
  experiment	
  
-­‐  Lesson:	
  Slice	
  by	
  user	
  groups	
  in	
  the	
  analysis	
  
21	
  
-­‐  (hal	
  9000)	
  
-­‐  Ran	
  an	
  experiment	
  on	
  our	
  acMvity	
  feed,	
  small	
  %	
  
-­‐  All	
  the	
  metrics	
  tanked	
  
-­‐  Turned	
  out	
  a	
  bot	
  we	
  have	
  to	
  monitor	
  page	
  Mmes	
  was	
  bucketed	
  in	
  
-­‐  Lesson:	
  a/b	
  tooling	
  ignore	
  your	
  bots	
  
22	
  
-­‐  Previous	
  stories	
  were	
  mechanical,	
  but	
  the	
  real	
  power	
  of	
  A/B	
  tesMng	
  is	
  seeing	
  how	
  
your	
  idea	
  interacts	
  with	
  the	
  world	
  
23	
  
-­‐  Implemented	
  as	
  a	
  monolithic	
  release	
  
-­‐  A/B	
  test	
  kept	
  as	
  a	
  hurdle	
  at	
  the	
  end	
  
24	
  
-­‐  Go	
  check	
  out	
  dan	
  mckinley’s	
  talk	
  
25	
  
-­‐  It	
  failed	
  terribly,	
  purchases	
  down	
  over	
  20%	
  
-­‐  Since	
  we	
  built	
  it	
  all	
  at	
  once,	
  we	
  had	
  nothing	
  to	
  pin	
  it	
  on	
  
-­‐  What	
  if	
  we	
  had	
  done	
  something	
  simple,	
  are	
  more	
  items	
  beJer?	
  –	
  40	
  v.	
  80	
  items	
  on	
  
a	
  page	
  
-­‐  Lesson:	
  test	
  ideas	
  in	
  isolaMon	
  
26	
  
-­‐  Here’s	
  a	
  story	
  about	
  an	
  A/B	
  test	
  telling	
  us	
  something	
  our	
  product	
  intuiMon	
  didn’t	
  
-­‐  Seems	
  like	
  an	
  obvious,	
  simple	
  win	
  
-­‐  Logins	
  are	
  way	
  down	
  
-­‐  Turns	
  out	
  average	
  users	
  use	
  way	
  worse	
  passwords	
  then	
  employees	
  
-­‐  Ended	
  up	
  being	
  a	
  no-­‐go	
  for	
  other	
  reasons	
  
-­‐  Lesson:	
  unintended	
  consequences	
  
27	
  
-­‐  You	
  can’t	
  measure	
  everything	
  that	
  maJers	
  
-­‐  Can	
  iron	
  out	
  the	
  mechanical	
  issues	
  
-­‐  Can	
  run	
  Mghtly	
  scoped	
  tests	
  that	
  allow	
  you	
  to	
  make	
  confident	
  decisions	
  
-­‐  What	
  if	
  you	
  asked	
  ½	
  of	
  the	
  people	
  you	
  met	
  for	
  the	
  rest	
  of	
  the	
  day	
  for	
  a	
  $1	
  
-­‐  You’d	
  end	
  up	
  with	
  more	
  money	
  
28	
  
-­‐  That’s	
  what	
  you’re	
  doing	
  with	
  this	
  
-­‐  If	
  you	
  a/b	
  test	
  it,	
  you’ll	
  get	
  more	
  signups	
  +	
  probably	
  beJer	
  Mme-­‐on-­‐page	
  
-­‐  Maybe	
  a	
  few	
  more	
  bounces	
  
-­‐  But	
  goodwill	
  &	
  brand	
  impression	
  is	
  hard	
  to	
  measure	
  
29	
  
30	
  

Weitere ähnliche Inhalte

Ähnlich wie Madison+ UX 2014: A/B Testing - The Good, The Bad, and The Ugly

REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...
REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...
REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...Erik Van Rompay
 
Using AWS Lambdas in the Real World
Using AWS Lambdas in the Real WorldUsing AWS Lambdas in the Real World
Using AWS Lambdas in the Real WorldEliot Pearson
 
Life skills for developers and architects
Life skills for developers and architectsLife skills for developers and architects
Life skills for developers and architectsRinka Singh
 
Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...
Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...
Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...Luke Stevens
 
UX challenges of a UI-centric config management tool
UX challenges of a UI-centric config management toolUX challenges of a UI-centric config management tool
UX challenges of a UI-centric config management toolRUDDER
 
Confessions of an Accidental Security Tester
Confessions of an Accidental Security TesterConfessions of an Accidental Security Tester
Confessions of an Accidental Security TesterAlan Richardson
 
Machine learning pipeline
Machine learning pipelineMachine learning pipeline
Machine learning pipelineVadym Kuzmenko
 
Building a site for people with big imaginations
Building a site for people with big imaginationsBuilding a site for people with big imaginations
Building a site for people with big imaginationsMark Mansour
 
OSMC 2018 | Eliminating Alerts or Operation Forest by Rihards Olups
OSMC 2018 | Eliminating Alerts or Operation Forest by Rihards OlupsOSMC 2018 | Eliminating Alerts or Operation Forest by Rihards Olups
OSMC 2018 | Eliminating Alerts or Operation Forest by Rihards OlupsNETWAYS
 
Ten Hard Won Lessons on the Road to Automation
Ten Hard Won Lessons on the Road to AutomationTen Hard Won Lessons on the Road to Automation
Ten Hard Won Lessons on the Road to AutomationCharles Meaden
 
How I failed to build a runbook automation system
How I failed to build a runbook automation systemHow I failed to build a runbook automation system
How I failed to build a runbook automation systemTimothyBonci
 
Black Ops Testing Workshop from Agile Testing Days 2014
Black Ops Testing Workshop from Agile Testing Days 2014Black Ops Testing Workshop from Agile Testing Days 2014
Black Ops Testing Workshop from Agile Testing Days 2014Alan Richardson
 
Grygoriy gavaleshko cross-framework communication on frontent
Grygoriy gavaleshko   cross-framework communication on frontentGrygoriy gavaleshko   cross-framework communication on frontent
Grygoriy gavaleshko cross-framework communication on frontentAneta Kołosowska (Wiśniewska)
 
Anomaly detection made easy
Anomaly detection made easyAnomaly detection made easy
Anomaly detection made easyPiotr Guzik
 
Anomaly detection made easy - Piotr Guzik Allegro
Anomaly detection made easy - Piotr Guzik AllegroAnomaly detection made easy - Piotr Guzik Allegro
Anomaly detection made easy - Piotr Guzik AllegroEvention
 
Puppet Camp Chicago 2014: Puppet at backstop another year of lessons
Puppet Camp Chicago 2014: Puppet at backstop another year of lessonsPuppet Camp Chicago 2014: Puppet at backstop another year of lessons
Puppet Camp Chicago 2014: Puppet at backstop another year of lessonsPuppet
 
7 lessons learned building high availability / performance systems - CM2015
7 lessons learned building high availability / performance systems - CM20157 lessons learned building high availability / performance systems - CM2015
7 lessons learned building high availability / performance systems - CM2015Francesco Degrassi
 
Machine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift ScienceMachine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift ScienceSift Science
 

Ähnlich wie Madison+ UX 2014: A/B Testing - The Good, The Bad, and The Ugly (20)

REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...
REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...
REBEL practices to implement innovation initiatives #RebelJam15 #vanrompay...
 
Probing Questions
Probing QuestionsProbing Questions
Probing Questions
 
Using AWS Lambdas in the Real World
Using AWS Lambdas in the Real WorldUsing AWS Lambdas in the Real World
Using AWS Lambdas in the Real World
 
Life skills for developers and architects
Life skills for developers and architectsLife skills for developers and architects
Life skills for developers and architects
 
Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...
Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...
Data Driven Design - Web Analytics & Testing for Designers (Web Directions So...
 
UX challenges of a UI-centric config management tool
UX challenges of a UI-centric config management toolUX challenges of a UI-centric config management tool
UX challenges of a UI-centric config management tool
 
Confessions of an Accidental Security Tester
Confessions of an Accidental Security TesterConfessions of an Accidental Security Tester
Confessions of an Accidental Security Tester
 
Machine learning pipeline
Machine learning pipelineMachine learning pipeline
Machine learning pipeline
 
Building a site for people with big imaginations
Building a site for people with big imaginationsBuilding a site for people with big imaginations
Building a site for people with big imaginations
 
OSMC 2018 | Eliminating Alerts or Operation Forest by Rihards Olups
OSMC 2018 | Eliminating Alerts or Operation Forest by Rihards OlupsOSMC 2018 | Eliminating Alerts or Operation Forest by Rihards Olups
OSMC 2018 | Eliminating Alerts or Operation Forest by Rihards Olups
 
Ten Hard Won Lessons on the Road to Automation
Ten Hard Won Lessons on the Road to AutomationTen Hard Won Lessons on the Road to Automation
Ten Hard Won Lessons on the Road to Automation
 
How I failed to build a runbook automation system
How I failed to build a runbook automation systemHow I failed to build a runbook automation system
How I failed to build a runbook automation system
 
Black Ops Testing Workshop from Agile Testing Days 2014
Black Ops Testing Workshop from Agile Testing Days 2014Black Ops Testing Workshop from Agile Testing Days 2014
Black Ops Testing Workshop from Agile Testing Days 2014
 
Grygoriy gavaleshko cross-framework communication on frontent
Grygoriy gavaleshko   cross-framework communication on frontentGrygoriy gavaleshko   cross-framework communication on frontent
Grygoriy gavaleshko cross-framework communication on frontent
 
Great! another bug
Great! another bugGreat! another bug
Great! another bug
 
Anomaly detection made easy
Anomaly detection made easyAnomaly detection made easy
Anomaly detection made easy
 
Anomaly detection made easy - Piotr Guzik Allegro
Anomaly detection made easy - Piotr Guzik AllegroAnomaly detection made easy - Piotr Guzik Allegro
Anomaly detection made easy - Piotr Guzik Allegro
 
Puppet Camp Chicago 2014: Puppet at backstop another year of lessons
Puppet Camp Chicago 2014: Puppet at backstop another year of lessonsPuppet Camp Chicago 2014: Puppet at backstop another year of lessons
Puppet Camp Chicago 2014: Puppet at backstop another year of lessons
 
7 lessons learned building high availability / performance systems - CM2015
7 lessons learned building high availability / performance systems - CM20157 lessons learned building high availability / performance systems - CM2015
7 lessons learned building high availability / performance systems - CM2015
 
Machine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift ScienceMachine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift Science
 

Kürzlich hochgeladen

Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptxFIDO Alliance
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfAnubhavMangla3
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxMasterG
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 

Kürzlich hochgeladen (20)

Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 

Madison+ UX 2014: A/B Testing - The Good, The Bad, and The Ugly

  • 2. -­‐  hi,  I’m  corey,  from  etsy  (@coreyloose)   -­‐    Marketplace  where  people  around  the  world  connect  to  buy  and  sell  unique  goods   (not  all  that  different  from  the  art  fair  going  on  right  now)   -­‐  We  like  to  run  a  lot  of  a/b  tests   2  
  • 3. -­‐  This  talk  is  201,  but  here’s  the  quick  101   3  
  • 4. -­‐  Have  a  theory  on  something  that  will  make  your  product  beJer   -­‐  Show  it  to  some  random  of  visitors  (but  keep  it  consistent)  “buckeMng”   -­‐  Try  both  for  a  bit  and  see  which  one  does  beJer   -­‐  Not  only  does  this  test  if  your  idea  is  good,  it  also  tests  your  implementaMon  and   all  sorts  of  complex  interacMons   -­‐  Would  this  one  cause  an  Increased  error  rate  in  variaMon  selecMon?   4  
  • 5. -­‐  As  I  just  explained  it,  A/B  tesMng  sounds  simple  +  awesome   -­‐  And  it  is,  but  as  always  the  devil  is  in  the  details   -­‐  I’m  going  to  tell  a  bunch  of  stories  of  stuff  that  we  did  wrong,  not  to  be  negaMve   but  it’s  just  more  interesMng  then  spraying  campaign  around   -­‐  Lets  start  with  a  really  common  no-­‐no   5  
  • 6. -­‐  Trying  one  thing  for  a  week,  then  trying  another   6  
  • 7. -­‐  Alluring  because  it  doesn’t  require  you  to  have  rich  metric  gathering  or  buckeMng   -­‐  You’re  going  to  need  some  tooling   -­‐  We  built  Feature  and  Catapult   7  
  • 8. -­‐  (only  code  in  the  presentaMon)   -­‐  Plenty  of  other  opMons  out  there,  but  we’re  happy  with  this   -­‐  Open  source   -­‐  Easy  enough  that  PMs  can  change  experiment  weights   -­‐  Uses  cookie  to  ensure  user  experience  stays  consistent   -­‐  You’ll  need  your  own  logging  to  do  analysis   8  
  • 9. -­‐  Internal  tool  that  does  data  analysis  of  a/b  tests  based  on  data  processing  from   feature  event  logs   -­‐  For  this  experiment:  more  pages  but  less  add  to  cart   -­‐  No  staMsMcal  significance  for  conversion  rate   9  
  • 10. -­‐  A  bit  sobering  but  you  goJa  have  a  lot  of  traffic,  or  make  a  big  change  to  do  this   10  
  • 11. -­‐  WriJen  by  an  Etsy  alumni   -­‐  To  detect  a  small  change  you  need  a  lot  of  Mme   11  
  • 12. -­‐  The  good  news  is  if  you  can  make  a  bigger  effect,  it  gets  much  easier  to  detect  (1%   =>  5%)   12  
  • 13. -­‐  Have  a  hypothesis  going  in,  no  fishing  (lets  just  pump  some  people  full  of  this  new   chemical)   -­‐  Lets  get  into  some  more  interesMng  failures   13  
  • 14. -­‐  Going  to  tell  a  few  stories  about  a  first  type  of  failure   -­‐  Mechanical   14  
  • 15. -­‐  All  users  get  bucketed  but  only  Australian  users  are  eligible  for  an  experiment   15  
  • 16. -­‐  This  is  what  really  happens,  since  the  rest  of  the  world  isn’t  eligible   -­‐  Going  to  under  represent  any  effects   16  
  • 17. -­‐  Need  to  exclude  the  rest   17  
  • 18. -­‐  If  your  experiment  causes  the  page  to  be  a  lot  bigger,  weirdness  can  happen   -­‐  Page  loads  slower   18  
  • 19. -­‐  This  ensures  the  user  actually  saw  the  page  +  we  have  access  to  more  informaMon   19  
  • 20. -­‐  Slow  network  speed  on  mobile   -­‐  The  combo  led  to  experiments  being  under-­‐reported   -­‐  NoMced  because  experiment  group  would  appear  to  have  far  less  people  in  it   -­‐  Lesson:  Watch  page  weight   20  
  • 21. -­‐  We  don’t  support  ie7   -­‐  We  ran  an  experiment  once  that  looked  like  this  in  Ie7   -­‐  Was  sMll  enough  traffic  to  tank  experiment   -­‐  Lesson:  Slice  by  user  groups  in  the  analysis   21  
  • 22. -­‐  (hal  9000)   -­‐  Ran  an  experiment  on  our  acMvity  feed,  small  %   -­‐  All  the  metrics  tanked   -­‐  Turned  out  a  bot  we  have  to  monitor  page  Mmes  was  bucketed  in   -­‐  Lesson:  a/b  tooling  ignore  your  bots   22  
  • 23. -­‐  Previous  stories  were  mechanical,  but  the  real  power  of  A/B  tesMng  is  seeing  how   your  idea  interacts  with  the  world   23  
  • 24. -­‐  Implemented  as  a  monolithic  release   -­‐  A/B  test  kept  as  a  hurdle  at  the  end   24  
  • 25. -­‐  Go  check  out  dan  mckinley’s  talk   25  
  • 26. -­‐  It  failed  terribly,  purchases  down  over  20%   -­‐  Since  we  built  it  all  at  once,  we  had  nothing  to  pin  it  on   -­‐  What  if  we  had  done  something  simple,  are  more  items  beJer?  –  40  v.  80  items  on   a  page   -­‐  Lesson:  test  ideas  in  isolaMon   26  
  • 27. -­‐  Here’s  a  story  about  an  A/B  test  telling  us  something  our  product  intuiMon  didn’t   -­‐  Seems  like  an  obvious,  simple  win   -­‐  Logins  are  way  down   -­‐  Turns  out  average  users  use  way  worse  passwords  then  employees   -­‐  Ended  up  being  a  no-­‐go  for  other  reasons   -­‐  Lesson:  unintended  consequences   27  
  • 28. -­‐  You  can’t  measure  everything  that  maJers   -­‐  Can  iron  out  the  mechanical  issues   -­‐  Can  run  Mghtly  scoped  tests  that  allow  you  to  make  confident  decisions   -­‐  What  if  you  asked  ½  of  the  people  you  met  for  the  rest  of  the  day  for  a  $1   -­‐  You’d  end  up  with  more  money   28  
  • 29. -­‐  That’s  what  you’re  doing  with  this   -­‐  If  you  a/b  test  it,  you’ll  get  more  signups  +  probably  beJer  Mme-­‐on-­‐page   -­‐  Maybe  a  few  more  bounces   -­‐  But  goodwill  &  brand  impression  is  hard  to  measure   29  
  • 30. 30