SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
N=109	
  
Automated	
  
Experimenta5on	
  at	
  Scale	
  
Wojciech	
  Galuba	
  
Decision	
  Tools	
  Lead,	
  
Facebook	
  
@wgaluba	
  
N=109:
Automated Experimentation at Scale
Wojtek Galuba (wgaluba@fb)
Decision Tools Team Lead
Data Science Infrastructure
Facebook
History of Data Science Infra at FB
•  Founded April 2012
•  A group of data scientists and software engineers
•  Experienced first hand the need for better infrastructure
•  Need continues to grow
•  Team doubled over the past year
•  Expect continued rapid growth this year
Why do we experiment?
Experimentation
Product
changes
Experiment
to study this
Metrics
Experiment to:
Catch problems before they arise
Experiment to:
Choose between multiple options
Experiment to:
Challenge intuitions about product
Experiment to:
Not only evaluate ideas
but generate new ones
Challenges
Many experiments
• Experiments running in parallel
• Modifying many different aspects of the product
• Overlaps are possible and may conflict
Many metric dimensions
• Different contexts of user actions
• Thousands of device types
• Geography
• Demographics
• Time
• Enormous space of possible questions
Many teams
• Many ways to run an experiment
• Diverse audience for results
• Huge set of results from every experiment
• Many ways to interpret results
Experimentation at Facebook
An experiment
QuickExperiment
Dividepeoplerandomly
color: blue

size: medium"
color: blue"
size: big"
color: green"
size: medium"
QuickExperiment
• Centralized experiment management
• Purely config-level: no code pushes to iterate
• Automatic exposure logging
PlanOut
PlanOut
• Open sourced: http://facebook.github.io/planout/
• Flexible experimental design
• Full, programmatic control over param values
Experiment evaluation
Exposures
Metrics
% change from control to test
-1
 0
 1
 2
-2
 3
-3
posts
99.9 %99 %95 %Confidence:
Assess decision risk
99.9 %99 %95 %Confidence:
Lessons learned
Computing answers to exponential
number of possible questions
Pre-compute
• low specificity
• low dimensionality
• long-term
Compute on-the-
fly
• high specificity
• high dimensionality
• short-term
A balancing act
Tackling many dimensions
Two sets of tools
For exploration For extraction
Automated exploration
Enforce a lifecycle;
In particular:
clear experiment end dates
Why lifecycle policy?
• Unifies methodology across teams
• Prevents tech debt buildup
• Minimizes bad impact on product
Ease of rapid iteration;
Safe and scientifically valid iteration
Fast, but not too fast
• Novelty effect vs. top engaged users bump
• Understand if waiting helps
Ensure mutual exclusion;
Across platforms,
features and infra
Why mutual exclusion?
• Fewer experiment conflicts
• Lower metrics variance
Exposure log everything
• Measure effects on the exposed only
• Conditioning analyses on the time since last exposure
The culture
Experimentation gives focus;
But watch out for tunnel vision!
The culture
Cultivate sound practices;
Safe and low-impact experimentation
The culture
Educate on data interpretation;
Uniform decision-making
across teams
Understanding uncertainty
“Robust misinterpretation of confidence intervals”
Rink Hoekstra et al.
Psychonomic Bulletin & Review
• Only 3% of scientists got
all 6 answers right...
• How do we educate the
users of the tools?
The three stages of
experimentation
infrastructure
Stage 1: Artisanal
Photo credit: Abhisek
Stage 2: Power tools
Stage 2: Power tools
Stage 3: Industrialized
Photo credit: Steve Jurvetson
Conclusions
Empower, but don’t overwhelm
Conclusions
Filter and automate,
but maintain broad focus
Conclusions
Clean data and powerful tools are great, but
building the right experimentation
culture is equally important
N=109	
  
Automated	
  Experimenta5on	
  at	
  
Scale	
  
Wojciech	
  
Galuba	
  
Decision	
  Tools	
  Lead,	
  
Facebook	
  
@wgaluba	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction for CCASR
Introduction for CCASRIntroduction for CCASR
Introduction for CCASR
Neil Ernst
 
SkillSwap Weekend - Usability Testing
SkillSwap Weekend - Usability TestingSkillSwap Weekend - Usability Testing
SkillSwap Weekend - Usability Testing
schaef2493
 
JoshHess_Resume
JoshHess_ResumeJoshHess_Resume
JoshHess_Resume
Josh Hess
 

Was ist angesagt? (13)

Introduction for CCASR
Introduction for CCASRIntroduction for CCASR
Introduction for CCASR
 
Taylor workshop esa_2014
Taylor workshop esa_2014Taylor workshop esa_2014
Taylor workshop esa_2014
 
The Design and Evaluation of Beahvior Change Tech
The Design and Evaluation of Beahvior Change TechThe Design and Evaluation of Beahvior Change Tech
The Design and Evaluation of Beahvior Change Tech
 
ISMB trainee workshop2016
ISMB trainee workshop2016ISMB trainee workshop2016
ISMB trainee workshop2016
 
Use of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature ScreeningUse of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature Screening
 
Practitioners’ Expectations on Automated Fault Localization
Practitioners’ Expectations on Automated Fault LocalizationPractitioners’ Expectations on Automated Fault Localization
Practitioners’ Expectations on Automated Fault Localization
 
Leading Change from the Quality team
Leading Change from the Quality teamLeading Change from the Quality team
Leading Change from the Quality team
 
Testing Heuristic Detections
Testing Heuristic DetectionsTesting Heuristic Detections
Testing Heuristic Detections
 
SkillSwap Weekend - Usability Testing
SkillSwap Weekend - Usability TestingSkillSwap Weekend - Usability Testing
SkillSwap Weekend - Usability Testing
 
Testing for everyone
Testing for everyoneTesting for everyone
Testing for everyone
 
Surveys: An Overview
Surveys: An OverviewSurveys: An Overview
Surveys: An Overview
 
JoshHess_Resume
JoshHess_ResumeJoshHess_Resume
JoshHess_Resume
 
SFScon 2020 - Rowan Wilson - Machine Learning Challenges for FOSS
SFScon 2020 - Rowan Wilson - Machine Learning Challenges for FOSSSFScon 2020 - Rowan Wilson - Machine Learning Challenges for FOSS
SFScon 2020 - Rowan Wilson - Machine Learning Challenges for FOSS
 

Andere mochten auch

Ab testing work flow
Ab testing work flowAb testing work flow
Ab testing work flow
Jainul Khan
 

Andere mochten auch (6)

Scaling Product Experimentation
Scaling Product ExperimentationScaling Product Experimentation
Scaling Product Experimentation
 
Implementing and analyzing online experiments
Implementing and analyzing online experimentsImplementing and analyzing online experiments
Implementing and analyzing online experiments
 
Ab testing work flow
Ab testing work flowAb testing work flow
Ab testing work flow
 
The Experimentation Imperative
The Experimentation ImperativeThe Experimentation Imperative
The Experimentation Imperative
 
A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation A/B Testing at Pinterest: Building a Culture of Experimentation
A/B Testing at Pinterest: Building a Culture of Experimentation
 
Field Guide to Rapid Experimentation
Field Guide to Rapid Experimentation Field Guide to Rapid Experimentation
Field Guide to Rapid Experimentation
 

Ähnlich wie N=10^9: Automated Experimentation at Scale

evaluation technique uni 2
evaluation technique uni 2evaluation technique uni 2
evaluation technique uni 2
vrgokila
 

Ähnlich wie N=10^9: Automated Experimentation at Scale (20)

Introduction to Usability Testing for Survey Research
Introduction to Usability Testing for Survey ResearchIntroduction to Usability Testing for Survey Research
Introduction to Usability Testing for Survey Research
 
Tests
TestsTests
Tests
 
Chapter 8 Evaluation Techniques
Chapter 8 Evaluation  TechniquesChapter 8 Evaluation  Techniques
Chapter 8 Evaluation Techniques
 
HCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniquesHCI_chapter_09-Evaluation_techniques
HCI_chapter_09-Evaluation_techniques
 
evaluation technique uni 2
evaluation technique uni 2evaluation technique uni 2
evaluation technique uni 2
 
ICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdfICS3211_lecture 9_2022.pdf
ICS3211_lecture 9_2022.pdf
 
2018 Bio-IT World Agile in Wet Labs Speeds Big Data
2018 Bio-IT World Agile in Wet Labs Speeds Big Data2018 Bio-IT World Agile in Wet Labs Speeds Big Data
2018 Bio-IT World Agile in Wet Labs Speeds Big Data
 
Chapter 8 eval. tech. lesson 2
Chapter 8 eval. tech. lesson 2Chapter 8 eval. tech. lesson 2
Chapter 8 eval. tech. lesson 2
 
ICS3211 Lecture 9
ICS3211 Lecture 9ICS3211 Lecture 9
ICS3211 Lecture 9
 
Carlton Wood | Operational Processes, Technolgy and Support.pptx
Carlton Wood | Operational Processes, Technolgy and Support.pptxCarlton Wood | Operational Processes, Technolgy and Support.pptx
Carlton Wood | Operational Processes, Technolgy and Support.pptx
 
Analytic emperical Mehods
Analytic emperical MehodsAnalytic emperical Mehods
Analytic emperical Mehods
 
E3 chap-09
E3 chap-09E3 chap-09
E3 chap-09
 
Scientific Research Steps Part 2
Scientific Research Steps Part 2Scientific Research Steps Part 2
Scientific Research Steps Part 2
 
How to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian PrimerHow to Conduct Usability Studies: A Librarian Primer
How to Conduct Usability Studies: A Librarian Primer
 
Dare to Explore: Discover ET!
Dare to Explore: Discover ET!Dare to Explore: Discover ET!
Dare to Explore: Discover ET!
 
UXprobe workshop at Dare Festival 2016
UXprobe workshop at Dare Festival 2016UXprobe workshop at Dare Festival 2016
UXprobe workshop at Dare Festival 2016
 
STARCANADA 2013 Keynote: Lightning Strikes the Keynotes
STARCANADA 2013 Keynote: Lightning Strikes the KeynotesSTARCANADA 2013 Keynote: Lightning Strikes the Keynotes
STARCANADA 2013 Keynote: Lightning Strikes the Keynotes
 
Business Research Method - Unit II, AKTU, Lucknow Syllabus
Business Research Method - Unit II, AKTU, Lucknow SyllabusBusiness Research Method - Unit II, AKTU, Lucknow Syllabus
Business Research Method - Unit II, AKTU, Lucknow Syllabus
 
classmar2.ppt
classmar2.pptclassmar2.ppt
classmar2.ppt
 
More Than Usability
More Than UsabilityMore Than Usability
More Than Usability
 

Mehr von Optimizely

Mehr von Optimizely (20)

Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive Experimentation
 
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
Make Every Touchpoint Count: How to Drive Revenue in an Increasingly Online W...
 
The Science of Getting Testing Right
The Science of Getting Testing RightThe Science of Getting Testing Right
The Science of Getting Testing Right
 
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Atlassian's Mystique CLI, Minimizing the Experiment Development CycleAtlassian's Mystique CLI, Minimizing the Experiment Development Cycle
Atlassian's Mystique CLI, Minimizing the Experiment Development Cycle
 
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
Autotrader Case Study: Migrating from Home-Grown Testing to Best-in-Class Too...
 
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
Zillow + Optimizely: Building the Bridge to $20 Billion RevenueZillow + Optimizely: Building the Bridge to $20 Billion Revenue
Zillow + Optimizely: Building the Bridge to $20 Billion Revenue
 
The Future of Optimizely for Technical Teams
The Future of Optimizely for Technical TeamsThe Future of Optimizely for Technical Teams
The Future of Optimizely for Technical Teams
 
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
Empowering Agents to Provide Service from Anywhere: Contact Centers in the Ti...
 
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
Experimentation Everywhere: Create Exceptional Online Shopping Experiences an...
 
Building an Experiment Pipeline for GitHub’s New Free Team Offering
Building an Experiment Pipeline for GitHub’s New Free Team OfferingBuilding an Experiment Pipeline for GitHub’s New Free Team Offering
Building an Experiment Pipeline for GitHub’s New Free Team Offering
 
AMC Networks Experiments Faster on the Server Side
AMC Networks Experiments Faster on the Server SideAMC Networks Experiments Faster on the Server Side
AMC Networks Experiments Faster on the Server Side
 
Evolving Experimentation from CRO to Product Development
Evolving Experimentation from CRO to Product DevelopmentEvolving Experimentation from CRO to Product Development
Evolving Experimentation from CRO to Product Development
 
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
Overcoming the Challenges of Experimentation on a Service Oriented ArchitectureOvercoming the Challenges of Experimentation on a Service Oriented Architecture
Overcoming the Challenges of Experimentation on a Service Oriented Architecture
 
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
How The Zebra Utilized Feature Experiments To Increase Carrier Card Engagemen...
 
Making Your Hypothesis Work Harder to Inform Future Product Strategy
Making Your Hypothesis Work Harder to Inform Future Product StrategyMaking Your Hypothesis Work Harder to Inform Future Product Strategy
Making Your Hypothesis Work Harder to Inform Future Product Strategy
 
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives RevenueKick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
Kick Your Assumptions: How Scholl's Test-Everything Culture Drives Revenue
 
Experimentation through Clients' Eyes
Experimentation through Clients' EyesExperimentation through Clients' Eyes
Experimentation through Clients' Eyes
 
Shipping to Learn and Accelerate Growth with GitHub
Shipping to Learn and Accelerate Growth with GitHubShipping to Learn and Accelerate Growth with GitHub
Shipping to Learn and Accelerate Growth with GitHub
 
Test Everything: TrustRadius Delivers Customer Value with Experimentation
Test Everything: TrustRadius Delivers Customer Value with ExperimentationTest Everything: TrustRadius Delivers Customer Value with Experimentation
Test Everything: TrustRadius Delivers Customer Value with Experimentation
 
Optimizely Agent: Scaling Resilient Feature Delivery
Optimizely Agent: Scaling Resilient Feature DeliveryOptimizely Agent: Scaling Resilient Feature Delivery
Optimizely Agent: Scaling Resilient Feature Delivery
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

N=10^9: Automated Experimentation at Scale