SlideShare ist ein Scribd-Unternehmen logo
1 von 156
Downloaden Sie, um offline zu lesen
Donal McMahon
Weapons of Math Instruction:
Evolving from Data-Driven to Science-Driven
Director of Data Science, Indeed
Convince you to use the scientific method.
Then, I’ll teach you how.
using data.
Indeed is data-driven
We’ve hosted 6 eng tech talks on this topic!
Many other industries are also now becoming data-driven
1 Democratize decision-making
2 Better decisions
3 Increase decision velocity
4 Improve collaboration via ego removal
Why data-driven?
Is data-driven an accurate descriptor?
No, we’re science-driven
Why do you need to be science-driven?
A cautionary tale
Dreaming big and about to change the world
Donal PM
disclaimer: sadly not real childhood photos
Idea
Modernize our mobile site to improve job seeker experience
Control Treatment
Change 1
Increased spacing
between jobs
Control Treatment
Change 2
Replaced orange text with
buttons for:
● New
● Apply with your Indeed
Resume
Control Treatment
Change 3
Removed sponsored jobs
Control Treatment
Change 4
Other minor UI tweaks
● Salary range
● Home button
● Fonts
Control Treatment
Completely aligned
Donal PM
Convinced software developers to implement
We ran an A/B test and generated lots of data
Analysed it separately
We drew contradictory conclusions
What is job seeker experience?
What’s a job?
One that’s anywhere on the page, or one that’s viewed?
What’s an acceptable metric trade off?
Resolution strategy: be more data-driven
So, we threw
more data at each other
different hypotheses + different data + different metrics
∴ different conclusions
We learn geology the morning
after the earthquake.
Ralph Waldo Emerson.
A better solution does exist
The Scientific Method
Observation Question Hypothesis Experiment Analysis Conclusion
Remainder of this talk
1 What did we do?
2 Why was it wrong?
3 How can you do it better?
Stage 1: Observation
What did we do?
Observation Question Hypothesis Experiment Analysis Conclusion
Nothing
Observation Question Hypothesis Experiment Analysis Conclusion
Why was it wrong?
1
Didn’t establish baseline for job seeker experience, or measures
2
When we failed, we had no knowledge backlog for future work
Observation Question Hypothesis Experiment Analysis Conclusion
How can you do it better?
Nano
Study real job seeker sessions
Micro
Partner with experts (UX) to gather qualitative data
Macro
Large scale data analysis and observation via experimentation
How can you do it better?
Nano: study real job seeker sessions
Query 1
Click on
Job A
Click on
Job C
Query 2
Click on
Job D
Apply
Observation Question Hypothesis Experiment Analysis Conclusion
Not only is the universe stranger
than we imagine, it is stranger
than we can imagine.
Sir Arthur Eddington
How can you do it better? A shameless plug
Micro: partner with experts (UX) to gather qualitative data
Observation Question Hypothesis Experiment Analysis Conclusion
medium.com/indeed-data-science
How can you do it better?
Micro: partner with experts (UX) to gather qualitative data
Observation Question Hypothesis Experiment Analysis Conclusion
1
Real-life observation
2
Interviews
3
Content analysis (surveys)
How can you do it better?
Macro: large scale data analysis and observation via experimentation
Common Question
What’s a worthwhile/launchable metric trade-off?
Observation Question Hypothesis Experiment Analysis Conclusion
How can you do it better?
Reality
You’re making trade-offs implicitly already
Observation Question Hypothesis Experiment Analysis Conclusion
Macro: large scale data analysis and observation via experimentation
How can you do it better?
Learn your implicit local trade-off function
Run multiple simple perturbation experiments, all the time
Observation Question Hypothesis Experiment Analysis Conclusion
Macro: large scale data analysis and observation via experimentation
Observation via experimentation
Applies
JobAlert
Signups
Observation via experimentation
Applies
JobAlert
Signups
Current state
Observation via experimentation
Applies
JobAlert
Signups
Current state
Learn your current implicit trade-offs via experimentation
Applies
JobAlert
Signups
Expt 1: bold Apply with
your Indeed Resume
Learn your current implicit trade-offs via experimentation
Applies
JobAlert
Signups
Expt 2: add pixel
whitespace to
JobAlert UI
Expt 1: bold Apply with
your Indeed Resume
Learn your current implicit trade-offs via experimentation
Applies
JobAlert
Signups
Compare your current state to all pareto efficient alternatives
Applies
JobAlert
Signups
For each pareto efficient alternative you have a tradeoff
Applies
JobAlert
Sign-ups
ΔApplies
ΔJobAlerts
How can you do it better?
Implicit tradeoff
Each JobAlert sign-up is worth 1.7 Applies
Observation Question Hypothesis Experiment Analysis Conclusion
Macro: large scale data analysis and observation via experimentation
Stage 2: Question
What did we do?
Observation Question Hypothesis Experiment Analysis Conclusion
Nothing
Why was it wrong?
1
We never prioritized the most important question(s)
2
By bundling questions, we couldn’t answer any, learn and improve
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Research Question
Potential
Impact
Complexity
Time To
Learn
What are good measures for job seeker experience? ? ? ?
How can we help job seeker navigate to their desired
job more quickly?
? ? ?
How can we clearly denote sponsored content? ? ? ?
… ... ... ...
How can you do it better?
Observation Question Hypothesis Experiment Analysis Conclusion
Stage 3: Hypothesis
What did we do?
Modernize the mobile interface to improve job seeker
experience
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Why was it wrong?
1
Hypothesis was ill-defined and vague
2
No established metrics
3
No clear success/failure criteria
Observation Question Hypothesis Experiment Analysis Conclusion
How can you do it better?
1
Determine one or more hypothesis
“Does extra whitespace between job cards help job seekers to navigate quicker.”
2
Agree on the data, metrics and acceptable trade-offs up front
Suggested metrics: (i) time to click, (ii) click rate, (iii) time to hire
Observation Question Hypothesis Experiment Analysis Conclusion
Important Question #1
How many metrics?
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Spoiler
3
Your product is a high dimensional hypercube
Observation Question Hypothesis Experiment Analysis Conclusion
2D hypercube 3D hypercube 4D hypercube
5D hypercube 6D hypercube 7D hypercube
2D hypercube 3D hypercube 4D hypercube
5D hypercube 6D hypercube 7D hypercube
2D hypercube 3D hypercube 4D hypercube
5D hypercube 6D hypercube 7D hypercube
2D hypercube 3D hypercube 4D hypercube
5D hypercube 6D hypercube 7D hypercube
How many metrics?
We need a low-dimensional representation
that preserves almost all of the signal
Observation Question Hypothesis Experiment Analysis Conclusion
How many metrics?
Singular value decomposition (SVD)
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
How many metrics using SVD
Observation Question Hypothesis Experiment Analysis Conclusion
How many metrics using SVD
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Important Question #2
How do you choose great metrics?
This is a full academic
discipline
Observation Question Hypothesis Experiment Analysis Conclusion
Some dedicated their 20’s to this!
You need to decide on a target (θ)
Observation Question Hypothesis Experiment Analysis Conclusion
Choosing metrics
Observation Question Hypothesis Experiment Analysis Conclusion
Termed the estimand in statistics (θ)
Choose how you’ll aim for the target
Observation Question Hypothesis Experiment Analysis Conclusion
Choosing metrics
Estimator and Estimate (θ)
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Mathematical criteria for metric evaluation
1
Bias
2
Variance
3
System complexity
Observation Question Hypothesis Experiment Analysis Conclusion
Mathematical criteria
1
Bias
2
Variance
3
System complexity
Observation Question Hypothesis Experiment Analysis Conclusion
Bias
Observation Question Hypothesis Experiment Analysis Conclusion
It can be easy to miss bias
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Hidden bias in our example
Estimate “time to hire” for job seekers
Job seeker First action Still active Hired
1 01/01/2016 Yes No
2 01/22/2016 No 01/25/2016
3 02/04/2016 No 02/23/2016
4 02/17/2016 No No
... ... ... ...
... ... ... ...
n 04/23/2016 Yes No
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Initial Metric Proposal
Average time to hire for job seekers who were hired
But there’s a flaw
Job seeker First action Still active Hired
1 01/01/2016 Yes No
2 01/22/2016 No 01/25/2016
3 02/04/2016 No 02/23/2016
4 02/17/2016 No No
... ... ... ...
... ... ... ...
n 04/23/2016 Yes No
Observation Question Hypothesis Experiment Analysis Conclusion
Job seeker First action Still active Hired
1 01/01/2016 Yes No
2 01/22/2016 No 01/25/2016
3 02/04/2016 No 02/23/2016
4 02/17/2016 No No
... ... ... ...
... ... ... ...
n 04/23/2016 Yes No
Observation Question Hypothesis Experiment Analysis Conclusion
Job seeker First action Still active Hired
1 01/01/2016 Yes No
2 01/22/2016 No 01/25/2016
3 02/04/2016 No 02/23/2016
4 02/17/2016 No No
... ... ... ...
... ... ... ...
n 04/23/2016 Yes No
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Solution
Estimate typical time to hire using Kaplan-Meier Estimate
Observation Question Hypothesis Experiment Analysis Conclusion
Time (t)
Observation Question Hypothesis Experiment Analysis Conclusion
Estimated time to hire
Observation Question Hypothesis Experiment Analysis Conclusion
Mathematical criteria for metric evaluation
1
Bias
2
Variance
3
System complexity
Observation Question Hypothesis Experiment Analysis Conclusion
Variance - a measure of data spread
Low variance High variance
Observation Question Hypothesis Experiment Analysis Conclusion
Variance is fundamental
for valid statistical inference
Observation Question Hypothesis Experiment Analysis Conclusion
Science assumes “innocent until proven guilty”
We often term this our null hypothesis (H0)
Observation Question Hypothesis Experiment Analysis Conclusion
Proof required beyond reasonable doubt
In order to reject the null hypothesis
Observation Question Hypothesis Experiment Analysis Conclusion
Variance is your estimate of uncertainty, i.e. doubt
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Note
We often choose the
Minimum Variance Unbiased Estimator (MVUE)
Not Always MVUE
Occasionally you
might trade bias for
variance
e.g. machine learning
Low variance High variance
HighbiasLowbias
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Mathematical criteria
1
Bias
2
Variance
3
System complexity
System complexity
Product development isn’t linear
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Sometimes there are multiple potential targets
Observation Question Hypothesis Experiment Analysis Conclusion
Or the target is partially blocked
Observation Question Hypothesis Experiment Analysis Conclusion
Or it keeps moving
It can become stressful
There is no catch-all mathematical formula
to measure and account for system complexity
Observation Question Hypothesis Experiment Analysis Conclusion
But that doesn’t mean you shouldn’t try to estimate
it and factor it into decisions
Observation Question Hypothesis Experiment Analysis Conclusion
Search
Tap
Apply
Interview
Offer
“I need a job”
Hire
Observation Question Hypothesis Experiment Analysis Conclusion
Covered
extensively
in Ketan’s talk
Observation Question Hypothesis Experiment Analysis Conclusion
A (strange) American staple
Which also involves prediction brackets
You predict a winner for each game and awarded points if correct
16
9
5
4
✅
✅
✅
1
9
5
4
̶ my prediction ̶ actual result
If you predict an upset early, success/failure compounds
16
9
5
4
✅
✅
✅
1
9
5
4
̶ my prediction ̶ actual result
9 1
4 ✅ 4
4 1
● Downstream compounded loss
● Number of bracket participants
● Points awarded at each stage
Observation Question Hypothesis Experiment Analysis Conclusion
System
complexity
factors
How to win your NCAA pool
Simulate the downstream effect of all potential decisions
Check whether it increases/decreases your win probability
Observation Question Hypothesis Experiment Analysis Conclusion
Reminder - How can you do it better?
1
Determine one or more hypothesis
“Does extra whitespace between job cards help job seekers to navigate quicker.”
2
Agree on the data, metrics and acceptable trade-offs up front
Metrics: (i) time to click, (ii) click rate, (iii) time to hire
Observation Question Hypothesis Experiment Analysis Conclusion
Stage 4: Experiment
What did we do?
Ran a single treatment experiment where we
simultaneously changed four components
Observation Question Hypothesis Experiment Analysis Conclusion
Observation Question Hypothesis Experiment Analysis Conclusion
Why was it wrong?
Observation Question Hypothesis Experiment Analysis Conclusion
Couldn’t disentangle the effects of the 4 different treatments
How can you do it better?
Observation Question Hypothesis Experiment Analysis Conclusion
Run a full factorial experiment
Observation Question Hypothesis Experiment Analysis Conclusion
Full Factorial Experiment
Suggestion
A: Whitespace
B: Orange text
C: Salary range
Full Factorial Experiment
Observation Question Hypothesis Experiment Analysis Conclusion
Increased statistical power, and simultaneous
testing of interaction effects
Full Factorial Experiment
Observation Question Hypothesis Experiment Analysis Conclusion
i.e. you’ll learn more and learn quicker
Stage 5: Analysis
What did we do?
1
Cobbled data together from different sources
2
Defined different metrics
3
Invested a lot of time analysing tests
Observation Question Hypothesis Experiment Analysis Conclusion
To consult the statistician after an
experiment is finished is often merely
to ask her to conduct a post mortem
examination. She can perhaps say
what the experiment died of.
R.A. Fisher
Why was it wrong?
Observation Question Hypothesis Experiment Analysis Conclusion
Opinion-driven, time sink, unsatisfying for all involved
How can you do it better?
Observation Question Hypothesis Experiment Analysis Conclusion
With correct setup, this should be trivial
Existing metric New metric
Existing product
New product
Observation Question Hypothesis Experiment Analysis Conclusion
Existing metric New metric
Existing product Uninteresting
New product
Observation Question Hypothesis Experiment Analysis Conclusion
Existing metric New metric
Existing product Uninteresting Metric Innovation
New product
Observation Question Hypothesis Experiment Analysis Conclusion
Existing metric New metric
Existing product Uninteresting Metric Innovation
New product Product Innovation
Observation Question Hypothesis Experiment Analysis Conclusion
Existing metric New metric
Existing product Uninteresting Metric Innovation
New product Product Innovation Uninformative
Observation Question Hypothesis Experiment Analysis Conclusion
Never use new data or metrics
to validate new products!
Observation Question Hypothesis Experiment Analysis Conclusion
Stage 6: Conclusion
What did we do?
Observation Question Hypothesis Experiment Analysis Conclusion
Drew two different conclusions
Why was it wrong?
1
Didn’t learn anything
2
Lost team trust
Observation Question Hypothesis Experiment Analysis Conclusion
How can you do it better?
Observation Question Hypothesis Experiment Analysis Conclusion
Should follow directly from analysis
The Goldilocks syndrome
Observation Question Hypothesis Experiment Analysis Conclusion
A/B test
(-1%, 1%] (1%, 5%] (5%, ∞](-5%, -1%][-∞, -5%]Outcome
Conclusion too cold too cold too cold
Just right,
declare victory
too hot
Retain healthy skepticism
Always look for bugs
Check for repeatability via holdbacks
The Complete Scientific Method
Observation Question Hypothesis Experiment Analysis Conclusion
nano,
micro,
macro
prioritize,
implicit
trade-offs
bias &
variance,
3 metrics
full
factorial
design
trivial,
no data
innovation
Goldilocks
syndrome,
repeatability
Observation Question Hypothesis Experiment Analysis Conclusion
nano,
micro,
macro
prioritize,
implicit
trade-offs
bias &
variance,
3 metrics
full
factorial
design
trivial,
no data
innovation
Goldilocks
syndrome,
repeatability
Data-driven can be disorientating in a world of abundant data
Be science-driven, i.e. use the scientific method to add necessary structure
Invest in the observation, question and hypothesis stages
Parting Thoughts
~ finn ~

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Validation and hypothesis based product management by Abdallah Al-Khalidi
Validation and hypothesis based  product management by Abdallah Al-KhalidiValidation and hypothesis based  product management by Abdallah Al-Khalidi
Validation and hypothesis based product management by Abdallah Al-Khalidi
 
Master the Essentials of Conversion Optimization
Master the Essentials of Conversion OptimizationMaster the Essentials of Conversion Optimization
Master the Essentials of Conversion Optimization
 
LKCE18 Nicolas Brown - Coaching in a data-driven world
LKCE18 Nicolas Brown - Coaching in a data-driven worldLKCE18 Nicolas Brown - Coaching in a data-driven world
LKCE18 Nicolas Brown - Coaching in a data-driven world
 
Acceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditableAcceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditable
 
Cognifide content usabilitytesting-csa2017-v0.1
Cognifide content usabilitytesting-csa2017-v0.1Cognifide content usabilitytesting-csa2017-v0.1
Cognifide content usabilitytesting-csa2017-v0.1
 
[CXL Live 16] What to Test Next - Prioritizing Your Tests by Pauline Marol
[CXL Live 16] What to Test Next - Prioritizing Your Tests by Pauline Marol[CXL Live 16] What to Test Next - Prioritizing Your Tests by Pauline Marol
[CXL Live 16] What to Test Next - Prioritizing Your Tests by Pauline Marol
 
How to Increase Your Testing Success by Combining Qualitative and Quantitativ...
How to Increase Your Testing Success by Combining Qualitative and Quantitativ...How to Increase Your Testing Success by Combining Qualitative and Quantitativ...
How to Increase Your Testing Success by Combining Qualitative and Quantitativ...
 
Always Be Testing - Learn from Every A/B Test (Hiten Shah)
Always Be Testing - Learn from Every A/B Test (Hiten Shah)Always Be Testing - Learn from Every A/B Test (Hiten Shah)
Always Be Testing - Learn from Every A/B Test (Hiten Shah)
 
Data Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerData Science: The Product Manager's Primer
Data Science: The Product Manager's Primer
 
[CXL Live 16] SaaS Optimization - Effective Metrics, Process and Hacks by Ste...
[CXL Live 16] SaaS Optimization - Effective Metrics, Process and Hacks by Ste...[CXL Live 16] SaaS Optimization - Effective Metrics, Process and Hacks by Ste...
[CXL Live 16] SaaS Optimization - Effective Metrics, Process and Hacks by Ste...
 
A/B Testing and the Infinite Monkey Theory
A/B Testing and the Infinite Monkey TheoryA/B Testing and the Infinite Monkey Theory
A/B Testing and the Infinite Monkey Theory
 
[Elite Camp 2016] Craig Sullivan - Elite Camp Summary Session
[Elite Camp 2016] Craig Sullivan - Elite Camp Summary Session[Elite Camp 2016] Craig Sullivan - Elite Camp Summary Session
[Elite Camp 2016] Craig Sullivan - Elite Camp Summary Session
 
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
A/B testing, optimization and results analysis by Mariia Bocheva, ATD'18
 
SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances
SearchLove Boston 2017 | Will Critchlow | Building Robot AllegiancesSearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances
SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances
 
SearchLove London 2016 | Stephen Pavlovich | Habits of Advanced Conversion Op...
SearchLove London 2016 | Stephen Pavlovich | Habits of Advanced Conversion Op...SearchLove London 2016 | Stephen Pavlovich | Habits of Advanced Conversion Op...
SearchLove London 2016 | Stephen Pavlovich | Habits of Advanced Conversion Op...
 
The Optimisation Grand Unified Theory @ ConversionXL Live
The Optimisation Grand Unified Theory @ ConversionXL LiveThe Optimisation Grand Unified Theory @ ConversionXL Live
The Optimisation Grand Unified Theory @ ConversionXL Live
 
The Million Dollar Optimization Strategy - Andre Morys - ConversionXL Live 2015
The Million Dollar Optimization Strategy - Andre Morys - ConversionXL Live 2015The Million Dollar Optimization Strategy - Andre Morys - ConversionXL Live 2015
The Million Dollar Optimization Strategy - Andre Morys - ConversionXL Live 2015
 
Opticon 2017 Experimenting with Stats Engine
Opticon 2017 Experimenting with Stats EngineOpticon 2017 Experimenting with Stats Engine
Opticon 2017 Experimenting with Stats Engine
 
Gilligan's Guide to Analysts as Community Managers' Best Friends
Gilligan's Guide to Analysts as Community Managers' Best FriendsGilligan's Guide to Analysts as Community Managers' Best Friends
Gilligan's Guide to Analysts as Community Managers' Best Friends
 
Startup Metrics for Pirates (Startonomics Beijing, June 2009)
Startup Metrics for Pirates (Startonomics Beijing, June 2009)Startup Metrics for Pirates (Startonomics Beijing, June 2009)
Startup Metrics for Pirates (Startonomics Beijing, June 2009)
 

Ähnlich wie Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven

Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
Natalino Busa
 

Ähnlich wie Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven (20)

UX STRAT Online 2020: Dr. Martin Tingley, Netflix
UX STRAT Online 2020: Dr. Martin Tingley, NetflixUX STRAT Online 2020: Dr. Martin Tingley, Netflix
UX STRAT Online 2020: Dr. Martin Tingley, Netflix
 
Better Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data DecisionsBetter Living Through Analytics - Strategies for Data Decisions
Better Living Through Analytics - Strategies for Data Decisions
 
The Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PMThe Scientific Method of Experimentation by Google PM
The Scientific Method of Experimentation by Google PM
 
Improving your Agile Process
Improving your Agile ProcessImproving your Agile Process
Improving your Agile Process
 
No estimates - 10 new principles for testing
No estimates  - 10 new principles for testingNo estimates  - 10 new principles for testing
No estimates - 10 new principles for testing
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Data Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analyticsData Science for Business Managers - An intro to ROI for predictive analytics
Data Science for Business Managers - An intro to ROI for predictive analytics
 
Digital Gaggle | September 2017 SEO Conference | Stephen Pavlovich 'Applying ...
Digital Gaggle | September 2017 SEO Conference | Stephen Pavlovich 'Applying ...Digital Gaggle | September 2017 SEO Conference | Stephen Pavlovich 'Applying ...
Digital Gaggle | September 2017 SEO Conference | Stephen Pavlovich 'Applying ...
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
Thinking in software testing
Thinking in software testingThinking in software testing
Thinking in software testing
 
Machine learning at b.e.s.t. summer university
Machine learning  at b.e.s.t. summer universityMachine learning  at b.e.s.t. summer university
Machine learning at b.e.s.t. summer university
 
[CXL Live 16] Opening Keynote by Peep Laja
[CXL Live 16] Opening Keynote by Peep Laja[CXL Live 16] Opening Keynote by Peep Laja
[CXL Live 16] Opening Keynote by Peep Laja
 
Pin the tail on the metric v01 2016 oct
Pin the tail on the metric v01 2016 octPin the tail on the metric v01 2016 oct
Pin the tail on the metric v01 2016 oct
 
Julian Harty - Alternatives To Testing - EuroSTAR 2010
Julian Harty - Alternatives To Testing - EuroSTAR 2010Julian Harty - Alternatives To Testing - EuroSTAR 2010
Julian Harty - Alternatives To Testing - EuroSTAR 2010
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
 
Agile Metrics
Agile MetricsAgile Metrics
Agile Metrics
 
Are We Secure? Answering the Unanswerable
Are We Secure? Answering the UnanswerableAre We Secure? Answering the Unanswerable
Are We Secure? Answering the Unanswerable
 
Pin the tail on the metric v00 75 min version
Pin the tail on the metric v00 75 min versionPin the tail on the metric v00 75 min version
Pin the tail on the metric v00 75 min version
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
 
Software estimation is crap
Software estimation is crapSoftware estimation is crap
Software estimation is crap
 

Mehr von indeedeng

Mehr von indeedeng (17)

Alchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That WorkAlchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That Work
 
Automation and Developer Infrastructure — Empowering Engineers to Move from I...
Automation and Developer Infrastructure — Empowering Engineers to Move from I...Automation and Developer Infrastructure — Empowering Engineers to Move from I...
Automation and Developer Infrastructure — Empowering Engineers to Move from I...
 
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
 
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
 
Data Day Texas - Recommendations
Data Day Texas - RecommendationsData Day Texas - Recommendations
Data Day Texas - Recommendations
 
Vectorized VByte Decoding
Vectorized VByte DecodingVectorized VByte Decoding
Vectorized VByte Decoding
 
[@IndeedEng] Imhotep Workshop
[@IndeedEng] Imhotep Workshop[@IndeedEng] Imhotep Workshop
[@IndeedEng] Imhotep Workshop
 
@IndeedEng: Tokens and Millicents - technical challenges in launching Indeed...
@IndeedEng:  Tokens and Millicents - technical challenges in launching Indeed...@IndeedEng:  Tokens and Millicents - technical challenges in launching Indeed...
@IndeedEng: Tokens and Millicents - technical challenges in launching Indeed...
 
[@IndeedEng] Large scale interactive analytics with Imhotep
[@IndeedEng] Large scale interactive analytics with Imhotep[@IndeedEng] Large scale interactive analytics with Imhotep
[@IndeedEng] Large scale interactive analytics with Imhotep
 
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
 
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol [@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
 
[@IndeedEng Talk] Diving deeper into data-driven product design
[@IndeedEng Talk] Diving deeper into data-driven product design[@IndeedEng Talk] Diving deeper into data-driven product design
[@IndeedEng Talk] Diving deeper into data-driven product design
 
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
 
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
 
[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters
 
[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search
 
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven

  • 1. Donal McMahon Weapons of Math Instruction: Evolving from Data-Driven to Science-Driven Director of Data Science, Indeed
  • 2. Convince you to use the scientific method. Then, I’ll teach you how.
  • 3.
  • 6. We’ve hosted 6 eng tech talks on this topic!
  • 7. Many other industries are also now becoming data-driven
  • 8. 1 Democratize decision-making 2 Better decisions 3 Increase decision velocity 4 Improve collaboration via ego removal Why data-driven?
  • 9. Is data-driven an accurate descriptor?
  • 11. Why do you need to be science-driven? A cautionary tale
  • 12. Dreaming big and about to change the world Donal PM disclaimer: sadly not real childhood photos
  • 13. Idea Modernize our mobile site to improve job seeker experience
  • 15. Change 1 Increased spacing between jobs Control Treatment
  • 16. Change 2 Replaced orange text with buttons for: ● New ● Apply with your Indeed Resume Control Treatment
  • 17. Change 3 Removed sponsored jobs Control Treatment
  • 18. Change 4 Other minor UI tweaks ● Salary range ● Home button ● Fonts Control Treatment
  • 21. We ran an A/B test and generated lots of data
  • 23. We drew contradictory conclusions
  • 24. What is job seeker experience?
  • 25. What’s a job? One that’s anywhere on the page, or one that’s viewed?
  • 26. What’s an acceptable metric trade off?
  • 27. Resolution strategy: be more data-driven
  • 28. So, we threw more data at each other
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. different hypotheses + different data + different metrics ∴ different conclusions
  • 34. We learn geology the morning after the earthquake. Ralph Waldo Emerson.
  • 35. A better solution does exist
  • 36. The Scientific Method Observation Question Hypothesis Experiment Analysis Conclusion
  • 37. Remainder of this talk 1 What did we do? 2 Why was it wrong? 3 How can you do it better?
  • 39. What did we do? Observation Question Hypothesis Experiment Analysis Conclusion Nothing
  • 40. Observation Question Hypothesis Experiment Analysis Conclusion Why was it wrong? 1 Didn’t establish baseline for job seeker experience, or measures 2 When we failed, we had no knowledge backlog for future work
  • 41. Observation Question Hypothesis Experiment Analysis Conclusion How can you do it better? Nano Study real job seeker sessions Micro Partner with experts (UX) to gather qualitative data Macro Large scale data analysis and observation via experimentation
  • 42. How can you do it better? Nano: study real job seeker sessions Query 1 Click on Job A Click on Job C Query 2 Click on Job D Apply Observation Question Hypothesis Experiment Analysis Conclusion
  • 43. Not only is the universe stranger than we imagine, it is stranger than we can imagine. Sir Arthur Eddington
  • 44. How can you do it better? A shameless plug Micro: partner with experts (UX) to gather qualitative data Observation Question Hypothesis Experiment Analysis Conclusion medium.com/indeed-data-science
  • 45. How can you do it better? Micro: partner with experts (UX) to gather qualitative data Observation Question Hypothesis Experiment Analysis Conclusion 1 Real-life observation 2 Interviews 3 Content analysis (surveys)
  • 46. How can you do it better? Macro: large scale data analysis and observation via experimentation Common Question What’s a worthwhile/launchable metric trade-off? Observation Question Hypothesis Experiment Analysis Conclusion
  • 47. How can you do it better? Reality You’re making trade-offs implicitly already Observation Question Hypothesis Experiment Analysis Conclusion Macro: large scale data analysis and observation via experimentation
  • 48. How can you do it better? Learn your implicit local trade-off function Run multiple simple perturbation experiments, all the time Observation Question Hypothesis Experiment Analysis Conclusion Macro: large scale data analysis and observation via experimentation
  • 52. Learn your current implicit trade-offs via experimentation Applies JobAlert Signups Expt 1: bold Apply with your Indeed Resume
  • 53. Learn your current implicit trade-offs via experimentation Applies JobAlert Signups Expt 2: add pixel whitespace to JobAlert UI Expt 1: bold Apply with your Indeed Resume
  • 54. Learn your current implicit trade-offs via experimentation Applies JobAlert Signups
  • 55. Compare your current state to all pareto efficient alternatives Applies JobAlert Signups
  • 56. For each pareto efficient alternative you have a tradeoff Applies JobAlert Sign-ups ΔApplies ΔJobAlerts
  • 57. How can you do it better? Implicit tradeoff Each JobAlert sign-up is worth 1.7 Applies Observation Question Hypothesis Experiment Analysis Conclusion Macro: large scale data analysis and observation via experimentation
  • 59. What did we do? Observation Question Hypothesis Experiment Analysis Conclusion Nothing
  • 60. Why was it wrong? 1 We never prioritized the most important question(s) 2 By bundling questions, we couldn’t answer any, learn and improve Observation Question Hypothesis Experiment Analysis Conclusion
  • 61. Observation Question Hypothesis Experiment Analysis Conclusion
  • 62. Research Question Potential Impact Complexity Time To Learn What are good measures for job seeker experience? ? ? ? How can we help job seeker navigate to their desired job more quickly? ? ? ? How can we clearly denote sponsored content? ? ? ? … ... ... ... How can you do it better? Observation Question Hypothesis Experiment Analysis Conclusion
  • 64. What did we do? Modernize the mobile interface to improve job seeker experience Observation Question Hypothesis Experiment Analysis Conclusion
  • 65. Observation Question Hypothesis Experiment Analysis Conclusion
  • 66. Why was it wrong? 1 Hypothesis was ill-defined and vague 2 No established metrics 3 No clear success/failure criteria Observation Question Hypothesis Experiment Analysis Conclusion
  • 67. How can you do it better? 1 Determine one or more hypothesis “Does extra whitespace between job cards help job seekers to navigate quicker.” 2 Agree on the data, metrics and acceptable trade-offs up front Suggested metrics: (i) time to click, (ii) click rate, (iii) time to hire Observation Question Hypothesis Experiment Analysis Conclusion
  • 68. Important Question #1 How many metrics? Observation Question Hypothesis Experiment Analysis Conclusion
  • 69. Observation Question Hypothesis Experiment Analysis Conclusion Spoiler 3
  • 70. Your product is a high dimensional hypercube Observation Question Hypothesis Experiment Analysis Conclusion
  • 71. 2D hypercube 3D hypercube 4D hypercube 5D hypercube 6D hypercube 7D hypercube
  • 72. 2D hypercube 3D hypercube 4D hypercube 5D hypercube 6D hypercube 7D hypercube
  • 73. 2D hypercube 3D hypercube 4D hypercube 5D hypercube 6D hypercube 7D hypercube
  • 74. 2D hypercube 3D hypercube 4D hypercube 5D hypercube 6D hypercube 7D hypercube
  • 75. How many metrics? We need a low-dimensional representation that preserves almost all of the signal Observation Question Hypothesis Experiment Analysis Conclusion
  • 76. How many metrics? Singular value decomposition (SVD) Observation Question Hypothesis Experiment Analysis Conclusion
  • 77. Observation Question Hypothesis Experiment Analysis Conclusion
  • 78. How many metrics using SVD Observation Question Hypothesis Experiment Analysis Conclusion
  • 79. How many metrics using SVD Observation Question Hypothesis Experiment Analysis Conclusion
  • 80. Observation Question Hypothesis Experiment Analysis Conclusion Important Question #2 How do you choose great metrics?
  • 81. This is a full academic discipline Observation Question Hypothesis Experiment Analysis Conclusion Some dedicated their 20’s to this!
  • 82. You need to decide on a target (θ) Observation Question Hypothesis Experiment Analysis Conclusion Choosing metrics
  • 83. Observation Question Hypothesis Experiment Analysis Conclusion Termed the estimand in statistics (θ)
  • 84. Choose how you’ll aim for the target Observation Question Hypothesis Experiment Analysis Conclusion Choosing metrics
  • 85. Estimator and Estimate (θ) Observation Question Hypothesis Experiment Analysis Conclusion
  • 86. Observation Question Hypothesis Experiment Analysis Conclusion Mathematical criteria for metric evaluation 1 Bias 2 Variance 3 System complexity
  • 87. Observation Question Hypothesis Experiment Analysis Conclusion Mathematical criteria 1 Bias 2 Variance 3 System complexity
  • 88. Observation Question Hypothesis Experiment Analysis Conclusion Bias
  • 89. Observation Question Hypothesis Experiment Analysis Conclusion
  • 90. It can be easy to miss bias Observation Question Hypothesis Experiment Analysis Conclusion
  • 91. Observation Question Hypothesis Experiment Analysis Conclusion Hidden bias in our example Estimate “time to hire” for job seekers
  • 92. Job seeker First action Still active Hired 1 01/01/2016 Yes No 2 01/22/2016 No 01/25/2016 3 02/04/2016 No 02/23/2016 4 02/17/2016 No No ... ... ... ... ... ... ... ... n 04/23/2016 Yes No Observation Question Hypothesis Experiment Analysis Conclusion
  • 93. Observation Question Hypothesis Experiment Analysis Conclusion Initial Metric Proposal Average time to hire for job seekers who were hired
  • 95. Job seeker First action Still active Hired 1 01/01/2016 Yes No 2 01/22/2016 No 01/25/2016 3 02/04/2016 No 02/23/2016 4 02/17/2016 No No ... ... ... ... ... ... ... ... n 04/23/2016 Yes No Observation Question Hypothesis Experiment Analysis Conclusion
  • 96. Job seeker First action Still active Hired 1 01/01/2016 Yes No 2 01/22/2016 No 01/25/2016 3 02/04/2016 No 02/23/2016 4 02/17/2016 No No ... ... ... ... ... ... ... ... n 04/23/2016 Yes No Observation Question Hypothesis Experiment Analysis Conclusion
  • 97. Job seeker First action Still active Hired 1 01/01/2016 Yes No 2 01/22/2016 No 01/25/2016 3 02/04/2016 No 02/23/2016 4 02/17/2016 No No ... ... ... ... ... ... ... ... n 04/23/2016 Yes No Observation Question Hypothesis Experiment Analysis Conclusion
  • 98. Observation Question Hypothesis Experiment Analysis Conclusion Solution Estimate typical time to hire using Kaplan-Meier Estimate
  • 99. Observation Question Hypothesis Experiment Analysis Conclusion
  • 100. Time (t) Observation Question Hypothesis Experiment Analysis Conclusion Estimated time to hire
  • 101. Observation Question Hypothesis Experiment Analysis Conclusion Mathematical criteria for metric evaluation 1 Bias 2 Variance 3 System complexity
  • 102. Observation Question Hypothesis Experiment Analysis Conclusion Variance - a measure of data spread Low variance High variance
  • 103. Observation Question Hypothesis Experiment Analysis Conclusion
  • 104. Variance is fundamental for valid statistical inference Observation Question Hypothesis Experiment Analysis Conclusion
  • 105. Science assumes “innocent until proven guilty” We often term this our null hypothesis (H0) Observation Question Hypothesis Experiment Analysis Conclusion
  • 106. Proof required beyond reasonable doubt In order to reject the null hypothesis Observation Question Hypothesis Experiment Analysis Conclusion
  • 107. Variance is your estimate of uncertainty, i.e. doubt Observation Question Hypothesis Experiment Analysis Conclusion
  • 108. Observation Question Hypothesis Experiment Analysis Conclusion Note We often choose the Minimum Variance Unbiased Estimator (MVUE)
  • 109. Not Always MVUE Occasionally you might trade bias for variance e.g. machine learning Low variance High variance HighbiasLowbias Observation Question Hypothesis Experiment Analysis Conclusion
  • 110. Observation Question Hypothesis Experiment Analysis Conclusion Mathematical criteria 1 Bias 2 Variance 3 System complexity
  • 112. Product development isn’t linear Observation Question Hypothesis Experiment Analysis Conclusion
  • 113. Observation Question Hypothesis Experiment Analysis Conclusion Sometimes there are multiple potential targets
  • 114. Observation Question Hypothesis Experiment Analysis Conclusion Or the target is partially blocked
  • 115. Observation Question Hypothesis Experiment Analysis Conclusion Or it keeps moving
  • 116. It can become stressful
  • 117. There is no catch-all mathematical formula to measure and account for system complexity Observation Question Hypothesis Experiment Analysis Conclusion
  • 118. But that doesn’t mean you shouldn’t try to estimate it and factor it into decisions Observation Question Hypothesis Experiment Analysis Conclusion
  • 119. Search Tap Apply Interview Offer “I need a job” Hire Observation Question Hypothesis Experiment Analysis Conclusion Covered extensively in Ketan’s talk
  • 120. Observation Question Hypothesis Experiment Analysis Conclusion
  • 122. Which also involves prediction brackets
  • 123. You predict a winner for each game and awarded points if correct 16 9 5 4 ✅ ✅ ✅ 1 9 5 4 ̶ my prediction ̶ actual result
  • 124. If you predict an upset early, success/failure compounds 16 9 5 4 ✅ ✅ ✅ 1 9 5 4 ̶ my prediction ̶ actual result 9 1 4 ✅ 4 4 1
  • 125. ● Downstream compounded loss ● Number of bracket participants ● Points awarded at each stage Observation Question Hypothesis Experiment Analysis Conclusion System complexity factors
  • 126. How to win your NCAA pool Simulate the downstream effect of all potential decisions Check whether it increases/decreases your win probability Observation Question Hypothesis Experiment Analysis Conclusion
  • 127. Reminder - How can you do it better? 1 Determine one or more hypothesis “Does extra whitespace between job cards help job seekers to navigate quicker.” 2 Agree on the data, metrics and acceptable trade-offs up front Metrics: (i) time to click, (ii) click rate, (iii) time to hire Observation Question Hypothesis Experiment Analysis Conclusion
  • 129. What did we do? Ran a single treatment experiment where we simultaneously changed four components Observation Question Hypothesis Experiment Analysis Conclusion
  • 130. Observation Question Hypothesis Experiment Analysis Conclusion
  • 131. Why was it wrong? Observation Question Hypothesis Experiment Analysis Conclusion Couldn’t disentangle the effects of the 4 different treatments
  • 132. How can you do it better? Observation Question Hypothesis Experiment Analysis Conclusion Run a full factorial experiment
  • 133. Observation Question Hypothesis Experiment Analysis Conclusion Full Factorial Experiment Suggestion A: Whitespace B: Orange text C: Salary range
  • 134. Full Factorial Experiment Observation Question Hypothesis Experiment Analysis Conclusion Increased statistical power, and simultaneous testing of interaction effects
  • 135. Full Factorial Experiment Observation Question Hypothesis Experiment Analysis Conclusion i.e. you’ll learn more and learn quicker
  • 137. What did we do? 1 Cobbled data together from different sources 2 Defined different metrics 3 Invested a lot of time analysing tests Observation Question Hypothesis Experiment Analysis Conclusion
  • 138. To consult the statistician after an experiment is finished is often merely to ask her to conduct a post mortem examination. She can perhaps say what the experiment died of. R.A. Fisher
  • 139. Why was it wrong? Observation Question Hypothesis Experiment Analysis Conclusion Opinion-driven, time sink, unsatisfying for all involved
  • 140. How can you do it better? Observation Question Hypothesis Experiment Analysis Conclusion With correct setup, this should be trivial
  • 141. Existing metric New metric Existing product New product Observation Question Hypothesis Experiment Analysis Conclusion
  • 142. Existing metric New metric Existing product Uninteresting New product Observation Question Hypothesis Experiment Analysis Conclusion
  • 143. Existing metric New metric Existing product Uninteresting Metric Innovation New product Observation Question Hypothesis Experiment Analysis Conclusion
  • 144. Existing metric New metric Existing product Uninteresting Metric Innovation New product Product Innovation Observation Question Hypothesis Experiment Analysis Conclusion
  • 145. Existing metric New metric Existing product Uninteresting Metric Innovation New product Product Innovation Uninformative Observation Question Hypothesis Experiment Analysis Conclusion
  • 146. Never use new data or metrics to validate new products! Observation Question Hypothesis Experiment Analysis Conclusion
  • 148. What did we do? Observation Question Hypothesis Experiment Analysis Conclusion Drew two different conclusions
  • 149. Why was it wrong? 1 Didn’t learn anything 2 Lost team trust Observation Question Hypothesis Experiment Analysis Conclusion
  • 150. How can you do it better? Observation Question Hypothesis Experiment Analysis Conclusion Should follow directly from analysis
  • 151. The Goldilocks syndrome Observation Question Hypothesis Experiment Analysis Conclusion A/B test (-1%, 1%] (1%, 5%] (5%, ∞](-5%, -1%][-∞, -5%]Outcome Conclusion too cold too cold too cold Just right, declare victory too hot
  • 152. Retain healthy skepticism Always look for bugs Check for repeatability via holdbacks
  • 153. The Complete Scientific Method Observation Question Hypothesis Experiment Analysis Conclusion nano, micro, macro prioritize, implicit trade-offs bias & variance, 3 metrics full factorial design trivial, no data innovation Goldilocks syndrome, repeatability
  • 154. Observation Question Hypothesis Experiment Analysis Conclusion nano, micro, macro prioritize, implicit trade-offs bias & variance, 3 metrics full factorial design trivial, no data innovation Goldilocks syndrome, repeatability
  • 155. Data-driven can be disorientating in a world of abundant data Be science-driven, i.e. use the scientific method to add necessary structure Invest in the observation, question and hypothesis stages Parting Thoughts