SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Thinking with Data
Max Shron
@mshron
Thinking with Data
Max Shron
@mshron
Big picture
• Data is too much fun, too easy to rabbit-hole
• Specialized knowledge is hard to communicate
• Not all stats is well-adapted to the real world
• We need techniques to handle that
Big picture
• Design — UX, consulting, etc.
• Humanities — philosophy, law, etc.
• Social science — sociology, psychology, etc.
Scoping
• First set of techniques: scoping.
• The world gives us vague requests.
• We should have things clear before we start, or
we end up with uninteresting questions.
• Write things down or say out loud.
Scoping
• Imagine we are working with a company with a
subscription business. The CEO asks us for a
churn model.
• Bad scope: “We will use R to create a logistic
regression to predict who will quit using the
product.”
• Not actionable, irrelevant detail.
Scoping
• CoNVO:
• Context
• Need
• Vision
• Outcome
• Iterative process — start simple, refine, refine, refine.
Scoping
• Context
• Who are we working with? What are the big
picture, long term goals?
• “The company has subscription model. CEO’s
goal is to improve profitability.”
Scoping
• Need
• What is the particular knowledge we are
missing?
• “We want to understand who drops off early
enough so that we can intervene.”
Scoping
• Vision
• What would it look like to solve the problem?
• “We will build a predictive model using
behavioral data to predict who will drop off —
early enough to be useful.”
• Sources of data: important. Kinds of offers:
important. Kind of experimentation: important.
Kind of model: unimportant.
Scoping
• Outcome
• Who will be responsible for next steps? How will we
know if we are correct?
• “The tech team will implement the model in a batch
process to run daily, automatically sending out email
offers. We will calculate success metrics (precision
and recall) on held out users, and send a weekly
email of stats to stay on top.”
• We need a control group!
Scoping
• How do we develop a CoNVO?
• interviews
• kitchen sink interrogation
• roleplaying
• story-telling
• mockups
• Clearer vision with mockups
John Smith is 36 years old,
he has seen 40 different
pages over the past two
weeks, his and he has a 20%
chance to convert.
Scoping
Scoping
• Context We are hired to work in a hospital system with
250K patients over 20 years. Report to CEO, who is
interested in building a tool for reducing medical issues.
• Need After talking to some doctors, some belief that
there is overuse of antibiotics, but hard to detect.
• Vision A pilot investigation. If we find signal, repeatable
flagging tool.
• Outcome CMO will decide if pilot is valuable based on
report. Automated tool would be run by CMO on demand.
Arguments
• Data is not a ray gun!
• People need to be convinced, including you.
• The world is not deductive logic, we need a
theory that includes that people have minds.
• Trusting a tool, making a point with a graph,
coming to terms on a definition, convincing
someone to act differently, etc etc.
Arguments
• General model is semi-deductive. We move from
what is known / agreed-upon and move towards
what is not yet known.
• Patterns of reasoning help us make stronger cases
in less time and effort. Take advantage of two
thousand years of research.
Arguments
• Example: Predicting Δ poverty from satellite data
• It takes 5-10 years to get small scale poverty
estimates in poor countries.
• The vision: predict whether the poverty estimates
will go up or down ahead of time, using cheap
satellite data.
• The outcome: use to informally guide policy
decisions, keeping track of interventions.
Arguments
• Claim - Your audience does not believe it yet but
you think you can make a case for it.
• “Poverty can be modeled effectively with satellite
data.”
• Prior knowledge - Things your audience already
believes before the case is started.
Arguments
• Evidence - Where data enters an argument. We
transform data into evidence. Counts, models, graphs,
etc. make up the evidence.
• Justification - The reasoning why the evidence should
cause us to believe the claim.
• “These graphs indicate that the residuals for our
model are as we had anticipated.”
• Rebuttal - Any of the reasons why the justification might
not hold in this particular case. Usually smart to know.
Arguments
• Patterns!
• Causal analysis
• Convincing takes more than math
• Categories of dispute
Arguments
• Disputes of fact — getting the details straight
• “The F1 for this model is .7”
• Two stock issues:
• What is a reasonable truth condition?
• Is it satisfied?
Arguments
• Disputes of definition — relating words to math
• “Poverty is defined as FGT, α = 2”
• Three stock issues:
• Does this definition make a useful distinction?
• How consistent is this definition with prior ideas?
• What, if any, are the reasonable alternatives?
Arguments
• Disputes of value — making the right trade-offs
• “Our model is simple enough.”
• Two stock issues:
• How do our goals determine which values are
most important?
• Have the values been properly applied here?
Arguments
• Disputes of policy — the right course of action
• “We should use this model to informally guide our decisions
between official estimates.”
• Four stock issues:
• Is there a problem? ill
• Where is credit or blame due? blame
• Will the proposal solve it? cure
• Will it be better on balance? cost
Summary
• Take half the math and tools and twice the listening
to what people actually need.
• This is the tip of the iceberg. In general, we have a
lot to learn from others
• Let’s talk! @mshron

Weitere ähnliche Inhalte

Was ist angesagt?

Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
Daniel Tunkelang
 
SDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie Shupack
SDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie ShupackSDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie Shupack
SDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie Shupack
Service Design Network
 

Was ist angesagt? (20)

Be Data Informed Without Being a Data Scientist
Be Data Informed Without Being a Data ScientistBe Data Informed Without Being a Data Scientist
Be Data Informed Without Being a Data Scientist
 
How to Become a Data Science Company instead of a company with Data Scientist...
How to Become a Data Science Company instead of a company with Data Scientist...How to Become a Data Science Company instead of a company with Data Scientist...
How to Become a Data Science Company instead of a company with Data Scientist...
 
UX STRAT Online 2020: Dr. Martin Tingley, Netflix
UX STRAT Online 2020: Dr. Martin Tingley, NetflixUX STRAT Online 2020: Dr. Martin Tingley, Netflix
UX STRAT Online 2020: Dr. Martin Tingley, Netflix
 
The Product Mindset- Jonny Schneider (ThoughtWorks Live)
The Product Mindset- Jonny Schneider (ThoughtWorks Live)The Product Mindset- Jonny Schneider (ThoughtWorks Live)
The Product Mindset- Jonny Schneider (ThoughtWorks Live)
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
How to Analyze Survey Data | SoGoSurvey
How to Analyze Survey Data | SoGoSurveyHow to Analyze Survey Data | SoGoSurvey
How to Analyze Survey Data | SoGoSurvey
 
How to use data to make a hit
How to use data to make a hitHow to use data to make a hit
How to use data to make a hit
 
SDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie Shupack
SDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie ShupackSDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie Shupack
SDNC13 -DAY2- There is no Innovation Fast-lane by Lizzie Shupack
 
Data science
Data scienceData science
Data science
 
Week2 day4slide
Week2 day4slideWeek2 day4slide
Week2 day4slide
 
Managing Data Science by David Martínez Rego
Managing Data Science by David Martínez RegoManaging Data Science by David Martínez Rego
Managing Data Science by David Martínez Rego
 
Estimate and Measure. Minimize work, maximize value. Part 1
Estimate and Measure. Minimize work, maximize value. Part 1Estimate and Measure. Minimize work, maximize value. Part 1
Estimate and Measure. Minimize work, maximize value. Part 1
 
Lightning talk on the future of analytics - CloudCamp London, 2016
Lightning talk on the future of analytics - CloudCamp London, 2016 Lightning talk on the future of analytics - CloudCamp London, 2016
Lightning talk on the future of analytics - CloudCamp London, 2016
 
Data monetization
Data monetizationData monetization
Data monetization
 
How to use data to make a hit tv show
How to use data to make a hit tv showHow to use data to make a hit tv show
How to use data to make a hit tv show
 
Analysis of "A Predictive Analytics Primer" by Thomas H. Davenport
Analysis of "A Predictive Analytics Primer" by Thomas H. DavenportAnalysis of "A Predictive Analytics Primer" by Thomas H. Davenport
Analysis of "A Predictive Analytics Primer" by Thomas H. Davenport
 
Larissa
Larissa  Larissa
Larissa
 
What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...
 
Capturing the real customer experience
Capturing the real customer experienceCapturing the real customer experience
Capturing the real customer experience
 
Steve Keightley, Head of Optimisation, Mezzo Labs - Data-Driven Optimisation
Steve Keightley, Head of Optimisation, Mezzo Labs - Data-Driven OptimisationSteve Keightley, Head of Optimisation, Mezzo Labs - Data-Driven Optimisation
Steve Keightley, Head of Optimisation, Mezzo Labs - Data-Driven Optimisation
 

Andere mochten auch

Andere mochten auch (20)

Drew Conway: A Social Scientist's Perspective on Data Science
Drew Conway: A Social Scientist's Perspective on Data ScienceDrew Conway: A Social Scientist's Perspective on Data Science
Drew Conway: A Social Scientist's Perspective on Data Science
 
Can Big Data Save the World? By Jake Porway
Can Big Data Save the World? By Jake PorwayCan Big Data Save the World? By Jake Porway
Can Big Data Save the World? By Jake Porway
 
The Back of The Napkin (Dan Roam)
The Back of The Napkin (Dan Roam)The Back of The Napkin (Dan Roam)
The Back of The Napkin (Dan Roam)
 
The Fry and the Cranberry
The Fry and the CranberryThe Fry and the Cranberry
The Fry and the Cranberry
 
1 simple way to better presentations: don't outline, PUMA!
1 simple way to better presentations: don't outline, PUMA!1 simple way to better presentations: don't outline, PUMA!
1 simple way to better presentations: don't outline, PUMA!
 
Navigating the AWS Compliance Framework | AWS Security Roadshow Dublin
Navigating the AWS Compliance Framework | AWS Security Roadshow DublinNavigating the AWS Compliance Framework | AWS Security Roadshow Dublin
Navigating the AWS Compliance Framework | AWS Security Roadshow Dublin
 
Health Care Napkin 2
Health Care Napkin 2Health Care Napkin 2
Health Care Napkin 2
 
Health care napkin 4
Health care napkin 4Health care napkin 4
Health care napkin 4
 
My Speech to Congress (Health Care)
My Speech to Congress (Health Care)My Speech to Congress (Health Care)
My Speech to Congress (Health Care)
 
Auditoría Odontológica
Auditoría OdontológicaAuditoría Odontológica
Auditoría Odontológica
 
Health Care Napkin 1
Health Care Napkin 1Health Care Napkin 1
Health Care Napkin 1
 
Consequences of the Armed Conflict as a Stressor of Climate Change in Colombi...
Consequences of the Armed Conflict as a Stressor of Climate Change in Colombi...Consequences of the Armed Conflict as a Stressor of Climate Change in Colombi...
Consequences of the Armed Conflict as a Stressor of Climate Change in Colombi...
 
Impact of a Collective Action in a Disaster-affected Community to Site a Temp...
Impact of a Collective Action in a Disaster-affected Community to Site a Temp...Impact of a Collective Action in a Disaster-affected Community to Site a Temp...
Impact of a Collective Action in a Disaster-affected Community to Site a Temp...
 
Methods and Compliance Programs for Anti-Corruption, Pascal HANS
Methods and Compliance Programs for Anti-Corruption, Pascal HANSMethods and Compliance Programs for Anti-Corruption, Pascal HANS
Methods and Compliance Programs for Anti-Corruption, Pascal HANS
 
Expected Skills, Required Program Content and Assessment System to Address th...
Expected Skills, Required Program Content and Assessment System to Address th...Expected Skills, Required Program Content and Assessment System to Address th...
Expected Skills, Required Program Content and Assessment System to Address th...
 
Qualitative Risk Assessment for Business Continuity Management in University,...
Qualitative Risk Assessment for Business Continuity Management in University,...Qualitative Risk Assessment for Business Continuity Management in University,...
Qualitative Risk Assessment for Business Continuity Management in University,...
 
Vulnerability Assessment Using Spatial Information in terms of Chemical Relea...
Vulnerability Assessment Using Spatial Information in terms of Chemical Relea...Vulnerability Assessment Using Spatial Information in terms of Chemical Relea...
Vulnerability Assessment Using Spatial Information in terms of Chemical Relea...
 
Rock and Roll's Drawing Fable (A quick lesson on how to draw anything!)
Rock and Roll's Drawing Fable (A quick lesson on how to draw anything!) Rock and Roll's Drawing Fable (A quick lesson on how to draw anything!)
Rock and Roll's Drawing Fable (A quick lesson on how to draw anything!)
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with Cassandra
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 

Ähnlich wie Max Shron, Thinking with Data at the NYC Data Science Meetup

Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
OrateTeam
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
oralonso
 
The Behavioural Science of Predictions, Forecasting and Decision Making
The Behavioural Science of Predictions, Forecasting and Decision MakingThe Behavioural Science of Predictions, Forecasting and Decision Making
The Behavioural Science of Predictions, Forecasting and Decision Making
Needle Partners
 

Ähnlich wie Max Shron, Thinking with Data at the NYC Data Science Meetup (20)

Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1
 
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
Pdf analytics-and-witch-doctoring -why-executives-succumb-to-the-black-box-me...
 
Data Science-final7
Data Science-final7Data Science-final7
Data Science-final7
 
Big Data LDN 2017: Preserving The Key Principles Of Academic Research In A Bu...
Big Data LDN 2017: Preserving The Key Principles Of Academic Research In A Bu...Big Data LDN 2017: Preserving The Key Principles Of Academic Research In A Bu...
Big Data LDN 2017: Preserving The Key Principles Of Academic Research In A Bu...
 
The Human Side of Data By Colin Strong
The Human Side of Data By Colin StrongThe Human Side of Data By Colin Strong
The Human Side of Data By Colin Strong
 
6.6 Family and Youth Program Measurement Simplified
6.6 Family and Youth Program Measurement Simplified6.6 Family and Youth Program Measurement Simplified
6.6 Family and Youth Program Measurement Simplified
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
 
CSI: Clinical Site Intelligence
CSI: Clinical Site IntelligenceCSI: Clinical Site Intelligence
CSI: Clinical Site Intelligence
 
Pp 3
Pp 3Pp 3
Pp 3
 
Acceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditableAcceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditable
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
The Behavioural Science of Predictions, Forecasting and Decision Making
The Behavioural Science of Predictions, Forecasting and Decision MakingThe Behavioural Science of Predictions, Forecasting and Decision Making
The Behavioural Science of Predictions, Forecasting and Decision Making
 
1.11 Data and Performance Simplified
1.11 Data and Performance Simplified1.11 Data and Performance Simplified
1.11 Data and Performance Simplified
 
Market Sounding Brief: ACT Government Data Management
Market Sounding Brief: ACT Government Data ManagementMarket Sounding Brief: ACT Government Data Management
Market Sounding Brief: ACT Government Data Management
 
3 beliefs you need to let go to start you agile journey – Agile EE 2017
3 beliefs you need to let go to start you agile journey – Agile EE 20173 beliefs you need to let go to start you agile journey – Agile EE 2017
3 beliefs you need to let go to start you agile journey – Agile EE 2017
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
Management by data
Management by dataManagement by data
Management by data
 
What We Learned from Four Years of Sciencing the Crap Out of DevOps - Nicole ...
What We Learned from Four Years of Sciencing the Crap Out of DevOps - Nicole ...What We Learned from Four Years of Sciencing the Crap Out of DevOps - Nicole ...
What We Learned from Four Years of Sciencing the Crap Out of DevOps - Nicole ...
 
DOES16 San Francisco - Nicole Forsgren & Jez Humble - The Latest: What We Lea...
DOES16 San Francisco - Nicole Forsgren & Jez Humble - The Latest: What We Lea...DOES16 San Francisco - Nicole Forsgren & Jez Humble - The Latest: What We Lea...
DOES16 San Francisco - Nicole Forsgren & Jez Humble - The Latest: What We Lea...
 
DOES 2016 Sciencing the Crap Out of DevOps
DOES 2016 Sciencing the Crap Out of DevOpsDOES 2016 Sciencing the Crap Out of DevOps
DOES 2016 Sciencing the Crap Out of DevOps
 

Mehr von mortardata (6)

Daeil Kim: Machine Learning at the New York Times
Daeil Kim: Machine Learning at the New York TimesDaeil Kim: Machine Learning at the New York Times
Daeil Kim: Machine Learning at the New York Times
 
Jonathan Coveney: Why Pig?
Jonathan Coveney: Why Pig?Jonathan Coveney: Why Pig?
Jonathan Coveney: Why Pig?
 
Pig on Spark
Pig on SparkPig on Spark
Pig on Spark
 
Data Science at Tumblr
Data Science at TumblrData Science at Tumblr
Data Science at Tumblr
 
Hadoop, Pig, and Python (PyData NYC 2012)
Hadoop, Pig, and Python (PyData NYC 2012)Hadoop, Pig, and Python (PyData NYC 2012)
Hadoop, Pig, and Python (PyData NYC 2012)
 
Mortar: Hadoop-as-a-Service + Open Source Framework | AWS re: Invent public …
Mortar: Hadoop-as-a-Service + Open Source Framework | AWS re: Invent public …Mortar: Hadoop-as-a-Service + Open Source Framework | AWS re: Invent public …
Mortar: Hadoop-as-a-Service + Open Source Framework | AWS re: Invent public …
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Max Shron, Thinking with Data at the NYC Data Science Meetup

  • 1. Thinking with Data Max Shron @mshron
  • 2.
  • 3. Thinking with Data Max Shron @mshron
  • 4.
  • 5. Big picture • Data is too much fun, too easy to rabbit-hole • Specialized knowledge is hard to communicate • Not all stats is well-adapted to the real world • We need techniques to handle that
  • 6. Big picture • Design — UX, consulting, etc. • Humanities — philosophy, law, etc. • Social science — sociology, psychology, etc.
  • 7.
  • 8. Scoping • First set of techniques: scoping. • The world gives us vague requests. • We should have things clear before we start, or we end up with uninteresting questions. • Write things down or say out loud.
  • 9. Scoping • Imagine we are working with a company with a subscription business. The CEO asks us for a churn model. • Bad scope: “We will use R to create a logistic regression to predict who will quit using the product.” • Not actionable, irrelevant detail.
  • 10. Scoping • CoNVO: • Context • Need • Vision • Outcome • Iterative process — start simple, refine, refine, refine.
  • 11. Scoping • Context • Who are we working with? What are the big picture, long term goals? • “The company has subscription model. CEO’s goal is to improve profitability.”
  • 12. Scoping • Need • What is the particular knowledge we are missing? • “We want to understand who drops off early enough so that we can intervene.”
  • 13. Scoping • Vision • What would it look like to solve the problem? • “We will build a predictive model using behavioral data to predict who will drop off — early enough to be useful.” • Sources of data: important. Kinds of offers: important. Kind of experimentation: important. Kind of model: unimportant.
  • 14. Scoping • Outcome • Who will be responsible for next steps? How will we know if we are correct? • “The tech team will implement the model in a batch process to run daily, automatically sending out email offers. We will calculate success metrics (precision and recall) on held out users, and send a weekly email of stats to stay on top.” • We need a control group!
  • 15. Scoping • How do we develop a CoNVO? • interviews • kitchen sink interrogation • roleplaying • story-telling • mockups
  • 16. • Clearer vision with mockups John Smith is 36 years old, he has seen 40 different pages over the past two weeks, his and he has a 20% chance to convert. Scoping
  • 17. Scoping • Context We are hired to work in a hospital system with 250K patients over 20 years. Report to CEO, who is interested in building a tool for reducing medical issues. • Need After talking to some doctors, some belief that there is overuse of antibiotics, but hard to detect. • Vision A pilot investigation. If we find signal, repeatable flagging tool. • Outcome CMO will decide if pilot is valuable based on report. Automated tool would be run by CMO on demand.
  • 18.
  • 19. Arguments • Data is not a ray gun! • People need to be convinced, including you. • The world is not deductive logic, we need a theory that includes that people have minds. • Trusting a tool, making a point with a graph, coming to terms on a definition, convincing someone to act differently, etc etc.
  • 20. Arguments • General model is semi-deductive. We move from what is known / agreed-upon and move towards what is not yet known. • Patterns of reasoning help us make stronger cases in less time and effort. Take advantage of two thousand years of research.
  • 21.
  • 22. Arguments • Example: Predicting Δ poverty from satellite data • It takes 5-10 years to get small scale poverty estimates in poor countries. • The vision: predict whether the poverty estimates will go up or down ahead of time, using cheap satellite data. • The outcome: use to informally guide policy decisions, keeping track of interventions.
  • 23. Arguments • Claim - Your audience does not believe it yet but you think you can make a case for it. • “Poverty can be modeled effectively with satellite data.” • Prior knowledge - Things your audience already believes before the case is started.
  • 24. Arguments • Evidence - Where data enters an argument. We transform data into evidence. Counts, models, graphs, etc. make up the evidence. • Justification - The reasoning why the evidence should cause us to believe the claim. • “These graphs indicate that the residuals for our model are as we had anticipated.” • Rebuttal - Any of the reasons why the justification might not hold in this particular case. Usually smart to know.
  • 25.
  • 26. Arguments • Patterns! • Causal analysis • Convincing takes more than math • Categories of dispute
  • 27. Arguments • Disputes of fact — getting the details straight • “The F1 for this model is .7” • Two stock issues: • What is a reasonable truth condition? • Is it satisfied?
  • 28. Arguments • Disputes of definition — relating words to math • “Poverty is defined as FGT, α = 2” • Three stock issues: • Does this definition make a useful distinction? • How consistent is this definition with prior ideas? • What, if any, are the reasonable alternatives?
  • 29. Arguments • Disputes of value — making the right trade-offs • “Our model is simple enough.” • Two stock issues: • How do our goals determine which values are most important? • Have the values been properly applied here?
  • 30. Arguments • Disputes of policy — the right course of action • “We should use this model to informally guide our decisions between official estimates.” • Four stock issues: • Is there a problem? ill • Where is credit or blame due? blame • Will the proposal solve it? cure • Will it be better on balance? cost
  • 31. Summary • Take half the math and tools and twice the listening to what people actually need. • This is the tip of the iceberg. In general, we have a lot to learn from others • Let’s talk! @mshron