SlideShare ist ein Scribd-Unternehmen logo
1 von 38
User-centric design
for data scientists
Annie Darmofal and Katie Malone, Civis Analytics
1
quick poll:
who’s in the room today?
Hi, I’m Katie. I’m a big nerd, and I do data science.
This is me at my previous job, being
a graduate physics student working
at a particle collider.
It was fun, and I got good at science,
but there was a lot I had to learn at
my first data science job.
When physicists play designer
Introducing Rocky
A scrappy data science project from my early days
5
The next few slides are literal training slides from my
internal roadshow, introducing Rocky to its users.
I’m not proud of what I’m about to show you.
Please be gentle.
7Civis Analytics | Proprietary and Confidential
Inspiration and Use Cases
“Can we put Civis Research together
with modeling? Like, build and score
models as part of the standard Civis
Research workflow”
“It would be dope if, when the
omnibus comes in each week,
we could just automatically
build models of all the
questions”
“I want to build a model for
every variable in GFK”
8Civis Analytics | Proprietary and Confidential
Step 1: Build DVSets
Names of your
dvsets will be
printed to the
logs
*note: you may also see a “credential_id” parameter. This should be kept at its default value, 5263
9Civis Analytics | Proprietary and Confidential
Backend: Setting up a dvset
kiwi tables are basically our
config files
kiwi.tables
- table id
- primary key
kiwi.depvars
- depvar id
- column name
- table id
kiwi.dvsets
- model type
- dvset name
- depvar id
insert (2541992, “voterbase_id”) into kiwi.tables
### makes an API call to get the names of all the columns
insert (“romance”, 2541992),
(“comedy”, 2541992),
(“horror”, 2541992)
into kiwi.depvars
### returns a list of auto-incremented depvar ids
insert
(dv_id_1, “movies_dvset_GBT”, “gradient boosting classifier),
(dv_id_2, “movies_dvset_GBT”, “gradient boosting classifier”),
(dv_id_3, “movies_dvset_GBT”, “gradient boosting classifier”)
into kiwi.dvsets
10Civis Analytics | Proprietary and Confidential
Step 2: Run a DVSet
dvset name here
training table here
make sure dvset table
and training table are on
the same cluster!
put in your username (for
finding your S3
credential)
*note: you may also see a “credential_id” parameter. This should be kept at its default value, 5263
11Civis Analytics | Proprietary and Confidential
Backend: running the dvset
kiwi tables are basically our
config files
kiwi.tables
- table id
- primary key
kiwi.depvars
- depvar id
- column name
- table id
kiwi.dvsets
- model type
- dvset name
- depvar id
### kiwi.dvset → dependent variables → depvar table
### auto-generate SQL code:
create view public.rocky_train as select
depvar_table.comedy, depvar_table.romance, depvar_table.horror,
basefile.*
join ts.modeling_commercial basefile
with depvar_table
on basefile.voterbase_id = depvar_table.voterbase_id
file_id = export_redshift_to_S3(public.rocky_train)
for dv in (“comedy”, “romance”, “horror”):
mp = civis_model.ModelPipeline(
depvar = dv,
workflow = “gradient boosting classifier”
excluded_cols = [all other dvs])
mp.train(file_id = file_id)
12Civis Analytics | Proprietary and Confidential
What’s the right way to parallelize model-building: “map” step
voterbase_id
voterbase_id
voterbase_id
voterbase_id
freq_theaterg
oer
genre_comed
y
genre_scifi
genre_roman
tic
voterbase_id
the usual
basefile stuff
voterbase_idvoterbase_id
freq_theaterg
oer
genre_comed
y
genre_scifi
genre_roman
tic
voterbase_id
the usual
basefile stuff
freq_theaterg
oer
genre_comed
y
genre_comed
y
genre_roman
tic
the usual
basefile stuff
13Civis Analytics | Proprietary and Confidential
Step 5: Take a look at your models
Did you get all that?
Of course not. Those slides were terrible.
Even worse, nobody used it.
14
15
Hi, I’m Annie. I’m a designer. I ask a lot of questions.
Katie
User-Centered
Design
Empathize Define Ideate Prototype Test
1. The design is based upon an explicit understanding of users, tasks
and environments.
2. Users are involved throughout design and development.
3. The design addresses the whole user experience.
4. The design is driven and refined by user-centered evaluation.
5. The process is iterative.
Principles
20
1 / 5 Understand your users, tasks and environments
21
2 / 5 Keep users involved throughout the process
22
3 / 5 Consider the whole user experience
23
4 / 5 Evaluate
24
Visibility of system status
25
Match between system and the real world
26
Consistency and standards
27
Error prevention
28
Aesthetic and minimalist design
29
Help users recognize, diagnose, and recover
from errors
30
5 / 5 Iterate
Empathize Define Ideate Prototype Test
Rocky II
The rematch
31
Principle 1: the design is based upon an explicit understanding of
users, tasks, and environments
Our user: a data scientist user is creating models for a business user
to use
Their task: business user wants to cut lists of people based on
modeled predictions
Can we build a tool to help with the list-cutting?
models are
already built
grouped by topic
easy-to-use
For our data scientists, prioritize models over model-creation tools.
before after
Insight #2: Build models, not model-creation tools.
before after
In conclusion...
If you’re a data scientist, you should care about people using the things you build,
and you will build things that people use if you’re user-centered in your mindset.
If you’re a business user, give problems not solutions and have a little empathy the
other way: be engaged, be patient, give thoughtful feedback, have fun.
You own the outcome together!
Thanks!
We would be delighted to take your questions
37
2019 WIA - User-centric Design for Data Scientists

Weitere ähnliche Inhalte

Ähnlich wie 2019 WIA - User-centric Design for Data Scientists

DevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation SlidesDevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation SlidesFab L
 
Accelerate Your Delivery Pipeline with Continuous Testing
Accelerate Your Delivery Pipeline with Continuous TestingAccelerate Your Delivery Pipeline with Continuous Testing
Accelerate Your Delivery Pipeline with Continuous TestingSmartBear
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...DevSecCon
 
Don't let your tests slow you down
Don't let your tests slow you downDon't let your tests slow you down
Don't let your tests slow you downDaniel Irvine
 
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation Zivtech, LLC
 
DefCore: The Interoperability Standard for OpenStack
DefCore: The Interoperability Standard for OpenStackDefCore: The Interoperability Standard for OpenStack
DefCore: The Interoperability Standard for OpenStackMark Voelker
 
Slides galvin-widjaja
Slides galvin-widjajaSlides galvin-widjaja
Slides galvin-widjajaCodePolitan
 
Embracing OOUX for Better Projects and Happier Teams
Embracing OOUX for Better Projects and Happier TeamsEmbracing OOUX for Better Projects and Happier Teams
Embracing OOUX for Better Projects and Happier TeamsCaroline Sober-James
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsDavid De Roure
 
Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018David Tan
 
InteropWG Intro & Vertical Programs (May. 2017)
InteropWG Intro & Vertical Programs (May. 2017)InteropWG Intro & Vertical Programs (May. 2017)
InteropWG Intro & Vertical Programs (May. 2017)Mark Voelker
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksSenturus
 
Defining a Minimum Viable Product (MVP)
Defining a Minimum Viable Product (MVP)Defining a Minimum Viable Product (MVP)
Defining a Minimum Viable Product (MVP)Eric Swenson
 
Hands-On Lab Data Mining - SQL Server
Hands-On Lab Data Mining - SQL ServerHands-On Lab Data Mining - SQL Server
Hands-On Lab Data Mining - SQL ServerSerra Laercio
 
White Labeling Your Data Analytics
White Labeling Your Data AnalyticsWhite Labeling Your Data Analytics
White Labeling Your Data AnalyticsPoojitha B
 

Ähnlich wie 2019 WIA - User-centric Design for Data Scientists (20)

DevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation SlidesDevSecCon SG 2018 Fabian Presentation Slides
DevSecCon SG 2018 Fabian Presentation Slides
 
Accelerate Your Delivery Pipeline with Continuous Testing
Accelerate Your Delivery Pipeline with Continuous TestingAccelerate Your Delivery Pipeline with Continuous Testing
Accelerate Your Delivery Pipeline with Continuous Testing
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...DevSecCon Singapore 2018 -  Remove developers’ shameful secrets or simply rem...
DevSecCon Singapore 2018 - Remove developers’ shameful secrets or simply rem...
 
Drupal WebJam Utrecht
Drupal WebJam UtrechtDrupal WebJam Utrecht
Drupal WebJam Utrecht
 
Don't let your tests slow you down
Don't let your tests slow you downDon't let your tests slow you down
Don't let your tests slow you down
 
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
 
DefCore: The Interoperability Standard for OpenStack
DefCore: The Interoperability Standard for OpenStackDefCore: The Interoperability Standard for OpenStack
DefCore: The Interoperability Standard for OpenStack
 
Raising the Bar
Raising the BarRaising the Bar
Raising the Bar
 
Slides galvin-widjaja
Slides galvin-widjajaSlides galvin-widjaja
Slides galvin-widjaja
 
Embracing OOUX for Better Projects and Happier Teams
Embracing OOUX for Better Projects and Happier TeamsEmbracing OOUX for Better Projects and Happier Teams
Embracing OOUX for Better Projects and Happier Teams
 
Six Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower ScientistsSix Principles of Software Design to Empower Scientists
Six Principles of Software Design to Empower Scientists
 
Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018Deploying ML models to production (frequently and safely) - PYCON 2018
Deploying ML models to production (frequently and safely) - PYCON 2018
 
InteropWG Intro & Vertical Programs (May. 2017)
InteropWG Intro & Vertical Programs (May. 2017)InteropWG Intro & Vertical Programs (May. 2017)
InteropWG Intro & Vertical Programs (May. 2017)
 
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and TricksIBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
IBM Cognos 10 Framework Manager Metadata Modeling: Tips and Tricks
 
Swenson "Defining a Minimum Viable Product"
Swenson "Defining a Minimum Viable Product"Swenson "Defining a Minimum Viable Product"
Swenson "Defining a Minimum Viable Product"
 
Defining a Minimum Viable Product (MVP)
Defining a Minimum Viable Product (MVP)Defining a Minimum Viable Product (MVP)
Defining a Minimum Viable Product (MVP)
 
Hands-On Lab Data Mining - SQL Server
Hands-On Lab Data Mining - SQL ServerHands-On Lab Data Mining - SQL Server
Hands-On Lab Data Mining - SQL Server
 
Ds for finance day 4
Ds for finance day 4Ds for finance day 4
Ds for finance day 4
 
White Labeling Your Data Analytics
White Labeling Your Data AnalyticsWhite Labeling Your Data Analytics
White Labeling Your Data Analytics
 

Mehr von Women in Analytics Conference

WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data AnalysisWIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data AnalysisWomen in Analytics Conference
 
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...Women in Analytics Conference
 
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)Women in Analytics Conference
 
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)Women in Analytics Conference
 
2019 WIA - Setting Expectations on Data Science Projects
2019 WIA - Setting Expectations on Data Science Projects2019 WIA - Setting Expectations on Data Science Projects
2019 WIA - Setting Expectations on Data Science ProjectsWomen in Analytics Conference
 
2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be IncredibleWomen in Analytics Conference
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWomen in Analytics Conference
 

Mehr von Women in Analytics Conference (10)

WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data AnalysisWIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
WIA 2019 - Enhancing AI in Wearable Devices using Topological Data Analysis
 
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
WIA 2019 - Using Embeddings to Understand the Variance and Evolution of Data ...
 
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
WIA 2019 - Ethics in Algorithms Panel (Emily Schlesinger)
 
2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science2019 WIA - The Importance of Ethics in Data Science
2019 WIA - The Importance of Ethics in Data Science
 
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
WIA 2019 - Ethics in Algorithms Panel (Nicole Alexander)
 
2019 WIA - Setting Expectations on Data Science Projects
2019 WIA - Setting Expectations on Data Science Projects2019 WIA - Setting Expectations on Data Science Projects
2019 WIA - Setting Expectations on Data Science Projects
 
2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible2019 WIA - Analytics Maturity: The Power to be Incredible
2019 WIA - Analytics Maturity: The Power to be Incredible
 
2019 WIA - Diversity in Analytics
2019 WIA - Diversity in Analytics2019 WIA - Diversity in Analytics
2019 WIA - Diversity in Analytics
 
WIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual DiagnosticsWIA 2019 - Steering Model Selection with Visual Diagnostics
WIA 2019 - Steering Model Selection with Visual Diagnostics
 
WIA 2019 - From Academia to Industry
WIA 2019 - From Academia to IndustryWIA 2019 - From Academia to Industry
WIA 2019 - From Academia to Industry
 

Kürzlich hochgeladen

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Kürzlich hochgeladen (20)

1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

2019 WIA - User-centric Design for Data Scientists

  • 1. User-centric design for data scientists Annie Darmofal and Katie Malone, Civis Analytics 1
  • 2. quick poll: who’s in the room today?
  • 3. Hi, I’m Katie. I’m a big nerd, and I do data science. This is me at my previous job, being a graduate physics student working at a particle collider. It was fun, and I got good at science, but there was a lot I had to learn at my first data science job.
  • 5. Introducing Rocky A scrappy data science project from my early days 5
  • 6. The next few slides are literal training slides from my internal roadshow, introducing Rocky to its users. I’m not proud of what I’m about to show you. Please be gentle.
  • 7. 7Civis Analytics | Proprietary and Confidential Inspiration and Use Cases “Can we put Civis Research together with modeling? Like, build and score models as part of the standard Civis Research workflow” “It would be dope if, when the omnibus comes in each week, we could just automatically build models of all the questions” “I want to build a model for every variable in GFK”
  • 8. 8Civis Analytics | Proprietary and Confidential Step 1: Build DVSets Names of your dvsets will be printed to the logs *note: you may also see a “credential_id” parameter. This should be kept at its default value, 5263
  • 9. 9Civis Analytics | Proprietary and Confidential Backend: Setting up a dvset kiwi tables are basically our config files kiwi.tables - table id - primary key kiwi.depvars - depvar id - column name - table id kiwi.dvsets - model type - dvset name - depvar id insert (2541992, “voterbase_id”) into kiwi.tables ### makes an API call to get the names of all the columns insert (“romance”, 2541992), (“comedy”, 2541992), (“horror”, 2541992) into kiwi.depvars ### returns a list of auto-incremented depvar ids insert (dv_id_1, “movies_dvset_GBT”, “gradient boosting classifier), (dv_id_2, “movies_dvset_GBT”, “gradient boosting classifier”), (dv_id_3, “movies_dvset_GBT”, “gradient boosting classifier”) into kiwi.dvsets
  • 10. 10Civis Analytics | Proprietary and Confidential Step 2: Run a DVSet dvset name here training table here make sure dvset table and training table are on the same cluster! put in your username (for finding your S3 credential) *note: you may also see a “credential_id” parameter. This should be kept at its default value, 5263
  • 11. 11Civis Analytics | Proprietary and Confidential Backend: running the dvset kiwi tables are basically our config files kiwi.tables - table id - primary key kiwi.depvars - depvar id - column name - table id kiwi.dvsets - model type - dvset name - depvar id ### kiwi.dvset → dependent variables → depvar table ### auto-generate SQL code: create view public.rocky_train as select depvar_table.comedy, depvar_table.romance, depvar_table.horror, basefile.* join ts.modeling_commercial basefile with depvar_table on basefile.voterbase_id = depvar_table.voterbase_id file_id = export_redshift_to_S3(public.rocky_train) for dv in (“comedy”, “romance”, “horror”): mp = civis_model.ModelPipeline( depvar = dv, workflow = “gradient boosting classifier” excluded_cols = [all other dvs]) mp.train(file_id = file_id)
  • 12. 12Civis Analytics | Proprietary and Confidential What’s the right way to parallelize model-building: “map” step voterbase_id voterbase_id voterbase_id voterbase_id freq_theaterg oer genre_comed y genre_scifi genre_roman tic voterbase_id the usual basefile stuff voterbase_idvoterbase_id freq_theaterg oer genre_comed y genre_scifi genre_roman tic voterbase_id the usual basefile stuff freq_theaterg oer genre_comed y genre_comed y genre_roman tic the usual basefile stuff
  • 13. 13Civis Analytics | Proprietary and Confidential Step 5: Take a look at your models
  • 14. Did you get all that? Of course not. Those slides were terrible. Even worse, nobody used it. 14
  • 15. 15 Hi, I’m Annie. I’m a designer. I ask a lot of questions. Katie
  • 17.
  • 18. Empathize Define Ideate Prototype Test
  • 19. 1. The design is based upon an explicit understanding of users, tasks and environments. 2. Users are involved throughout design and development. 3. The design addresses the whole user experience. 4. The design is driven and refined by user-centered evaluation. 5. The process is iterative. Principles
  • 20. 20 1 / 5 Understand your users, tasks and environments
  • 21. 21 2 / 5 Keep users involved throughout the process
  • 22. 22 3 / 5 Consider the whole user experience
  • 23. 23 4 / 5 Evaluate
  • 25. 25 Match between system and the real world
  • 29. 29 Help users recognize, diagnose, and recover from errors
  • 30. 30 5 / 5 Iterate Empathize Define Ideate Prototype Test
  • 32. Principle 1: the design is based upon an explicit understanding of users, tasks, and environments Our user: a data scientist user is creating models for a business user to use Their task: business user wants to cut lists of people based on modeled predictions Can we build a tool to help with the list-cutting?
  • 33. models are already built grouped by topic easy-to-use
  • 34. For our data scientists, prioritize models over model-creation tools. before after
  • 35. Insight #2: Build models, not model-creation tools. before after
  • 36. In conclusion... If you’re a data scientist, you should care about people using the things you build, and you will build things that people use if you’re user-centered in your mindset. If you’re a business user, give problems not solutions and have a little empathy the other way: be engaged, be patient, give thoughtful feedback, have fun. You own the outcome together!
  • 37. Thanks! We would be delighted to take your questions 37