SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Building a Data-Driven WorldTM
Open Data Science Conference
A Hybrid Approach to Data
Science Project Management
Elaine Lee
elee@civisanalytics.com
@elaineklee
2Open Data Science Conference#ODSC
Organizations want to be data-driven but many obstacles stand in their way:
• Communication not trickling up to executives and key decision makers
• Silos between departments, making it difficult to share and collaborate on
analysis
• Data ingestion (ETL or Extract-Transform-Load) is difficult and time-consuming
• Lack of meaningful, yet customizable visual reporting
• Inability to flexibly scale up or down technological needs at a reasonable cost
• Inadequate or overwhelming learning resources about data science
A Common Problem With Many Faces
3Open Data Science Conference#ODSC
Where should Enroll America direct its insurance signup efforts?
Mapping the Uninsured in America
4Civis Analytics | Proprietary and Confidential
As a company, Civis traces its
origins to the 2012 Obama for
America analytics team.
We built a scientific
understanding of each voter.
Our data science influenced
every strategy and tactic: voter
targeting, messaging, media
buys, and fundraising.
This meant the campaign could
allocate resources where impact
would be greatest.
We ran the first
individualized
presidential
campaign
Civis Analytics | Proprietary and Confidential Open Data Science Conference#ODSC
5Civis Analytics | Proprietary and Confidential
Today, we
leverage data
science to help
our clients in
politics, non-
profits, and the
corporate world.
Civis Analytics | Proprietary and Confidential Open Data Science Conference#ODSC
Open Data Science Conference#ODSC Open Data Science Conference#ODSC
An easy-to-use,
end-to-end, incredibly
extendable, data science
platform in the cloud for
teams who want to make
great data-driven decisions
to drive their organizations
forward.
Introducing
Civis
7Open Data Science Conference#ODSC
The Civis Approach
ProductConsulting R&D
Applied Data Science
• Tackles the toughest data
science problems we can
find
Data Science R&D
• Generalizes and
automates the solution for
many scenarios
Software Engineering
• Integrates solutions into
user-empowering software
• Highly collaborative departments
• All departments contribute to both our services arm and product development
8Open Data Science Conference#ODSC
The Civis Approach
Our unique team structure allows
us to solve your biggest problems
with custom solutions and the
technology to scale them.
9Open Data Science Conference#ODSC
Strategies and philosophies
• Teams based on Civis’s product and consulting needs:
• “Built around code”
• Semi-annual departmental day-long off-sites to plan upcoming R&D initiatives
• Academia-influenced: evidence-based approaches to finding and reporting best
solutions
• Software development-influenced: standups, code review
• Favorite tools:
Data Science R&D
R&D
Modeling
Methodology
Unstructured
Data
Engineering
10Open Data Science Conference#ODSC
Tools
• Share and discuss data science news
• Receive feedback from colleagues
using our tools
• Discuss implementation
• Lower communication costs compared
to email
Data Science R&D
11Open Data Science Conference#ODSC
Tools
• Prototype new workflows
• Used like a log book to record and
present results
• Share preliminary results with
members of other departments
Data Science R&D
12Open Data Science Conference#ODSC
Tools
• Department heads set milestones,
check progress, and make project
staffing decisions
• Collaboratively plan development on
new functionality or organizational
processes (e.g. recruiting)
Data Science R&D
13Open Data Science Conference#ODSC
Tools
Strategies
• Designate “tag team” on R&D as
default R&D resources for client
engagements
• This is the Modeling Methodology
team
• Other R&D teams’ members may be
staffed on engagements depending on
expertise required
• R&D team member always serves as the
Consulted in the RACI model
• Transparency about challenges is
paramount
R&D <-> ADS
14Open Data Science Conference#ODSC
1. Assemble a project team of R&D data
scientists and Applied Data Scientists
2. Work with Enroll America to refine
requirements and come up with a plan
of analysis, ultimately resulting in the
design and execution of a phone
survey on a sample of individuals,
followed by building a predictive
model for the rest of the country.
3. The Applied Data Science Manager
has weekly calls with Enroll America
and status meetings with the project
team.
4. The project team delivers the
predictions and analysis to Enroll
America.
R&D <-> ADS: A Case Study
Mapping the Uninsured in America
The project team completes a postmortem
and determines these activities could be
automated: model building
15Open Data Science Conference#ODSC
Tools
Strategies
• Designate teams at the interface to
triage issues and plan new
development:
• R&D: “Engineering” team
• Tech: “Modeling” team
• Use module or project-specific chatrooms
to get answers to ad-hoc questions
quickly
• Identify opportunities to form cross-
functional teams, e.g.:
• Developing apps using the Platform’s
API
• Knowledge sharing on best practices
R&D <-> Tech
16Open Data Science Conference#ODSC
1. After the postmortem for the Enroll
America engagement, R&D begins
prototyping automated modeling
functionality and discussing its
implementation with the Tech
department.
2. R&D’s Engineering team finishes the
prototype and works with Tech’s
Modeling team to integrate it as a new
feature in the Platform.
3. During integration, ad hoc
discussions occur on GitHub and
Hipchat to address usability
questions, e.g. resource usage and
input/output specifications.
R&D <-> Tech: A Case Study
Mapping the Uninsured in America
The integration team successfully builds
and integrates the Build Model module in
the Platform.
Open Data Science Conference#ODSC
Our approach to data science consulting and product development
is enriched by valuable perspectives of our employees, who come
from a wide array of backgrounds, making our project management
strategies a hybrid of more conventional techniques.
Conclusion

Weitere ähnliche Inhalte

Was ist angesagt?

A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science processMathieu d'Aquin
 
Applied Machine Learning for the IoT - Data Science Pop-up Seattle
Applied Machine Learning for the IoT - Data Science Pop-up SeattleApplied Machine Learning for the IoT - Data Science Pop-up Seattle
Applied Machine Learning for the IoT - Data Science Pop-up SeattleDomino Data Lab
 
data scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centurydata scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centuryFrank Kienle
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in dataDavid Rostcheck
 
AI Orange Belt - Session 4
AI Orange Belt - Session 4AI Orange Belt - Session 4
AI Orange Belt - Session 4AI Black Belt
 
Developing cognitive applications v1
Developing cognitive applications v1Developing cognitive applications v1
Developing cognitive applications v1Harsha Srivatsa
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First CourseArnab Majumdar
 
Ethics of Analytics and Machine Learning
Ethics of Analytics and Machine LearningEthics of Analytics and Machine Learning
Ethics of Analytics and Machine LearningMark Underwood
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Omg co p proactive computing oct 2010
Omg co p   proactive computing oct 2010Omg co p   proactive computing oct 2010
Omg co p proactive computing oct 2010Opher Etzion
 
The State of Australian AI 2022
The State of Australian AI 2022The State of Australian AI 2022
The State of Australian AI 2022Jon Whittle
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceNiko Vuokko
 
AI Yellow Belt - Day 1 - case by Sagacify
AI Yellow Belt - Day 1 - case by SagacifyAI Yellow Belt - Day 1 - case by Sagacify
AI Yellow Belt - Day 1 - case by SagacifyAI Black Belt
 
How to perform Secure Data Labeling for Machine Learning
How to perform Secure Data Labeling for Machine LearningHow to perform Secure Data Labeling for Machine Learning
How to perform Secure Data Labeling for Machine LearningSkyl.ai
 
Towards the Industrialization of AI
Towards the Industrialization of AITowards the Industrialization of AI
Towards the Industrialization of AIHui Lei
 
Understanding Cognitive Applications: A Framework - Sue Feldman
Understanding Cognitive Applications:  A Framework - Sue FeldmanUnderstanding Cognitive Applications:  A Framework - Sue Feldman
Understanding Cognitive Applications: A Framework - Sue Feldmandiannepatricia
 
Smart Data Webinar: A Roadmap for Deploying Modern AI in Business
Smart Data Webinar: A Roadmap for Deploying Modern AI in BusinessSmart Data Webinar: A Roadmap for Deploying Modern AI in Business
Smart Data Webinar: A Roadmap for Deploying Modern AI in BusinessDATAVERSITY
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?Srinath Perera
 

Was ist angesagt? (20)

A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Applied Machine Learning for the IoT - Data Science Pop-up Seattle
Applied Machine Learning for the IoT - Data Science Pop-up SeattleApplied Machine Learning for the IoT - Data Science Pop-up Seattle
Applied Machine Learning for the IoT - Data Science Pop-up Seattle
 
data scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st centurydata scientist the sexiest job of the 21st century
data scientist the sexiest job of the 21st century
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
AI Orange Belt - Session 4
AI Orange Belt - Session 4AI Orange Belt - Session 4
AI Orange Belt - Session 4
 
Developing cognitive applications v1
Developing cognitive applications v1Developing cognitive applications v1
Developing cognitive applications v1
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Ethics of Analytics and Machine Learning
Ethics of Analytics and Machine LearningEthics of Analytics and Machine Learning
Ethics of Analytics and Machine Learning
 
Data science - An Introduction
Data science - An IntroductionData science - An Introduction
Data science - An Introduction
 
QAI brochure
QAI brochureQAI brochure
QAI brochure
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Omg co p proactive computing oct 2010
Omg co p   proactive computing oct 2010Omg co p   proactive computing oct 2010
Omg co p proactive computing oct 2010
 
The State of Australian AI 2022
The State of Australian AI 2022The State of Australian AI 2022
The State of Australian AI 2022
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
AI Yellow Belt - Day 1 - case by Sagacify
AI Yellow Belt - Day 1 - case by SagacifyAI Yellow Belt - Day 1 - case by Sagacify
AI Yellow Belt - Day 1 - case by Sagacify
 
How to perform Secure Data Labeling for Machine Learning
How to perform Secure Data Labeling for Machine LearningHow to perform Secure Data Labeling for Machine Learning
How to perform Secure Data Labeling for Machine Learning
 
Towards the Industrialization of AI
Towards the Industrialization of AITowards the Industrialization of AI
Towards the Industrialization of AI
 
Understanding Cognitive Applications: A Framework - Sue Feldman
Understanding Cognitive Applications:  A Framework - Sue FeldmanUnderstanding Cognitive Applications:  A Framework - Sue Feldman
Understanding Cognitive Applications: A Framework - Sue Feldman
 
Smart Data Webinar: A Roadmap for Deploying Modern AI in Business
Smart Data Webinar: A Roadmap for Deploying Modern AI in BusinessSmart Data Webinar: A Roadmap for Deploying Modern AI in Business
Smart Data Webinar: A Roadmap for Deploying Modern AI in Business
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 

Ähnlich wie A Hybrid Approach to Data Science Project Management

The Download: Tech Talks by the HPCC Systems Community, Episode 12
 The Download: Tech Talks by the HPCC Systems Community, Episode 12 The Download: Tech Talks by the HPCC Systems Community, Episode 12
The Download: Tech Talks by the HPCC Systems Community, Episode 12HPCC Systems
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
The Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development ProcessesThe Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development ProcessesChristoph Matthies
 
IT Application Development - with SDLC.pptx
IT Application Development - with SDLC.pptxIT Application Development - with SDLC.pptx
IT Application Development - with SDLC.pptxdjualaja88
 
Breed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxBreed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxGautamPopli1
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Betacowork
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with DatabricksGrega Kespret
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teamsVenkatesh Umaashankar
 
Agile data science
Agile data scienceAgile data science
Agile data scienceJoel Horwitz
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
Capstone Presentation 2015 - Quality+
Capstone Presentation 2015 - Quality+Capstone Presentation 2015 - Quality+
Capstone Presentation 2015 - Quality+Eric M. Pastore
 
Uncovering Emerging Information Trends in Information Technology
Uncovering Emerging Information Trends in Information TechnologyUncovering Emerging Information Trends in Information Technology
Uncovering Emerging Information Trends in Information TechnologyEric M. Pastore
 
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...panagenda
 
Software management plans in research software
Software management plans in research softwareSoftware management plans in research software
Software management plans in research softwareShoaib Sufi
 
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...Databricks
 
Data Visualization in Health
Data Visualization in HealthData Visualization in Health
Data Visualization in HealthRamon Martinez
 
Smart source usa ppt
Smart source usa pptSmart source usa ppt
Smart source usa pptbonafied
 

Ähnlich wie A Hybrid Approach to Data Science Project Management (20)

The Download: Tech Talks by the HPCC Systems Community, Episode 12
 The Download: Tech Talks by the HPCC Systems Community, Episode 12 The Download: Tech Talks by the HPCC Systems Community, Episode 12
The Download: Tech Talks by the HPCC Systems Community, Episode 12
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
The Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development ProcessesThe Road to Data-Informed Agile Development Processes
The Road to Data-Informed Agile Development Processes
 
IT Application Development - with SDLC.pptx
IT Application Development - with SDLC.pptxIT Application Development - with SDLC.pptx
IT Application Development - with SDLC.pptx
 
Breed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxBreed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptx
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
 
How Celtra Optimizes its Advertising Platform with Databricks
How Celtra Optimizes its Advertising Platformwith DatabricksHow Celtra Optimizes its Advertising Platformwith Databricks
How Celtra Optimizes its Advertising Platform with Databricks
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
DMP & DMPonline
DMP & DMPonlineDMP & DMPonline
DMP & DMPonline
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Data-X-Sparse-v2
Data-X-Sparse-v2Data-X-Sparse-v2
Data-X-Sparse-v2
 
Capstone Presentation 2015 - Quality+
Capstone Presentation 2015 - Quality+Capstone Presentation 2015 - Quality+
Capstone Presentation 2015 - Quality+
 
Uncovering Emerging Information Trends in Information Technology
Uncovering Emerging Information Trends in Information TechnologyUncovering Emerging Information Trends in Information Technology
Uncovering Emerging Information Trends in Information Technology
 
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
CollabSphere 2020 - ANA101 - Domino Application Strategy Key insights for suc...
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
Software management plans in research software
Software management plans in research softwareSoftware management plans in research software
Software management plans in research software
 
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
 
Data Visualization in Health
Data Visualization in HealthData Visualization in Health
Data Visualization in Health
 
Smart source usa ppt
Smart source usa pptSmart source usa ppt
Smart source usa ppt
 

Kürzlich hochgeladen

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 

Kürzlich hochgeladen (20)

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

A Hybrid Approach to Data Science Project Management

  • 1. Building a Data-Driven WorldTM Open Data Science Conference A Hybrid Approach to Data Science Project Management Elaine Lee elee@civisanalytics.com @elaineklee
  • 2. 2Open Data Science Conference#ODSC Organizations want to be data-driven but many obstacles stand in their way: • Communication not trickling up to executives and key decision makers • Silos between departments, making it difficult to share and collaborate on analysis • Data ingestion (ETL or Extract-Transform-Load) is difficult and time-consuming • Lack of meaningful, yet customizable visual reporting • Inability to flexibly scale up or down technological needs at a reasonable cost • Inadequate or overwhelming learning resources about data science A Common Problem With Many Faces
  • 3. 3Open Data Science Conference#ODSC Where should Enroll America direct its insurance signup efforts? Mapping the Uninsured in America
  • 4. 4Civis Analytics | Proprietary and Confidential As a company, Civis traces its origins to the 2012 Obama for America analytics team. We built a scientific understanding of each voter. Our data science influenced every strategy and tactic: voter targeting, messaging, media buys, and fundraising. This meant the campaign could allocate resources where impact would be greatest. We ran the first individualized presidential campaign Civis Analytics | Proprietary and Confidential Open Data Science Conference#ODSC
  • 5. 5Civis Analytics | Proprietary and Confidential Today, we leverage data science to help our clients in politics, non- profits, and the corporate world. Civis Analytics | Proprietary and Confidential Open Data Science Conference#ODSC
  • 6. Open Data Science Conference#ODSC Open Data Science Conference#ODSC An easy-to-use, end-to-end, incredibly extendable, data science platform in the cloud for teams who want to make great data-driven decisions to drive their organizations forward. Introducing Civis
  • 7. 7Open Data Science Conference#ODSC The Civis Approach ProductConsulting R&D Applied Data Science • Tackles the toughest data science problems we can find Data Science R&D • Generalizes and automates the solution for many scenarios Software Engineering • Integrates solutions into user-empowering software • Highly collaborative departments • All departments contribute to both our services arm and product development
  • 8. 8Open Data Science Conference#ODSC The Civis Approach Our unique team structure allows us to solve your biggest problems with custom solutions and the technology to scale them.
  • 9. 9Open Data Science Conference#ODSC Strategies and philosophies • Teams based on Civis’s product and consulting needs: • “Built around code” • Semi-annual departmental day-long off-sites to plan upcoming R&D initiatives • Academia-influenced: evidence-based approaches to finding and reporting best solutions • Software development-influenced: standups, code review • Favorite tools: Data Science R&D R&D Modeling Methodology Unstructured Data Engineering
  • 10. 10Open Data Science Conference#ODSC Tools • Share and discuss data science news • Receive feedback from colleagues using our tools • Discuss implementation • Lower communication costs compared to email Data Science R&D
  • 11. 11Open Data Science Conference#ODSC Tools • Prototype new workflows • Used like a log book to record and present results • Share preliminary results with members of other departments Data Science R&D
  • 12. 12Open Data Science Conference#ODSC Tools • Department heads set milestones, check progress, and make project staffing decisions • Collaboratively plan development on new functionality or organizational processes (e.g. recruiting) Data Science R&D
  • 13. 13Open Data Science Conference#ODSC Tools Strategies • Designate “tag team” on R&D as default R&D resources for client engagements • This is the Modeling Methodology team • Other R&D teams’ members may be staffed on engagements depending on expertise required • R&D team member always serves as the Consulted in the RACI model • Transparency about challenges is paramount R&D <-> ADS
  • 14. 14Open Data Science Conference#ODSC 1. Assemble a project team of R&D data scientists and Applied Data Scientists 2. Work with Enroll America to refine requirements and come up with a plan of analysis, ultimately resulting in the design and execution of a phone survey on a sample of individuals, followed by building a predictive model for the rest of the country. 3. The Applied Data Science Manager has weekly calls with Enroll America and status meetings with the project team. 4. The project team delivers the predictions and analysis to Enroll America. R&D <-> ADS: A Case Study Mapping the Uninsured in America The project team completes a postmortem and determines these activities could be automated: model building
  • 15. 15Open Data Science Conference#ODSC Tools Strategies • Designate teams at the interface to triage issues and plan new development: • R&D: “Engineering” team • Tech: “Modeling” team • Use module or project-specific chatrooms to get answers to ad-hoc questions quickly • Identify opportunities to form cross- functional teams, e.g.: • Developing apps using the Platform’s API • Knowledge sharing on best practices R&D <-> Tech
  • 16. 16Open Data Science Conference#ODSC 1. After the postmortem for the Enroll America engagement, R&D begins prototyping automated modeling functionality and discussing its implementation with the Tech department. 2. R&D’s Engineering team finishes the prototype and works with Tech’s Modeling team to integrate it as a new feature in the Platform. 3. During integration, ad hoc discussions occur on GitHub and Hipchat to address usability questions, e.g. resource usage and input/output specifications. R&D <-> Tech: A Case Study Mapping the Uninsured in America The integration team successfully builds and integrates the Build Model module in the Platform.
  • 17. Open Data Science Conference#ODSC Our approach to data science consulting and product development is enriched by valuable perspectives of our employees, who come from a wide array of backgrounds, making our project management strategies a hybrid of more conventional techniques. Conclusion

Hinweis der Redaktion

  1. Hi everyone, it’s great to be here. My name is Elaine Lee. I am a Data Scientist in the R&D department at Civis Analytics. Civis is a Chicago-based data science consulting and software startup, and I’m excited to tell you a little bit about our company and the work that we do. In particular, I’ll be talking about how the R&D department juggles concurrent development of both our consulting services and our cloud-based data science platform. I’ll be emphasizing approaches borrowed from other more established industries as it pertains to department projects as well as interdepartmental collaborations.
  2. Many of you are already familiar with data science and the potential it has to change the way things are done. However, data science has a high barrier of entry for some teams, from a technical standpoint and organizational standpoint. It can be difficult to wrap your head around the technical needs and quantitative concepts that go into data science. In addition, it can be hard to assemble the right team to do data science and to keep the work organized. Picture a team of data scientists working on the same project. Some of them have written R or Python scripts to process the data, do feature engineering, and build models on it. Some of them have taken the results of the models and produced charts and visualizations in Excel, Tableau, or D3. All the work is being kept in a few different places – Dropbox, Google Drive, Github, MySQL, … It is difficult for this hypothetical team to figure out what exactly has been done, and even worse, what efforts have been duplicated. It is also incredibly difficult to validate the analysis. Does this sound familiar to anyone? Fortunately, many of us at Civis Analytics have faced these challenges in our previous work, but we’ve made those challenges a thing of the past! It didn’t happen overnight, but we were constantly coming up with new ideas to improve the data science workflow by, well, working on a variety of consulting projects and researching new methods. Today I will talk about what some of these ideas are. In addition, I will tell the story of how one client engagement provided us a valuable exercise in collaboration and data science best practices we’ve internalized.
  3. Throughout my talk today, I will be using our project with Enroll America to illustrate a lot my concepts. Enroll America was one of our first clients in 2013. They wanted our help identifying Americans without health insurance so they knew where to direct their outreach. This was a challenging problem because of its large scope – they want to do outreach throughout the country! – and it wasn’t obvious what’s predictive of being uninsured. Why did Enroll America specifically seek us out to solve this problem?
  4. Let’s talk a little about what expertise Civis has for tackling problems like Enroll America’s. The founding members of Civis Analytics were part of Obama For America’s analytics team in his 2012 re-election campaign. There, we developed the beginnings of a framework for doing person-level analytics (which is highly relevant for Enroll America). With scientific levels of rigor, we built models to understand all sorts of relevant vote-related behaviors in order to better identify and persuade supporters, which translated to optimizing how the campaign’s resources were used. The campaign spanned many months and during that time, lots of models were being built and refined; their results were constantly being sent to those in the field to take action upon. Developing an organized and repeatable workflow was especially crucial in order to minimize costs, time spent – especially since the staff was small, and any inadvertent human error, especially when models are built at such a large scale.
  5. After the campaign ended in 2012, we re-examined the strategies we employed and the problems they solved. We realized that if we generalized them, we could solve similar problems for clients in the political, non-profit, and corporate worlds. Which is exactly what Civis did. What you see here is a sample of clients, in addition to Enroll America, that we have helped better target their advertising dollars, identify potential customers for greener sources of electricity, and determine public awareness and sentiment on their brand or cause. In the past year, we took it a step further and we formed a partnership with Discovery Communications to inform more sophisticated audience targeting approaches, ratings forecasting, and marketing spend. We anticipate making more partnerships like this in the future. The examples I gave are all problems with a similar flavor to what Civis successfully solved in 2012 – identifying and reaching the people you care about most.
  6. Our diverse client portfolio, innovative approaches, and proven track record have made Civis Analytics’ consulting services highly sought after in the predictive analytics space. However, we’re equally passionate about removing obstacles to doing data science. Our steady client pipeline enables us to formalize our approach in the form of a cloud-based data science application. Our software, Civis, or “the Platform”, supports the entire workflow of a typical data science project, from data warehousing to data processing to predictive modeling to reporting. This enables organizations to easily take control of their own data and unlock their insights.
  7. This is how we turn our client work experiences into software. We select novel problems brought forth by our clients and work with them to deliver a solution. This is primarily addressed by our Applied Data Science department. Simultaneous to this, we’ve been conducting research and experimenting with different methods to solve the problem, with one eye towards determining how to generalize the solution. This is primarily done by the Data Science R&D department. Finally, solutions are integrated into our software platform by the Software Engineering, or Tech, department. Users of our software platform – clients and our Applied Data Scientists – provide us valuable feedback which are continuously incorporated. This unique, synergistic cycle enables us to deliver high quality results to our customers.
  8. In our day-to-day work, all departments pitch in on both lines of business, ensuring fluency on all the company’s offerings and thus better decision making. We also collaborate across departments on all projects, big or small. Today I will be focusing on how my department, the DS R&D department, manages its workload and how it works with the Applied Data Science and Tech departments.
  9. The R&D department is the only department that is intimately aligned with both lines of business. We’re split into 3 different teams. Modeling Methodology focuses on developing new modeling workflows. Unstructured data specializes in data that can’t neatly be summarized by a flat file, like text data. Engineering is responsible for managing our production codebases of new features for our software product. Our department is “built around code”: “We're trying to build up knowledge and best practices, and being built around code lowers our communication costs, errors, redundancy, and facilitates us making software.” To roadmap what we build, based on what we’ve learned from recent client engagements, we have day-long semi-annual department off-sites. When developing new methodologies, we use an academic-influenced approach – empirical and thorough such that our recommended solution covers all the edge cases. When building out workflows, we follow guidelines common to most software development projects, including some ideas from the Agile methodology – we have daily standups to make sure everyone’s on the same page about the status of the codebases and we do code reviews before any changes are shipped. Our standups are on a per-repository basis, so it doesn’t waste anyone’s time. To do our work, these are our favorite tools. Let’s take a look at how we use them.
  10. Hipchat and Github form the backbone of our communications. To those not familiar with these tools, Hipchat is an instant messaging tool for organizations. Github is a web interface, built on top of the version control system, git, for teams to collaborate on a codebase. These tools are crucial to our philosophy on being built around code They enable members across the company to participate by asking questions and generally weighing in Departmental members use it to discuss implementation These tools are much faster than email since it makes it easier to ask questions and get answers, since anyone who knows the answer can see the request and thus respond.
  11. When developing new methods, we like to use Jupyter and Google Drive. We use Jupyter for its Ipython Notebook capabilities. It allows us to run Python code, especially modules from our codebase, interactively – it allows us to chain components together to make new workflows. Jupyter also has presentation functionality, so we also use it as a log book to record and present results in internal meetings. Sometimes we also use Google Drive to record and share results with members of other departments, such as Applied Data Scientists, who have a vested interest in the project but don’t require all the details.
  12. Finally, to take the “pulse” on the R&D department as a whole, department heads use Google Drive and Asana for big picture planning. Asana is a project management tool which gives department heads a birds eye view of what each team member is working on and how each project is progressing. Google Drive tools are used to collaborate on planning documents, be it plans for new functionality to build or revising organizational processes, such as rewriting our hiring exam.
  13. That was how we, the R&D department, work together. How do we work with the Applied Data Scientists, the data scientists in our consulting arm? To make project staffing seamless, we designate a tag team to serve as the first point of contact for client engagements. This is the Modeling Methodology team. However, other R&D data scientists may be staffed on a project depending on expertise required. The R&D data scientist always serves as the Consulted in the RACI model. The RACI model is a popular project management model used in consulting. It emphasizes explicit roles for each team member to ensure accountability. R is for Responsible, a role held by the applied data scientists. A is for Accountable; this is the Applied Data Science Manager or project manager C is for consulted. And I is for Informed (the client) Lastly, we are open with Applied Data Scientists about R&D challenges in order to avoid schedule slips on the client engagement. The project plan is often tracked in Trello, a popular bulletin board app, with bulletin boards for each milestone’s requirements.
  14. Let’s revisit our client story – Mapping the Uninsured in America – to illustrate concretely how we work together. After Enroll America shared their problem to us, we assembled a project team of R&D data scientists and Applied data scientists to solve it. We worked with Enroll to refine the problem statement into a set of requirements, ultimately resulting in the design and execution of a phone survey on a sample of individuals, followed by building a model to capture the rest of the country. The project gets under way. Throughout the project, the Applied Data Science Manager has weekly status calls with Enroll and with the project team to make sure we’re on schedule. Occasionally we staffed a couple extra data scientists to the project to make sure we delivered results on time when there was risk of a schedule slip. For example, we brought in an extra data scientist towards the end of the project to help produce graphs and visualizations of the results. Finally, we finished our analysis and presented our predictions to Enroll America. Afterwards, we did a post mortem and realized that automated model building would’ve made us more efficient. This is because we conducted our experiment in waves and built similar models as the results came in, with the only difference being the input data. Also, the analysts were each working on individual components of the analysis, writing their own R scripts which had a lot of overlap (such as the data processing steps), which meant a lot of time was wasted.
  15. So that’s how we work with the Applied Data Scientists on consulting projects. How do we work with the Tech department? Much like how we work with the Applied Data Science department, we’ve designated a team to interface with the Tech department and they have as well. That would be the Engineering team on our side and the Modeling team on their side. The Engineering team in Data Science are data scientists who speak software development and the Modeling team in the Tech department are software engineers who speak data science. Most of our communications are done using module or project-specific chatrooms and github issue tickets, which gets answers quickly. To promote really inspired product development, we identify opportunities to form cross-functional teams, Such as using the Platform’s API to develop new apps And teaching each other best practices for software development via brownbag sessions.
  16. Let’s revisit the Enroll America project for an example of how the R&D data scientists work with the software engineers. After the post mortem for the Enroll engagement, we began prototyping automated modeling functionality, communicating to the Tech department the motivation for it and including them in discussions about implementation and feasibility. Once we finish the prototype, ensuring that it passes all the tests and code review, the Engineering team in R&D work with the Modeling team in Tech to integrate it as a new feature in the Platform. We use Github and Hipchat to discuss questions that come up, such as resource usage, input/output specifications, and data visualizations we wanted to provide to the end user. Together, the R&D department and the Tech department successfully built and integrated the Build Model module that exists today in Platform.
  17. In summary, a lot of our approaches have a common theme, which is minimizing communication costs within the R&D department and with other departments. This is evidenced by our embrace of some free or open-source tools for collaboration and our general belief in transparency about challenges. We also emphasize collaborative opportunities between departments to strengthen our cohesiveness as a team, be it working on a client engagements together or learning best practices in a seminar format. A lot of our ideas come from the valuable perspectives of our employees, who come from a wide array of backgrounds. Thus, our project management strategies are a hybrid of techniques seen in more established industries such as software engineering, consulting, and academia. I hope the tips presented in my talk today has made doing data science more manageable for your team. Thank you for your time.