2. Patterns I came across
“We are on the pace
of transforming
ourselves into a tech
company, we must
explore some data
science PoC’s”
“We need to make
better use of our
data, must be
good case for data
science”
“We can automate
many tasks using
machine learning,
lets do a PoC”
“Let’s build a
cool machine
learning model
and take it
business”
6. ! Data Science projects often start as
PoC’s
! Works great to mitigate the hype
! Low cost
The Proof of Concept (PoC) Mode
7. ! Not always business value focussed
! Suffers from ad hoc prioritization
! Expectation mismatch
! Often unclear roadmap/vision
Limited Value with PoC
8. !
!
!
Business first approach
Empower Data team with
product mindset
Focus on reusability (code,
infrastructure)
! Poly-skilled team
! Avoids standalone tabletop
solutions
! Iterate with a vision
! Enables build platform to
support multiple solutions
From PoC to MVP
10. !
!
Data Science projects are Not
Requirement driven (Well, For the
most part)
Data Science projects are not
always Test driven (still very
important to write tests)
! Data Science projects are always
Hypothesis driven
Data
Exploration
Feature
Analysis
Feature
Engineering
BuildEvaluate
Model
Domain
Knowledge
DSLC: The Data Science Life Cycle
16. Getting data, data quality
checks, begin to build the
pipeline
Exploration ,
Experimentation
and Model building
Now that we have data and
certain transformations, Time to
start on underlying computation
framework
Feature engineering,
Experimentation,
and Model building
& evaluation
Build CD, Model
management,integrate
Agile DPLC in action
17. ●A machine learning platform allows rapid
experimentation
●Allows feature sharing between teams
● Model management and versioning
● Faster path to production
●A collaborative and shareable infrastructure
Product Thinking +
Platform Approach
Accelerate with Platform
18. Pay attention to Underlying Math
“I would rather have questions that can't be answered than answers that
can't be questioned.” - Richard P.Feynman, Physicist