3. We have to
understand…
Data Scientist is a useless job
title. It is a ‘catch-all’ term for
many roles related to data.
Organizations are in different
stages of building up data
driven culture and hence
require different skillsets to
move up the data value chain.
4. Data Science Process
Define problem and how to validate solution
Collect raw data. Clean data.
Explore and analyse
Build ideas and model
Validate model. Build and validate solution.
Create Insights. Visualize and explain.
Operationalize. Inspire Decision.
5. Organizations data maturity
Ad-hoc data
collection and
analysis.
Controllable
data
collection.
Controllable
and
repeatable
experiments.
Ability to make
predictions
through
models.
Ability to
compare
models.
Automated
insights. Faster
data driven
decisions.
9. Conclusion
Understand what
phase the
organization is in.
What problem it is
trying to solve.
1
Create the right role
description.
2
Look for candidates
with combination of
skills. Prefer Pi
profiles.
3
Dig deeper into
candidates skills
beyond keywords.
4
10. Org State and role mapping
Decision workflows, Business
Process Integrations
Java, Python, Spark,
Apache Beam, GCP
Dataflow
Data Pipeline / DW
R, Python Visualization, D3.js
Hive, BigQuery
Data Analysis
Collect Store Explore Operationalize Automate Predict/Understand Optimize
CHIED DATA OFFICER, DATA RISK ANALYST,
EXPERT ANALYTICS, BUSINESS AI
INTEGRATION EXPERT, DATA GOVERNANCE
EXPERT, CONSULTANTS, AI RESEARCHER,
PhDs, HIGHLY EXPERIENCED EXPERT
ENGINEERS
DATA ANALYST, DATABASE DEVELOPER,
NOSQL DEVELOPERS, DATA ADMIN, BIG
DATA DEVELOPERS, DATA VISUALIZATION,
DATA HACKERS, DATA INSIGHTS ENGINEER,
NEWBIE ML ENGINEER, STATISTICIAN
DATA ENGINEER, CLOUD BIG DATA ENGINEER,
DISTRIBUTED SYSTEMS ENGINEER, DEVOPS,
BIG DATA ADMINISTRATOR
MACHINE LEARNING ENGINEER, DATA
SCIENTIST, ANALYTICS MANAGER, APPLIED AI
ENGINEER, NLP ENGINEER, OPTIMIZATION
EXPERT, STATISTICIAN, QUANTITATIVE
MODELLING EXPERT
Descriptive Analytics
[What happened?]
Predictive Analytics
[What can happen?]
Prescriptive Analytics
[What should we do when
it happens?]
Researchers, Deep Learning, AI Business integration,
Automated Insight Discovery
Controlled Repeatable Experiments, Fast Insights
Data Driven AI Organization
AI driven Actions, Cutting Edge Innovation
Machine Learning
scikit learn, tensorflow, h2o
Operation Dashboard, Real time
Analytics
Insights Actions
Big Data (Management of 4 Vs)
Data Architecture and Management
Flume, Hadoop, DW, S3
Ad-hoc Analysis