SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
The path to be
a
Data Scientist
Poo Kuan Hoong, Ph.D
Senior Manager Data Science,
Nielsen Malaysia
Disclaimer: The views and opinions expressed in this slides are those of
the author and do not necessarily reflect the official policy or position
of Nielsen Malaysia. Examples of analysis performed within this slides
are only examples. They should not be utilized in real-world analytic
products as they are based only on very limited and dated open source
information. Assumptions made within the analysis are not reflective of
the position of Nielsen Malaysia.
Agenda
• What is a data scientist?
• What kinds of companies that employ data scientists?
• What are the key functions of data scientist?
• What type of work does a data scientist do?
• General Aptitude to be a data scientist
• What skillsets needed to be a data scientist?
• What is data science?
• Where do I begin?
• MDEC National Big App Challenge 3.0 Knowledge Sharing
Self Introduction
Poo Kuan Hoong, http://www.linkedin.com/in/kuanhoong
• Senior Manager Data
Science
• Senior Lecturer
• Chairperson Data Science
Institute
• Coursera Facilitator
• Consultant
• Funding mentor
• Founder
• Speaker/Trainer
https://www.meetup.com/MY-RUserGroup/
https://www.facebook.com/rusergroupmalaysia/
What is a Data Scientist?
Data Scientist
The term "data scientist" has been
around for years, and the various
advanced analytics specialties that
fall under it are even older.
However, due to recent explosion
of data, the term has been used in
the convergence of disciplines and
that leads to the soaring
popularity.
What are the job title?
• Data Scientist
• Data Engineer
• Big Data Engineer
• Machine Learning Scientist
• Business Analytics Specialist
• Data Visualization Developer
• BI Solutions Architect/ BI Specialist
• Operations Research Analyst
• Analytics Manager
• Machine Learning Engineer
• Statistician
• Business Intelligence (BI) Engineer
Why the Global Need?
Abundance of
Data
Availability of
affordable
compute
resources
Internet of
Things (IoT)
sensors data
950 Data Analyst (India)
8,411 Data Scientist (US)
808 Data Analyst (UK)
1,188 Data Manager (US)
81 Data Analyst (Australia)
80 in April 2015 1,500 by 2020
The Star, Friday, 24 April 2015
“Malaysia needs 1,500 data scientists by 2020”
What kinds of companies that
employ data scientists?
MNC
Government
BANKS
What are the key functions of
data scientist?
Key functions of data scientist
Devising
Business
Strategies
from the
insights
Descriptive
and Predictive
Analytics
Data Mining
and Analysis
Design
Understanding
the business
problem
Scenario 1: Customer Churn
Analytics
Churn analytics
• Predicting who will switch mobile operator
Customer churn - who do customers change
operators?
• The top 3 reasons why
subscribers change providers:
• They want a new handset
• They believe they pay too
much for calls/data
• Providers do not offer
additional loyalty benefits
Data Collection
Data Preprocessing
Attributes selection
• Attribute 1
• Attribute 2
• Attribute 3
Algorithm
Training Model Score Model
Apply Data
/Test Data
Predicting Output
Initialization Step Learn Step Apply Step
Machine Learning Framework
Correlation Matrix
Feature selection
Models comparison
• Receiver operating characteristic
curve (ROC curve) illustrates the
performance of a binary classifier
system as its discrimination
threshold is varied.
Scenario 2: Market Basket
Analysis
Market Basket Analysis
Where should detergents be placed in the
store to maximize sales?
Are bleach products purchased when
detergents and orange juice are bought
together?
Is cola typically purchased with bananas?
Does the brand of cola make a difference?
How are the demographics of the
neighbourhood affecting what customers
are buying?
What type of work does a data
scientist do?
http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-
consuming-least-enjoyable-data-science-task-survey-says/#f37c7f758459
http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-
consuming-least-enjoyable-data-science-task-survey-says/#f37c7f758459
General aptitude to be Data
Scientist
Data Scientist
• Common sense
• Curious mind
• Clear and simplify
thought
• Love to solve
puzzles
• Good listening,
writing and
communication
skills
• Maths & Stats
• Business
sense
I have 4 red, 18 black and 8 brown socks in my sock drawer. If it is
completely dark and I cannot see the colour of the socks that I am
picking, how many socks do I need to take from the drawer to be sure
that I have at least one pair of socks that are the same colour?
What is the hidden number under the car?
What skillsets needed to be a
data scientist?
Data scientist skillsets
• Data Mining
• Machine Learning
• R/Python
• Data Analysis
• Statistics
• SQL
• Java
• Algorithms
Image Source: http://imgur.com/hoyFT4t
What is the average salary?
Average salary: Data Scientist
What is data science?
Data Science
• Data science is as an evolutionary step in interdisciplinary fields like
business analysis that incorporate computer science, modeling, statistics,
analytics, and mathematics.
• At its core, data science involves using automated methods to analyze
massive amounts of data and to extract knowledge from them.
• Drawing insight from a piece of data involves understanding how it fits
into the larger picture of an organization,
Where do I begin?
Massive Open Online Course (MOOC)
• MSC Malaysia MyProCert (SRI) – Data Science Massive Open Online
Courses (MOOC)
• The Center of Applied Data Science (MDEC & HRDF)
• John Hopkins University – Data Science Specialization
• University of Washington - Data Science at Scale Specialization
• Data Analyst Nanodegree - Udacity
• CSCI E-109 Data Science (Harvard Extension School)
• Machine Learning - Stanford University
BDA Undergraduate & Postgraduate
Programme
Undergraduate
• Multimedia University – Bachelor of Computer Science (Data Science
Specialization)
• Sunway University - BSc (Hons) Information Systems (Business
Analytics)
• Universiti Teknologi Malaysia (UTM), International Islamic University
Malaysia, Monash University, University Institute Technology Mara
(UiTM) & University Teknologi Petronas (UTP).
Postgraduate
• Big Data Analytics Post Graduate Programme
Kaggle
• Data sets, real problems, in
unprocessed manner.
• Recommend to go through
past competitions.
• Read through the forums
with particular
competitions to find out
useful discussion and
tips/hints that will be
useful for solving future
problems.
• https://www.kaggle.com/
UC Irvine Machine Learning Repository
• 360 data sets as a service to the machine learning community
http://archive.ics.uci.edu/ml/
Open data
• Open data from various countries
• Malaysia - http://www.data.gov.my/
• Singapore - https://data.gov.sg/
MDEC National Big App Challenge
3.0
• June 4th – June 5th 2016, Berjaya Times Square
• The themes for AHKL2016 were as follows:
1. Big Data Analytics --- Powered by MDEC. Access to 65mil
rows of real datasets sponsored by iProperty.com Malaysia
2. O2O Commerce --- Powered by MOLWallet MOLPay
3. Smart Living --- Powered by TIME Internet
National MDEC Big App Challenge 3.0
PropertySenze
• B2B business model
• Provide machine learning and AI
services to customers
• Visual Search
• Personalized customer experience
BUSINESS
MODEL
Big Data becomes Smart Data
1. PropertySenze
contracts with
property sites and
property developers
to generate
analytics and visual
search
5. Analytics at the
fingertips for both
buyers and sellers
2. PropertySenze’s
machine learning algorithm
enables search and buy
similar properties that user
sees on the sites, from
user‐generated photos and
from user‐uploaded images
3. Enhanced
search experience
and personalized
results for users
7. PropertySenze
verifies all
transactions and
charges
commission fees
every month
4. Improved platform
that recognizes
properties for retrieval
purposes or instant
purchases.
6. Improved user
experience that
leads to more
engagement and
sale transactions
PropertySenze
Hackathon: Tips
• Have a well-shaped team with not more than one
server-side developer with relevant experience,
one good designer and one the amazing storyteller
• Understand the expected outcomes of the
hackathon
• Develop something that everyone can see the
benefits
• Have an impressive aim or objective
• Start promoting your product during the
hackathon
• Hit the demo 100%. The pitch is for the product to
shine
Thanks!
Questions?
@kuanhoong
https://www.linkedin.com/in/kuanhoong
kuanhoong@gmail.com

Weitere ähnliche Inhalte

Was ist angesagt?

Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2Roger Barga
 
CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsMichał Łopuszyński
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Edureka!
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101Setu Chokshi
 
Barga Data Science lecture 1
Barga Data Science lecture 1Barga Data Science lecture 1
Barga Data Science lecture 1Roger Barga
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningPruet Boonma
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learningKnoldus Inc.
 
How to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistHow to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistInside Analysis
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningTamir Taha
 
GTU GeekDay Data Science and Applications
GTU GeekDay Data Science and ApplicationsGTU GeekDay Data Science and Applications
GTU GeekDay Data Science and ApplicationsKürşat İNCE
 
Barga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkBarga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkRoger Barga
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteRoger Barga
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningMostafa
 
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...Simplilearn
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014Roger Barga
 
How to become a data scientist
How to become a data scientist How to become a data scientist
How to become a data scientist Manjunath Sindagi
 

Was ist angesagt? (20)

Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
CRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining ProjectsCRISP-DM - Agile Approach To Data Mining Projects
CRISP-DM - Agile Approach To Data Mining Projects
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Machine Learning 101
Machine Learning 101Machine Learning 101
Machine Learning 101
 
Barga Data Science lecture 1
Barga Data Science lecture 1Barga Data Science lecture 1
Barga Data Science lecture 1
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
How to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data ScientistHow to Identify, Train or Become a Data Scientist
How to Identify, Train or Become a Data Scientist
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
GTU GeekDay Data Science and Applications
GTU GeekDay Data Science and ApplicationsGTU GeekDay Data Science and Applications
GTU GeekDay Data Science and Applications
 
Barga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkBarga DIDC'14 Invited Talk
Barga DIDC'14 Invited Talk
 
Barga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 KeynoteBarga ACM DEBS 2013 Keynote
Barga ACM DEBS 2013 Keynote
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
Machine Learning Engineer Salary, Roles And Responsibilities, Skills and Resu...
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Data Driven Engineering 2014
Data Driven Engineering 2014Data Driven Engineering 2014
Data Driven Engineering 2014
 
How to become a data scientist
How to become a data scientist How to become a data scientist
How to become a data scientist
 

Andere mochten auch

Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! PromptCloud
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenPoo Kuan Hoong
 
How to become a data scientist in 6 months
How to become a data scientist in 6 monthsHow to become a data scientist in 6 months
How to become a data scientist in 6 monthsTetiana Ivanova
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
Customer Analytics & Segmentation
Customer Analytics & SegmentationCustomer Analytics & Segmentation
Customer Analytics & SegmentationGeorge Krasadakis
 
Data Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyData Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyLyn Fenex
 
A Comparison of People Counting Techniques via Video Scene Analysis
A Comparison of People Counting Techniques viaVideo Scene AnalysisA Comparison of People Counting Techniques viaVideo Scene Analysis
A Comparison of People Counting Techniques via Video Scene AnalysisPoo Kuan Hoong
 
Learn Data Science
Learn Data ScienceLearn Data Science
Learn Data ScienceRyan
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist Experian_US
 
Basic data analysis using R.
Basic data analysis using R.Basic data analysis using R.
Basic data analysis using R.C. Tobin Magle
 
Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation - Parinds...
 Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation  - Parinds... Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation  - Parinds...
Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation - Parinds...Jigsaw Academy
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with RPoo Kuan Hoong
 
Class ppt overview of analytics
Class ppt overview of analyticsClass ppt overview of analytics
Class ppt overview of analyticsJigsawAcademy2014
 
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Data Science London
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
Predictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advicePredictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and adviceThe Marketing Distillery
 
Univariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squareUnivariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squarekongara
 

Andere mochten auch (20)

Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps! Be a Data Scientist in 8 steps!
Be a Data Scientist in 8 steps!
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
How to become a data scientist in 6 months
How to become a data scientist in 6 monthsHow to become a data scientist in 6 months
How to become a data scientist in 6 months
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Customer Analytics & Segmentation
Customer Analytics & SegmentationCustomer Analytics & Segmentation
Customer Analytics & Segmentation
 
Data Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st CenturyData Scientist: The Sexiest Job in the 21st Century
Data Scientist: The Sexiest Job in the 21st Century
 
A Comparison of People Counting Techniques via Video Scene Analysis
A Comparison of People Counting Techniques viaVideo Scene AnalysisA Comparison of People Counting Techniques viaVideo Scene Analysis
A Comparison of People Counting Techniques via Video Scene Analysis
 
Learn Data Science
Learn Data ScienceLearn Data Science
Learn Data Science
 
Class ppt intro to-sas
Class ppt   intro to-sasClass ppt   intro to-sas
Class ppt intro to-sas
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist
 
Basic data analysis using R.
Basic data analysis using R.Basic data analysis using R.
Basic data analysis using R.
 
Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation - Parinds...
 Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation  - Parinds... Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation  - Parinds...
Jigsaw Mortgage Dex Data Analysis Competition Winner Presentation - Parinds...
 
Facebook data analysis using r
Facebook data analysis using rFacebook data analysis using r
Facebook data analysis using r
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Data Science Thailand Meetup#11
Data Science Thailand Meetup#11Data Science Thailand Meetup#11
Data Science Thailand Meetup#11
 
Class ppt overview of analytics
Class ppt overview of analyticsClass ppt overview of analytics
Class ppt overview of analytics
 
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...Standardizing +113 million Merchant Names in Financial Services with Greenplu...
Standardizing +113 million Merchant Names in Financial Services with Greenplu...
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
Predictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advicePredictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advice
 
Univariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squareUnivariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi square
 

Ähnlich wie The path to be a data scientist

Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfSujata Gupta
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big dataPoo Kuan Hoong
 
Intro to Artificial Intelligence w/ Target's Director of PM
 Intro to Artificial Intelligence w/ Target's Director of PM Intro to Artificial Intelligence w/ Target's Director of PM
Intro to Artificial Intelligence w/ Target's Director of PMProduct School
 
Introductory of Information Technology
Introductory of Information TechnologyIntroductory of Information Technology
Introductory of Information Technologyturkiyeizmir2020
 
AIIA_DataAnalytics_Project_External_20160721
AIIA_DataAnalytics_Project_External_20160721AIIA_DataAnalytics_Project_External_20160721
AIIA_DataAnalytics_Project_External_20160721Graeme Wood
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorMachine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorRudradeb Mitra
 
Executive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you thinkExecutive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you thinkPeter Skomoroch
 
Analytics in Action - Introduction
Analytics in Action - IntroductionAnalytics in Action - Introduction
Analytics in Action - IntroductionLee Schlenker
 
The New Self-Service Analytics - Going Beyond the Tools
The New Self-Service Analytics - Going Beyond the ToolsThe New Self-Service Analytics - Going Beyond the Tools
The New Self-Service Analytics - Going Beyond the ToolsKatherine Gabriel
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 

Ähnlich wie The path to be a data scientist (20)

Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
Intro to Artificial Intelligence w/ Target's Director of PM
 Intro to Artificial Intelligence w/ Target's Director of PM Intro to Artificial Intelligence w/ Target's Director of PM
Intro to Artificial Intelligence w/ Target's Director of PM
 
Trends in data analytics
Trends in data analyticsTrends in data analytics
Trends in data analytics
 
Introductory of Information Technology
Introductory of Information TechnologyIntroductory of Information Technology
Introductory of Information Technology
 
AIIA_DataAnalytics_Project_External_20160721
AIIA_DataAnalytics_Project_External_20160721AIIA_DataAnalytics_Project_External_20160721
AIIA_DataAnalytics_Project_External_20160721
 
Get your data analytics strategy right!
Get your data analytics strategy right!Get your data analytics strategy right!
Get your data analytics strategy right!
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorMachine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
 
Executive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you thinkExecutive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you think
 
Analytics in Action - Introduction
Analytics in Action - IntroductionAnalytics in Action - Introduction
Analytics in Action - Introduction
 
The New Self-Service Analytics - Going Beyond the Tools
The New Self-Service Analytics - Going Beyond the ToolsThe New Self-Service Analytics - Going Beyond the Tools
The New Self-Service Analytics - Going Beyond the Tools
 
lec1.pdf
lec1.pdflec1.pdf
lec1.pdf
 
CS-IS 027
CS-IS 027CS-IS 027
CS-IS 027
 
SAS Institute: Big data and smarter analytics
SAS Institute: Big data and smarter analyticsSAS Institute: Big data and smarter analytics
SAS Institute: Big data and smarter analytics
 
Data Analytics: From Basic Skills to Executive Decision-Making
Data Analytics: From Basic Skills to Executive Decision-MakingData Analytics: From Basic Skills to Executive Decision-Making
Data Analytics: From Basic Skills to Executive Decision-Making
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 

Mehr von Poo Kuan Hoong

Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMPoo Kuan Hoong
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's newPoo Kuan Hoong
 
The future outlook and the path to be Data Scientist
The future outlook and the path to be Data ScientistThe future outlook and the path to be Data Scientist
The future outlook and the path to be Data ScientistPoo Kuan Hoong
 
Data Driven Organization and Data Commercialization
Data Driven Organization and Data CommercializationData Driven Organization and Data Commercialization
Data Driven Organization and Data CommercializationPoo Kuan Hoong
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewPoo Kuan Hoong
 
Explore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer LearningExplore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer LearningPoo Kuan Hoong
 
Explore and have fun with TensorFlow: An introductory to TensorFlow
Explore and have fun with TensorFlow: An introductory	to TensorFlowExplore and have fun with TensorFlow: An introductory	to TensorFlow
Explore and have fun with TensorFlow: An introductory to TensorFlowPoo Kuan Hoong
 
The path to be a Data Scientist
The path to be a Data ScientistThe path to be a Data Scientist
The path to be a Data ScientistPoo Kuan Hoong
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenPoo Kuan Hoong
 
Microsoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community BootcampMicrosoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community BootcampPoo Kuan Hoong
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RPoo Kuan Hoong
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep LearningPoo Kuan Hoong
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learningPoo Kuan Hoong
 
Context Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social MediaContext Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social MediaPoo Kuan Hoong
 
Virtual Interaction Using Myo And Google Cardboard (slides)
Virtual Interaction Using Myo And Google Cardboard (slides)Virtual Interaction Using Myo And Google Cardboard (slides)
Virtual Interaction Using Myo And Google Cardboard (slides)Poo Kuan Hoong
 
A Comparative Study of HITS vs PageRank Algorithms for Twitter Users Analysis
A Comparative Study of HITS vs PageRank Algorithms for Twitter Users AnalysisA Comparative Study of HITS vs PageRank Algorithms for Twitter Users Analysis
A Comparative Study of HITS vs PageRank Algorithms for Twitter Users AnalysisPoo Kuan Hoong
 
Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...
Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...
Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...Poo Kuan Hoong
 

Mehr von Poo Kuan Hoong (20)

Build an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBMBuild an efficient Machine Learning model with LightGBM
Build an efficient Machine Learning model with LightGBM
 
Tensor flow 2.0 what's new
Tensor flow 2.0  what's newTensor flow 2.0  what's new
Tensor flow 2.0 what's new
 
The future outlook and the path to be Data Scientist
The future outlook and the path to be Data ScientistThe future outlook and the path to be Data Scientist
The future outlook and the path to be Data Scientist
 
Data Driven Organization and Data Commercialization
Data Driven Organization and Data CommercializationData Driven Organization and Data Commercialization
Data Driven Organization and Data Commercialization
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 
Explore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer LearningExplore and Have Fun with TensorFlow: Transfer Learning
Explore and Have Fun with TensorFlow: Transfer Learning
 
Deep Learning with R
Deep Learning with RDeep Learning with R
Deep Learning with R
 
Explore and have fun with TensorFlow: An introductory to TensorFlow
Explore and have fun with TensorFlow: An introductory	to TensorFlowExplore and have fun with TensorFlow: An introductory	to TensorFlow
Explore and have fun with TensorFlow: An introductory to TensorFlow
 
The path to be a Data Scientist
The path to be a Data ScientistThe path to be a Data Scientist
The path to be a Data Scientist
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R Open
 
Microsoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community BootcampMicrosoft APAC Machine Learning & Data Science Community Bootcamp
Microsoft APAC Machine Learning & Data Science Community Bootcamp
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Big Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep LearningBig Data Malaysia - A Primer on Deep Learning
Big Data Malaysia - A Primer on Deep Learning
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Context Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social MediaContext Aware Road Traffic Speech Information System from Social Media
Context Aware Road Traffic Speech Information System from Social Media
 
Virtual Interaction Using Myo And Google Cardboard (slides)
Virtual Interaction Using Myo And Google Cardboard (slides)Virtual Interaction Using Myo And Google Cardboard (slides)
Virtual Interaction Using Myo And Google Cardboard (slides)
 
A Comparative Study of HITS vs PageRank Algorithms for Twitter Users Analysis
A Comparative Study of HITS vs PageRank Algorithms for Twitter Users AnalysisA Comparative Study of HITS vs PageRank Algorithms for Twitter Users Analysis
A Comparative Study of HITS vs PageRank Algorithms for Twitter Users Analysis
 
Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...
Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...
Towards Auto-Extracting Car Park Structures: Image Processing Approach on Low...
 

Kürzlich hochgeladen

Crack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interviewCrack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interviewNilendra Kumar
 
Application deck- Cyril Caudroy-2024.pdf
Application deck- Cyril Caudroy-2024.pdfApplication deck- Cyril Caudroy-2024.pdf
Application deck- Cyril Caudroy-2024.pdfCyril CAUDROY
 
定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一
定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一
定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一fjjwgk
 
Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...
Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...
Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...nitagrag2
 
Navigating the Data Economy: Transforming Recruitment and Hiring
Navigating the Data Economy: Transforming Recruitment and HiringNavigating the Data Economy: Transforming Recruitment and Hiring
Navigating the Data Economy: Transforming Recruitment and Hiringkaran651042
 
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一A SSS
 
Storytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary PhotographyStorytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary PhotographyOrtega Alikwe
 
Most Inspirational Leaders Empowering the Educational Sector, 2024.pdf
Most Inspirational Leaders Empowering the Educational Sector, 2024.pdfMost Inspirational Leaders Empowering the Educational Sector, 2024.pdf
Most Inspirational Leaders Empowering the Educational Sector, 2024.pdfTheKnowledgeReview2
 
MIdterm Review International Trade.pptx review
MIdterm Review International Trade.pptx reviewMIdterm Review International Trade.pptx review
MIdterm Review International Trade.pptx reviewSheldon Byron
 
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一z xss
 
定制英国克兰菲尔德大学毕业证成绩单原版一比一
定制英国克兰菲尔德大学毕业证成绩单原版一比一定制英国克兰菲尔德大学毕业证成绩单原版一比一
定制英国克兰菲尔德大学毕业证成绩单原版一比一z zzz
 
Issues in the Philippines (Unemployment and Underemployment).pptx
Issues in the Philippines (Unemployment and Underemployment).pptxIssues in the Philippines (Unemployment and Underemployment).pptx
Issues in the Philippines (Unemployment and Underemployment).pptxJenniferPeraro1
 
Graduate Trainee Officer Job in Bank Al Habib 2024.docx
Graduate Trainee Officer Job in Bank Al Habib 2024.docxGraduate Trainee Officer Job in Bank Al Habib 2024.docx
Graduate Trainee Officer Job in Bank Al Habib 2024.docxJobs Finder Hub
 
Digital Marketing Training Institute in Mohali, India
Digital Marketing Training Institute in Mohali, IndiaDigital Marketing Training Institute in Mohali, India
Digital Marketing Training Institute in Mohali, IndiaDigital Discovery Institute
 
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCRdollysharma2066
 
定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一
定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一
定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一2s3dgmej
 
tools in IDTelated to first year vtu students is useful where they can refer ...
tools in IDTelated to first year vtu students is useful where they can refer ...tools in IDTelated to first year vtu students is useful where they can refer ...
tools in IDTelated to first year vtu students is useful where they can refer ...vinbld123
 
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改yuu sss
 
原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证
原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证
原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证nhjeo1gg
 

Kürzlich hochgeladen (20)

Crack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interviewCrack JAG. Guidance program for entry to JAG Dept. & SSB interview
Crack JAG. Guidance program for entry to JAG Dept. & SSB interview
 
Application deck- Cyril Caudroy-2024.pdf
Application deck- Cyril Caudroy-2024.pdfApplication deck- Cyril Caudroy-2024.pdf
Application deck- Cyril Caudroy-2024.pdf
 
定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一
定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一
定制(ECU毕业证书)埃迪斯科文大学毕业证毕业证成绩单原版一比一
 
Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...
Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...
Escorts Service Near Surya International Hotel, New Delhi |9873777170| Find H...
 
Navigating the Data Economy: Transforming Recruitment and Hiring
Navigating the Data Economy: Transforming Recruitment and HiringNavigating the Data Economy: Transforming Recruitment and Hiring
Navigating the Data Economy: Transforming Recruitment and Hiring
 
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
办理学位证(UoM证书)北安普顿大学毕业证成绩单原版一比一
 
Storytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary PhotographyStorytelling, Ethics and Workflow in Documentary Photography
Storytelling, Ethics and Workflow in Documentary Photography
 
Most Inspirational Leaders Empowering the Educational Sector, 2024.pdf
Most Inspirational Leaders Empowering the Educational Sector, 2024.pdfMost Inspirational Leaders Empowering the Educational Sector, 2024.pdf
Most Inspirational Leaders Empowering the Educational Sector, 2024.pdf
 
MIdterm Review International Trade.pptx review
MIdterm Review International Trade.pptx reviewMIdterm Review International Trade.pptx review
MIdterm Review International Trade.pptx review
 
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
定制(SCU毕业证书)南十字星大学毕业证成绩单原版一比一
 
定制英国克兰菲尔德大学毕业证成绩单原版一比一
定制英国克兰菲尔德大学毕业证成绩单原版一比一定制英国克兰菲尔德大学毕业证成绩单原版一比一
定制英国克兰菲尔德大学毕业证成绩单原版一比一
 
Issues in the Philippines (Unemployment and Underemployment).pptx
Issues in the Philippines (Unemployment and Underemployment).pptxIssues in the Philippines (Unemployment and Underemployment).pptx
Issues in the Philippines (Unemployment and Underemployment).pptx
 
Graduate Trainee Officer Job in Bank Al Habib 2024.docx
Graduate Trainee Officer Job in Bank Al Habib 2024.docxGraduate Trainee Officer Job in Bank Al Habib 2024.docx
Graduate Trainee Officer Job in Bank Al Habib 2024.docx
 
Digital Marketing Training Institute in Mohali, India
Digital Marketing Training Institute in Mohali, IndiaDigital Marketing Training Institute in Mohali, India
Digital Marketing Training Institute in Mohali, India
 
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
8377877756 Full Enjoy @24/7 Call Girls in Pitampura Delhi NCR
 
Young Call~Girl in Pragati Maidan New Delhi 8448380779 Full Enjoy Escort Service
Young Call~Girl in Pragati Maidan New Delhi 8448380779 Full Enjoy Escort ServiceYoung Call~Girl in Pragati Maidan New Delhi 8448380779 Full Enjoy Escort Service
Young Call~Girl in Pragati Maidan New Delhi 8448380779 Full Enjoy Escort Service
 
定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一
定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一
定制(NYIT毕业证书)美国纽约理工学院毕业证成绩单原版一比一
 
tools in IDTelated to first year vtu students is useful where they can refer ...
tools in IDTelated to first year vtu students is useful where they can refer ...tools in IDTelated to first year vtu students is useful where they can refer ...
tools in IDTelated to first year vtu students is useful where they can refer ...
 
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
办澳洲詹姆斯库克大学毕业证成绩单pdf电子版制作修改
 
原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证
原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证
原版快速办理MQU毕业证麦考瑞大学毕业证成绩单留信学历认证
 

The path to be a data scientist

  • 1. The path to be a Data Scientist Poo Kuan Hoong, Ph.D Senior Manager Data Science, Nielsen Malaysia
  • 2. Disclaimer: The views and opinions expressed in this slides are those of the author and do not necessarily reflect the official policy or position of Nielsen Malaysia. Examples of analysis performed within this slides are only examples. They should not be utilized in real-world analytic products as they are based only on very limited and dated open source information. Assumptions made within the analysis are not reflective of the position of Nielsen Malaysia.
  • 3. Agenda • What is a data scientist? • What kinds of companies that employ data scientists? • What are the key functions of data scientist? • What type of work does a data scientist do? • General Aptitude to be a data scientist • What skillsets needed to be a data scientist? • What is data science? • Where do I begin? • MDEC National Big App Challenge 3.0 Knowledge Sharing
  • 4. Self Introduction Poo Kuan Hoong, http://www.linkedin.com/in/kuanhoong • Senior Manager Data Science • Senior Lecturer • Chairperson Data Science Institute • Coursera Facilitator • Consultant • Funding mentor • Founder • Speaker/Trainer
  • 7. What is a Data Scientist?
  • 8.
  • 9. Data Scientist The term "data scientist" has been around for years, and the various advanced analytics specialties that fall under it are even older. However, due to recent explosion of data, the term has been used in the convergence of disciplines and that leads to the soaring popularity.
  • 10. What are the job title? • Data Scientist • Data Engineer • Big Data Engineer • Machine Learning Scientist • Business Analytics Specialist • Data Visualization Developer • BI Solutions Architect/ BI Specialist • Operations Research Analyst • Analytics Manager • Machine Learning Engineer • Statistician • Business Intelligence (BI) Engineer
  • 11. Why the Global Need? Abundance of Data Availability of affordable compute resources Internet of Things (IoT) sensors data
  • 12. 950 Data Analyst (India) 8,411 Data Scientist (US) 808 Data Analyst (UK) 1,188 Data Manager (US) 81 Data Analyst (Australia)
  • 13. 80 in April 2015 1,500 by 2020 The Star, Friday, 24 April 2015 “Malaysia needs 1,500 data scientists by 2020”
  • 14. What kinds of companies that employ data scientists?
  • 16. What are the key functions of data scientist?
  • 17. Key functions of data scientist Devising Business Strategies from the insights Descriptive and Predictive Analytics Data Mining and Analysis Design Understanding the business problem
  • 18. Scenario 1: Customer Churn Analytics
  • 19. Churn analytics • Predicting who will switch mobile operator
  • 20. Customer churn - who do customers change operators? • The top 3 reasons why subscribers change providers: • They want a new handset • They believe they pay too much for calls/data • Providers do not offer additional loyalty benefits
  • 21. Data Collection Data Preprocessing Attributes selection • Attribute 1 • Attribute 2 • Attribute 3 Algorithm Training Model Score Model Apply Data /Test Data Predicting Output Initialization Step Learn Step Apply Step Machine Learning Framework
  • 24. Models comparison • Receiver operating characteristic curve (ROC curve) illustrates the performance of a binary classifier system as its discrimination threshold is varied.
  • 25. Scenario 2: Market Basket Analysis
  • 26. Market Basket Analysis Where should detergents be placed in the store to maximize sales? Are bleach products purchased when detergents and orange juice are bought together? Is cola typically purchased with bananas? Does the brand of cola make a difference? How are the demographics of the neighbourhood affecting what customers are buying?
  • 27. What type of work does a data scientist do?
  • 30.
  • 31. General aptitude to be Data Scientist
  • 32. Data Scientist • Common sense • Curious mind • Clear and simplify thought • Love to solve puzzles • Good listening, writing and communication skills • Maths & Stats • Business sense
  • 33. I have 4 red, 18 black and 8 brown socks in my sock drawer. If it is completely dark and I cannot see the colour of the socks that I am picking, how many socks do I need to take from the drawer to be sure that I have at least one pair of socks that are the same colour?
  • 34. What is the hidden number under the car?
  • 35. What skillsets needed to be a data scientist?
  • 36. Data scientist skillsets • Data Mining • Machine Learning • R/Python • Data Analysis • Statistics • SQL • Java • Algorithms Image Source: http://imgur.com/hoyFT4t
  • 37. What is the average salary?
  • 38. Average salary: Data Scientist
  • 39. What is data science?
  • 40. Data Science • Data science is as an evolutionary step in interdisciplinary fields like business analysis that incorporate computer science, modeling, statistics, analytics, and mathematics. • At its core, data science involves using automated methods to analyze massive amounts of data and to extract knowledge from them. • Drawing insight from a piece of data involves understanding how it fits into the larger picture of an organization,
  • 41.
  • 42. Where do I begin?
  • 43. Massive Open Online Course (MOOC) • MSC Malaysia MyProCert (SRI) – Data Science Massive Open Online Courses (MOOC) • The Center of Applied Data Science (MDEC & HRDF) • John Hopkins University – Data Science Specialization • University of Washington - Data Science at Scale Specialization • Data Analyst Nanodegree - Udacity • CSCI E-109 Data Science (Harvard Extension School) • Machine Learning - Stanford University
  • 44. BDA Undergraduate & Postgraduate Programme Undergraduate • Multimedia University – Bachelor of Computer Science (Data Science Specialization) • Sunway University - BSc (Hons) Information Systems (Business Analytics) • Universiti Teknologi Malaysia (UTM), International Islamic University Malaysia, Monash University, University Institute Technology Mara (UiTM) & University Teknologi Petronas (UTP). Postgraduate • Big Data Analytics Post Graduate Programme
  • 45. Kaggle • Data sets, real problems, in unprocessed manner. • Recommend to go through past competitions. • Read through the forums with particular competitions to find out useful discussion and tips/hints that will be useful for solving future problems. • https://www.kaggle.com/
  • 46. UC Irvine Machine Learning Repository • 360 data sets as a service to the machine learning community http://archive.ics.uci.edu/ml/
  • 47. Open data • Open data from various countries • Malaysia - http://www.data.gov.my/ • Singapore - https://data.gov.sg/
  • 48.
  • 49. MDEC National Big App Challenge 3.0
  • 50. • June 4th – June 5th 2016, Berjaya Times Square • The themes for AHKL2016 were as follows: 1. Big Data Analytics --- Powered by MDEC. Access to 65mil rows of real datasets sponsored by iProperty.com Malaysia 2. O2O Commerce --- Powered by MOLWallet MOLPay 3. Smart Living --- Powered by TIME Internet
  • 51.
  • 52.
  • 53. National MDEC Big App Challenge 3.0
  • 54.
  • 55.
  • 56.
  • 57.
  • 58. PropertySenze • B2B business model • Provide machine learning and AI services to customers • Visual Search • Personalized customer experience
  • 59. BUSINESS MODEL Big Data becomes Smart Data 1. PropertySenze contracts with property sites and property developers to generate analytics and visual search 5. Analytics at the fingertips for both buyers and sellers 2. PropertySenze’s machine learning algorithm enables search and buy similar properties that user sees on the sites, from user‐generated photos and from user‐uploaded images 3. Enhanced search experience and personalized results for users 7. PropertySenze verifies all transactions and charges commission fees every month 4. Improved platform that recognizes properties for retrieval purposes or instant purchases. 6. Improved user experience that leads to more engagement and sale transactions
  • 61.
  • 62.
  • 63.
  • 64. Hackathon: Tips • Have a well-shaped team with not more than one server-side developer with relevant experience, one good designer and one the amazing storyteller • Understand the expected outcomes of the hackathon • Develop something that everyone can see the benefits • Have an impressive aim or objective • Start promoting your product during the hackathon • Hit the demo 100%. The pitch is for the product to shine
  • 65.