SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Content of this document is under Creative Commons BY-NC-SA

Data Scientist Enablement
DSE 400 - Fast Track to Data Science
Week 1 Roadmap

Advanced Center of Excellence
Modern Renaissance Corporation
In Collaboration with SONO team and others
Agenda
You can always find the latest version of this document at bit.ly/1hC5wAV

Welcome
Mission and Objectives
DSE Roadmap
DSE 400 at a glance
Week 1 at a glance
Discussions
Learning
Practice
Assignments and Submission
Looking ahead
References
Acknowledgement
Welcome
Welcome to DSE 2014 Track. You are on one of most
exciting programs to disseminate knowledge, diffuse
advancements and also stimulate adoption of Data/Decision
Sciences, Big Data Analytics and what we call EvidenceOriented Systems Engineering. The content and the courses
are designed to be easy, engaging and engendering.
Consequently, we also hope this program will also be most
rewarding for you from intellectual, pragmatic and
professional development perspectives.
Mission and Objectives
Mission of our program is to provide free, open and worldclass enablement of Data Scientists and help advance the
profession of Data Science and allied disciplines.
We aim to prepare the participants with analytical and
practical skills emphasizing breadth and depth in a range of
relevant disciplines and capabilities in Data/Decision
Sciences, Big Data Analytics, Architecture and Systems
Engineering.
Data Scientist Enablement Roadmap - 2014

Ramping up Machine Learning with R

Advanced Techniques in
Big Data Analytics

Fast track to
Data Science
Modern Data Platforms
“”“A Data Scientist is someone who knows how to extract meaning from and interpret data, which
requires both tools and methods from statistics and machine learning, as well as being human.”
- Rachel Schutt and Cathy O’Neil, Doing Data Science
DSE 2014 with tentative timeline

Mar 30 - May 10

July 20 - Aug 30

Ramping up Machine Learning with R

Fast track to
Data Science
Jan 19 - Mar 15
Modern Data Platforms
May 25 - July 5

Advanced Techniques in
Big Data Analytics
DSE 400 at a glance
Introductory course with NO pre requisites. It employs
socialized learning paradigm involving individual effort,
team work, discussions and collaboration on SONO (Social
Knowledge) platform.
Topics include Algorithms, Statistical
Inference, Data Analysis, Hadoop, R,
Data Engineering, Machine Learning,
Visualization, Applications, Case Studies,
employing a variety of tools and techniques.
DSE 400 - Week 1 at a glance
Discussions(on SONO):
Welcome, Introductions, Programming and Analytics background etc.

Reading plan:
Read Chapters 1-3 from An Introduction to Data Science by Jeffrey Stanton and Big Data
[sorry] & Data Science: What Does a Data Scientist Do?

Activities:
Installing R and R-Studio; Fun with Math; Playing with ML Datasets, Research on Data
Visualization tools etc.

Assignment 1:
Download Housing dataset from UCI Machine Learning Repository to your local machine or
cloud drive. Import this dataset into your R environment and display this dataset.
Social Engagement on SONO
Login to SONO Community. Visit our Jump Pad (or
Knowledge Domain) called DSE 400. Go to DSE 2014
Global then join right participant group based on first letter
of your last name. Also feel free to explore other
Knowledge-rich communities on SONO.
http://getsokno.com/redvinef/controllers/cell.php?
user_knocell=992
Social Engagement on SONO - Week 1
Discussion 1: Welcome to DSE program.
Discussion 2: What programming languages are you
familiar with? What languages do you use on day to day
basis? Do you have any experience using R Language?
What kind of Analytics tools if any, you have used before.
<Optional> Discussion 3: Q&A. General questions as well
as questions specific to week1 are welcome.
To participate in these discussions visit DSE 400 Week 1 at
http://getsokno.com/redvinef/controllers/cell.php?user_knocell=1001
Week 1 Reading Plan
DSE 400 is designed be a broad introduction to Data
Science, Analytics Architecture and Visualization from both
learning as well as pragmatic perspectives. Following plan
is recommend for Week 1 to kickstart the program.
Read Chapters 1-3 from An Introduction to Data Science
by Jeffrey Stanton.
Read Big Data [sorry] & Data Science: What Does a Data
Scientist Do?
Activities
<Required> Visit http://www.rstudio.com/ Follow the instructions to
download and install R and R-Studio. For specific advice on your system and its
configuration, several how-to videos on Installing R and R-Studio can be found
on Youtube. Skip this activity if you already have R and R-Studio.
<Collaborative Research> <Required> Create a presentation on Data
Visualization Tools - A Comparative Study . Incorporate your unique ideas,
research and collective insights to arrive at the right evaluation methodology,
explain your thought-process and justify your choices. Note: You will build this
presentation for 4 weeks. You and your team will present it during 5th week
Activities - contd
<Practice> Math is Fun. Create a bar chart quickly with 10 random values
using Data Graphs widget at Math is Fun website. Change graph to Pie Chart.
Display percentages only, not the original values.
<Practice> Visit UCI Machine Learning Repository. Familiarize yourself
with various datasets at this site. Feel free to download any dataset you like. We
will be using this repository in DSE program extensively. For week 1 our focus is
on just “Housing” dataset.
Assignment 1 - Submission Required
Download R-Studio, in case you have not already done so.
Download Housing dataset from UCI Machine Learning
Repository to your local machine or cloud drive. Import this
dataset into your R environment and display this dataset.
Show the screenshot of your environment.
(See the sample image in the next slide.)
http://archive.ics.uci.edu/ml/datasets.html
Assignment 1 - Example screenshot
Submissions
Deadline Saturday Jan 25, 11:59 PM your local time.
Submit <mail to datascience400@gmail.com> the
screenshots of your R workspace (on your
machine/laptop/desktop) showing the Housing dataset.
You can either paste the image into the body of email or
create a document in PDF format and send it as an
attachment. No links please.
Fun@Work
DSE Participant Distribution Pattern
Fun@Work
Tagcloud of professional backgrounds of DSE Participants
DSE 400 - Weeks 2-8 ahead
Week 2 Basic Statistics, Hypothesis Testing, Regression, Playing with Spreadsheets,Visualization with
R. If you are new to Statistics or need a refresher, read ahead Think Stats: Probability and Statistics for
Programmers or watch Statistics Playlist by Khan Academy

Week 3 - 4 Intro to Machine Learning(ML) - Classification, Clustering, Prediction NaiveBayes,
Recommendations and Boosting algorithms
Week 5 Visualizations. Present your research Data Visualization Tools - A Comparative Study
Week 6 -7 Processing large data sets. Hadoop Ecosystem. Stream Computing etc.
Week 8 Ethics, Privacy and Building Data Products.
References and Additional Reading
An Introduction to Data Science by Jeffrey Stanton. This
is a good introduction to Data Science for non-technical
readers. This book is available under Creative Commons
Licence.
Learning R - Video Tutorial Lessons on Youtube
R for Machine Learning by Allison Chung
The Value of Big Data Isn't the Data HBR Article
[MIT OCW] Prediction, Machine Learning and Statistics
Citation
Housing Data Set Information: Concerns housing values in suburbs of Boston.
Origin: This dataset was taken from the StatLib library which is maintained at
Carnegie Mellon University. Creator : Harrison, D. and Rubinfeld, D.L.
'Hedonic prices and the demand for clean air', J. Environ. Economics &
Management, vol.5, 81-102, 1978.

Content that appears as is on this document only, is under Creative Commons
BY-NC-SA This license may not apply to material referenced here.
For More Information
DSE 2014 stream is all set set to commence on Jan 19, 2004
For more details, visit DSE 400 Announcement Page bit.ly/18zPE1j
Visit DSE 2014 Global to participate in DSE and to get to know the DSE Core
Team and participants. Week 1 discussions can found at DSE 400 Week 1
We welcome questions, thoughts and suggestions. Post these on SONO in the
right forum/discussion or write to us at <datascience400@gmail.com>
You can always find the latest version of this document at bit.ly/1hC5wAV
Acknowledgement
We thank our community of committed and passionate
volunteers, experts, educators, innovators, benefactors,
advisers, advocates, mentors and supporters
We are also grateful to the outstanding support and
encouragement from SONO team as well as other
organizations like R-Project, Open Courseware
Consortium, MIT, IBM, HortonWorks, Stanford University,
Caltech and Data Science Central etc.
Thank You

Weitere ähnliche Inhalte

Andere mochten auch

Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Dr. Mohan K. Bavirisetty
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Big Data Joe™ Rossi
 
Data scientist enablement dse 400 week 7 roadmap
Data scientist enablement   dse 400   week 7 roadmapData scientist enablement   dse 400   week 7 roadmap
Data scientist enablement dse 400 week 7 roadmapDr. Mohan K. Bavirisetty
 
Dr Mohan K Bavirisetty - 8 Disciplines of Enterprise Modernization - Final
Dr  Mohan K  Bavirisetty - 8 Disciplines of Enterprise Modernization - FinalDr  Mohan K  Bavirisetty - 8 Disciplines of Enterprise Modernization - Final
Dr Mohan K Bavirisetty - 8 Disciplines of Enterprise Modernization - FinalDr. Mohan K. Bavirisetty
 
Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0 Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0 Dr. Mohan K. Bavirisetty
 
Business Analytics Competency centre: A strategic Differentiator
Business Analytics Competency centre: A strategic Differentiator Business Analytics Competency centre: A strategic Differentiator
Business Analytics Competency centre: A strategic Differentiator BSGAfrica
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platformHaoran Du
 
BICC - A key element to your BI strategy
BICC - A key element to your BI strategyBICC - A key element to your BI strategy
BICC - A key element to your BI strategyGuyVanderSande
 
Center of Excellence Building Blocks
Center of Excellence Building BlocksCenter of Excellence Building Blocks
Center of Excellence Building BlocksArup Dutta
 
The Road to Becoming a Center of Excellence
The Road to Becoming a Center of ExcellenceThe Road to Becoming a Center of Excellence
The Road to Becoming a Center of ExcellenceLisa D'Adamo-Weinstein
 
sparklyr - Jeff Allen
sparklyr - Jeff Allensparklyr - Jeff Allen
sparklyr - Jeff AllenSri Ambati
 
Creating a Business Intelligence Competency Center
Creating a Business Intelligence Competency CenterCreating a Business Intelligence Competency Center
Creating a Business Intelligence Competency CenterTommy Tavenner
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Cloudera, Inc.
 
Business intelligence competency centre strategy and road map
Business intelligence competency centre strategy and road mapBusiness intelligence competency centre strategy and road map
Business intelligence competency centre strategy and road mapOmar Khan
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Dr. Mohan K. Bavirisetty
 
Successfully establishing a SOA Center of Excellence
Successfully establishing a SOA Center of ExcellenceSuccessfully establishing a SOA Center of Excellence
Successfully establishing a SOA Center of ExcellenceKelly Emo
 

Andere mochten auch (17)

Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
 
Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0Hadoop - Past, Present and Future - v2.0
Hadoop - Past, Present and Future - v2.0
 
Data scientist enablement dse 400 week 7 roadmap
Data scientist enablement   dse 400   week 7 roadmapData scientist enablement   dse 400   week 7 roadmap
Data scientist enablement dse 400 week 7 roadmap
 
Dr Mohan K Bavirisetty - 8 Disciplines of Enterprise Modernization - Final
Dr  Mohan K  Bavirisetty - 8 Disciplines of Enterprise Modernization - FinalDr  Mohan K  Bavirisetty - 8 Disciplines of Enterprise Modernization - Final
Dr Mohan K Bavirisetty - 8 Disciplines of Enterprise Modernization - Final
 
Data Scientist Enablement roadmap 1.0
Data Scientist Enablement roadmap 1.0Data Scientist Enablement roadmap 1.0
Data Scientist Enablement roadmap 1.0
 
Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0 Polyglot Processing - An Introduction 1.0
Polyglot Processing - An Introduction 1.0
 
Business Analytics Competency centre: A strategic Differentiator
Business Analytics Competency centre: A strategic Differentiator Business Analytics Competency centre: A strategic Differentiator
Business Analytics Competency centre: A strategic Differentiator
 
Building enterprise advance analytics platform
Building enterprise advance analytics platformBuilding enterprise advance analytics platform
Building enterprise advance analytics platform
 
BICC - A key element to your BI strategy
BICC - A key element to your BI strategyBICC - A key element to your BI strategy
BICC - A key element to your BI strategy
 
Center of Excellence Building Blocks
Center of Excellence Building BlocksCenter of Excellence Building Blocks
Center of Excellence Building Blocks
 
The Road to Becoming a Center of Excellence
The Road to Becoming a Center of ExcellenceThe Road to Becoming a Center of Excellence
The Road to Becoming a Center of Excellence
 
sparklyr - Jeff Allen
sparklyr - Jeff Allensparklyr - Jeff Allen
sparklyr - Jeff Allen
 
Creating a Business Intelligence Competency Center
Creating a Business Intelligence Competency CenterCreating a Business Intelligence Competency Center
Creating a Business Intelligence Competency Center
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Business intelligence competency centre strategy and road map
Business intelligence competency centre strategy and road mapBusiness intelligence competency centre strategy and road map
Business intelligence competency centre strategy and road map
 
Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence Building Big Data Analytics Center Of Excellence
Building Big Data Analytics Center Of Excellence
 
Successfully establishing a SOA Center of Excellence
Successfully establishing a SOA Center of ExcellenceSuccessfully establishing a SOA Center of Excellence
Successfully establishing a SOA Center of Excellence
 

Ähnlich wie Data scientist enablement dse 400 - week 1

Data scientist enablement dse 400 week 2 roadmap
Data scientist enablement   dse 400   week 2 roadmapData scientist enablement   dse 400   week 2 roadmap
Data scientist enablement dse 400 week 2 roadmapDr. Mohan K. Bavirisetty
 
Data scientist enablement dse 400 week 4 roadmap
Data scientist enablement   dse 400   week 4 roadmap Data scientist enablement   dse 400   week 4 roadmap
Data scientist enablement dse 400 week 4 roadmap Dr. Mohan K. Bavirisetty
 
Data scientist enablement dse 400 week 3 roadmap
Data scientist enablement   dse 400   week 3 roadmapData scientist enablement   dse 400   week 3 roadmap
Data scientist enablement dse 400 week 3 roadmapDr. Mohan K. Bavirisetty
 
Hithai Shree.J and Varsha.R.pptx
Hithai Shree.J and Varsha.R.pptxHithai Shree.J and Varsha.R.pptx
Hithai Shree.J and Varsha.R.pptxssuser22b2ec
 
Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docx
Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docxHorton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docx
Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docxwellesleyterresa
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Qazi Maaz Arshad
 
1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric 1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric MartineMccracken314
 
1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric 1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric AbbyWhyte974
 
Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05tracykteal
 
Guia 2-examen-de-ingles
Guia 2-examen-de-inglesGuia 2-examen-de-ingles
Guia 2-examen-de-inglesLiz Castro B
 
Empowerment Tech-Mod8_Developing and Constructing the ICT Project.pdf
Empowerment Tech-Mod8_Developing and Constructing the ICT Project.pdfEmpowerment Tech-Mod8_Developing and Constructing the ICT Project.pdf
Empowerment Tech-Mod8_Developing and Constructing the ICT Project.pdfChris selebio
 
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET- Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET Journal
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programmingRonikSharma1
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programmingNikhilsharma1159
 
Learn data science with r programming (1)
Learn data science with r programming (1)Learn data science with r programming (1)
Learn data science with r programming (1)Sagag55
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programmingKeshavSain2
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdfvkharish18
 
A Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxA Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxRajSingh512965
 

Ähnlich wie Data scientist enablement dse 400 - week 1 (20)

Data scientist enablement dse 400 week 2 roadmap
Data scientist enablement   dse 400   week 2 roadmapData scientist enablement   dse 400   week 2 roadmap
Data scientist enablement dse 400 week 2 roadmap
 
Data scientist enablement dse 400 week 4 roadmap
Data scientist enablement   dse 400   week 4 roadmap Data scientist enablement   dse 400   week 4 roadmap
Data scientist enablement dse 400 week 4 roadmap
 
Data scientist enablement dse 400 week 3 roadmap
Data scientist enablement   dse 400   week 3 roadmapData scientist enablement   dse 400   week 3 roadmap
Data scientist enablement dse 400 week 3 roadmap
 
Hithai Shree.J and Varsha.R.pptx
Hithai Shree.J and Varsha.R.pptxHithai Shree.J and Varsha.R.pptx
Hithai Shree.J and Varsha.R.pptx
 
Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docx
Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docxHorton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docx
Horton+Pruim+Kaplan_MOSAIC-StudentGuide.pdf Nicholas J. .docx
 
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
Cse443 Project Report - LPU (Modern Big Data Analysis with SQL Specialization)
 
1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric 1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric
 
1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric 1 IDS 403 Final Project Part Two Guidelines and Rubric
1 IDS 403 Final Project Part Two Guidelines and Rubric
 
Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05Data carpentry ndic-2015-05-05
Data carpentry ndic-2015-05-05
 
Guia 2-examen-de-ingles
Guia 2-examen-de-inglesGuia 2-examen-de-ingles
Guia 2-examen-de-ingles
 
Empowerment Tech-Mod8_Developing and Constructing the ICT Project.pdf
Empowerment Tech-Mod8_Developing and Constructing the ICT Project.pdfEmpowerment Tech-Mod8_Developing and Constructing the ICT Project.pdf
Empowerment Tech-Mod8_Developing and Constructing the ICT Project.pdf
 
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programming
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programming
 
Learn data science with r programming (1)
Learn data science with r programming (1)Learn data science with r programming (1)
Learn data science with r programming (1)
 
Learn data science with r programming
Learn data science with r programmingLearn data science with r programming
Learn data science with r programming
 
data-science-pdf-16588.pdf
data-science-pdf-16588.pdfdata-science-pdf-16588.pdf
data-science-pdf-16588.pdf
 
Sanjay CV
Sanjay CVSanjay CV
Sanjay CV
 
Sanjay cv
Sanjay cvSanjay cv
Sanjay cv
 
A Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptxA Comprehensive Learning Path to Become a Data Science 2021.pptx
A Comprehensive Learning Path to Become a Data Science 2021.pptx
 

Kürzlich hochgeladen

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 

Kürzlich hochgeladen (20)

FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 

Data scientist enablement dse 400 - week 1

  • 1. Content of this document is under Creative Commons BY-NC-SA Data Scientist Enablement DSE 400 - Fast Track to Data Science Week 1 Roadmap Advanced Center of Excellence Modern Renaissance Corporation In Collaboration with SONO team and others
  • 2. Agenda You can always find the latest version of this document at bit.ly/1hC5wAV Welcome Mission and Objectives DSE Roadmap DSE 400 at a glance Week 1 at a glance Discussions Learning Practice Assignments and Submission Looking ahead References Acknowledgement
  • 3. Welcome Welcome to DSE 2014 Track. You are on one of most exciting programs to disseminate knowledge, diffuse advancements and also stimulate adoption of Data/Decision Sciences, Big Data Analytics and what we call EvidenceOriented Systems Engineering. The content and the courses are designed to be easy, engaging and engendering. Consequently, we also hope this program will also be most rewarding for you from intellectual, pragmatic and professional development perspectives.
  • 4. Mission and Objectives Mission of our program is to provide free, open and worldclass enablement of Data Scientists and help advance the profession of Data Science and allied disciplines. We aim to prepare the participants with analytical and practical skills emphasizing breadth and depth in a range of relevant disciplines and capabilities in Data/Decision Sciences, Big Data Analytics, Architecture and Systems Engineering.
  • 5. Data Scientist Enablement Roadmap - 2014 Ramping up Machine Learning with R Advanced Techniques in Big Data Analytics Fast track to Data Science Modern Data Platforms “”“A Data Scientist is someone who knows how to extract meaning from and interpret data, which requires both tools and methods from statistics and machine learning, as well as being human.” - Rachel Schutt and Cathy O’Neil, Doing Data Science
  • 6. DSE 2014 with tentative timeline Mar 30 - May 10 July 20 - Aug 30 Ramping up Machine Learning with R Fast track to Data Science Jan 19 - Mar 15 Modern Data Platforms May 25 - July 5 Advanced Techniques in Big Data Analytics
  • 7. DSE 400 at a glance Introductory course with NO pre requisites. It employs socialized learning paradigm involving individual effort, team work, discussions and collaboration on SONO (Social Knowledge) platform. Topics include Algorithms, Statistical Inference, Data Analysis, Hadoop, R, Data Engineering, Machine Learning, Visualization, Applications, Case Studies, employing a variety of tools and techniques.
  • 8. DSE 400 - Week 1 at a glance Discussions(on SONO): Welcome, Introductions, Programming and Analytics background etc. Reading plan: Read Chapters 1-3 from An Introduction to Data Science by Jeffrey Stanton and Big Data [sorry] & Data Science: What Does a Data Scientist Do? Activities: Installing R and R-Studio; Fun with Math; Playing with ML Datasets, Research on Data Visualization tools etc. Assignment 1: Download Housing dataset from UCI Machine Learning Repository to your local machine or cloud drive. Import this dataset into your R environment and display this dataset.
  • 9. Social Engagement on SONO Login to SONO Community. Visit our Jump Pad (or Knowledge Domain) called DSE 400. Go to DSE 2014 Global then join right participant group based on first letter of your last name. Also feel free to explore other Knowledge-rich communities on SONO. http://getsokno.com/redvinef/controllers/cell.php? user_knocell=992
  • 10. Social Engagement on SONO - Week 1 Discussion 1: Welcome to DSE program. Discussion 2: What programming languages are you familiar with? What languages do you use on day to day basis? Do you have any experience using R Language? What kind of Analytics tools if any, you have used before. <Optional> Discussion 3: Q&A. General questions as well as questions specific to week1 are welcome. To participate in these discussions visit DSE 400 Week 1 at http://getsokno.com/redvinef/controllers/cell.php?user_knocell=1001
  • 11. Week 1 Reading Plan DSE 400 is designed be a broad introduction to Data Science, Analytics Architecture and Visualization from both learning as well as pragmatic perspectives. Following plan is recommend for Week 1 to kickstart the program. Read Chapters 1-3 from An Introduction to Data Science by Jeffrey Stanton. Read Big Data [sorry] & Data Science: What Does a Data Scientist Do?
  • 12. Activities <Required> Visit http://www.rstudio.com/ Follow the instructions to download and install R and R-Studio. For specific advice on your system and its configuration, several how-to videos on Installing R and R-Studio can be found on Youtube. Skip this activity if you already have R and R-Studio. <Collaborative Research> <Required> Create a presentation on Data Visualization Tools - A Comparative Study . Incorporate your unique ideas, research and collective insights to arrive at the right evaluation methodology, explain your thought-process and justify your choices. Note: You will build this presentation for 4 weeks. You and your team will present it during 5th week
  • 13. Activities - contd <Practice> Math is Fun. Create a bar chart quickly with 10 random values using Data Graphs widget at Math is Fun website. Change graph to Pie Chart. Display percentages only, not the original values. <Practice> Visit UCI Machine Learning Repository. Familiarize yourself with various datasets at this site. Feel free to download any dataset you like. We will be using this repository in DSE program extensively. For week 1 our focus is on just “Housing” dataset.
  • 14. Assignment 1 - Submission Required Download R-Studio, in case you have not already done so. Download Housing dataset from UCI Machine Learning Repository to your local machine or cloud drive. Import this dataset into your R environment and display this dataset. Show the screenshot of your environment. (See the sample image in the next slide.) http://archive.ics.uci.edu/ml/datasets.html
  • 15. Assignment 1 - Example screenshot
  • 16. Submissions Deadline Saturday Jan 25, 11:59 PM your local time. Submit <mail to datascience400@gmail.com> the screenshots of your R workspace (on your machine/laptop/desktop) showing the Housing dataset. You can either paste the image into the body of email or create a document in PDF format and send it as an attachment. No links please.
  • 18. Fun@Work Tagcloud of professional backgrounds of DSE Participants
  • 19. DSE 400 - Weeks 2-8 ahead Week 2 Basic Statistics, Hypothesis Testing, Regression, Playing with Spreadsheets,Visualization with R. If you are new to Statistics or need a refresher, read ahead Think Stats: Probability and Statistics for Programmers or watch Statistics Playlist by Khan Academy Week 3 - 4 Intro to Machine Learning(ML) - Classification, Clustering, Prediction NaiveBayes, Recommendations and Boosting algorithms Week 5 Visualizations. Present your research Data Visualization Tools - A Comparative Study Week 6 -7 Processing large data sets. Hadoop Ecosystem. Stream Computing etc. Week 8 Ethics, Privacy and Building Data Products.
  • 20. References and Additional Reading An Introduction to Data Science by Jeffrey Stanton. This is a good introduction to Data Science for non-technical readers. This book is available under Creative Commons Licence. Learning R - Video Tutorial Lessons on Youtube R for Machine Learning by Allison Chung The Value of Big Data Isn't the Data HBR Article [MIT OCW] Prediction, Machine Learning and Statistics
  • 21. Citation Housing Data Set Information: Concerns housing values in suburbs of Boston. Origin: This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. Creator : Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978. Content that appears as is on this document only, is under Creative Commons BY-NC-SA This license may not apply to material referenced here.
  • 22. For More Information DSE 2014 stream is all set set to commence on Jan 19, 2004 For more details, visit DSE 400 Announcement Page bit.ly/18zPE1j Visit DSE 2014 Global to participate in DSE and to get to know the DSE Core Team and participants. Week 1 discussions can found at DSE 400 Week 1 We welcome questions, thoughts and suggestions. Post these on SONO in the right forum/discussion or write to us at <datascience400@gmail.com> You can always find the latest version of this document at bit.ly/1hC5wAV
  • 23. Acknowledgement We thank our community of committed and passionate volunteers, experts, educators, innovators, benefactors, advisers, advocates, mentors and supporters We are also grateful to the outstanding support and encouragement from SONO team as well as other organizations like R-Project, Open Courseware Consortium, MIT, IBM, HortonWorks, Stanford University, Caltech and Data Science Central etc.