2. Acknowledgement
We thank our community of committed and passionate
volunteers, experts, educators, innovators, benefactors,
advisers, advocates and supporters
We are also grateful to the outstanding support and
encouragement from SONO team as well as other
organizations like Open Courseware Consortium, MIT,
IBM, HortonWorks, Stanford University, and Caltech etc.
3. Principles
Philanthropy through Free and Open Education, Knowledge
Dissemination and Social Innovation
Synthesize Data Science, Big Data Architecture, Technology
Platforms and Systems Engineering for Decision Making
Collaboration, Crowdsourcing and Innovation Diffusion
Emphasis on Knowledge, Skills and Abilities (KSAs)
over Abstract Mathematics or Theoretical Profundity
Principles without programs are platitudes.- George Bernard Shaw
4. Motivation
Industry needs Data Scientists with
versatile background in Machine
Learning, Statistics, Big Data
Architecture, Advanced Analytics,
Evidence-Oriented Systems Engineering.
Aspiring Data Scientists, Big Data
Engineers also need well-rounded
education, mentorship from experts
as well as practical skills
5. Goals
Prepare the students, practitioners to have set of
T-shaped practical-skills emphasizing depth and
breadth of a range of relevant disciplines and
capabilities in Data/Decision Sciences and Big Data
Architecture/Engineering.
Make the course delivery
easy, engaging and
engendering.
6. Data Science Enablement Roadmap 2014
1 + 3 Courses gets you Master’s Level Certificate
Ramping up Machine Learning with R
Fast track to
Data Science
Modern Data Platforms
Advanced Techniques in
Big Data Analytics
7. Data Science Enablement Roadmap - Future
Possible extensions in future
Data Mining Process
Methodologies and Tools
Advanced Techniques in
Big Data Analytics
Ramping with R
Fast track to
Data Science
Modern Data Platforms
Machine Learning/AI
Data Visualization
8. Fast track to Data Science (DSE 400)
Introductory course with NO pre requisites.
Topics include Algorithms, Statistical Inference, Data
Analysis, Model Building, Validation, Calibration,
Data at rest and in motion, Causality, Meaning of Data,
Data Engineering, Hadoop, R,
Machine Learning,
Data Mining, Visualization,
Applications, Case Studies,
variety of tools and techniques etc.
9. Ramping up with R (DSE 501)
Prerequisite: DSE 400
Applied Statistics,
Machine Learning,
Data Mining,
Graphing,
Analytics and Visualization
Use cases, Industry Applications
10. Modern Data Platforms (DSE 502)
Prerequisite: DSE 400
Employ Hadoop and Hadoop Ecosystem to
enable Enterprises handle data explosion and derive
actionable analytics.
MapReduce, Pig, Hive, NoSQL, Zookeeper etc.
Also introduce streams computing with Storm and Kafka
Case Studies: Fraud Prevention, Product Recommendation,
Epidemic Prediction etc.
12. Machine Learning and AI (DSE 503)
Prerequisite: DSE 400
Explore and implement Machine
Learning Algorithms such as
Classification, Clustering, Ranking,
Recommendation, Neural Networks,
Adaptive Learning.
Also to include Knowledge
Engineering, Expert Systems,
Ontologies, NLP and Reasoning
13. Data Mining Process and
Methodologies (DSE 504)
Prerequisite: DSE 400
Decision Trees
Regression
Classification
Clustering
Association Rules etc.
14. Data Visualization (DSE 505)
Prerequisite: DSE 400
Story Telling/Data Journalism
Data Visualization Methodology
Tools and Techniques
HTML5, d3.js, BIRT,
Prefuse
15. Advanced Techniques for
Big Data Analytics ( DSE 600)
Prerequisite: DSE 400
On Demand Data Integration
Data Virtualization
Enterprise Data Hub
Analytics Dashboards
Big Data Appliances
Analytics as a Service
Open Stack and Savannah
Privacy and Security
17. Next Steps
DSE 2014 stream is set to commence on Jan 19, 2004
For more details, visit DSE 400 Announcement Page <http://bit.ly/18zPE1j>
To Enroll for DSE 400 visit Enrollment Page
<http://soknocommunity.com/xoops/modules/xforms/?form_id=2>
This presentation can also accessed at
Data Scientist Enrollment Roadmap 1.0 <http://bit.ly/1c2wC2P>
We welcome thoughts and suggestions. Write to us at
<datascience400@gmail.com>
18. References
Data Jujitsu - The Art of Turning Data into a Product by DJ Patel
Data Science eBook Dr. Vincent Garville
Data Scientist - Sexiest Job of 21 Century (HBR)
Doing Data Science by Rachel Shutt
Data Visualization: a successful design process by Andy Kirk
Disruptive Possibilities: How Big Data Changes Everything by Jeffrey Needham
How to process, analyze and Visualize Data (MIT OCW)
Knowledge-based Systems (MIT OCW)
Learning from Data (Caltech)
Statistical Thinking and Data Analysis (MIT OCW)
The Complete Guide to Business Analytics (Collection) By: Thomas H. Davenport
Think Bayes; Think Python; and Think Stats by Aleen Downey
Learn from the masters - Johann Wolfgang von Goethe
19. References (contd …)
Google White papers on GFS, MapReduce, Bigtable
Real-time Analytics with Big Data - Facebook Case Study
Interactive Data Visualization by Scott Murray
Data, Models and Decisions (MIT OCW)
Communicating Data (MIT Sloan School of Management OCW)
Unleashing the Power of Hadoop - DBTA Thought Leadership Series
How Educators Can Narrow Big Data Skills Gap - Jeff Bertloucci,
Data Visualization with D3.js Cookbook
What is Data Science?
Agile Data Science