SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Musings on Data Science and
Students Experiencing Data Analytics
New England SENCER Center for Innovation
Prof. Randy Paffenroth
Data Science Program
Department of Mathematical Sciences
Worcester Polytechnic Institute
rcpaffenroth@wpi.edu
2014
My Research
"Internet Connectivity Access layer" by User:Ludovic.ferre -
Internet_Connectivity_Overview2_Access.svg. Licensed under Creative
Commons Attribution-Share Alike 3.0 via Wikimedia Commons -
http://commons.wikimedia.org/wiki/File:Internet_Connectivity_Access_layer
.svg#mediaviewer/File:Internet_Connectivity_Access_layer.svg
This is a panel, so I want to be
provocative!
Provocative
Adjective
1. tending or serving to provoke; inciting,
stimulating, irritating, or vexing.
So, I will be a little sad if I don’t end up irritating
anyone 
The first war: Terminology
• Analyzing data has a long history!
• There have been many terms that have been
used to describe such endeavors:
• Statistics
• Artificial Intelligence
• Machine learning
• Data analytics
• Since I happen to work in a “Data Science”
program perhaps I may be allowed the
indulgence of using that terminology…
Whatever we call it, what makes
things different now?
Experiments, observations, and numerical simulations in many
areas of science and business are currently generating terabytes
of data, and in some cases are on the verge of generating
petabytes and beyond. Analyses of the information contained in
these data sets have already led to major breakthroughs in fields
ranging from genomics to astronomy and high-energy physics and
to the development of new information-based industries.
- Frontiers in Massive Data Analysis, National Research Council of the National Academies
Given a large mass of data, we can by judicious selection
construct perfectly plausible unassailable theories—all of
which, some of which, or none of which may be right.
- Paul Arnold Srere
The ability to take data—to be able to understand it, to process it, to
extract value from it, to visualize it, to communicate it—that’s going to
be a hugely important skill in the next decades, not only at the
professional level but even at the educational level for elementary
school kids, for high school kids, for college kids. Because now we
really do have essentially free and ubiquitous data. So the
complimentary scarce factor is the ability to understand that data and
extract value from it.
-Hal Varian, Google's Chief Economist, http://www.mckinsey.com/insights/innovation/hal_varian_on_how_the_web_challenges_managers
My personal goal: Getting students to be able to
think critically about data.
What is Big Data?
 The are many examples of "data", but what makes some of
it “big”? The classic definition revolves around the three
Vs.
 Volume, velocity, and variety.
 Volume: There is a just a lot of it being generated all
the time. Things get interesting and “big”, when you
can’t fit it all on one computer anymore. Why? There
are many ideas here such as MapReduce, Hadoop, etc.
that all revolve around being able to process data that
goes from Terabytes, to Petabytes, to Exabytes.
 Velocity: Data is being generated very quickly. Can
you even store it all? If not, then what do you get rid of
and what do you keep?
 Variety: The data types you mention all take different
shapes. What does it mean to store them so that you
can play with or compare them?
http://pl.wikipedia.org
/wiki/Green_Giant#m
ediaviewer/Plik:Jolly_
green_giant.jpg
Is Big Data the same as Data
Science?
 Are Big Data and Data Science the same thing?
 I wouldn't say so...
 Data Science can be done on small data sets.
 And not everything done using Big Data would
necessarily be called Data Science.
Big Data
Data
Science
Is Big Data the same as Data
Science?
 Are Big Data and Data Science the same thing?
 I wouldn't say so...
 Data Science can be done on small data sets.
 And not everything done using Big Data would
necessarily be called Data Science.
 But there certainly is a substantial overlap!
Big Data
Data
Science
Can you even be certain?
 For real world problems, I
claim that you will never be
certain of any inferences from
data.
 I mean, what happens to your
carefully thought out marketing
plan for some rocking slacks
when the Martians land.
 What is unacceptable is when
the data you actually have
does not support the
conclusion you report.
Public domain image
It can be easy to fool yourself!
Human beings are really
good at pattern
detection...
http://en.wikipedia.org/wiki/Cydonia_(region_of_Mars)
Perhaps a bit too good!
It can be easy to fool yourself!
http://en.wikipedia.org/wiki/Cydonia_(region_of_Mars)
Skills for Data Science
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Which is most important?
http://en.wikipedia.org/wiki/View_of_the_World_from_9th_Avenue
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
WPI Data Science Program:
A Collaboration
Business School
Computer
Science
Department
Mathematical
Sciences
Department
M.S. in Data Science Program
INTEGRATIVE DATA SCIENCE (3 CREDITS)
GRADUATE QUALIFYING PROJECT OR MS THESIS
(3 TO 9 CREDITS)
MATHEMATICAL
ANALYTICS
(3 CREDITS)
DATA ACCESS &
MANAGEMENT
(3 CREDITS)
DATA
ANALYTICS &
MINING
(3 CREDITS)
BUSINESS
INTELLIGENCE &
CASE STUDIES
(3 CREDITS)
CONCENTRATION AND ELECTIVES
(9 TO 15 CREDITS)
Data Science Core
I N T E G R AT I V E D ATA S C I E N C E :
D S 5 0 1 I N T R O D U C T I O N T O D ATA S C I E N C E ( N E W C O U R S E )
M AT H E M AT I C A L A N A LY T I C S ( S E L E C T O N E ) :
M A 5 4 3 / D S 5 0 2 S TAT I S T I C A L M E T H O D S F O R D ATA S C I E N C E ( N E W
C O U R S E )
M A 5 4 2 R E G R E S S I O N A N A LY S I S
M A 5 5 4 A P P L I E D M U LT I V A R I AT E A N A LY S I S
D ATA A C C E S S A N D M A N A G E M E N T ( S E L E C T O N E ) :
C S 5 4 2 D ATA B A S E M A N A G E M E N T S Y S T E M S
M I S 5 7 1 D ATA B A S E A P P L I C AT I O N S D E V E L O P M E N T
C S 5 6 1 A D V A N C E D T O P I C S I N D ATA B A S E S Y S T E M S
C S 5 8 5 / D S 5 0 3 B I G D ATA M A N A G E M E N T ( N E W C O U R S E )
D ATA A N A LY T I C S A N D M I N I N G ( S E L E C T O N E ) :
C S 5 4 8 K N O W L E D G E D I S C O V E R Y A N D D ATA M I N I N G
C S 5 3 9 M A C H I N E L E A R N I N G
C S 5 8 6 / D S 5 0 4 B I G D ATA A N A LY T I C S ( N E W C O U R S E )
B U S I N E S S I N T E L L I G E N C E A N D C A S E S T U D I E S ( S E L E C T O N E ) :
M I S 5 8 4 B U S I N E S S I N T E L L I G E N C E
M K T 5 6 8 D ATA M I N I N G B U S I N E S S A P P L I C AT I O N S
Data Science Certificate
Program (18 credits);
• 15 CREDIT DATA SCIENCE CORE
plus
• 3 CREDIT ELECTIVE
2014 Data Science Cohort
NATIONALITY
C A M B O D I A
I N D I A
C H I N A
P A K I S T A N
T A I W A N
I R A N
U . S . A .
B R A Z I L
N E P A L
A F G H A N I S T A N
I N D O N E S I A
EDUCATIONAL FOUNDATION
QUANTITATIVE/ COMPUTATIONAL
BACKGROUNDS
PROGRAMMING WITH DATA STRUCTURES
AND ALGORITHMS FOR COMPUTATIONAL
SKILLS
QUANTITATIVE SKILLS
CALCULUS, LINEAR ALGEBRA AND
STATISTICS
EMPLOYMENT HISTORIES
SENIOR RESEARCH ANALYST
SENIOR BUSINESS ANALYST
PATIENT FINANCIAL SERVICES
DATA BASE ANALYST-ARCHITECT
DECISION SCIENTIST
MINISTRY OF FINANCE
LAHEY HEALTH
TECHNICAL PROGRAM MANAGEMENT
U.S. DEPARTMENT OF STATE
66.70% Male
33.3% Female
GENDER
10%
FULBRIGHT
SCHOLARS
2014 Data Science Cohort
FALL 2014
Total Applicants 126
Total acceptances 33
Fulbright Scholars 3
Brazil Science Mobility Student 1
Countries Represented 9
Domestic Students 5
International Students 28
Many hold more than one earned Bachelor’s Degree
US Universities include Columbia, UNH and WPI
Dean Oates gave two Awards of $5K to outstanding
students.
These awards help attract top students.
Skills Acquired by Our Students
Fundamental/Technical :
SQL/ Data Modeling / Cleaning
Data Integration / Warehousing
Statistical Learning / Machine Learning
Distributed Computing
Big Data Management
Classif./Regression/DecisionTrees
Business Intelligence
Distributed Mining Algorithms
Professional Skills:
Business Use Cases / Entrepreneurship
Interdisciplinary Teams / Leadership
Tools :
Oracle /MySQL/DB2/SQLServer
R / SAS / SciKit
Weka /RapidMiner /MatLab
IBM Cognos / SPSS Modeler
Hadoop / Mahout / Cassandra
Python / Java / Cloud Computing
Storm / Sparc / InfoSphere Streams
Spotfire / Tableaux
Professional Skills:
Story Telling / Visualization
Presentations / Reports
Data Science Tools for Students:
Free!
Software:
•Python
•http://www.python.org/
• iPython: http://ipython.org/
• Numpy: http://www.numpy.org/
• Pandas: http://pandas.pydata.org/
• Matplotlib: http://matplotlib.org/
• Mayavi: http://mayavi.sourceforge.net/
• Scikit-learn: http://scikit-
learn.org/stable/
Data:
•UCI Machine learning
repository
• http://archive.ics.uci.edu/ml/
•Kaggle
• https://www.kaggle.com/
•U.S. Government
• https://www.data.gov/

Weitere ähnliche Inhalte

Ähnlich wie SENCER_panel.ppt

Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with RStephen Withington
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work Ellicium Solutions Inc.
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadKelly Technologies
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG DataPrasant Misra
 
Data Modelling Fundamentals course 3 day synopsis
Data Modelling Fundamentals course 3 day synopsisData Modelling Fundamentals course 3 day synopsis
Data Modelling Fundamentals course 3 day synopsisChristopher Bradley
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science processMathieu d'Aquin
 
Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016Ravi Pal
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Massimiliano Crosato
 
Big Data for International Development
Big Data for International DevelopmentBig Data for International Development
Big Data for International DevelopmentAlex Rascanu
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data ScienceKrishna Sankar
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnWhat does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnPraj H
 
Computational Thinking & STEM = PBL in action
Computational Thinking & STEM = PBL in actionComputational Thinking & STEM = PBL in action
Computational Thinking & STEM = PBL in actionSusan S. Wells
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 

Ähnlich wie SENCER_panel.ppt (20)

Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with R
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Interview
InterviewInterview
Interview
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Small data big impact
Small data big impactSmall data big impact
Small data big impact
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
Data Modelling Fundamentals course 3 day synopsis
Data Modelling Fundamentals course 3 day synopsisData Modelling Fundamentals course 3 day synopsis
Data Modelling Fundamentals course 3 day synopsis
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Vikram emerging technologies
Vikram emerging technologiesVikram emerging technologies
Vikram emerging technologies
 
Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016Technology Trends, Consumer Experience @MICA 2016
Technology Trends, Consumer Experience @MICA 2016
 
KOHN.ppt
KOHN.pptKOHN.ppt
KOHN.ppt
 
KOHN.ppt
KOHN.pptKOHN.ppt
KOHN.ppt
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
 
Big Data for International Development
Big Data for International DevelopmentBig Data for International Development
Big Data for International Development
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnWhat does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
 
Computational Thinking & STEM = PBL in action
Computational Thinking & STEM = PBL in actionComputational Thinking & STEM = PBL in action
Computational Thinking & STEM = PBL in action
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 

Mehr von nagarajan740445

principles of design thinking and start a new business in bengaluru.pptx
principles of design thinking and start a new business in bengaluru.pptxprinciples of design thinking and start a new business in bengaluru.pptx
principles of design thinking and start a new business in bengaluru.pptxnagarajan740445
 
how to start the MSME business in India.pptx
how to start the MSME business in India.pptxhow to start the MSME business in India.pptx
how to start the MSME business in India.pptxnagarajan740445
 
digital age mode Industry presentation.pptx
digital age mode Industry presentation.pptxdigital age mode Industry presentation.pptx
digital age mode Industry presentation.pptxnagarajan740445
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxnagarajan740445
 
Inroduction to ERP system core functions and challenages.pptx
Inroduction to ERP system core functions and challenages.pptxInroduction to ERP system core functions and challenages.pptx
Inroduction to ERP system core functions and challenages.pptxnagarajan740445
 
MDD in CAP (Saundra Stock).ppt
MDD in CAP (Saundra Stock).pptMDD in CAP (Saundra Stock).ppt
MDD in CAP (Saundra Stock).pptnagarajan740445
 
Intestinal Obstruction (1).ppt
Intestinal Obstruction (1).pptIntestinal Obstruction (1).ppt
Intestinal Obstruction (1).pptnagarajan740445
 
marketing analytics 1.pptx
marketing analytics 1.pptxmarketing analytics 1.pptx
marketing analytics 1.pptxnagarajan740445
 
first rule of marketing analytics forget about the customer.pptx
first rule of marketing analytics  forget about the customer.pptxfirst rule of marketing analytics  forget about the customer.pptx
first rule of marketing analytics forget about the customer.pptxnagarajan740445
 
marketing analytics.pptx
marketing  analytics.pptxmarketing  analytics.pptx
marketing analytics.pptxnagarajan740445
 
BUSINESS_ANALYTICS_ppt.ppt
BUSINESS_ANALYTICS_ppt.pptBUSINESS_ANALYTICS_ppt.ppt
BUSINESS_ANALYTICS_ppt.pptnagarajan740445
 
Tamil Nadul List of Doctors-2020.pdf
Tamil Nadul List of Doctors-2020.pdfTamil Nadul List of Doctors-2020.pdf
Tamil Nadul List of Doctors-2020.pdfnagarajan740445
 
malabsorptionsyndrome-141120082515-conversion-gate02.pdf
malabsorptionsyndrome-141120082515-conversion-gate02.pdfmalabsorptionsyndrome-141120082515-conversion-gate02.pdf
malabsorptionsyndrome-141120082515-conversion-gate02.pdfnagarajan740445
 

Mehr von nagarajan740445 (20)

principles of design thinking and start a new business in bengaluru.pptx
principles of design thinking and start a new business in bengaluru.pptxprinciples of design thinking and start a new business in bengaluru.pptx
principles of design thinking and start a new business in bengaluru.pptx
 
how to start the MSME business in India.pptx
how to start the MSME business in India.pptxhow to start the MSME business in India.pptx
how to start the MSME business in India.pptx
 
digital age mode Industry presentation.pptx
digital age mode Industry presentation.pptxdigital age mode Industry presentation.pptx
digital age mode Industry presentation.pptx
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptx
 
scorpio case study.pptx
scorpio case study.pptxscorpio case study.pptx
scorpio case study.pptx
 
geetha 1SP21BA009.pptx
geetha 1SP21BA009.pptxgeetha 1SP21BA009.pptx
geetha 1SP21BA009.pptx
 
gagana ppt 1.pptx
gagana ppt 1.pptxgagana ppt 1.pptx
gagana ppt 1.pptx
 
SCM + PUF_Day 3.pptx
SCM + PUF_Day 3.pptxSCM + PUF_Day 3.pptx
SCM + PUF_Day 3.pptx
 
Inroduction to ERP system core functions and challenages.pptx
Inroduction to ERP system core functions and challenages.pptxInroduction to ERP system core functions and challenages.pptx
Inroduction to ERP system core functions and challenages.pptx
 
MDD in CAP (Saundra Stock).ppt
MDD in CAP (Saundra Stock).pptMDD in CAP (Saundra Stock).ppt
MDD in CAP (Saundra Stock).ppt
 
Intestinal Obstruction (1).ppt
Intestinal Obstruction (1).pptIntestinal Obstruction (1).ppt
Intestinal Obstruction (1).ppt
 
marketing analytics 1.pptx
marketing analytics 1.pptxmarketing analytics 1.pptx
marketing analytics 1.pptx
 
first rule of marketing analytics forget about the customer.pptx
first rule of marketing analytics  forget about the customer.pptxfirst rule of marketing analytics  forget about the customer.pptx
first rule of marketing analytics forget about the customer.pptx
 
marketing analytics.pptx
marketing  analytics.pptxmarketing  analytics.pptx
marketing analytics.pptx
 
Cardiac.pptx
Cardiac.pptxCardiac.pptx
Cardiac.pptx
 
NERCOMPfinal_jfg.ppt
NERCOMPfinal_jfg.pptNERCOMPfinal_jfg.ppt
NERCOMPfinal_jfg.ppt
 
Data Analytics .pptx
Data Analytics .pptxData Analytics .pptx
Data Analytics .pptx
 
BUSINESS_ANALYTICS_ppt.ppt
BUSINESS_ANALYTICS_ppt.pptBUSINESS_ANALYTICS_ppt.ppt
BUSINESS_ANALYTICS_ppt.ppt
 
Tamil Nadul List of Doctors-2020.pdf
Tamil Nadul List of Doctors-2020.pdfTamil Nadul List of Doctors-2020.pdf
Tamil Nadul List of Doctors-2020.pdf
 
malabsorptionsyndrome-141120082515-conversion-gate02.pdf
malabsorptionsyndrome-141120082515-conversion-gate02.pdfmalabsorptionsyndrome-141120082515-conversion-gate02.pdf
malabsorptionsyndrome-141120082515-conversion-gate02.pdf
 

Kürzlich hochgeladen

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 

Kürzlich hochgeladen (20)

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

SENCER_panel.ppt

  • 1. Musings on Data Science and Students Experiencing Data Analytics New England SENCER Center for Innovation Prof. Randy Paffenroth Data Science Program Department of Mathematical Sciences Worcester Polytechnic Institute rcpaffenroth@wpi.edu 2014
  • 2. My Research "Internet Connectivity Access layer" by User:Ludovic.ferre - Internet_Connectivity_Overview2_Access.svg. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Internet_Connectivity_Access_layer .svg#mediaviewer/File:Internet_Connectivity_Access_layer.svg
  • 3. This is a panel, so I want to be provocative! Provocative Adjective 1. tending or serving to provoke; inciting, stimulating, irritating, or vexing. So, I will be a little sad if I don’t end up irritating anyone 
  • 4. The first war: Terminology • Analyzing data has a long history! • There have been many terms that have been used to describe such endeavors: • Statistics • Artificial Intelligence • Machine learning • Data analytics • Since I happen to work in a “Data Science” program perhaps I may be allowed the indulgence of using that terminology…
  • 5. Whatever we call it, what makes things different now?
  • 6. Experiments, observations, and numerical simulations in many areas of science and business are currently generating terabytes of data, and in some cases are on the verge of generating petabytes and beyond. Analyses of the information contained in these data sets have already led to major breakthroughs in fields ranging from genomics to astronomy and high-energy physics and to the development of new information-based industries. - Frontiers in Massive Data Analysis, National Research Council of the National Academies Given a large mass of data, we can by judicious selection construct perfectly plausible unassailable theories—all of which, some of which, or none of which may be right. - Paul Arnold Srere
  • 7. The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it. -Hal Varian, Google's Chief Economist, http://www.mckinsey.com/insights/innovation/hal_varian_on_how_the_web_challenges_managers My personal goal: Getting students to be able to think critically about data.
  • 8. What is Big Data?  The are many examples of "data", but what makes some of it “big”? The classic definition revolves around the three Vs.  Volume, velocity, and variety.  Volume: There is a just a lot of it being generated all the time. Things get interesting and “big”, when you can’t fit it all on one computer anymore. Why? There are many ideas here such as MapReduce, Hadoop, etc. that all revolve around being able to process data that goes from Terabytes, to Petabytes, to Exabytes.  Velocity: Data is being generated very quickly. Can you even store it all? If not, then what do you get rid of and what do you keep?  Variety: The data types you mention all take different shapes. What does it mean to store them so that you can play with or compare them? http://pl.wikipedia.org /wiki/Green_Giant#m ediaviewer/Plik:Jolly_ green_giant.jpg
  • 9. Is Big Data the same as Data Science?  Are Big Data and Data Science the same thing?  I wouldn't say so...  Data Science can be done on small data sets.  And not everything done using Big Data would necessarily be called Data Science. Big Data Data Science
  • 10. Is Big Data the same as Data Science?  Are Big Data and Data Science the same thing?  I wouldn't say so...  Data Science can be done on small data sets.  And not everything done using Big Data would necessarily be called Data Science.  But there certainly is a substantial overlap! Big Data Data Science
  • 11. Can you even be certain?  For real world problems, I claim that you will never be certain of any inferences from data.  I mean, what happens to your carefully thought out marketing plan for some rocking slacks when the Martians land.  What is unacceptable is when the data you actually have does not support the conclusion you report. Public domain image
  • 12. It can be easy to fool yourself! Human beings are really good at pattern detection... http://en.wikipedia.org/wiki/Cydonia_(region_of_Mars) Perhaps a bit too good!
  • 13. It can be easy to fool yourself! http://en.wikipedia.org/wiki/Cydonia_(region_of_Mars)
  • 14. Skills for Data Science http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
  • 15. Which is most important? http://en.wikipedia.org/wiki/View_of_the_World_from_9th_Avenue http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
  • 16. WPI Data Science Program: A Collaboration Business School Computer Science Department Mathematical Sciences Department
  • 17. M.S. in Data Science Program INTEGRATIVE DATA SCIENCE (3 CREDITS) GRADUATE QUALIFYING PROJECT OR MS THESIS (3 TO 9 CREDITS) MATHEMATICAL ANALYTICS (3 CREDITS) DATA ACCESS & MANAGEMENT (3 CREDITS) DATA ANALYTICS & MINING (3 CREDITS) BUSINESS INTELLIGENCE & CASE STUDIES (3 CREDITS) CONCENTRATION AND ELECTIVES (9 TO 15 CREDITS)
  • 18. Data Science Core I N T E G R AT I V E D ATA S C I E N C E : D S 5 0 1 I N T R O D U C T I O N T O D ATA S C I E N C E ( N E W C O U R S E ) M AT H E M AT I C A L A N A LY T I C S ( S E L E C T O N E ) : M A 5 4 3 / D S 5 0 2 S TAT I S T I C A L M E T H O D S F O R D ATA S C I E N C E ( N E W C O U R S E ) M A 5 4 2 R E G R E S S I O N A N A LY S I S M A 5 5 4 A P P L I E D M U LT I V A R I AT E A N A LY S I S D ATA A C C E S S A N D M A N A G E M E N T ( S E L E C T O N E ) : C S 5 4 2 D ATA B A S E M A N A G E M E N T S Y S T E M S M I S 5 7 1 D ATA B A S E A P P L I C AT I O N S D E V E L O P M E N T C S 5 6 1 A D V A N C E D T O P I C S I N D ATA B A S E S Y S T E M S C S 5 8 5 / D S 5 0 3 B I G D ATA M A N A G E M E N T ( N E W C O U R S E ) D ATA A N A LY T I C S A N D M I N I N G ( S E L E C T O N E ) : C S 5 4 8 K N O W L E D G E D I S C O V E R Y A N D D ATA M I N I N G C S 5 3 9 M A C H I N E L E A R N I N G C S 5 8 6 / D S 5 0 4 B I G D ATA A N A LY T I C S ( N E W C O U R S E ) B U S I N E S S I N T E L L I G E N C E A N D C A S E S T U D I E S ( S E L E C T O N E ) : M I S 5 8 4 B U S I N E S S I N T E L L I G E N C E M K T 5 6 8 D ATA M I N I N G B U S I N E S S A P P L I C AT I O N S Data Science Certificate Program (18 credits); • 15 CREDIT DATA SCIENCE CORE plus • 3 CREDIT ELECTIVE
  • 19. 2014 Data Science Cohort NATIONALITY C A M B O D I A I N D I A C H I N A P A K I S T A N T A I W A N I R A N U . S . A . B R A Z I L N E P A L A F G H A N I S T A N I N D O N E S I A EDUCATIONAL FOUNDATION QUANTITATIVE/ COMPUTATIONAL BACKGROUNDS PROGRAMMING WITH DATA STRUCTURES AND ALGORITHMS FOR COMPUTATIONAL SKILLS QUANTITATIVE SKILLS CALCULUS, LINEAR ALGEBRA AND STATISTICS EMPLOYMENT HISTORIES SENIOR RESEARCH ANALYST SENIOR BUSINESS ANALYST PATIENT FINANCIAL SERVICES DATA BASE ANALYST-ARCHITECT DECISION SCIENTIST MINISTRY OF FINANCE LAHEY HEALTH TECHNICAL PROGRAM MANAGEMENT U.S. DEPARTMENT OF STATE 66.70% Male 33.3% Female GENDER 10% FULBRIGHT SCHOLARS
  • 20. 2014 Data Science Cohort FALL 2014 Total Applicants 126 Total acceptances 33 Fulbright Scholars 3 Brazil Science Mobility Student 1 Countries Represented 9 Domestic Students 5 International Students 28 Many hold more than one earned Bachelor’s Degree US Universities include Columbia, UNH and WPI Dean Oates gave two Awards of $5K to outstanding students. These awards help attract top students.
  • 21. Skills Acquired by Our Students Fundamental/Technical : SQL/ Data Modeling / Cleaning Data Integration / Warehousing Statistical Learning / Machine Learning Distributed Computing Big Data Management Classif./Regression/DecisionTrees Business Intelligence Distributed Mining Algorithms Professional Skills: Business Use Cases / Entrepreneurship Interdisciplinary Teams / Leadership Tools : Oracle /MySQL/DB2/SQLServer R / SAS / SciKit Weka /RapidMiner /MatLab IBM Cognos / SPSS Modeler Hadoop / Mahout / Cassandra Python / Java / Cloud Computing Storm / Sparc / InfoSphere Streams Spotfire / Tableaux Professional Skills: Story Telling / Visualization Presentations / Reports
  • 22. Data Science Tools for Students: Free! Software: •Python •http://www.python.org/ • iPython: http://ipython.org/ • Numpy: http://www.numpy.org/ • Pandas: http://pandas.pydata.org/ • Matplotlib: http://matplotlib.org/ • Mayavi: http://mayavi.sourceforge.net/ • Scikit-learn: http://scikit- learn.org/stable/ Data: •UCI Machine learning repository • http://archive.ics.uci.edu/ml/ •Kaggle • https://www.kaggle.com/ •U.S. Government • https://www.data.gov/