2. 2
Outcome: Understand a bit of data science academic
history, current educational programs and what the
future may hold
3. 3
Data Science, brief History
Series of Academic Pebbles:
1960, Peter Naur used the term “data science” as a substitute
for computer science in survey research
John W. Tukey, 1962, “The Future of Data Analysis”:
“For a long time I thought I was a statistician, interested in inferences
from the particular to the general. But as I have watched mathematical
statistics evolve, I have had cause to wonder and doubt… I have come to
feel that my central interest is in data analysis… Data analysis, and the
parts of statistics which adhere to it, must…take on the characteristics of
science rather than those of mathematics… data analysis is intrinsically
an empirical science… How vital and how important… is the rise of
the stored-program electronic computer? In many instances the
answer may surprise many by being ‘important but not vital,’ although in
others there is no doubt but what the computer has been ‘vital.’”
4. 4
Data Science, brief History
Series of Academic Pebbles:
1974, Peter Naur publishes book: Concise Survey of
Computer Methods, provides a definition for Data Science:
“The science of dealing with data, once they have been established, while
the relation of the data to what they represent is delegated to other fields
and sciences.”
1977, The International Association of Statistical
Computing is established as a Section of the ISI
“It is the mission of the IASC to link traditional
statistical methodology, modern computer technology,
and the knowledge of domain experts in order to
convert data into information and knowledge.”
One of the first instances when we see the three cornerstones of
modern day data science being articulated
5. 5
Data Science, brief History
Series of Academic Pebbles:
1989, Gregory Piatetsky-Shapiro establishes first
Knowledge Discovery in Databases (KDD) workshop
1996 International Federation of Classification Societies
meets in Kobe, Japan and for the first time “data science” is
included in the total of the conference
2002, Data Science Journal is launched
2003, Journal of Data Science is launched
6. 6
Data Science, brief History
Google Weighes in…
January 2009 Hal Varian, Google’s Chief Economist, says:
“I keep saying the sexy job in the next ten years will be
statisticians. People think I’m joking, but who would’ve
guessed that computer engineers would’ve been the sexy
job of the 1990s? The ability to take data—to be able to
understand it, to process it, to extract value from it, to
visualize it, to communicate it—that’s going to be a hugely
important skill in the next decades… Because now we
really do have essentially free and ubiquitous data.
7. 7
Data Science, brief History
Pressure on Academy to change curriculums:
2010 Kirk Borne (teaches for GW) and other astrophysicists
submit to the Astro2010 Decadal Survey a paper titled “The
Revolution in Astronomy Education: Data Science for the
Masses “
“Training the next generation in the fine art of deriving intelligent
understanding from data is needed for the success of sciences,
communities, projects, agencies, businesses, and economies. This
is true for both specialists (scientists) and non-specialists
(everyone else: the public, educators and students, workforce).
Non-specialists require information literacy skills as productive
members of the 21st century workforce, integrating foundational
skills for lifelong learning in a world increasingly dominated by
data.”
8. 8
Data Science, brief History
Data Scientist emerges:
2012 Harvard Business Review article "Data Scientist: The
Sexiest Job of the 21st Century“,
DJ Patil claims to have coined this term in 2008 with
Jeff Hammerbacher to define their jobs at LinkedIn
and Facebook.
10. 10
Academic Programs
Data Scientist emerges:
Degree Programs with the phrase “Data Science” started
popping around this same time, 2008ish (N.C. State,
College of Charleston, Stanford)
Nomenclature started out as Data Analytics, market is now
moving to Data Science as the normative name,
Harvard this year launched a Master’s in Data Science
Really a extension of “Business Analytics”…at first…
Computer power began to create machine learning
techniques that required a more intensive focus on software
coding skills to fully leverage predictive power
The field is now, loosely, separated in three very high-level
areas of focus
11. 11
Academic Programs
Three loosely defined educational paradigms:
Business Analytics
(Business School)
Data Science
(Arts and Sciences)
Data Engineer
(CS or Engineering)
Educational
Focus
Knowledge on how to
leverage data
outcomes for business
decisions
Knowledge on
creation and
interpretation of
data products
Knowledge on data
infrastructure and
system creation and
maintenance
Job Title
Analogy
Business Analyst Data Scientist
(largest demand)
Data Architect
Job Duties Analysis applied to
operational elements
of organization
Creating
monetizable
commodities or
information
Maintain
systems/software
used for “big data”
and analysis
12. 12
Method:
Identified top U.S. institutions
Identified those offering graduate level “Data Science”
degrees
Gathered enrollment data from National Center of
Education Statistics
Where available
Gathered curriculum data by viewing individual websites
Categorized the results based on topic areas
Mapped the various institutions based on curriculum using
qualitative clustering techniques
GWU Data Science Program Overview
Research on Data Science Master’s Programs
13.
14. 14
Method:
Through numerous interviews with other data science
program directors and private sector companies
Participation on standard development committees, BHEF,
ASA and NVTC
Experience in developing the program at GW
Best Practices in DS Education
15. 15
Best Practices in DS Education
Practice Deployed Result Notes
Diversity in
Computing
Languages
Select language
dependent on
content being
delivered
Students more
able to adapt to
multiple working
environments
Python: ML
R: Stats
Javascript: Vis
Hive: HPC
Limit theory focus
on applied
knowledge
30 minutes lectures
coupled with in class
work
Students leave
with an ability to
contribute
immediately
Unique Data
Science Courses
Develop courses
organically, don’t
leverage current
courses
Courses are
designed
specifically for
Data Science sector
needs
Dedicated program
HPC/hardware/
cloud
Students have access
to Big Data platform
throughout program
Able to
understand the
unique challenges
associated with
large datasets
16. 16
Best Practices in DS Education
Practice Deployed Result Notes
Connection with
Industry
Corporate board and
partnerships
Students can work
on real-world
projects
Portfolio
Development
Approach
Students use github
to advertise skills
and can share with
employers
Students have
practical
knowledge and get
hired at higher
rates
Student lead project
teams
Encourage students
to create teams and
complete projects
outside of class
More experience
and deeper subject
area expertise is
developed
17. 17
Four High-Level Educational Options
Data Science Industry Education at Large
Secondary
Education
Immersion
Programs
Online Boot Strapping
Example Undergraduate,
Graduate,
Certificates, 2
year schools
Springboard,
General Assembly,
Data Society
Coursera,
Udacity,
DataCamp, etc.
MOCs, books,
free courses
Goal Industry
recognized
validation of
skills
Gain new skills at
a low cost, rapidly
Enhance
current skills or
gain awareness
of field
Gain or
enhance skills
at personnel
pace
Investment
(Time and
Money)
High/High Low/Medium Med/Low High/Low
19. Knowledge Economy?
The value of a company or organization's employee
knowledge, business training and any proprietary information
that may provide the company with a competitive advantage.
Intellectual Capital?
A system of consumption and production that is
predicated on intellectual capital
What is driving this economic reality?
20. Adam Smith in 1776 prognosticated in Wealth of Nations that
Division of Labor would be a economic driver for years to
come and he was right, resulting in….
Hyper-Specialization?
Occurs as an economy becomes more and more advanced
requiring ever increasing specialized skills.
21. Knowledge
Economy
Intellectual CapitalHyper Specialization
Division of Labor
What Does this Mean for us?
It means that a combination of technical proficiency and subject
area expertise will be essential for success and that in demand
skills in cutting edge technology areas will continue to evolve as
they have done for decades.
22. 22
Collaborations between the options
Online platforms offered as supplemental content to a
secondary program
GW working with Data Society
Market failures for higher education programs that do not
demonstrate value to companies
Drive standards toward what we are seeing now already in the top
schools
Increased specialization: Master’s in Machine Learning (John
Smith – Every increasing knowledge economy)
Consolidation of the online or immersion programs
Increased collaborations between private sector and higher
education institutions
Trends in Data Science Education
23. 23
Based on a report by Business Higher Education Forum and
PwC. “Investing in America’s data science and analytics talent” April 2017
Industry Notes
24. 24
New Job Postings Expected to reach 2.72 million in 2020 for
data and analytics professions, three general categories
Industry Notes
27. 27
Future Research
We are currently using NLP to cluster data science skills
listed in job postings
The results will then be compared to the curriculum being
offered by these top universities to determine if gaps are
present
Continue to monitor industry over time and track progress
in the what the market is demanding with the hopes of
adjusting our curriculum as necessary
28. 28
Future Research
Based on 200 “Data Scientist” jobs national wide, expanding the
number to included thousands of jobs targeting DC area, NYC,
and Silicon Valley
R
29. Thomas Friedman in The World is Flat (2005): “Markets will
continue to grow to form a global competitive landscape
defined by economic powers composed of knowledge
workers where critical thinking and idea creation will drive
demand.” (Golden Arches Theory)
Einstein – “True sign of knowledge is not intelligence but
imagination”
Nelson Mandela – “Education is the most powerful weapon
which you can use to change the world”
“The science of dealing with data, once they have been established, while the relation of the data to what they represent is delegated to other fields and sciences.”
Booz Allen Story and then highlight DS programs in general
Before click on the bog review data science ven diagram
Before click on the bog review data science ven diagram
Before click on the bog review data science ven diagram
Before click on the bog review data science ven diagram
What does know economy mean to you guys? How about intellectual capital? So if we believe are moving towards a knowledge economy predicated on intellectual capital what does that mean for the working individual and what is driving this reality?
Division of labor occurs as a economy becomes more and more advanced require ever increasing specialized skills.