2. SDS and NASA
• The state of play in academic data science
• UVA response
• School formation
• Mission
• Our data science framework
• Examples of research
• School capabilities
• Opportunities for NASA/SDS Collaboration
3. Increased Demand over the Past Five Years
74%
Artificial Intelligence specialists
Top industries hiring this talent: Computer software, internet,
information technology and services, higher education,
consumer electronics
37%
Data Scientist
Top industries hiring this talent: Information technology and
services, computer software, internet, financial services, higher
education
33%
Data Engineer
Top industries hiring this talent: Information technology and
services, internet, computer software, financial services,
hospital and healthcare
4. The Rising Demand for Data Scientists
*for graduates seeking employment
100% 100% 100% 98% 97%
UVA School of Data Science
Graduate Job Placement
2019 2018 2017 2016 2015
*
Roles
Machine Learning Engineer, Director of Data
Science, Deep Learning Research Scientist,
Senior Data Analyst, Data Science Developer,
Consultant, Product Data Analyst, Financial
Engineer, Engagement Manager & more
Industries
● Finance
● Government
● Healthcare & Medicine
● Professional Sports
● Commerce
● Media
● Higher Ed
● Technology
6. A New School for a New Century
A School Without Walls
Mission
To be a national and international leader in responsible data science
emphasizing interdisciplinary collaboration which results in
furthering discovery, sharing knowledge, and societal benefit
7. One Representation of Data Science
The 4+1 Model
The model is based
on the core insight
that all definitions of
data science
assume a pipeline
and that this
pipeline forms a
parallel process
[From Raf Alvarado]
8. One Representation of Data Science
The 4+1 Model
• Value – assuring
societal benefit
• Design -
Communication of the
value of data
• Systems – the means
to communicate and
convey benefit
• Analytics – models
and methods
• Practice – where
everything happens
[From Raf Alvarado]
9. The 4+1 Model Interplay
[From Raf Alvarado]
• Value + Design = Openness,
responsibility
• Value + Analytics = Human
centered AI, algorithmic bias
• Value + Systems =
sustainability, access,
environmental impact
• Design + Analytics = literate
programming, visualization
• Design + Systems =
dashboards, engineering
design
• Analytics + Systems = ML
engineering
10. A New School for a New Century
Where we are Today
Foundation
● Residential & Online Masters in Data
Science
● Presidential Fellows
● Undergrad minor Spring 2021
● PhD programs submitted in the fall
● Hiring & recruiting leading faculty
● Research & community projects underway
● New building plan – occupy 2023
11. Use case – Data Integration
Researcher and Assistant Professor of
Medicine Dr. Thomas Hartka, also a current
online Masters in Data Science student, is
combining two disparate data sets—electronic
health records and DMV crash data—to save
lives after motor vehicle crashes.
“I enrolled in the MSDS program
to expand my research on
automotive safety. I have already
used techniques from classes in
my work. I hope to expand my
research to real-time analytics to
improve emergency room care.”
— Dr. Thomas Hartka, UVA
School of Medicine
12. Use Cases
Machine Learning
powered insights Don Brown
• Monitor data from host computers logs,
authentication attempts and network
traffic from multiple enterprises, and
subject this data to optimized ML
techniques capable of detecting
anomalies that signal an intrusion
• Develop deep neural network learning
methods that do not require enterprises
to send their data to a global repository
• Preserve Privacy
Project 2 - DODProject 1 - DARPA
• Exploit massive amounts of contextual
data and use other aspects of the
dynamic environment
• Leverage signal processing, data fusion,
visualization, human factors and cyber-
security
• Fusion processes developed in
the Predictive Technology
Laboratory took data from multiple
sources and combined them using
hierarchical models.
13. Use Case Presidential Fellowship with NASA
Environmental Data
Jake Malcomb and Linnea Saby
• Analyze a massive geospatial data
set collected over a two-year period
from the International Space
Station, and then parsed by an
“extreme machine learning” tool that
aims to mimic the human brain.
• Tree core samples provide temporal
information about long-term tree
growth and physiology
• ML taps geospatial data to
understand forest ecosystems
• NASA ECOSTRES and GEDI
provide the extraordinarily large
geospatial dataset
14. Furthering Discovery to Build a Better World
RESEARCH
Cybersecurity
Detecting broad-spectrum cyber
threats almost immediately after
they are launched through a $7.6
million Defense Advanced
Research Projects Agency
(DARPA) grant.
Environment
Using NASA data collected aboard the
International Space Station to examine
climate change in the Shenandoah
National Forest and beyond, and find
solutions
Health & Medicine
Securing high-performance computing
equipment and personnel to allow
collaboration across the university on brain
science research like Autism, Alzheimer’s,
mental health disorders, traumatic brain
injuries and more.
Business
Discovering what makes a job
interview successful for the
candidate and the recruiter, and
how to mitigate bias in the
recruiting process
Democracy
Investigating how terrorist groups recruit
women through propaganda and
examining risk and threat assessment for
extremist violence perpetrated by women.
Education
Helping economically disadvantaged,
underrepresented populations pursue
tailored educational workforce pathways
that have a higher probability of leading
them to success.
15. Applying Data Science Across Industries
“To tackle challenges in science and medicine.”
— Elizabeth Driskell, MSDS ‘20
“To inform public policy and government.”
— Bradley Katcher, MSDS ‘20
“I want to use data science to find a new way of
thinking.” — Alex Gromadzki, MSDS ‘21
“I want to use data science to solve complex business
problems.” — Ruslan Askerov, MSDS ‘21
“To address poverty and income inequality.”
— Arti Patel, MSDS ‘20
16. SDS Faculty Research
Data Science Faculty member or affiliated
faculty Website Research Interests
Nada Basit
https://engineering.virginia.edu/facul
ty/nada-basit
Machine Learning, Bioinformatics, Data Mining, Pattern
Recognition
Phil Bourne
https://engineering.virginia.edu/facul
ty/philip-e-bourne
Multiscale Modeling Using Data Science Techniques
Early Stage Drug Discovery and Drug Repurposing
Early Stage Drug Methods and Tools for Macromolecular
Don Brown
https://engineering.virginia.edu/facul
ty/donald-e-brown-phd
Data Fusion, Knowledge Discovery, and Simulation
Optimization
Sallie Keller
https://biocomplexity.virginia.edu/sal
lie-keller
social and decision informatics, statistical underpinnings of
data science, and data access and confidentiality.
Daniel Mietchen
https://tools.wmflabs.org/scholia/aut
hor/Q20895785
Computational Biology, Biodiversity integrating research
workflows with the World Wide Web through open
licensing, open standards, and open collaboration via
Rafael Avarado http://transducer.ontoligent.com/
Cultural Analytics and Machine Learning, Digital
Humanities, Text Analysis
Heman Shakeri https://www.hemanshakeri.com/
structure and function of interconnected networks, often
expressed via graphs that comprise a set of nodes and a
set of connections between them.
Jonathan Kropko
https://facultydirectory.virginia.edu/f
aculty/jk8sd
methods to examine historical data, to test theories of
voting in U.S. presidential elections, and to handle
nonresponse in surveys.
Michael Porter
https://engineering.virginia.edu/facul
ty/michael-d-porter
event prediction, pattern and anomaly detection, and data
linkage - applications for Criminology, Transportation,
Terrorism, Defense, Security, Forensics, Business
Mohammad Fallahi-Sichani new hire
designing and building new experimental and
computational tools to enable the analysis, interpretation
and rational modulation of multi-scale processes that
Jack Van Horn
https://scholar.google.com/citations?
user=i9bGqbgAAAAJ&hl=en Psychology and Data Science, Cognitive Neuroscience
Pete Alonzi https://github.com/alonzi
Vicente Ordonez
https://engineering.virginia.edu/facul
ty/vicente-ordonez-roman
Computer Vision, Natural Language Processing and
Machine Learning
Tim Clark
https://scholar.google.com/citations?
user=k-iwlCUAAAAJ&hl=en
next generation approaches for biomedical
communications and data integration, including
semantically integrated data repositories, claims and
Gerard Learmonth
https://www.researchgate.net/profil
e/Gerard_Learmonth
Generation and testing of pseudorandom number
generators; Abstract database design; Strategic
applications of information systems and technology
Hongning Wang http://www.cs.virginia.edu/~hw5x/
data mining, machine learning, and information retrieval,
with a special emphasis on computational user behavior
modelin
Stephen Adams
http://www.nsfcvdi.org/wordpress/c
vdi_personnel/steven-adams-ph-d/
Adaptive Decision Systems Lab at UVA and his research is
applied to several domains including activity recognition,
prognostics and health management for manufacturing
Aidong Zhang
https://engineering.virginia.edu/facul
ty/aidong-zhang ML, Data mining, bioinformatics
Jundong Li http://people.virginia.edu/~jl6qk/
Data Mining, Machine Learning, Social Computing, and
Deep Learning
Brian Wright
https://www.linkedin.com/in/brian-
wright-ph-d-90063027/
17. 2020 Capstone Projects
Org sponsor Capstone project
Markel Corporation
Machine Learning Based Approaches to Predict Customer Churn for an Insurance
Company
UVA Health System
Analyzing the Composition of Diabetes Patients andImpact of Seasonal and Climate Trends
on Emergency Room Utilization in Central Virginia
Met Museum Exploring Themes and Bias in Art using MachineLearning Image Analysis
Raytheon Machine Learning for Real-Time Vehicle Detection in All-Electronic Tolling System
Capital One Evaluating and Improving Attrition Models for the Retail Banking Industry
Babylon Farms A Digital Green Thumb: Neural Networks toMonitor Hydroponic Plant Growth
S&P Global
An Exploration and Characterization of Financial Performance of Standard and Poor’s 500
Index Constituents Led By Female CEOs
UVA School of Medicine/McManusGeographic Access to HIV Care
Corning Natural Language Processing for Company Financial Communication Style
Westrock Enhancing Promotion Decisions using Classification and Network-based Methods
Capital One - dual degree Retailer’s Dilemma: Personalized Product Marketing to Maximize Revenue
LMI Document Retrieval Using Deep Learning
X Mode Social Applying Mobile Location Data to Improve Hurricane Evacuation Plans
Smart C-ville The Deployment of a LoRaWAN-Based IoT Air Quality Sensor Network for Public Good
Fortive Modeling Client Churn for Small Business-to-Business Firms
Politics Lessons Learned: A Case Study in Creating a Data Pipeline using Twitter’s API
School of Architecture Analyzing Pre-Trained Neural Network Behavior with Layer Activation Optimization
Biomedical Engineering Deep Learning of Protein Structural Classes: Any Evidence for an ‘Urfold’?
Clarabridge
A Comparative Study of the Performance of Unsupervised Text Segmentation Techniques
on Dialogue Transcripts
18. Growing the School
M.S. IN DATA SCIENCE
Residential & Online
2020
2020-2023
UNDERGRADUATE
COURSES
increase to 18
courses per AY
2021
PH.D. PROGRAM
2023
UNDERGRADUATE
MAJOR
Building occupied
Team Size (FTEs)
5
40
60
80
120
Exec. Ed.
19. Why Responsible Data Science?
• A defining feature
• A partnership between STEM,
social sciences and the
humanities
• Where UVA has strength
20. SDS and NASA
• Course or short course, including NASA content
• Funded and collaborative research
• Faculty, Capstone, Presidential Fellowship
• Armed Forces Admits – MSDS, PhD
• Cybersecurity Joint Hire – Faculty
• Diversity partnership
• Secure facility