SlideShare ist ein Scribd-Unternehmen logo
1 von 32
AI from the Perspective of a School of Data Science
Philip E. Bourne PhD
peb6a@virginia.edu
https://www.slideshare.net/pebourne
October 27, 2022 ORNL AI Workshop
My Perspective aka Biases
• AI User
• Practical Science Long standing computational biomedical researcher
• Open Access Co-Founder and Founding Editor in Chief PLOS
Computational Biology
• Open Knowledge First President of FORCE11
• Data are Value Involved in FAIR
• Translation First Associate Vice Chancellor for Innovation and
Industrial Alliances, UCSD
• Funders as Lever First Associate Director for Data Science, NIH – preprints,
data sharing, BD2K, etc.
• Change Higher Ed Founding Dean School of Data Science, UVA
AI Solved What Some Have Called the Holy Grail of
Molecular Biology
https://medium.com/proteinqure/welcome-into-the-fold-bbd3f3b19fdd
1-D 3-D
Downstream Implications
Defining Success
Google’s DeepMind’s AlphaFold2 makes gigantic leap in solving
protein structures
AlphaFold2
Numerical optimization – differential programming
Overall gradient descent trained to win CASP
Jumper et al.., 2021. Nature, 596 (7873),
pp.583-589
Transformer models using attention
Geometry invariant to
translation/rotation
Reasons Behind the Win
● Nothing fundamentally new from an AI perspective
● Data integration
● Collaboration not competition
● Engineering challenge beyond most labs
● Compute power beyond most labs
● Team size beyond most labs
● Worked with protein structure specialists
While a victory for AI
there are implications
that require a closer
look …
https://www.dreamstime.com/
Reasons Behind the Win – Lessons for Data Science
● Nothing fundamentally new from an AI perspective
● Data integration – data science vs data engineering
● Collaboration not competition – team building is critical
● Engineering challenge beyond most labs – systems that scale up
● Compute power beyond most labs – systems that scale up
● Team size beyond most labs – human systems that scale up
● Worked with protein structure specialists – domains rule
Implications for Science
The fourth paradigm changes how we think of the
long standing need to perform an act of
reductionism.
In the context of protein structure prediction, I
refer to this as the Curse of the Ribbon.
Let me explain…
[Forthcoming: Bourne, Draizen, Mura, PLOS Biology 2022]
Protein Fold Space – Human Reductionism
There are ~ 20300 possible proteins
>>>> all the atoms in the Universe
~189M protein sequences from
292K organisms (source UniProt))
Classified into ~1500 folds (source SCOP)
https://doi.org/10.1073/pnas.2628030100
It has become apparent that fold space is more continuous
Curse of the Ribbon
[From Cam Mura]
The human desire to bin/classify/reduce and to simplify how we view
data (ribbon diagram) while useful masks {we argue} aspects of the data
that algorithms can see
Downstream Implications for Data Science
• Cooperation rather than competition
• Public-private partnership
• Translational possibilities are endless
• Made possible by curated open data
• Appreciate engineering
As one example of the future success of AI, how
does a school of data science think of itself?
It starts with a simple foundational model of
data science that all in the school agree upon
The 4+1 Model of Data Science
• Value – assuring societal
benefit
• Design - Communication
of the value of data
• Systems – the means to
communicate and
convey benefit
• Analytics – models and
methods
• Practice – where
everything happens
From: Alvarado & Bourne, AI for Science Eds. Choudhary, Fox & Hey, 2023
The Data Science Interplay
• Value + Design = Openness,
responsibility
• Value + Analytics = Human
centered AI, algorithmic bias
• Value + Systems =
sustainability, access,
environmental impact
• Design + Analytics = literate
programming, visualization
• Design + Systems =
dashboards, engineering
design
• Analytics + Systems = ML
engineering
[From Raf Alvarado]
Thinking of data as a science unto itself is novel and controversial
We see AI in a more holistic framework…
Databases
organize data
around a project.
Data warehouses
organize the data
for an organization
Data commons
organize the data
for a scientific
discipline or field
Data
Warehouse
Data Ecosystems
Example – We Consider the
Evolving Systems that Support
AI
Challenges
Fixed level of funding
Opportunities
data commons
Data commons co-locate data
with cloud computing
infrastructure and commonly
used software services, tools &
apps for managing, analyzing and
sharing data to create an
interoperable resource for the
research community.*
*Robert L. Grossman, Allison Heath, Mark Murphy, Maria Patterson and Walt Wells, A Case for Data Commons Towards Data Science as a Service, IEEE
Computing in Science and Engineer, 2016. Source of image: The CDIS, GDC, & OCC data commons infrastructure at a University of Chicago data center.
Bonazzi VR, Bourne PE (2017) Should biomedical research be like Airbnb? PLoS Biol 15(4): e2001818.
Systems
[Adapted from Bob Grossman]
But wait the picture is more complicated….
A Data Science Poster Child
Researcher and Assistant Professor of
Medicine Dr. Thomas Hartka, also a
current online Masters in Data Science
student, is combining two disparate
data sets—electronic health records
and DMV crash data—to save lives
after motor vehicle crashes.
“I enrolled in the MSDS program to
expand my research on automotive
safety. I have already used
techniques from classes in my work.
I hope to expand my research to
real-time analytics to improve
emergency room care.”
— Dr. Thomas Hartka, UVA School
of Medicine
Questions?
Back Pocket
Research ethics
committees (RECs) review
the ethical acceptability
of research involving
human participants.
Historically, the principal
emphases of RECs have
been to protect
participants from physical
harms and to provide
assurance as to
participants’ interests and
welfare.*
[The Framework] is
guided by, Article 27
of the 1948 Universal
Declaration of Human
Rights. Article 27
guarantees the rights
of every individual in
the world "to share in
scientific
advancement and its
benefits" (including to
freely engage in
responsible scientific
inquiry)…*
Protect human
subject data
The right of human
subjects to benefit
from research.
*GA4GH Framework for Responsible Sharing of Genomic and Health-Related Data, see goo.gl/CTavQR
Data sharing with protections provides the evidence
so patients can benefit from advances in research.
Balance protecting human subject data
with open research that benefits
patients
[Adapted from Bob Grossman]
Value
Why Responsible Data Science?
• A defining feature
• A partnership between STEM, social
sciences and the humanities
• Where UVA has strength
https://en.wikipedia.org/wiki/Jim_Gray_(computer_scientist)
https://www.microsoft.com/en-us/research/wp-
content/uploads/2009/10/Fourth_Paradigm.pdf
https://twitter.com/aip_publishing/status/856825353645559808
Yet another way of thinking
about it – the fifth paradigm..
Daily Challenges
• Deciding what not to do
• Competition for the best team members (faculty and staff)
• Establishing a diverse team
• Lack of a comprehensive enterprise-wide data infrastructure
• Its easier to conform
During my 5-year interview as dean I was asked,
“Will we need a school of data science in 10 years
wont it be ubiquitous throughout the university?”
My response,
“Will we need a university in ten years? Wont it be
one big school of data science?”
https://pebourne.wordpress.com/2022/06/29/deans-blog-
data-science-ten-years-from-now/
Questions I Leave You With ….
• Have I overstated the case for data science?
• Are we currently doing the best by our students?
• Are the models we propose the right ones?
• Where do we go from here?
Punchline – in 45+ Years in Academia I Have
Never Seen Anything Like It
• It is a response to the digital transformation of
society
• It is touching every discipline (aka vertical)
• We can’t keep the students out of our classes
• Cause – large amounts of digital data
• Effect – interdisciplinarity, openness, translation,
search for responsibility and more
In summary, it is disruptive and higher ed. better pay attention

Weitere ähnliche Inhalte

Ähnlich wie AI from the Perspective of a School of Data Science

Ähnlich wie AI from the Perspective of a School of Data Science (20)

Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Cartegena051811
Cartegena051811Cartegena051811
Cartegena051811
 
Jim Gray Award Lecture
Jim Gray Award LectureJim Gray Award Lecture
Jim Gray Award Lecture
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
Nicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShowNicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShow
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
A Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital EnterpriseA Successful Academic Medical Center Must be a Truly Digital Enterprise
A Successful Academic Medical Center Must be a Truly Digital Enterprise
 
Biomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital EnterpriseBiomedical Research as Part of the Digital Enterprise
Biomedical Research as Part of the Digital Enterprise
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 

Mehr von Philip Bourne

Mehr von Philip Bourne (20)

AI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a ConversationAI in Medical Education A Meta View to Start a Conversation
AI in Medical Education A Meta View to Start a Conversation
 
AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
 
The Most Important Ten Simple Rules
The Most Important Ten Simple RulesThe Most Important Ten Simple Rules
The Most Important Ten Simple Rules
 
UVA School of Data Science
UVA School of Data ScienceUVA School of Data Science
UVA School of Data Science
 
Capstone Experience - SWOT Analysis
Capstone Experience - SWOT AnalysisCapstone Experience - SWOT Analysis
Capstone Experience - SWOT Analysis
 

Kürzlich hochgeladen

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Kürzlich hochgeladen (20)

ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 

AI from the Perspective of a School of Data Science

  • 1. AI from the Perspective of a School of Data Science Philip E. Bourne PhD peb6a@virginia.edu https://www.slideshare.net/pebourne October 27, 2022 ORNL AI Workshop
  • 2. My Perspective aka Biases • AI User • Practical Science Long standing computational biomedical researcher • Open Access Co-Founder and Founding Editor in Chief PLOS Computational Biology • Open Knowledge First President of FORCE11 • Data are Value Involved in FAIR • Translation First Associate Vice Chancellor for Innovation and Industrial Alliances, UCSD • Funders as Lever First Associate Director for Data Science, NIH – preprints, data sharing, BD2K, etc. • Change Higher Ed Founding Dean School of Data Science, UVA
  • 3. AI Solved What Some Have Called the Holy Grail of Molecular Biology https://medium.com/proteinqure/welcome-into-the-fold-bbd3f3b19fdd 1-D 3-D
  • 6. Google’s DeepMind’s AlphaFold2 makes gigantic leap in solving protein structures
  • 7. AlphaFold2 Numerical optimization – differential programming Overall gradient descent trained to win CASP Jumper et al.., 2021. Nature, 596 (7873), pp.583-589 Transformer models using attention Geometry invariant to translation/rotation
  • 8. Reasons Behind the Win ● Nothing fundamentally new from an AI perspective ● Data integration ● Collaboration not competition ● Engineering challenge beyond most labs ● Compute power beyond most labs ● Team size beyond most labs ● Worked with protein structure specialists
  • 9. While a victory for AI there are implications that require a closer look … https://www.dreamstime.com/
  • 10. Reasons Behind the Win – Lessons for Data Science ● Nothing fundamentally new from an AI perspective ● Data integration – data science vs data engineering ● Collaboration not competition – team building is critical ● Engineering challenge beyond most labs – systems that scale up ● Compute power beyond most labs – systems that scale up ● Team size beyond most labs – human systems that scale up ● Worked with protein structure specialists – domains rule
  • 11. Implications for Science The fourth paradigm changes how we think of the long standing need to perform an act of reductionism. In the context of protein structure prediction, I refer to this as the Curse of the Ribbon. Let me explain… [Forthcoming: Bourne, Draizen, Mura, PLOS Biology 2022]
  • 12. Protein Fold Space – Human Reductionism There are ~ 20300 possible proteins >>>> all the atoms in the Universe ~189M protein sequences from 292K organisms (source UniProt)) Classified into ~1500 folds (source SCOP) https://doi.org/10.1073/pnas.2628030100 It has become apparent that fold space is more continuous
  • 13. Curse of the Ribbon [From Cam Mura] The human desire to bin/classify/reduce and to simplify how we view data (ribbon diagram) while useful masks {we argue} aspects of the data that algorithms can see
  • 14. Downstream Implications for Data Science • Cooperation rather than competition • Public-private partnership • Translational possibilities are endless • Made possible by curated open data • Appreciate engineering
  • 15. As one example of the future success of AI, how does a school of data science think of itself? It starts with a simple foundational model of data science that all in the school agree upon
  • 16. The 4+1 Model of Data Science • Value – assuring societal benefit • Design - Communication of the value of data • Systems – the means to communicate and convey benefit • Analytics – models and methods • Practice – where everything happens From: Alvarado & Bourne, AI for Science Eds. Choudhary, Fox & Hey, 2023
  • 17. The Data Science Interplay • Value + Design = Openness, responsibility • Value + Analytics = Human centered AI, algorithmic bias • Value + Systems = sustainability, access, environmental impact • Design + Analytics = literate programming, visualization • Design + Systems = dashboards, engineering design • Analytics + Systems = ML engineering [From Raf Alvarado] Thinking of data as a science unto itself is novel and controversial
  • 18. We see AI in a more holistic framework…
  • 19. Databases organize data around a project. Data warehouses organize the data for an organization Data commons organize the data for a scientific discipline or field Data Warehouse Data Ecosystems Example – We Consider the Evolving Systems that Support AI
  • 20. Challenges Fixed level of funding Opportunities data commons Data commons co-locate data with cloud computing infrastructure and commonly used software services, tools & apps for managing, analyzing and sharing data to create an interoperable resource for the research community.* *Robert L. Grossman, Allison Heath, Mark Murphy, Maria Patterson and Walt Wells, A Case for Data Commons Towards Data Science as a Service, IEEE Computing in Science and Engineer, 2016. Source of image: The CDIS, GDC, & OCC data commons infrastructure at a University of Chicago data center. Bonazzi VR, Bourne PE (2017) Should biomedical research be like Airbnb? PLoS Biol 15(4): e2001818. Systems [Adapted from Bob Grossman]
  • 21. But wait the picture is more complicated….
  • 22. A Data Science Poster Child Researcher and Assistant Professor of Medicine Dr. Thomas Hartka, also a current online Masters in Data Science student, is combining two disparate data sets—electronic health records and DMV crash data—to save lives after motor vehicle crashes. “I enrolled in the MSDS program to expand my research on automotive safety. I have already used techniques from classes in my work. I hope to expand my research to real-time analytics to improve emergency room care.” — Dr. Thomas Hartka, UVA School of Medicine
  • 23.
  • 26. Research ethics committees (RECs) review the ethical acceptability of research involving human participants. Historically, the principal emphases of RECs have been to protect participants from physical harms and to provide assurance as to participants’ interests and welfare.* [The Framework] is guided by, Article 27 of the 1948 Universal Declaration of Human Rights. Article 27 guarantees the rights of every individual in the world "to share in scientific advancement and its benefits" (including to freely engage in responsible scientific inquiry)…* Protect human subject data The right of human subjects to benefit from research. *GA4GH Framework for Responsible Sharing of Genomic and Health-Related Data, see goo.gl/CTavQR Data sharing with protections provides the evidence so patients can benefit from advances in research. Balance protecting human subject data with open research that benefits patients [Adapted from Bob Grossman] Value
  • 27. Why Responsible Data Science? • A defining feature • A partnership between STEM, social sciences and the humanities • Where UVA has strength
  • 29. Daily Challenges • Deciding what not to do • Competition for the best team members (faculty and staff) • Establishing a diverse team • Lack of a comprehensive enterprise-wide data infrastructure • Its easier to conform
  • 30. During my 5-year interview as dean I was asked, “Will we need a school of data science in 10 years wont it be ubiquitous throughout the university?” My response, “Will we need a university in ten years? Wont it be one big school of data science?” https://pebourne.wordpress.com/2022/06/29/deans-blog- data-science-ten-years-from-now/
  • 31. Questions I Leave You With …. • Have I overstated the case for data science? • Are we currently doing the best by our students? • Are the models we propose the right ones? • Where do we go from here?
  • 32. Punchline – in 45+ Years in Academia I Have Never Seen Anything Like It • It is a response to the digital transformation of society • It is touching every discipline (aka vertical) • We can’t keep the students out of our classes • Cause – large amounts of digital data • Effect – interdisciplinarity, openness, translation, search for responsibility and more In summary, it is disruptive and higher ed. better pay attention

Hinweis der Redaktion

  1. I will introduce the concept of data science with a story that illustrates - citizen engagement, merging of unexpected data and societal benefit