SlideShare ist ein Scribd-Unternehmen logo
1 von 60
Creating and Sustaining a FAIR
Biomedical Data Ecosystem
Susan Gregruick, Ph.D.
Associate Director for Data Science and
Director, Office of Data Science Strategy
October 9, 2020
Making Data FAIR
 must have unique identifiers, effectively labeling it
within searchable resources.Findable
 must be easily retrievable via open systems and
effective and secure authentication and authorization
procedures.
Accessible
 should “use and speak the same language” via use
of standardized vocabularies.Interoperable
 must be adequately described to a new user, have
clear information about data-usage licenses, and
have a traceable “owner’s manual,” or provenance.
Reusable
Is this what
FAIR data
looks like…
Or is this FAIR Data…
NIH supports many different biomedical
research communities with diverse sets
of data
6
The Rime of the Ancient Mariner,
Samuel Taylor Coleridge
(excerpted)
Day after day, day after day,
We stuck, nor breath nor motion;
As idle as a painted ship
Upon a painted ocean.
Water, water, every where,
And all the boards did shrink;
Water, water, every where,
Nor any drop to drink.
This proliferation of data, and the
accompanying computing resources
and new algorithms, brings new
opportunities for discovery, as well as
new challenges
Journal articles could link
to repository data sets
Metadata were computable so that a
search for similar datasets was possible
Analysis tools were linked to datasets,
via Github, Bioconductor, Galaxy or
other….
NIDDK
The mission of the National Institute of Diabetes and Digestive and Kidney
Diseases (NIDDK) is to conduct and support research on diabetes and other
endocrine and metabolic diseases; digestive diseases, nutritional disorders,
and obesity; and kidney, urologic, and hematologic diseases, to improve health
and quality of life.
NIDDK supports research
studies across a wide variety of
disease areas and in turn
supports a variety of platforms
to house and manage the data
they each generate.
These studies utilize a spectrum
of modern experimental
techniques, generating different
modalities of data about the
patient and their disease state.
Collecting, integrating and
working with this all data
presents a variety of
challenges.
Challenges
New consortia would like to
share or reuse existing data
platforms rather than having
to create them from scratch
Integrating data from the
same patient across different
studies currently requires
significant manual effort
Supporting analysis and
visualization tools for
imaging data being produced
by various projects
Image from https://www.niddk.nih.gov/health-information/kidney-disease
Integration of GUDMAP expression data with GTEx eQTLs
Core Motivations
● GUDMAP contains gene
expression data across
various parts of the kidney and
urogenital system.
● The GTEx database contains
expression QTL (eQTL) data
correlating gene expression
with specific genomic variants
● Integrating GUDMAP data
with GTEx may lead to
insights into gene regulation in
kidney development and renal
disease
Potential Data Sources
● NIDDK GUDMAP
Genitourinary data
repository
● Common Fund GTEx gene
expression database
ResearchScientist
Icon made by Roundicons from www.flaticon.com
As a renal disease
researcher, I want to
combine gene data from
GUDMAP with eQTL data
from the Common Fund
GTEx resource in order to
investigate variants
involved in regulating
renal gene expression
Data integration within the TEDDY T1D platform
Core Motivations
● Data submitted to TEDDY at
different times and locations
are independent data releases
with different subject
identifiers per release
● The same subject will likely
have data spread across
multiple data releases
● Recombining this data is a
very manual process, having
an integrated data
environment would simplify
this significantly
Potential Data Sources
● Genomics
● Epigenomics
● Transcriptomics
● Proteomics
● Metabolomics
ResearchScientist
Icon made by Roundicons from www.flaticon.com
As a T1 diabetes
researcher, I want to
combine data across
TEDDY releases in order
to bring together all the
different modalities of
data collected from the
same subject
This is the promise
of the NIH
Strategic Plan for
Data Science
…and here’s how we will get there.
13
0%
25%
50%
75%
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
PERCENTAGE
YEAR
Percentage of NIH Supported PMC publications with data
availability statement
NIH Data Management and Sharing
Policy Development
• Researchers with NIH-funded or conducted research projects resulting in
the generation of scientific data will be required to submit a Plan
• Plans should explain how scientific data generated by a research study
will be managed and which of these scientific data will be shared
Community
Input
Solicited
• 189
submissions
from national
and
international
stakeholders
Identified
need for
appropriate
infrastructure
• policy and
implementation
to go
‘hand-in-hand’
Develop
draft policy
for data
management
and sharing
and related
guidance
Released
draft for
community
input
Release final
policy (2020)
Options of scaled implementation for sharing datasets
• PMC stores publication-related
supplemental materials and datasets
directly associated publications. Up
to 2 GB.
• Generate Unique Identifiers for the
stored supplementary materials and
datasets.
Use of commercial and non-profit
repositories STRIDES Cloud Partners
• Store and manage large scale, high
priority NIH datasets. (Partnership with
STRIDES)
• Assign Unique Identifiers, implement
authentication, authorization and
access control.
Datasets up to 2 gigabytes Datasets up to 20*gigabytes High Priority Datasets petabytes
PubMed Central
• Assign Unique Identifiers to
datasets associated with
publications and link to PubMed.
• Store and manage datasets
associated with publication, up to
20* GB.
NIH strongly encourages
open access Data Sharing Repositories
as a first choice.
https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html
Overview of Sharing Publication and
Related Data
• PMC stores publication-
related supplemental
materials and datasets
directly associated
publications. Up to 2 GB.
• Generate Unique
Identifiers for the stored
supplementary materials
and datasets.
Use of commercial and
non-profit repositories
STRIDES Cloud Partners
• Store and manage large
scale, high priority NIH
datasets. (Partnership with
STRIDES)
• Assign Unique Identifiers,
implement authentication,
authorization and access
control.
PubMed Central
• Assign Unique Identifiers
to datasets associated
with publications and link
to PubMed.
• Store and manage
datasets associated with
publication, up to 20* GB.
NIH supports many repositories for
biomedical data sharing
AphasiaBank
How to Find Data Repositories
• BMIC Data Repository Listing
https://www.nlm.nih.gov/NIHbmic/nih_data_sha
ring_repositories.html
• SciCruch/dkNET
• Organized by repository type and scientific
area.
https://dknet.org/about/Suggested-data-
repositories
• FAIRsharing
https://fairsharing.org/
• DataMed
https://datamed.org/
Optimized Funding for NIH Data
Repositories and Knowledgebases
• Data resources are important
research tools
• Historically funded through
research grants
• Funding mechanism should
be optimal for type of
resource
• End goal: researcher
confident in data and
information integrity
• Solution: New Funding
Announcement for data
repositories and knowledgebases
• Resource plan requirement
Scientific Impact
1.Community
Engagement
1.Quality of Data
and Services and
Efficiency of
Operations
Governance
Optimized Funding for NIH Data
Repositories and Knowledgebases
Funding Opportunities
• NIH released two funding
opportunities on Jan. 17 to
support biomedical data
repositories and
knowledgebases:
• Biomedical Data Repository
(PAR-20-089)
• Biomedical Knowledgebase
(PAR-20-097)
Scientific Impact
1.Community
Engagement
1.Quality of Data
and Services and
Efficiency of
Operations
Governance
Piloting a FAIR Generalist
Repository Using Figshare
https://nih.figshare.com
Existing Figshare features Pilot-specific features
Repository contains data funded by 21
different NIH ICOs
NCATS
NCCIH
• Generalist repositories are growing – more researchers are
depositing data and more publications are linking to generalist
repositories.
• Researchers need more education and guidance – where to
publish data and how to describe datasets in metadata fields
effectively.
• Metadata enhancement enables greater discoverability –
metrics indicate greater access but need longer time scale to
observe data reuse.
NIH Figshare Pilot – Key Takeaways
Guiding researchers on better metadata to
enhance data discoverability graphic credit: Ontotext
Harnessing the
power of the cloud
NIH is Harnessing the Power of the
Cloud for Biomedical Research
• Cloud computing offers multiple opportunities NIH can
leverage to advance biomedical research, including:
• Computation on biomedical data at an unprecedented scale
• Broad access to cutting-edge cloud technology with, for example,
industry-leading security tools
• Storage of large, diverse data in a way that enables easier sharing,
access, and reuse of data with other researchers
• A community-driven approach to data science that breaks down
disciplinary silos
• Adopt and develop cloud-based tools from industry or academia for
biomedical research
25
Turning Research Data Into
Knowledge and Discovery
26
The Science and Technology Research Infrastructure for
Discovery, Experimentation, and Sustainability (STRIDES)
Initiative​
• State-of-the-art data storage and computational capabilities​
• Training and education for researchers​
• Innovative technologies such as artificial intelligence and
machine learning​
Partnerships with and other commercial providers
STRIDES by the numbers*
27
17
NIH ICs extramural institutions programs/projects people trained
37 279 >2400
cost savings to
participating ICs
$9M
obligated by NIH /
expended to date
$51.5M /
$18M
compute hours
30M
petabytes stored
80
*as of 8/31
28
Moving Data to the Cloud for Large-Scale
Analysis
36.4 PB of public and
controlled-access
Sequence Read Archive
data in two clouds
(GCP & AWS)
We can now do this in 3-4 days instead of 12+
months directly as a result of the SRA data being
available in the cloud. This means we can share
this data with the CoV researchers today, when it
can make a difference, not a year from now. This is
important for COVID-19 now, and will be
important in response to the next pandemic."
– Artem Babaian, Lead Developer at Serratus and corresponding author for publication,
“Petabase-scale sequence alignment catalyses viral discovery”
https://www.biorxiv.org/content/10.1101/2020.08.07.241729v1.full
Benefits of the Cloud for Large-Scale
Analysis
Enhancing Software Tools for Open
Science
Supplements to Enhance Software
Tools for Open Science
• New collaborations between biomedical researchers and
software engineers
Enhance software engineering of valuable
scientific tools
• Working with STRIDES Initiative is encouraged but not
required
Make research tools “cloud-ready”
NOT-OD-20-073
Topics Funded Across 12 Institutes
and Centers
FHIR Clinical Cloud Commons
Biomolecular
Simulation
Biophysics
Genomics Imaging Neuroimaging
Advancing
Artificial
Intelligence (AI)
EMRs/EHRs
Extract medical information from
text in EMRs/EHRs
Interpret genomic sequence
data to understand impact of
mutations on protein function
Read medical images
and help diagnose
diseases like
pneumonia and cancer
Monitor sleep and
vitals to send
information about
health at home to
doctors
Determine which calls to child
welfare systems warrant
deployment of family support and
prevention resources to protect at-
risk children
Examples from Katabi, Ng, Putnam-Hornstein, Troyanskaya, and others
AI in Biomedicine: Opportunities
NIH NVIDIA COVID CT-AI Classification
Segmentation
Image Classification
Preprocessing
Conversion to nifti
with 1x1x1 resampling
dicoms nifti
AH-Net Architecture
3D-Densnet-121
Apply Mask
Classification
Likely COVID
Vs
Unlikely COVID
Lung Segmentation
Mask
Baris Turkbey, Sheng Xu, Tom Sanford, Stephanie Harmon,, Mona Flores, Daguang Xu, Xiasong Wang, Ziyue Xu, Holger Roth, Dong Yang, Evrim Turkbey, Mike Kassin, Maxime Blain, Brad Wood
CT images have been
used in Asia to detect
COVID-19 virus in
patients
New Common Fund Initiative:
Artificial Intelligence for BiomedicaL
Excellence (AIBLE)
May 15, 2020 - NIH Council of Councils
• https://dpcpsi.nih.gov/council/may-15-2020-agenda
• AI Concept Clearance (start at 1:25min)
https://videocast.nih.gov/watch=36031
• NIH Artificial Intelligence Working Group Final Report
data people ELSI
Data
collection
analysis
reuse
People
attract
train
convene
Ethics
accountable
informed
representative
R2: criteria for ML-friendly
datasets R3: “datasheets” and “model cards”
R4: consent and data access
standards
R5: ethical principles for ML in
biomedicine
R7: ML-focused trainees and fellows.
R8: convene cross-disciplinary
collaborators
R6: curricula for ML-BioMed experts
R1: flagship data generation
efforts
Recommendations
38
Support flagship efforts that generate large-scale
experimental data, with billions of data points
designed to:
i. be well-suited for ML analysis and inference
ii. address key biomedical challenges
iii. stimulate new approaches in machine learning
And that implement processes designed to:
i. develop improved criteria and technical
mechanisms for data access
ii. strengthen ethical criteria for dataset use
(consent, privacy, accountability, ...)
Support flagship data generation efforts to propel progress
by the scientific community.
27
data ethics people
Projects should:
▪ address key biomedical
challenges using ML methods
▪ advance ML methods for future
use in biomedicine
▪ produce transformative data
sets, designed with ML in mind
▪ propel new ways to gather
massive data in biomedicine
▪ involve strong engagement
from leading ML
researchers
Project review should:
▪ incorporate expertise in ML as
well as traditional biomedical
domains
Publish criteria for evaluating datasets based on their value for ML-based analysis.
▪ what makes a dataset most useful for ML-based analysis?
▪ what attributes are and aren’t addressed by existing datasets?
▪ start as guidelines; within two years recommend a subset as requirements
Develop and publish criteria for ML-friendly datasets.
30
Examples of potential criteria:
▪ clear provenance: as much metadata as possible, to detect & correct for batch effects
▪ well-described data: what does each variable mean? what’s the distribution of
values?
▪ accessible data: flexible data access policy, reasonable data access process
▪ large sample size: to allow training (and evaluation) without overfitting
▪ multimodal data: to study complex systems from multiple perspectives
▪ perturbation data: includes outcomes (“outputs”) as well as measurements (“inputs”)
▪ longitudinal data: to allow modeling and prediction of progression
▪ active learning: data grows over time, incorporates new data-gathering techniques,
and uses ML-based analysis of existing data to inform future data generation
data ethics people
Design and apply “datasheets” and “model cards” for
biomedical ML.
41
Potential datasheet best practices:
• demographics and UBR
characteristics
• privacy, consent, and copyright
issues
• known blind spots, which could
otherwise create hidden biases
Potential model card best practices:
• what training data was used
• how training and validation were
done
• known limitations on applicability
• intended use, and potential
harms of inappropriate use
• Develop and publish best practices for:
• “datasheets” that describe & evaluate training
datasets
• “model cards” that do the same for generated
models
• Test the best practices in the real world:
• build after-the-fact examples for existing datasets
• apply to new datasets, and update the best
practices
• Once best practices have been updated:
• require datasheets and model cards for all NIH
extramural grant applications and NIH intramural
projects that involve ML research
• encourage journals to do the same for paper
submission and publication
data ethics people
FY21 FY22 FY23 FY24
1 DATA DESIGN CENTERS
2 TOOLS
4 GOLD DATA: COHORT1
3 DATA ENHANCEMENT
SUPPLEMENTS TO EXISTING
AWARDS
FY25 FY26
4 GOLD DATA: COHORT2
5 ASSESS
Draft initiative map (5/15/2020)
New Partnerships
in Data Science
and AI
Smart and Connected Health (SCH)
Accelerate innovations
in computer and
information science
and engineering to
support the
transformation of
health and medicine
Smart Health & Data Science Research Areas
• Tools for interoperable, distributed, federated, & scalable digital infrastructure
• Novel ontological systems and knowledge representation approaches
• Methods for data integrity, provenance, security, privacy and reliability
Information Infrastructure
• Computational tools for fusion and analysis of multi-level and -scale data
• Knowledge representations, visualizations and reasoning algorithms
• Approaches for combining AI learning with mechanistic modeling
• Unstructured data interpretation
Transformative Data Science
• Design & fabrication of novel multimodal sensor systems
• Synthesis of new biorecognition elements
Novel Multimodal Sensor
System Hardware
• New approaches to support individuals to effectively participate in their own health
• User-tailored and context-aware interfaces to reduce burden and increase autonomy
• Develop new methods for context-dependent selection, presentation and use of data
Effective Usability
• Closed-loop or Human-in-the loop systems
• Technology platforms for optimizing delivery of health interventions
• Simulation and modeling methods and software tools
Automating Health
• Modeling on-visual context information and perception of complex images.
• Methods to exploit experts’ implicit knowledge to improve perceptual decision making
• Develop models of how experts respond to changes in cognitive factors
Medical Data Interpretation
Let’s create a bright future
47
Coding it Forward
• Student-led non-profit places tech-
savvy students in federal agencies
• 16 students for summer placed in
admin or funding offices across 11
host institutes, centers, offices
(ICOs) for 10-week summer
program
• 2 students extended until the start of
school, 1 hired as contractor
• 24 students will start a fall fellowship
across 14 host ICOs
https://www.codingitforward.com/
NIH Data and Technology Advancement
(DATA) National Service Scholar Program
https://datascience.nih.gov/data-scholars
8 Scholars will…
Catalyze neuroscience research
Unravel the Alzheimer’s Disease Genome
Support cancer knowledge extraction
Accelerate the clinical adoption of machine intelligence applications in
medical imaging
Harness data science for health discovery and innovation in Africa
Expand theories of brain circuits
Integrate NIH cloud-based platforms for genomics research
Architect search across petabyte-scale data
…in 2021
Strategic Plan for Data Science:
Goals and Objectives
Data Infrastructure
Optimize data
storage and
security
Connect NIH data
systems
Modernized Data
Ecosystem
Modernize data
repository
ecosystems
Support storage
and sharing of
individual datasets
Better integrate
clinical and
observational data
into biomedical
data science
Data Management,
Analytics, and
Tools
Support useful,
generalizable, and
accessible tools
Broaden utility of,
and access to,
specialized tools
Improve discovery
and cataloging
resources
Workforce
Development
Enhance the NIH
data science
workforce
Expand the
national research
workforce
Engage a broader
community
Stewardship and
Sustainability
Develop policies
for a FAIR data
ecosystem
Enhance
stewardship
https://datascience.nih.gov
Data Science to
Address COVID-19
We’re putting COVID-19 data into
repositories and platforms so the data
will be USED by researchers!
What could researchers do with these
data?
Better understand
transmission and
infectivity
Evaluate Treatments
& Interventions
Predict Long-term
Sequelae
Link Social
Determinants of
Health with COVID-
19 related data and
exposures
Examine the impact
on Child & Maternal
Health
Resolve Technical &
Implementation
issues
53
COVID Clinical Platforms
 Increasing the amount and quality of EHR data related
to individuals with COVID-19
 Pilot a new enrollment partner model to efficiently target recruitment in
expanded regions of the country and collect EHR data from proven partners
 Rapidly collect EHR-derived clinical, lab, and imaging data from hospitals and
health plans at the peak of the pandemic and as it evolves
 Develop a robust, flexible collaborative analytics infrastructure to enable a
high frequency response to COVID-19 and the next emerging threats
 Include data from underserved populations, roughly 9.3M unique patients
 PETAL’s ORCHID Trial & PETAL’s CORAL registry
o RED CORAL: observational study of retrospective review of data collected on
hospitalized patients with COVID-19
o BLUE CORAL, a multicenter prospective observational study designed to
collect comprehensive data on hospitalized patients with COVID-19. This
study will gather imaging, biospecimens, and long-term outcomes.
Honest
Broker
P O L I C Y R E S O U R C E S
W O R K B E N C H E S
/ T O O L S
Federated
Data Platforms
I N F O R M A T I O N
S Y S T E M S
D A T A D I S C O V E R Y
API API API API
TBD*
TBD*
TBD*
TBD* MIDRC, RADx, NICHD, NIA, etc.
Research Authentication
System
Hash
Diagram Elements
CDE Standards for Interoperability
Data Discovery across Platforms
examples include GA4GH FASP,
PIC-SURE
Research Authentication
System
Interoperable Elements
Data Linkage Across SystemsHonest
Broker
FHIR to map and move data
Interoperability Across Clinical COVID
Serving Data Platforms
55
Researcher Workflows Before Researcher Authentication Services (RAS)
Platform 1
Cloud-based
Analysis Tool
LOGIN (5)
SEARCH/
SELECT
ACCESS
COMPUTE SHARE
SEARCH/
SELECT
ACCESS
Platform 2
1 3
2 4
5
Researchers login and/or give consent at least 5 times for each workflow in the Phase 1 interoperability use
cases
56
AUTH N AUTH Z
Passport and Visa: Which dbGaP
studies/consent groups you are authorized to
access and your role
LOGIN (1)
SEARCH/
SELECT
ACCESS COMPUTE SHARE
ID Token: Who
you are
1
Before provisioning data, the platform validates the
passport/visa by calling RAS, so access information is
always up to date within the last 30 minutes
Researcher Workflows After RAS August Deploy
Authentication and Authorization provided by a central NIH service. Auth tokens move with the user as
they navigate to any of the four Phase 1 Data Platforms so that the researcher only logs in one time to RAS
Privacy-Preserving Tokens
N3C
Sites
N3C
Sites
Output de-id
tokens
Patient 123
Tokenize
NIH Clinical
Studies
Senior
Living EHR
Tokenize
Tokenize
Output de-id
tokens
Patient 456
Output de-id
tokens
Patient 789
John Smith
03/27/1945
Male
John Smith
• Admitted to N3C Hospital
• Participates in Clinical
Studies
• Lives in a Senior Living
Facility
N3C Linkage Honest
Broker
Patient 123
Patient 456
Patient 789
De-identified ‘Rosetta
Stone’ process that unifies
records
007
Match &
De-duplicate
Patient Care Tokenization
De-Duplication and
Linkages
a modernized,
integrated, FAIR
biomedical data
ecosystem
VISION
NIH staff who deserve all the credit
• STRIDES: Andrea Norris, Nick Weber and NMDS team, and Fenglou Mao
• Connecting NIH Data Resources: Regina Bures, Ishwar Chandramouliswaran, Tanja Davidsen, Valentine Di
Francesco, Jeff Erickson, Tram Huyen, Rebecca Rosen, Steve Sherry, Alastair Thomson, Greg Farber, Dylan
Klomparens, Charles Schmitt, Susan, Wright, Ken Wiley, Kristofor Langlais, James Coulomb, Lora Kutkat, Nick
Weber, Allen Dearry
• Data Repository and Knowledgebase Resources: Kim Pruitt Valerie Florance, Valentina di Francesco, Ajay
Pillai, Qi Duan, Dawei Lin, Christine Colvis, Jennie Larkin, Ravi Ravichandran, and James Coulombe
• FHIR Pilots: Teresa Zayas-Caban, Denise Warzel, Kerry Goetz, Ken Wiley, Alison Cernick, Kenneth Wilkins,
Carolina Mendoza-Puccini, Matt McAuliffe, and Belinda Seto
• Criteria for Open Access Data Sharing Repositories: Mike Huerta, Dawei Lin, Maryam Zaringhalam, Lisa
Federer and BMIC Team
• Pilot for Scaled Implementation for Sharing Datasets: Ishwar Chandramouliswaran, Lisa Federer, Maryam
Zaringhalam, and Jennie Larkin
• Software Sustainability: Heidi Sofia, Ishwar Chandramouliswaran, Mike Conway, Tony Kirilusha, Xujing Wang,
Andrew Weitz, Todd Merchak, Allissa Dillman and Jess Mazerik
• Smart and Connected Health: Haluk Resat, Dana Wolff-Hughes, Partha Bhattacharyya, Fenglou Mao
• Coding-it-Forward Fellows Summer Program & DATA Scholars Program: Jess Mazerik, Wynn Meyer
60
Office of Data
Science Strategy
www.datascience.nih.gov
A modernized, integrated, FAIR
biomedical data ecosystem
60@NIHDataScience /NIH.DataScience datascience@nih.gov

Weitere ähnliche Inhalte

Was ist angesagt?

Building an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by BitBuilding an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by Bit
readkev
 

Was ist angesagt? (20)

DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...
 
Repositories in an Open Data Ecosystem
Repositories in an Open Data EcosystemRepositories in an Open Data Ecosystem
Repositories in an Open Data Ecosystem
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in libraries
 
DataONE Education Module 08: Data Citation
DataONE Education Module 08: Data CitationDataONE Education Module 08: Data Citation
DataONE Education Module 08: Data Citation
 
BioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative AdvantageBioPharma and FAIR Data, a Collaborative Advantage
BioPharma and FAIR Data, a Collaborative Advantage
 
FAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to PracticeFAIR Data Knowledge Graphs–from Theory to Practice
FAIR Data Knowledge Graphs–from Theory to Practice
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
 
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
FAIR Data Knowledge Graphs
FAIR Data Knowledge GraphsFAIR Data Knowledge Graphs
FAIR Data Knowledge Graphs
 
Building an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by BitBuilding an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by Bit
 
Biodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary ChallengeBiodiversity Informatics: An Interdisciplinary Challenge
Biodiversity Informatics: An Interdisciplinary Challenge
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
2014 10 china-nsl
2014 10 china-nsl2014 10 china-nsl
2014 10 china-nsl
 

Ähnlich wie dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09/2020

Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
Incisive_Events
 

Ähnlich wie dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09/2020 (20)

Toward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data EcosystemToward a FAIR Biomedical Data Ecosystem
Toward a FAIR Biomedical Data Ecosystem
 
Open Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality DataOpen Access as a Means to Produce High Quality Data
Open Access as a Means to Produce High Quality Data
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Simon hodson
Simon hodsonSimon hodson
Simon hodson
 
NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)NIH Big Data to Knowledge (BD2K)
NIH Big Data to Knowledge (BD2K)
 
Data Virtualization Modernizes Biobanking
Data Virtualization Modernizes BiobankingData Virtualization Modernizes Biobanking
Data Virtualization Modernizes Biobanking
 
Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
Digital transformation to enable a FAIR approach for health data science
Digital transformation to enable a FAIR approach for health data scienceDigital transformation to enable a FAIR approach for health data science
Digital transformation to enable a FAIR approach for health data science
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 
Research Data Management for Clinical Trials and Quality Improvement
Research Data Management for Clinical Trials and Quality ImprovementResearch Data Management for Clinical Trials and Quality Improvement
Research Data Management for Clinical Trials and Quality Improvement
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...
 
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
dkNET Office Hours: NIH Data Management and Sharing Mandate  05/03/2024dkNET Office Hours: NIH Data Management and Sharing Mandate  05/03/2024
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
 
Open Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon HodsonOpen Science - Global Perspectives/Simon Hodson
Open Science - Global Perspectives/Simon Hodson
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
Research Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & ResponsibilitiesResearch Data Management: Part 1, Principles & Responsibilities
Research Data Management: Part 1, Principles & Responsibilities
 
Shifting the goal post – from high impact journals to high impact data
 Shifting the goal post – from high impact journals to high impact data Shifting the goal post – from high impact journals to high impact data
Shifting the goal post – from high impact journals to high impact data
 
Ratan "Are we there yet? Keeping the promise of open science"
Ratan "Are we there yet?  Keeping the promise of open science"Ratan "Are we there yet?  Keeping the promise of open science"
Ratan "Are we there yet? Keeping the promise of open science"
 
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
 

Mehr von dkNET

dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET
 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET
 
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET
 
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET
 
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET
 
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET
 
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET
 
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET
 
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET
 
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET
 
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET
 
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET
 
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET
 
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022
dkNET
 

Mehr von dkNET (20)

dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
 
dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
 
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
 
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
 
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
 
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
 
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
 
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
 
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
 
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
 
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...
 
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
 
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
 
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022
dkNET Webinar: The Human BioMolecular Atlas Program (HuBMAP) 10/14/2022
 
dkNET Webinar "Visualizing Organelle and Cell Longevity In Situ" 05/20/22
dkNET Webinar "Visualizing Organelle and Cell Longevity In Situ" 05/20/22dkNET Webinar "Visualizing Organelle and Cell Longevity In Situ" 05/20/22
dkNET Webinar "Visualizing Organelle and Cell Longevity In Situ" 05/20/22
 

Kürzlich hochgeladen

Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
anilsa9823
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Kürzlich hochgeladen (20)

9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 

dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09/2020

  • 1. Creating and Sustaining a FAIR Biomedical Data Ecosystem Susan Gregruick, Ph.D. Associate Director for Data Science and Director, Office of Data Science Strategy October 9, 2020
  • 2. Making Data FAIR  must have unique identifiers, effectively labeling it within searchable resources.Findable  must be easily retrievable via open systems and effective and secure authentication and authorization procedures. Accessible  should “use and speak the same language” via use of standardized vocabularies.Interoperable  must be adequately described to a new user, have clear information about data-usage licenses, and have a traceable “owner’s manual,” or provenance. Reusable
  • 3. Is this what FAIR data looks like…
  • 4. Or is this FAIR Data…
  • 5. NIH supports many different biomedical research communities with diverse sets of data
  • 6. 6 The Rime of the Ancient Mariner, Samuel Taylor Coleridge (excerpted) Day after day, day after day, We stuck, nor breath nor motion; As idle as a painted ship Upon a painted ocean. Water, water, every where, And all the boards did shrink; Water, water, every where, Nor any drop to drink.
  • 7. This proliferation of data, and the accompanying computing resources and new algorithms, brings new opportunities for discovery, as well as new challenges
  • 8. Journal articles could link to repository data sets Metadata were computable so that a search for similar datasets was possible Analysis tools were linked to datasets, via Github, Bioconductor, Galaxy or other….
  • 9. NIDDK The mission of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) is to conduct and support research on diabetes and other endocrine and metabolic diseases; digestive diseases, nutritional disorders, and obesity; and kidney, urologic, and hematologic diseases, to improve health and quality of life. NIDDK supports research studies across a wide variety of disease areas and in turn supports a variety of platforms to house and manage the data they each generate. These studies utilize a spectrum of modern experimental techniques, generating different modalities of data about the patient and their disease state. Collecting, integrating and working with this all data presents a variety of challenges. Challenges New consortia would like to share or reuse existing data platforms rather than having to create them from scratch Integrating data from the same patient across different studies currently requires significant manual effort Supporting analysis and visualization tools for imaging data being produced by various projects Image from https://www.niddk.nih.gov/health-information/kidney-disease
  • 10. Integration of GUDMAP expression data with GTEx eQTLs Core Motivations ● GUDMAP contains gene expression data across various parts of the kidney and urogenital system. ● The GTEx database contains expression QTL (eQTL) data correlating gene expression with specific genomic variants ● Integrating GUDMAP data with GTEx may lead to insights into gene regulation in kidney development and renal disease Potential Data Sources ● NIDDK GUDMAP Genitourinary data repository ● Common Fund GTEx gene expression database ResearchScientist Icon made by Roundicons from www.flaticon.com As a renal disease researcher, I want to combine gene data from GUDMAP with eQTL data from the Common Fund GTEx resource in order to investigate variants involved in regulating renal gene expression
  • 11. Data integration within the TEDDY T1D platform Core Motivations ● Data submitted to TEDDY at different times and locations are independent data releases with different subject identifiers per release ● The same subject will likely have data spread across multiple data releases ● Recombining this data is a very manual process, having an integrated data environment would simplify this significantly Potential Data Sources ● Genomics ● Epigenomics ● Transcriptomics ● Proteomics ● Metabolomics ResearchScientist Icon made by Roundicons from www.flaticon.com As a T1 diabetes researcher, I want to combine data across TEDDY releases in order to bring together all the different modalities of data collected from the same subject
  • 12. This is the promise of the NIH Strategic Plan for Data Science …and here’s how we will get there.
  • 13. 13 0% 25% 50% 75% 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 PERCENTAGE YEAR Percentage of NIH Supported PMC publications with data availability statement
  • 14. NIH Data Management and Sharing Policy Development • Researchers with NIH-funded or conducted research projects resulting in the generation of scientific data will be required to submit a Plan • Plans should explain how scientific data generated by a research study will be managed and which of these scientific data will be shared Community Input Solicited • 189 submissions from national and international stakeholders Identified need for appropriate infrastructure • policy and implementation to go ‘hand-in-hand’ Develop draft policy for data management and sharing and related guidance Released draft for community input Release final policy (2020)
  • 15. Options of scaled implementation for sharing datasets • PMC stores publication-related supplemental materials and datasets directly associated publications. Up to 2 GB. • Generate Unique Identifiers for the stored supplementary materials and datasets. Use of commercial and non-profit repositories STRIDES Cloud Partners • Store and manage large scale, high priority NIH datasets. (Partnership with STRIDES) • Assign Unique Identifiers, implement authentication, authorization and access control. Datasets up to 2 gigabytes Datasets up to 20*gigabytes High Priority Datasets petabytes PubMed Central • Assign Unique Identifiers to datasets associated with publications and link to PubMed. • Store and manage datasets associated with publication, up to 20* GB. NIH strongly encourages open access Data Sharing Repositories as a first choice. https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html Overview of Sharing Publication and Related Data
  • 16. • PMC stores publication- related supplemental materials and datasets directly associated publications. Up to 2 GB. • Generate Unique Identifiers for the stored supplementary materials and datasets. Use of commercial and non-profit repositories STRIDES Cloud Partners • Store and manage large scale, high priority NIH datasets. (Partnership with STRIDES) • Assign Unique Identifiers, implement authentication, authorization and access control. PubMed Central • Assign Unique Identifiers to datasets associated with publications and link to PubMed. • Store and manage datasets associated with publication, up to 20* GB. NIH supports many repositories for biomedical data sharing AphasiaBank
  • 17. How to Find Data Repositories • BMIC Data Repository Listing https://www.nlm.nih.gov/NIHbmic/nih_data_sha ring_repositories.html • SciCruch/dkNET • Organized by repository type and scientific area. https://dknet.org/about/Suggested-data- repositories • FAIRsharing https://fairsharing.org/ • DataMed https://datamed.org/
  • 18. Optimized Funding for NIH Data Repositories and Knowledgebases • Data resources are important research tools • Historically funded through research grants • Funding mechanism should be optimal for type of resource • End goal: researcher confident in data and information integrity • Solution: New Funding Announcement for data repositories and knowledgebases • Resource plan requirement Scientific Impact 1.Community Engagement 1.Quality of Data and Services and Efficiency of Operations Governance
  • 19. Optimized Funding for NIH Data Repositories and Knowledgebases Funding Opportunities • NIH released two funding opportunities on Jan. 17 to support biomedical data repositories and knowledgebases: • Biomedical Data Repository (PAR-20-089) • Biomedical Knowledgebase (PAR-20-097) Scientific Impact 1.Community Engagement 1.Quality of Data and Services and Efficiency of Operations Governance
  • 20. Piloting a FAIR Generalist Repository Using Figshare https://nih.figshare.com Existing Figshare features Pilot-specific features
  • 21. Repository contains data funded by 21 different NIH ICOs NCATS NCCIH
  • 22. • Generalist repositories are growing – more researchers are depositing data and more publications are linking to generalist repositories. • Researchers need more education and guidance – where to publish data and how to describe datasets in metadata fields effectively. • Metadata enhancement enables greater discoverability – metrics indicate greater access but need longer time scale to observe data reuse. NIH Figshare Pilot – Key Takeaways
  • 23. Guiding researchers on better metadata to enhance data discoverability graphic credit: Ontotext
  • 25. NIH is Harnessing the Power of the Cloud for Biomedical Research • Cloud computing offers multiple opportunities NIH can leverage to advance biomedical research, including: • Computation on biomedical data at an unprecedented scale • Broad access to cutting-edge cloud technology with, for example, industry-leading security tools • Storage of large, diverse data in a way that enables easier sharing, access, and reuse of data with other researchers • A community-driven approach to data science that breaks down disciplinary silos • Adopt and develop cloud-based tools from industry or academia for biomedical research 25
  • 26. Turning Research Data Into Knowledge and Discovery 26 The Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative​ • State-of-the-art data storage and computational capabilities​ • Training and education for researchers​ • Innovative technologies such as artificial intelligence and machine learning​ Partnerships with and other commercial providers
  • 27. STRIDES by the numbers* 27 17 NIH ICs extramural institutions programs/projects people trained 37 279 >2400 cost savings to participating ICs $9M obligated by NIH / expended to date $51.5M / $18M compute hours 30M petabytes stored 80 *as of 8/31
  • 28. 28
  • 29. Moving Data to the Cloud for Large-Scale Analysis 36.4 PB of public and controlled-access Sequence Read Archive data in two clouds (GCP & AWS)
  • 30. We can now do this in 3-4 days instead of 12+ months directly as a result of the SRA data being available in the cloud. This means we can share this data with the CoV researchers today, when it can make a difference, not a year from now. This is important for COVID-19 now, and will be important in response to the next pandemic." – Artem Babaian, Lead Developer at Serratus and corresponding author for publication, “Petabase-scale sequence alignment catalyses viral discovery” https://www.biorxiv.org/content/10.1101/2020.08.07.241729v1.full Benefits of the Cloud for Large-Scale Analysis
  • 31. Enhancing Software Tools for Open Science
  • 32. Supplements to Enhance Software Tools for Open Science • New collaborations between biomedical researchers and software engineers Enhance software engineering of valuable scientific tools • Working with STRIDES Initiative is encouraged but not required Make research tools “cloud-ready” NOT-OD-20-073
  • 33. Topics Funded Across 12 Institutes and Centers FHIR Clinical Cloud Commons Biomolecular Simulation Biophysics Genomics Imaging Neuroimaging
  • 35. EMRs/EHRs Extract medical information from text in EMRs/EHRs Interpret genomic sequence data to understand impact of mutations on protein function Read medical images and help diagnose diseases like pneumonia and cancer Monitor sleep and vitals to send information about health at home to doctors Determine which calls to child welfare systems warrant deployment of family support and prevention resources to protect at- risk children Examples from Katabi, Ng, Putnam-Hornstein, Troyanskaya, and others AI in Biomedicine: Opportunities
  • 36. NIH NVIDIA COVID CT-AI Classification Segmentation Image Classification Preprocessing Conversion to nifti with 1x1x1 resampling dicoms nifti AH-Net Architecture 3D-Densnet-121 Apply Mask Classification Likely COVID Vs Unlikely COVID Lung Segmentation Mask Baris Turkbey, Sheng Xu, Tom Sanford, Stephanie Harmon,, Mona Flores, Daguang Xu, Xiasong Wang, Ziyue Xu, Holger Roth, Dong Yang, Evrim Turkbey, Mike Kassin, Maxime Blain, Brad Wood CT images have been used in Asia to detect COVID-19 virus in patients
  • 37. New Common Fund Initiative: Artificial Intelligence for BiomedicaL Excellence (AIBLE) May 15, 2020 - NIH Council of Councils • https://dpcpsi.nih.gov/council/may-15-2020-agenda • AI Concept Clearance (start at 1:25min) https://videocast.nih.gov/watch=36031 • NIH Artificial Intelligence Working Group Final Report
  • 38. data people ELSI Data collection analysis reuse People attract train convene Ethics accountable informed representative R2: criteria for ML-friendly datasets R3: “datasheets” and “model cards” R4: consent and data access standards R5: ethical principles for ML in biomedicine R7: ML-focused trainees and fellows. R8: convene cross-disciplinary collaborators R6: curricula for ML-BioMed experts R1: flagship data generation efforts Recommendations 38
  • 39. Support flagship efforts that generate large-scale experimental data, with billions of data points designed to: i. be well-suited for ML analysis and inference ii. address key biomedical challenges iii. stimulate new approaches in machine learning And that implement processes designed to: i. develop improved criteria and technical mechanisms for data access ii. strengthen ethical criteria for dataset use (consent, privacy, accountability, ...) Support flagship data generation efforts to propel progress by the scientific community. 27 data ethics people Projects should: ▪ address key biomedical challenges using ML methods ▪ advance ML methods for future use in biomedicine ▪ produce transformative data sets, designed with ML in mind ▪ propel new ways to gather massive data in biomedicine ▪ involve strong engagement from leading ML researchers Project review should: ▪ incorporate expertise in ML as well as traditional biomedical domains
  • 40. Publish criteria for evaluating datasets based on their value for ML-based analysis. ▪ what makes a dataset most useful for ML-based analysis? ▪ what attributes are and aren’t addressed by existing datasets? ▪ start as guidelines; within two years recommend a subset as requirements Develop and publish criteria for ML-friendly datasets. 30 Examples of potential criteria: ▪ clear provenance: as much metadata as possible, to detect & correct for batch effects ▪ well-described data: what does each variable mean? what’s the distribution of values? ▪ accessible data: flexible data access policy, reasonable data access process ▪ large sample size: to allow training (and evaluation) without overfitting ▪ multimodal data: to study complex systems from multiple perspectives ▪ perturbation data: includes outcomes (“outputs”) as well as measurements (“inputs”) ▪ longitudinal data: to allow modeling and prediction of progression ▪ active learning: data grows over time, incorporates new data-gathering techniques, and uses ML-based analysis of existing data to inform future data generation data ethics people
  • 41. Design and apply “datasheets” and “model cards” for biomedical ML. 41 Potential datasheet best practices: • demographics and UBR characteristics • privacy, consent, and copyright issues • known blind spots, which could otherwise create hidden biases Potential model card best practices: • what training data was used • how training and validation were done • known limitations on applicability • intended use, and potential harms of inappropriate use • Develop and publish best practices for: • “datasheets” that describe & evaluate training datasets • “model cards” that do the same for generated models • Test the best practices in the real world: • build after-the-fact examples for existing datasets • apply to new datasets, and update the best practices • Once best practices have been updated: • require datasheets and model cards for all NIH extramural grant applications and NIH intramural projects that involve ML research • encourage journals to do the same for paper submission and publication data ethics people
  • 42. FY21 FY22 FY23 FY24 1 DATA DESIGN CENTERS 2 TOOLS 4 GOLD DATA: COHORT1 3 DATA ENHANCEMENT SUPPLEMENTS TO EXISTING AWARDS FY25 FY26 4 GOLD DATA: COHORT2 5 ASSESS Draft initiative map (5/15/2020)
  • 43. New Partnerships in Data Science and AI
  • 44. Smart and Connected Health (SCH) Accelerate innovations in computer and information science and engineering to support the transformation of health and medicine
  • 45. Smart Health & Data Science Research Areas • Tools for interoperable, distributed, federated, & scalable digital infrastructure • Novel ontological systems and knowledge representation approaches • Methods for data integrity, provenance, security, privacy and reliability Information Infrastructure • Computational tools for fusion and analysis of multi-level and -scale data • Knowledge representations, visualizations and reasoning algorithms • Approaches for combining AI learning with mechanistic modeling • Unstructured data interpretation Transformative Data Science • Design & fabrication of novel multimodal sensor systems • Synthesis of new biorecognition elements Novel Multimodal Sensor System Hardware • New approaches to support individuals to effectively participate in their own health • User-tailored and context-aware interfaces to reduce burden and increase autonomy • Develop new methods for context-dependent selection, presentation and use of data Effective Usability • Closed-loop or Human-in-the loop systems • Technology platforms for optimizing delivery of health interventions • Simulation and modeling methods and software tools Automating Health • Modeling on-visual context information and perception of complex images. • Methods to exploit experts’ implicit knowledge to improve perceptual decision making • Develop models of how experts respond to changes in cognitive factors Medical Data Interpretation
  • 46. Let’s create a bright future
  • 47. 47 Coding it Forward • Student-led non-profit places tech- savvy students in federal agencies • 16 students for summer placed in admin or funding offices across 11 host institutes, centers, offices (ICOs) for 10-week summer program • 2 students extended until the start of school, 1 hired as contractor • 24 students will start a fall fellowship across 14 host ICOs https://www.codingitforward.com/
  • 48. NIH Data and Technology Advancement (DATA) National Service Scholar Program https://datascience.nih.gov/data-scholars 8 Scholars will… Catalyze neuroscience research Unravel the Alzheimer’s Disease Genome Support cancer knowledge extraction Accelerate the clinical adoption of machine intelligence applications in medical imaging Harness data science for health discovery and innovation in Africa Expand theories of brain circuits Integrate NIH cloud-based platforms for genomics research Architect search across petabyte-scale data …in 2021
  • 49. Strategic Plan for Data Science: Goals and Objectives Data Infrastructure Optimize data storage and security Connect NIH data systems Modernized Data Ecosystem Modernize data repository ecosystems Support storage and sharing of individual datasets Better integrate clinical and observational data into biomedical data science Data Management, Analytics, and Tools Support useful, generalizable, and accessible tools Broaden utility of, and access to, specialized tools Improve discovery and cataloging resources Workforce Development Enhance the NIH data science workforce Expand the national research workforce Engage a broader community Stewardship and Sustainability Develop policies for a FAIR data ecosystem Enhance stewardship https://datascience.nih.gov
  • 51. We’re putting COVID-19 data into repositories and platforms so the data will be USED by researchers!
  • 52. What could researchers do with these data? Better understand transmission and infectivity Evaluate Treatments & Interventions Predict Long-term Sequelae Link Social Determinants of Health with COVID- 19 related data and exposures Examine the impact on Child & Maternal Health Resolve Technical & Implementation issues
  • 53. 53 COVID Clinical Platforms  Increasing the amount and quality of EHR data related to individuals with COVID-19  Pilot a new enrollment partner model to efficiently target recruitment in expanded regions of the country and collect EHR data from proven partners  Rapidly collect EHR-derived clinical, lab, and imaging data from hospitals and health plans at the peak of the pandemic and as it evolves  Develop a robust, flexible collaborative analytics infrastructure to enable a high frequency response to COVID-19 and the next emerging threats  Include data from underserved populations, roughly 9.3M unique patients  PETAL’s ORCHID Trial & PETAL’s CORAL registry o RED CORAL: observational study of retrospective review of data collected on hospitalized patients with COVID-19 o BLUE CORAL, a multicenter prospective observational study designed to collect comprehensive data on hospitalized patients with COVID-19. This study will gather imaging, biospecimens, and long-term outcomes.
  • 54. Honest Broker P O L I C Y R E S O U R C E S W O R K B E N C H E S / T O O L S Federated Data Platforms I N F O R M A T I O N S Y S T E M S D A T A D I S C O V E R Y API API API API TBD* TBD* TBD* TBD* MIDRC, RADx, NICHD, NIA, etc. Research Authentication System Hash Diagram Elements CDE Standards for Interoperability Data Discovery across Platforms examples include GA4GH FASP, PIC-SURE Research Authentication System Interoperable Elements Data Linkage Across SystemsHonest Broker FHIR to map and move data Interoperability Across Clinical COVID Serving Data Platforms
  • 55. 55 Researcher Workflows Before Researcher Authentication Services (RAS) Platform 1 Cloud-based Analysis Tool LOGIN (5) SEARCH/ SELECT ACCESS COMPUTE SHARE SEARCH/ SELECT ACCESS Platform 2 1 3 2 4 5 Researchers login and/or give consent at least 5 times for each workflow in the Phase 1 interoperability use cases
  • 56. 56 AUTH N AUTH Z Passport and Visa: Which dbGaP studies/consent groups you are authorized to access and your role LOGIN (1) SEARCH/ SELECT ACCESS COMPUTE SHARE ID Token: Who you are 1 Before provisioning data, the platform validates the passport/visa by calling RAS, so access information is always up to date within the last 30 minutes Researcher Workflows After RAS August Deploy Authentication and Authorization provided by a central NIH service. Auth tokens move with the user as they navigate to any of the four Phase 1 Data Platforms so that the researcher only logs in one time to RAS
  • 57. Privacy-Preserving Tokens N3C Sites N3C Sites Output de-id tokens Patient 123 Tokenize NIH Clinical Studies Senior Living EHR Tokenize Tokenize Output de-id tokens Patient 456 Output de-id tokens Patient 789 John Smith 03/27/1945 Male John Smith • Admitted to N3C Hospital • Participates in Clinical Studies • Lives in a Senior Living Facility N3C Linkage Honest Broker Patient 123 Patient 456 Patient 789 De-identified ‘Rosetta Stone’ process that unifies records 007 Match & De-duplicate Patient Care Tokenization De-Duplication and Linkages
  • 59. NIH staff who deserve all the credit • STRIDES: Andrea Norris, Nick Weber and NMDS team, and Fenglou Mao • Connecting NIH Data Resources: Regina Bures, Ishwar Chandramouliswaran, Tanja Davidsen, Valentine Di Francesco, Jeff Erickson, Tram Huyen, Rebecca Rosen, Steve Sherry, Alastair Thomson, Greg Farber, Dylan Klomparens, Charles Schmitt, Susan, Wright, Ken Wiley, Kristofor Langlais, James Coulomb, Lora Kutkat, Nick Weber, Allen Dearry • Data Repository and Knowledgebase Resources: Kim Pruitt Valerie Florance, Valentina di Francesco, Ajay Pillai, Qi Duan, Dawei Lin, Christine Colvis, Jennie Larkin, Ravi Ravichandran, and James Coulombe • FHIR Pilots: Teresa Zayas-Caban, Denise Warzel, Kerry Goetz, Ken Wiley, Alison Cernick, Kenneth Wilkins, Carolina Mendoza-Puccini, Matt McAuliffe, and Belinda Seto • Criteria for Open Access Data Sharing Repositories: Mike Huerta, Dawei Lin, Maryam Zaringhalam, Lisa Federer and BMIC Team • Pilot for Scaled Implementation for Sharing Datasets: Ishwar Chandramouliswaran, Lisa Federer, Maryam Zaringhalam, and Jennie Larkin • Software Sustainability: Heidi Sofia, Ishwar Chandramouliswaran, Mike Conway, Tony Kirilusha, Xujing Wang, Andrew Weitz, Todd Merchak, Allissa Dillman and Jess Mazerik • Smart and Connected Health: Haluk Resat, Dana Wolff-Hughes, Partha Bhattacharyya, Fenglou Mao • Coding-it-Forward Fellows Summer Program & DATA Scholars Program: Jess Mazerik, Wynn Meyer
  • 60. 60 Office of Data Science Strategy www.datascience.nih.gov A modernized, integrated, FAIR biomedical data ecosystem 60@NIHDataScience /NIH.DataScience datascience@nih.gov