SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Chris Evelo
From reproducibility to reusability: capturing metadata at the source
From reproducibility to reusability
DTL Focus meeting: Metadata for data reusability: eNotebook standards.
31 Oktober, 2019. Holland Heart House, Utrecht
Announcement: https://www.dtls.nl/events/dtl-focus-meeting-metadata-for-data-reusability-enotebook-standards/
From reproducibility to reusability
Failing to reuse is expensive
Elon Musk: suppose you would have to throw away a 747 every
time you fly to the US (well 2 if you want to go back).
Suppose you would have to redo experiments every time you
want to use the data.
The difference
Reproducibility:
can I answer the same question again and get the same result.
Reuse:
can I in addition answer different questions using the same data.
Reuse is essential
More value for money
It is a core argument for funders to stimulate:
Ø Better data management
Ø FAIR data
Ø Research data infrastructure to allow that
Ø Compute infrastructure
It is typically argued that a large chunk of total research budget (5-10%)
should be allocated for this.
Integrative Systems Biology
Internal &
external
data
repositories
e.g. dbNP,
Sage, Atlas
knowledge
resources &
(semantic web)
Integration
e.g. Open PHACTS
WikiPathways
study capturing
ISA
models
study
data
processing,
statistics,
storage
e.g. arrayanalysis.org
ontologies
modeling & data integration,
network biology (extension),
supervised statistics
curation,
simulation
annotation &
provenance
research
applications
mapping
BridgeDb
extraction,
SPARQLing
conversion
Reuse typically needs more meta data
Ø Hard to predict which data is needed
Ø Too much work
Ø No clear incentive to put in repositories (not needed to publish)
Ø Even standards aim for minimal (e.g. MIAME, MIAPPE)
Ø Solution? Can we:
Ø Facilitate collection of richer metadata
Ø Keep everything collected by default
Ø For that purpose connect data resources
Example 1
Multiple human studies look at effect of high fat vs low fat diets.
Typically these studies compare two groups of individuals
(often in a cross-over design).
Typically groups are described as: otherwise the same
E.g.: same average age and comparable age range.
Can you reuse and combine these studies to study age effects?
Only when individual ages are stored.
Example 2
Some of these studies mention Vitamin E content of diet.
Can you study (or exclude) effects of added (vs naturally present
Vitamin E)?
Only if:
• It is clear whether added or natural
• Fat source is clear to estimate natural
Data moves through funnel
Ø Collected on paper or not at all (and often lost)
Ø Collected in eNotebooks
Ø Uploaded to study databases like dbNP and Molgenis
Ø Uploaded to data repositories
eNotebooks
Ø Not often mentioned as essential by ELIXIR community
Ø Should follow ISA principle (facilitate collection of study design,
assays performed, sample descriptions)
Ø Export/import standards would facilitate data transfer:
- Between different eNotebook types
- To study databases (for combination and advance analysis)
- To repositories directly
Ø Like for study databases (shared) templates could help to harmonise
data
Study capture databases
Ø More visibility in ELIXIR community
Ø Typically do follow ISA principles (facilitate collection of study design,
assays performed, sample descriptions)
Ø Export/import standards would facilitate data transfer:
- To other study databases
- To repositories directly (often implemented)
Ø Development of shared templates started.
Ø Modular software design would facilitate reuse of components
Study capture databases
Ø More visibility in ELIXIR community
Ø Typically do follow ISA principle (facilitate collection of study design,
assays performed, sample descriptions)
Ø Export/import standards would facilitate data transfer:
- To other study databases
- To repositories directly (often implemented)
Ø Development of shared templates started.
Ø Modular software design would facilitate reuse of components
Study capture databases
Ø More visibility in ELIXIR community
Ø Typically do follow ISA principle (facilitate collection of study design,
assays performed, sample descriptions)
Ø Export/import standards would facilitate data transfer:
- To other study databases
- To repositories directly (often implemented)
Ø Development of shared templates started.
Ø Modular software design would facilitate reuse of components
(ELIXIR core) data repositories
Ø Very visible in ELIXIR community
Ø Almost always follow ISA principles
Ø Typically facilitate data upload of raw data in native format
Ø Support for upload from eNotebooks and study capture databases
varies
Ø Most are technology (transcriptomics, metabolomics, ets) specific
Ø BioSamples de facto can be the place to describe studies (samples)
Ø BioStudies became the place where “other” study are captured

Weitere ähnliche Inhalte

Kürzlich hochgeladen

SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Mohammad Khajehpour
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 

Kürzlich hochgeladen (20)

SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 

Empfohlen

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

From reproducibility to reusability: capturing metadata at the source

  • 1. Chris Evelo From reproducibility to reusability: capturing metadata at the source
  • 2. From reproducibility to reusability DTL Focus meeting: Metadata for data reusability: eNotebook standards. 31 Oktober, 2019. Holland Heart House, Utrecht Announcement: https://www.dtls.nl/events/dtl-focus-meeting-metadata-for-data-reusability-enotebook-standards/
  • 4. Failing to reuse is expensive Elon Musk: suppose you would have to throw away a 747 every time you fly to the US (well 2 if you want to go back). Suppose you would have to redo experiments every time you want to use the data.
  • 5.
  • 6. The difference Reproducibility: can I answer the same question again and get the same result. Reuse: can I in addition answer different questions using the same data.
  • 7. Reuse is essential More value for money It is a core argument for funders to stimulate: Ø Better data management Ø FAIR data Ø Research data infrastructure to allow that Ø Compute infrastructure It is typically argued that a large chunk of total research budget (5-10%) should be allocated for this.
  • 8. Integrative Systems Biology Internal & external data repositories e.g. dbNP, Sage, Atlas knowledge resources & (semantic web) Integration e.g. Open PHACTS WikiPathways study capturing ISA models study data processing, statistics, storage e.g. arrayanalysis.org ontologies modeling & data integration, network biology (extension), supervised statistics curation, simulation annotation & provenance research applications mapping BridgeDb extraction, SPARQLing conversion
  • 9. Reuse typically needs more meta data Ø Hard to predict which data is needed Ø Too much work Ø No clear incentive to put in repositories (not needed to publish) Ø Even standards aim for minimal (e.g. MIAME, MIAPPE) Ø Solution? Can we: Ø Facilitate collection of richer metadata Ø Keep everything collected by default Ø For that purpose connect data resources
  • 10. Example 1 Multiple human studies look at effect of high fat vs low fat diets. Typically these studies compare two groups of individuals (often in a cross-over design). Typically groups are described as: otherwise the same E.g.: same average age and comparable age range. Can you reuse and combine these studies to study age effects? Only when individual ages are stored.
  • 11. Example 2 Some of these studies mention Vitamin E content of diet. Can you study (or exclude) effects of added (vs naturally present Vitamin E)? Only if: • It is clear whether added or natural • Fat source is clear to estimate natural
  • 12. Data moves through funnel Ø Collected on paper or not at all (and often lost) Ø Collected in eNotebooks Ø Uploaded to study databases like dbNP and Molgenis Ø Uploaded to data repositories
  • 13. eNotebooks Ø Not often mentioned as essential by ELIXIR community Ø Should follow ISA principle (facilitate collection of study design, assays performed, sample descriptions) Ø Export/import standards would facilitate data transfer: - Between different eNotebook types - To study databases (for combination and advance analysis) - To repositories directly Ø Like for study databases (shared) templates could help to harmonise data
  • 14. Study capture databases Ø More visibility in ELIXIR community Ø Typically do follow ISA principles (facilitate collection of study design, assays performed, sample descriptions) Ø Export/import standards would facilitate data transfer: - To other study databases - To repositories directly (often implemented) Ø Development of shared templates started. Ø Modular software design would facilitate reuse of components
  • 15. Study capture databases Ø More visibility in ELIXIR community Ø Typically do follow ISA principle (facilitate collection of study design, assays performed, sample descriptions) Ø Export/import standards would facilitate data transfer: - To other study databases - To repositories directly (often implemented) Ø Development of shared templates started. Ø Modular software design would facilitate reuse of components
  • 16. Study capture databases Ø More visibility in ELIXIR community Ø Typically do follow ISA principle (facilitate collection of study design, assays performed, sample descriptions) Ø Export/import standards would facilitate data transfer: - To other study databases - To repositories directly (often implemented) Ø Development of shared templates started. Ø Modular software design would facilitate reuse of components
  • 17. (ELIXIR core) data repositories Ø Very visible in ELIXIR community Ø Almost always follow ISA principles Ø Typically facilitate data upload of raw data in native format Ø Support for upload from eNotebooks and study capture databases varies Ø Most are technology (transcriptomics, metabolomics, ets) specific Ø BioSamples de facto can be the place to describe studies (samples) Ø BioStudies became the place where “other” study are captured