SlideShare ist ein Scribd-Unternehmen logo
1 von 21
ICG-11: Genomic Data Projects around the World
- How to find data for your research
Fiona Nielsen – November 4th 2016
We are always looking for data
Genetics,
Cancer,
Rare disease
research
We need
access to the
right data at
the right time
DNA
interpretation
requires
lots of data
How much data do you need to publish a paper?
2001: 1 human genome
2012: 1000 Genomes (1092 genomes, since increased to ~2500)
2015:
UK10K, Icelandic population (2,636 + 100k imputed),
Cancer genome atlas ~11,000 genomes
?
2016:
Exac consortium 65,000 exomes
GnomAD ~126,000 exomes
2020:
Data is not easy to find and access
FRAGMENTED
Poor visibility of available
genomic data
ADMIN BURDEN
Huge overhead to manage
data access
BAD CULTURE
Lack of data sharing habits in
research culture
Finding and accessing data can take months
40%
48%
11%
< 1 week
1-3 months
+6 months
Time spent data scouting per project
Why the barrier?
Barriers
• Difficult to find data, let alone
find the RIGHT data
• Time-consuming and difficult
to apply for access to data
• Complicated and labourious to
submit data to public
repositories
http://blog.repositive.io/tag/data-access/
http://blog.repositive.io/tag/data-sharing/
But where in the world is the data?
?
DATA is fragmented
How to make data easy to discover?
We have identified hundreds of data sources
Universities – Or repositories
affiliated to a university.
Projects/Consortia – Has a
specific purpose/aim. Often
focussed on a specific
research question or disease.
Public repositories – Allows
download and upload of
data from multiple
institutions.
Companies – For profit
organisations making data
available for free or as a
service.
Biobanks – many have sequence data of their biological samples.
Researchers
know on
average 4-5
data sources
More data sources appear every day,
to date we have identified 350+
Simpler workflow
for data access
And indexed them on a the Repositive platform
Discover and
access
Efficient Search,
see related results
Find colleagues &
their data interests
Co-annotate data &
community feedback
Free to use: http://discover.repositive.io
Platform launched in Sept 2016
Discover and
access
Efficient Search,
see related results
Find colleagues &
their data interests
Co-annotate data &
community feedback
1 Million+
Human genomic
datasets indexed
Free to use: http://discover.repositive.io
Platform launched in Sept 2016
Discover and
access
Efficient Search,
see related results
Find colleagues &
their data interests
Co-annotate data &
community feedback
1 Million+
datasets indexed
Simpler workflow
for data access
177k
Whole Exomes
213k
Whole Genomes
2400
23andMe samples
Free to use: http://discover.repositive.io
Platform launched in Sept 2016
Discover and
access
Efficient Search,
see related results
Find colleagues &
their data interests
Co-annotate data &
community feedback
1 Million+
datasets indexed
Simpler workflow
for data access
61+
Countries
426+
Research organisations
Using Repositive
PDX Consortium
With AstraZeneca
Free to use: http://discover.repositive.io
11
155
2
2
4
4
7
780
0
5
10
15
20
25
30
35
40
45
GB FI NL FR DE CH EE BE DK ES SI IE SE
0
5
10
15
20
25
30
35
CA MD MA WA NY TX AZ DC NJ NC PA UT TN CO IN FL LA VA IL ME OH MO MI SC OR
1
1
1
1
1
1
Data sources across the globe
GEO location of 278
data sources analysed.
Found by tracking IP address
of the source.
These include:
 Public Repositories
 Universities
 Companies
 BioBanks
 Research consortiums
Data source content
Assay Types
Dedicated to…
Sequenced ethnicities
Aboriginals
African Americans
Africans
Australians
Chinese
Malays
Indians
Danish
Dutch Estonian
Russian
European Ancestry
Finnish
Icelandic
Japanese
Korean
Latin Americans
Saudi
Swedish
Machines & Data sources
947
5600
88
660
26
68
50
62
3
25
0
0
23 International
Interesting site to look at:
http://omicsmaps.com/stats
• Repositive is supporting the whole research workflow
• Faster, more efficient data discovery
• Streamlining data access applications
• Developing technology for efficient data access
• Setting up pre-competitive data sharing agreements
• Running workshops and training programmes
More efficient data access
 Read about our pre-competitive PDX data resource in collaboration with AstraZeneca http://repositive.io/pdx
Building upon best practices
MAKE DATA
DISCOVERABLE
SIMPLIFY
WORKFLOWS
CONTRIBUTE TO
COMMUNITY
DNAdigest and Repositive – Connecting the world of genomic data
http://www.tinyurl.com/plos-biology-repositive
First 30 data sources listed here:
Connecting the world of genomic data
Visit us at: http://repositive.io
Or tweet us @repositiveio Free to use: http://discover.repositive.io
Fiona Nielsen, CEO
Email us: info@repositive.io

Weitere ähnliche Inhalte

Was ist angesagt?

Value of the mediawiki platform for providing content to the chemistry community
Value of the mediawiki platform for providing content to the chemistry communityValue of the mediawiki platform for providing content to the chemistry community
Value of the mediawiki platform for providing content to the chemistry community
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Was ist angesagt? (20)

SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Data for Science: How Elsevier is using data science to empower researchers
Data for Science: How Elsevier is using data science to empower researchersData for Science: How Elsevier is using data science to empower researchers
Data for Science: How Elsevier is using data science to empower researchers
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is useful
 
Why ContentMining is useful
Why ContentMining is usefulWhy ContentMining is useful
Why ContentMining is useful
 
CRIS and Open Science: Pure Implementation
CRIS and  Open Science: Pure ImplementationCRIS and  Open Science: Pure Implementation
CRIS and Open Science: Pure Implementation
 
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
 
Seven questions about ResearchGate
Seven questions about ResearchGateSeven questions about ResearchGate
Seven questions about ResearchGate
 
Bosman-Kramer Changing Research Workflows
Bosman-Kramer Changing Research WorkflowsBosman-Kramer Changing Research Workflows
Bosman-Kramer Changing Research Workflows
 
Genome sharing projects around the world nijmegen oct 29 - 2015
Genome sharing projects around the world   nijmegen oct 29 - 2015Genome sharing projects around the world   nijmegen oct 29 - 2015
Genome sharing projects around the world nijmegen oct 29 - 2015
 
Recommender systems and information extraction for researchers
Recommender systems and information extraction for researchersRecommender systems and information extraction for researchers
Recommender systems and information extraction for researchers
 
Rii stock centerdir_aug9_2016
Rii stock centerdir_aug9_2016Rii stock centerdir_aug9_2016
Rii stock centerdir_aug9_2016
 
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
Access to Freely Available Journal Articles: Gold, Green, and Rogue Open Ac...
 
Data Metadata and Data Citation - Emma Ganley (PLoS)
Data Metadata and Data Citation - Emma Ganley (PLoS)Data Metadata and Data Citation - Emma Ganley (PLoS)
Data Metadata and Data Citation - Emma Ganley (PLoS)
 
figshare Codecamp Iasi
figshare Codecamp Iasifigshare Codecamp Iasi
figshare Codecamp Iasi
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
Value of the mediawiki platform for providing content to the chemistry community
Value of the mediawiki platform for providing content to the chemistry communityValue of the mediawiki platform for providing content to the chemistry community
Value of the mediawiki platform for providing content to the chemistry community
 
Searching Deeply for Data, Results and Tools- What is Stopping Us?
Searching Deeply for Data, Results and Tools- What is Stopping Us?Searching Deeply for Data, Results and Tools- What is Stopping Us?
Searching Deeply for Data, Results and Tools- What is Stopping Us?
 
Showcasing Research and Scholarship from the College of Veterinary Medicine
Showcasing Research and Scholarship from the College of Veterinary MedicineShowcasing Research and Scholarship from the College of Veterinary Medicine
Showcasing Research and Scholarship from the College of Veterinary Medicine
 
Librarians supporting applied research and discipline-specific researchers
Librarians supporting applied research and discipline-specific researchersLibrarians supporting applied research and discipline-specific researchers
Librarians supporting applied research and discipline-specific researchers
 
Ceh Conference Nsb
Ceh Conference NsbCeh Conference Nsb
Ceh Conference Nsb
 

Andere mochten auch

Технология раскрашивания
Технология раскрашиванияТехнология раскрашивания
Технология раскрашивания
kulibin
 
Social Media Strategies for FACC
Social Media Strategies for FACCSocial Media Strategies for FACC
Social Media Strategies for FACC
Michael Procopio
 
How to deal with confrontational students
How to deal with confrontational studentsHow to deal with confrontational students
How to deal with confrontational students
kaiperm17
 
Grafico diario del dax perfomance index para el 13 03-2012
Grafico diario del dax perfomance index para el 13 03-2012Grafico diario del dax perfomance index para el 13 03-2012
Grafico diario del dax perfomance index para el 13 03-2012
Experiencia Trading
 
Basilica+. DE SAN FRANCISCO DE ASIS-ITALIA
Basilica+. DE SAN FRANCISCO DE ASIS-ITALIABasilica+. DE SAN FRANCISCO DE ASIS-ITALIA
Basilica+. DE SAN FRANCISCO DE ASIS-ITALIA
José Gracia Cervera
 
Rifle scopes
Rifle scopesRifle scopes
Rifle scopes
pau0006
 

Andere mochten auch (20)

Workshop finding and accessing data - fiona - lunteren april 18 2016
Workshop   finding and accessing data - fiona - lunteren april 18 2016Workshop   finding and accessing data - fiona - lunteren april 18 2016
Workshop finding and accessing data - fiona - lunteren april 18 2016
 
Genome sharing projects around the world - Open Access is not enough
Genome sharing projects around the world - Open Access is not enough Genome sharing projects around the world - Open Access is not enough
Genome sharing projects around the world - Open Access is not enough
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016
 
User Engagement - A scientific Challenge
User Engagement - A scientific ChallengeUser Engagement - A scientific Challenge
User Engagement - A scientific Challenge
 
Технология раскрашивания
Технология раскрашиванияТехнология раскрашивания
Технология раскрашивания
 
Guia de estudio fe y obras del 4 6
Guia de estudio fe y obras del 4 6Guia de estudio fe y obras del 4 6
Guia de estudio fe y obras del 4 6
 
Daily Snapshot - 31st January 2017
Daily Snapshot - 31st January 2017Daily Snapshot - 31st January 2017
Daily Snapshot - 31st January 2017
 
Geke van Dijk, "Tugboats and Tankers: Contributing to Longterm Organizational...
Geke van Dijk, "Tugboats and Tankers: Contributing to Longterm Organizational...Geke van Dijk, "Tugboats and Tankers: Contributing to Longterm Organizational...
Geke van Dijk, "Tugboats and Tankers: Contributing to Longterm Organizational...
 
Pando Ventures Presse Kit
Pando Ventures Presse KitPando Ventures Presse Kit
Pando Ventures Presse Kit
 
Social Media Strategies for FACC
Social Media Strategies for FACCSocial Media Strategies for FACC
Social Media Strategies for FACC
 
How to deal with confrontational students
How to deal with confrontational studentsHow to deal with confrontational students
How to deal with confrontational students
 
Grafico diario del dax perfomance index para el 13 03-2012
Grafico diario del dax perfomance index para el 13 03-2012Grafico diario del dax perfomance index para el 13 03-2012
Grafico diario del dax perfomance index para el 13 03-2012
 
Basilica+. DE SAN FRANCISCO DE ASIS-ITALIA
Basilica+. DE SAN FRANCISCO DE ASIS-ITALIABasilica+. DE SAN FRANCISCO DE ASIS-ITALIA
Basilica+. DE SAN FRANCISCO DE ASIS-ITALIA
 
Triptico evento a
Triptico evento aTriptico evento a
Triptico evento a
 
The lego story in comic book form
The lego story in comic book formThe lego story in comic book form
The lego story in comic book form
 
Words That Inspire Me
Words That Inspire MeWords That Inspire Me
Words That Inspire Me
 
Experiencias de ensamblado de un computador
Experiencias de ensamblado de un computadorExperiencias de ensamblado de un computador
Experiencias de ensamblado de un computador
 
Nuevas tecnologías de información en computo
Nuevas  tecnologías  de información  en computoNuevas  tecnologías  de información  en computo
Nuevas tecnologías de información en computo
 
Five Steps for Working Smarter
Five Steps for Working SmarterFive Steps for Working Smarter
Five Steps for Working Smarter
 
Rifle scopes
Rifle scopesRifle scopes
Rifle scopes
 

Ähnlich wie ICG-11 - genomic data projects around the world - nov 5 2016

Ähnlich wie ICG-11 - genomic data projects around the world - nov 5 2016 (20)

Finding and accessing human genome data with Repositive
Finding and accessing human genome data with RepositiveFinding and accessing human genome data with Repositive
Finding and accessing human genome data with Repositive
 
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...Workshop   finding and accessing data - fiona nadia charlotte - cambridge apr...
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...
 
HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Data dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data DiscoveryData dialogue - Human Genomic Data Discovery
Data dialogue - Human Genomic Data Discovery
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.
 
dkNET Poster ENDO 2016
dkNET Poster ENDO 2016 dkNET Poster ENDO 2016
dkNET Poster ENDO 2016
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
Scott Edmunds talk at AIST: Overcoming the Reproducibility Crisis: and why I ...
 
FSCI Data management and data sharing
FSCI Data management and data sharingFSCI Data management and data sharing
FSCI Data management and data sharing
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global Infrastructure
 
On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...On community-standards, data curation and scholarly communication" Stanford M...
On community-standards, data curation and scholarly communication" Stanford M...
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 

Mehr von Fiona Nielsen

Mehr von Fiona Nielsen (10)

EICT Summer School August 2023 - Things I never knew I never knew - about bu...
EICT Summer School August 2023 - Things I never knew  I never knew - about bu...EICT Summer School August 2023 - Things I never knew  I never knew - about bu...
EICT Summer School August 2023 - Things I never knew I never knew - about bu...
 
Challenges with pre-clinical studies in immuno oncology - by Fiona Nielsen
Challenges with pre-clinical studies in immuno oncology - by Fiona NielsenChallenges with pre-clinical studies in immuno oncology - by Fiona Nielsen
Challenges with pre-clinical studies in immuno oncology - by Fiona Nielsen
 
AIDR2019 - standards - tools - incentives - what does it take to enable data ...
AIDR2019 - standards - tools - incentives - what does it take to enable data ...AIDR2019 - standards - tools - incentives - what does it take to enable data ...
AIDR2019 - standards - tools - incentives - what does it take to enable data ...
 
Genomics for the public is coming - are you ready or not?
Genomics for the public is coming - are you ready or not?Genomics for the public is coming - are you ready or not?
Genomics for the public is coming - are you ready or not?
 
Investing in innovation for genomic medicine - sept 5 2017
Investing in innovation for genomic medicine - sept 5 2017Investing in innovation for genomic medicine - sept 5 2017
Investing in innovation for genomic medicine - sept 5 2017
 
Investing in innovation for genomic medicine - the journey of Repositive
Investing in innovation for genomic medicine - the journey of RepositiveInvesting in innovation for genomic medicine - the journey of Repositive
Investing in innovation for genomic medicine - the journey of Repositive
 
Session 3 - big (biomedical) data
Session 3 - big (biomedical) dataSession 3 - big (biomedical) data
Session 3 - big (biomedical) data
 
Why i left my job in genomics R&D - Lunteren - april 18 - 2016
Why i left my job in genomics R&D - Lunteren - april 18 - 2016Why i left my job in genomics R&D - Lunteren - april 18 - 2016
Why i left my job in genomics R&D - Lunteren - april 18 - 2016
 
The need to redefine genomic data sharing - moving towards Open Science Oct ...
The need to redefine genomic data sharing - moving towards Open Science  Oct ...The need to redefine genomic data sharing - moving towards Open Science  Oct ...
The need to redefine genomic data sharing - moving towards Open Science Oct ...
 
DNAdigest Eagle Genomics Symposium March 27, 2014
DNAdigest Eagle Genomics Symposium March 27, 2014DNAdigest Eagle Genomics Symposium March 27, 2014
DNAdigest Eagle Genomics Symposium March 27, 2014
 

Kürzlich hochgeladen

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 

Kürzlich hochgeladen (20)

Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 

ICG-11 - genomic data projects around the world - nov 5 2016

  • 1. ICG-11: Genomic Data Projects around the World - How to find data for your research Fiona Nielsen – November 4th 2016
  • 2. We are always looking for data Genetics, Cancer, Rare disease research We need access to the right data at the right time DNA interpretation requires lots of data
  • 3. How much data do you need to publish a paper? 2001: 1 human genome 2012: 1000 Genomes (1092 genomes, since increased to ~2500) 2015: UK10K, Icelandic population (2,636 + 100k imputed), Cancer genome atlas ~11,000 genomes ? 2016: Exac consortium 65,000 exomes GnomAD ~126,000 exomes 2020:
  • 4. Data is not easy to find and access FRAGMENTED Poor visibility of available genomic data ADMIN BURDEN Huge overhead to manage data access BAD CULTURE Lack of data sharing habits in research culture
  • 5. Finding and accessing data can take months 40% 48% 11% < 1 week 1-3 months +6 months Time spent data scouting per project
  • 6. Why the barrier? Barriers • Difficult to find data, let alone find the RIGHT data • Time-consuming and difficult to apply for access to data • Complicated and labourious to submit data to public repositories http://blog.repositive.io/tag/data-access/ http://blog.repositive.io/tag/data-sharing/
  • 7. But where in the world is the data? ?
  • 9. How to make data easy to discover?
  • 10. We have identified hundreds of data sources Universities – Or repositories affiliated to a university. Projects/Consortia – Has a specific purpose/aim. Often focussed on a specific research question or disease. Public repositories – Allows download and upload of data from multiple institutions. Companies – For profit organisations making data available for free or as a service. Biobanks – many have sequence data of their biological samples. Researchers know on average 4-5 data sources More data sources appear every day, to date we have identified 350+
  • 11. Simpler workflow for data access And indexed them on a the Repositive platform Discover and access Efficient Search, see related results Find colleagues & their data interests Co-annotate data & community feedback Free to use: http://discover.repositive.io
  • 12. Platform launched in Sept 2016 Discover and access Efficient Search, see related results Find colleagues & their data interests Co-annotate data & community feedback 1 Million+ Human genomic datasets indexed Free to use: http://discover.repositive.io
  • 13. Platform launched in Sept 2016 Discover and access Efficient Search, see related results Find colleagues & their data interests Co-annotate data & community feedback 1 Million+ datasets indexed Simpler workflow for data access 177k Whole Exomes 213k Whole Genomes 2400 23andMe samples Free to use: http://discover.repositive.io
  • 14. Platform launched in Sept 2016 Discover and access Efficient Search, see related results Find colleagues & their data interests Co-annotate data & community feedback 1 Million+ datasets indexed Simpler workflow for data access 61+ Countries 426+ Research organisations Using Repositive PDX Consortium With AstraZeneca Free to use: http://discover.repositive.io
  • 15. 11 155 2 2 4 4 7 780 0 5 10 15 20 25 30 35 40 45 GB FI NL FR DE CH EE BE DK ES SI IE SE 0 5 10 15 20 25 30 35 CA MD MA WA NY TX AZ DC NJ NC PA UT TN CO IN FL LA VA IL ME OH MO MI SC OR 1 1 1 1 1 1 Data sources across the globe GEO location of 278 data sources analysed. Found by tracking IP address of the source. These include:  Public Repositories  Universities  Companies  BioBanks  Research consortiums
  • 16. Data source content Assay Types Dedicated to…
  • 17. Sequenced ethnicities Aboriginals African Americans Africans Australians Chinese Malays Indians Danish Dutch Estonian Russian European Ancestry Finnish Icelandic Japanese Korean Latin Americans Saudi Swedish
  • 18. Machines & Data sources 947 5600 88 660 26 68 50 62 3 25 0 0 23 International Interesting site to look at: http://omicsmaps.com/stats
  • 19. • Repositive is supporting the whole research workflow • Faster, more efficient data discovery • Streamlining data access applications • Developing technology for efficient data access • Setting up pre-competitive data sharing agreements • Running workshops and training programmes More efficient data access  Read about our pre-competitive PDX data resource in collaboration with AstraZeneca http://repositive.io/pdx
  • 20. Building upon best practices MAKE DATA DISCOVERABLE SIMPLIFY WORKFLOWS CONTRIBUTE TO COMMUNITY DNAdigest and Repositive – Connecting the world of genomic data http://www.tinyurl.com/plos-biology-repositive First 30 data sources listed here:
  • 21. Connecting the world of genomic data Visit us at: http://repositive.io Or tweet us @repositiveio Free to use: http://discover.repositive.io Fiona Nielsen, CEO Email us: info@repositive.io

Hinweis der Redaktion

  1. Our mission is to speed up research and diagnostics for genetic diseases by enabling efficient and ethical access to genomic research data
  2. Because interpretation requires LOTS of data And although data exists around the world, it is siloed, and even if available, it is not accessible This is Jenn, a genetic researcher –our target customer- seeking to interpret data from genetic diseases and cancer She needs data from other patients to compare and interpret Mabels DNA She also has data available in her own lab, but she cannot share because of concerns how to deal with secure access to sensitive data and vetting of users
  3. Data is fragmented in unconnected silos – makes it very difficult to discover data Tracking data and working with data access requests is a time-consuming and bureaucratic exercise Difficult to build a user community without best practices and tools/platforms where users can share their data experience / findings
  4. Because interpretation requires LOTS of data And although data exists around the world, it is siloed, and even if available, it is not accessible This is Jenn, a genetic researcher –our target customer- seeking to interpret data from genetic diseases and cancer She needs data from other patients to compare and interpret Mabels DNA She also has data available in her own lab, but she cannot share because of concerns how to deal with secure access to sensitive data and data governance, e.g. vetting of users
  5. Further confounded by the data being highly fragmented. Siloed in repositories and institutions around the world.
  6. There are many public repositories, but It can be hugely confusing to know where to look for the right kind of data
  7. The Repositive platform is an online community and marketplace connecting data consumers with data providers. On Repositive, Jenn has Easy, Interactive search Faster data access workflow Easy access to new data collaborators Benefiting from reading feedback on data from community, colleagues, to assess data quality and utility The Repositive platform and technology will remove barriers to data sharing and will incentivise users to explore, contribute and collaborate in alignment with best practices
  8. The Repositive platform is an online community and marketplace connecting data consumers with data providers. On Repositive, Jenn has Easy, Interactive search Faster data access workflow Easy access to new data collaborators Benefiting from reading feedback on data from community, colleagues, to assess data quality and utility The Repositive platform and technology will remove barriers to data sharing and will incentivise users to explore, contribute and collaborate in alignment with best practices
  9. The Repositive platform is an online community and marketplace connecting data consumers with data providers. On Repositive, Jenn has Easy, Interactive search Faster data access workflow Easy access to new data collaborators Benefiting from reading feedback on data from community, colleagues, to assess data quality and utility The Repositive platform and technology will remove barriers to data sharing and will incentivise users to explore, contribute and collaborate in alignment with best practices
  10. The Repositive platform is an online community and marketplace connecting data consumers with data providers. On Repositive, Jenn has Easy, Interactive search Faster data access workflow Easy access to new data collaborators Benefiting from reading feedback on data from community, colleagues, to assess data quality and utility The Repositive platform and technology will remove barriers to data sharing and will incentivise users to explore, contribute and collaborate in alignment with best practices
  11. FAIR data: https://www.force11.org/group/fairgroup/fairprinciples
  12. DNA.land OpenSNP PersonalGenomesProject Direct to consumer genetic tests & microbiome