SlideShare a Scribd company logo
1 of 39
Download to read offline
European Life Sciences Infrastructure for Biological Information
www.elixir-europe.org
The ELIXIR Proteomics Community
Dr. Juan AntonioVizcaíno
European Bioinformatics Institute(EMBL-EBI
juan@ebi.ac.uk
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• One slide intro to proteomics
• The ELIXIR Proteomics Community
• Plans for the near future
Outline
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
One slide intro to Mass Spectrometry proteomics
Hein et al., Handbook of Systems Biology, 2012
Proteins -> most drug targets
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• One slide intro to proteomics
• The ELIXIR Proteomics Community
• Plans for the near future
Outline
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• 11 ELIXIR nodes supported the application:
• Germany (co-lead) (O. Kohlbacher)
• Belgium (co-lead) (L. Martens)
• Czech Republic
• Denmark
• Ireland
• France
• Netherlands
• Spain
• Sweden
• United Kingdom
• EMBL-EBI (co-lead) (Juan A. Vizcaíno)
ELIXIR nodes supporting the new Community
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• The goal of the ELIXIR proteomics community is to
develop and maintain sustainable proteomics
tools and data resources
• An essential part of the development will also be the
‘FAIRification’ of the resources (i.e. making the
resources FAIR)
• Integrate proteomics bioinformatics activities in
ELIXIR
Overall objectives
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
White paper as the basis for this Community
Vizcaíno et al., F1000Research, 2017
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Highlighting already existing resources and initiatives
Tools: Services and connectors to drive access and exploitation
Data: Sustaining Europe’s life science data infrastructure
Interoperability: Integration of data and services
Compute: Access, exchange and storage
Training: Professional skills for managing and exploiting data
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Tools: Services and connectors to drive access and exploitation
Data: Sustaining Europe’s life science data infrastructure
Interoperability: Integration of data and services
Compute: Access, exchange and storage
Training: Professional skills for managing and exploiting data
Highlighting already existing resources and initiatives
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• PRIDE stores mass spectrometry (MS)-based
proteomics data:
• Peptide and protein expression data
(identification and quantification)
• Post-translational modifications
• Mass spectra (raw data and peak lists)
• Technical and biological metadata
• Any other related information
• Full support for tandem MS approaches
• Any type of data can be stored
• Leading ProteomeXchange
• From July 2017, an ELIXIR core resource
European leadership: the world-leading PRIDE database
http://www.ebi.ac.uk/pride/archive Martens et al., Proteomics, 2005
Vizcaíno et al., NAR, 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
ProteomeXchange: A Global, distributed proteomics database
PASSEL
(SRM data)
PRIDE
(MS/MS data)
MassIVE
(MS/MS data)
Raw
ID/Q
Meta
jPOST
(MS/MS data)
Mandatory data deposition
http://www.proteomexchange.org
Vizcaíno et al., Nat Biotechnol, 2014
Deutsch et al., NAR, 2017
iProX
(MS/MS data)
• Framework to allow standard data submission and dissemination
pipelines between the main existing proteomics repositories.
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
PRIDE data submissions and data growth
> 2,400 datasets submitted in 2017
September, November and December
2017 were the record months in terms
of submitted datasets
Datasets submitted per
month
Datasets submitted
per year
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Stats: Data growth in EMBL-EBI resources
Sequence data
Micro-array
Metabolomics
Proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Data re-use in proteomics is increasing
Data download volume for PRIDE
Archive in 2017: 295 TB
0
50
100
150
200
250
300
350
2013 2014 2015 2016 2017
Downloads in TBs
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Do you want to learn more?
Martens & Vizcaíno, Trends Bioch Sci, 2017 Vaudel et al., Proteomics, 2016
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Tools: Services and connectors to drive access and exploitation
Data: Sustaining Europe’s life science data infrastructure
Interoperability: Integration of data and services
Compute: Access, exchange and storage
Training: Professional skills for managing and exploiting data
Highlighting already existing resources and initiatives
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
•Develops open data standards for proteomics.
•Both data representation and annotation standards.
•Involves data producers, database providers, software producers,
publishers, everyone who wants to be involved…
•Active Workgroups: MI, MS, PI, Mod and the new QC.
•Inter-group activities: MIAPE and Controlled Vocabularies.
•Started in 2002, so some experience already…
•One annual meeting in March-April, regular phone calls.
•Closer interaction with the metabolomics community (MSI).
http://www.psidev.info
European leadership: HUPO Proteomics
Standards Initiative
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Activities of the Proteomics Standards Initiative
Deutsch et al., J Proteome Res, 2017
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• One slide intro to proteomics
• The ELIXIR Proteomics Community
• Plans for the near future
Outline
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Moving to “Proteoform” centric approaches
Smith et al., Nat Methods, 2013
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Across-omics -> Proteogenomics approaches
• Proteomics data is combined with genomics and/or transcriptomics
information, typically by using sequence databases generated from
DNA sequencing efforts, RNA-Seq experiments, Ribo-Seq
approaches, and long-non-coding RNAs.
• Increasingly important in personalised medicine studies.
Nesvizhskii, Nat Methods, 2014
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Data standards for proteogenomics: proBed and proBAM
• Same overall objective: to map identified peptides to genome
coordinates. Different level of detail:
• proBed is tab-delimited and simpler, based on the original BED format. Less level of
detail.
• proBAM is based in the original SAM/BAM formats, widely used in genomics. Much
higher level of detail.
• They can be used as “Track Hubs” (e.g. integration with genome browsers)
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Data
Tools
Compute
Interoperability
Training
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• Title: ‘’Mining the proteome: Enabling automated processing and analysis
of large-scale proteomics data”.
• Goal: Development of open, reproducible and modular pipelines based
on Open MS for DDA (Data Dependent Acquisition) approaches.
• Deployment in the EMBL-”Embassy Cloud”, with the goal that in the future,
they can be deployed in other cloud infrastructures, and be reused by
anyone in the community (e.g. hospitals).
• Connected to PRIDE, bringing the tools closer to the data.
• Who is involved?
• EMBL-EBI (Vizcaíno & Newhouse).
• ELIXIR-DE (Kohlbacher, EKUT , Eisenacher, RUB)
ELIXIR Implementation Study (Feb 2017-June 2018)
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
We opted for the framework
Features:
• Tool modularisation
• Solutions for data handover between tools with standardised
(PSI) formats
• Adapters for integrating third-party software (Search Engines,
LuciPHOr, FIDO, percolator, etc.)
• Integration into various workflow systems as a basis
Software used
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Summary figure of the developed infrastructure
Thanks to
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• Follow-up of the implementation study just mentioned.
• Title: "Extending open proteomics data analysis pipelines in the
cloud: Additional tools and focus on scalability, supporting the
dramatic growth of public proteomics data"
• It will start on August 2018 (1 year):
• Led by ELIXIR-Belgium (Martens).
• Participation of EMBL-EBI (Vizcaíno, Newhouse), ELIXIR-
Germany (Kohlbacher), ELIXIR-France (Bouyssie), ELIXIR-
Spain (Sabidó)
• It will include other tools and additional pipelines (Compomics tools,
QCloud, PROFI tools, etc).
Just approved Implementation Study (2018-2019)
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Data
Tools
Compute
Interoperability
Training
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• Assigned to the Community (10 ELIXIR nodes involved). It will start
on June 2018 (1 year).
• Title: ”Crowd-sourcing the annotation of public proteomics
datasets to improve data reusability”.
• Apply software developed in the different nodes to improve
automatic annotation pipelines linked to PRIDE (and QC
assessment).
• Improve re-usability of public data.
Just approved Implementation Study (2018-2019)
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Management of clinical proteomics data
Is proteomics data
patient identifiable?
A couple of papers on
this topic in 2016
Clear guidelines, policy
and resources
need to be developed
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Data
Tools
Compute
Interoperability
Training
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
• Proteomics bioinformatics activities in Europe are
very prominent world-wide
• Plans for the future:
• Focus in ‘proteoforms’ centric approaches.
• Data integration approaches with other ‘omics’
technologies (e.g. genomics, metabolomics, etc).
• Development of open, reproducible and scalable analysis
(and QC) data analysis workflows
• Improve data management practises (metadata
annotation, management of clinical data, …)
Summary
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018
Acknowledgements
https://www.elixir-europe.org/communities/proteomics
Juan A. Vizcaíno
juan@ebi.ac.uk
12th CeBiTec Symposium: “Big data in Medicine & Biotechnology”
Bielefeld, 20 March 2018

More Related Content

What's hot

Data Strategies: Metadata, Open Data, Linked Data
Data Strategies: Metadata, Open Data, Linked DataData Strategies: Metadata, Open Data, Linked Data
Data Strategies: Metadata, Open Data, Linked DataSemantic Web Company
 
VIVO 2010 2010 Paper
VIVO 2010 2010 PaperVIVO 2010 2010 Paper
VIVO 2010 2010 PaperWilliam Gunn
 
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...US-Ignite
 
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204Kees van Bochove
 
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...European Data Forum
 

What's hot (6)

Hahn "Wikidata as a hub to library linked data re-use"
Hahn "Wikidata as a hub to library linked data re-use"Hahn "Wikidata as a hub to library linked data re-use"
Hahn "Wikidata as a hub to library linked data re-use"
 
Data Strategies: Metadata, Open Data, Linked Data
Data Strategies: Metadata, Open Data, Linked DataData Strategies: Metadata, Open Data, Linked Data
Data Strategies: Metadata, Open Data, Linked Data
 
VIVO 2010 2010 Paper
VIVO 2010 2010 PaperVIVO 2010 2010 Paper
VIVO 2010 2010 Paper
 
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...
 
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
Bio Data World - The promise of FAIR data lakes - The Hyve - 20191204
 
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
 

Similar to ELIXIR Proteomics Community Drives Standards and Tools

Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Juan Antonio Vizcaino
 
FAIR Data Experiences - Kees van Bochove - The Hyve
FAIR Data Experiences - Kees van Bochove - The HyveFAIR Data Experiences - Kees van Bochove - The Hyve
FAIR Data Experiences - Kees van Bochove - The HyveKees van Bochove
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Juan Antonio Vizcaino
 
BioVis Meetup @ IEEE VIS 2015
BioVis Meetup @ IEEE VIS 2015BioVis Meetup @ IEEE VIS 2015
BioVis Meetup @ IEEE VIS 2015Nils Gehlenborg
 
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsJuan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRJuan Antonio Vizcaino
 
EuroBioForum2014_speaker_bilbao
EuroBioForum2014_speaker_bilbaoEuroBioForum2014_speaker_bilbao
EuroBioForum2014_speaker_bilbaoEuroBioForum
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0PetteriTeikariPhD
 
Digital pathology and biobanks
Digital pathology and biobanksDigital pathology and biobanks
Digital pathology and biobanksYves Sucaet
 
Final APEC ERW 25 Aug 2022.pdf
Final APEC ERW 25 Aug 2022.pdfFinal APEC ERW 25 Aug 2022.pdf
Final APEC ERW 25 Aug 2022.pdfpantapong
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataARDC
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 

Similar to ELIXIR Proteomics Community Drives Standards and Tools (20)

Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
FAIR Data Experiences - Kees van Bochove - The Hyve
FAIR Data Experiences - Kees van Bochove - The HyveFAIR Data Experiences - Kees van Bochove - The Hyve
FAIR Data Experiences - Kees van Bochove - The Hyve
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...Public proteomics data: a (mostly unexploited) gold mine for computational re...
Public proteomics data: a (mostly unexploited) gold mine for computational re...
 
BioVis Meetup @ IEEE VIS 2015
BioVis Meetup @ IEEE VIS 2015BioVis Meetup @ IEEE VIS 2015
BioVis Meetup @ IEEE VIS 2015
 
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomics
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
Kurt Zatloukal
Kurt ZatloukalKurt Zatloukal
Kurt Zatloukal
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 
EuroBioForum2014_speaker_bilbao
EuroBioForum2014_speaker_bilbaoEuroBioForum2014_speaker_bilbao
EuroBioForum2014_speaker_bilbao
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0
 
Digital pathology and biobanks
Digital pathology and biobanksDigital pathology and biobanks
Digital pathology and biobanks
 
Final APEC ERW 25 Aug 2022.pdf
Final APEC ERW 25 Aug 2022.pdfFinal APEC ERW 25 Aug 2022.pdf
Final APEC ERW 25 Aug 2022.pdf
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 

More from Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Juan Antonio Vizcaino
 

More from Juan Antonio Vizcaino (20)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE and ProteomeXchange
PRIDE and ProteomeXchangePRIDE and ProteomeXchange
PRIDE and ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
 
Reuse of public data in proteomics
Reuse of public data in proteomicsReuse of public data in proteomics
Reuse of public data in proteomics
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 

Recently uploaded

Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxSimeonChristian
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)itwameryclare
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 

Recently uploaded (20)

Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptxGood agricultural practices 3rd year bpharm. herbal drug technology .pptx
Good agricultural practices 3rd year bpharm. herbal drug technology .pptx
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)Functional group interconversions(oxidation reduction)
Functional group interconversions(oxidation reduction)
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 

ELIXIR Proteomics Community Drives Standards and Tools

  • 1. European Life Sciences Infrastructure for Biological Information www.elixir-europe.org The ELIXIR Proteomics Community Dr. Juan AntonioVizcaíno European Bioinformatics Institute(EMBL-EBI juan@ebi.ac.uk
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • One slide intro to proteomics • The ELIXIR Proteomics Community • Plans for the near future Outline
  • 3. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 One slide intro to Mass Spectrometry proteomics Hein et al., Handbook of Systems Biology, 2012 Proteins -> most drug targets
  • 4. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • One slide intro to proteomics • The ELIXIR Proteomics Community • Plans for the near future Outline
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • 11 ELIXIR nodes supported the application: • Germany (co-lead) (O. Kohlbacher) • Belgium (co-lead) (L. Martens) • Czech Republic • Denmark • Ireland • France • Netherlands • Spain • Sweden • United Kingdom • EMBL-EBI (co-lead) (Juan A. Vizcaíno) ELIXIR nodes supporting the new Community
  • 6. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • The goal of the ELIXIR proteomics community is to develop and maintain sustainable proteomics tools and data resources • An essential part of the development will also be the ‘FAIRification’ of the resources (i.e. making the resources FAIR) • Integrate proteomics bioinformatics activities in ELIXIR Overall objectives
  • 7. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 White paper as the basis for this Community Vizcaíno et al., F1000Research, 2017
  • 8. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Highlighting already existing resources and initiatives Tools: Services and connectors to drive access and exploitation Data: Sustaining Europe’s life science data infrastructure Interoperability: Integration of data and services Compute: Access, exchange and storage Training: Professional skills for managing and exploiting data
  • 9. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Tools: Services and connectors to drive access and exploitation Data: Sustaining Europe’s life science data infrastructure Interoperability: Integration of data and services Compute: Access, exchange and storage Training: Professional skills for managing and exploiting data Highlighting already existing resources and initiatives
  • 10. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • PRIDE stores mass spectrometry (MS)-based proteomics data: • Peptide and protein expression data (identification and quantification) • Post-translational modifications • Mass spectra (raw data and peak lists) • Technical and biological metadata • Any other related information • Full support for tandem MS approaches • Any type of data can be stored • Leading ProteomeXchange • From July 2017, an ELIXIR core resource European leadership: the world-leading PRIDE database http://www.ebi.ac.uk/pride/archive Martens et al., Proteomics, 2005 Vizcaíno et al., NAR, 2016
  • 11. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 ProteomeXchange: A Global, distributed proteomics database PASSEL (SRM data) PRIDE (MS/MS data) MassIVE (MS/MS data) Raw ID/Q Meta jPOST (MS/MS data) Mandatory data deposition http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014 Deutsch et al., NAR, 2017 iProX (MS/MS data) • Framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories.
  • 12. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 PRIDE data submissions and data growth > 2,400 datasets submitted in 2017 September, November and December 2017 were the record months in terms of submitted datasets Datasets submitted per month Datasets submitted per year
  • 13. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Stats: Data growth in EMBL-EBI resources Sequence data Micro-array Metabolomics Proteomics
  • 14. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Data re-use in proteomics is increasing Data download volume for PRIDE Archive in 2017: 295 TB 0 50 100 150 200 250 300 350 2013 2014 2015 2016 2017 Downloads in TBs
  • 15. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Do you want to learn more? Martens & Vizcaíno, Trends Bioch Sci, 2017 Vaudel et al., Proteomics, 2016
  • 16. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Tools: Services and connectors to drive access and exploitation Data: Sustaining Europe’s life science data infrastructure Interoperability: Integration of data and services Compute: Access, exchange and storage Training: Professional skills for managing and exploiting data Highlighting already existing resources and initiatives
  • 17. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 •Develops open data standards for proteomics. •Both data representation and annotation standards. •Involves data producers, database providers, software producers, publishers, everyone who wants to be involved… •Active Workgroups: MI, MS, PI, Mod and the new QC. •Inter-group activities: MIAPE and Controlled Vocabularies. •Started in 2002, so some experience already… •One annual meeting in March-April, regular phone calls. •Closer interaction with the metabolomics community (MSI). http://www.psidev.info European leadership: HUPO Proteomics Standards Initiative
  • 18. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Activities of the Proteomics Standards Initiative Deutsch et al., J Proteome Res, 2017
  • 19. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • One slide intro to proteomics • The ELIXIR Proteomics Community • Plans for the near future Outline
  • 20. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018
  • 21. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Moving to “Proteoform” centric approaches Smith et al., Nat Methods, 2013
  • 22. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018
  • 23. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Across-omics -> Proteogenomics approaches • Proteomics data is combined with genomics and/or transcriptomics information, typically by using sequence databases generated from DNA sequencing efforts, RNA-Seq experiments, Ribo-Seq approaches, and long-non-coding RNAs. • Increasingly important in personalised medicine studies. Nesvizhskii, Nat Methods, 2014
  • 24. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Data standards for proteogenomics: proBed and proBAM • Same overall objective: to map identified peptides to genome coordinates. Different level of detail: • proBed is tab-delimited and simpler, based on the original BED format. Less level of detail. • proBAM is based in the original SAM/BAM formats, widely used in genomics. Much higher level of detail. • They can be used as “Track Hubs” (e.g. integration with genome browsers)
  • 25. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Data Tools Compute Interoperability Training
  • 26. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018
  • 27. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • Title: ‘’Mining the proteome: Enabling automated processing and analysis of large-scale proteomics data”. • Goal: Development of open, reproducible and modular pipelines based on Open MS for DDA (Data Dependent Acquisition) approaches. • Deployment in the EMBL-”Embassy Cloud”, with the goal that in the future, they can be deployed in other cloud infrastructures, and be reused by anyone in the community (e.g. hospitals). • Connected to PRIDE, bringing the tools closer to the data. • Who is involved? • EMBL-EBI (Vizcaíno & Newhouse). • ELIXIR-DE (Kohlbacher, EKUT , Eisenacher, RUB) ELIXIR Implementation Study (Feb 2017-June 2018)
  • 28. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 We opted for the framework Features: • Tool modularisation • Solutions for data handover between tools with standardised (PSI) formats • Adapters for integrating third-party software (Search Engines, LuciPHOr, FIDO, percolator, etc.) • Integration into various workflow systems as a basis Software used
  • 29. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Summary figure of the developed infrastructure Thanks to
  • 30. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • Follow-up of the implementation study just mentioned. • Title: "Extending open proteomics data analysis pipelines in the cloud: Additional tools and focus on scalability, supporting the dramatic growth of public proteomics data" • It will start on August 2018 (1 year): • Led by ELIXIR-Belgium (Martens). • Participation of EMBL-EBI (Vizcaíno, Newhouse), ELIXIR- Germany (Kohlbacher), ELIXIR-France (Bouyssie), ELIXIR- Spain (Sabidó) • It will include other tools and additional pipelines (Compomics tools, QCloud, PROFI tools, etc). Just approved Implementation Study (2018-2019)
  • 31. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Data Tools Compute Interoperability Training
  • 32. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018
  • 33. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • Assigned to the Community (10 ELIXIR nodes involved). It will start on June 2018 (1 year). • Title: ”Crowd-sourcing the annotation of public proteomics datasets to improve data reusability”. • Apply software developed in the different nodes to improve automatic annotation pipelines linked to PRIDE (and QC assessment). • Improve re-usability of public data. Just approved Implementation Study (2018-2019)
  • 34. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Management of clinical proteomics data Is proteomics data patient identifiable? A couple of papers on this topic in 2016 Clear guidelines, policy and resources need to be developed
  • 35. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Data Tools Compute Interoperability Training
  • 36. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018
  • 37. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 • Proteomics bioinformatics activities in Europe are very prominent world-wide • Plans for the future: • Focus in ‘proteoforms’ centric approaches. • Data integration approaches with other ‘omics’ technologies (e.g. genomics, metabolomics, etc). • Development of open, reproducible and scalable analysis (and QC) data analysis workflows • Improve data management practises (metadata annotation, management of clinical data, …) Summary
  • 38. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018 Acknowledgements https://www.elixir-europe.org/communities/proteomics
  • 39. Juan A. Vizcaíno juan@ebi.ac.uk 12th CeBiTec Symposium: “Big data in Medicine & Biotechnology” Bielefeld, 20 March 2018