SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Experiences to learn from the mass
spectrometry proteomics field
Dr. Juan Antonio Vizcaíno
Proteomics Team Leader
EMBL-EBI
Hinxton, Cambridge, UK
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
•Develops data format standards for proteomics.
•Both data representation and annotation standards.
•Involves data producers, database providers, software producers,
publishers, …
•Active Workgroups: MI, MS, PI and now a new QC group.
•Inter-group activities: MIAPE and Controlled Vocabularies.
•Started in 2002, so some experience already…
•One annual meeting in March-April, regular phone calls.
•Peer Review for standards: PSI document process.
http://www.psidev.info
HUPO Proteomics Standards Initiative
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Current PSI Proteomics Standard File Formats for
Mass Spectrometry
• mzMLMS data
• mzIdentMLIdentification
• mzQuantMLQuantitation
• mzTabFinal Results
• TraMLSRM
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
• mzML is actively used already to store MS data
(very flexible format).
• mzTab is a tab-delimited format that it is being
extended to support MS metabolomics data in a
better way. It can be used for both identification
and quantification results.
• mzQuantML and TraML could be used with small
molecule data, but it has not been tested.
Reuse of data standards in metabolomics
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
• mzML is actively used already to store MS data
(very flexible format).
• mzTab is a tab-delimited format that it is being
extended to support MS metabolomics data in a
better way. It can be used for both identification
and quantification results.
• Meeting next week in Liverpool organised by A. Jones.
• mzQuantML and TraML could be used with small
molecule data, but it has not been tested.
Reuse of data standards in metabolomics
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Current Standard File Formats that are or could be
used in metabolomics
• mzMLMS data
• mzIdentMLIdentification
• mzQuantML *Quantitation
• mzTabFinal Results
• TraML *SRM
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Current vision for data exchange standards in MS
Neumann (IPB-Halle), Proteomics and HUPO-PSI community
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
imzML: data standard for mass imaging data
http://www.imzml.org
Not a PSI format: Based on mzML
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
qcML files to be generated after submission
• XML format that captures output from QC pipelines
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
• Don’t reinvent the wheel! There is no need…
• Software libraries (APIs) to handle the standards.
• Data converters.
• Data visualisation tools.
• Data analysis tools and workflows.
• A big proportion of the available software is open
source.
Opportunity to reuse and extend existing software
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
mzML: more software available
The most popular search
engines support mzML
Many parser libraries available
Conversion from raw files
into mzMLhttp://www.psidev.info/mzml_1_0_0
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Data visualisation: PRIDE Inspector Toolsuite
Wang et al., Nat. Biotechnology, 2012
Perez-Riverol et al., MCP, 2016
PRIDE Inspector Toolsuite
PRIDE Inspector Toolsuite supports:
- PRIDE XML
- mzIdentML
- mzML & all types of spectra files
- mzTab identification and Quantification
https://github.com/PRIDE-Toolsuite/
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
OpenMS/TOPP
• OpenMS – an open-source C++ framework for computational mass
spectrometry
• Jointly developed at ETH Zürich, FU Berlin, University of Tübingen
• Open source: BSD 3-clause license
• Portable: available on Windows, OSX, and Linux
• TOPP – The OpenMS Proteomics Pipeline
• Building blocks: one application for each analysis step
• All applications share identical user interfaces
• Uses PSI standard formats and integrates seamlessly with other applications supporting
these formats
• Can be integrated in various workflow systems
• TOPPAS – TOPP Pipeline Assistant
• Galaxy
• WS-PGRADE
• KNIME
Kohlbacher et al., Bioinformatics (2007), 23:e191
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
ProteomeXchange Consortium
• Goal: Development of a framework to allow
standard data submission and dissemination
pipelines between the main existing proteomics
repositories.
• Includes PeptideAtlas (ISB, Seattle), PRIDE
(Cambridge, UK), MassIVE (UCSD, San Diego) and
jPOST (Japan) will be integrated in July 2016.
• EU FP7 CA (01/2011-> 06/2014).
• Common identifier space (PXD identifiers)
• Two supported data workflows: MS/MS and SRM.
• Main objective: Make life easier for researchers
http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
PRIDE Archive submitted datasets up until 1st April, 2016
• In the last complete year: on average, >150 submitted datasets per
month
• Size of PRIDE Archive: ~ 220TB
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Vendor support for mzIdentML has grown in
parallel with the number of submitted datasets
Search
Engine
Results +
MS files
Search
engines
mzIdentML
- Mascot
- MSGF+
- Myrimatch and related tools from D. Tabb’s lab
- OpenMS
- PEAKS
- PeptideShaker (several open source tools)
- ProCon (ProteomeDiscoverer, Sequest)
- Scaffold
- TPP via the idConvert tool (ProteoWizard)
- ProteinPilot (from version 5.0)
- X!Tandem (from PILEDRIVER version)
- Others: library for X!Tandem conversion, lab
internal pipelines, …
- Crux
An increasing number of tools support export to mzIdentML
1.1
Updated list: http://www.psidev.info/tools-implementing-
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
•Develop tools in parallel with the data standards.
•Don’t reinvent the wheel! Many ideas and software already
there.
•Ideally, get vendors involved as soon as possible.
•Data repositories and data standards are a perfect match.
Conclusions
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Acknowlegements and further reading…
http://www.psidev.info
Poster P18
Juan A. Vizcaíno
juan@ebi.ac.uk
12th Conference of the Metabolomics Society
Dublin, 27 June 2016
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...Juan Antonio Vizcaino
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsJuan Antonio Vizcaino
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?Juan Antonio Vizcaino
 
TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...Peter Löwe
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)Carole Goble
 
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...OpenAIRE
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...Carole Goble
 
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBDr. Haxel Consult
 
FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIRDOM
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerCarole Goble
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpCarole Goble
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Carole Goble
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOMCarole Goble
 
Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIMartin Scharm
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMCarole Goble
 
Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...FAIRDOM
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
 

Was ist angesagt? (20)

Pride cluster presentation
Pride cluster presentation Pride cluster presentation
Pride cluster presentation
 
An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...An overview of the PRIDE ecosystem of resources and computational tools for m...
An overview of the PRIDE ecosystem of resources and computational tools for m...
 
Mining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasetsMining the hidden proteome using hundreds of public proteomics datasets
Mining the hidden proteome using hundreds of public proteomics datasets
 
How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?How to run and maintain a popular biological data repository?
How to run and maintain a popular biological data repository?
 
TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...TIB's action for research data managament as a national library's strategy in...
TIB's action for research data managament as a national library's strategy in...
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIBICIC 2013 Conference Proceedings Uwe Rosemann TIB
ICIC 2013 Conference Proceedings Uwe Rosemann TIB
 
FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
Improving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBIImproving the Management of Computational Models -- Invited talk at the EBI
Improving the Management of Computational Models -- Invited talk at the EBI
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
 
Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...Capturing the context: one small(ish step for modellers, one giant leap for m...
Capturing the context: one small(ish step for modellers, one giant leap for m...
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 

Ähnlich wie Experiences to learn from the MS proteomics field

PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...Juan Antonio Vizcaino
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Juan Antonio Vizcaino
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataJuan Antonio Vizcaino
 
Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressJuan Antonio Vizcaino
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formatsJuan Antonio Vizcaino
 
EMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and servicesEMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and servicesRafael C. Jimenez
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRJuan Antonio Vizcaino
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015Juan Antonio Vizcaino
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateJuan Antonio Vizcaino
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeJuan Antonio Vizcaino
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarJuan Antonio Vizcaino
 

Ähnlich wie Experiences to learn from the MS proteomics field (20)

Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
Proteomics data standards
Proteomics data standardsProteomics data standards
Proteomics data standards
 
ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016ProteomeXchange update HUPO 2016
ProteomeXchange update HUPO 2016
 
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
PRIDE and ProteomeXchange: supporting the cultural change in proteomics publi...
 
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
Developing open data analysis pipelines in the cloud: Enabling the ‘big data’...
 
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics dataPRIDE and ProteomeXchange: A golden age for working with public proteomics data
PRIDE and ProteomeXchange: A golden age for working with public proteomics data
 
Mass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progressMass Spectrometry Informatics formats in progress
Mass Spectrometry Informatics formats in progress
 
Euro lipids 2014_graz
Euro lipids 2014_grazEuro lipids 2014_graz
Euro lipids 2014_graz
 
Pride and ProteomeXchange
Pride and ProteomeXchangePride and ProteomeXchange
Pride and ProteomeXchange
 
Introduction to the PSI standard data formats
Introduction to the PSI standard data formatsIntroduction to the PSI standard data formats
Introduction to the PSI standard data formats
 
PRIDE and ProteomeXchange
PRIDE and ProteomeXchangePRIDE and ProteomeXchange
PRIDE and ProteomeXchange
 
EMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and servicesEMBL-EBI Proteomics data resources and services
EMBL-EBI Proteomics data resources and services
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIRIntroduction to EBI for Proteomics in ELIXIR
Introduction to EBI for Proteomics in ELIXIR
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015
 
The ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 updateThe ProteomeXchange Consoritum: 2017 update
The ProteomeXchange Consoritum: 2017 update
 
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchangeData volumes in proteomics data resources: PRIDE and ProteomeXchange
Data volumes in proteomics data resources: PRIDE and ProteomeXchange
 
PRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinarPRIDE and ProteomeXchange: Training webinar
PRIDE and ProteomeXchange: Training webinar
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 

Mehr von Juan Antonio Vizcaino

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Juan Antonio Vizcaino
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...Juan Antonio Vizcaino
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Juan Antonio Vizcaino
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Juan Antonio Vizcaino
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...Juan Antonio Vizcaino
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataJuan Antonio Vizcaino
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)Juan Antonio Vizcaino
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Juan Antonio Vizcaino
 

Mehr von Juan Antonio Vizcaino (20)

Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...Reusing and integrating public proteomics data to improve our knowledge of th...
Reusing and integrating public proteomics data to improve our knowledge of th...
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
PRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchangePRIDE resources and ProteomeXchange
PRIDE resources and ProteomeXchange
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018Introduction to the Proteomics Bioinformatics Course 2018
Introduction to the Proteomics Bioinformatics Course 2018
 
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
ELIXIR Implementation Study: “Mining the Proteome: Enabling Automated Process...
 
PSI-Proteome Informatics update
PSI-Proteome Informatics updatePSI-Proteome Informatics update
PSI-Proteome Informatics update
 
ProteomeXchange update
ProteomeXchange updateProteomeXchange update
ProteomeXchange update
 
The ELIXIR Proteomics community
The ELIXIR Proteomics community The ELIXIR Proteomics community
The ELIXIR Proteomics community
 
The ELIXIR Proteomics Community
The ELIXIR Proteomics CommunityThe ELIXIR Proteomics Community
The ELIXIR Proteomics Community
 
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...A proteomics data “gold mine” at your disposal: Now that the data is there, w...
A proteomics data “gold mine” at your disposal: Now that the data is there, w...
 
Reuse of public proteomics data
Reuse of public proteomics dataReuse of public proteomics data
Reuse of public proteomics data
 
Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017Introduction to the Proteomics Bioinformatics Course 2017
Introduction to the Proteomics Bioinformatics Course 2017
 
Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?Is it feasible to identify novel biomarkers by mining public proteomics data?
Is it feasible to identify novel biomarkers by mining public proteomics data?
 
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
The spectra-cluster toolsuite: Enhancing proteomics analysis through spectrum...
 
ProteomeXchange update 2017
ProteomeXchange update 2017ProteomeXchange update 2017
ProteomeXchange update 2017
 
Enabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics dataEnabling automated processing and analysis of large-scale proteomics data
Enabling automated processing and analysis of large-scale proteomics data
 
The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)The Proteomics Standards Initiative (PSI)
The Proteomics Standards Initiative (PSI)
 
Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016Introduction to the Proteomics Bioinformatics Course 2016
Introduction to the Proteomics Bioinformatics Course 2016
 
Reuse of public data in proteomics
Reuse of public data in proteomicsReuse of public data in proteomics
Reuse of public data in proteomics
 

Kürzlich hochgeladen

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 

Kürzlich hochgeladen (20)

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 

Experiences to learn from the MS proteomics field

  • 1. Experiences to learn from the mass spectrometry proteomics field Dr. Juan Antonio Vizcaíno Proteomics Team Leader EMBL-EBI Hinxton, Cambridge, UK
  • 2. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 •Develops data format standards for proteomics. •Both data representation and annotation standards. •Involves data producers, database providers, software producers, publishers, … •Active Workgroups: MI, MS, PI and now a new QC group. •Inter-group activities: MIAPE and Controlled Vocabularies. •Started in 2002, so some experience already… •One annual meeting in March-April, regular phone calls. •Peer Review for standards: PSI document process. http://www.psidev.info HUPO Proteomics Standards Initiative
  • 3. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Current PSI Proteomics Standard File Formats for Mass Spectrometry • mzMLMS data • mzIdentMLIdentification • mzQuantMLQuantitation • mzTabFinal Results • TraMLSRM
  • 4. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 • mzML is actively used already to store MS data (very flexible format). • mzTab is a tab-delimited format that it is being extended to support MS metabolomics data in a better way. It can be used for both identification and quantification results. • mzQuantML and TraML could be used with small molecule data, but it has not been tested. Reuse of data standards in metabolomics
  • 5. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 • mzML is actively used already to store MS data (very flexible format). • mzTab is a tab-delimited format that it is being extended to support MS metabolomics data in a better way. It can be used for both identification and quantification results. • Meeting next week in Liverpool organised by A. Jones. • mzQuantML and TraML could be used with small molecule data, but it has not been tested. Reuse of data standards in metabolomics
  • 6. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Current Standard File Formats that are or could be used in metabolomics • mzMLMS data • mzIdentMLIdentification • mzQuantML *Quantitation • mzTabFinal Results • TraML *SRM
  • 7. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Current vision for data exchange standards in MS Neumann (IPB-Halle), Proteomics and HUPO-PSI community
  • 8. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 imzML: data standard for mass imaging data http://www.imzml.org Not a PSI format: Based on mzML
  • 9. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 qcML files to be generated after submission • XML format that captures output from QC pipelines
  • 10. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 • Don’t reinvent the wheel! There is no need… • Software libraries (APIs) to handle the standards. • Data converters. • Data visualisation tools. • Data analysis tools and workflows. • A big proportion of the available software is open source. Opportunity to reuse and extend existing software
  • 11. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 mzML: more software available The most popular search engines support mzML Many parser libraries available Conversion from raw files into mzMLhttp://www.psidev.info/mzml_1_0_0
  • 12. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Data visualisation: PRIDE Inspector Toolsuite Wang et al., Nat. Biotechnology, 2012 Perez-Riverol et al., MCP, 2016 PRIDE Inspector Toolsuite PRIDE Inspector Toolsuite supports: - PRIDE XML - mzIdentML - mzML & all types of spectra files - mzTab identification and Quantification https://github.com/PRIDE-Toolsuite/
  • 13. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 OpenMS/TOPP • OpenMS – an open-source C++ framework for computational mass spectrometry • Jointly developed at ETH Zürich, FU Berlin, University of Tübingen • Open source: BSD 3-clause license • Portable: available on Windows, OSX, and Linux • TOPP – The OpenMS Proteomics Pipeline • Building blocks: one application for each analysis step • All applications share identical user interfaces • Uses PSI standard formats and integrates seamlessly with other applications supporting these formats • Can be integrated in various workflow systems • TOPPAS – TOPP Pipeline Assistant • Galaxy • WS-PGRADE • KNIME Kohlbacher et al., Bioinformatics (2007), 23:e191
  • 14. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 ProteomeXchange Consortium • Goal: Development of a framework to allow standard data submission and dissemination pipelines between the main existing proteomics repositories. • Includes PeptideAtlas (ISB, Seattle), PRIDE (Cambridge, UK), MassIVE (UCSD, San Diego) and jPOST (Japan) will be integrated in July 2016. • EU FP7 CA (01/2011-> 06/2014). • Common identifier space (PXD identifiers) • Two supported data workflows: MS/MS and SRM. • Main objective: Make life easier for researchers http://www.proteomexchange.org Vizcaíno et al., Nat Biotechnol, 2014
  • 15. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 PRIDE Archive submitted datasets up until 1st April, 2016 • In the last complete year: on average, >150 submitted datasets per month • Size of PRIDE Archive: ~ 220TB
  • 16. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Vendor support for mzIdentML has grown in parallel with the number of submitted datasets Search Engine Results + MS files Search engines mzIdentML - Mascot - MSGF+ - Myrimatch and related tools from D. Tabb’s lab - OpenMS - PEAKS - PeptideShaker (several open source tools) - ProCon (ProteomeDiscoverer, Sequest) - Scaffold - TPP via the idConvert tool (ProteoWizard) - ProteinPilot (from version 5.0) - X!Tandem (from PILEDRIVER version) - Others: library for X!Tandem conversion, lab internal pipelines, … - Crux An increasing number of tools support export to mzIdentML 1.1 Updated list: http://www.psidev.info/tools-implementing-
  • 17. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 •Develop tools in parallel with the data standards. •Don’t reinvent the wheel! Many ideas and software already there. •Ideally, get vendors involved as soon as possible. •Data repositories and data standards are a perfect match. Conclusions
  • 18. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Acknowlegements and further reading… http://www.psidev.info Poster P18
  • 19. Juan A. Vizcaíno juan@ebi.ac.uk 12th Conference of the Metabolomics Society Dublin, 27 June 2016 Questions?