SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Yasset Perez-Riverol Ph.D
github: github.com/ypriverol
twitter: @ypriverol
OpenMS: Quantitative proteomics at
large scale
Proteomics Bioinformatics
EMBL-EBI, December 2016
Outline
• Introduction to OpenMS
Modularity & Workflows
Visualization.
Integration with other tools.
• Two example workflows
Protein identification
Label-free quantification
Proteomics Bioinformatics
EMBL-EBI, December 2016
Modularity is the degree to which a system's components may
be separated and recombined.
Proteomics Bioinformatics
EMBL-EBI, December 2016
Proteomics Bioinformatics
EMBL-EBI, December 2016
Modularity
tools for identification
DecoyDatabase
MascotAdapter
XTandemAdapter
MSGFPlusAdapter
PeptideIndexer
FalseDiscoveryRate
IDPosteriorErrorProbability
ConsensusID
LuciphorAdapter
HighResPrecursorMassCorrector
FidoAdapter
tools for quantification
PeakPickerHiRes
FeatureFinderMultiplex
FeatureFinderCentroided
SpectraMerger
NoiseFilterSGolay
ITRAQAnalyzer
IDMapper
IDConflictResolver
MapAlignerPoseClustering
MapRTTransformer
FeatureLinkerUnlabeledQT
ProteinQuantifier
tools for file handling
FileConverter
FileMerger
FileFilter
IDFileConverter
IDMerger
IDFilter
MzTabExporter
FileInfo
OpenMS ⇨ collection of 180 software tools
≈ 30 tools sufficient for standard workflows
Proteomics Bioinformatics
EMBL-EBI, December 2016
OpenMS
OpenMS – an open-source framework for computational mass spectrometry
Portable: available on Windows, OSX, Linux
OpenMS TOPP tools – The OpenMS Proteomics Pipeline tools
• > 180 Building blocks: One application for each analysis step
• Vendor independent: Uses PSI standard formats
Can be integrated in various workflow systems
• TOPPAS – TOPP Pipeline Assistant
• Galaxy
• KNIME
Proteomics Bioinformatics
EMBL-EBI, December 2016
KNIME and TOPPView
KNIME – KoNstanz Information MinEr
• Enable to build customized workflows by using OpenMS
components.
TOPVIEW: An OpenMS Data Viewer.
• Based on standard files formats.
• MS/MS information,
peptides/proteins,
quantitative information.
Proteomics Bioinformatics
EMBL-EBI, December 2016
KNIME – Workflow System
KNIME – KoNstanz Information MinEr
Industrial-strength general-purpose workflow system
Convenient and easy-to-use graphical user interface
Available for Windows, OSX, Linux at http://KNIME.org
KNIME (CC BY-SA 4.0)
Workflows
Plots
Tables
Console
Nodes
Proteomics Bioinformatics
EMBL-EBI, December 2016
Workflow Builder: Data Flow
KNIME-OpenMS workflows consist of distinct nodes
that are assembled into workflows
Either tables or files are exchanged between nodes
along the edges of the workflow
Configuration dialogs are used to set node
parameters
Loops, allow iterating sequentially over lists of data
Switches, allow executing nodes or subworkflows
dependent on a condition
Proteomics Bioinformatics
EMBL-EBI, December 2016
Scripting
KNIME permits the embedding of R code for advanced statistics
Embedding of R scripts using the R Snippet node
All plotting capabilities of R can be used as well
Proteomics Bioinformatics
EMBL-EBI, December 2016
Peptide/Protein Identification
Task: Identify peptides in multiple samples
Mass spectra enter workflow on the left
Loop nodes permit execution of parts of the workflow
Identified proteins end up in result files (right side)
Proteomics Bioinformatics
EMBL-EBI, December 2016
TOOView: Visualization of the results
mzML idXML
Proteomics Bioinformatics
EMBL-EBI, December 2016
Workflow – Plug-In System
Task: Identify peptides in multiple samples
Mass spectra enter workflow on the left
Loop nodes permit execution of parts of the workflow
Identified proteins end up in result files (right side)
Proteomics Bioinformatics
EMBL-EBI, December 2016
Workflow – Plug-In System
Task: Identify peptides in multiple samples
Combination of Xtandem+OMSSA
Defining of QC parameters like FDR. Q-values, P-values.
Proteomics Bioinformatics
EMBL-EBI, December 2016
Complex and customized Workflows
X!Tandem Mascot MS-GF+ Merged
PIA 1214 64 (5.3%) 1442 74 (5.1%) 1631 93 (5.7%) 1615 101 (6.2%)
Fido 996 67 (6.7%) 1439 80 (5.6%) 1679 96 (5.7%) 1619 105 (6.5%)
ProteinLP 989 64 (6.5%) 1229 77 (2.3%) 1651 93 (5.6%) 1295 104 (8.0%)
MSBayesPro 749 24 (3.2%) 958 26 (2.7%) 1303 31 (2.4%) 963 36 (3.7%)
ProteinProphet 1027 64 (6.2%) 1282 73 (5.7%) 1629 91 (5.6%) 1629 99 (6.7%)
Audain E. & Uszkoreit J. et al, Journal of Proteomics, 2017
Best Protein inference
algorithm:
3 Datasets
4 Search engines.
5 Protein inference
algorithms.
> 140 combinations.
Proteomics Bioinformatics
EMBL-EBI, December 2016
Some of the Identification nodes
IDPosteriorErrorProbability
Compute the posterior error probability for each PSM
Generate a new file with the corresponding values.
ConsensusID
Combine PSM identifications from multiple search
engines.
Generate a Combined PosteriorErrorProbability for
each PSM.
For each peptide ID, use the best score of any
search engine as the consensus score.
FalseDiscoveryRate
For each peptide ID, use the best score of any
search engine as the consensus score.
Proteomics Bioinformatics
EMBL-EBI, December 2016
Adapters and Complementary Nodes
FileMerger
This nodes takes two files (or file lists) as input and
outputs a merged list of both inputs. The order
corresponds to the order of the input lists and ports.
IDMerger
Merges several protein/peptide identification files
into one file.
PeptideIndexer
Refreshes the protein references for all peptide hits.
IDFilter
Filters results from protein or peptide identification
engines based on different criteria.
Proteomics Bioinformatics
EMBL-EBI, December 2016
Quantitative Proteomics
Quantitative Proteomics
Relative Quantification
Labeled
In vivo
14N/15N SILAC
In vitro
iTRAQ TMT 16O/18O
Label-Free
Spectral Counting MRM Feature-Based
Absolute Quantification
AQUA SISCAPA
And many more…
Proteomics Bioinformatics
EMBL-EBI, December 2016
Label-Free Quantification (LFQ)
Label-free quantification is probably the most natural way of
quantifying
• No labeling required, removing further sources of error, cheap
• Different samples acquired in different measurements – higher
reproducibility needed
• Manual analysis difficult
• Scales very well with the number of samples, basically no limit,
no difference in the analysis between 2 or 100 samples
Proteomics Bioinformatics
EMBL-EBI, December 2016
Feature-based LFQ - LC-MS Maps
Spectra are acquired with rates up to dozens per second
Stacking the spectra yields peak maps
Resolution:
• Up to millions of points per spectrum
• Tens of thousands of spectra per LC run
Huge 2D datasets of up to hundreds of GB per sample
Quantification
(3x over-expressed, …)
Feature
(eluting peptide)
Proteomics Bioinformatics
EMBL-EBI, December 2016
Feature-based LFQ
1. Find features in all maps
2. Align maps
3. Link corresponding features
4. Identify features
5. Quantify features
6. Quantify proteins based on
their peptides
NPC2_HUMA
N
1.0 : 5.2 : 0.3
CD177_HUMAN 1.0 : 0.2 : 0.4
::
Sample 1 Sample 2 Sample 3
Proteomics Bioinformatics
EMBL-EBI, December 2016
Label-Free Workflow
Different algorithms has been proposed by the OpenMS community for
label free:
• Weisser H, Journal of Proteome Research (2013).
• Bo Zhang, Molecular Cell Proteomics (2016).
• Veit J., Jounral of Proteome Research (2016)
• Ranninger C., Analytica Chimica Acta (2016)
Proteomics Bioinformatics
EMBL-EBI, December 2016
DeMix-Q Algorithm and Workflow
Bo Zhang, Lukas Käll & Roman A. Zubarev, MCP (2016)
Proteomics Bioinformatics
EMBL-EBI, December 2016
Reliable and reproducible Quantitation
Proteomics Bioinformatics
EMBL-EBI, December 2016
LFQ Relevant nodes
FeatureFinderCentroid
Detects two-dimensional features in LC-MS data.
MapAlignerPoseClustering
Corrects retention time distortions between maps
using a pose clustering approach.
FeatureLinkerUnlabeledQT
Groups corresponding features from multiple maps.
ConsensusMapNormalizer
Normalizes maps of one consensusXML file
Proteomics Bioinformatics
EMBL-EBI, December 2016
OpenMS at Large Scale
Galaxy
WS-PGRADE/gUSE
KNIME
Each individual tool can be run in the command line making
possible its distribution in large HPC environments.
$> FileFilter -in myinfile.mzML -levels 2 -rt 100:1500 -out myoutfile.mzML
$> OpenSwathDecoyGenerator.exe −in OpenSWATH_SGS_AssayLibrary.TraML −out
OpenSWATH_SGS_AssayLibrary_with_Decoys.TraML −method shuffle −append exclude_similar
−remove_unannotated
Conclusions
• OpenMS modular workflow system
• standard workflows:
SILAC, iTRAQ/TMT, label-free, Swath, Quality
Control
• strong collaboration with other projects:
ProteoWizard, Thermo PD, Knime, Fido
Percolator, search engines, HUPO-PSI formats
How to run OpenMS workflows
• OpenMS, local installation
(Windows, OS X, Linux)
http://bit.ly/1J6lz6h
http://openms.de/workflows
• OpenMS in Proteome Discoverer
(LFQProfiler and RNPxl for PD 2.1)
http://openms.de/PD
• OpenMS in Galaxy
http://galaxy.uni-freiburg.de
• OpenMS in Knime
https://tech.knime.org/community/bioinf/openms

Weitere ähnliche Inhalte

Was ist angesagt?

Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Sijo A
 
Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Sean Davis
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS
 
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015Alain van Gool
 
Peptide Mass Fingerprinting
Peptide Mass FingerprintingPeptide Mass Fingerprinting
Peptide Mass FingerprintingRida Khalid
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOAEBI
 
Proteomics and protein-protein interaction
Proteomics  and protein-protein interactionProteomics  and protein-protein interaction
Proteomics and protein-protein interactionSenthilkumarV25
 
Quantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserQuantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserNeil Swainston
 
Mass Spectrometry-Based Proteomics Quantification: iTRAQ
Mass Spectrometry-Based Proteomics Quantification: iTRAQ Mass Spectrometry-Based Proteomics Quantification: iTRAQ
Mass Spectrometry-Based Proteomics Quantification: iTRAQ Creative Proteomics
 
Proteomics & Metabolomics
Proteomics & MetabolomicsProteomics & Metabolomics
Proteomics & Metabolomicsgumccomm
 
Techniques used for separation in proteomics
Techniques used for separation in proteomicsTechniques used for separation in proteomics
Techniques used for separation in proteomicsNilesh Chandra
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsAyeshaYousaf20
 

Was ist angesagt? (20)

Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)
 
Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09Bioc strucvariant seattle_11_09
Bioc strucvariant seattle_11_09
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysis
 
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
 
Peptide Mass Fingerprinting
Peptide Mass FingerprintingPeptide Mass Fingerprinting
Peptide Mass Fingerprinting
 
MASCOT
MASCOTMASCOT
MASCOT
 
Protein identication characterization
Protein identication characterizationProtein identication characterization
Protein identication characterization
 
Proteomics
ProteomicsProteomics
Proteomics
 
Mascot database
Mascot databaseMascot database
Mascot database
 
UniProt-GOA
UniProt-GOAUniProt-GOA
UniProt-GOA
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Proteomics and protein-protein interaction
Proteomics  and protein-protein interactionProteomics  and protein-protein interaction
Proteomics and protein-protein interaction
 
Quantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserQuantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To Browser
 
Mass Spectrometry-Based Proteomics Quantification: iTRAQ
Mass Spectrometry-Based Proteomics Quantification: iTRAQ Mass Spectrometry-Based Proteomics Quantification: iTRAQ
Mass Spectrometry-Based Proteomics Quantification: iTRAQ
 
Proteomics & Metabolomics
Proteomics & MetabolomicsProteomics & Metabolomics
Proteomics & Metabolomics
 
Salisha ppt (1) (1)
Salisha ppt (1) (1)Salisha ppt (1) (1)
Salisha ppt (1) (1)
 
Techniques used for separation in proteomics
Techniques used for separation in proteomicsTechniques used for separation in proteomics
Techniques used for separation in proteomics
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Metabolomics Data Analysis
Metabolomics Data AnalysisMetabolomics Data Analysis
Metabolomics Data Analysis
 
proteomics
 proteomics proteomics
proteomics
 

Ähnlich wie OpenMS: Quantitative Proteomics Tools and Workflows

Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuKAUSHAL SAHU
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformaticscontactsoorya
 
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Yasset Perez-Riverol
 
Biomarker Strategies
Biomarker StrategiesBiomarker Strategies
Biomarker StrategiesTom Plasterer
 
Informatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems BiologyInformatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems BiologyNeil Swainston
 
Internship Report
Internship ReportInternship Report
Internship ReportNeha Gupta
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 
Microarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the benchMicroarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the benchMaté Ongenaert
 
A systematic review of network analyst - Pubrica
A systematic review of network analyst - PubricaA systematic review of network analyst - Pubrica
A systematic review of network analyst - PubricaPubrica
 
T-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesT-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesElia Brodsky
 

Ähnlich wie OpenMS: Quantitative Proteomics Tools and Workflows (20)

Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
Molecular Biology Software Links
Molecular Biology Software LinksMolecular Biology Software Links
Molecular Biology Software Links
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformatics
 
Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...Systematic integration of millions of peptidoform evidences into Ensembl and ...
Systematic integration of millions of peptidoform evidences into Ensembl and ...
 
Biomarker Strategies
Biomarker StrategiesBiomarker Strategies
Biomarker Strategies
 
C044041723
C044041723C044041723
C044041723
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
 
Informatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems BiologyInformatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems Biology
 
Cytoscape Talk 2010
Cytoscape Talk 2010Cytoscape Talk 2010
Cytoscape Talk 2010
 
User manual
User manualUser manual
User manual
 
Internship Report
Internship ReportInternship Report
Internship Report
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Microarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the benchMicroarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the bench
 
A systematic review of network analyst - Pubrica
A systematic review of network analyst - PubricaA systematic review of network analyst - Pubrica
A systematic review of network analyst - Pubrica
 
Path2 ppi
Path2 ppiPath2 ppi
Path2 ppi
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
iOMICS Research
iOMICS ResearchiOMICS Research
iOMICS Research
 
T-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesT-BioInfo Methods and Approaches
T-BioInfo Methods and Approaches
 
T-bioinfo overview
T-bioinfo overviewT-bioinfo overview
T-bioinfo overview
 

Mehr von Yasset Perez-Riverol

Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsYasset Perez-Riverol
 
Mapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesMapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesYasset Perez-Riverol
 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionYasset Perez-Riverol
 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017Yasset Perez-Riverol
 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Yasset Perez-Riverol
 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesYasset Perez-Riverol
 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small moleculesYasset Perez-Riverol
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable Yasset Perez-Riverol
 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningYasset Perez-Riverol
 

Mehr von Yasset Perez-Riverol (12)

Biocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All HandsBiocontainers 2019: Presentation for the ELIXIR All Hands
Biocontainers 2019: Presentation for the ELIXIR All Hands
 
Mapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome CoordinatesMapping millions of peptidoforms to Genome Coordinates
Mapping millions of peptidoforms to Genome Coordinates
 
Biocontainers Hackathon Introduction
Biocontainers Hackathon IntroductionBiocontainers Hackathon Introduction
Biocontainers Hackathon Introduction
 
BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017BioContainers on ELIXIR All Hands 2017
BioContainers on ELIXIR All Hands 2017
 
Do we need to make public our proteomics data?
Do we need to make public our proteomics data?Do we need to make public our proteomics data?
Do we need to make public our proteomics data?
 
Design of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studiesDesign of an hexapeptide database for proteomics studies
Design of an hexapeptide database for proteomics studies
 
Parallel conformational search of small molecules
Parallel conformational search of small moleculesParallel conformational search of small molecules
Parallel conformational search of small molecules
 
PBS Web (Spanish)
PBS Web (Spanish)PBS Web (Spanish)
PBS Web (Spanish)
 
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusablePRIDE and ProteomeXchange – Making proteomics data accessible and reusable
PRIDE and ProteomeXchange – Making proteomics data accessible and reusable
 
Yasset perezriverol csi2011
Yasset perezriverol csi2011Yasset perezriverol csi2011
Yasset perezriverol csi2011
 
Yasset iso point-cigb-2012
Yasset iso point-cigb-2012Yasset iso point-cigb-2012
Yasset iso point-cigb-2012
 
SintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual ScreeningSintCompound: A Small Compound Database for Virtual Screening
SintCompound: A Small Compound Database for Virtual Screening
 

Kürzlich hochgeladen

OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuinethapagita
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxuniversity
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 

Kürzlich hochgeladen (20)

OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 GenuineCall Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
Call Girls in Majnu Ka Tilla Delhi 🔝9711014705🔝 Genuine
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptxThermodynamics ,types of system,formulae ,gibbs free energy .pptx
Thermodynamics ,types of system,formulae ,gibbs free energy .pptx
 
Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 

OpenMS: Quantitative Proteomics Tools and Workflows

  • 1. Yasset Perez-Riverol Ph.D github: github.com/ypriverol twitter: @ypriverol OpenMS: Quantitative proteomics at large scale
  • 2. Proteomics Bioinformatics EMBL-EBI, December 2016 Outline • Introduction to OpenMS Modularity & Workflows Visualization. Integration with other tools. • Two example workflows Protein identification Label-free quantification
  • 3. Proteomics Bioinformatics EMBL-EBI, December 2016 Modularity is the degree to which a system's components may be separated and recombined.
  • 5. Proteomics Bioinformatics EMBL-EBI, December 2016 Modularity tools for identification DecoyDatabase MascotAdapter XTandemAdapter MSGFPlusAdapter PeptideIndexer FalseDiscoveryRate IDPosteriorErrorProbability ConsensusID LuciphorAdapter HighResPrecursorMassCorrector FidoAdapter tools for quantification PeakPickerHiRes FeatureFinderMultiplex FeatureFinderCentroided SpectraMerger NoiseFilterSGolay ITRAQAnalyzer IDMapper IDConflictResolver MapAlignerPoseClustering MapRTTransformer FeatureLinkerUnlabeledQT ProteinQuantifier tools for file handling FileConverter FileMerger FileFilter IDFileConverter IDMerger IDFilter MzTabExporter FileInfo OpenMS ⇨ collection of 180 software tools ≈ 30 tools sufficient for standard workflows
  • 6. Proteomics Bioinformatics EMBL-EBI, December 2016 OpenMS OpenMS – an open-source framework for computational mass spectrometry Portable: available on Windows, OSX, Linux OpenMS TOPP tools – The OpenMS Proteomics Pipeline tools • > 180 Building blocks: One application for each analysis step • Vendor independent: Uses PSI standard formats Can be integrated in various workflow systems • TOPPAS – TOPP Pipeline Assistant • Galaxy • KNIME
  • 7. Proteomics Bioinformatics EMBL-EBI, December 2016 KNIME and TOPPView KNIME – KoNstanz Information MinEr • Enable to build customized workflows by using OpenMS components. TOPVIEW: An OpenMS Data Viewer. • Based on standard files formats. • MS/MS information, peptides/proteins, quantitative information.
  • 8. Proteomics Bioinformatics EMBL-EBI, December 2016 KNIME – Workflow System KNIME – KoNstanz Information MinEr Industrial-strength general-purpose workflow system Convenient and easy-to-use graphical user interface Available for Windows, OSX, Linux at http://KNIME.org KNIME (CC BY-SA 4.0) Workflows Plots Tables Console Nodes
  • 9. Proteomics Bioinformatics EMBL-EBI, December 2016 Workflow Builder: Data Flow KNIME-OpenMS workflows consist of distinct nodes that are assembled into workflows Either tables or files are exchanged between nodes along the edges of the workflow Configuration dialogs are used to set node parameters Loops, allow iterating sequentially over lists of data Switches, allow executing nodes or subworkflows dependent on a condition
  • 10. Proteomics Bioinformatics EMBL-EBI, December 2016 Scripting KNIME permits the embedding of R code for advanced statistics Embedding of R scripts using the R Snippet node All plotting capabilities of R can be used as well
  • 11. Proteomics Bioinformatics EMBL-EBI, December 2016 Peptide/Protein Identification Task: Identify peptides in multiple samples Mass spectra enter workflow on the left Loop nodes permit execution of parts of the workflow Identified proteins end up in result files (right side)
  • 12. Proteomics Bioinformatics EMBL-EBI, December 2016 TOOView: Visualization of the results mzML idXML
  • 13. Proteomics Bioinformatics EMBL-EBI, December 2016 Workflow – Plug-In System Task: Identify peptides in multiple samples Mass spectra enter workflow on the left Loop nodes permit execution of parts of the workflow Identified proteins end up in result files (right side)
  • 14. Proteomics Bioinformatics EMBL-EBI, December 2016 Workflow – Plug-In System Task: Identify peptides in multiple samples Combination of Xtandem+OMSSA Defining of QC parameters like FDR. Q-values, P-values.
  • 15. Proteomics Bioinformatics EMBL-EBI, December 2016 Complex and customized Workflows X!Tandem Mascot MS-GF+ Merged PIA 1214 64 (5.3%) 1442 74 (5.1%) 1631 93 (5.7%) 1615 101 (6.2%) Fido 996 67 (6.7%) 1439 80 (5.6%) 1679 96 (5.7%) 1619 105 (6.5%) ProteinLP 989 64 (6.5%) 1229 77 (2.3%) 1651 93 (5.6%) 1295 104 (8.0%) MSBayesPro 749 24 (3.2%) 958 26 (2.7%) 1303 31 (2.4%) 963 36 (3.7%) ProteinProphet 1027 64 (6.2%) 1282 73 (5.7%) 1629 91 (5.6%) 1629 99 (6.7%) Audain E. & Uszkoreit J. et al, Journal of Proteomics, 2017 Best Protein inference algorithm: 3 Datasets 4 Search engines. 5 Protein inference algorithms. > 140 combinations.
  • 16. Proteomics Bioinformatics EMBL-EBI, December 2016 Some of the Identification nodes IDPosteriorErrorProbability Compute the posterior error probability for each PSM Generate a new file with the corresponding values. ConsensusID Combine PSM identifications from multiple search engines. Generate a Combined PosteriorErrorProbability for each PSM. For each peptide ID, use the best score of any search engine as the consensus score. FalseDiscoveryRate For each peptide ID, use the best score of any search engine as the consensus score.
  • 17. Proteomics Bioinformatics EMBL-EBI, December 2016 Adapters and Complementary Nodes FileMerger This nodes takes two files (or file lists) as input and outputs a merged list of both inputs. The order corresponds to the order of the input lists and ports. IDMerger Merges several protein/peptide identification files into one file. PeptideIndexer Refreshes the protein references for all peptide hits. IDFilter Filters results from protein or peptide identification engines based on different criteria.
  • 18. Proteomics Bioinformatics EMBL-EBI, December 2016 Quantitative Proteomics Quantitative Proteomics Relative Quantification Labeled In vivo 14N/15N SILAC In vitro iTRAQ TMT 16O/18O Label-Free Spectral Counting MRM Feature-Based Absolute Quantification AQUA SISCAPA And many more…
  • 19. Proteomics Bioinformatics EMBL-EBI, December 2016 Label-Free Quantification (LFQ) Label-free quantification is probably the most natural way of quantifying • No labeling required, removing further sources of error, cheap • Different samples acquired in different measurements – higher reproducibility needed • Manual analysis difficult • Scales very well with the number of samples, basically no limit, no difference in the analysis between 2 or 100 samples
  • 20. Proteomics Bioinformatics EMBL-EBI, December 2016 Feature-based LFQ - LC-MS Maps Spectra are acquired with rates up to dozens per second Stacking the spectra yields peak maps Resolution: • Up to millions of points per spectrum • Tens of thousands of spectra per LC run Huge 2D datasets of up to hundreds of GB per sample Quantification (3x over-expressed, …) Feature (eluting peptide)
  • 21. Proteomics Bioinformatics EMBL-EBI, December 2016 Feature-based LFQ 1. Find features in all maps 2. Align maps 3. Link corresponding features 4. Identify features 5. Quantify features 6. Quantify proteins based on their peptides NPC2_HUMA N 1.0 : 5.2 : 0.3 CD177_HUMAN 1.0 : 0.2 : 0.4 :: Sample 1 Sample 2 Sample 3
  • 22. Proteomics Bioinformatics EMBL-EBI, December 2016 Label-Free Workflow Different algorithms has been proposed by the OpenMS community for label free: • Weisser H, Journal of Proteome Research (2013). • Bo Zhang, Molecular Cell Proteomics (2016). • Veit J., Jounral of Proteome Research (2016) • Ranninger C., Analytica Chimica Acta (2016)
  • 23. Proteomics Bioinformatics EMBL-EBI, December 2016 DeMix-Q Algorithm and Workflow Bo Zhang, Lukas Käll & Roman A. Zubarev, MCP (2016)
  • 24. Proteomics Bioinformatics EMBL-EBI, December 2016 Reliable and reproducible Quantitation
  • 25. Proteomics Bioinformatics EMBL-EBI, December 2016 LFQ Relevant nodes FeatureFinderCentroid Detects two-dimensional features in LC-MS data. MapAlignerPoseClustering Corrects retention time distortions between maps using a pose clustering approach. FeatureLinkerUnlabeledQT Groups corresponding features from multiple maps. ConsensusMapNormalizer Normalizes maps of one consensusXML file
  • 26. Proteomics Bioinformatics EMBL-EBI, December 2016 OpenMS at Large Scale Galaxy WS-PGRADE/gUSE KNIME Each individual tool can be run in the command line making possible its distribution in large HPC environments. $> FileFilter -in myinfile.mzML -levels 2 -rt 100:1500 -out myoutfile.mzML $> OpenSwathDecoyGenerator.exe −in OpenSWATH_SGS_AssayLibrary.TraML −out OpenSWATH_SGS_AssayLibrary_with_Decoys.TraML −method shuffle −append exclude_similar −remove_unannotated
  • 27. Conclusions • OpenMS modular workflow system • standard workflows: SILAC, iTRAQ/TMT, label-free, Swath, Quality Control • strong collaboration with other projects: ProteoWizard, Thermo PD, Knime, Fido Percolator, search engines, HUPO-PSI formats
  • 28. How to run OpenMS workflows • OpenMS, local installation (Windows, OS X, Linux) http://bit.ly/1J6lz6h http://openms.de/workflows • OpenMS in Proteome Discoverer (LFQProfiler and RNPxl for PD 2.1) http://openms.de/PD • OpenMS in Galaxy http://galaxy.uni-freiburg.de • OpenMS in Knime https://tech.knime.org/community/bioinf/openms