Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Metabolite ID mapping and WikiPathways

156 Aufrufe

Veröffentlicht am

Slides for the Metabolomics Winterschool, Wittenberg/Germany, 2018-03-08.

Veröffentlicht in: Wissenschaft
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Metabolite ID mapping and WikiPathways

  1. 1. Metabolite ID mapping and WikiPathways Egon Willighagen http://chem-bla-ics.blogspot.com/ @egonwillighagen ORCID:0000-0001-7542-0286 Metabolomics Winter School #metawinterschool Wittenberg, 2018-03-08 https://goo.gl/Afca8Y CC-BY 4.0 Int. (except slides with )
  2. 2. Acknowledgements ● WikiPathways and PathVisio projects – Prof. Alex Pico's team, UCSF – Current/past BiGCaT (Prof. Chris Evelo): Marloes Poort, Denise Slenter, Jacob Windsor, Garima Thakur – Pathway providers: Pieter Giesbertz (TUM), Kozo Nishida (RIKEN) ● Maastricht University – Toxicology: Rianne Fijten, Agnieszka Smolinska – Maastricht MultiModal Molecular Imaging Institute (M4I): Prof. Ron Heeren, Benjamin Balluff – MaCSBio team and other departments ● Open PHACTS – Manchester University: Prof. Carole Goble, Christian Brenninkmeijer, Stian Soiland-Reyes – Heriot-Watt University: Alasdair Gray – Royal Society of Chemistry: Colin Batchelor ● Others – Bioclipse: Ola Spjuth (Uppsala University) – MetaboLights collaboration: Reza Salek and Chandu Venkata – ChEBI collaboration: Prof. Christoph Steinbeck, Gareth Owen – PubChem collaboration: Evan Bolton, Gang Fu – HMDB, Wikidata teams: Andra Waagmeester (Micelio)
  3. 3. Asthma: Detecting and Understanding Smolinska et al. PLOS ONE. 2014 9:e105447 doi:10.1371/journal.pone.0105447
  4. 4. Systems Biology: pathways Andón FT, Fadeel B; ''Programmed Cell Death: Molecular Mechanisms and Implications for Safety Assessment of Nanomaterials.''; Acc Chem Res, 2012
  5. 5. Dopamine metabolism Marloes Poort
  6. 6. The effect of troglitazone on heme biosynthesis
  7. 7. PathVisio: pathway enrichment (etc) Van Iersel, M.P., et al. "Presenting and exploring biological pathways with PathVisio." BMC bioinformatics 9.1 (2008): 399. http://pathvisio.org/ → Martina Kutmon
  8. 8. We see a lot? But what is it? ● Current techniques can see up to 1000 metabolites in one analysis – Only part of all 40k metabolites ● Only 30% we can identify – The other 70% is unknown
  9. 9. Databases & identifiers ● HMDB: Human Metabolome Database ● ChEBI: Database of Chemicals Entities of Biological Interest ● ChemSpider, PubChem ● CAS: Chemical Abstracts Service ● InChI: International Chemical Identifier
  10. 10. Acid/Base conjugates CHEBI:15361 (Pyruvate) -> Ce:CHEBI:32816 (conjugate) -> Ck:C00022 -> [WP2456 HIF1A and PPARG regulation of glycolysis, WP2453 TCA Cycle and PDHc] CHEBI:15361 CHEBI:32816
  11. 11. Switching identities: Glucose
  12. 12. Porter, W. (2010). Warfarin: history, tautomerism and Activity Journal of Computer-Aided Molecular Design, 24 (6-7), 553-573 DOI: 10.1007/s10822-010-9335-7 Switching identities: Warfarin
  13. 13. Bridging: identifiers
  14. 14. So, what IDs are used in WikiPathways? Curated subset 201220152017 + Reactome
  15. 15. BridgeDb Van Iersel, M.P., et al. "The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services." BMC Bioinformatics 11.1 (2010): 5. New tools ● Open PHACTS' Identifier Mapping Service ● R package ● Bioclipse
  16. 16. One ID in the pathway, many in the popup MetaboLights
  17. 17. Metabolite ID Mapping database ● HMDB, ChEBI, Wikidata
  18. 18. BridgeDb: scientific lenses ● Gene – gene-protein – gene-probe ● Metabolite – Tautomers – Compound class – Charge (acid/ate) Brenninkmeijer, CYA, et al. "Scientific Lenses over Linked Data: An approach to support task specific views of the data. A vision." Proceedings of 2nd International Workshop on Linked Science. 2012.
  19. 19. #1: The breath data set CAS numbers: 1843 CAS numbers (unique): 1733 CAS numbers with mappings: 718 CAS numbers matches: 54 Pathways found: 76 Matches via CAS: 9 Matches via mapping: 29 Matches via ChEBI super class: 35 Matches via ChEBI charged species: 3 Matches via ChEBI tautomers: 0 CAS: 544-63-8 (myristic acid) → Ce:28875 → Ce:15904 (long-chain fatty acid) → [WP368 Mitochondrial LC-Fatty Acid Beta-Oxidation, WP357 Fatty Acid Biosynthesis]
  20. 20. What if we add more CAS ID mappings? (e.g. from Wikidata)
  21. 21. Wikidata Mietchen, D. et al. Enabling open science: Wikidata for research (Wiki4R). Research Ideas and Outcomes 1, e7573+ (2015)
  22. 22. Wikidata: identifiers
  23. 23. Wikidata: metabolites Spjuth, O. et al., 2007, BMC Bioinformatics
  24. 24. What if we add more CAS ID mappings? (e.g. from Wikidata) INFO: Number of ids in Ch (HMDB): 41514 (changed +0.0%) INFO: Number of ids in Ce (ChEBI): 64222 (changed +0.0%) INFO: Number of ids in Kd (KEGG Drug): 2406 (changed +23960.0%) INFO: Number of ids in Ca (CAS): 38621 (changed +30.5%) INFO: Number of ids in Wi (Wikipedia): 3991 (changed +0.0%) INFO: Number of ids in Ck (KEGG Compound): 15896 (changed +0.0%) INFO: Number of ids in Cpc (PubChem-compound): 29170 (changed +72.5%) INFO: Number of ids in Wd: 18237 INFO: Number of ids in Cs (Chemspider): 23981 (changed +49.4%) - 30% more CAS numbers (294 unique IDs in WikiPathways) - 73% more PubChem compound identifiers (217 unique IDs in WP) - 50% more Chemspider identifiers (157 unique IDs in WP) - a lot more KEGG Drug identifiers
  25. 25. #1: The breath data set CAS numbers: 1843 CAS numbers (unique): 1733 CAS numbers with mappings: 978 CAS numbers matches: 116 Pathways found: 158 (unique: 62) Matches via CAS: 9 Matches via mapping: 28 Matches via ChEBI super class: 108 Matches via ChEBI charged species: 9 Matches via ChEBI tautomers: 0 Matches via ChEBI roles: 4 CAS: 544-63-8 (myristic acid) → Ce:28875 → Ce:15904 (long-chain fatty acid) → [WP368 Mitochondrial LC-Fatty Acid Beta-Oxidation, WP357 Fatty Acid Biosynthesis]
  26. 26. #1: The breath data set CAS numbers: 1843 CAS numbers (unique): 1733 CAS numbers with mappings: 978 CAS numbers matches: 116 Pathways found: 158 (unique: 62) Matches via CAS: 9 Matches via mapping: 28 Matches via ChEBI super class: 108 Matches via ChEBI charged species: 9 Matches via ChEBI tautomers: 0 Matches via ChEBI roles: 4
  27. 27. Wikidata: Scholia doi:10.1007/978-3-319-70407-4_36
  28. 28. Secondary identifiers: ChEBI, HMDB, …
  29. 29. Application Programming Interfaces
  30. 30. Application Programming Interfaces
  31. 31. Biology is a living thing: goo.gl/WJxjTv Jacob Windsor, @jcbwndsr
  32. 32. Biology is a living thing: goo.gl/zDrRpH Jacob Windsor, @jcbwndsr
  33. 33. Conclusions ● Updated metabolite ID database – HMDB: still a major workhorse – ChEBI: charged species, compound classes – Wikidata: CAS numbers, other missing ● Pathway Analysis – Mapping with Bioclipse and PathVisio – Scientific lenses improve mappings – Better annotation

×