SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
This work has been supported by the BBSRC/EPSRC grant: the Manchester Centre for Integrative Systems Biology



                                       Subliminal: exploiting semantic
                                       annotations in the reconstruction of
                                       metabolic networks
                                       Neil Swainston
                                       Manchester Centre for Integrative Systems Biology, University of Manchester, Manchester M1 7ND, UK

Introduction
The development of metabolic network reconstructions has increased in recent years. It now covers a range of organisms and has been applied
to a number of research topics including metabolic engineering, genome-annotation, evolutionary studies, network property analysis, and
interpretation of omics datasets1.

The process of developing such reconstructions is now defined and is recognised as being time-consuming2. While many of the steps associated
with generating a high-quality reconstruction require manual curation, some of these are applicable to automation, providing the possibility of
automating the process of generating a draft reconstruction to be used in subsequent manual curation3.

The importance of using standard representations such as SBML4 and the MIRIAM standard5 has been recognised6, with the development of
reconstructions in which all components are semantically annotated with unambiguous database identifiers greatly facilitating their use by third
parties.

However, to date, the use of semantic annotations has been focused on the usability of the reconstruction after publication. Subliminal
comprises a toolbox that exploits semantic annotations during the reconstruction process, utilising libAnnotationSBML7 and web service
interfaces to external databases such as ChEBI8 and UniProt9 to retrieve chemical and protein data which can be used in the automation of
chemical protonation state determination, reaction mass / charge balancing and enzyme (and reaction) localisation.


Initial pre-draft pathways: KEGG2SBML and other sources


                          Initial pre-draft pathways for a given organism are generated from the existing KEGG2SBML10 tool. KEGG2SBML
                          generates SBML files representing individual metabolic pathways, which are then enhanced by addition of semantic
                          annotations such as references to ChEBI and UniProt ids for metabolites and enzymes respectively, and EC terms.

                          Subsequent work will focus on generating additional pathways from MetaCyc11 and genome sequences.


Model merging: pre-draft reconstruction


                          As each of the initial pre-draft pathways, irrespective of their source, are semantically annotated with comparable terms,
                          each can be merged automatically to generate a pre-draft reconstruction in which duplicate metabolites, enzymes and
                          reactions are removed.




Protonation state prediction


                          Automated acquisition from the ChEBI database of the InChI12 (or SMILES) string representing each metabolite allows
                          protonation state of the metabolite at a given pH to be predicted using cheminformatic resources such as the Chemistry
                          Development Kit (CDK)13.




Reaction mass / charge balancing

                          By acquiring the chemical formulae and charge of each metabolite from the ChEBI database, each reaction can be
                          represented as an matrix, A, containing elements and charges for each reactant and product. The vector, b, represents
   Ab = 0                 the stoichiometric coefficients of each reactant. Mixed integer linear programming can be applied to solve Ab = 0,
                          producing a vector of stoichiometric coefficients to be applied to each reactant and product. Commonly absent species,
                          such as water, protons and CO2, can also be considered, allowing previously unbalancable reactions (for example, from
                          KEGG) to be balanced automatically.

Protein localisation


                          With each enzyme being annotated with UniProt terms, the UniProt web services can be queried to automatically acquire
                          each protein sequence. These can be fed to protein cellular location prediction algorithms such as PSORT14 in order to
                          predict subcellular location of the enzyme, and by implication, the reaction(s) that it catalyses.




Future directions
While individual steps in the reconstruction process are amenable to automation, it is recognised that gap-filling, manual curation and validation
are essential steps in generating a high-quality reconstruction. Semantic annotations can further aid the validation process through automated
harvesting of chemical synonyms which can be fed to text-mining tools such as PathText15 in order to simplify the arduous, but necessary, task
of finding evidence for present (and missing) reactions in the literature.
1Applications of genome-scale metabolic reconstructions. Oberhardt MA, Palsson BØ, Papin JA. Mol Syst Biol. (2009) 5:320
2A protocol for generating a high-quality genome-scale metabolic reconstruction. Thiele I, Palsson BØ. Nat Protoc. (2010) 5, 93-121.
3High-throughput generation, optimization and analysis of genome-scale metabolic models. Henry CS, DeJongh M, et al. Nat Biotechnol. (2010) 28, 977-82.
4The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Hucka M, Finney A, et al. Bioinformatics. (2003) 19, 524-31.
5Minimum information requested in the annotation of biochemical models (MIRIAM). Le NovĂšre N, Finney A, et al. Nat Biotechnol. (2005) 23, 1509-15.
6A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. HerrgÄrd MJ, Swainston N, et al. Nat Biotechnol. (2008) 26, 1155-60.
7libAnnotationSBML: a library for exploiting SBML annotations. Swainston N, Mendes P. Bioinformatics. (2009) 25, 2292-3.
8ChEBI: a database and ontology for chemical entities of biological interest. Degtyarenko K, de Matos P, et al. Nucleic Acids Res. (2008) 36, D344-50.
9The Universal Protein Resource (UniProt) in 2010. UniProt Consortium. Nucleic Acids Res. (2010) 38, D142-8.
10http://sbml.org/Software/KEGG2SBML/
11The EcoCyc and MetaCyc databases. Karp PD, Riley M, et al. Nucleic Acids Res. (2000) 28, 56-9.
12http://www.iupac.org/inchi/
13PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Nakai K, Horton P. Trends Biochem Sci. (1999) 24, 34-6.
14The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. Steinbeck C, Han Y, et al. J Chem Inf Comput Sci. (2003) 43, 493-500.
15PathText: a text mining integrator for biological pathway visualizations. Kemper B, Matsuzaki T, et al. Bioinformatics. (2010) 26, i374-81.

Weitere Àhnliche Inhalte

Was ist angesagt?

Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Keiji Takamoto
 
Biochemical characterization of LOV domain proteins from protist-SK
Biochemical characterization of LOV domain proteins from  protist-SKBiochemical characterization of LOV domain proteins from  protist-SK
Biochemical characterization of LOV domain proteins from protist-SK
harimohan001
 

Was ist angesagt? (14)

Proteomics and protein-protein interaction
Proteomics  and protein-protein interactionProteomics  and protein-protein interaction
Proteomics and protein-protein interaction
 
Mg Atp
Mg AtpMg Atp
Mg Atp
 
Introduction to Proteogenomics
Introduction to Proteogenomics Introduction to Proteogenomics
Introduction to Proteogenomics
 
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
Theoretical evaluation of shotgun proteomic analysis strategies; Peptide obse...
 
Biochemical characterization of LOV domain proteins from protist-SK
Biochemical characterization of LOV domain proteins from  protist-SKBiochemical characterization of LOV domain proteins from  protist-SK
Biochemical characterization of LOV domain proteins from protist-SK
 
Hydrolysis of ATP
Hydrolysis of ATPHydrolysis of ATP
Hydrolysis of ATP
 
Proteomics ppt
Proteomics pptProteomics ppt
Proteomics ppt
 
Cobra phylogeny paper slides
Cobra phylogeny paper slidesCobra phylogeny paper slides
Cobra phylogeny paper slides
 
Role of ATP in Bioenergetics
Role of ATP in BioenergeticsRole of ATP in Bioenergetics
Role of ATP in Bioenergetics
 
Macromolecules or Big Small Molecules? Handling biopolymers in a chemical reg...
Macromolecules or Big Small Molecules? Handling biopolymers in a chemical reg...Macromolecules or Big Small Molecules? Handling biopolymers in a chemical reg...
Macromolecules or Big Small Molecules? Handling biopolymers in a chemical reg...
 
JBEI Highlights February 2015
JBEI Highlights February 2015JBEI Highlights February 2015
JBEI Highlights February 2015
 
Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networks
 
Ppi
PpiPpi
Ppi
 
A new algorithm for Predicting Metabolic Pathways
A new algorithm for Predicting Metabolic PathwaysA new algorithm for Predicting Metabolic Pathways
A new algorithm for Predicting Metabolic Pathways
 

Andere mochten auch

Continued development of ChEBI towards better usability for the systems biolo...
Continued development of ChEBI towards better usability for the systems biolo...Continued development of ChEBI towards better usability for the systems biolo...
Continued development of ChEBI towards better usability for the systems biolo...
Neil Swainston
 
Quantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserQuantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To Browser
Neil Swainston
 
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
Neil Swainston
 
Data standards for systems biology
Data standards for systems biologyData standards for systems biology
Data standards for systems biology
Neil Swainston
 
Data standards for systems biology
Data standards for systems biologyData standards for systems biology
Data standards for systems biology
Neil Swainston
 
Data Integration, Mass Spectrometry Proteomics Software Development
Data Integration, Mass Spectrometry Proteomics Software DevelopmentData Integration, Mass Spectrometry Proteomics Software Development
Data Integration, Mass Spectrometry Proteomics Software Development
Neil Swainston
 
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
Neil Swainston
 
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
Neil Swainston
 

Andere mochten auch (10)

SBML Browse
SBML BrowseSBML Browse
SBML Browse
 
Continued development of ChEBI towards better usability for the systems biolo...
Continued development of ChEBI towards better usability for the systems biolo...Continued development of ChEBI towards better usability for the systems biolo...
Continued development of ChEBI towards better usability for the systems biolo...
 
Quantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserQuantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To Browser
 
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
 
Network cheminformatics: gap filling and identifying new reactions in metabol...
Network cheminformatics: gap filling and identifying new reactions in metabol...Network cheminformatics: gap filling and identifying new reactions in metabol...
Network cheminformatics: gap filling and identifying new reactions in metabol...
 
Data standards for systems biology
Data standards for systems biologyData standards for systems biology
Data standards for systems biology
 
Data standards for systems biology
Data standards for systems biologyData standards for systems biology
Data standards for systems biology
 
Data Integration, Mass Spectrometry Proteomics Software Development
Data Integration, Mass Spectrometry Proteomics Software DevelopmentData Integration, Mass Spectrometry Proteomics Software Development
Data Integration, Mass Spectrometry Proteomics Software Development
 
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
 
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...
 

Ähnlich wie Subliminal: exploiting semantic annotations in the reconstruction of metabolic networks

2015_Nature Chem
2015_Nature Chem2015_Nature Chem
2015_Nature Chem
Ximin He
 
Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...
Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...
Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...
SynEnthu
 
art%3A10.1186%2F1471-2105-13-93
art%3A10.1186%2F1471-2105-13-93art%3A10.1186%2F1471-2105-13-93
art%3A10.1186%2F1471-2105-13-93
Sunisa Chatsurachai
 
iGEM UCSD 2015 Poster
iGEM UCSD 2015 PosteriGEM UCSD 2015 Poster
iGEM UCSD 2015 Poster
Vivienne Gunadhi
 

Ähnlich wie Subliminal: exploiting semantic annotations in the reconstruction of metabolic networks (20)

Systems biology and biotechnology of Streptomyces species for the production ...
Systems biology and biotechnology of Streptomyces species for the production ...Systems biology and biotechnology of Streptomyces species for the production ...
Systems biology and biotechnology of Streptomyces species for the production ...
 
Flux balance analysis
Flux balance analysisFlux balance analysis
Flux balance analysis
 
Metabolomics
MetabolomicsMetabolomics
Metabolomics
 
Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...Project report: Investigating the effect of cellular objectives on genome-sca...
Project report: Investigating the effect of cellular objectives on genome-sca...
 
CE508 Lecture 1 2006.ppt
CE508 Lecture 1 2006.pptCE508 Lecture 1 2006.ppt
CE508 Lecture 1 2006.ppt
 
CE508-Lecture 1 2007.ppt
CE508-Lecture 1 2007.pptCE508-Lecture 1 2007.ppt
CE508-Lecture 1 2007.ppt
 
2015_Nature Chem
2015_Nature Chem2015_Nature Chem
2015_Nature Chem
 
Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...
Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...
Conrado et al. 2011 NAR DNA-guided assembly of biosynthetic pathways promotes...
 
Gdt 2-126
Gdt 2-126Gdt 2-126
Gdt 2-126
 
Gdt 2-126 (1)
Gdt 2-126 (1)Gdt 2-126 (1)
Gdt 2-126 (1)
 
Ecocyc database
Ecocyc databaseEcocyc database
Ecocyc database
 
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and ResiduesMULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
MULISA : A New Strategy for Discovery of Protein Functional Motifs and Residues
 
art%3A10.1186%2F1471-2105-13-93
art%3A10.1186%2F1471-2105-13-93art%3A10.1186%2F1471-2105-13-93
art%3A10.1186%2F1471-2105-13-93
 
Lafont proteins 2007
Lafont proteins 2007Lafont proteins 2007
Lafont proteins 2007
 
Computer simulation
Computer simulationComputer simulation
Computer simulation
 
Dynamic complex formation during the yeast cell cycle
Dynamic complex formation during the yeast cell cycleDynamic complex formation during the yeast cell cycle
Dynamic complex formation during the yeast cell cycle
 
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and ScienceResearch Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science
 
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
A Frequency Domain Approach to Protein Sequence Similarity Analysis and Funct...
 
whole body.pptx
whole body.pptxwhole body.pptx
whole body.pptx
 
iGEM UCSD 2015 Poster
iGEM UCSD 2015 PosteriGEM UCSD 2015 Poster
iGEM UCSD 2015 Poster
 

Mehr von Neil Swainston

Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
Neil Swainston
 
ChEBI and genome scale metabolic reconstructions
ChEBI and genome scale metabolic reconstructionsChEBI and genome scale metabolic reconstructions
ChEBI and genome scale metabolic reconstructions
Neil Swainston
 
iQconCAT: quantitative proteomics from instrument to browser
iQconCAT: quantitative proteomics from instrument to browseriQconCAT: quantitative proteomics from instrument to browser
iQconCAT: quantitative proteomics from instrument to browser
Neil Swainston
 
Informatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems BiologyInformatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems Biology
Neil Swainston
 
QconCat: From Instrument To Browser
QconCat: From Instrument To BrowserQconCat: From Instrument To Browser
QconCat: From Instrument To Browser
Neil Swainston
 
libAnnotationSBML
libAnnotationSBMLlibAnnotationSBML
libAnnotationSBML
Neil Swainston
 

Mehr von Neil Swainston (6)

Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
 
ChEBI and genome scale metabolic reconstructions
ChEBI and genome scale metabolic reconstructionsChEBI and genome scale metabolic reconstructions
ChEBI and genome scale metabolic reconstructions
 
iQconCAT: quantitative proteomics from instrument to browser
iQconCAT: quantitative proteomics from instrument to browseriQconCAT: quantitative proteomics from instrument to browser
iQconCAT: quantitative proteomics from instrument to browser
 
Informatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems BiologyInformatics In The Manchester Centre For Integrative Systems Biology
Informatics In The Manchester Centre For Integrative Systems Biology
 
QconCat: From Instrument To Browser
QconCat: From Instrument To BrowserQconCat: From Instrument To Browser
QconCat: From Instrument To Browser
 
libAnnotationSBML
libAnnotationSBMLlibAnnotationSBML
libAnnotationSBML
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

KĂŒrzlich hochgeladen (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Subliminal: exploiting semantic annotations in the reconstruction of metabolic networks

  • 1. This work has been supported by the BBSRC/EPSRC grant: the Manchester Centre for Integrative Systems Biology Subliminal: exploiting semantic annotations in the reconstruction of metabolic networks Neil Swainston Manchester Centre for Integrative Systems Biology, University of Manchester, Manchester M1 7ND, UK Introduction The development of metabolic network reconstructions has increased in recent years. It now covers a range of organisms and has been applied to a number of research topics including metabolic engineering, genome-annotation, evolutionary studies, network property analysis, and interpretation of omics datasets1. The process of developing such reconstructions is now defined and is recognised as being time-consuming2. While many of the steps associated with generating a high-quality reconstruction require manual curation, some of these are applicable to automation, providing the possibility of automating the process of generating a draft reconstruction to be used in subsequent manual curation3. The importance of using standard representations such as SBML4 and the MIRIAM standard5 has been recognised6, with the development of reconstructions in which all components are semantically annotated with unambiguous database identifiers greatly facilitating their use by third parties. However, to date, the use of semantic annotations has been focused on the usability of the reconstruction after publication. Subliminal comprises a toolbox that exploits semantic annotations during the reconstruction process, utilising libAnnotationSBML7 and web service interfaces to external databases such as ChEBI8 and UniProt9 to retrieve chemical and protein data which can be used in the automation of chemical protonation state determination, reaction mass / charge balancing and enzyme (and reaction) localisation. Initial pre-draft pathways: KEGG2SBML and other sources Initial pre-draft pathways for a given organism are generated from the existing KEGG2SBML10 tool. KEGG2SBML generates SBML files representing individual metabolic pathways, which are then enhanced by addition of semantic annotations such as references to ChEBI and UniProt ids for metabolites and enzymes respectively, and EC terms. Subsequent work will focus on generating additional pathways from MetaCyc11 and genome sequences. Model merging: pre-draft reconstruction As each of the initial pre-draft pathways, irrespective of their source, are semantically annotated with comparable terms, each can be merged automatically to generate a pre-draft reconstruction in which duplicate metabolites, enzymes and reactions are removed. Protonation state prediction Automated acquisition from the ChEBI database of the InChI12 (or SMILES) string representing each metabolite allows protonation state of the metabolite at a given pH to be predicted using cheminformatic resources such as the Chemistry Development Kit (CDK)13. Reaction mass / charge balancing By acquiring the chemical formulae and charge of each metabolite from the ChEBI database, each reaction can be represented as an matrix, A, containing elements and charges for each reactant and product. The vector, b, represents Ab = 0 the stoichiometric coefficients of each reactant. Mixed integer linear programming can be applied to solve Ab = 0, producing a vector of stoichiometric coefficients to be applied to each reactant and product. Commonly absent species, such as water, protons and CO2, can also be considered, allowing previously unbalancable reactions (for example, from KEGG) to be balanced automatically. Protein localisation With each enzyme being annotated with UniProt terms, the UniProt web services can be queried to automatically acquire each protein sequence. These can be fed to protein cellular location prediction algorithms such as PSORT14 in order to predict subcellular location of the enzyme, and by implication, the reaction(s) that it catalyses. Future directions While individual steps in the reconstruction process are amenable to automation, it is recognised that gap-filling, manual curation and validation are essential steps in generating a high-quality reconstruction. Semantic annotations can further aid the validation process through automated harvesting of chemical synonyms which can be fed to text-mining tools such as PathText15 in order to simplify the arduous, but necessary, task of finding evidence for present (and missing) reactions in the literature. 1Applications of genome-scale metabolic reconstructions. Oberhardt MA, Palsson BØ, Papin JA. Mol Syst Biol. (2009) 5:320 2A protocol for generating a high-quality genome-scale metabolic reconstruction. Thiele I, Palsson BØ. Nat Protoc. (2010) 5, 93-121. 3High-throughput generation, optimization and analysis of genome-scale metabolic models. Henry CS, DeJongh M, et al. Nat Biotechnol. (2010) 28, 977-82. 4The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Hucka M, Finney A, et al. Bioinformatics. (2003) 19, 524-31. 5Minimum information requested in the annotation of biochemical models (MIRIAM). Le NovĂšre N, Finney A, et al. Nat Biotechnol. (2005) 23, 1509-15. 6A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. HerrgĂ„rd MJ, Swainston N, et al. Nat Biotechnol. (2008) 26, 1155-60. 7libAnnotationSBML: a library for exploiting SBML annotations. Swainston N, Mendes P. Bioinformatics. (2009) 25, 2292-3. 8ChEBI: a database and ontology for chemical entities of biological interest. Degtyarenko K, de Matos P, et al. Nucleic Acids Res. (2008) 36, D344-50. 9The Universal Protein Resource (UniProt) in 2010. UniProt Consortium. Nucleic Acids Res. (2010) 38, D142-8. 10http://sbml.org/Software/KEGG2SBML/ 11The EcoCyc and MetaCyc databases. Karp PD, Riley M, et al. Nucleic Acids Res. (2000) 28, 56-9. 12http://www.iupac.org/inchi/ 13PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Nakai K, Horton P. Trends Biochem Sci. (1999) 24, 34-6. 14The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. Steinbeck C, Han Y, et al. J Chem Inf Comput Sci. (2003) 43, 493-500. 15PathText: a text mining integrator for biological pathway visualizations. Kemper B, Matsuzaki T, et al. Bioinformatics. (2010) 26, i374-81.