SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
CHEM2BIO2RDF:
A LINKED OPEN DATA PORTAL FOR
SYSTEMS CHEMICAL BIOLOGY

Bin Chen, Ying Ding, Huijun Wang, David Wild, Xiao Dong,
Yuyin Sun, Qian Zhu, Madhuvanthi Sankaranarayanan

              Indiana University at Bloomington
Chemogenomics


                                               PPI                 Disease
Compound                   Protein             Metabolic Pathway   Side effect
Drug                       Gene                Gene Regulatory     Toxicity

 Chemical                  Biology                Systems          Phenotype


            interacting              mapping




                      What’s Systems Chemical Biology
All the public data are scattered around the web…




                                              MATADOR
LODD
     Bio2RDF
                                            (Drug/Chemical Data)
(biological data)




                    Chem2Bio2RDF
      (chemogenomics---how   chemical interact with biological data)
Linked Open Data (LOD)
   Bio2RDF
   LODD
   Linked Life Data
   Chem2Bio2RDF
Workflow for RDF conversion
    XML
                                                           Ontology


                                                       D2R
    CSV      Download           Scripts                Mapping               Dumping   Virtuoso
                                          Relational                 D2R
                        Local                                                           Triple
                                             DB                     server
                        copy                                                            Store

    TXT
                                                           Publishing


    DB

    …
External Sources
We are focusing on how chemical
interacts with biological data
    12 databases
    204, 981 compounds
    17, 930 genes
    646, 608 associations

    Caveat: Not all binding data!



                                    MATADOR
Literature based Systems Chemical
Biology
                      Covering 1865-2009
                      18,502,916 PubMed/Medline
                      literature records!
Workflow for conversion PubMed/Medline
data
Chem2Bio2RDF data

                                                                                Other data venders
                                                                                      compound
                                                                                      protein/gene
                                                                                      chemogenomics
                                                                                      literature
                                                                                      others


                             Node represents each database colored by its RDF vender; Directed edge shows
 Over 110 million triples!   the linkage from one dataset to another dataset, colored by the linkage type.
                             E.g,., the type compound includes CID, CAS, ChEBI, DBID and so on. The size of

Chem2Bio2RDF Datasets        nodes and the width of edges are dependent on the # of triples and # of
                             linkages respectively.
Dereferenable URI




                                                 PlotViz: Visualization
           Bio2RDF       Browsing




                                                      Cytoscape Plugin
                            Virtuoso
                          Triple store
Chem2Bio2RDF




                                             Linked Path Generation and Ranking
             LODD


 uniprot




            Others

                     SPARQL ENDPOINTS                  Third party tools
http://chem2bio2rdf.org/medline/resource/medline/15722552 (Dereferenable URI)




                              Link to Bio2RDF disease




                                     Link to Chem2Bio2RDF Gene




                                      Link to PubMed website




                                              Link to Chem2Bio2RDF pathway



                                        Link to Chem2Bio2RDF side effect
Facet browsers using Exhibit




       http://chem2bio2rdf.org/exhibit/drugbank.html
Search Chem2Bio2RDF




                 Search engine results



SPARQL results                           Cytoscape plugin
Answer scientific questions
   Give me all information about this compound
   Give me all information about this target
   Find chemical associated genes
   Find gene associated chemicals
   Find disease associated chemicals
   Find side effect associated chemicals
   Find all the drug-like compounds in PubChem BioAssay that
    share at least two targets with a drug in DrugBank
   Link KEGG / Reactome Pathways and PubChem to identify
    potential multiple pathway inhibitors for MAPK

        More in http://chem2bio2rdf.wikispaces.com/multiple+sources
CASE study: Adverse drug reaction
1. Scientific Question
   Drugs that cause similar adverse side effects often
    have totally different chemical structures




                                 Cholestasis, Bile salt transporters in liver
2. hypothesis




      drug targets might function in the same pathway
3. Methods



                                                                 find KEGG pathways containing
                                                                 at least two of the targets
                                                                 associated with a given side
Path finding and visualization                                   effect (i.e. hepatomegaly)
                 PREFIX chem2bio: <http://localhost:2020/vocab/resource/>
                  SELECT ?pathway_id (count(?pathway_id) as ?count)
                 WHERE {
                                                                                SPARQL
                  ?compound chem2bio:sider_side_effect ?side_effect .
                  ?compound chem2bio:sider_cid ?dbid .
                 ?targetid chem2bio: DrugBankTarget_dbid ?dbid .
                  ?targetid chem2bio: DrugBankTarget_swissport_id ?UniProt_id . ?pathwayid
                 chem2bio:KEGG_pathway _gene_keggid ?UniProt_id .
                  ?pathwayid chem2bio:KEGG_pathway _pathway_id ?pathway_id .
                  FILTER regex(?side_effect,"hepatomegaly","i") .
                  } GROUP BY ?pathway_id ORDER BY ?count DESC;
4. results
                                                                             Olanzapin
                 Doxazosin         Isoflurane          Ziprasidone                          Risperidone          Clozapine
Drug                                                                             e




                          GABRA                 GLRA                                     ADRA1
Target   PTGS2 PTGS1            GRIA1                    HRH1        HTR1A    HTR2A            ADRA1B ADRB1           DRD2    DRD1
                            1                    1                                         A




Pathwa      Arachidonic        VEGF           Neuroactive                                             Calcium
               acid          signaling     ligand-receptor            Small cell    Pathways in      signaling           Gap
   y
            metabolism       pathway          interaction            lung cancer      cancer         pathway           Junction




 Side                               Hepatic                                                Hepatomegal
Effect                                                          Hepatitis
                                    Necrosis                                                    y




         hepatomegaly & Gap Junction?
5. validation
PREFIX medline: <http://chem2bio2rdf.org/medline/resource/>
PREFIX kegg: <http://chem2bio2rdf.org/kegg/resource/>
PREFIX sider: <http://chem2bio2rdf.org/sider/resource/>

select *
from <http://chem2bio2rdf.org/medline>
from <http://chem2bio2rdf.org/kegg>
from <http://chem2bio2rdf.org/sider>

where
{
 ?kegg_id kegg:Pathway_name ?pathway_name . FILTER
regex(?pathway_name,"gap junction","i") .
 ?pmid medline:pathway ?kegg_id .
 ?pmid medline:side_effect ?sider .
 ?sider sider:side_effect ?side_effect . FILTER
regex(?side_effect,"Hepatomegaly","i") .
}


   Literature based validation


     Retrieve literatures talking about hepatomegaly & Gap Junction
Summary
   Chem2Bio2RDF portal attempts to collect and link
    all public data related to Systems Chemical Biology
   Chem2Bio2RDF offer various tools to browse, search
    and explore the data source
   Case studies demonstrate that it could serve as an
    useful portal in drug discovery
THANKS!

Weitere ähnliche Inhalte

Ähnlich wie Chem2bio2rdf portal

Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013
Nadia Anwar
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformatics
Remzi Çelebi
 
BioPAX Models and Pathways
BioPAX Models and PathwaysBioPAX Models and Pathways
BioPAX Models and Pathways
Michel Dumontier
 
Valeria proposalsat
Valeria proposalsatValeria proposalsat
Valeria proposalsat
valrivera
 

Ähnlich wie Chem2bio2rdf portal (20)

Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013Linking Linked Data CSHALS2013
Linking Linked Data CSHALS2013
 
2013 eswc-bio2rdf-r2
2013 eswc-bio2rdf-r22013 eswc-bio2rdf-r2
2013 eswc-bio2rdf-r2
 
Exploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemExploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChem
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformatics
 
Collaboration with GeneGo provides seamless access to compound databases, pat...
Collaboration with GeneGo provides seamless access to compound databases, pat...Collaboration with GeneGo provides seamless access to compound databases, pat...
Collaboration with GeneGo provides seamless access to compound databases, pat...
 
Ppi
PpiPpi
Ppi
 
BioPAX Models and Pathways
BioPAX Models and PathwaysBioPAX Models and Pathways
BioPAX Models and Pathways
 
Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data
Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open DataGraph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data
Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data
 
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS FoundationPistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
Pistoia Alliance European Conference 2015 - Nick Lynch / Open PHACTS Foundation
 
Bind database
Bind databaseBind database
Bind database
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDBMetabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
 
Mapping metabolites against pathway databases
Mapping metabolites against pathway databases Mapping metabolites against pathway databases
Mapping metabolites against pathway databases
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 
Pharmacophore mapping in Drug Development
Pharmacophore mapping in Drug DevelopmentPharmacophore mapping in Drug Development
Pharmacophore mapping in Drug Development
 
Valeria proposalsat
Valeria proposalsatValeria proposalsat
Valeria proposalsat
 
ABRCMS Poster2012
ABRCMS Poster2012ABRCMS Poster2012
ABRCMS Poster2012
 
Using biological network approaches for dynamic extension of micronutrient re...
Using biological network approaches for dynamic extension of micronutrient re...Using biological network approaches for dynamic extension of micronutrient re...
Using biological network approaches for dynamic extension of micronutrient re...
 
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
2015-02-10 The Open PHACTS Discovery Platform: Semantic Data Integration for ...
 

Kürzlich hochgeladen

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 

Kürzlich hochgeladen (20)

APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

Chem2bio2rdf portal

  • 1. CHEM2BIO2RDF: A LINKED OPEN DATA PORTAL FOR SYSTEMS CHEMICAL BIOLOGY Bin Chen, Ying Ding, Huijun Wang, David Wild, Xiao Dong, Yuyin Sun, Qian Zhu, Madhuvanthi Sankaranarayanan Indiana University at Bloomington
  • 2. Chemogenomics PPI Disease Compound Protein Metabolic Pathway Side effect Drug Gene Gene Regulatory Toxicity Chemical Biology Systems Phenotype interacting mapping What’s Systems Chemical Biology
  • 3. All the public data are scattered around the web… MATADOR
  • 4. LODD Bio2RDF (Drug/Chemical Data) (biological data) Chem2Bio2RDF (chemogenomics---how chemical interact with biological data)
  • 5. Linked Open Data (LOD)  Bio2RDF  LODD  Linked Life Data  Chem2Bio2RDF
  • 6. Workflow for RDF conversion XML Ontology D2R CSV Download Scripts Mapping Dumping Virtuoso Relational D2R Local Triple DB server copy Store TXT Publishing DB … External Sources
  • 7. We are focusing on how chemical interacts with biological data  12 databases  204, 981 compounds  17, 930 genes  646, 608 associations Caveat: Not all binding data! MATADOR
  • 8. Literature based Systems Chemical Biology Covering 1865-2009 18,502,916 PubMed/Medline literature records!
  • 9. Workflow for conversion PubMed/Medline data
  • 10. Chem2Bio2RDF data Other data venders compound protein/gene chemogenomics literature others Node represents each database colored by its RDF vender; Directed edge shows Over 110 million triples! the linkage from one dataset to another dataset, colored by the linkage type. E.g,., the type compound includes CID, CAS, ChEBI, DBID and so on. The size of Chem2Bio2RDF Datasets nodes and the width of edges are dependent on the # of triples and # of linkages respectively.
  • 11. Dereferenable URI PlotViz: Visualization Bio2RDF Browsing Cytoscape Plugin Virtuoso Triple store Chem2Bio2RDF Linked Path Generation and Ranking LODD uniprot Others SPARQL ENDPOINTS Third party tools
  • 12. http://chem2bio2rdf.org/medline/resource/medline/15722552 (Dereferenable URI) Link to Bio2RDF disease Link to Chem2Bio2RDF Gene Link to PubMed website Link to Chem2Bio2RDF pathway Link to Chem2Bio2RDF side effect
  • 13. Facet browsers using Exhibit http://chem2bio2rdf.org/exhibit/drugbank.html
  • 14. Search Chem2Bio2RDF Search engine results SPARQL results Cytoscape plugin
  • 15. Answer scientific questions  Give me all information about this compound  Give me all information about this target  Find chemical associated genes  Find gene associated chemicals  Find disease associated chemicals  Find side effect associated chemicals  Find all the drug-like compounds in PubChem BioAssay that share at least two targets with a drug in DrugBank  Link KEGG / Reactome Pathways and PubChem to identify potential multiple pathway inhibitors for MAPK More in http://chem2bio2rdf.wikispaces.com/multiple+sources
  • 16. CASE study: Adverse drug reaction
  • 17. 1. Scientific Question  Drugs that cause similar adverse side effects often have totally different chemical structures Cholestasis, Bile salt transporters in liver
  • 18. 2. hypothesis drug targets might function in the same pathway
  • 19. 3. Methods find KEGG pathways containing at least two of the targets associated with a given side Path finding and visualization effect (i.e. hepatomegaly) PREFIX chem2bio: <http://localhost:2020/vocab/resource/> SELECT ?pathway_id (count(?pathway_id) as ?count) WHERE { SPARQL ?compound chem2bio:sider_side_effect ?side_effect . ?compound chem2bio:sider_cid ?dbid . ?targetid chem2bio: DrugBankTarget_dbid ?dbid . ?targetid chem2bio: DrugBankTarget_swissport_id ?UniProt_id . ?pathwayid chem2bio:KEGG_pathway _gene_keggid ?UniProt_id . ?pathwayid chem2bio:KEGG_pathway _pathway_id ?pathway_id . FILTER regex(?side_effect,"hepatomegaly","i") . } GROUP BY ?pathway_id ORDER BY ?count DESC;
  • 20. 4. results Olanzapin Doxazosin Isoflurane Ziprasidone Risperidone Clozapine Drug e GABRA GLRA ADRA1 Target PTGS2 PTGS1 GRIA1 HRH1 HTR1A HTR2A ADRA1B ADRB1 DRD2 DRD1 1 1 A Pathwa Arachidonic VEGF Neuroactive Calcium acid signaling ligand-receptor Small cell Pathways in signaling Gap y metabolism pathway interaction lung cancer cancer pathway Junction Side Hepatic Hepatomegal Effect Hepatitis Necrosis y hepatomegaly & Gap Junction?
  • 21. 5. validation PREFIX medline: <http://chem2bio2rdf.org/medline/resource/> PREFIX kegg: <http://chem2bio2rdf.org/kegg/resource/> PREFIX sider: <http://chem2bio2rdf.org/sider/resource/> select * from <http://chem2bio2rdf.org/medline> from <http://chem2bio2rdf.org/kegg> from <http://chem2bio2rdf.org/sider> where { ?kegg_id kegg:Pathway_name ?pathway_name . FILTER regex(?pathway_name,"gap junction","i") . ?pmid medline:pathway ?kegg_id . ?pmid medline:side_effect ?sider . ?sider sider:side_effect ?side_effect . FILTER regex(?side_effect,"Hepatomegaly","i") . } Literature based validation Retrieve literatures talking about hepatomegaly & Gap Junction
  • 22. Summary  Chem2Bio2RDF portal attempts to collect and link all public data related to Systems Chemical Biology  Chem2Bio2RDF offer various tools to browse, search and explore the data source  Case studies demonstrate that it could serve as an useful portal in drug discovery