1. CHEM2BIO2RDF:
A LINKED OPEN DATA PORTAL FOR
SYSTEMS CHEMICAL BIOLOGY
Bin Chen, Ying Ding, Huijun Wang, David Wild, Xiao Dong,
Yuyin Sun, Qian Zhu, Madhuvanthi Sankaranarayanan
Indiana University at Bloomington
2. Chemogenomics
PPI Disease
Compound Protein Metabolic Pathway Side effect
Drug Gene Gene Regulatory Toxicity
Chemical Biology Systems Phenotype
interacting mapping
What’s Systems Chemical Biology
3. All the public data are scattered around the web…
MATADOR
4. LODD
Bio2RDF
(Drug/Chemical Data)
(biological data)
Chem2Bio2RDF
(chemogenomics---how chemical interact with biological data)
5. Linked Open Data (LOD)
Bio2RDF
LODD
Linked Life Data
Chem2Bio2RDF
6. Workflow for RDF conversion
XML
Ontology
D2R
CSV Download Scripts Mapping Dumping Virtuoso
Relational D2R
Local Triple
DB server
copy Store
TXT
Publishing
DB
…
External Sources
7. We are focusing on how chemical
interacts with biological data
12 databases
204, 981 compounds
17, 930 genes
646, 608 associations
Caveat: Not all binding data!
MATADOR
8. Literature based Systems Chemical
Biology
Covering 1865-2009
18,502,916 PubMed/Medline
literature records!
10. Chem2Bio2RDF data
Other data venders
compound
protein/gene
chemogenomics
literature
others
Node represents each database colored by its RDF vender; Directed edge shows
Over 110 million triples! the linkage from one dataset to another dataset, colored by the linkage type.
E.g,., the type compound includes CID, CAS, ChEBI, DBID and so on. The size of
Chem2Bio2RDF Datasets nodes and the width of edges are dependent on the # of triples and # of
linkages respectively.
11. Dereferenable URI
PlotViz: Visualization
Bio2RDF Browsing
Cytoscape Plugin
Virtuoso
Triple store
Chem2Bio2RDF
Linked Path Generation and Ranking
LODD
uniprot
Others
SPARQL ENDPOINTS Third party tools
15. Answer scientific questions
Give me all information about this compound
Give me all information about this target
Find chemical associated genes
Find gene associated chemicals
Find disease associated chemicals
Find side effect associated chemicals
Find all the drug-like compounds in PubChem BioAssay that
share at least two targets with a drug in DrugBank
Link KEGG / Reactome Pathways and PubChem to identify
potential multiple pathway inhibitors for MAPK
More in http://chem2bio2rdf.wikispaces.com/multiple+sources
17. 1. Scientific Question
Drugs that cause similar adverse side effects often
have totally different chemical structures
Cholestasis, Bile salt transporters in liver
18. 2. hypothesis
drug targets might function in the same pathway
19. 3. Methods
find KEGG pathways containing
at least two of the targets
associated with a given side
Path finding and visualization effect (i.e. hepatomegaly)
PREFIX chem2bio: <http://localhost:2020/vocab/resource/>
SELECT ?pathway_id (count(?pathway_id) as ?count)
WHERE {
SPARQL
?compound chem2bio:sider_side_effect ?side_effect .
?compound chem2bio:sider_cid ?dbid .
?targetid chem2bio: DrugBankTarget_dbid ?dbid .
?targetid chem2bio: DrugBankTarget_swissport_id ?UniProt_id . ?pathwayid
chem2bio:KEGG_pathway _gene_keggid ?UniProt_id .
?pathwayid chem2bio:KEGG_pathway _pathway_id ?pathway_id .
FILTER regex(?side_effect,"hepatomegaly","i") .
} GROUP BY ?pathway_id ORDER BY ?count DESC;
20. 4. results
Olanzapin
Doxazosin Isoflurane Ziprasidone Risperidone Clozapine
Drug e
GABRA GLRA ADRA1
Target PTGS2 PTGS1 GRIA1 HRH1 HTR1A HTR2A ADRA1B ADRB1 DRD2 DRD1
1 1 A
Pathwa Arachidonic VEGF Neuroactive Calcium
acid signaling ligand-receptor Small cell Pathways in signaling Gap
y
metabolism pathway interaction lung cancer cancer pathway Junction
Side Hepatic Hepatomegal
Effect Hepatitis
Necrosis y
hepatomegaly & Gap Junction?
21. 5. validation
PREFIX medline: <http://chem2bio2rdf.org/medline/resource/>
PREFIX kegg: <http://chem2bio2rdf.org/kegg/resource/>
PREFIX sider: <http://chem2bio2rdf.org/sider/resource/>
select *
from <http://chem2bio2rdf.org/medline>
from <http://chem2bio2rdf.org/kegg>
from <http://chem2bio2rdf.org/sider>
where
{
?kegg_id kegg:Pathway_name ?pathway_name . FILTER
regex(?pathway_name,"gap junction","i") .
?pmid medline:pathway ?kegg_id .
?pmid medline:side_effect ?sider .
?sider sider:side_effect ?side_effect . FILTER
regex(?side_effect,"Hepatomegaly","i") .
}
Literature based validation
Retrieve literatures talking about hepatomegaly & Gap Junction
22. Summary
Chem2Bio2RDF portal attempts to collect and link
all public data related to Systems Chemical Biology
Chem2Bio2RDF offer various tools to browse, search
and explore the data source
Case studies demonstrate that it could serve as an
useful portal in drug discovery