Grafana in space: Monitoring Japan's SLIM moon lander in real time
Update on the Druggable Proteome
1. www.guidetopharmacology.org
Intersecting different databases to define the inner and
outer limits of the data-supported druggable proteome
Christopher Southan, Adam J. Pawson, Joanna L. Sharman, Elena
Faccenda, Simon Harding, Jamie Davis, IUPHAR/BPS Guide to
PHARMACOLOGY, Centre for Integrative Physiology, University of
Edinburgh
ACS Tue, Mar 15, CINF 98: Linking Big Data with Chemistry:
Databases Connecting Genomics, Biological Pathways & Targets to
Chemistry 9:30 AM - 11:50 AM Room 24C 11:10am - 11:30am
1
http://www.slideshare.net/cdsouthan/update-on-the-druggable-proteome
2. Abstract (will be skipped for presentation)
2
Hopkins and Groom coined the term “druggable genome” in 2002 for the extrapolated total of ~
10% of the human proteome likely to bind small molecules with lead-like chemical properties
and sufficient binding affinity for activity modulation. Fast-forward to 2015 and the UniProtKB
website now include four database cross-references in the new Chemistry section. These
provide a more detailed picture, based largely on chemistry-to-protein mapping data curated
from the literature. They are thus evidence-supported statistics rather than homology-based
transitive estimates. These included (Sept 2015) human protein links to 2927 target entries
from ChEMBL, 2191 from BindingDB, 1563 from DrugBank and 1340 from the IUPHAR/BPS
Guide to PHARMACOLOGY (GtoPdb). Statistical comparisons between these will be presented
here defining different levels evidence support and following their continued expansion. The
union of all four sets, 3603, encompasses ~ 18% of the proteome. However, the proportion that
would match the most stringently curated of these, GtoPdb for chemistry-to-protein mapping is
lower and comparison indicate curation strategies and source selections for each database
diverge considerably (PMID 24533037). This is manifest in the relatively high unique content of
1147 (31% of the union) for the sources. However, they converge as a 4-way intersect for 490
proteins (13% of the union). Concordance between at least two independent sources (i.e. the
non-unique proportion) expands to 2456 or 12% of the proteome. This represents the most
precise data-supported druggable proteome snapshot for each UniProtKB release. Orthogonal
comparative analyses of these intersecting sets will be presented, including by Gene Ontology
functional categories, target class content, secreted vs. non-secreted, and disease gene links.
The utility of this druggable proteome assessment is very high in pharmacology and drug
discovery, especially in terms of being able to data mine leads as chemical starting points for
target validation experiments.
3. Outline
• Origins of the druggable genome
• Sources for the druggable proteome
• Comparing coverage
• Inner and outer limits
• Distribution of target attributes
• Selection example
• Future expansion
3
5. Druggable proteome: 2016 update
Working definitions for IUPHAR/BPS Guide to PHARMACOLGY curation
• Protein “has ligand”: data-supported pharmacologically relevant interaction
(quantitative if possible)
• Drugged target: molecular mechanism of action (MMOA) involves binding of
drug to primary target
• Drugged proteome (targets of approved drugs):
• 120 in 2002 (PMID 12209152)
• 213 in 2006 (PMID 7139284)
• 312 in 2015 (PMID 26464438)
• Tractable target: assay > documented in vitro activity modulation of target by
small molecule or other therapeutic modality
• Druggable target: data-supported plausibility of in vivo modulation
• Validated target: in vivo modulation via MMOA > clinical efficacy for disease
5
9. Druggable inner and outer limits
(Swiss-Prot human proteome at 20,198)
9
Source-unique 1,099
4-way 539
3-way 1053
2-way 861
All sources (union) 3,568 = 18% of proteome
4-way = 2.7% of proteome
4-way = 15% of the union
12. Advanced selection example
12
From the 4-way set
database:(type:merops)
annotation:(type:signal)
database:(type:pdb)
annotation:(type:"alternative products")
database:(type:hpa)
13. Initiatives for expansion
13
NIH Illuminating the Druggable Genome (IDG)
Program objective is to improve our understanding of
the properties and functions of proteins that are
currently unannotated within the four most commonly
drug-targeted protein families: the G-protein coupled
receptors, nuclear receptors, ion channels, and
protein kinases.
15. Conclusions
• The data-supported druggable proteome is expanding
• UniProt chemistry cross-referencing collates curated sources with
complementary selectivity
• Sources indicate an outer limit of 18% with an inner limit of 3%
• Advanced “slice-and-dice” options can identify subsets
• Expanding choice of experimental perturbagens for systems pharmacology,
dug discovery, chemical biology and synthetic biology
• Challenge of the constitutive “loss of function” for disease causality
• It is hoped the druggable expansion will translate into
• novel validated targets
• broader potential therapeutic coverage (including rare diseases)
• new approved medicines
• new combinations and hybrids
• more repurposing via target-hopping
15
16. Acknowledgements, references and questions
16
Benoit Bely, UniProt Release Production Project Leader, EBI (for x-refs)
Database teams of BindingDB, ChEMBL and DrugBank
GtoPdb Team Members and funders from title slide