SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Trials and tribulations of curating peptide and
antibody ligands for the IUPHAR/BPS Guide to
Pharmacology
Christopher Southan, Joanna L. Sharman, Adam J. Pawson, Simon D.
Harding, Elena Faccenda and Jamie A. Davies, IUPHAR/BPS Guide to Pharmacology,
Discovery Brain Sciences, University of Edinburgh, UK.
ACS Boston 2018, Biologics & Registration Session, Mon Aug 20,
15:50 - 16:15, Harbor Ballroom II
1
https://www.slideshare.net/cdsouthan
Abstract (will not be shown)
As an expert-curated database of approved, clinical or research pharmacological targets mapped to
defined ligands, the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) and its precursor IUPHAR-DB,
have been extracting and annotating bioactive peptides from papers for well over a decade. The current
total has reached 2089 peptides, split between exogenous and endogenous, within the 9144 ligand
entries submitted to PubChem in our 2018.2 database release. More recently, as approved drugs or
clinical candidates we have curated 235 antibodies and a small number of therapeutic nucleotides.
Indexing these entity types in GtoPdb present challenges similar to those being encountered for the
registration of biologicals as explicitly defined structures. In addition, we target-map the citation-
supported quantitative binding parameters where possible.This presentation will outline these
curatorial challenges and our efforts to at least partially ameliorate the problems. For peptides below
the PubChem CID SMILES limit of approximately 70 residues we have been using Sugar and Splice from
NextMove Software to convert more of our peptide SIDs to join the 6969 CIDs we already have.
However, we are often confounded by the equivocal structural specifications of authors w.r.t. post
translational modifications and exact positions of radiolabel incorporations. However, we do capture at
least a primary sequence string as an interim compromise that users can hit by BLAST. For reported
receptor-binding endogenous peptides we find some that do not match the Swiss-Prot features for the
precursor protein. PubChem has been encouraging and supporting us in converting more activity-
mapped peptides to CIDs and InChIKeys which should enhance inter-source connectivity. Otherwise,
biological SID data can only be joined by equivocal name matching. Antibodies and other large-
biological SIDs may also currently remain structurally orphaned and present their own challenges.
Notwithstanding, GtoPdb has successfully curated at least primary sequences for the molecular
specification of clinical Mabs. For this we use the IMGT/mAb-DB for approved monoclonals as a first
stop shop since they extract sequences from INN documents. For these and clinical candidates with
code names we also use the patent sequence databases to source a UniParc accession number and can
sometimes get binding data that has not appeared in papers. 2
Outline
• Intoducing GtoPdb
• GtoPdb peptide content and stats
• Peptide tribulations
• PubChem peptidic pros and cons
• Getting more peptides > SMILES
• GtoPdb antibody content
• Antipbody tribulations
• Stats and examples
• Exploiting PubChem SID tagging
• Wher we go from here
• Further information
3
Introducing the IUPHAR/BPS Guide to
PHARMACOLOGY (GtoPdb)
• IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British
Pharmacological Society
• Formerly know as IUPHAR-DB for receptors and channels since 2003
• Since 2012 funded byWellcomeTrust to cover all targets in the human genome
• Since 2015 WellcomeTrust “fork” as Guide to IMMUNOPHARMACOLOGY
• Molecular mechanism of action (mmoa) mapping primary & secondary targets
• Release cycle time (with PubChem refreshes) ~ 2 months
• Six well-cited NAR Annual Database issues, latest as PMID 29149325 (2018)
• Distilled into the 2-yearly BritishJournal of Pharmacology “Concise Guide to
PHARMACOLOGY” as a nine-paper series (see PMID 29055037) with outlinks
• Presents users with selected quality compounds for pharmacology research in
silico, in vitro, in cellulo, in vivo, in clinico
• An ELIXIR UK Node resource since 2016 http://www.guidetopharmacology.org
4
5
Expert-curated, citation provenanced,
quantitative binding data
Document > assay > result > compound > location > protein target
D- A- R - C- L- P
Where “C” is not a small molecule, we have ~ 2000 peptides and ~ 250
antibodies included in the ~ 9000 substances we submit to PubChem
Peptides
6
Endogenous peptides (786)
7
http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Endogenous-peptide&database=all
Non-endogenous peptides (1310)
8http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Peptide&database=all
Peptide stats
• Peptide ligs/all ligs = 22%.
• Ligands with quantitative binding data/all ligs = 75%
• Peptides with quantitative binding data/all peps = 63%
• CID quantitative binding data peptides/all CID peps = 89%
9
Tribulations with peptides
• Author specifications may be insuficient for complete molecular definition
• Consequent structural equivocalties slip through the editor/referee net
• Correct IUPAC peptide nomenclature is rare (ad-hoc more common)
• Exact location of radiolables often not specified
• Absence of purity verification and/or in vivo stability
• Need to surface user-intuative renderings (but HELM rules OK)
• Poor resolution of peptide name-to-structure (n2s)
• SMILES only copes for ~ 70 residues
• Searching patents for corroborative peptide prior-art is much more difficult than
small-molecules
• Literature extraction or author database submissions for bioactive peptides
proportionally lower than small molecules
• Species ”zoo” for venom peptides and their names
• Conjugates (peptides + linkers + proteins ect) even more difficult
• The PIR RESID Database of Protein Modifications is no longer maintained
10
The classic peptidic triple-whammy
11
Endothelin-1, CID 91928636, 1470 ”Similar Compounds” and top-100 BLAST hits
• Too big to search or cluster by SMILES
• Too small to BLAST cleanly (and sans PTMs)
• Too many species splits for precursors
Endothelin-1 inGtoPdb
12• But this now needs a SMILES backfill
Swiss-Prot precursor annotation:
useful but text-only PTMs
13
PubChem bad news:
will the real Endothelin-1 please stand up?
14
• "endothelin 1"[CompleteSynonym] > 6 CIDs > 36 SIDs (10 SID-only)
• “MW 2491.9140 NOT endothelin 1“ > 16 CIDs > 23 SIDs (some unnamed)
• BioAssay spliting (including for SID-only) is problematic
PubChem good news:
GtoPdb > SID SMILES > CID > biologicals annotation
15
PubChem: more good news
16
Our current push:
Peptides > S&S > SMILES > SIDs > CIDs
17
http://www.guidetopharmacology.org/GRAC/LigandDisplayForward?ligandId=3854
Antibodies
18
Tribulations with antibody curation
• Getting at least a primary Mab sequence as a molecuar definition
• Not alll clinical Mab sequences > patents > INN > IMGT-DB
• May get persistant UniParc ID sequence (on a good day)
• Papers often omit in vitro binding data
• Challenging to track press releases back to primary data
• Papers usually dont usually cite the patents
• But we sometimes get binding data from patents
• The biosimilars are piling in
• No open specification of glycan chains linked to primary sequences
• Some journals publish Mab characterisation with blinded code names
• Considering reseach reagents with vendor IDs if well provenanced
19
GtoPdb antibodies (245)
20http://www.guidetopharmacology.org/GRAC/LigandListForward?type=Antibody&database=all
Example: adalimumab
21
Exploiting PubChem SID-tagging for user selections
22
GtoP plans
• Continue peptide back-fill of peptides > CIDs using S&S
• Resolve our sequences against Swiss-Prot x-refs, ChEMBL and GPCRdb
• Continue adding antibody biosimilar cross-pointers
• Consider adding ”peptide” as a new SID tag
• For IUPHAR Guide to Immunopharmacology
– Sub-comitee feedback on peptides, antibodies, targets and indications
– Continue curation of peptides relevant to immunity and inflamation
• Anticipate curation of new ”binder” therapeutics including minibodies,
polyvalents and hybrids
• Keep watching brief on large-molecule InChIKeys
• Belt-and-braces of linking SMILEs with compromise (i.e. sans modifications)
FASTA approximations for BLAST indexing and clustering of peptide ligands
• Introduce local HELM rendering
• Revise legacy data model (e.g. introduce a protein ligand classification)
23
Acknowledgments, info, COI
24https://sites.google.com/view/tw2informatics/home
Conflict of interest (minor) has consulted in the peptide area
Thanks to the NextMove team
for S&S support
Lin Yikai, for her M.Sc. project;
”Developing
bio/cheminformatics methods
for converting bioactive peptide
structures into machine-
readable formats”
Anna Gaulton for ChEMBL FASTA
sequences
Paul Thiessen for PubChem for
FASTA sequences

Weitere ähnliche Inhalte

Was ist angesagt?

CINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceCINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resource
George Papadatos
 
MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16
Keegan McAuliffe
 

Was ist angesagt? (15)

IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
 
GPCRs_HouseLA
GPCRs_HouseLAGPCRs_HouseLA
GPCRs_HouseLA
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTS
 
Prota cs and targeted protein degradation
Prota cs and targeted protein degradationProta cs and targeted protein degradation
Prota cs and targeted protein degradation
 
Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY Curatorial data wrangling for the Guide to PHARMACOLGY
Curatorial data wrangling for the Guide to PHARMACOLGY
 
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and EducationGuide to PHARMACOLOGY: a web-Based Compendium for Research and Education
Guide to PHARMACOLOGY: a web-Based Compendium for Research and Education
 
CINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resourceCINF 55: SureChEMBL: An open patent chemistry resource
CINF 55: SureChEMBL: An open patent chemistry resource
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor data
 
MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16MCB 432 Final Table PP 01.06.16
MCB 432 Final Table PP 01.06.16
 
ChEMBL+KNIME
ChEMBL+KNIMEChEMBL+KNIME
ChEMBL+KNIME
 
Integrated Magnetic Systems - Eddie Blair
Integrated Magnetic Systems - Eddie BlairIntegrated Magnetic Systems - Eddie Blair
Integrated Magnetic Systems - Eddie Blair
 
Integrative inference of transcriptional networks in Arabidopsis yields novel...
Integrative inference of transcriptional networks in Arabidopsis yields novel...Integrative inference of transcriptional networks in Arabidopsis yields novel...
Integrative inference of transcriptional networks in Arabidopsis yields novel...
 
TF2Network: unravelling gene regulatory networks and transcription factor fun...
TF2Network: unravelling gene regulatory networks and transcription factor fun...TF2Network: unravelling gene regulatory networks and transcription factor fun...
TF2Network: unravelling gene regulatory networks and transcription factor fun...
 
GPU-accelerated Virtual Screening
GPU-accelerated Virtual ScreeningGPU-accelerated Virtual Screening
GPU-accelerated Virtual Screening
 
Antimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosureAntimalarial drug dscovery data disclosure
Antimalarial drug dscovery data disclosure
 

Ähnlich wie Peptide Tribulations in GtoPdb

GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
Chris Southan
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
Dr. Haxel Consult
 
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGYIUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
Guide to PHARMACOLOGY
 

Ähnlich wie Peptide Tribulations in GtoPdb (20)

Druggable Proteome sources in UniProt
Druggable Proteome sources in UniProtDruggable Proteome sources in UniProt
Druggable Proteome sources in UniProt
 
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGYSlicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
Slicing and dicing expert-curated protein targets in the Guide to PHARMACOLGY
 
GtoPdb: A resource for cell-based perturbogens
GtoPdb:  A resource for cell-based perturbogensGtoPdb:  A resource for cell-based perturbogens
GtoPdb: A resource for cell-based perturbogens
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb)
 
Druggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbsDruggable genome in GtoPdb and other dbs
Druggable genome in GtoPdb and other dbs
 
Evolving consensus-based curatorial strategies
Evolving consensus-based curatorial strategiesEvolving consensus-based curatorial strategies
Evolving consensus-based curatorial strategies
 
PubChem as a source of systems biology perturbagens
PubChem as a source of  systems biology perturbagensPubChem as a source of  systems biology perturbagens
PubChem as a source of systems biology perturbagens
 
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
Drug-to-protein mappings in the Guide to PHARMACOLOGY: Utility as a target va...
 
The IUPHAR/MMV Guide to Malaria Pharmacology
The  IUPHAR/MMV Guide to Malaria Pharmacology  The  IUPHAR/MMV Guide to Malaria Pharmacology
The IUPHAR/MMV Guide to Malaria Pharmacology
 
Correct drug structures for pharmacology
Correct drug structures for pharmacologyCorrect drug structures for pharmacology
Correct drug structures for pharmacology
 
GtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_CoreGtoPdb_StatusReport_May2018_Core
GtoPdb_StatusReport_May2018_Core
 
Assessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChemAssessing GtoPdb ligand content in PubChem
Assessing GtoPdb ligand content in PubChem
 
proteomics.ppt
proteomics.pptproteomics.ppt
proteomics.ppt
 
5HT2A modulators in GtoPdb and other databses
5HT2A modulators in GtoPdb and other databses5HT2A modulators in GtoPdb and other databses
5HT2A modulators in GtoPdb and other databses
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistry
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
 
GtoPdb teaching slides
GtoPdb teaching slidesGtoPdb teaching slides
GtoPdb teaching slides
 
Biologics information in PubChem
Biologics information in PubChemBiologics information in PubChem
Biologics information in PubChem
 
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGYIUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
IUPHAR-DB, GRAC and the IUPHAR/BPS Guide to PHARMACOLOGY
 

Mehr von Chris Southan

Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
Chris Southan
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
Chris Southan
 
The big data join in pharmacology
The big data join in pharmacologyThe big data join in pharmacology
The big data join in pharmacology
Chris Southan
 

Mehr von Chris Southan (20)

FAIR connectivity for DARCP
FAIR  connectivity for DARCPFAIR  connectivity for DARCP
FAIR connectivity for DARCP
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
 
Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2 Vicissitudes of target validation for BACE1 and BACE2
Vicissitudes of target validation for BACE1 and BACE2
 
Guide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updaeGuide to Pharmacology database: ELIXIR updae
Guide to Pharmacology database: ELIXIR updae
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
 
Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?Will the correct BACE ORFs please stand up?
Will the correct BACE ORFs please stand up?
 
Desperately seeking DARCP
Desperately seeking DARCPDesperately seeking DARCP
Desperately seeking DARCP
 
Seeking glimmers of light in Pharos “Tdark” proteins
Seeking glimmers of light in  Pharos “Tdark” proteinsSeeking glimmers of light in  Pharos “Tdark” proteins
Seeking glimmers of light in Pharos “Tdark” proteins
 
5HT2A modulators update for SAFER
5HT2A modulators update for SAFER5HT2A modulators update for SAFER
5HT2A modulators update for SAFER
 
Quality and noise in big chemistry databases
Quality and noise in big chemistry databasesQuality and noise in big chemistry databases
Quality and noise in big chemistry databases
 
Connecting chemistry-to-biology
Connecting chemistry-to-biology Connecting chemistry-to-biology
Connecting chemistry-to-biology
 
GtoPdb June 2019 poster
GtoPdb June 2019 posterGtoPdb June 2019 poster
GtoPdb June 2019 poster
 
PubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biologyPubChem for drug discovery and chemical biology
PubChem for drug discovery and chemical biology
 
Will the real proteins please stand up
Will the real proteins please stand upWill the real proteins please stand up
Will the real proteins please stand up
 
Looking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIRLooking at chemistry - protein - papers connectivity in ELIXIR
Looking at chemistry - protein - papers connectivity in ELIXIR
 
Guide to Immunopharmacology update
Guide to Immunopharmacology updateGuide to Immunopharmacology update
Guide to Immunopharmacology update
 
Patents in PubChem
Patents in PubChemPatents in PubChem
Patents in PubChem
 
Pub Med to PubChem Connectivity
Pub Med to PubChem ConnectivityPub Med to PubChem Connectivity
Pub Med to PubChem Connectivity
 
The big data join in pharmacology
The big data join in pharmacologyThe big data join in pharmacology
The big data join in pharmacology
 
Linking GtoP <> PubChem <> PubMed
Linking GtoP <> PubChem <> PubMed Linking GtoP <> PubChem <> PubMed
Linking GtoP <> PubChem <> PubMed
 

Kürzlich hochgeladen

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 

Kürzlich hochgeladen (20)

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 

Peptide Tribulations in GtoPdb

  • 1. Trials and tribulations of curating peptide and antibody ligands for the IUPHAR/BPS Guide to Pharmacology Christopher Southan, Joanna L. Sharman, Adam J. Pawson, Simon D. Harding, Elena Faccenda and Jamie A. Davies, IUPHAR/BPS Guide to Pharmacology, Discovery Brain Sciences, University of Edinburgh, UK. ACS Boston 2018, Biologics & Registration Session, Mon Aug 20, 15:50 - 16:15, Harbor Ballroom II 1 https://www.slideshare.net/cdsouthan
  • 2. Abstract (will not be shown) As an expert-curated database of approved, clinical or research pharmacological targets mapped to defined ligands, the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) and its precursor IUPHAR-DB, have been extracting and annotating bioactive peptides from papers for well over a decade. The current total has reached 2089 peptides, split between exogenous and endogenous, within the 9144 ligand entries submitted to PubChem in our 2018.2 database release. More recently, as approved drugs or clinical candidates we have curated 235 antibodies and a small number of therapeutic nucleotides. Indexing these entity types in GtoPdb present challenges similar to those being encountered for the registration of biologicals as explicitly defined structures. In addition, we target-map the citation- supported quantitative binding parameters where possible.This presentation will outline these curatorial challenges and our efforts to at least partially ameliorate the problems. For peptides below the PubChem CID SMILES limit of approximately 70 residues we have been using Sugar and Splice from NextMove Software to convert more of our peptide SIDs to join the 6969 CIDs we already have. However, we are often confounded by the equivocal structural specifications of authors w.r.t. post translational modifications and exact positions of radiolabel incorporations. However, we do capture at least a primary sequence string as an interim compromise that users can hit by BLAST. For reported receptor-binding endogenous peptides we find some that do not match the Swiss-Prot features for the precursor protein. PubChem has been encouraging and supporting us in converting more activity- mapped peptides to CIDs and InChIKeys which should enhance inter-source connectivity. Otherwise, biological SID data can only be joined by equivocal name matching. Antibodies and other large- biological SIDs may also currently remain structurally orphaned and present their own challenges. Notwithstanding, GtoPdb has successfully curated at least primary sequences for the molecular specification of clinical Mabs. For this we use the IMGT/mAb-DB for approved monoclonals as a first stop shop since they extract sequences from INN documents. For these and clinical candidates with code names we also use the patent sequence databases to source a UniParc accession number and can sometimes get binding data that has not appeared in papers. 2
  • 3. Outline • Intoducing GtoPdb • GtoPdb peptide content and stats • Peptide tribulations • PubChem peptidic pros and cons • Getting more peptides > SMILES • GtoPdb antibody content • Antipbody tribulations • Stats and examples • Exploiting PubChem SID tagging • Wher we go from here • Further information 3
  • 4. Introducing the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb) • IUPHAR = International Union of Basic and Clinical Pharmacology, BPS = British Pharmacological Society • Formerly know as IUPHAR-DB for receptors and channels since 2003 • Since 2012 funded byWellcomeTrust to cover all targets in the human genome • Since 2015 WellcomeTrust “fork” as Guide to IMMUNOPHARMACOLOGY • Molecular mechanism of action (mmoa) mapping primary & secondary targets • Release cycle time (with PubChem refreshes) ~ 2 months • Six well-cited NAR Annual Database issues, latest as PMID 29149325 (2018) • Distilled into the 2-yearly BritishJournal of Pharmacology “Concise Guide to PHARMACOLOGY” as a nine-paper series (see PMID 29055037) with outlinks • Presents users with selected quality compounds for pharmacology research in silico, in vitro, in cellulo, in vivo, in clinico • An ELIXIR UK Node resource since 2016 http://www.guidetopharmacology.org 4
  • 5. 5 Expert-curated, citation provenanced, quantitative binding data Document > assay > result > compound > location > protein target D- A- R - C- L- P Where “C” is not a small molecule, we have ~ 2000 peptides and ~ 250 antibodies included in the ~ 9000 substances we submit to PubChem
  • 9. Peptide stats • Peptide ligs/all ligs = 22%. • Ligands with quantitative binding data/all ligs = 75% • Peptides with quantitative binding data/all peps = 63% • CID quantitative binding data peptides/all CID peps = 89% 9
  • 10. Tribulations with peptides • Author specifications may be insuficient for complete molecular definition • Consequent structural equivocalties slip through the editor/referee net • Correct IUPAC peptide nomenclature is rare (ad-hoc more common) • Exact location of radiolables often not specified • Absence of purity verification and/or in vivo stability • Need to surface user-intuative renderings (but HELM rules OK) • Poor resolution of peptide name-to-structure (n2s) • SMILES only copes for ~ 70 residues • Searching patents for corroborative peptide prior-art is much more difficult than small-molecules • Literature extraction or author database submissions for bioactive peptides proportionally lower than small molecules • Species ”zoo” for venom peptides and their names • Conjugates (peptides + linkers + proteins ect) even more difficult • The PIR RESID Database of Protein Modifications is no longer maintained 10
  • 11. The classic peptidic triple-whammy 11 Endothelin-1, CID 91928636, 1470 ”Similar Compounds” and top-100 BLAST hits • Too big to search or cluster by SMILES • Too small to BLAST cleanly (and sans PTMs) • Too many species splits for precursors
  • 12. Endothelin-1 inGtoPdb 12• But this now needs a SMILES backfill
  • 14. PubChem bad news: will the real Endothelin-1 please stand up? 14 • "endothelin 1"[CompleteSynonym] > 6 CIDs > 36 SIDs (10 SID-only) • “MW 2491.9140 NOT endothelin 1“ > 16 CIDs > 23 SIDs (some unnamed) • BioAssay spliting (including for SID-only) is problematic
  • 15. PubChem good news: GtoPdb > SID SMILES > CID > biologicals annotation 15
  • 17. Our current push: Peptides > S&S > SMILES > SIDs > CIDs 17 http://www.guidetopharmacology.org/GRAC/LigandDisplayForward?ligandId=3854
  • 19. Tribulations with antibody curation • Getting at least a primary Mab sequence as a molecuar definition • Not alll clinical Mab sequences > patents > INN > IMGT-DB • May get persistant UniParc ID sequence (on a good day) • Papers often omit in vitro binding data • Challenging to track press releases back to primary data • Papers usually dont usually cite the patents • But we sometimes get binding data from patents • The biosimilars are piling in • No open specification of glycan chains linked to primary sequences • Some journals publish Mab characterisation with blinded code names • Considering reseach reagents with vendor IDs if well provenanced 19
  • 22. Exploiting PubChem SID-tagging for user selections 22
  • 23. GtoP plans • Continue peptide back-fill of peptides > CIDs using S&S • Resolve our sequences against Swiss-Prot x-refs, ChEMBL and GPCRdb • Continue adding antibody biosimilar cross-pointers • Consider adding ”peptide” as a new SID tag • For IUPHAR Guide to Immunopharmacology – Sub-comitee feedback on peptides, antibodies, targets and indications – Continue curation of peptides relevant to immunity and inflamation • Anticipate curation of new ”binder” therapeutics including minibodies, polyvalents and hybrids • Keep watching brief on large-molecule InChIKeys • Belt-and-braces of linking SMILEs with compromise (i.e. sans modifications) FASTA approximations for BLAST indexing and clustering of peptide ligands • Introduce local HELM rendering • Revise legacy data model (e.g. introduce a protein ligand classification) 23
  • 24. Acknowledgments, info, COI 24https://sites.google.com/view/tw2informatics/home Conflict of interest (minor) has consulted in the peptide area Thanks to the NextMove team for S&S support Lin Yikai, for her M.Sc. project; ”Developing bio/cheminformatics methods for converting bioactive peptide structures into machine- readable formats” Anna Gaulton for ChEMBL FASTA sequences Paul Thiessen for PubChem for FASTA sequences