SlideShare ist ein Scribd-Unternehmen logo
1 von 32
ARMADILLO PROTEIN EVOLUTION &
BINDING PREDICTION
Spencer Bliven
Anisimova Journal Club
2018-05-24
ARMADILLO REPEAT PROTEINS
WHY ARMADILLOS? BIOLOGICAL INTEREST
 Roles in localization, protein transport, & more
 ß-catenin: cell adhesion, development
 α-importin: nuclear localization
 APC: tumor suppressor gene, linked to colorectal cancer
 Probably homologous to HEAT repeat family
 Slightly different structure
 huntingtin: disease, nerve signaling, protein transport
 ß-importin: nuclear localization
 phosphotases/kinases
 Ancient family (primarily eukaryotic, but predates metazoans)
WHY ARMADILLOS? PROTEIN-PROTEIN BINDING
 Bind peptides, so could be used like antibodies or
DARPins (therapeutics, biotech, assays, etc.)
 Bind extended chains
 Target disordered regions and termini
 Linear epitope, so much easier to design
 Modular binding
[5AEI]
ARMADILLO EVOLUTION
Armadillidium vulgare
TANDEM REPEAT EVOLUTION
 Duplications & fusions within a gene lead to tandem repeats
 Speciation and gene duplication lead to orthologs and paralogs
 Pattern of repeats tells us the sequence of evolutionary events
HEAT & ARM
Andrade MA, Petosa C, O'donoghue SI, Müller CW, Bork P. Comparison of ARM and HEAT protein repeats. J Mol Biol.
Academic Press; 2001 May 25;309(1):1–18.
ARM FAMILY
Gul, I. S., Hulpiau, P., Saeys, Y., & van Roy, F. (2017) Cellular and Molecular Life Sciences, 74(3), 525–541
∂-catenins & ARM formins
ß-catenin not with ∂-catenins
ß-importin
HEAT
(outgroup)
Catenin
beta-like
α-importin
LIMITATIONS OF PRIOR STUDIES
 Don’t model repeat evolution
 Either use full-length sequences (no support for copy variation) or single
repeats (inconsistent boundaries, repeats segregate differently between
species)
 No reconciliation between gene tree and repeat tree
 Older papers use limited species and sequences
 Inconsistent inclusion of HEAT repeats
MY APPROACH
 Detect repeats with TRAL (cpHMM)
 Alignment & tree inference with ProGraphML+TR
 Joint gene tree and repeat tree inference (future work)
TRAL
 Tandem Repeat Annotation Library
 Circularly permuted Hidden Markov Model (cpHMM) for tandem
repeat alignment
 Integrates repeat detection software
 Important for expanding analysis beyond ArmRP family
Schaper et al. (2015). TRAL: tandem repeat annotation library. Bioinformatics, 31(18), 3051–3053.
Schaper E, Gascuel O, Anisimova M. Deep conservation of human protein tandem repeats within the eukaryotes. Mol Biol Evol. 2014
May;31(5):1132–48.
DETECTED REPEATS BY SPECIES (GUL HMM)
Species ArmRP Proteins
Macrostomum lignano 170
Echinostoma caproni 163
Lingula anatina 125
human 107
zebrafish 107
scaled quail 100
tropical clawed frog 95
owl limpet 93
starlet sea anemone 93
Florida lancelet 90
Japanese sea cucumber 84
Schistocephalus solidus 84
Octopus bimaculoides 82
Biomphalaria glabrata 82
purple sea urchin 81
platypus 75
green sea turtle 75
Stylophora pistillata 75
Wild Bactrian camel 72
Amphimedon queenslandica 68
Number of Proteins
Numberofspecies
94 species
PROGRAPHML+TR
Szalkowski AM, Anisimova M. Graph-based modeling of tandem repeats improves global multiple sequence alignment. Nucleic Acids Res.
2013 Sep;41(17):e162–2.
OUTLOOK: EVOLUTION
 Improve Arm profiles based on structural searches
 MMTF-pySpark for rapid structural searches
 Finish phylogenetic reconstruction with ProGraphML+TR on diverse
species
 Joint gene-repeat reconstruction
 Analogous to joint species-gene tree inference (e.g. Szöllosi et al, 2015)
ARM BINDING
MOTIVATION
 Nature’s solution to binding
molecules
 Used in diagnostics,
therapy, labelling,
biochemistry research
 $105 billion industry (2016)
 3D epitope
 Produced in vivo in
animals (polyclonal) then
optimized biochemically
(monoclonal)
Antibodies
MOTIVATION
 Nature’s solution to binding
molecules
 Used in diagnostics,
therapy, labelling,
biochemistry research
 $105 billion industry (2016)
 3D epitope
 Produced in vivo in
animals (polyclonal) then
optimized biochemically
(monoclonal)
Antibodies DARPins
 Designed Ankyrin Repeat
Proteins
 Developed by Andreas
Plückthun, UZH
 Commercialized by
Molecular Partners AG
($571 million market cap)
 Similar uses to antibodies
 3D epitope
 Produced in vitro from a
randomized library
MOTIVATION
 Nature’s solution to binding
molecules
 Used in diagnostics,
therapy, labelling,
biochemistry research
 $105 billion industry (2016)
 3D epitope
 Produced in vivo in
animals (polyclonal) then
optimized biochemically
(monoclonal)
Antibodies DARPins dArmRP
 Designed Ankyrin Repeat
Proteins
 Developed by Andreas
Plückthun, UZH
 Commercialized by
Molecular Partners AG
($571 million market cap)
 Similar uses to antibodies
 3D epitope
 Produced in vitro from a
randomized library
 Designed Armadillo Repeat
Proteins
 Bind extended peptides
(tails, disordered regions,
denatured proteins)
 1D epitope
 Rationally designed in
silico?
ARM STRUCTURE & CONSERVATION
Gul 2017 Fig 1B
Structure: Repeat from designed ARM YIIIM5AII (Hansen…Plückthun, 2016) [5aei], colored and labeled as in the alignment
H1
H2
H3 H1 H2 H3
Hydrophobic core
BINDING HINTS FROM DARMRP ((KR)N BINDING)
Gul 2017 Fig 1B
Structure: Repeat from designed ARM YIIIM5AII (Hansen…Plückthun, 2016) [5aei], colored and labeled as in the alignment
H1
H2
H3
Nonspecific
binding
Mutants available for 7 residues in Arg pocket
Lys pocket has only one specific interaction
H1 H2 H3
Hydrophobic core
BINDING MODULARITY
 For dArmRP, binding is linear with the number of repeats and for
single-residue mutations
Predictable binding energies
Single-residue resolution
K->A
R->A
2K->2A
2R->2A
KERNEL MODEL
 Regression problem: predict binding affinity from sequence at 7
positions
 Extract 5 features based on amino acid properties (Atchley 2005)
 Use linear regression with various kernels
log10 𝑌 = 𝐾 𝐾 + 𝜆𝐼 log10 𝑌
 Linear kernel 𝑎, 𝑏 = 𝑎 𝑇 𝑏
 Gaussian kernel 𝑎, 𝑏 = 𝑒𝑥𝑝 −𝜎 𝑎 − 𝑏 2
RESULTS
 Train on 138 datapoints from Plückthun group
 Essentially all “positive” binding cases
 Leave-one-out cross validation for error estimation
 Linear: 0.42 standard error (log10 M units)
 Gaussian: 1.42, but numerically instable
LINEAR KERNEL
lambda=0.001
0.42 standard error (log10 M units)
R=.90
Measured Binding (log10)
PredictedBinding(log10)
GAUSSIAN KERNEL
lambda=10-4 sigma=108
1.42 standard error (log10 M units)
R=.13
Measured Binding (log10)
PredictedBinding(log10)
GAUSSIAN KERNAL
 Numerically unstable implementation (hat matrix is near-singular)
 No renormalization currently
OUTLOOK: BINDING
 Switch from regression to classification
 Additional training data from collaborators
 In particular, need non-binding examples
 More sophisticated classifiers
 Numerically stable implementation
 Better kernels?
 Proactively suggest informative instances for our collaborators to
measure
THANKS!
 ACGTeam: Maria Anisimova, Manuel Gil, Victor Garcia,
Lorenzo Gatti, Max Maiolo, Simone Ulzega, Erich
Zbinden
 Matteo Delucci & Lina Naef (ACLS masters) – TRAL
 Elke Schaper – TRAL
 Somayeh Danafar – Kernel methods
 Andreas Plückthun, Patrick Ernst, Yvonne Stark (UZH) –
Binding data
But wait, there’s more! MMTF format coming next…
MMTF DEMO?
MMTF
 Compression: data normalization, vectorization, run-length encoding,
delta encoding
 Optional lossy/course representation
 Now has widespread software support (BioPython, BioJava, most
molecular viewers, etc)
 MMTF Format: http://mmtf.rcsb.org/
 MMTF-Spark library:
 Java https://github.com/sbl-sdsc/mmtf-spark/
 Python https://github.com/sbl-sdsc/mmtf-pyspark/
 Fast, parallelized whole-PDB analysis
CE-SYMM OPEN REPEAT DETECTION
ß-catenin [1I7X]
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported
License.

Weitere ähnliche Inhalte

Was ist angesagt?

Transposagen Genome Engineering Brochure 2014
Transposagen Genome Engineering Brochure 2014Transposagen Genome Engineering Brochure 2014
Transposagen Genome Engineering Brochure 2014
Dustin Perry
 
Lessons learned from high throughput CRISPR targeting in human cell lines
Lessons learned from high throughput CRISPR targeting in human cell linesLessons learned from high throughput CRISPR targeting in human cell lines
Lessons learned from high throughput CRISPR targeting in human cell lines
Chris Thorne
 
CRISPR: Gene editing for everyone
CRISPR: Gene editing for everyoneCRISPR: Gene editing for everyone
CRISPR: Gene editing for everyone
Candy Smellie
 

Was ist angesagt? (20)

Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?
 
Transposagen Genome Engineering Brochure 2014
Transposagen Genome Engineering Brochure 2014Transposagen Genome Engineering Brochure 2014
Transposagen Genome Engineering Brochure 2014
 
Crispr/cas9 101
Crispr/cas9 101Crispr/cas9 101
Crispr/cas9 101
 
Gene editing 1
Gene editing 1Gene editing 1
Gene editing 1
 
1.4 av
1.4 av1.4 av
1.4 av
 
BIOL335: RNA bioinformatics
BIOL335: RNA bioinformaticsBIOL335: RNA bioinformatics
BIOL335: RNA bioinformatics
 
Apollo Exercises Kansas State University 2015
Apollo Exercises Kansas State University 2015Apollo Exercises Kansas State University 2015
Apollo Exercises Kansas State University 2015
 
CRISPR - gene-editing for everyone
CRISPR - gene-editing for everyoneCRISPR - gene-editing for everyone
CRISPR - gene-editing for everyone
 
Poster
PosterPoster
Poster
 
Visualizing the pan genome - Australian Society for Microbiology - tue 8 jul ...
Visualizing the pan genome - Australian Society for Microbiology - tue 8 jul ...Visualizing the pan genome - Australian Society for Microbiology - tue 8 jul ...
Visualizing the pan genome - Australian Society for Microbiology - tue 8 jul ...
 
The next generation of crispr–cas technologies and Applications
The next generation of crispr–cas technologies and ApplicationsThe next generation of crispr–cas technologies and Applications
The next generation of crispr–cas technologies and Applications
 
Correction IARS syndrome using CRISPR/Cas9 in Japanese Black Cattle
Correction IARS syndrome using CRISPR/Cas9 in Japanese Black CattleCorrection IARS syndrome using CRISPR/Cas9 in Japanese Black Cattle
Correction IARS syndrome using CRISPR/Cas9 in Japanese Black Cattle
 
Review of CRISPR/Cas9
Review of CRISPR/Cas9Review of CRISPR/Cas9
Review of CRISPR/Cas9
 
Lessons learned from high throughput CRISPR targeting in human cell lines
Lessons learned from high throughput CRISPR targeting in human cell linesLessons learned from high throughput CRISPR targeting in human cell lines
Lessons learned from high throughput CRISPR targeting in human cell lines
 
MSU Transgenic and Genome Editing Facility
MSU Transgenic and Genome Editing FacilityMSU Transgenic and Genome Editing Facility
MSU Transgenic and Genome Editing Facility
 
CRISPR: Gene editing for everyone
CRISPR: Gene editing for everyoneCRISPR: Gene editing for everyone
CRISPR: Gene editing for everyone
 
CRISPR/CAS9 ppt by sanjana pandey
CRISPR/CAS9 ppt by sanjana pandeyCRISPR/CAS9 ppt by sanjana pandey
CRISPR/CAS9 ppt by sanjana pandey
 
Genome Editing- ZNF vs TELEN
Genome Editing- ZNF vs TELENGenome Editing- ZNF vs TELEN
Genome Editing- ZNF vs TELEN
 
Making the cut with CRISPR
Making the cut with CRISPRMaking the cut with CRISPR
Making the cut with CRISPR
 
Crispr suman
Crispr  sumanCrispr  suman
Crispr suman
 

Ähnlich wie 2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design potential

2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
Dongyan Zhao
 
Western Blotting Of Camkii Β And T 287
Western Blotting Of Camkii Β And T 287Western Blotting Of Camkii Β And T 287
Western Blotting Of Camkii Β And T 287
Beth Salazar
 
Combined thesis 1
Combined thesis 1Combined thesis 1
Combined thesis 1
deepthesis
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
Ulises Urzua
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
Sean Davis
 
Cloning and Characterization of Master Regulator of Systemic Acquired Resista...
Cloning and Characterization of Master Regulator of Systemic Acquired Resista...Cloning and Characterization of Master Regulator of Systemic Acquired Resista...
Cloning and Characterization of Master Regulator of Systemic Acquired Resista...
Akhilesh Rawat
 
140128 use cases of giab RMs
140128 use cases of giab RMs140128 use cases of giab RMs
140128 use cases of giab RMs
GenomeInABottle
 

Ähnlich wie 2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design potential (20)

2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
Prediction of protein function
Prediction of protein functionPrediction of protein function
Prediction of protein function
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
Western Blotting Of Camkii Β And T 287
Western Blotting Of Camkii Β And T 287Western Blotting Of Camkii Β And T 287
Western Blotting Of Camkii Β And T 287
 
Aptamer as therapeutic
Aptamer as therapeuticAptamer as therapeutic
Aptamer as therapeutic
 
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
http://lectures.gersteinlab.org/ppt/Gencode-winter08-20090121-pseudogenes/Gen...
 
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
 
Combined thesis 1
Combined thesis 1Combined thesis 1
Combined thesis 1
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
 
Predicting Pharmacology
Predicting PharmacologyPredicting Pharmacology
Predicting Pharmacology
 
Shah Presentation1[1811].pptx
Shah Presentation1[1811].pptxShah Presentation1[1811].pptx
Shah Presentation1[1811].pptx
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012
 
BIOL335: Functional genomics
BIOL335: Functional genomicsBIOL335: Functional genomics
BIOL335: Functional genomics
 
E research feb2016 sifting the needles in the haystack
E research feb2016 sifting the needles in the haystackE research feb2016 sifting the needles in the haystack
E research feb2016 sifting the needles in the haystack
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methods
 
Cloning and Characterization of Master Regulator of Systemic Acquired Resista...
Cloning and Characterization of Master Regulator of Systemic Acquired Resista...Cloning and Characterization of Master Regulator of Systemic Acquired Resista...
Cloning and Characterization of Master Regulator of Systemic Acquired Resista...
 
140128 use cases of giab RMs
140128 use cases of giab RMs140128 use cases of giab RMs
140128 use cases of giab RMs
 
Bits of the Green Junk
 Bits of the Green Junk Bits of the Green Junk
Bits of the Green Junk
 

Mehr von Spencer Bliven

Mehr von Spencer Bliven (9)

3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
3DSIG 2016 Presentation: Exploring Internal Symmetry and Structural Repeats w...
 
Aligning Subunits of Internally Symmetric Proteins with CE-Symm
Aligning Subunits of Internally Symmetric Proteins with CE-SymmAligning Subunits of Internally Symmetric Proteins with CE-Symm
Aligning Subunits of Internally Symmetric Proteins with CE-Symm
 
CE-Symm jLBR talk
CE-Symm jLBR talkCE-Symm jLBR talk
CE-Symm jLBR talk
 
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
Systematic detection of internal symmetry in proteins - Rheinknie Regiomeetin...
 
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
3DSIG 2014 Presentation: Systematic detection of internal symmetry in proteins
 
Journal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et alJournal Club 2013-09-10: Pandya et al
Journal Club 2013-09-10: Pandya et al
 
Following the Evolution of New Protein Folds via Protodomains [Report]
Following the Evolution of New Protein Folds via Protodomains [Report]Following the Evolution of New Protein Folds via Protodomains [Report]
Following the Evolution of New Protein Folds via Protodomains [Report]
 
Following the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via ProtodomainsFollowing the Evolution of New Protein Folds via Protodomains
Following the Evolution of New Protein Folds via Protodomains
 
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
Topic Pages: The Peer-reviewed Wikipedia Article (BOSC 2012 Poster)
 

Kürzlich hochgeladen

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

2018-05-24 Research update on Armadillo Repeat Proteins: Evolution and Design potential

  • 1. ARMADILLO PROTEIN EVOLUTION & BINDING PREDICTION Spencer Bliven Anisimova Journal Club 2018-05-24
  • 3. WHY ARMADILLOS? BIOLOGICAL INTEREST  Roles in localization, protein transport, & more  ß-catenin: cell adhesion, development  α-importin: nuclear localization  APC: tumor suppressor gene, linked to colorectal cancer  Probably homologous to HEAT repeat family  Slightly different structure  huntingtin: disease, nerve signaling, protein transport  ß-importin: nuclear localization  phosphotases/kinases  Ancient family (primarily eukaryotic, but predates metazoans)
  • 4. WHY ARMADILLOS? PROTEIN-PROTEIN BINDING  Bind peptides, so could be used like antibodies or DARPins (therapeutics, biotech, assays, etc.)  Bind extended chains  Target disordered regions and termini  Linear epitope, so much easier to design  Modular binding [5AEI]
  • 6. TANDEM REPEAT EVOLUTION  Duplications & fusions within a gene lead to tandem repeats  Speciation and gene duplication lead to orthologs and paralogs  Pattern of repeats tells us the sequence of evolutionary events
  • 7. HEAT & ARM Andrade MA, Petosa C, O'donoghue SI, Müller CW, Bork P. Comparison of ARM and HEAT protein repeats. J Mol Biol. Academic Press; 2001 May 25;309(1):1–18.
  • 8. ARM FAMILY Gul, I. S., Hulpiau, P., Saeys, Y., & van Roy, F. (2017) Cellular and Molecular Life Sciences, 74(3), 525–541 ∂-catenins & ARM formins ß-catenin not with ∂-catenins ß-importin HEAT (outgroup) Catenin beta-like α-importin
  • 9. LIMITATIONS OF PRIOR STUDIES  Don’t model repeat evolution  Either use full-length sequences (no support for copy variation) or single repeats (inconsistent boundaries, repeats segregate differently between species)  No reconciliation between gene tree and repeat tree  Older papers use limited species and sequences  Inconsistent inclusion of HEAT repeats MY APPROACH  Detect repeats with TRAL (cpHMM)  Alignment & tree inference with ProGraphML+TR  Joint gene tree and repeat tree inference (future work)
  • 10. TRAL  Tandem Repeat Annotation Library  Circularly permuted Hidden Markov Model (cpHMM) for tandem repeat alignment  Integrates repeat detection software  Important for expanding analysis beyond ArmRP family Schaper et al. (2015). TRAL: tandem repeat annotation library. Bioinformatics, 31(18), 3051–3053. Schaper E, Gascuel O, Anisimova M. Deep conservation of human protein tandem repeats within the eukaryotes. Mol Biol Evol. 2014 May;31(5):1132–48.
  • 11. DETECTED REPEATS BY SPECIES (GUL HMM) Species ArmRP Proteins Macrostomum lignano 170 Echinostoma caproni 163 Lingula anatina 125 human 107 zebrafish 107 scaled quail 100 tropical clawed frog 95 owl limpet 93 starlet sea anemone 93 Florida lancelet 90 Japanese sea cucumber 84 Schistocephalus solidus 84 Octopus bimaculoides 82 Biomphalaria glabrata 82 purple sea urchin 81 platypus 75 green sea turtle 75 Stylophora pistillata 75 Wild Bactrian camel 72 Amphimedon queenslandica 68 Number of Proteins Numberofspecies 94 species
  • 12. PROGRAPHML+TR Szalkowski AM, Anisimova M. Graph-based modeling of tandem repeats improves global multiple sequence alignment. Nucleic Acids Res. 2013 Sep;41(17):e162–2.
  • 13. OUTLOOK: EVOLUTION  Improve Arm profiles based on structural searches  MMTF-pySpark for rapid structural searches  Finish phylogenetic reconstruction with ProGraphML+TR on diverse species  Joint gene-repeat reconstruction  Analogous to joint species-gene tree inference (e.g. Szöllosi et al, 2015)
  • 15. MOTIVATION  Nature’s solution to binding molecules  Used in diagnostics, therapy, labelling, biochemistry research  $105 billion industry (2016)  3D epitope  Produced in vivo in animals (polyclonal) then optimized biochemically (monoclonal) Antibodies
  • 16. MOTIVATION  Nature’s solution to binding molecules  Used in diagnostics, therapy, labelling, biochemistry research  $105 billion industry (2016)  3D epitope  Produced in vivo in animals (polyclonal) then optimized biochemically (monoclonal) Antibodies DARPins  Designed Ankyrin Repeat Proteins  Developed by Andreas Plückthun, UZH  Commercialized by Molecular Partners AG ($571 million market cap)  Similar uses to antibodies  3D epitope  Produced in vitro from a randomized library
  • 17. MOTIVATION  Nature’s solution to binding molecules  Used in diagnostics, therapy, labelling, biochemistry research  $105 billion industry (2016)  3D epitope  Produced in vivo in animals (polyclonal) then optimized biochemically (monoclonal) Antibodies DARPins dArmRP  Designed Ankyrin Repeat Proteins  Developed by Andreas Plückthun, UZH  Commercialized by Molecular Partners AG ($571 million market cap)  Similar uses to antibodies  3D epitope  Produced in vitro from a randomized library  Designed Armadillo Repeat Proteins  Bind extended peptides (tails, disordered regions, denatured proteins)  1D epitope  Rationally designed in silico?
  • 18. ARM STRUCTURE & CONSERVATION Gul 2017 Fig 1B Structure: Repeat from designed ARM YIIIM5AII (Hansen…Plückthun, 2016) [5aei], colored and labeled as in the alignment H1 H2 H3 H1 H2 H3 Hydrophobic core
  • 19. BINDING HINTS FROM DARMRP ((KR)N BINDING) Gul 2017 Fig 1B Structure: Repeat from designed ARM YIIIM5AII (Hansen…Plückthun, 2016) [5aei], colored and labeled as in the alignment H1 H2 H3 Nonspecific binding Mutants available for 7 residues in Arg pocket Lys pocket has only one specific interaction H1 H2 H3 Hydrophobic core
  • 20. BINDING MODULARITY  For dArmRP, binding is linear with the number of repeats and for single-residue mutations Predictable binding energies Single-residue resolution K->A R->A 2K->2A 2R->2A
  • 21. KERNEL MODEL  Regression problem: predict binding affinity from sequence at 7 positions  Extract 5 features based on amino acid properties (Atchley 2005)  Use linear regression with various kernels log10 𝑌 = 𝐾 𝐾 + 𝜆𝐼 log10 𝑌  Linear kernel 𝑎, 𝑏 = 𝑎 𝑇 𝑏  Gaussian kernel 𝑎, 𝑏 = 𝑒𝑥𝑝 −𝜎 𝑎 − 𝑏 2
  • 22. RESULTS  Train on 138 datapoints from Plückthun group  Essentially all “positive” binding cases  Leave-one-out cross validation for error estimation  Linear: 0.42 standard error (log10 M units)  Gaussian: 1.42, but numerically instable
  • 23. LINEAR KERNEL lambda=0.001 0.42 standard error (log10 M units) R=.90 Measured Binding (log10) PredictedBinding(log10)
  • 24. GAUSSIAN KERNEL lambda=10-4 sigma=108 1.42 standard error (log10 M units) R=.13 Measured Binding (log10) PredictedBinding(log10)
  • 25. GAUSSIAN KERNAL  Numerically unstable implementation (hat matrix is near-singular)  No renormalization currently
  • 26. OUTLOOK: BINDING  Switch from regression to classification  Additional training data from collaborators  In particular, need non-binding examples  More sophisticated classifiers  Numerically stable implementation  Better kernels?  Proactively suggest informative instances for our collaborators to measure
  • 27. THANKS!  ACGTeam: Maria Anisimova, Manuel Gil, Victor Garcia, Lorenzo Gatti, Max Maiolo, Simone Ulzega, Erich Zbinden  Matteo Delucci & Lina Naef (ACLS masters) – TRAL  Elke Schaper – TRAL  Somayeh Danafar – Kernel methods  Andreas Plückthun, Patrick Ernst, Yvonne Stark (UZH) – Binding data But wait, there’s more! MMTF format coming next…
  • 29. MMTF  Compression: data normalization, vectorization, run-length encoding, delta encoding  Optional lossy/course representation  Now has widespread software support (BioPython, BioJava, most molecular viewers, etc)
  • 30.  MMTF Format: http://mmtf.rcsb.org/  MMTF-Spark library:  Java https://github.com/sbl-sdsc/mmtf-spark/  Python https://github.com/sbl-sdsc/mmtf-pyspark/  Fast, parallelized whole-PDB analysis
  • 31. CE-SYMM OPEN REPEAT DETECTION ß-catenin [1I7X]
  • 32. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Hinweis der Redaktion

  1. Micrograph by Mark Peifer https://bio.unc.edu/people/faculty/peifer/ All rights reserved
  2. Armadillidium vulgare by Franco Folini https://commons.wikimedia.org/wiki/File:Armadillidium_vulgare_001.jpg CC-BY
  3. Gul 2017 Fig 1C
  4. Image: Armadillos gloves by muratyusuf (https://www.etsy.com/ch-en/listing/115618947/last-minute-discount-original-design?ref=shop_home_active_2)
  5. Sources: Antibody IgG2 by TimVickers https://commons.wikimedia.org/wiki/File:Antibody_IgG2.png https://www.futuremarketinsights.com/press-release/antibodies-market
  6. Sources: Antibody IgG2 by TimVickers https://commons.wikimedia.org/wiki/File:Antibody_IgG2.png https://www.futuremarketinsights.com/press-release/antibodies-market
  7. Sources: Antibody IgG2 by TimVickers https://commons.wikimedia.org/wiki/File:Antibody_IgG2.png Antibody industry revenue: https://www.futuremarketinsights.com/press-release/antibodies-market DARPin 2QYJ dArmRP 5AEI
  8. Gul 2017 Fig 1C
  9. Image by Tim Pierce https://commons.wikimedia.org/wiki/File:Southern_three-banded_armadillo_(10432292444).jpg CC-BY