SlideShare ist ein Scribd-Unternehmen logo
1 von 14
 Scoring system is a set of values for qualifying the set of
one residue being substituted by another in an alignment.
 It is also known as substitution matrix.
 Scoring matrix of nucleotide is relatively simple.
 A positive value or a high score is given for a match &
negative value or a low score is given for a mismatch.
 Scoring matrices for amino acids are more complicated
because scoring has to reflect the physicochemical
properties of amino acid residues.
Transition --- substitutions in which a purine (A/G) is replaced by
another purine (A/G) or a pyrimidine (C/T) is replaced by
another pyrimidine (C/T).
Tansversions ---
(A/G)  (C/T)
1000G
0100C
0010T
0001A
GCTA
Identity matrix
1-5-5-
1
G
-51-1-
5
C
-5-11-
5
T
-1-5-51A
GCTA
Transition-Transversion matrix
 Match score: +1
 Mismatch score: +0
 Gap penalty: –1
 ACGTCTGATACGCCGTATAGTCTATCT
||||| ||| || ||||||||
----CTGATTCGC---ATCGTCTATCT
 Matches: 18 × (+1)
 Mismatches: 2 × 0 Score = +11
 Gaps: 7 × (– 1)
PAM - point accepted mutation based on
global alignment [evolutionary model]
BLOSUM - Block substitutions based on
local alignments [similarity among
conserved sequences]
 First given by Dayhoff who compiled alignment of 71
groups of very closely related protein sequences.
 PAM- Point Accepted Mutation.
 PAM matrix were derived based on evolutionary
divergence between sequences of protein structure.
 Construction of PAM1 matrix involves alignment of full
length sequence & subsequent construction of
phylogenic trees using parsimony principle.
 Ancestral sequence information is used to count the number
of substitution along each branch of tree.
 Positive scores in the matrix denotes substitutions occurring
more frequently than expected among evolutionary
conserved replacements.
 Negative score corresponds to substution which occurs less
frequently.
 A PAM is defined as 1% amino acid change or one mutation
per 100 residues.
 The increasing PAM numbers correlate with increasing PAM
units & thus evolutionary distances of protein sequences.
 Constructed based on the phylogenetic
relationships prior to scoring mutations;
 Difficulty of determining ancestral
relationships among sequences;
 Based on a small set of closely related
proteins;
 It is a series of block amino acid substitution matrix.
 Derived on the basis of direct observation for every
possible amino acid substitution in multiple sequence
alignment.
 Sequence pattern is also called as block.
 Ungapped alignments are less than 60 amino acid in
length.
 BLOSUM matrix are actual % values of sequence
selected for construction of matrix.
 BLOSUM 62 indicates that sequence selected for
constructing the matrix is an average share of 62%.
 BLOSUM share for a particular residue pair is derived
from the log ratio of observed residue substitution
versus the expected probability of particular residue.
 Lower the number of BLOSUM more divergent species
are present.
C S T P A G
C 9
S -1 4
T -1 1 5
P -3 -1 -1 7
A 0 1 0 -1 4
G -3 0 -2 -2 0 6
 BLOSUM62 was
measured on pairs
of sequences with
an average of 62 %
identical amino
acids.
Log-odds = log ( )chance to see the pair in homologous proteins
chance to see the pair in unrelated proteins by chance
 PAM
› Based on mutational
model of evolution
(Markov process)
› PAM1 is based on
sequences of 85%
similarity
› Designed to track the
evolutionary origins
 BLOSUM
› Based on the multiple
alignment of blocks
› Good to be used to
compare distant
sequences
› Designed to find
proteins’ conserved
domains
 ESSENTIAL BIOINFORMATICS by Xiong
 NCBI Handbook
 www.google.com
Scoring matrices

Weitere ähnliche Inhalte

Was ist angesagt?

Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
hemantbreeder
 

Was ist angesagt? (20)

Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Cath
CathCath
Cath
 
Scop database
Scop databaseScop database
Scop database
 
Prosite
PrositeProsite
Prosite
 
UPGMA
UPGMAUPGMA
UPGMA
 
Ddbj
DdbjDdbj
Ddbj
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
Phylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny ofPhylogenetic tree and its construction and phylogeny of
Phylogenetic tree and its construction and phylogeny of
 
Phylogenetic analysis
Phylogenetic analysisPhylogenetic analysis
Phylogenetic analysis
 
Clustal
ClustalClustal
Clustal
 

Ähnlich wie Scoring matrices

20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
Computer Science Club
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
 
adenylate_cyclase_poster
adenylate_cyclase_posteradenylate_cyclase_poster
adenylate_cyclase_poster
Kelly Thompson
 

Ähnlich wie Scoring matrices (20)

Scoring schemes in bioinformatics
Scoring schemes in bioinformaticsScoring schemes in bioinformatics
Scoring schemes in bioinformatics
 
Arms 2
Arms 2Arms 2
Arms 2
 
PAM matrices evolution
PAM matrices evolutionPAM matrices evolution
PAM matrices evolution
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
 
Bioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matricesBioinformatica 20-10-2011-t3-scoring matrices
Bioinformatica 20-10-2011-t3-scoring matrices
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Medicilon KRAS-targeted Drugs R&D Service.pdf
Medicilon KRAS-targeted Drugs R&D Service.pdfMedicilon KRAS-targeted Drugs R&D Service.pdf
Medicilon KRAS-targeted Drugs R&D Service.pdf
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSA
 
Research Project
Research ProjectResearch Project
Research Project
 
Computation and System Biology Assignment Help
Computation and System Biology Assignment HelpComputation and System Biology Assignment Help
Computation and System Biology Assignment Help
 
Seq alignment
Seq alignment Seq alignment
Seq alignment
 
Aacr poster2007
Aacr poster2007Aacr poster2007
Aacr poster2007
 
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTIONMUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
 
10 mutation
10 mutation10 mutation
10 mutation
 
4. sequence alignment.pptx
4. sequence alignment.pptx4. sequence alignment.pptx
4. sequence alignment.pptx
 
How the blast work
How the blast workHow the blast work
How the blast work
 
BIOS 5260 Term Paper
BIOS 5260 Term PaperBIOS 5260 Term Paper
BIOS 5260 Term Paper
 
SNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping SolutionSNP genotyping using Affymetrix' Axiom Genotyping Solution
SNP genotyping using Affymetrix' Axiom Genotyping Solution
 
Wang labsummer2010
Wang labsummer2010Wang labsummer2010
Wang labsummer2010
 
adenylate_cyclase_poster
adenylate_cyclase_posteradenylate_cyclase_poster
adenylate_cyclase_poster
 

Kürzlich hochgeladen

Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
Bhagirath Gogikar
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 

Kürzlich hochgeladen (20)

Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 

Scoring matrices

  • 1.
  • 2.  Scoring system is a set of values for qualifying the set of one residue being substituted by another in an alignment.  It is also known as substitution matrix.  Scoring matrix of nucleotide is relatively simple.  A positive value or a high score is given for a match & negative value or a low score is given for a mismatch.  Scoring matrices for amino acids are more complicated because scoring has to reflect the physicochemical properties of amino acid residues.
  • 3. Transition --- substitutions in which a purine (A/G) is replaced by another purine (A/G) or a pyrimidine (C/T) is replaced by another pyrimidine (C/T). Tansversions --- (A/G)  (C/T) 1000G 0100C 0010T 0001A GCTA Identity matrix 1-5-5- 1 G -51-1- 5 C -5-11- 5 T -1-5-51A GCTA Transition-Transversion matrix
  • 4.  Match score: +1  Mismatch score: +0  Gap penalty: –1  ACGTCTGATACGCCGTATAGTCTATCT ||||| ||| || |||||||| ----CTGATTCGC---ATCGTCTATCT  Matches: 18 × (+1)  Mismatches: 2 × 0 Score = +11  Gaps: 7 × (– 1)
  • 5. PAM - point accepted mutation based on global alignment [evolutionary model] BLOSUM - Block substitutions based on local alignments [similarity among conserved sequences]
  • 6.  First given by Dayhoff who compiled alignment of 71 groups of very closely related protein sequences.  PAM- Point Accepted Mutation.  PAM matrix were derived based on evolutionary divergence between sequences of protein structure.  Construction of PAM1 matrix involves alignment of full length sequence & subsequent construction of phylogenic trees using parsimony principle.
  • 7.  Ancestral sequence information is used to count the number of substitution along each branch of tree.  Positive scores in the matrix denotes substitutions occurring more frequently than expected among evolutionary conserved replacements.  Negative score corresponds to substution which occurs less frequently.  A PAM is defined as 1% amino acid change or one mutation per 100 residues.  The increasing PAM numbers correlate with increasing PAM units & thus evolutionary distances of protein sequences.
  • 8.  Constructed based on the phylogenetic relationships prior to scoring mutations;  Difficulty of determining ancestral relationships among sequences;  Based on a small set of closely related proteins;
  • 9.  It is a series of block amino acid substitution matrix.  Derived on the basis of direct observation for every possible amino acid substitution in multiple sequence alignment.  Sequence pattern is also called as block.  Ungapped alignments are less than 60 amino acid in length.  BLOSUM matrix are actual % values of sequence selected for construction of matrix.
  • 10.  BLOSUM 62 indicates that sequence selected for constructing the matrix is an average share of 62%.  BLOSUM share for a particular residue pair is derived from the log ratio of observed residue substitution versus the expected probability of particular residue.  Lower the number of BLOSUM more divergent species are present.
  • 11. C S T P A G C 9 S -1 4 T -1 1 5 P -3 -1 -1 7 A 0 1 0 -1 4 G -3 0 -2 -2 0 6  BLOSUM62 was measured on pairs of sequences with an average of 62 % identical amino acids. Log-odds = log ( )chance to see the pair in homologous proteins chance to see the pair in unrelated proteins by chance
  • 12.  PAM › Based on mutational model of evolution (Markov process) › PAM1 is based on sequences of 85% similarity › Designed to track the evolutionary origins  BLOSUM › Based on the multiple alignment of blocks › Good to be used to compare distant sequences › Designed to find proteins’ conserved domains
  • 13.  ESSENTIAL BIOINFORMATICS by Xiong  NCBI Handbook  www.google.com