SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Sequence comparison technique Ms.ruchiyadavlectureramity institute of biotechnologyamity universitylucknow(up)
Sequence comparison technique Pairwise Alignment Local Alignment(Smith WatermanAlgorithm) Global Alignment(Needleman Wunsch  Algorithm) Multiple Alignment Heuristic Methods Rather than struggling to find the optimal alignment we may save a lot of time by employing heuristic algorithms Execution time is much faster May completely miss the optimal alignment  FASTA and  BLAST
A T T G A C T T A A G 1 1 1 1 1 1 1 1 1 1 1 G 2 2 2 2 1 1 1 1 1 1 1 G 2 2 2 2 2 2 2 2 2 2 1 A 3 3 3 3 3 3 3 3 2 2 1 T 4 4 4 4 4 4 3 3 2 2 1 C 5 5 5 5 4 4 3 3 2 2 1 G 6 5 5 5 5 4 3 3 3 2 1 A Heuristic Methods Problem of Dynamic Programming     D.P. compute  the score in a lot of useless area for optimal sequence FASTA focuses on diagonal area
Heuristic   Heuristic    Good local alignment should have some exact match subsequence. FASTA focus on this area
Heuristic Methods: FASTA and BLAST FASTA  First fast sequence searching algorithm for comparing a query sequence against a database. BLAST  Basic Local Alignment Search Technique 	Improvement of FASTA: Search speed, ease of use, statistical rigor.
FASTA ALGORITHM (a)Find runs of identical words Identify regions shared by the two sequences that have the highest density of single identities (ktup=1) or two consecutive identities(ktup=2) (b) Re-score using PAM matrix.  Longest diagonals are scored again using the PAM-250 matrix (or other matrix).  The best scores are saved as “init1” scores.
FASTA Algorithm “init1”  ktup=2
FASTA ALGORITHM                   (c) Join segments using gaps and eliminate other   segments.  Longdiagonals that are neighbors are joined.  The score for this joined region is“initn”.  This score may be lower due to a penalty for a gap. (d) Use DP to create the optimal alignment.  construct an optimal alignment of the query sequence and the library sequence (SW algorithm).This score is reported as the optimized score
FASTA Alignments “initn”
FASTA Algorithm- Find words of identical words.  Lookup table showing the positions of each word of length k, or k-tuple, is constructed for each sequence.  The relative positions of each word in the two sequences are then calculated by subtracting the position in the first sequence from that in the second.  Words that have the same offset position are in phase and reveal a region of alignment between the two sequences.
Look-up table
A T T G A C T T A A G * * G Location Q * * G 2,3,7,11 A * * * * A 6 C * * * * T 1,8 G * C * * G 4,5,9,10 T * * * * A FASTA   - Algorithm - Use look-up Table Query     : G A A T T C A G T T A Sequence: G G A T C G A Dot—Matrix       1    2   3   4   5   6   7   8   9  10  11 Look-up Table
FASTA  - Algorithm - Use the dynamic programming in restricted area around the best-score alignment to find out the higher-score alignment than the best-score alignment Width of this band is a parameter
FASTA  - Complexity  Complexity  Step 1 and 2  	// select the best 10 diagonal run//        Let n be a sequence from DB O(n) because Step 1 just uses look up table        O(n) << O(mn)    m,n = 100 to 200
FASTA  - Complexity  compute partial D.P. Depends on the restricted area < O(mn)  Therefore, FASTA is faster than D.P. Width of this band is a parameter
Step 1: Finding Seeds  t s 16
Step 2: Re-scoring Segments, Keeping Top 10  t s 17
Step 3: Eliminating Unlikely Segments  t s 18
Step 4: Finding the Best Alignment  t s 19
Versions of FASTA FASTA compares a query protein sequence to a protein sequence library to find similar sequences. FASTA also compares a DNA sequence to a DNA sequence library. TFASTA compares a query protein sequence to a DNA sequence library, after translating the DNA sequence library in all six reading frames. FASTX and FASTY translate a query DNA sequence in all three reading forward frames and compare all three frames to a protein sequence database. TFASTX and TFASTY compare a query protein sequence to a DNA sequence database, translating each DNA sequence in all six possible reading frames.
BLAST Publications: Ungapped BLAST – Alttschul et al., 1990 Gapped BLAST, PSI-BLAST -  Altschul et al., 1997 Basic Local Alignment Search Tool Altschul et al. 1990,1994,1997 Heuristic method for local alignment Designed specifically for database searches Based on the same assumption as FASTA that good alignments contain short lengths of exact matches
Basic Local Alignment Search Tool (BLAST) Input: Query (target) sequence– either DNA, RNA or Protein Scoring Scheme– gap penalties, substitution matrix for proteins, identity/mismatch scores for DNA/RNA Word length W– typical is W=3 for proteins and W=11 for DNA/RNA Output: Statistically significant matches   22
BLAST ALGORITHM PARAMETERS
Algorithm of BLAST There are three distinct steps, which are represented as follow: Step1: Query preprocessing; Step2: Scan the database for hits; Step3: Extension of hits.
BLAST  - Algorithm  Step 1: Query preprocessing; 	Create neighbourhood words for each query word  	Max:L-w+1 Query Word Neighborhood words
BLAST  - Algorithm  Step 1: Query preprocessing; A list of words of length 3 for protein  (word length 11 is used for DNA sequences)
BLAST -Query preprocessing Compile the short-hit scoring word list from query.      The length of query word, is 3. Words below threshold are not further pursued.
BLAST  - Algorithm  Step 2: Scan the database for hits; For each words list, identify all exact matches with DB sequences Neighborhood Word list Query Word Sequences in DB Sequence 1 Sequence 2 Step 2 Step 1 The purpose of Step 1 and 2 is as same as FASTA
Step3:Extension of the hits Every hit that has been generated is now extended in both directions, without gaps. To determine whether each hit may be part of a longer segment pair with higher score,
Step3:Extension of the hits HSP (High scoring Segment Pair).  If the extended segment pair has score better than equal to S (set as a parameter of the program), it is called HSP MSP (Maximal segment pair).  In a comparison, for every sequence in the database, the best scoring HSP is called the MSP
HIGH –SCORING PAIR(HSP)
Maximal segment pair(msp)
Step 2: Extracting Seeds t s 33
Step 3: Finding HSPs t s 34
Step 4: Combining HSPs t s 35
BLAST
Basic BLAST
Specialized BLAST ,[object Object]
 Search trace archives
 Find conserved domains in your sequence (cds)
 Find sequences with similar conserved domain architecture (cdart)
 Search sequences that have gene expression  profiles (GEO)

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
Needleman-wunch algorithm harshita
Needleman-wunch algorithm  harshitaNeedleman-wunch algorithm  harshita
Needleman-wunch algorithm harshita
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Structure alignment methods
Structure alignment methodsStructure alignment methods
Structure alignment methods
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
FASTA
FASTAFASTA
FASTA
 
BLAST
BLASTBLAST
BLAST
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Global and Local Sequence Alignment
Global and Local Sequence AlignmentGlobal and Local Sequence Alignment
Global and Local Sequence Alignment
 
Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)Global and local alignment (bioinformatics)
Global and local alignment (bioinformatics)
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Cath
CathCath
Cath
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
Multiple Sequence Alignment
Multiple Sequence AlignmentMultiple Sequence Alignment
Multiple Sequence Alignment
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 

Andere mochten auch

Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning GenomeCompiler
 
Introduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for BioinformaticsIntroduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for Bioinformaticsibogicevic
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsNikesh Narayanan
 
Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction CS, NcState
 
Human genome project
Human genome projectHuman genome project
Human genome projectruchibioinfo
 
Global local alignment
Global local alignmentGlobal local alignment
Global local alignmentScott Hamilton
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresLars Juhl Jensen
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in BioinformaticsArindam Ghosh
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment Parinda Rajapaksha
 
RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure predictionMuhammed sadiq
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASucheta Tripathy
 

Andere mochten auch (20)

Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Blast fasta 4
Blast fasta 4Blast fasta 4
Blast fasta 4
 
Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning Pairwise Alignment Course - Verify Your Cloning
Pairwise Alignment Course - Verify Your Cloning
 
Introduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for BioinformaticsIntroduction to Probabilistic Models for Bioinformatics
Introduction to Probabilistic Models for Bioinformatics
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Ch06 rna
Ch06 rnaCh06 rna
Ch06 rna
 
Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction Local vs. Global Models for Effort Estimation and Defect Prediction
Local vs. Global Models for Effort Estimation and Defect Prediction
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Blast
BlastBlast
Blast
 
Global local alignment
Global local alignmentGlobal local alignment
Global local alignment
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein features
 
Publicly available tools and open resources in Bioinformatics
Publicly available  tools and open resources in BioinformaticsPublicly available  tools and open resources in Bioinformatics
Publicly available tools and open resources in Bioinformatics
 
Bioalgo 2012-01-gene-prediction-sim
Bioalgo 2012-01-gene-prediction-simBioalgo 2012-01-gene-prediction-sim
Bioalgo 2012-01-gene-prediction-sim
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Ch06 alignment
Ch06 alignmentCh06 alignment
Ch06 alignment
 
The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment The Needleman-Wunsch Algorithm for Sequence Alignment
The Needleman-Wunsch Algorithm for Sequence Alignment
 
RNA secondary structure prediction
RNA secondary structure predictionRNA secondary structure prediction
RNA secondary structure prediction
 
Sequence alignment belgaum
Sequence alignment belgaumSequence alignment belgaum
Sequence alignment belgaum
 
Sequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSASequence Alignment,Blast, Fasta, MSA
Sequence Alignment,Blast, Fasta, MSA
 
Genome evolution
Genome evolutionGenome evolution
Genome evolution
 

Ähnlich wie Sequence comparison techniques

2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekingeProf. Wim Van Criekinge
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014Prof. Wim Van Criekinge
 
BLAST AND FASTA.pptx
BLAST AND FASTA.pptxBLAST AND FASTA.pptx
BLAST AND FASTA.pptxPiyushBehgal1
 
Bioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searchingBioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searchingProf. Wim Van Criekinge
 
2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekinge2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekingeProf. Wim Van Criekinge
 
FastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMFastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMMuunda Mudenda
 
Bioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeBioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeProf. Wim Van Criekinge
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformaticsatmapandey
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)AnkitTiwari354
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignmentbarathvaj
 
Blast fasta
Blast fastaBlast fasta
Blast fastayaghava
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadfalizain9604
 
BLAST_CSS2.ppt
BLAST_CSS2.pptBLAST_CSS2.ppt
BLAST_CSS2.pptSilpa87
 

Ähnlich wie Sequence comparison techniques (20)

Mayank
MayankMayank
Mayank
 
2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge2016 bioinformatics i_database_searching_wimvancriekinge
2016 bioinformatics i_database_searching_wimvancriekinge
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014Bioinformatics t5-databasesearching v2014
Bioinformatics t5-databasesearching v2014
 
BLAST AND FASTA.pptx
BLAST AND FASTA.pptxBLAST AND FASTA.pptx
BLAST AND FASTA.pptx
 
Bioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searchingBioinformatica 10-11-2011-t5-database searching
Bioinformatica 10-11-2011-t5-database searching
 
2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekinge2015 bioinformatics database_searching_wimvancriekinge
2015 bioinformatics database_searching_wimvancriekinge
 
FastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMFastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHM
 
Bioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeBioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekinge
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
blast and fasta
 blast and fasta blast and fasta
blast and fasta
 
Database Searching
Database SearchingDatabase Searching
Database Searching
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformatics
 
_BLAST.ppt
_BLAST.ppt_BLAST.ppt
_BLAST.ppt
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignment
 
Blast fasta
Blast fastaBlast fasta
Blast fasta
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
 
BLAST_CSS2.ppt
BLAST_CSS2.pptBLAST_CSS2.ppt
BLAST_CSS2.ppt
 
Sequence database
Sequence databaseSequence database
Sequence database
 

Kürzlich hochgeladen

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Kürzlich hochgeladen (20)

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Sequence comparison techniques

  • 1. Sequence comparison technique Ms.ruchiyadavlectureramity institute of biotechnologyamity universitylucknow(up)
  • 2. Sequence comparison technique Pairwise Alignment Local Alignment(Smith WatermanAlgorithm) Global Alignment(Needleman Wunsch Algorithm) Multiple Alignment Heuristic Methods Rather than struggling to find the optimal alignment we may save a lot of time by employing heuristic algorithms Execution time is much faster May completely miss the optimal alignment FASTA and BLAST
  • 3. A T T G A C T T A A G 1 1 1 1 1 1 1 1 1 1 1 G 2 2 2 2 1 1 1 1 1 1 1 G 2 2 2 2 2 2 2 2 2 2 1 A 3 3 3 3 3 3 3 3 2 2 1 T 4 4 4 4 4 4 3 3 2 2 1 C 5 5 5 5 4 4 3 3 2 2 1 G 6 5 5 5 5 4 3 3 3 2 1 A Heuristic Methods Problem of Dynamic Programming D.P. compute the score in a lot of useless area for optimal sequence FASTA focuses on diagonal area
  • 4. Heuristic Heuristic Good local alignment should have some exact match subsequence. FASTA focus on this area
  • 5. Heuristic Methods: FASTA and BLAST FASTA First fast sequence searching algorithm for comparing a query sequence against a database. BLAST Basic Local Alignment Search Technique Improvement of FASTA: Search speed, ease of use, statistical rigor.
  • 6. FASTA ALGORITHM (a)Find runs of identical words Identify regions shared by the two sequences that have the highest density of single identities (ktup=1) or two consecutive identities(ktup=2) (b) Re-score using PAM matrix. Longest diagonals are scored again using the PAM-250 matrix (or other matrix). The best scores are saved as “init1” scores.
  • 8. FASTA ALGORITHM (c) Join segments using gaps and eliminate other segments. Longdiagonals that are neighbors are joined. The score for this joined region is“initn”. This score may be lower due to a penalty for a gap. (d) Use DP to create the optimal alignment. construct an optimal alignment of the query sequence and the library sequence (SW algorithm).This score is reported as the optimized score
  • 10. FASTA Algorithm- Find words of identical words. Lookup table showing the positions of each word of length k, or k-tuple, is constructed for each sequence. The relative positions of each word in the two sequences are then calculated by subtracting the position in the first sequence from that in the second. Words that have the same offset position are in phase and reveal a region of alignment between the two sequences.
  • 12. A T T G A C T T A A G * * G Location Q * * G 2,3,7,11 A * * * * A 6 C * * * * T 1,8 G * C * * G 4,5,9,10 T * * * * A FASTA - Algorithm - Use look-up Table Query : G A A T T C A G T T A Sequence: G G A T C G A Dot—Matrix 1 2 3 4 5 6 7 8 9 10 11 Look-up Table
  • 13. FASTA - Algorithm - Use the dynamic programming in restricted area around the best-score alignment to find out the higher-score alignment than the best-score alignment Width of this band is a parameter
  • 14. FASTA - Complexity Complexity Step 1 and 2 // select the best 10 diagonal run// Let n be a sequence from DB O(n) because Step 1 just uses look up table O(n) << O(mn) m,n = 100 to 200
  • 15. FASTA - Complexity compute partial D.P. Depends on the restricted area < O(mn) Therefore, FASTA is faster than D.P. Width of this band is a parameter
  • 16. Step 1: Finding Seeds t s 16
  • 17. Step 2: Re-scoring Segments, Keeping Top 10 t s 17
  • 18. Step 3: Eliminating Unlikely Segments t s 18
  • 19. Step 4: Finding the Best Alignment t s 19
  • 20. Versions of FASTA FASTA compares a query protein sequence to a protein sequence library to find similar sequences. FASTA also compares a DNA sequence to a DNA sequence library. TFASTA compares a query protein sequence to a DNA sequence library, after translating the DNA sequence library in all six reading frames. FASTX and FASTY translate a query DNA sequence in all three reading forward frames and compare all three frames to a protein sequence database. TFASTX and TFASTY compare a query protein sequence to a DNA sequence database, translating each DNA sequence in all six possible reading frames.
  • 21. BLAST Publications: Ungapped BLAST – Alttschul et al., 1990 Gapped BLAST, PSI-BLAST - Altschul et al., 1997 Basic Local Alignment Search Tool Altschul et al. 1990,1994,1997 Heuristic method for local alignment Designed specifically for database searches Based on the same assumption as FASTA that good alignments contain short lengths of exact matches
  • 22. Basic Local Alignment Search Tool (BLAST) Input: Query (target) sequence– either DNA, RNA or Protein Scoring Scheme– gap penalties, substitution matrix for proteins, identity/mismatch scores for DNA/RNA Word length W– typical is W=3 for proteins and W=11 for DNA/RNA Output: Statistically significant matches 22
  • 24. Algorithm of BLAST There are three distinct steps, which are represented as follow: Step1: Query preprocessing; Step2: Scan the database for hits; Step3: Extension of hits.
  • 25. BLAST - Algorithm Step 1: Query preprocessing; Create neighbourhood words for each query word Max:L-w+1 Query Word Neighborhood words
  • 26. BLAST - Algorithm Step 1: Query preprocessing; A list of words of length 3 for protein (word length 11 is used for DNA sequences)
  • 27. BLAST -Query preprocessing Compile the short-hit scoring word list from query. The length of query word, is 3. Words below threshold are not further pursued.
  • 28. BLAST - Algorithm Step 2: Scan the database for hits; For each words list, identify all exact matches with DB sequences Neighborhood Word list Query Word Sequences in DB Sequence 1 Sequence 2 Step 2 Step 1 The purpose of Step 1 and 2 is as same as FASTA
  • 29. Step3:Extension of the hits Every hit that has been generated is now extended in both directions, without gaps. To determine whether each hit may be part of a longer segment pair with higher score,
  • 30. Step3:Extension of the hits HSP (High scoring Segment Pair). If the extended segment pair has score better than equal to S (set as a parameter of the program), it is called HSP MSP (Maximal segment pair). In a comparison, for every sequence in the database, the best scoring HSP is called the MSP
  • 33. Step 2: Extracting Seeds t s 33
  • 34. Step 3: Finding HSPs t s 34
  • 35. Step 4: Combining HSPs t s 35
  • 36. BLAST
  • 38.
  • 39. Search trace archives
  • 40. Find conserved domains in your sequence (cds)
  • 41. Find sequences with similar conserved domain architecture (cdart)
  • 42. Search sequences that have gene expression profiles (GEO)
  • 44. Search for SNPs(snp)
  • 45. Screen sequence for vector contamination (vecscreen)
  • 46. Align two (or more) sequences using BLAST (bl2seq)
  • 47. Search protein or nucleotide targets in PubChem BioAssay
  • 48. Search SRA transcript and genomic libraries
  • 49. Constraint Based Protein Multiple Alignment Tool
  • 50.
  • 51. Databases available on BLAST Web server
  • 52. Databases available on BLAST Web server
  • 53. Options and parameter settings available on the BLAST server