2. Introduction
BLAST is a sequence similarity search program that
can be used to quickly search a sequence database
for matches to a query sequence.
Set of programs that search sequence databases for
statistically significant similarities.
3. Basic Local Alignment Search Tool
– Altschul et al. 1990,1994,1997
Heuristic method for local alignment
Designed specifically for database searches
Based on the same assumption as FASTA that good
alignments contain short lengths of exact matches
4. Both BLAST and FASTA search for local sequence
similarity ‐ indeed they have exactly the same goals,
though they use somewhat different algorithms and
statistical approaches.
BLAST benefits
– Speed
– User friendly
– Statistical rigor
– More sensitive
5. Input/Output
Input:
– Query sequence Q
– Database of sequences DB
– Maximal E‐value
Output:
– Sequences from DB (Seq), such that Q and Seq
have E‐values < E
7. Uses word matching like FASTA
Similarity matching of words (3 amino acids, 11
bases)
– does not require identical words.
If no words are similar, then no alignment
– Will not find matches for very short sequences
Does not handle gaps well
“gapped BLAST” is somewhat better
8. BLAST Algorithm
3 step process
1. Word search method
2. Identification of exact word match method
3. Maximum segment pair alignment method
13. Use two word matches as anchors to build an
alignment between the query and a database
sequence.
Then score the alignment.
14. HSPs are Aligned Regions
The results of the word matching and attempts to
extend the alignment are segments‐ called HSPs
(High‐Scoring Segment Pairs)
BLAST often produces several short HSPs
rather than a single aligned region