SlideShare a Scribd company logo
1 of 14
Dot plots

                    Dr Avril Coghlan
                   alc@sanger.ac.uk

Note: this talk contains animations which can only be seen by
downloading and using ‘View Slide show’ in Powerpoint
Dot plots
• How can we compare the human & Drosophila
  melanogaster Eyeless protein sequences?
  One method is a dotplot
• A dotplot is a graphical method for assessing
  similarity
  Make a matrix (table) with one row for each letter in sequence 1, & one
       column for each letter in sequence 2
  Colour in each cell with an identical letter in the 2 sequences
  Regions of local similarity between the 2 sequences appear as diagonal
       lines of coloured cells (‘dots’)
eg. for sequences ‘RQQEPVRSTC’ and ‘QQESGPVRST’:

                   Q   Q    E   S   G    P   V    R   S   T          Sequence 2
               R
               Q
               Q
               E
Sequence 1
               P
               V
               R
               S
               T
               C

     Regions of local similarity between the 2 sequences appear as
     diagonal lines
     Some off-diagonal dots may be due to chance similarities
Problem
• Make a dot-plot for DNA sequences “GCATCGGC” &
  “CCATCGCCATCG”. Are there regions of similarity?
Answer
• Make a dot-plot for DNA sequences “GCATCGGC” &
  “CCATCGCCATCG”. Are there regions of similarity?
       C    C   A   T   C   G    C   C   A   T      C   G
   G
   C
   A
   T
   C
   G
   G
   C

  CATCG in sequence 1 appears twice in sequence 2
Dot plots with thresholds
• If you colour in all cells with an identical letter, some
  dots may be due to chance similarities
• Therefore, it is common to use a threshold to decide
  whether to plot a ‘dot’ in a cell
  A window of a certain size (eg. window size = 3) is moved up all possible
        diagonals, one-by-one
  A score is calculated for each position of the window on a diagonal :
        the number of identical letters in the window
  If the score is equal to or above the threshold (eg. threshold = score of
        2), all the cells in the window are coloured in
  The choice of values for the window size and threshold for the dot plot
        are chosen by trial-and-error
eg. for sequences “GCATCGGC” and “CCATCGCCATCG” , using a window
      size of 3, and a threshold of ≥2:


          C   C   A   T   C   G   C   C     A    T   C    G
      G
      C
      A
      T
      C
      G
      G
      C

          Score = 2, ≥ threshold → colour in
                  3, <
                  0,
                  1,

  = the sliding window                    and so on....
Real data: fruitfly & human Eyeless
• A dot plot of fruitfly & human Eyeless proteins:
        Fruitfly Eyeless



                                           Window-size = 10,
                                           Threshold = 3




                           Human Eyeless
  Do you think we chose a good value for the
  window-size and threshold?
Real data: fruitfly & human Eyeless
• Here is a dot plot of fruitfly and human Eyeless
  proteins, made using windowsize=10, threshold=5:
     Fruitfly Eyeless




                                         Window-size = 10,
                                         Threshold = 5




                        Human Eyeless
  Are there any regions of similarity?
Pros and cons of dot plots
• Advantages
  A dot plot can be used to identify long regions of strong similarity
  between two sequences
  It produces a plot, which is easy to make and to interpret
  It can be used to compare very short or long sequences (even whole
        chromosomes – millions of bases)
• Disadvantages
  It is necessary to find the best window size and threshold by trial-and-
  error
  A dot plot can only be used to compare 2 sequences, not >2 sequences
  It doesn’t tell you what mutations occurred in the region of
  similarity (if there is one) since the two sequences shared a
  common ancestor
Software for making dotplots
• dotPlot() function in the SeqinR R library
  Allows you to specify a windowsize and threshold
  If the score in a window is ≥ than the threshold, colours in the 1st cell in
        the window (not all cells)
• EMBOSS dottup
  Allows you to specify a windowsize but not a threshold
  If all cells in a window are identities, it colours in all cells in the window
• EMBOSS dotmatcher
  Allows you to specify a windowsize and threshold
  Instead of using the number of identities in a window as the window
        score, it calculates a more complex score based on the
  similarities of the bases/amino acids
Problem
• Make a dot-plot for amino acid sequences
  “RQQEPVRSTC” and “QQESGPVRST”, using a
  window size of 3, and a threshold of ≥3
Answer
•   Make a dot-plot for sequences “RQQEPVRSTC” and “QQESGPVRST”,
    using window size: 3, threshold: ≥3

                Q   Q   E   S   G   P   V   R   S   T
            R
            Q
            Q
            E
            P
            V
            R
            S
            T
            C
Further reading
•   Chapter 3 in Introduction to Computational Genomics Cristianini & Hahn
•   Practical on dotplots in R in the Little Book of R for Bioinformatics:
    https://a-little-book-of-r-for-
    bioinformatics.readthedocs.org/en/latest/src/chapter4.html

More Related Content

What's hot

What's hot (20)

Kegg databse
Kegg databseKegg databse
Kegg databse
 
PAM matrices evolution
PAM matrices evolutionPAM matrices evolution
PAM matrices evolution
 
Prosite
PrositeProsite
Prosite
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Introduction to sequence alignment partii
Introduction to sequence alignment partiiIntroduction to sequence alignment partii
Introduction to sequence alignment partii
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
UniProt
UniProtUniProt
UniProt
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
 
Blast
BlastBlast
Blast
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Biological database
Biological databaseBiological database
Biological database
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Est database
Est databaseEst database
Est database
 
Kegg
KeggKegg
Kegg
 
Phylogenetic data analysis
Phylogenetic data analysisPhylogenetic data analysis
Phylogenetic data analysis
 

Similar to Dotplots for Bioinformatics

NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured prediction
zukun
 
Intelligent Handwriting Recognition_MIL_presentation_v3_final
Intelligent Handwriting Recognition_MIL_presentation_v3_finalIntelligent Handwriting Recognition_MIL_presentation_v3_final
Intelligent Handwriting Recognition_MIL_presentation_v3_final
Suhas Pillai
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
Computer Science Club
 
PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)
Jinwon Lee
 
Scalable membership management
Scalable membership management Scalable membership management
Scalable membership management
Vinay Setty
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
avrilcoghlan
 
2012 talk to CSE department at U. Arizona
2012 talk to CSE department at U. Arizona2012 talk to CSE department at U. Arizona
2012 talk to CSE department at U. Arizona
c.titus.brown
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
Jinwon Lee
 

Similar to Dotplots for Bioinformatics (20)

Dot matrix seminar
Dot matrix seminarDot matrix seminar
Dot matrix seminar
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured prediction
 
Intelligent Handwriting Recognition_MIL_presentation_v3_final
Intelligent Handwriting Recognition_MIL_presentation_v3_finalIntelligent Handwriting Recognition_MIL_presentation_v3_final
Intelligent Handwriting Recognition_MIL_presentation_v3_final
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
 
PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)PR-284: End-to-End Object Detection with Transformers(DETR)
PR-284: End-to-End Object Detection with Transformers(DETR)
 
Scalable membership management
Scalable membership management Scalable membership management
Scalable membership management
 
SyMAP Master's Thesis Presentation
SyMAP Master's Thesis PresentationSyMAP Master's Thesis Presentation
SyMAP Master's Thesis Presentation
 
DOT MATRIX DOT MATRIX DOT MATRIX DOT MATRIX
DOT MATRIX DOT MATRIX DOT MATRIX DOT MATRIXDOT MATRIX DOT MATRIX DOT MATRIX DOT MATRIX
DOT MATRIX DOT MATRIX DOT MATRIX DOT MATRIX
 
Indexing Text with Approximate q-grams
Indexing Text with Approximate q-gramsIndexing Text with Approximate q-grams
Indexing Text with Approximate q-grams
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
Efficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketchingEfficient anomaly detection via matrix sketching
Efficient anomaly detection via matrix sketching
 
2012 talk to CSE department at U. Arizona
2012 talk to CSE department at U. Arizona2012 talk to CSE department at U. Arizona
2012 talk to CSE department at U. Arizona
 
Significant scales in community structure
Significant scales in community structureSignificant scales in community structure
Significant scales in community structure
 
Word2vec and Friends
Word2vec and FriendsWord2vec and Friends
Word2vec and Friends
 
De bruijn graphs
De bruijn graphsDe bruijn graphs
De bruijn graphs
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks, arXiv e-...
 
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
Part 4 of RNA-seq for DE analysis: Extracting count table and QCPart 4 of RNA-seq for DE analysis: Extracting count table and QC
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
 
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular AutomataCost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
Cost Optimized Design Technique for Pseudo-Random Numbers in Cellular Automata
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
 

More from avrilcoghlan

DESeq Paper Journal club
DESeq Paper Journal club DESeq Paper Journal club
DESeq Paper Journal club
avrilcoghlan
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomes
avrilcoghlan
 
Statistical significance of alignments
Statistical significance of alignmentsStatistical significance of alignments
Statistical significance of alignments
avrilcoghlan
 
Multiple alignment
Multiple alignmentMultiple alignment
Multiple alignment
avrilcoghlan
 
The Smith Waterman algorithm
The Smith Waterman algorithmThe Smith Waterman algorithm
The Smith Waterman algorithm
avrilcoghlan
 
Alignment scoring functions
Alignment scoring functionsAlignment scoring functions
Alignment scoring functions
avrilcoghlan
 
The Needleman Wunsch algorithm
The Needleman Wunsch algorithmThe Needleman Wunsch algorithm
The Needleman Wunsch algorithm
avrilcoghlan
 
Introduction to HMMs in Bioinformatics
Introduction to HMMs in BioinformaticsIntroduction to HMMs in Bioinformatics
Introduction to HMMs in Bioinformatics
avrilcoghlan
 

More from avrilcoghlan (10)

DESeq Paper Journal club
DESeq Paper Journal club DESeq Paper Journal club
DESeq Paper Journal club
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomes
 
Homology
HomologyHomology
Homology
 
Statistical significance of alignments
Statistical significance of alignmentsStatistical significance of alignments
Statistical significance of alignments
 
BLAST
BLASTBLAST
BLAST
 
Multiple alignment
Multiple alignmentMultiple alignment
Multiple alignment
 
The Smith Waterman algorithm
The Smith Waterman algorithmThe Smith Waterman algorithm
The Smith Waterman algorithm
 
Alignment scoring functions
Alignment scoring functionsAlignment scoring functions
Alignment scoring functions
 
The Needleman Wunsch algorithm
The Needleman Wunsch algorithmThe Needleman Wunsch algorithm
The Needleman Wunsch algorithm
 
Introduction to HMMs in Bioinformatics
Introduction to HMMs in BioinformaticsIntroduction to HMMs in Bioinformatics
Introduction to HMMs in Bioinformatics
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

Dotplots for Bioinformatics

  • 1. Dot plots Dr Avril Coghlan alc@sanger.ac.uk Note: this talk contains animations which can only be seen by downloading and using ‘View Slide show’ in Powerpoint
  • 2. Dot plots • How can we compare the human & Drosophila melanogaster Eyeless protein sequences? One method is a dotplot • A dotplot is a graphical method for assessing similarity Make a matrix (table) with one row for each letter in sequence 1, & one column for each letter in sequence 2 Colour in each cell with an identical letter in the 2 sequences Regions of local similarity between the 2 sequences appear as diagonal lines of coloured cells (‘dots’)
  • 3. eg. for sequences ‘RQQEPVRSTC’ and ‘QQESGPVRST’: Q Q E S G P V R S T Sequence 2 R Q Q E Sequence 1 P V R S T C Regions of local similarity between the 2 sequences appear as diagonal lines Some off-diagonal dots may be due to chance similarities
  • 4. Problem • Make a dot-plot for DNA sequences “GCATCGGC” & “CCATCGCCATCG”. Are there regions of similarity?
  • 5. Answer • Make a dot-plot for DNA sequences “GCATCGGC” & “CCATCGCCATCG”. Are there regions of similarity? C C A T C G C C A T C G G C A T C G G C CATCG in sequence 1 appears twice in sequence 2
  • 6. Dot plots with thresholds • If you colour in all cells with an identical letter, some dots may be due to chance similarities • Therefore, it is common to use a threshold to decide whether to plot a ‘dot’ in a cell A window of a certain size (eg. window size = 3) is moved up all possible diagonals, one-by-one A score is calculated for each position of the window on a diagonal : the number of identical letters in the window If the score is equal to or above the threshold (eg. threshold = score of 2), all the cells in the window are coloured in The choice of values for the window size and threshold for the dot plot are chosen by trial-and-error
  • 7. eg. for sequences “GCATCGGC” and “CCATCGCCATCG” , using a window size of 3, and a threshold of ≥2: C C A T C G C C A T C G G C A T C G G C Score = 2, ≥ threshold → colour in 3, < 0, 1, = the sliding window and so on....
  • 8. Real data: fruitfly & human Eyeless • A dot plot of fruitfly & human Eyeless proteins: Fruitfly Eyeless Window-size = 10, Threshold = 3 Human Eyeless Do you think we chose a good value for the window-size and threshold?
  • 9. Real data: fruitfly & human Eyeless • Here is a dot plot of fruitfly and human Eyeless proteins, made using windowsize=10, threshold=5: Fruitfly Eyeless Window-size = 10, Threshold = 5 Human Eyeless Are there any regions of similarity?
  • 10. Pros and cons of dot plots • Advantages A dot plot can be used to identify long regions of strong similarity between two sequences It produces a plot, which is easy to make and to interpret It can be used to compare very short or long sequences (even whole chromosomes – millions of bases) • Disadvantages It is necessary to find the best window size and threshold by trial-and- error A dot plot can only be used to compare 2 sequences, not >2 sequences It doesn’t tell you what mutations occurred in the region of similarity (if there is one) since the two sequences shared a common ancestor
  • 11. Software for making dotplots • dotPlot() function in the SeqinR R library Allows you to specify a windowsize and threshold If the score in a window is ≥ than the threshold, colours in the 1st cell in the window (not all cells) • EMBOSS dottup Allows you to specify a windowsize but not a threshold If all cells in a window are identities, it colours in all cells in the window • EMBOSS dotmatcher Allows you to specify a windowsize and threshold Instead of using the number of identities in a window as the window score, it calculates a more complex score based on the similarities of the bases/amino acids
  • 12. Problem • Make a dot-plot for amino acid sequences “RQQEPVRSTC” and “QQESGPVRST”, using a window size of 3, and a threshold of ≥3
  • 13. Answer • Make a dot-plot for sequences “RQQEPVRSTC” and “QQESGPVRST”, using window size: 3, threshold: ≥3 Q Q E S G P V R S T R Q Q E P V R S T C
  • 14. Further reading • Chapter 3 in Introduction to Computational Genomics Cristianini & Hahn • Practical on dotplots in R in the Little Book of R for Bioinformatics: https://a-little-book-of-r-for- bioinformatics.readthedocs.org/en/latest/src/chapter4.html

Editor's Notes

  1. In R: setwd(&quot;C:/Documents and Settings/Avril Coughlan/My Documents/BACKEDUP/MScCourseLectures/MB6301Lectures/MB6301_Ls3456_Aln&quot;) library(&quot;seqinr&quot;) seq1 &lt;- “RQQEPVRSTC” seq2 &lt;- “QQESGPVRST” seq1b &lt;- s2c(seq1) seq2b &lt;- s2c(seq2) source(“dotplot.R”) makeDotPlot1(seq1b,seq2b,dotsize=1)
  2. In R: setwd(&quot;C:/Documents and Settings/Avril Coughlan/My Documents/BACKEDUP/MScCourseLectures/MB6301Lectures/MB6301_Ls3456_Aln&quot;) library(&quot;seqinr&quot;) seq1 &lt;- “GCATCGGC” seq2 &lt;- “CCATCGCCATCG” seq1b &lt;- s2c(seq1) seq2b &lt;- s2c(seq2) source(“dotplot.R”) makeDotPlot1(seq1b,seq2b,dotsize=1)
  3. In R: setwd(&quot;C:/Documents and Settings/Avril Coughlan/My Documents/BACKEDUP/MScCourseLectures/MB6301Lectures/MB6301_Ls3456_Aln&quot;) library(&quot;seqinr&quot;) seq1 &lt;- “GCATCGGC” seq2 &lt;- “CCATCGCCATCG” seq1b &lt;- s2c(seq1) seq2b &lt;- s2c(seq2) source(“dotplot.R”) makeDotPlot2(seq1b,seq2b,dotsize=1,windowsize=3,threshold=2)
  4. setwd(&quot;C:/Documents and Settings/Avril Coughlan/My Documents/BACKEDUP/MScCourseLectures/MB6301Lectures/MB6301_Ls3456_Aln&quot;) library(&quot;seqinr&quot;) seq1 &lt;- read.fasta(“human.fa”) # human Eyeless seq2 &lt;- read.fasta(“fly.fa”) # fruitfly Eyeless seq1b &lt;- seq1[[1]] seq2b &lt;- seq2[[1]] source(“dotplot.R”) makeDotPlot2(seq1b,seq2b,dotsize=1,windowsize=10,threshold=3) Saved picture as dotplot2.png
  5. setwd(&quot;C:/Documents and Settings/Avril Coughlan/My Documents/BACKEDUP/MScCourseLectures/MB6301Lectures/MB6301_Ls3456_Aln&quot;) library(&quot;seqinr&quot;) seq1 &lt;- read.fasta(“human.fa”) # human Eyeless seq2 &lt;- read.fasta(“fly.fa”) # fruitfly Eyeless seq1b &lt;- seq1[[1]] seq2b &lt;- seq2[[1]] source(“dotplot.R”) makeDotPlot2(seq1b,seq2b,dotsize=1,windowsize=10,threshold=5) Saved picture as dotplot1.png
  6. In R: setwd(&quot;C:/Documents and Settings/Avril Coughlan/My Documents/BACKEDUP/MScCourseLectures/MB6301Lectures/MB6301_Ls3456_Aln&quot;) library(&quot;seqinr&quot;) seq1 &lt;- &quot;RQQEPVRSTC&quot; seq2 &lt;- &quot;QQESGPVRST&quot; seq1b &lt;- s2c(seq1) seq2b &lt;- s2c(seq2) source(&quot;dotplot.R&quot;) makeDotPlot2(seq1b,seq2b,dotsize=1,windowsize=3,threshold=3)