2. Genome sequencing
Genome > Fragments > Cloning > Genome library >
Sequencing > Assembly based on Overlaps
Contig : Two or more clones that can be shown to overlap
make a contig
Physical markers :
-Used to identify overlaps
-Essentially unique DNA sequences
Physical markers are used to Build a Physical map.
3. Methodologies of Mapping
Three general methodologies of mapping are,
-Restriction enzyme fingerprinting
-Marker sequences
-Hybridization assays
Objective : To create a landmark map of the genome under
study with markers dispersed at regular intervals throughout
the map.
4. Composite maps :
Different kinds of markers are located on the same map
•In practice,
Usually markers are located on cloned genomic fragments
and contigs are build
5. Restriction enzyme fingerprinting
Limitation : Can not be applied to the genomes except
microbial genomes
Reasons :
i) Double digests, Radiolabelling methods are not reproducible
and not amenable to large-scale mapping efforts
ii) Problems of detecting overlaps from fingerprint patterns grow
exponentially as the size of the genome increases
6. Restriction enzyme fingerprinting
Two solutions to above problem,
1) Using BAC, PAC vectors
Insert size 150kb
Advantages:
- Can yield many easily detectable fragments on digestion with
a single restriction enzyme
- Non-radioactive detection methods can be used
7. Restriction enzyme fingerprinting
2) FPC (Fingerprinted contigs) software :
Carry out massive task of comparing the fingerprints of
different clones and determine which ones overlaps
Advantages : Can accommodate the use of physical markers
such as STSs
8. Marker sequences
Sequence-tagged sites (STS) :
Short region of DNA about 200-300 bases long whose
exact sequence is found nowhere else in the genome
Note : 2 or more clones containing the same STS must overlap
and the overlap must include the STS
Any clone that can be sequenced may be used as an STS
provided it contains a unique sequence
9. Marker sequences
Sequence-tagged connectors (STCs)
-Proposed by Venter et al.
-Used as an aid to Human genome sequencing
-Both ends of inserts are sequenced for 500 bases from the point
of insert
-These sequences can act as connectors because they will allow
any one BAC to be connected to about 30 others
10. Marker sequences
Expressed sequence tags (ESTs):
Spliced mRNA contains sequences that are largely free of
repetitive DNA.Thus partial cDNA sequences (ESTs) can
serve as marker sequences.
Advantage : Point directly to an expressed gene.
Operational considerations :
1) They need to be very short to ensure that the two ends of the
sequences are contiguous in the genome.
2) Large genes may be represented by multiple ESTs.
e.g., For Serum Albumin – 1300 diff. ESTs.
11. Marker sequences
3` Untranslated regions (3`UTRs) :
Advantages :
1) Rarely contains introns
-leads to PCR product sizes that are small enough to amplify
2) Display less sequence conservation than coding regions
-makes it easier to discriminate among gene families
12. Marker sequences
Single nucleotide polymorphisms (SNPs) :
Single base pair positions in genomic DNA at which different
sequence alternatives (alleles) exist in a population
Note :
Polymorphisms are considered to be SNPs only if the least
abundant allele has frequency of 1% or more in highly
outbred populations.
13. Marker sequences
Special subsets of SNPs :
1. Base change alters the sensitivity to cleavage by restriction
endonuclease - RFLP (Restriction Fragment length
Polymorphism)
2. Two alleles can be distinguished by the presence or absence
of a phenotype
14. Marker sequences
Importance of SNPs :
Provide greatest density of markers in a physical map
Methods for discovery of SNPs :
1. Mine the sequence data stored in the major databases
2. A variation of the above method.