2. Genomics: a field of science that analyzes and compares the complete genome (genetic material of an
organism) of organisms or a large number of genes in a simultaneous fashion.
microbial
(Science: biology, microbiology) Pertaining to microorganisms too small to be seen with the naked eye.
Microbial genomic :
the contact of genomics on our understanding of the microbes life
1) Methods for Studying Microbial Genomes
2) Analysis and Interpretation of Whole Genome Sequences
Introduction
3. Genome
Genome Terminology
Circular Chromosome ·The DNA is arranged in a closed circle, which is negatively
supercoiled allowing for the compact nature of many bacterial genomes.
Eg: E. coli, Enterobacter sp.
Linear Chromosome · A non-closed chromosome, which has inverted repeats at the ends,
similar to teleomeres in eukaryotic chromosomes.
Eg: Streptomyces sp.,
Plasmid · Extra-chromosomal DNA which replicates independently of the chromosome
and regulates its own replication.
Mega plasmid · A very large plasmid ranging in size from 100- 1700 Kb
Kb - A kilobase (Kb) is 1000 bases of DNA,
Mb - is 1,000,00 bases.
4. plasmid: A circle of double-stranded DNA that is separate from the
chromosomes, which is found in bacteria and protozoa.
mobilome: The entirety of the mobile (transposable) elements of a
genome.
replicons: a region of DNA or RNA, that replicates from a single origin
of replication.
episome Plasmid DNA integrated with chromosomal DNA.
5. Bacterial genome
• The total genetic information of an organism- genome
Eg: a) Haploid set of chromosomes in eukaryotes
b) Single chromosome in bacteria
c) DNA or RNA of virus (virus contains either of DNA or RNA but not both at
time)
• Prokaryotic genome is different from the eukaryotic genome
• Small sections of DNA, called genes, code for the RNA and protein molecules required
by the organism
• The full range of RNA molecules expressed by a genome is known as its
transcriptome, and
• The full assortment of proteins produced by the genome is called its proteome.
• The structure (and function) of DNA depends on the sequence of the DNA
6. Genomes summary
1. >930 bacterial genomes sequenced.
2. Genomes of >200 eukaryotes (45 “higher”) sequenced.
3. Bacterial genome is circular exceptional linear genomes are also present
4. Genes densely packed.
5. 2-10 Mbases, 470 - 7,000 genes
6. On average, ~50% of gene functions “known”.
7. Human genome: <40,000 genes code for >120,000
proteins.
Large gene families (e.g. 500 protein kinases)
98% of human DNA is noncoding.
~3% of human DNA = simple repeats (satellites,
minisatellites, microsatellites)
~50% of DNA = mobile elements (DNA transposons,
retrotransposons (LTR and nonLTR) & pseudogenes)
7. Predicted genes in bacterial species
Mycoplasma genitalium 470
Mycoplasma mycoides 985
E. coli 4,288
B. anthracis 5,508
P. aeruginosa 5,570
Mycobacterium leprae 1,604
Mycobacterium tuberculosis 3,995
Bacterial genome sizes
8. Deinococcus radiodurans
• Deinococcus radiodurans was first discovered in 1956 by Arthur W. Anderson.
• Meaning "strange berry that withstands radiation."
• Ionizing radiation makes double-strand breaks in the DNA.
• Deinococcus radiodurans is able to survive radiation exposure up to 1,500,000 rads! That is
3,000 times greater than the amount of radiation exposure that would kill a human.
• Cells have mechanisms to repair these lesions but if too many breaks are made, stitching
together the DNA in the right order can overwhelm the cells DNA repair mechanisms.
Somehow, D. radiodurans has the ability to repair a shattered genome.
• The genome of D. radiodurans is unusual in that it is composed of two chromosomes, a
megaplasmid and a small plasmid.
9. Azotobacter vinelandii
• Azotobacter vinelandii is a large, soil-dwelling, obligate aerobic bacterium capable of fixing
nitrogen.
• In addition, A. vinelandii can metabolize a large number of carbohydrates, organic acids and
alcohols.
• During exponential growth, A. vinelandii cells typically contain 2 to 4 copies of their
chromosome.
• However, during stationary phase, the number of chromosomes in an individual cell can
increase to 50-100.
10. Buchnera spp.
• These bacteria are intracellular symbionts of certain aphid species. This mutualistic relationship between aphid
and bacterium evolved millions of years ago.
• Although closely related to E. coli, Buchnera has a genome approximately one-seventh the size of the E.
coli genome. In one Buchnera species, the genome is composed of one 640 kilobase (Kb) chromosome and
two plasmids, which encode the biosynthetic pathways for several amino acids.
• It has been shown that the number of genome copies in Buchnera cells is related to the developmental stage of
their host aphid; as an aphid enters into adulthood, the genomic copy number in individual Buchnera cells
increases.
• As the aphid host ages, the genomic copy number in Buchnera decreases. It has been proposed that this
fluctuation in copy number may be due to the bacterium purging itself of genomes with deleterious mutations,
ensuring only viable chromosomes are transmitted to the next generation of aphids.
11. Agrobacterium tumefaciens
These ubiquitous, gram-negative, motile, rod-shaped soil bacteria are the causative agent of
crown-gall disease in plants.
Agrobacterium tumefaciens is referred to as a natural genetic engineer, as it is capable of
transferring DNA from itself into plant cells.
The approximately 5.7 megabase (Mb) genome is comprised of a circular chromosome, a linear
chromosome and two plasmids.
One of the plasmids, referred to as the Ti plasmid for Tumor Inducing plasmid, is responsible
for A. tumefaciens virulence.
12. Epulopiscium spp.
• Epulopiscium sp. are intestinal symbionts of certain species
of surgeonfish belonging to the family Acanthuridae.
• Some morphotypes of Epulopiscium can attain lengths
greater than 0.5 mm! This image is of DAPI
stained Epulopiscium cells.
• DAPI is a DNA-specific stain, and all of the blue that you
see in these cells is actually DNA.
• Assays using real-time quantitative PCR suggest
that Epulopiscium contains tens of thousands of copies of its
genome. This copy number is unprecedented in bacteria and
may represent a cellular adaptation which
allows Epulopiscium to maintain such a large cell size.
• By having thousands of copies of its
genome, Epulopiscium may be able to synthesize
macromolecules close to where they are needed in the cell,
overcoming the constraints imposed by the diffusion
coefficients of small molecules and biomolecules.
13. Plasmids
• A plasmid is a small DNA molecule within a cell that is physically separated from
a chromosomal DNA and can replicate independently.
• The term was coined by Lederberg and Hays and shortly discovered by Tatum.
• They are most commonly found in bacteria as small circular, double-stranded DNA molecules;
however, plasmids are sometimes present in archaea and eukaryotic organisms.
• While the chromosomes are big and contain all the essential genetic information for living under
normal conditions, plasmids usually are very small and contain only additional genes that may
be useful to the organism under certain situations or particular conditions.
• Plasmids used in genetic engineering are called vectors. Artificial plasmids are widely used
as vectors in molecular cloning, serving to drive the replication of recombinant DNA sequences
within host organisms.
14. Types of plasmids
• (Fertility) F-plasmids, which contain tra genes. They are capable of conjugation and result
in the expression of sex pilli.
• (Resistance) R -plasmids, which contain genes that provide resistance against antibiotics or
poisons. They were historically known as R-factors, before the nature of plasmids was
understood.
• Col plasmids, which contain genes that code for bacteriocins, proteins that can kill other
bacteria.
• Degradative plasmids, which enable the digestion of unusual substances, e.g. toluene and
salicylic acid.
• (Virulence) Vir plasmids , which turn the bacterium into a pathogen.
• (Sybmiosis) Sym plasmid, pea specific nodulation and nitrogen fixation
19. Classification
• Plasmids may be classified in a number of ways.
• Plasmids can be broadly classified into conjugative plasmids and non-conjugative plasmids.
Conjugative plasmids
• Conjugative plasmids contain a set of transfer or tra genes which promote sexual conjugation between different
cells.
• In the complex process of conjugation, plasmid may be transferred from one bacterium to another via
sex pili encoded by some of the tra genes.
Non-conjugative plasmids
• Non-conjugative plasmids are incapable of initiating conjugation, hence they can be transferred only with the
assistance of conjugative plasmids.
• An intermediate class of plasmids are mobilizable, and carry only a subset of the genes required for transfer.
• They can parasitize a conjugative plasmid, transferring at high frequency only in its presence.
20.
21. • OriT (Origin of Transfer): The sequence which marks the starting point of
conjugative transfer.
• OriC (Origin of Replication): The sequence starting with which the plasmid-
DNA will be replicated in the recipient cell.
• tra-region (transfer genes): Genes coding the F-Pilus and DNA transfer process.
• IS (Insertion Elements) composed of one copy of IS2, two copies of IS3, so-
called "selfish genes" (sequence fragments which can integrate copies of
themselves at different locations).
22. R-Plasmids
• Resistance (R) plasmids, which contain genes that provide resistance against antibiotics or poisons.
• Historically known as R-factors, before the nature of plasmids was understood. R-factor was first
demonstrated in Shigella in 1959 by Japanese scientists.
• Often, R-factors code for more than one antibiotic resistance factor: genes that encode resistance to
unrelated antibiotics may be carried on a single R-factor, sometimes up to 8 different resistances.
• Many R-factors can pass from one bacterium to another through bacterial conjugation and are a
common means by which antibiotic resistance spreads between bacterial species, genera and even
families.
• For example, RP1, a plasmid that encodes resistance
to ampicillin, tetracycline and kanamycin originated in a species of Pseudomonas, from the
Family Pseudomonadaceae, but can also be maintained in bacteria belonging to the
family Enterobacteriaceae, such as Escherichia coli.
23. • plasmids provide a mechanism for horizontal gene transfer (HGT) within a population of microbes
and typically provide a selective advantage under a given environmental state.
• Plasmids can be transmitted from one bacterium to another via three main
mechanisms: transformation, transduction, and conjugation.
• This host-to-host transfer of genetic material is called horizontal gene transfer, and plasmids can be
considered part of the mobilome.
• Unlike viruses (which encase their genetic material in a protective protein coat called a capsid),
plasmids are "naked" DNA and do not encode genes necessary to encase the genetic material for
transfer to a new host.
• The size of the plasmid varies from 1 to over 200 kbp, and the number of identical plasmids in a
single cell can range anywhere from one to thousands under some circumstances.
24. Uses of plasmids
• Plasmids almost always carry at least one gene.
• Many of the genes carried by a plasmid are beneficial for the host cells, for example: enabling the
host cell to survive in an environment that would otherwise be lethal or restrictive for growth.
• Some of these genes encode traits for antibiotic resistance or resistance to heavy metal, while others
may produce virulence factors that enable a bacterium to colonize a host and overcome its defences
(vir plasmids), or have specific metabolic functions that allow the bacterium to utilize a particular
nutrient, including the ability to degrade recalcitrant or toxic organic compounds (degradative
plasmids).
• Plasmids can also provide bacteria with the ability to fix nitrogen (sym plasmids).
• Some plasmids, have no observable effect on the phenotype of the host cell or its benefit to the host
cells cannot be determined, and these plasmids are called cryptic plasmids.
25. Plasmid copy number
• Their size can range from very small mini-plasmids of less than a 1 kilobase pairs
(Kbp), to very large megaplasmids of several megabase pairs (Mbp).
• Plasmids are generally circular, however examples of linear plasmids are also known.
• Plasmids may be present in an individual cell in varying number, ranging from one to
several hundreds.
• The normal number of copies of plasmid that may be found in a single cell is called
the copy number, and is determined by how the replication initiation is regulated and
the size of the molecule.
• Larger plasmids tend to have lower copy numbers.
26. analysis of whole microbial genomes provides insight into microbial evolution
and diversity beyond single protein or gene phylogenies
in practical terms analysis of whole microbial genomes is also a powerful tool in
identifying new applications in for biotechnology and new approaches to the
treatment and control of pathogenic organisms
Central role in ecological balance
Commercial opportunities
Pathogens kill millions every year
Why study microbial genomes?
27. History of microbial genome sequencing
•1977 - first complete genome to be sequenced was bacteriophage X174
- 5386 bp
• Late 1990’s - many additional microbial genomes sequenced including
Archaea (Methanococcus jannaschii - 1996) and Eukaryotes
(Saccharomyces cerevisiae - 1996)
28. Laboratory tools for studying whole genomes
conventional techniques for analysing DNA are designed for the analysis of small
regions of whole genomes such as individual genes or operons
many of the techniques used to study whole genomes are conventional molecular
biology techniques adapted to operate effectively with DNA in a much larger size
range
29. •Pulsed Field Gel Electrophoresis
•Large insert cloning vectors – BAC’s and PAC’s
•Approaches to whole microbial genome sequencing
•Structural genomics
•Microarray hybridisation
•Random sequencing (shotgun) approach
•Genome Annotation
Some tests :
31. The process after sequencing has been completed.
Use of many different tools required:
Bioinformatics
Databases
Literature
Sequence
Experimental
Genome Annotation
33. GLIMMER is a software for gene prediction and used by:
BASYS-
http://wishart.biology.ualberta.ca/basys/cgi/submit.pl
JCVI (formerly TIGR)- http://www.tigr.org/
SABIA- http://www.sabia.lncc.br/
34. Example - Haemophilus influenzae
First complete genome sequence of a free living organism (1995) important
pathogen
Genome is around 1.83 megabases (MB) in size
random sequencing was done for both small insert and large insert (lambda) libraries
sequencing reactions performed by eight individuals using fourteen ABI 377 DNA
sequencers per day over a three month period
in total around 33000 sequencing reactions were performed on 20000 templates
plasmid extraction performed in a 96 well format
11 mb of sequence was intially used to generate 140 contigs
gaps were closed by lambda linking clones (23), peptide links (2), Southern analysis
(37) and PCR (42)