SlideShare ist ein Scribd-Unternehmen logo
1 von 273
Downloaden Sie, um offline zu lesen
Genome assembly: then and now
Keith Bradnam
Image from Wellcome Trust
Image from flickr.com/photos/dougitdesign/5613967601/
Contents
Sequencing 101!
!
Genome assembly: then!
!
Genome assembly: now
Assemblathon 1 & 2!
!
Advice & Angst!
!
The future
More info
✤ http://assemblathon.org!
!
✤ http://gigasciencejournal.com!
!
✤ http://twitter.com/assemblathon
Sequencing 101
A, C, G, T...
Image from nlm.nih.gov
Read
Read pair
Read pair
Mate pair
Contigs
Scaffold
NNNNNNNNNNNNNNNNNNN
Assembly size
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
15
15
15
5
Assembly size
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
200 Mbp
15
15
15
5
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
200 Mbp
15
15
15
5
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
200 Mbp
15
15
15
5
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
200 Mbp
15
15
15
5
70
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
15
15
15
5
200 Mbp
95
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
15
15
15
5
200 Mbp
95
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
15
15
15
5
200 Mbp
115
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
15
15
15
5
200 Mbp
115
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
5
15
15
15
5
200 Mbp
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
15
15
15
5
5
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
15
15
15
5
5
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
15
15
15
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
15
15
15
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
15
15
15
190 Mbp
N50 length
NNNNNNNNNNNNNNNNNNN
NNNNNNNNNNN
NNNNNNNNNNN
70
25
20
10
10
5
5
15
15
15
190 Mbp
N50 for two assemblies
N50 for two assemblies
208 Mbp 190 Mbp
N50 for two assemblies
208 Mbp 190 Mbp
N50 = 15 Mbp N50 = 25 Mbp
NG50 for two assemblies
208 Mbp 190 Mbp
NG50 for two assemblies
NG50 for two assemblies
Expected genome size = 250 Mbp
Expected genome size = 250 Mbp
NG50 for two assemblies
NG50 = 15 Mbp NG50 = 15 Mbp
Expected genome size = 250 Mbp
NG50 for two assemblies
You should check that high N50 values!
are not simply due to lots of Ns in the scaffolds!
Assembly 'x'
Assembly 'x'
Size: 859 Mbp!
!
Number of scaffolds: 28!
!
N50 = 70.3 Mbp
Assembly 'x'
Size: 859 Mbp!
!
Number of scaffolds: 28!
!
N50 = 70.3 Mbp
Ns = 90.6% !!!
Assembly 'x'
Size: 859 Mbp!
!
Number of scaffolds: 28!
!
N50 = 70.3 Mbp
Ns = 90.6% !!!
Basic assembly metrics
Basic assembly metrics
Metric Description
Assembly size With or without very short contigs?
N50 / NG50 For contigs and/or scaffolds
Coverage When compared to a reference sequence
Errors
Base errors from alignment to reference sequence !
and/or input read data
Number of genes
From comparison to reference transcriptome !
and/or set of known genes
Basic assembly metrics
Metric Description
Assembly size With or without very short contigs?
N50 / NG50 For contigs and/or scaffolds
Coverage When compared to a reference sequence
Errors
Base errors from alignment to reference sequence !
and/or input read data
Number of genes
From comparison to reference transcriptome !
and/or set of known genes
And many, many more...
Genome assembly
Back in the day...
Genome assembly
Back in the day...
1998
Genome assembly: then
Genetic maps ✓
Genome assembly: then
Genetic maps ✓
Physical maps ✓
Genome assembly: then
Genetic maps ✓
Physical maps ✓
Understanding of target genome ✓
Genome assembly: then
Genetic maps ✓
Physical maps ✓
Understanding of target genome ✓
Haploid / low heterozygosity genome ✓
Genome assembly: then
Genetic maps ✓
Physical maps ✓
Understanding of target genome ✓
Haploid / low heterozygosity genome ✓
Accurate & long reads ✓
Genome assembly: then
Genetic maps ✓
Physical maps ✓
Understanding of target genome ✓
Haploid / low heterozygosity genome ✓
Accurate & long reads ✓
Resources (time, money, people) ✓
Genome assembly: then
So what was the result of spending millions of dollars !
to assemble genomes of well-characterized species,!
with accurate long reads, and detailed maps???
✤ 2000: published genome size = 125 Mbp
✤ 2007: genome size = 157 Mbp
✤ 2012: genome size = 135 Mbp
Arabidopsis thaliana
✤ 2000: published genome size = 125 Mbp
✤ 2007: genome size = 157 Mbp
✤ 2012: genome size = 135 Mbp
✤ Amount sequenced = 119 Mbp
Arabidopsis thaliana
✤ 2000: published genome size = 125 Mbp
✤ 2007: genome size = 157 Mbp
✤ 2012: genome size = 135 Mbp
✤ Amount sequenced = 119 Mbp
✤ Ns = 0.2% of genome
Arabidopsis thaliana
Two views of the same gene
Two views of the same gene
Top: from genome sequence view on TAIR web site!
Bottom: from gene sequence file on TAIR FTP site
Drosophila melanogaster
✤ Genome published 1998
✤ Heterochromatin finished 2007
Drosophila melanogaster
✤ Genome published 1998
✤ Heterochromatin finished 2007
✤ Ns = 4% of genome
Caenorhabditis elegans
✤ Genome published 1998
✤ 2004: last N removed
Caenorhabditis elegans
✤ Genome published 1998
✤ 2004: last N removed
✤ 1998–2014: genome sequence changes
Caenorhabditis elegans
✤ Genome published 1998
✤ 2004: last N removed
✤ 1998–2014: genome sequence changes
✤ 558 insertions
✤ 230 deletions
✤ 614 substitutions
Caenorhabditis elegans
✤ Genome published 1998
✤ 2004: last N removed
✤ 1998–2014: genome sequence changes
✤ 558 insertions
✤ 230 deletions
✤ 614 substitutions
}Nov 2012
Saccharomyces cerevisiae
✤ Genome published 1997
✤ 12 Mbp genome
✤ 1,653 changes to genome since 1997
Saccharomyces cerevisiae
✤ Genome published 1997
✤ 12 Mbp genome
✤ 1,653 changes to genome since 1997
✤ Last changes made in 2011
Genetic maps ✓
Physical maps ✓
Understanding of target genome ✓
Haploid / low heterozygosity genome ✓
Accurate & long reads ✓
Resources (time, money, people) ✓
Genome assembly: then
Genetic maps ✗
Physical maps ✗
Understanding of target genome ✗
Haploid / low heterozygosity genome ✗
Accurate & long reads ✗
Resources (time, money, people) ✗
Genome assembly: now
Assembling & finishing!
a genome is not easy!
Assemblathons
A new idea is born
Image from flickr.com/photos/dullhunk/4422952630
If you sequence 10,000 genomes...!
...you need to assemble 10,000 genomes
How many assembly tools are out there?
bambus2
How many assembly tools are out there?
Ray
Celera
MIRA
ALLPATHS-LGSGA
Curtain
Metassembler
Phusion
ABySS
Amos
Arapan
CLC
Cortex
DNAnexus
DNA Dragon
Edena
Forge
Geneious
IDBA
Newbler
PRICE
PADENA
PASHA
Phrap
TIGR
Sequencher
SeqMan NGen
SHARCGS
SOPRA
SSAKE
SPAdes
Taipan
VCAKE
Velvet
Arachne
PCAP
GAM
Monument
Atlas
ABBA
Anchor
ATAC
Contrail
DecGPU GenoMinerLasergene
PE-Assembler
Pipeline Pilot
QSRA
SeqPrep
SHORTY
fermiTelescoper
Quast
SCARPA
Hapsembler
HapCompass
HaploMerger
SWiPS
GigAssembler
MSR-CA
MaSuRCA
GARM
Cerulean
TIGRA
ngsShoRT
PERGA
SOAPdenovo
REAPR
FRCBam
EULER-SR SSPACE
Opera
mip
gapfiller
image
PBJelly
HGAP
FALCON
Dazzler
GGAKE
A5
CABOG
SHRAP
SR-ASM
SuccinctAssembly
SUTTA
Ragout
Tedna
Trinity
SWAP-Assembler
SILP3
AutoAssemblyD
KGBAssembler
MetAMOS
iMetAMOS
MetaVelvet-SL
KmerGenie
Nesoni
Pilon
Platanus
CGAL
GAGM
Enly
BESST
Khmer
GRIT
IDBA-MTP
dipSPAdes
WhatsHap
SHEAR
ELOPER
OMACC
How many assembly tools are out there?
bambus2
How many assembly tools are out there?
Ray
Celera
MIRA
ALLPATHS-LGSGA
Curtain
Metassembler
Phusion
ABySS
Amos
Arapan
CLC
Cortex
DNAnexus
DNA Dragon
Edena
Forge
Geneious
IDBA
Newbler
PRICE
PADENA
PASHA
Phrap
TIGR
Sequencher
SeqMan NGen
SHARCGS
SOPRA
SSAKE
SPAdes
Taipan
VCAKE
Velvet
Arachne
PCAP
GAM
Monument
Atlas
ABBA
Anchor
ATAC
Contrail
DecGPU GenoMinerLasergene
PE-Assembler
Pipeline Pilot
QSRA
SeqPrep
SHORTY
fermiTelescoper
Quast
SCARPA
Hapsembler
HapCompass
HaploMerger
SWiPS
GigAssembler
MSR-CA
MaSuRCA
GARM
Cerulean
TIGRA
ngsShoRT
PERGA
SOAPdenovo
REAPR
FRCBam
EULER-SR SSPACE
Opera
mip
gapfiller
image
PBJelly
HGAP
FALCON
Dazzler
GGAKE
A5
CABOG
SHRAP
SR-ASM
SuccinctAssembly
SUTTA
Ragout
Tedna
Trinity
SWAP-Assembler
SILP3
AutoAssemblyD
KGBAssembler
MetAMOS
iMetAMOS
MetaVelvet-SL
KmerGenie
Nesoni
Pilon
Platanus
CGAL
GAGM
Enly
BESST
Khmer
GRIT
IDBA-MTP
dipSPAdes
WhatsHap
SHEAR
ELOPER
OMACC
bambus2
How many assembly tools are out there?
Ray
Celera
MIRA
ALLPATHS-LGSGA
Curtain
Metassembler
Phusion
ABySS
Amos
Arapan
CLC
Cortex
DNAnexus
DNA Dragon
Edena
Forge
Geneious
IDBA
Newbler
PRICE
PADENA
PASHA
Phrap
TIGR
Sequencher
SeqMan NGen
SHARCGS
SOPRA
SSAKE
SPAdes
Taipan
VCAKE
Velvet
Arachne
PCAP
GAM
Monument
Atlas
ABBA
Anchor
ATAC
Contrail
DecGPU GenoMinerLasergene
PE-Assembler
Pipeline Pilot
QSRA
SeqPrep
SHORTY
fermiTelescoper
Quast
SCARPA
Hapsembler
HapCompass
HaploMerger
SWiPS
GigAssembler
MSR-CA
MaSuRCA
GARM
Cerulean
TIGRA
ngsShoRT
PERGA
SOAPdenovo
REAPR
FRCBam
EULER-SR SSPACE
Opera
mip
gapfiller
image
PBJelly
HGAP
FALCON
Dazzler
GGAKE
A5
CABOG
SHRAP
SR-ASM
SuccinctAssembly
SUTTA
Ragout
Tedna
Trinity
SWAP-Assembler
SILP3
AutoAssemblyD
KGBAssembler
MetAMOS
iMetAMOS
MetaVelvet-SL
KmerGenie
Nesoni
Pilon
Platanus
CGAL
GAGM
Enly
BESST
Khmer
GRIT
IDBA-MTP
dipSPAdes
WhatsHap
SHEAR
ELOPER
OMACC
Which is the best?
Comparing assemblers
✤ Can't fairly compare two assemblers if they:
Comparing assemblers
✤ Can't fairly compare two assemblers if they:
✤ produced assemblies from different species
Comparing assemblers
✤ Can't fairly compare two assemblers if they:
✤ produced assemblies from different species
✤ assembled same species, but used sequence data from
different sequencing technologies
Comparing assemblers
✤ Can't fairly compare two assemblers if they:
✤ produced assemblies from different species
✤ assembled same species, but used sequence data from
different sequencing technologies
✤ used same sequencing technologies but have different
sequence libraries
Comparing assemblers
✤ Can't fairly compare two assemblers if they:
✤ produced assemblies from different species
✤ assembled same species, but used sequence data from
different sequencing technologies
✤ used same sequencing technologies but have different
sequence libraries
✤ Even using different options for the same assembler may produce
very different assemblies!
The PRICE genome assembler has 52 command-line options!!!
The PRICE genome assembler has 52 command-line options!!!
how many of them are you going to learn?
A genome assembly competition
An attempt to standardize some aspects !
of the genome assembly process
Genome assembly contests
✤ 2010–2011!
✤ Used synthetic data!
✤ Small genome (~100 Mbp)!
✤ We knew the answer!
Assemblathon 1
Here we go again
Type of data
Number of
genomes
Size of
genomes
Do we know
the answer?
Assemblathon 1 Synthetic 1 Small ✓
Type of data
Number of
genomes
Size of
genomes
Do we know
the answer?
Assemblathon 1 Synthetic 1 Small ✓
Assemblathon 2 Real 3 Large ✗
Melopsittacus undulatus
Boa constrictor constrictorMaylandia zebra
Bird
SnakeFish
Why these three species?
Why these three species?
Because they were there
Species
Bird
Fish
Snake
Estimated
genome size
1.2 Gbp
1.0 Gbp
1.6 Gbp
Assemble this!
Species
Bird
Fish
Snake
Estimated
genome size
1.2 Gbp
1.0 Gbp
1.6 Gbp
Illumina
285x!
(14 libraries)
192x!
(8 libraries)
125x!
(4 libraries)
Assemble this!
Species
Bird
Fish
Snake
Estimated
genome size
1.2 Gbp
1.0 Gbp
1.6 Gbp
Illumina
285x!
(14 libraries)
192x!
(8 libraries)
125x!
(4 libraries)
Roche 454
16x!
(3 libraries)
Assemble this!
Species
Bird
Fish
Snake
Estimated
genome size
1.2 Gbp
1.0 Gbp
1.6 Gbp
Illumina
285x!
(14 libraries)
192x!
(8 libraries)
125x!
(4 libraries)
Roche 454
16x!
(3 libraries)
PacBio
10x!
(2 libraries)
Assemble this!
Who took part?
Who took part?
Who took part?
21 teams!
43 assemblies!
52,013,623,777 bp of sequence
Species
Bird
Fish
Snake
Competitive
entries
12
10
12
Entries
Species
Bird
Fish
Snake
Competitive
entries
12
10
12
Evaluation
entries
3
6
0
Entries
Goals
Goals
✤ Assess 'quality' of assemblies
Goals
✤ Assess 'quality' of assemblies
✤ Define quality!
Goals
✤ Assess 'quality' of assemblies
✤ Define quality!
✤ Produce ranking of assemblies for each species
Goals
✤ Assess 'quality' of assemblies
✤ Define quality!
✤ Produce ranking of assemblies for each species
✤ Produce ranking of assemblers across species?
Who did what?
Person/group Jobs
Me, Ian Korf, and Joseph Fass Perform various analyses of all assemblies
David Schwarz et al. Produce & evaluate optical maps
Jay Shendure et al.
Produce Fosmid sequences !
(bird & snake only)
Martin Hunt & Thomas Otto Performed REAPR analysis
Dent Earl & Benedict Paten Help with meta-analysis of final rankings
91 co-authors!
flickr.com/photos/jamescridland/613445810
Results!
Lots of results!
102 different metrics!
10 key metrics
Key Metric Description
1 NG50 scaffold length
2 NG50 contig length
3 Amount of assembly in 'gene-sized' scaffolds
4 Number of 'core genes' present
5 Fosmid coverage
6 Fosmid validity
7 Short-range scaffold accuracy
8 Optical map: level 1
9 Optical map: levels 1–3
10 REAPR summary score
Key Metric Description
1 NG50 scaffold length
2 NG50 contig length
3 Amount of assembly in 'gene-sized' scaffolds
4 Number of 'core genes' present
5 Fosmid coverage
6 Fosmid validity
7 Short-range scaffold accuracy
8 Optical map: level 1
9 Optical map: levels 1–3
10 REAPR summary score
1) Scaffold NG50 lengths
✤ Can calculate NG50 length for each assembly!
✤ But also calculate NG60, NG70 etc.!
✤ Plot all results as a graph
1) Scaffold NG50 lengths
2) Contig vs scaffold NG50
2) Contig vs scaffold NG50
2) Contig vs scaffold NG50
3) Gene-sized scaffolds
3) Gene-sized scaffolds
✤ Some assembly folks get a little obsessed by length!
3) Gene-sized scaffolds
✤ Some assembly folks get a little obsessed by length!
✤ How long is 'long enough' for a scaffold?
3) Gene-sized scaffolds
✤ Some assembly folks get a little obsessed by length!
✤ How long is 'long enough' for a scaffold?
✤ What if you just wanted to find genes?
3) Gene-sized scaffolds
✤ Some assembly folks get a little obsessed by length!
✤ How long is 'long enough' for a scaffold?
✤ What if you just wanted to find genes?
✤ Average vertebrate gene = ~25 Kbp
3) Gene-sized scaffolds
4) Core genes
4) Core genes
✤ Used CEGMA (Core Eukaryotic Gene Mapping Approach)
4) Core genes
✤ Used CEGMA (Core Eukaryotic Gene Mapping Approach)
✤ CEGMA uses a set of 458 'Core Eukaryotic Genes' (CEGs)
4) Core genes
✤ Used CEGMA (Core Eukaryotic Gene Mapping Approach)
✤ CEGMA uses a set of 458 'Core Eukaryotic Genes' (CEGs)
✤ CEGs are conserved in: S. cerevisiae, S. pombe, A. thaliana,
C. elegans, D. melanogaster, and H. sapiens
4) Core genes
✤ Used CEGMA (Core Eukaryotic Gene Mapping Approach)
✤ CEGMA uses a set of 458 'Core Eukaryotic Genes' (CEGs)
✤ CEGs are conserved in: S. cerevisiae, S. pombe, A. thaliana,
C. elegans, D. melanogaster, and H. sapiens
✤ How many full-length CEGs are in each assembly?
4) Core genes
Species
Bird
Fish
Snake
Core genes (out of 458)
Best individual
assembly
420
436
438
4) Core genes
Species
Bird
Fish
Snake
Core genes (out of 458)
Best individual
assembly
420
436
438
Across all
assemblies
442
455
454
4) Core genes
ABYSS MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
BCM MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
CRACS MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
CURT MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
GAM MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVMLFYEVRKIKNVED
MERAC MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
PHUS MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
RAY MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
SGA MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
SYMB MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVMLFYEVRKIKNVED
SOAP MNTVLTRANSLFAFSLSVMAALTFGCFITTAFKERTVPVSIAVSRVML-------KNVED
************************************************ *****
!
ABYSS FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
BCM FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
CRACS FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
CURT FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
GAM FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNNLPHTHI
MERAC FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
PHUS FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
RAY FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
SGA FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
SYMB FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
SOAP FTGPGERSDLGIITFNISANILYYKHSSLFPNIFDWNVKQLFLYLSAEYSTKNN------
******************************************************
!
ABYSS ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
BCM ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
CRACS ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
CURT ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
GAM YGHALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLK------------------
MERAC ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
PHUS ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
RAY ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
SGA ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
SYMB ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
SOAP ---ALNQVVLWDKIILRGDDPNLLLKDMKSKYFFFDDGNGLKGNRNVTLTLSWNVVPNAG
***************************************
4) Core genes
8 & 9) Optical maps
8 & 9) Optical maps
✤ Stretch out DNA
8 & 9) Optical maps
✤ Stretch out DNA
✤ Cut with restriction enzymes
8 & 9) Optical maps
✤ Stretch out DNA
✤ Cut with restriction enzymes
✤ Note lengths of fragments
8 & 9) Optical maps
✤ Stretch out DNA
✤ Cut with restriction enzymes
✤ Note lengths of fragments
✤ Compare to in silico digest of scaffolds
8 & 9) Optical maps
✤ Stretch out DNA
✤ Cut with restriction enzymes
✤ Note lengths of fragments
✤ Compare to in silico digest of scaffolds
✤ Not all scaffolds suitable for analysis
8 & 9) Optical maps
Image from University of Wisconsin-Madison
8 & 9) Optical maps
8 & 9) Optical maps
8 & 9) Optical maps
What does this all mean?
102 metrics!
per assembly
10 key !
metrics
1 final!
ranking
Assembly
CRACS
SYMB
PHUS
BCM
SGA
MERAC
ABYSS
SOAP
RAY
GAM
CURT
Number of !
core genes
438
436
435
434
433
430
429
428
422
415
360
Assembly
CRACS
SYMB
PHUS
BCM
SGA
MERAC
ABYSS
SOAP
RAY
GAM
CURT
Number of !
core genes
438
436
435
434
433
430
429
428
422
415
360
Rank
1
2
3
4
5
6
7
8
9
10
11
Assembly
CRACS
SYMB
PHUS
BCM
SGA
MERAC
ABYSS
SOAP
RAY
GAM
CURT
Number of !
core genes
438
436
435
434
433
430
429
428
422
415
360
Rank
1
2
3
4
5
6
7
8
9
10
11
Z-score
+0.68
+0.59
+0.54
+0.49
+0.44
+0.30
+0.25
+0.21
–0.08
–0.41
–3.02
What does this all mean?
No really, what does this all mean?
Some conclusions
✤ Very hard to find assemblers that performed well across
all 10 key metrics!
✤ Assemblers that perform well in one species, do not
always perform as well in another!
✤ Bird & snake assemblies appear better than fish!
✤ No real 'winner' for bird and fish
SGA — best assembler for snake?
SGA — best assembler for snake?
Description Rank of snake SGA assembly
NG50 scaffold length 2
NG50 contig length 5
Amount of assembly in 'gene-sized' scaffolds 7
Number of 'core genes' present 5
Fosmid coverage 2
Fosmid validity 2
Short-range scaffold accuracy 3
Optical map: level 1 2
Optical map: levels 1–3 1
REAPR summary score 2
Description Rank of snake SGA assembly
NG50 scaffold length 2
NG50 contig length 5
Amount of assembly in 'gene-sized' scaffolds 7
Number of 'core genes' present 5
Fosmid coverage 2
Fosmid validity 2
Short-range scaffold accuracy 3
Optical map: level 1 2
Optical map: levels 1–3 1
REAPR summary score 2
Best assembler across species?
Best assembler across species?
Assembler
Number of 1st places
(out of 27)
BCM 5
Meraculous 4
Symbiose 4
Ray 3
Excluding evaluation entries
Best assembler across species?
Assembler
Number of 1st places
(out of 27)
BCM 5
Meraculous 4
Symbiose 4
Ray 3
Excluding evaluation entries
Ray performance
Species Final ranking
Bird 7th
Fish 7th
Snake 9th
Assembler
BCM -
evaluation
BCM -
competitive
Final
rank
1
2
NGS data
used in
assembly
Illumina +
454
Illumina +
454 + PacBio
BCM bird assemblies
Assembler
BCM -
evaluation
BCM -
competitive
Final
rank
1
2
NGS data
used in
assembly
Illumina +
454
Illumina +
454 + PacBio
BCM bird assemblies
Assembler
BCM -
evaluation
BCM -
competitive
Final
rank
1
2
NGS data
used in
assembly
Illumina +
454
Illumina +
454 + PacBio
Coverage!
Z-score
+2.0
–0.3
BCM bird assemblies
Assembler
BCM -
evaluation
BCM -
competitive
Final
rank
1
2
NGS data
used in
assembly
Illumina +
454
Illumina +
454 + PacBio
Coverage!
Z-score
+2.0
–0.3
Validity!
Z-score
+1.4
–0.8
BCM bird assemblies
Assembler
BCM -
evaluation
BCM -
competitive
Final
rank
1
2
NGS data
used in
assembly
Illumina +
454
Illumina +
454 + PacBio
Coverage!
Z-score
+2.0
–0.3
Validity!
Z-score
+1.4
–0.8
NG50 Contig
Z-score
+1.5
+2.7
BCM bird assemblies
BCM evaluation scaffold
NNNNNNNNNNNNNNNNNNN
BCM evaluation scaffold
NNNNNNNNNNNNNNNNNNN
BCM competition scaffold
NNNNNNNNNNNNNNNNNNN
BCM evaluation scaffold
NNNNNNNNNNNNNNNNNNN
BCM competition scaffold
NNNNNNNNNNNNNNNNNNN
PacBio sequence
BCM evaluation scaffold
NNNNNNNNNNNNNNNNNNN
BCM competition scaffold
CGTCGNNATCNNGGTTACG
BCM evaluation scaffold
NNNNNNNNNNNNNNNNNNN
BCM competition scaffold
CGTCGNNATCNNGGTTACG
Mismatches from PacBio sequence penalized alignment !
score more than matching unknown bases
The choice of one command-line option,!
used by one tool in the calculation of one key metric...
...probably made enough difference to drop!
the PacBio-containing assembly to 2nd place.
Other conclusions
✤ Different metrics tell different stories!
✤ Heterozygosity was a big issue for bird & fish assemblies!
✤ Final rankings very sensitive to changes in metrics!
✤ N50 is a semi-useful predictor of assembly quality
Inter-specific differences matter
Inter-specific differences matter
✤ The three species have genomes with different properties !
✤ repeats!
✤ heterozygosity
Inter-specific differences matter
✤ The three species have genomes with different properties !
✤ repeats!
✤ heterozygosity
✤ The three genomes had very different NGS data sets!
✤ Only bird had PacBio & 454 data!
✤ Different insert sizes in short-insert libraries
The Big Conclusion
The Big Conclusion
"You can't always get what you want"
Sir Michael Jagger, 1969
What comes next?
What comes next?
What comes next?
3?
A wish list for Assemblathon 3
A wish list for Assemblathon 3
✤ Only have 1 species
A wish list for Assemblathon 3
✤ Only have 1 species
✤ Teams have to 'buy' resources using virtual budgets
A wish list for Assemblathon 3
✤ Only have 1 species
✤ Teams have to 'buy' resources using virtual budgets
✤ Factor in CPU time/cost?
A wish list for Assemblathon 3
✤ Only have 1 species
✤ Teams have to 'buy' resources using virtual budgets
✤ Factor in CPU time/cost?
✤ Agree on metrics before evaluating assemblies!
A wish list for Assemblathon 3
✤ Only have 1 species
✤ Teams have to 'buy' resources using virtual budgets
✤ Factor in CPU time/cost?
✤ Agree on metrics before evaluating assemblies!
✤ Encourage experimental assemblies
A wish list for Assemblathon 3
✤ Only have 1 species
✤ Teams have to 'buy' resources using virtual budgets
✤ Factor in CPU time/cost?
✤ Agree on metrics before evaluating assemblies!
✤ Encourage experimental assemblies
✤ Use new FASTG genome assembly file format
A wish list for Assemblathon 3
✤ Only have 1 species
✤ Teams have to 'buy' resources using virtual budgets
✤ Factor in CPU time/cost?
✤ Agree on metrics before evaluating assemblies!
✤ Encourage experimental assemblies
✤ Use new FASTG genome assembly file format
✤ Get someone else to write the paper!
Intermission
NGS must die!
NGS must die!
‘NGS’ is used to refer to everything post-Sanger
NGS must die!
‘NGS’ is used to refer to everything post-Sanger
Pyrosequencing was developed ~1996
NGS madness
Next generation sequencing
aka second generation sequencing
NGS madness
Next generation sequencing
aka second generation sequencing
but there’s also:
NGS madness
Next generation sequencing
aka second generation sequencing
but there’s also: third generation sequencing
NGS madness
Next generation sequencing
aka second generation sequencing
but there’s also: third generation sequencing
fourth generation sequencing
NGS madness
Next generation sequencing
aka second generation sequencing
but there’s also: third generation sequencing
fourth generation sequencing
next-next generation sequencing
NGS madness
Next generation sequencing
aka second generation sequencing
but there’s also: third generation sequencing
fourth generation sequencing
next-next generation sequencing
next-next-next generation sequencing
NGS madness
Technology
Complete Genomics
Ion Torrent
PacBio
Oxford Nanopore
According to
some papers…
2nd generation
2nd generation
2nd generation
3rd generation
NGS madness
Technology
Complete Genomics
Ion Torrent
PacBio
Oxford Nanopore
According to
some papers…
2nd generation
2nd generation
2nd generation
3rd generation
According to
other papers…
3rd generation
3rd generation
3rd generation
4th generation
NGS madness
“PacBio is a 2.5th generation”
“Helicos lies between the transition of next-generation to third generation”
NGS madness
There are different sequencing methodologies, !
and there are different sequencing platforms.
NGS madness
There are different sequencing methodologies, !
and there are different sequencing platforms.
Use one or the other.
NGS madness
There are different sequencing methodologies, !
and there are different sequencing platforms.
Use one or the other.
Or just say ‘current sequencing technologies’.
Intermission
My #1 piece!
of advice
flickr.com/julia_manzerova
flickr.com/thomashawk
flickr.com/thomashawk
Look at your data!
I looked at the shortest 10 sequences in 34 different genome assemblies…
I looked at the shortest 10 sequences in 34 different genome assemblies…
I looked at the shortest 10 sequences in 34 different genome assemblies…
I looked at the shortest 10 sequences in 34 different genome assemblies…
From a vertebrate genome assembly with 72,214 sequences…
From a vertebrate genome assembly with 72,214 sequences…
From a vertebrate genome assembly with 72,214 sequences…
From a vertebrate genome assembly with 72,214 sequences…
From a vertebrate genome assembly with 72,214 sequences…
From a vertebrate genome assembly with 72,214 sequences…
Length of 10 shortest sequences: !
100, 100, 99, 88, 87, 76, 73, 63, 12, and 3 bp!
Reasons to be cheerful
flickr.com/danielygo
Data from Lex Nederbragt’s blog, June 2014
Data from Lex Nederbragt’s blog, June 2014
Long-read technology
Moleculo read data from Illumina BaseSpace, July 2013
Long-read technology
From https://flxlexblog.wordpress.com (Lex Nederbragt's blog)
PacBio!
data
Long-read technology
MinIon from Oxford Nanopore
Long-read technology
MinIon from Oxford Nanopore
Where is the data?
Where is the data?
Where is the data?
Nick Loman published the first real-world data on June 10th
Single chromosome assembly?
Single chromosome assembly?
Single chromosome assembly?
Tackling heterozygosity
1000 Genomes project plans to sequence 15 'trios' in high-depth
Hi-C
✤ Nature Biotechnology, 31, 2013 !
✤ Burton et al.!
✤ Selvaraj et al.!
✤ Kaplan & Dekker
The future of genome assembly
Kwik-E-Assembler
acgtaacacaancac
gggaacnnnacatta
acnactagcataata
nnnnnnnnnnaacac
actttaaattatatc
The future of genome assembly
The future of genome assembly
The future of genome assembly
✤ At some point we will look back with embarrassment at this era.
The future of genome assembly
✤ At some point we will look back with embarrassment at this era.
✤ Assembly must, and will, get better, but...
The future of genome assembly
✤ At some point we will look back with embarrassment at this era.
✤ Assembly must, and will, get better, but...
✤ ...'perfect' genomes may remain elusive.
The future of genome assembly
✤ At some point we will look back with embarrassment at this era.
✤ Assembly must, and will, get better, but...
✤ ...'perfect' genomes may remain elusive.
✤ Data management will remain an issue:
The future of genome assembly
✤ At some point we will look back with embarrassment at this era.
✤ Assembly must, and will, get better, but...
✤ ...'perfect' genomes may remain elusive.
✤ Data management will remain an issue:
✤ the human genome -> human genomes -> tissue-specific genomes
Summary
Summary
✤ There is no real consensus on how to make a good genome assembly
Summary
✤ There is no real consensus on how to make a good genome assembly
✤ Try different assemblers, try different command-line options
Summary
✤ There is no real consensus on how to make a good genome assembly
✤ Try different assemblers, try different command-line options
✤ Decide what it is you want to get out of a genome assembly
Summary
✤ There is no real consensus on how to make a good genome assembly
✤ Try different assemblers, try different command-line options
✤ Decide what it is you want to get out of a genome assembly
✤ Look at your input and output data
Summary
✤ There is no real consensus on how to make a good genome assembly
✤ Try different assemblers, try different command-line options
✤ Decide what it is you want to get out of a genome assembly
✤ Look at your input and output data
✤ Wait 5 years and come back, we’ll (probably) have solved everything!
Resources
✤ Lex Nederbragt’s blog - https://flxlexblog.wordpress.com!
✤ Nick Loman’s blog - http://pathogenomics.bham.ac.uk/blog/!
✤ Assemblathon twitter feed - https://twitter.com/assemblathon

Weitere ähnliche Inhalte

Was ist angesagt?

2013 stamps-assembly-methods.pptx
2013 stamps-assembly-methods.pptx2013 stamps-assembly-methods.pptx
2013 stamps-assembly-methods.pptx
c.titus.brown
 
2013 stamps-intro-assembly
2013 stamps-intro-assembly2013 stamps-intro-assembly
2013 stamps-intro-assembly
c.titus.brown
 
U Florida / Gainesville talk, apr 13 2011
U Florida / Gainesville  talk, apr 13 2011U Florida / Gainesville  talk, apr 13 2011
U Florida / Gainesville talk, apr 13 2011
c.titus.brown
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-research
c.titus.brown
 
2013 stamps-intro-assembly
2013 stamps-intro-assembly2013 stamps-intro-assembly
2013 stamps-intro-assembly
c.titus.brown
 

Was ist angesagt? (20)

What's in a name? Better vocabularies = better bioinformatics?
What's in a name? Better vocabularies = better bioinformatics?What's in a name? Better vocabularies = better bioinformatics?
What's in a name? Better vocabularies = better bioinformatics?
 
Basics of Genome Assembly
Basics of Genome Assembly Basics of Genome Assembly
Basics of Genome Assembly
 
Genome Assembly 2018
Genome Assembly 2018Genome Assembly 2018
Genome Assembly 2018
 
2014 ucl
2014 ucl2014 ucl
2014 ucl
 
2013 stamps-assembly-methods.pptx
2013 stamps-assembly-methods.pptx2013 stamps-assembly-methods.pptx
2013 stamps-assembly-methods.pptx
 
2013 stamps-intro-assembly
2013 stamps-intro-assembly2013 stamps-intro-assembly
2013 stamps-intro-assembly
 
U Florida / Gainesville talk, apr 13 2011
U Florida / Gainesville  talk, apr 13 2011U Florida / Gainesville  talk, apr 13 2011
U Florida / Gainesville talk, apr 13 2011
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-research
 
2013 stamps-intro-assembly
2013 stamps-intro-assembly2013 stamps-intro-assembly
2013 stamps-intro-assembly
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
 
2014 naples
2014 naples2014 naples
2014 naples
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
 
2012 oslo-talk
2012 oslo-talk2012 oslo-talk
2012 oslo-talk
 
2013 alumni-webinar
2013 alumni-webinar2013 alumni-webinar
2013 alumni-webinar
 
2013 duke-talk
2013 duke-talk2013 duke-talk
2013 duke-talk
 
Thoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore TechnologiesThoughts on the recent announcements by Oxford Nanopore Technologies
Thoughts on the recent announcements by Oxford Nanopore Technologies
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
How to sequence a large eukaryotic genome
How to sequence a large eukaryotic genomeHow to sequence a large eukaryotic genome
How to sequence a large eukaryotic genome
 
2014 nyu-bio-talk
2014 nyu-bio-talk2014 nyu-bio-talk
2014 nyu-bio-talk
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 

Andere mochten auch

Dna sequencing powerpoint
Dna sequencing powerpointDna sequencing powerpoint
Dna sequencing powerpoint
14cummke
 
State of the Cloud 2017
State of the Cloud 2017State of the Cloud 2017
State of the Cloud 2017
Bessemer Venture Partners
 

Andere mochten auch (15)

Improving and validating the Atlantic Cod genome assembly using PacBio
Improving and validating the Atlantic Cod genome assembly using PacBioImproving and validating the Atlantic Cod genome assembly using PacBio
Improving and validating the Atlantic Cod genome assembly using PacBio
 
Genome Assembly Forensics
Genome Assembly ForensicsGenome Assembly Forensics
Genome Assembly Forensics
 
20140711 3 t_clark_ercc2.0_workshop
20140711 3 t_clark_ercc2.0_workshop20140711 3 t_clark_ercc2.0_workshop
20140711 3 t_clark_ercc2.0_workshop
 
2015 12-09 nmdd
2015 12-09 nmdd2015 12-09 nmdd
2015 12-09 nmdd
 
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
 
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014Rapid outbreak characterisation  - UK Genome Sciences 2014 - wed 3 sep 2014
Rapid outbreak characterisation - UK Genome Sciences 2014 - wed 3 sep 2014
 
Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013Prokka - rapid bacterial genome annotation - ABPHM 2013
Prokka - rapid bacterial genome annotation - ABPHM 2013
 
IonGAP - an Integrated Genome Assembly Platform for Ion Torrent Data
IonGAP - an Integrated Genome Assembly Platform for Ion Torrent DataIonGAP - an Integrated Genome Assembly Platform for Ion Torrent Data
IonGAP - an Integrated Genome Assembly Platform for Ion Torrent Data
 
Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Bio153 microbial genomics 2012
Bio153 microbial genomics 2012
 
Dna sequencing
Dna    sequencingDna    sequencing
Dna sequencing
 
DNA Sequencing
DNA SequencingDNA Sequencing
DNA Sequencing
 
DNA Sequencing : Maxam Gilbert and Sanger Sequencing
DNA Sequencing : Maxam Gilbert and Sanger SequencingDNA Sequencing : Maxam Gilbert and Sanger Sequencing
DNA Sequencing : Maxam Gilbert and Sanger Sequencing
 
Ngs de novo assembly progresses and challenges
Ngs de novo assembly progresses and challengesNgs de novo assembly progresses and challenges
Ngs de novo assembly progresses and challenges
 
Dna sequencing powerpoint
Dna sequencing powerpointDna sequencing powerpoint
Dna sequencing powerpoint
 
State of the Cloud 2017
State of the Cloud 2017State of the Cloud 2017
State of the Cloud 2017
 

Ähnlich wie Genome assembly: then and now — v1.1

2013 hmp-assembly-webinar
2013 hmp-assembly-webinar2013 hmp-assembly-webinar
2013 hmp-assembly-webinar
c.titus.brown
 

Ähnlich wie Genome assembly: then and now — v1.1 (20)

Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
A Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with HypertableA Genome Sequence Analysis System Built with Hypertable
A Genome Sequence Analysis System Built with Hypertable
 
Gene tree-species tree methods in RevBayes
Gene tree-species tree methods in RevBayesGene tree-species tree methods in RevBayes
Gene tree-species tree methods in RevBayes
 
A bad genetic history of maize
A bad genetic history of maizeA bad genetic history of maize
A bad genetic history of maize
 
Real-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe ParkerReal-time Phylogenomics: Joe Parker
Real-time Phylogenomics: Joe Parker
 
2013 hmp-assembly-webinar
2013 hmp-assembly-webinar2013 hmp-assembly-webinar
2013 hmp-assembly-webinar
 
A Genome Sequence Analysis System Built With Hypertable
A Genome Sequence Analysis System Built With HypertableA Genome Sequence Analysis System Built With Hypertable
A Genome Sequence Analysis System Built With Hypertable
 
DNA Notes
DNA NotesDNA Notes
DNA Notes
 
RML NCBI Resources
RML NCBI ResourcesRML NCBI Resources
RML NCBI Resources
 
Inference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' worldInference and informatics in a 'sequenced' world
Inference and informatics in a 'sequenced' world
 
Mousegenomes tk-wtsi (1)
Mousegenomes tk-wtsi (1)Mousegenomes tk-wtsi (1)
Mousegenomes tk-wtsi (1)
 
Gel Electrophoresis Notes
Gel Electrophoresis NotesGel Electrophoresis Notes
Gel Electrophoresis Notes
 
2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial2015 beacon-metagenome-tutorial
2015 beacon-metagenome-tutorial
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRON
 
Phylogenomics and the diversification of microbes.
Phylogenomics and the diversification of microbes.Phylogenomics and the diversification of microbes.
Phylogenomics and the diversification of microbes.
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
 
Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...Generating high-quality human reference genomes using PromethION nanopore seq...
Generating high-quality human reference genomes using PromethION nanopore seq...
 
Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010Comparative Genomics and Visualisation BS32010
Comparative Genomics and Visualisation BS32010
 
Bioinformatics t2-databases v2014
Bioinformatics t2-databases v2014Bioinformatics t2-databases v2014
Bioinformatics t2-databases v2014
 
Using of dt40 chicken cell line as a reverse genetic tool to study human disease
Using of dt40 chicken cell line as a reverse genetic tool to study human diseaseUsing of dt40 chicken cell line as a reverse genetic tool to study human disease
Using of dt40 chicken cell line as a reverse genetic tool to study human disease
 

Mehr von Keith Bradnam

Database talk for Bits & Bites meeting
Database talk for Bits & Bites meetingDatabase talk for Bits & Bites meeting
Database talk for Bits & Bites meeting
Keith Bradnam
 

Mehr von Keith Bradnam (10)

13 questions you might have about galaxy
13 questions you might have about galaxy13 questions you might have about galaxy
13 questions you might have about galaxy
 
This bioinformatics lesson is brought to you by the letter 'W'
This bioinformatics lesson is brought to you by the letter 'W'This bioinformatics lesson is brought to you by the letter 'W'
This bioinformatics lesson is brought to you by the letter 'W'
 
This bioinformatics lesson is brought to you by the letter 'T'
This bioinformatics lesson is brought to you by the letter 'T'This bioinformatics lesson is brought to you by the letter 'T'
This bioinformatics lesson is brought to you by the letter 'T'
 
This bioinformatics lesson is brought to you by the letter 'D'
This bioinformatics lesson is brought to you by the letter 'D'This bioinformatics lesson is brought to you by the letter 'D'
This bioinformatics lesson is brought to you by the letter 'D'
 
Polish that presentation! 25 tips to bring clarity to your slides
Polish that presentation! 25 tips to bring clarity to your slidesPolish that presentation! 25 tips to bring clarity to your slides
Polish that presentation! 25 tips to bring clarity to your slides
 
10 tips for adding polish to presentations
10 tips for adding polish to presentations10 tips for adding polish to presentations
10 tips for adding polish to presentations
 
Database talk for Bits & Bites meeting
Database talk for Bits & Bites meetingDatabase talk for Bits & Bites meeting
Database talk for Bits & Bites meeting
 
Benchmarking short-read mapping programs
Benchmarking short-read mapping programsBenchmarking short-read mapping programs
Benchmarking short-read mapping programs
 
When is a genome finished?
When is a genome finished? When is a genome finished?
When is a genome finished?
 
Twitter 101 - an introduction to Twitter
Twitter 101  - an introduction to TwitterTwitter 101  - an introduction to Twitter
Twitter 101 - an introduction to Twitter
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Krashi Coaching
 

Kürzlich hochgeladen (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Genome assembly: then and now — v1.1