SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Jared Simpson
!
Ontario Institute for Cancer Research
&
Department of Computer Science
University ofToronto
Error correction, assembly
and consensus algorithms
for MinION data
London Calling, May 14th, 2015
Our collaboration
2
An overview of NGS assembly
β€’ Illumina data: short reads, very accurate, very deep	

β€’ nearly all Illumina assembly is based on exact matching algorithms	

β€’ fragmented assemblies	

!
β€’ Algorithms for Illumina data do not work for long, noisy reads	

β€’ PacBio developed a pipeline (β€œHGAP”) to assemble their data	

β€’ We used this recipe as a starting point but with custom components
3
Long read assembly pipeline
4
Error correction
Celera Assembler
Consensus
Input reads
Genome Assembly
Input Data
β€’ First challenge is finding overlaps for reads with 15-20% errors
5
Overlap Detection
6
we use github.com/thegenemyers/daligner to compute overlaps
Partial Order Graphs
7
add read GCTACGAT that we want to correct to graph
Partial Order Graphs
8
add sequence GCTCGAT to graph
Partial Order Graphs
9
add sequence GCTCGATT to graph
Partial Order Graphs
10
maximum weight path GCTCGAT is the corrected read
Error Correction
11
Contig Assembly
12
Celera Assembler produces one contig at 98.5% identity
Assembly Polishing
β€’ Consensus problem is viewed as choosing a sequence C’ that maximizes
the probability of the event data
13
C0
= arg max
S2C
P(D|S)
P(D|S) =
rY
k=1
P(ei,k, ei+1,k, ..., ej,k|S, β‡₯)
where
Selecting a Consensus
14
Mutate
ACTACGATCGACTTACGA
CCTACGATCGACTTACGA
TCTACGATCGACTTACGA
...
-CTACGATCGACTTACGA
G-TACGATCGACTTACGA
GC-ACGATCGACTTACGA
GCT-CGATCGACTTACGA
...
GACTACGATCGACTTACGA
GCCTACGATCGACTTACGA
GGCTACGATCGACTTACGA
GTCTACGATCGACTTACGA
P(D|S)
-190
-187
-192
-176
-191
-193
-168
-198
-191
-195
-181
GCTACGATCGACTTACGA
Selecting a Consensus
15
GCT-CGATCGACTTACGA
Mutate
A
C
T
...
-
G
GC
GCT
...
G
G
G
G
P(D|S)
-190
-187
-192
-176
-191
-193
-168
-198
-191
-195
-181
Select new consensus
GCT-CGATCGACTTACGA -168
Pore Models
16
Generating Events
β€’ What do we expect events from a given sequence to look like?
17
GCTACGATT
Sample Current ●●●
●
●●●●●
●●
●
●
●●
●
●●●
●
●●
●
●●●●●
●
●
●
●●●●
●
●●
●●
●●●●●
●
●
●●●●
●
●●
●●
●
●
●
●●●
●●●
●●●●
●●
●
●
●●●●
●
●
●
●
●●●
●●
●●●●●●●
●
●
●
●
●
●●●
●
●
●●●●●●●●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●●
●●
●●
●●●
●
●●
●●●●●●●●
●
●●●●
●
●●●●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●●●●●●●
●●●
●●●
●●●
●
●●●
●●
●●
●
●
●●●
●●
●●●
●●●●●●●
●
●
●
●●●●
●
●
●●●●●
●
●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●●●●●
●
●
●●
●●●●
●
●
●●●
●
●
●●
●●
●●
●
●●
●
●●●●●●●●●●●
●●●
●
●●
●
●●●
●
●●●
●
●
●●●●●●●
●
●●
●
●●
●●●●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●
●
●
●●●●●●
●●
●●
●
●●
●●●
●●●
●●●
●
●●
●
●
●●
●●
●●●●●
●●
●●●
●●●●
●
●
●●●
●
●●
●●●
●
●●●●●●
●
●
●●
●●●
●●●●
●
●
●●●
●
●
●
●●●●
●
●●●●●
●●
●
●
●●
●
●●●●
●●●
●
●●
●●
●●
●
●●●●●
●●●●●
●●
●
0
25
50
75
100
0.00 0.25 0.50 0.75 1.00 1.25
time (s)Current(pA)
Generating Events
β€’ What do we expect events from a given sequence to look like?
18
GCTACGATT
Sample Current ●●●
●
●●●●●
●●
●
●
●●
●
●●●
●
●●
●
●●●●●
●
●
●
●●●●
●
●●
●●
●●●●●
●
●
●●●●
●
●●
●●
●
●
●
●●●
●●●
●●●●
●●
●
●
●●●●
●
●
●
●
●●●
●●
●●●●●●●
●
●
●
●
●
●●●
●
●
●●●●●●●●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●●
●●
●●
●●●
●
●●
●●●●●●●●
●
●●●●
●
●●●●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●●●●●●●
●●●
●●●
●●●
●
●●●
●●
●●
●
●
●●●
●●
●●●
●●●●●●●
●
●
●
●●●●
●
●
●●●●●
●
●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●●●●●
●
●
●●
●●●●
●
●
●●●
●
●
●●
●●
●●
●
●●
●
●●●●●●●●●●●
●●●
●
●●
●
●●●
●
●●●
●
●
●●●●●●●
●
●●
●
●●
●●●●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●
●
●
●●●●●●
●●
●●
●
●●
●●●
●●●
●●●
●
●●
●
●
●●
●●
●●●●●
●●
●●●
●●●●
●
●
●●●
●
●●
●●●
●
●●●●●●
●
●
●●
●●●
●●●●
●
●
●●●
●
●
●
●●●●
●
●●●●●
●●
●
●
●●
●
●●●●
●●●
●
●●
●●
●●
●
●●●●●
●●●●●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●●●●●●●
●
●●
●
●●●●
●●
●
●
●
●
●
●●
●
●
●
●●●●●
●
●●●
●
●
●
●●
●
●
●●●●●●
●●
●
●●●●
●●
●
●
●●●
●●●●
●
●
●●●●●
●●
●●
●
●
●●●●●●●
0
25
50
75
100
0.00 0.25 0.50 0.75 1.00 1.25
time (s)Current(pA)
Generating Events
β€’ What do we expect events from a given sequence to look like?
19
GCTACGATT
Sample Current ●●●
●
●●●●●
●●
●
●
●●
●
●●●
●
●●
●
●●●●●
●
●
●
●●●●
●
●●
●●
●●●●●
●
●
●●●●
●
●●
●●
●
●
●
●●●
●●●
●●●●
●●
●
●
●●●●
●
●
●
●
●●●
●●
●●●●●●●
●
●
●
●
●
●●●
●
●
●●●●●●●●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●●
●●
●●
●●●
●
●●
●●●●●●●●
●
●●●●
●
●●●●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●●●●●●●
●●●
●●●
●●●
●
●●●
●●
●●
●
●
●●●
●●
●●●
●●●●●●●
●
●
●
●●●●
●
●
●●●●●
●
●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●●●●●
●
●
●●
●●●●
●
●
●●●
●
●
●●
●●
●●
●
●●
●
●●●●●●●●●●●
●●●
●
●●
●
●●●
●
●●●
●
●
●●●●●●●
●
●●
●
●●
●●●●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●
●
●
●●●●●●
●●
●●
●
●●
●●●
●●●
●●●
●
●●
●
●
●●
●●
●●●●●
●●
●●●
●●●●
●
●
●●●
●
●●
●●●
●
●●●●●●
●
●
●●
●●●
●●●●
●
●
●●●
●
●
●
●●●●
●
●●●●●
●●
●
●
●●
●
●●●●
●●●
●
●●
●●
●●
●
●●●●●
●●●●●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●●●●●●●
●
●●
●
●●●●
●●
●
●
●
●
●
●●
●
●
●
●●●●●
●
●●●
●
●
●
●●
●
●
●●●●●●
●●
●
●●●●
●●
●
●
●●●
●●●●
●
●
●●●●●
●●
●●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
0
25
50
75
100
0.00 0.25 0.50 0.75 1.00 1.25
time (s)Current(pA)
Generating Events
β€’ What do we expect events from a given sequence to look like?
20
GCTACGATT
Sample Current ●●●
●
●●●●●
●●
●
●
●●
●
●●●
●
●●
●
●●●●●
●
●
●
●●●●
●
●●
●●
●●●●●
●
●
●●●●
●
●●
●●
●
●
●
●●●
●●●
●●●●
●●
●
●
●●●●
●
●
●
●
●●●
●●
●●●●●●●
●
●
●
●
●
●●●
●
●
●●●●●●●●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●●
●●
●●
●●●
●
●●
●●●●●●●●
●
●●●●
●
●●●●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●●●●●●●
●●●
●●●
●●●
●
●●●
●●
●●
●
●
●●●
●●
●●●
●●●●●●●
●
●
●
●●●●
●
●
●●●●●
●
●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●●●●●
●
●
●●
●●●●
●
●
●●●
●
●
●●
●●
●●
●
●●
●
●●●●●●●●●●●
●●●
●
●●
●
●●●
●
●●●
●
●
●●●●●●●
●
●●
●
●●
●●●●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●
●
●
●●●●●●
●●
●●
●
●●
●●●
●●●
●●●
●
●●
●
●
●●
●●
●●●●●
●●
●●●
●●●●
●
●
●●●
●
●●
●●●
●
●●●●●●
●
●
●●
●●●
●●●●
●
●
●●●
●
●
●
●●●●
●
●●●●●
●●
●
●
●●
●
●●●●
●●●
●
●●
●●
●●
●
●●●●●
●●●●●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●●●●●●●
●
●●
●
●●●●
●●
●
●
●
●
●
●●
●
●
●
●●●●●
●
●●●
●
●
●
●●
●
●
●●●●●●
●●
●
●●●●
●●
●
●
●●●
●●●●
●
●
●●●●●
●●
●●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●●
●
●
●●●
●●●
●
●●●
●
●
●
●●●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●●
●●
0
25
50
75
100
0.00 0.25 0.50 0.75 1.00 1.25
time (s)Current(pA)
Generating Events
β€’ What do we expect events from a given sequence to look like?
21
CTACGATT
Sample Current ●●●
●
●●●●●
●●
●
●
●●
●
●●●
●
●●
●
●●●●●
●
●
●
●●●●
●
●●
●●
●●●●●
●
●
●●●●
●
●●
●●
●
●
●
●●●
●●●
●●●●
●●
●
●
●●●●
●
●
●
●
●●●
●●
●●●●●●●
●
●
●
●
●
●●●
●
●
●●●●●●●●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●●
●●
●●
●●●
●
●●
●●●●●●●●
●
●●●●
●
●●●●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●●●●●●●
●●●
●●●
●●●
●
●●●
●●
●●
●
●
●●●
●●
●●●
●●●●●●●
●
●
●
●●●●
●
●
●●●●●
●
●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●●●●●
●
●
●●
●●●●
●
●
●●●
●
●
●●
●●
●●
●
●●
●
●●●●●●●●●●●
●●●
●
●●
●
●●●
●
●●●
●
●
●●●●●●●
●
●●
●
●●
●●●●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●
●
●
●●●●●●
●●
●●
●
●●
●●●
●●●
●●●
●
●●
●
●
●●
●●
●●●●●
●●
●●●
●●●●
●
●
●●●
●
●●
●●●
●
●●●●●●
●
●
●●
●●●
●●●●
●
●
●●●
●
●
●
●●●●
●
●●●●●
●●
●
●
●●
●
●●●●
●●●
●
●●
●●
●●
●
●●●●●
●●●●●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●●●●●●●
●
●●
●
●●●●
●●
●
●
●
●
●
●●
●
●
●
●●●●●
●
●●●
●
●
●
●●
●
●
●●●●●●
●●
●
●●●●
●●
●
●
●●●
●●●●
●
●
●●●●●
●●
●●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●●
●
●
●●●
●●●
●
●●●
●
●
●
●●●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●
●●●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●●●●●
●
●
●
●
●●
●●●●
●
●●●
●●
●
●
●
●
●●●
●●
●
●
●
●●●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●●●●
●●●
●
●
●
●
●●
●●
●●●
●
●
●●
●●●
●
●
●
●●●
●●●●●
●
●
●
●
●
●
0
25
50
75
100
0.00 0.25 0.50 0.75 1.00 1.25
time (s)Current(pA)
Event Detection
22
●●●
●
●●●●●
●●
●
●
●●
●
●●●
●
●●
●
●●●●●
●
●
●
●●●●
●
●●
●●
●●●●●
●
●
●●●●
●
●●
●●
●
●
●
●●●
●●●
●●●●
●●
●
●
●●●●
●
●
●
●
●●●
●●
●●●●●●●
●
●
●
●
●
●●●
●
●
●●●●●●●●
●
●
●
●●●●
●
●●●
●●●
●
●●
●●●
●●
●●
●●●
●
●●
●●●●●●●●
●
●●●●
●
●●●●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●●●●●●●
●●●
●●●
●●●
●
●●●
●●
●●
●
●
●●●
●●
●●●
●●●●●●●
●
●
●
●●●●
●
●
●●●●●
●
●●●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●●
●●
●●
●
●
●●
●
●
●●●●●
●
●
●●
●●●●
●
●
●●●
●
●
●●
●●
●●
●
●●
●
●●●●●●●●●●●
●●●
●
●●
●
●●●
●
●●●
●
●
●●●●●●●
●
●●
●
●●
●●●●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●●●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●●●●●●●●●●●
●
●
●●●●●●
●●
●●
●
●●
●●●
●●●
●●●
●
●●
●
●
●●
●●
●●●●●
●●
●●●
●●●●
●
●
●●●
●
●●
●●●
●
●●●●●●
●
●
●●
●●●
●●●●
●
●
●●●
●
●
●
●●●●
●
●●●●●
●●
●
●
●●
●
●●●●
●●●
●
●●
●●
●●
●
●●●●●
●●●●●
●●
●
●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●●●●●●●
●
●●
●
●●●●
●●
●
●
●
●
●
●●
●
●
●
●●●●●
●
●●●
●
●
●
●●
●
●
●●●●●●
●●
●
●●●●
●●
●
●
●●●
●●●●
●
●
●●●●●
●●
●●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
●●
●●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●●
●
●
●●●
●
●
●●●
●●●
●
●●●
●
●
●
●●●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●
●●●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●●●●●
●
●
●
●
●●
●●●●
●
●●●
●●
●
●
●
●
●●●
●●
●
●
●
●●●
●
●
●
●●
●●
●
●
●
●
●●
●
●●
●●●●
●●●
●
●
●
●
●●
●●
●●●
●
●
●●
●●●
●
●
●
●●●
●●●●●
●
●
●
●
●
●
0
25
50
75
100
0.00 0.25 0.50 0.75 1.00 1.25
time (s)Current(pA)
Event mean current (pA) current stdv duration (s)
1 60.3 0.7 0.521
2 40.6 1.0 0.112
3 52.2 2.0 0.356
4 54.1 1.2 0.291
5 49.5 1.5 0.141
A simple model
β€’ What is the probability of observing events E given a sequence S?	

β€’ Assuming for the moment there are no missing or extra events:
23
P(e1, e2, ..., en|s1, s2, ..., sn, β‡₯) =
nY
i=1
P(ei|si, Β΅si , si )
P(ei|k, Β΅k, k) = N(Β΅k, 2
k)
Complications
24
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●●
●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●●●
●
●
●●●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
30
40
50
60
70
0.0 0.2 0.4 0.6
time (s)
Current(pA)
Complications
25
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●●
●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●●●
●
●
●●●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
30
40
50
60
70
0.0 0.2 0.4 0.6
time (s)
Current(pA)
Is this an event ?
Complications
26
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●●
●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●●●
●
●
●●●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
30
40
50
60
70
0.0 0.2 0.4 0.6
time (s)
Current(pA) One event or two ?
Nanopore HMM
β€’ must consider:	

β€’ over segmentation	

β€’ under segmentation	

β€’ missed short events	

β€’ HMM:	

β€’ M states: match event to 5-mers	

β€’ E states: extra obs. of an event	

β€’ K states: no event obs. for 5-mer
27
P(D|S)
P(⇑, e1, e2, ..., en|S, β‡₯) =
nY
i=1
P(ei|⇑i, Β΅si , si )P(⇑i|⇑i 1, S)
P(e1, e2, ..., en|S, β‡₯) =
X
⇑
P(⇑, e1, e2, ..., en|S, β‡₯)
Transition Probabilities
β€’ Probability of not observing an
event is a function of absolute
difference between (expected)
current
28
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●●
●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●●●
●
●
●●●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
30
40
50
60
70
0.0 0.2 0.4 0.6
time (s)
Current(pA)
Transition Probabilities
β€’ Probability of not observing an
event is a function of absolute
difference between (expected)
current
29
●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●●
●
●●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●●
●●
●
●
●●●
●
●
●●●●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●●●
●
●
●
●
●
30
40
50
60
70
0.0 0.2 0.4 0.6
time (s)
Current(pA)
Transition Probabilities
30
●
●
●
●
●
●
●
●
●
●
●
● ● ● ●
●
●
●
● ● ● ● ●
● ●
● ● ●
●
●
●
●
● ●
●
●
●
●
● ●
0.1
0.2
0.3
0.4
0.5
0 5 10 15 20
absolute difference (pA)
SkipProbability
Assembly Accuracy
31
0
5000
10000
0 5000 10000
5 mer count in reference
5mercountindraftassembly
0
3000
6000
9000
12000
TTTTT AAAAA TTTTG CAAAA CTTTT AAAAG CCCCC GGGGG
kmer
count
draft
reference
y
12000
A
C
B
D
0
5000
0 5000 10000
5 mer count in reference
5mercou
0
5000
10000
0 5000 10000
5 mer count in reference
5mercountinpolishedassembly
C D
Draft: 98.5% accuracy Polished: 99.5% accuracy
Assembly Accuracy
32
0
3000
6000
9000
12000
TTTTT AAAAA TTTTG CAAAA CTTTT AAAAG CCCCC GGGGG
kmer
count
draft
reference
9000
12000
B
D
0
0 5000 10000
5 mer count in reference
5mer
0
3000
TTTTT AAAAA TTTTG CAAAA CTTTT AAAAG CCCCC GGGGG
kmer
0
5000
10000
0 5000 10000
5 mer count in reference
5mercountinpolishedassembly
0
3000
6000
9000
12000
TTTTT AAAAA TTTTG CAAAA CTTTT AAAAG CCCCC GGGGG
kmer
count
polished
reference
C D
Aligning Events to a Reference
β€’ HMM can also align events to a reference genome



!
!
!
!
!
β€’ Read about it here:	

β€’ http://simpsonlab.github.io/2015/04/08/eventalign/
33
Planned Improvements
β€’ Model dwell duration to better call homopolymers	

!
!
!
β€’ SNP calling/genotyping	

!
!
!
β€’ Improve scalability to handle larger genomes	

β€’ Use signal data during error correction
34
CTAAAAAAAAAAAAGTACA
P(gi|D) =
P(D|gi)P(gi)
P(D)
Acknowledgements & Code
β€’ Collaborators:	

β€’ Nick Loman, Josh Quick (Birmingham)	

β€’ Jonathan Dursi (OICR)	

!
!
β€’ Code:	

β€’ github.com/jts/nanocorrect (error correction)	

β€’ github.com/jts/nanopolish (signal-level algorithms)	

β€’ github.com/jts/nanopore-paper-analysis (reproduce our paper)
35

Weitere Γ€hnliche Inhalte

Γ„hnlich wie 150514 jts london_calling

Metrics with Ganglia
Metrics with GangliaMetrics with Ganglia
Metrics with GangliaGareth Rushgrove
Β 
Towards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesTowards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesWesley De Neve
Β 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdfFrangoCamila
Β 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
Β 
Proportional-integral genetic algorithm controller for stability of TCP network
Proportional-integral genetic algorithm controller for stability of TCP network Proportional-integral genetic algorithm controller for stability of TCP network
Proportional-integral genetic algorithm controller for stability of TCP network IJECEIAES
Β 
Overlap Layout Consensus assembly
Overlap Layout Consensus assemblyOverlap Layout Consensus assembly
Overlap Layout Consensus assemblyZhuyi Xue
Β 
Python lec 1004_ch02_excercies
Python lec 1004_ch02_excerciesPython lec 1004_ch02_excercies
Python lec 1004_ch02_excerciesRamadan Babers, PhD
Β 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAijun Zhang
Β 
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceScalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceKyong-Ha Lee
Β 
PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...
PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...
PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...PROIDEA
Β 
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...ijfcstjournal
Β 
Clustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining ofClustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining ofijfcstjournal
Β 
CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...
CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...
CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...Mokhtar SELLAMI
Β 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performancesource{d}
Β 
Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsBOSC 2010
Β 

Γ„hnlich wie 150514 jts london_calling (20)

Tree building 2
Tree building 2Tree building 2
Tree building 2
Β 
Metrics with Ganglia
Metrics with GangliaMetrics with Ganglia
Metrics with Ganglia
Β 
Towards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniquesTowards reading genomic data using deep learning-driven NLP techniques
Towards reading genomic data using deep learning-driven NLP techniques
Β 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdf
Β 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Β 
Proportional-integral genetic algorithm controller for stability of TCP network
Proportional-integral genetic algorithm controller for stability of TCP network Proportional-integral genetic algorithm controller for stability of TCP network
Proportional-integral genetic algorithm controller for stability of TCP network
Β 
Overlap Layout Consensus assembly
Overlap Layout Consensus assemblyOverlap Layout Consensus assembly
Overlap Layout Consensus assembly
Β 
Python lec 1004_ch02_excercies
Python lec 1004_ch02_excerciesPython lec 1004_ch02_excercies
Python lec 1004_ch02_excercies
Β 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
Β 
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduceScalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduce
Β 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
Β 
PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...
PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...
PLNOG20 - PaweΕ‚ MaΕ‚achowski - Stress your DUT–wykorzystanie narzΔ™dzi open sou...
Β 
TiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architectureTiReX: Tiled Regular eXpression matching architecture
TiReX: Tiled Regular eXpression matching architecture
Β 
Blast fasta 4
Blast fasta 4Blast fasta 4
Blast fasta 4
Β 
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
CLUSTBIGFIM-FREQUENT ITEMSET MINING OF BIG DATA USING PRE-PROCESSING BASED ON...
Β 
Clustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining ofClustbigfim frequent itemset mining of
Clustbigfim frequent itemset mining of
Β 
CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...
CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...
CARI2020: A CGM-Based Parallel Algorithm Using the Four-Russians Speedup for ...
Β 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performance
Β 
Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomics
Β 
Thiele
ThieleThiele
Thiele
Β 

KΓΌrzlich hochgeladen

Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
Β 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
Β 
Stunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCR
Stunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCRStunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCR
Stunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
Β 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
Β 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSΓ©rgio Sacani
Β 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...SΓ©rgio Sacani
Β 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSΓ©rgio Sacani
Β 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...SΓ©rgio Sacani
Β 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
Β 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
Β 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
Β 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
Β 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
Β 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
Β 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
Β 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...SΓ©rgio Sacani
Β 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
Β 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
Β 

KΓΌrzlich hochgeladen (20)

Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
Β 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
Β 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
Β 
Stunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCR
Stunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCRStunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCR
Stunning βž₯8448380779β–» Call Girls In Panchshil Enclave Delhi NCR
Β 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
Β 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Β 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Β 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
Β 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Β 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Β 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
Β 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Β 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
Β 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
Β 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
Β 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
Β 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Β 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: β€œEg...
Β 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
Β 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
Β 

150514 jts london_calling