The document discusses various tree-based indexes for databases, including B-trees, Cache Oblivious Lookahead Arrays (COLAs), LSM trees, and fractal trees. It analyzes the performance tradeoffs between these structures for search, insertion, and space efficiency. A key point is that fractal trees can provide search performance similar to B-trees but with faster insertion by exploiting the block size of storage. The document also proposes using a block size-dependent fanout for ε-B trees to achieve optimal performance.
10. COLA
log2N
...........
Binary Search in one level:O(log2N) 2
11. COLA (Using Fractional Cascading)
log2N
...........
● Search: O(log2N) block transfers.
● Insert: O((1/B)log2N) amortized block transfers.
● Data is stored in log2N arrays of sizes 2, 4, 8, 16,..
● Balanced Binary Search Tree
12. COLA Conclusions
● Search: O(log2N) block transfers(Using Fractional
Cascading).
● Insert: O((1/B)log2N) amortized block transfers.
● Data is stored in log2N arrays of sizes 2, 4, 8, 16,..
● Balanced Binary Search Tree
● Lookahead(Prefetch), Data-Intensive!
● BUT, the bottom level will be big and bigger,
merging expensive.
13. COLA vs B-tree
● Search:
-- (log2N)/(logBN)
= log2B times slower than B-tree(In theory)
● Insert:
--(logBN)/((1/B)log2N)
= B/(log2B) times faster than B-trees(In theory)
if B = 4KB:
COLA search is 12 times slower than B-tree
COLA insert is 341 times faster than B-tree
17. LSM-tree (Merging)
In memory
buffer
buffer ... buffer
merge merge merge
buffer ... buffer ... buffer ... buffer
A lot of I/O wasted during merging!
Like a headless fly flying... Zzz...
25. ε
B -tree
B-tree
Fast ε=1/2
Search
Slow
AOF
Slow
Fast
Inserts
Optimal Curve
26. ε
B -tree
insert search
B-tree O(logBN) O(logBN)
(ɛ=1)
ɛ=1/2 O((logBN)/√B) O(logBN)
ɛ=0 O((logN)/B) O(logN)
if we want optimal point queries + very fast inserts, we
should choose ɛ=1/2
27. ε
B -tree
So, if block size is B, the fanout should be √B
29. Cache Oblivious Data Structure
Question:
Reading a sequence of k consecutive blocks
at once is not much more expensive than
reading a single block. How to take advantage
of this feature?
30. Cache Oblivious Data Structure
My Questions(In Chinese):
Q1:
只有1MB内存,怎样把两个64MB有序文件合
并成一个有序文件?
Q2:
大多数机械磁盘,连续读取多个Block和读取
单个Block花费相差不大,在Q1中如何利用这个
优势?
31. nessDB
You should agree that VFS do better than yourself cache!
https://github.com/shuttler/nessDB
32. nessDB
.. ... ... ... ..
.. .. .. .. .. ..
Each Block is Small-Splittable Tree
33. nessDB, What's going on?
..
.. .. ..
.. ... ... ... ..
.. .. .. .. .. ..
From the line to the plane..
34. Thanks!
Most of the references are from:
Tokutek & MIT CSAIL & Stony Brook.
Drafted By BohuTANG using Google Drive, @2012/12/12