Lighting Talk on TokuMX from MongoDB New York User Group 7/16/2013.
TokuMX integrates fractal tree indexes into MongoDB. Fractal tree indexes are the technology behind the TokuDB storage engine for MySQL. The big benefits:
- reduced I/O utilization on writes for large data
- compression
- faster and more indexing
- no fragmentation (no compact)
2. What is TokuMX?
• TokuMX = MongoDB with improved storage
• Drop in replacement for MongoDB v2.2 applications
(including sharding and replication)
o Same data model
o Same query language
o Drivers just work
• Open source
3. TokuMX Benefits
• Improved write performance on large data
o Much faster on data > RAM
o Significantly reduced I/O
o Don't need to keep working set in RAM
o Maintaining indexes uses less resources
• Compression! (up to 10x)
• No fragmentation (Deprecated compact!)
• Transactions (MVCC + multi-statement)
Bottom line: TokuMX makes MongoDB applications stable
and fast for large databases.
4. TokuMX comes from TokuDB
TokuMX v. 1.0:
• Performance
• Compression
• No fragmentation (no
compact)
• Transactions
TokuDB v. 7.0:
• Performance
• Compression
• No fragmentation (no
optimize table)
• Agility (online schema
changes)
TokuMX uses the same core storage code as TokuDB for
MySQL. This base provides the general improvements.
5. TokuMX: How?
Built a storage core from the ground up, with Fractal Tree indexes, a
data structure designed with large data in mind.
Key attributes:
• Buffers usedto batch up writes, so write I/O is drastically reduced
• large blocks of data, so compressionis great
• large blocks prevent fragmentation
6. TokuDB Customer case
• One of the leading partners of the software,
movie and music industry.
• Traces copyright infringements and illegal file
sharing in P2P networks.
• logs IP addresses while also performing a
connect to each peer. Fetch data to match it to
the copyrighted material for proof of copyright
violation.
• Ingest large amount of logging information each
hour for parallel processing and storage
7. TokuDB Customer case
Translation:
• Needed indexes on data > RAM to query the data in a
variety of ways
• Needed compression so that they could store larger
volumes of data on a single server
8. TokuDB Customer case
Results:
• Dramatic I/O reduction
• over 3X compression
• downtime periods previously spent running
"optimize table" eliminated
9. TokuDB: compression
Levels of compression depend on how naturally
compressible your data is.
9x compression
7x compression