SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
g414-inno
Embedded InnoDB,Voldemort, St8,
    and a Few Other Tidbits



        Sunny Gleason
What’s in this Preso

• What is InnoDB?
• Relation to MySQL & Other Products
• InnoDB Model
• g414-inno: a Java Access Library for InnoDB
What else is in this Preso

 • Creating a Voldemort Storage Engine with
   Embedded InnoDB
 • St8: A REST-based Storage Server
 • Faban Benchmark Results
What is InnoDB?
• High-Performance “guts” of MySQL
• Finely Tuned B-Tree Storage Engine
• MVCC Transactional Store a la Jim Gray
  (“Transactional Processing Systems”)
• Available Stand-Alone as Embedded
  InnoDB (stagnant) or HailDB (drizzle)
Relation to MySQL
• One of many MySQL storage engines
• Transactional, in contrast to MYISAM
• Well-known, Bullet-Proof Backup, Failure &
  Recovery Modes
• Advanced Buffer Pool Management
  (adaptive hash index, tunable LRU)
• Online Backup Support (Xtrabackup / Hot)
Other Products
• Tokyo BDB, Oracle BDB & BDB-JE
• Schema-Free (No Structure / Data Types)
• Lower Concurrency (fewer writers)
• Performance Degradation in Larger DBs
• (TODO: quantify performance gap - in
  meantime, see Dynamo & Voldemort)
InnoDB Model (Logical)
• Database == Tablespace
• Tablespace has Table(s) and Log(s)
• Table has columns (rich datatypes)
• Tables have a PRIMARY clustered index
• Tables may have SECONDARY indexes
• Row == Tuple
• Tuples are stored / clustered by index sort
• Secondary index stores full Primary Key
InnoDB Model (Txns)
• Everything uses a Transaction
• Isolation Level: Serialized, Read Committed, Read
  Uncommitted
• Locks: Shared (Read-only), Exclusive (Read/Write)
• Cursors provide access to tables: Lookup by index,
  Iteration / Traversal
• Secondary index contains partial Tuples
• Secondary cursor can access primary (full tuple)
InnoDB Model (Physical)
• Tablespace is a collection of pages (16K)
• Pages organized as a B-Tree: infimum &
  supremum keys, pointers to children
• Pages contain row or index tuple data, or
  blob overflow data
• Pages written to log first and flushed to
  tablespace based on ‘sync’ policy
Physical Considerations
•   New pages requested from OS in extend_size increments

•   OS Assigns space from file system / partition “free list”

•   Temporal Locality (pages close together)

•   Spatial Locality / Fragmentation from Updates

•   Prefer “narrow” rows / indexes: faster scan, keeps
    working set in-memory

•   Secondary “covering” indexes can save primary index
    access
!""#$%&$'(')'*+&,-.+*
                                -"!./'0($('012%,$32"
!"#$%&'$()*%#+(,%

                                                           *++,-.
                                                            %$"'()
                           !"#$%"&'
                             (&#&                                      0.%2 .!'$1
                          (!)#!*"&%+

                            !"1$%#5         /0         !""*(,-.!'$-/$%-#&,'$
                            ,3..$%5

                            3"(*
                            '*41
                                                                  0!,( .!'$1

                     !"#$%$ &!'()




        *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
!""#$%&'()*+,-(.+,
 0)>8$/.)?$
                                            &$'($%#
                                       !"#$%#             !"#$%#
 <$)=*%53$*/$'($%#
A5%B8$)=*%53$*/$'($%#                  !"#$%#             !"#$%#

                                                                          !"#$%#
  4588>)?@*/$'($%#
                                            ;)'$

             456                      456       456

             01"*23                 456     456 456
           4588*.52%#$1
       72$83*.52%#$1/               456   456

 72$83*9    72$83*:       72$83*%

                                                             )%*$"#$%#*+*,-*.)'$/




       *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
!""#$%&'#()
                               *      !"#$%&'()*+,              *
                                                                -./0-12$3"456



                                                                        #+,-./#(
               !"#$%&'(                                                   012,
        *                     *
                                7890/:-2$3"456



                                           #+,-./#(
                                             012,



!"#$%&'(&%'''')%*'+,''''!$--'./%''''0-&'./%1'''$2"%3-$45.67"'./%'88'09"-&'26-:"1




      *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
!""#$%&!"'()(*&+ ,-./0-1
                                            !$020&-#3*&0-(&*2#-('&
 !          !                                ."&24(&%+2-((&5(06&
                                             "#'(*&#6&0&758*2(-('&
                                             ."'()
                                              ! %+2-((&.*&#-90".:('&
                                                ;1&<-./0-1&=(1&#-&
                                                "#"+"855&8".>8(&=(1&
                                                #6&20;5(?&.6&'(6."('@&
                                                (5*(?&0"&."2(-"05&
                     !"#$%&'&(
                   )*'+,-'./0&.1
                       +2(&3

                    "#$%&#'()*+,-               7#58/"&3.24&A+;12(&
                                                BCDE!$&.*&0''('F




*source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
!""#$%&!"'()(*&+ ,(-#"'./0
                                                           !"#$%&'&(
                                                      !"#$%&'&(
                                                        )*'+,-'./0&.1
                                                    )*'+,-'./0&.1
                                                             +2(&3
                                                         +2(&3
! ,(-#"'./0&1"'()&%+
  2/((&3(.4&"#'(*&
  -#"2.1"5&4#/&(.-6&7(0&
  8.39(5&26(&:/1;./0&                           67%'&&/"&-8/25(&$9/!52%-+2+2:/(-%-
  7(0*&#4&26(&
  -#//(*:#"'1"<&/#=*5&
  9*('&2#&.--(**&
  -39*2(/1"<&1"'()&2#&
  #>2.1"&26(&'.2.
                 ,(-#"'./0 !"#$%
                                                67%'&&/"&-8/25(&$9/!52%-+2+2: ;<$

                                                         4&!52(-'./+2(&3
                                                      4&!52(-'./+2(&3




        *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
How can we use InnoDB?
 • Download Embedded InnoDB or HailDB
 • Use C-API for access to InnoDB tables
 • Innostore: Erlang library for InnoDB access
   (from Basho’s Riak NoSQL project)
 • g414-inno: Open-Source Java access library
   for Embedded InnoDB
g414-inno Foundations
• Uses JNA (Java Native Access): Like JNI, but
  doesn’t provoke (as much) insanity
• JNAerate: creates thin Java Class wrapper
  from a C-based header file (innodb.h)
• But, complex C API’s are super ugly in Java
• Need to clean that up a bit...
g414-inno Library
• Provides a more Object-Oriented API to
  mask all of the JNA “Pointer” madness
• Transaction Objects, Cursors, Table Builder,
  Tuple Builder, Datatype Validation
• Java Enum Types for ‘int’ enums in C API
• inTransaction() templates (like Spring, JDBI)
• Contains sanity checks to prevent common
  errors (mostly C API order of operations)
Use Case:Voldemort
• Voldemort: High-Performance Key-Value
  Store (Amazon Dynamo clone)
• Nokia: good results with Voldemort on
  MySQL with InnoDB
• Typical features of DB (network
  connectivity, SQL language) not really
  necessary
• Thought: why bother with DB layer?
  The g414-inno project is born ...
Voldemort Storage Engines
• Trivial to integrate new persistence
  mechanisms with Voldemort
• 2 Classes: Config & Storage Engine
• Trivial InnoDB Table:
  key_     VARBINARY(200) NOT NULL
  version_ VARBINARY(200) NOT NULL
  value_   BLOB
  PRIMARY KEY (key_, version_)



• 3 Operations: put(k, v), get(k), delete(k)
• Complication: k is Versioned<Key>
V Storage Engine: put
•   put(byte[] key, byte[] version, byte[] value)


• Start transaction, open table cursor
• Create search tuple for key
• Cursor.find(key)
• Foreach row matching key
     if row.version is below, delete row
     if row.version is above, throw exception
• Cursor.insert(key, version, value)
V Storage Engine: get
•   get(byte[] key, byte[] version)


• Start transaction, open table cursor
• Create search tuple for key
• Cursor.find(key)
• Foreach row matching key
     add to results
• Return results
V Storage Engine: delete
 •   delete(byte[] key)


 • Start transaction, open table cursor
 • Create search tuple for key
 • Cursor.find(key)
 • Foreach row matching key
      delete row
V Storage Engine: TODO

 • Perform Benchmarks (in EC2, local)
 • Tuning / Optimization
 • Clarify licenses (GPLv2 + Apache == ouch)
 • Organize & streamline distribution
St8
• Simple, Open Source REST-based Storage Server
• Wraps InnoDB with thin “but pleasant” HTTP API
• Custom Tables using JSON table definitions
• Natural, JSON-based access to tables: CRUD, Index-
  based Query & Iteration
• Under the hood: Jetty, Jersey, Guice, Jackson, g414-
  inno, Embedded InnoDB
St8 Table Def
{
"columns":[
   {"name":"key1","type":"INT","length":4},
   {"name":"key2","type":"VARCHAR","length":50},
   {"name":"val","type":"BLOB","length":0}
],
"indexes":[
   {
     "name":"PRIMARY",
     "clustered":true,"unique":true,
     "indexColumns":[{"name":"key1"}]
   }, {
     "name":"key2",
     "clustered":false,"unique":false,
     "indexColumns":[{"name":"key2"}]
   }
]
}
St8 Interface
• Operations for Table Management: create,
  describe, delete, truncate
• Operations for Data Management: Create,
  Retrieve, Update, Delete
• Influences g414-inno design: template
  methods for inTransaction(), insert, update,
  insertOrUpdate, delete, load
• Coming Soon: Query & Iteration APIs
St8: Sample Requests
SIMPLE GET:

curl "http://localhost:8080/d/atable;key1=123"

INSERT:

curl -X PUT "http://localhost:8080/d/atable;key1=123;key2=ABC;val=AVERYLONGDATA"

UPDATE:

curl -X POST "http://localhost:8080/d/atable;key1=123;key2=CDE;val=NEWDATA"

DELETE:

curl -X DELETE "http://localhost:8080/d/atable;key1=123"
g414-inno: Faban
      Benchmark
• Row: 4-byte Key, 4096-byte value
• Insert Sequential, Random
• Single disk, 3-disk RAID 0, SSD
• TODO: Concurrent Benchmarks, Mixed
  Read/Write
Benchmark Results
                                         Embedded InnoDB Latency (ms)
             20




             15




             10




              5




              0
                      InsertSeq                    InsertRnd             SelectRnd


                  Single Disk (OS X 1)          3-Disk Raid 0 (OS X 1)   SSD (OS X 2)




Single-Threaded Benchmarks                         InsertSeq InsertRnd SelectRnd
Single Disk (OS X 1)                                       9.0        9.3        16
3-Disk Raid 0 (OS X 1)                                    0.47        1.4       5.2
SSD (OS X 2)                                              0.51        1.2      0.71
Next Steps / Future Work
 • Finish St8: Queries & Iteration, Benchmark
 • Package / Qualify Voldemort Storage Engine
 • Integrate with Xtrabackup (hot backup)
 • Integrate with Sqoop (hadoop export)
 • Explore more advanced App-Level
   Replication Support
Questions?
• Thank you for listening!
References / More Info
• Embedded InnoDB, HailDB (drizzle)
• InnoDB Performance
• GitHub: g414-inno, st8, voldemort, xfaban
• Java Native Access (JNA)
• Tokyo BDB, Oracle BDB & BDB-JE
• Amazon Dynamo;Voldemort Project

Weitere ähnliche Inhalte

Was ist angesagt?

[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programming[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programmingnpinto
 
Capitulo 3.5 Ventas Mipro ERP
Capitulo 3.5 Ventas Mipro ERPCapitulo 3.5 Ventas Mipro ERP
Capitulo 3.5 Ventas Mipro ERPDeath User
 
6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group
6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group
6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat GroupInterlat
 
Java Web Programming Using Cloud Platform
Java Web Programming Using Cloud PlatformJava Web Programming Using Cloud Platform
Java Web Programming Using Cloud PlatformIMC Institute
 
Bren Poster Presentation Workshop
Bren Poster Presentation WorkshopBren Poster Presentation Workshop
Bren Poster Presentation WorkshopMonica Bulger
 
Pictet perspectives september 2011
Pictet   perspectives september 2011Pictet   perspectives september 2011
Pictet perspectives september 2011PrivateBanker.ro
 
The Security Risks of Web 2.0 - DEF CON 17
The Security Risks of Web 2.0 - DEF CON 17The Security Risks of Web 2.0 - DEF CON 17
The Security Risks of Web 2.0 - DEF CON 17Security Ninja
 
6thoralmucosaldiseases 2010
6thoralmucosaldiseases 20106thoralmucosaldiseases 2010
6thoralmucosaldiseases 2010LE HAI TRIEU
 
Carta de los docentes a los padres
Carta  de los docentes a  los padresCarta  de los docentes a  los padres
Carta de los docentes a los padressoypublica
 
Capitulo 3.4 Compras Mipro Erp
Capitulo 3.4 Compras Mipro ErpCapitulo 3.4 Compras Mipro Erp
Capitulo 3.4 Compras Mipro ErpDeath User
 
NETWORK REBRAND - pitch presentation (short version)
NETWORK REBRAND  - pitch presentation (short version)NETWORK REBRAND  - pitch presentation (short version)
NETWORK REBRAND - pitch presentation (short version)Stefano Di Ceglie
 
CSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendCSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendFITC
 
DBIx::Skinnyと仲間たち
DBIx::Skinnyと仲間たちDBIx::Skinnyと仲間たち
DBIx::Skinnyと仲間たちRyo Miyake
 

Was ist angesagt? (19)

IWRM National Dialogues
IWRM National DialoguesIWRM National Dialogues
IWRM National Dialogues
 
[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programming[Harvard CS264] 04 - Intermediate-level CUDA Programming
[Harvard CS264] 04 - Intermediate-level CUDA Programming
 
Capitulo 3.5 Ventas Mipro ERP
Capitulo 3.5 Ventas Mipro ERPCapitulo 3.5 Ventas Mipro ERP
Capitulo 3.5 Ventas Mipro ERP
 
6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group
6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group
6.Conocimiento cliente Cuenta Pagos en Linea. (Interlat Group
 
Java Web Programming Using Cloud Platform
Java Web Programming Using Cloud PlatformJava Web Programming Using Cloud Platform
Java Web Programming Using Cloud Platform
 
Bren Poster Presentation Workshop
Bren Poster Presentation WorkshopBren Poster Presentation Workshop
Bren Poster Presentation Workshop
 
Pictet perspectives september 2011
Pictet   perspectives september 2011Pictet   perspectives september 2011
Pictet perspectives september 2011
 
The Security Risks of Web 2.0 - DEF CON 17
The Security Risks of Web 2.0 - DEF CON 17The Security Risks of Web 2.0 - DEF CON 17
The Security Risks of Web 2.0 - DEF CON 17
 
6thoralmucosaldiseases 2010
6thoralmucosaldiseases 20106thoralmucosaldiseases 2010
6thoralmucosaldiseases 2010
 
323 n ministerial
323 n ministerial323 n ministerial
323 n ministerial
 
Carta de los docentes a los padres
Carta  de los docentes a  los padresCarta  de los docentes a  los padres
Carta de los docentes a los padres
 
74 kg greco
74 kg greco74 kg greco
74 kg greco
 
Capitulo 3.4 Compras Mipro Erp
Capitulo 3.4 Compras Mipro ErpCapitulo 3.4 Compras Mipro Erp
Capitulo 3.4 Compras Mipro Erp
 
NETWORK REBRAND - pitch presentation (short version)
NETWORK REBRAND  - pitch presentation (short version)NETWORK REBRAND  - pitch presentation (short version)
NETWORK REBRAND - pitch presentation (short version)
 
CSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the BackendCSS: A Slippery Slope to the Backend
CSS: A Slippery Slope to the Backend
 
EB-85 A
EB-85 AEB-85 A
EB-85 A
 
DBIx::Skinnyと仲間たち
DBIx::Skinnyと仲間たちDBIx::Skinnyと仲間たち
DBIx::Skinnyと仲間たち
 
la Repubblica.it
la Repubblica.itla Repubblica.it
la Repubblica.it
 
la Repubblica.it
la Repubblica.itla Repubblica.it
la Repubblica.it
 

Andere mochten auch

Making Backups in Extreme Situations
Making Backups in Extreme SituationsMaking Backups in Extreme Situations
Making Backups in Extreme SituationsSveta Smirnova
 
OUGLS 2016: How profiling works in MySQL
OUGLS 2016: How profiling works in MySQLOUGLS 2016: How profiling works in MySQL
OUGLS 2016: How profiling works in MySQLGeorgi Kodinov
 
深入解析MySQL之锁机制应用
深入解析MySQL之锁机制应用深入解析MySQL之锁机制应用
深入解析MySQL之锁机制应用banping
 
HailDB: A NoSQL API Direct to InnoDB
HailDB: A NoSQL API Direct to InnoDBHailDB: A NoSQL API Direct to InnoDB
HailDB: A NoSQL API Direct to InnoDBstewartsmith
 
MySQL AIO详解
MySQL AIO详解MySQL AIO详解
MySQL AIO详解mysqlops
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsLaurynas Biveinis
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化YUCHENG HU
 
Mysql features for the enterprise
Mysql features for the enterpriseMysql features for the enterprise
Mysql features for the enterpriseGiuseppe Maxia
 
Understanding MySql locking issues
Understanding MySql locking issuesUnderstanding MySql locking issues
Understanding MySql locking issuesOm Vikram Thapa
 
Metadata locking in MySQL 5.5
Metadata locking in MySQL 5.5Metadata locking in MySQL 5.5
Metadata locking in MySQL 5.5Kostja Osipov
 
Pldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internalsPldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internalsmysqlops
 
Percon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshellPercon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshellFrederic Descamps
 
MySQL Scalability Mistakes - OTN
MySQL Scalability Mistakes - OTNMySQL Scalability Mistakes - OTN
MySQL Scalability Mistakes - OTNRonald Bradford
 
Locking and Concurrency Control
Locking and Concurrency ControlLocking and Concurrency Control
Locking and Concurrency ControlMorgan Tocker
 
MySQL Monitoring Mechanisms
MySQL Monitoring MechanismsMySQL Monitoring Mechanisms
MySQL Monitoring MechanismsMark Leith
 
Mysql tech day_paris_ps_and_sys
Mysql tech day_paris_ps_and_sysMysql tech day_paris_ps_and_sys
Mysql tech day_paris_ps_and_sysMark Leith
 
Haute disponibilité my sql avec group réplication
Haute disponibilité my sql avec group réplicationHaute disponibilité my sql avec group réplication
Haute disponibilité my sql avec group réplicationFrederic Descamps
 
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationMySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationFrederic Descamps
 

Andere mochten auch (20)

Making Backups in Extreme Situations
Making Backups in Extreme SituationsMaking Backups in Extreme Situations
Making Backups in Extreme Situations
 
OUGLS 2016: How profiling works in MySQL
OUGLS 2016: How profiling works in MySQLOUGLS 2016: How profiling works in MySQL
OUGLS 2016: How profiling works in MySQL
 
深入解析MySQL之锁机制应用
深入解析MySQL之锁机制应用深入解析MySQL之锁机制应用
深入解析MySQL之锁机制应用
 
HailDB: A NoSQL API Direct to InnoDB
HailDB: A NoSQL API Direct to InnoDBHailDB: A NoSQL API Direct to InnoDB
HailDB: A NoSQL API Direct to InnoDB
 
MySQL AIO详解
MySQL AIO详解MySQL AIO详解
MySQL AIO详解
 
Mysql Optimization
Mysql OptimizationMysql Optimization
Mysql Optimization
 
XtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithmsXtraDB 5.7: key performance algorithms
XtraDB 5.7: key performance algorithms
 
Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化Innodb 和 XtraDB 结构和性能优化
Innodb 和 XtraDB 结构和性能优化
 
Mysql features for the enterprise
Mysql features for the enterpriseMysql features for the enterprise
Mysql features for the enterprise
 
Understanding MySql locking issues
Understanding MySql locking issuesUnderstanding MySql locking issues
Understanding MySql locking issues
 
Metadata locking in MySQL 5.5
Metadata locking in MySQL 5.5Metadata locking in MySQL 5.5
Metadata locking in MySQL 5.5
 
Pldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internalsPldc2012 innodb architecture and internals
Pldc2012 innodb architecture and internals
 
Percon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshellPercon XtraDB Cluster in a nutshell
Percon XtraDB Cluster in a nutshell
 
MySQL Scalability Mistakes - OTN
MySQL Scalability Mistakes - OTNMySQL Scalability Mistakes - OTN
MySQL Scalability Mistakes - OTN
 
Perf Tuning Short
Perf Tuning ShortPerf Tuning Short
Perf Tuning Short
 
Locking and Concurrency Control
Locking and Concurrency ControlLocking and Concurrency Control
Locking and Concurrency Control
 
MySQL Monitoring Mechanisms
MySQL Monitoring MechanismsMySQL Monitoring Mechanisms
MySQL Monitoring Mechanisms
 
Mysql tech day_paris_ps_and_sys
Mysql tech day_paris_ps_and_sysMysql tech day_paris_ps_and_sys
Mysql tech day_paris_ps_and_sys
 
Haute disponibilité my sql avec group réplication
Haute disponibilité my sql avec group réplicationHaute disponibilité my sql avec group réplication
Haute disponibilité my sql avec group réplication
 
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group ReplicationMySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster - Group Replication
 

Ähnlich wie InnoDB Magic

WALA Tutorial at PLDI 2010
WALA Tutorial at PLDI 2010WALA Tutorial at PLDI 2010
WALA Tutorial at PLDI 2010Julian Dolby
 
Massive device deployment - EclipseCon 2011
Massive device deployment - EclipseCon 2011Massive device deployment - EclipseCon 2011
Massive device deployment - EclipseCon 2011Angelo van der Sijpt
 
LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6umapst
 
SEO - It Works Even if You Don’t Know How or Why
SEO - It Works Even if You Don’t Know How or Why SEO - It Works Even if You Don’t Know How or Why
SEO - It Works Even if You Don’t Know How or Why Wolfgang Weicht
 
Science Fiction Sensor Networks
Science Fiction Sensor NetworksScience Fiction Sensor Networks
Science Fiction Sensor NetworksDiego Pizzocaro
 
E-Primer Your Business Online
E-Primer Your Business OnlineE-Primer Your Business Online
E-Primer Your Business Onlineguestfc9d8a
 
Introduction - Builders at Play
Introduction - Builders at PlayIntroduction - Builders at Play
Introduction - Builders at PlaySmart in Public
 
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012 Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012 Hossam Karim
 
Evolving systems and the link to service orientation
Evolving systems and the link to service orientationEvolving systems and the link to service orientation
Evolving systems and the link to service orientationAngelo van der Sijpt
 
Interaction design
Interaction designInteraction design
Interaction designfeifei2011
 
Архитектура коммутаторов Cisco Catalyst 6500
Архитектура коммутаторов Cisco Catalyst 6500Архитектура коммутаторов Cisco Catalyst 6500
Архитектура коммутаторов Cisco Catalyst 6500Cisco Russia
 
Steering Iterative and Incremental Delivery with Jeff Patton
Steering Iterative and Incremental Delivery with Jeff PattonSteering Iterative and Incremental Delivery with Jeff Patton
Steering Iterative and Incremental Delivery with Jeff PattonUIEpreviews
 
Ico corporate presentation en
Ico corporate presentation enIco corporate presentation en
Ico corporate presentation enHarpreet kaur
 
Web API Directory: Statistics, Trends and Good Practices
Web API Directory: Statistics, Trends and Good PracticesWeb API Directory: Statistics, Trends and Good Practices
Web API Directory: Statistics, Trends and Good Practicesmashups
 
Moosecon native apps_blackberry_10-optimized
Moosecon native apps_blackberry_10-optimizedMoosecon native apps_blackberry_10-optimized
Moosecon native apps_blackberry_10-optimizedHeinrich Seeger
 

Ähnlich wie InnoDB Magic (20)

Device deployment
Device deploymentDevice deployment
Device deployment
 
WALA Tutorial at PLDI 2010
WALA Tutorial at PLDI 2010WALA Tutorial at PLDI 2010
WALA Tutorial at PLDI 2010
 
All about Apache ACE
All about Apache ACEAll about Apache ACE
All about Apache ACE
 
Massive device deployment - EclipseCon 2011
Massive device deployment - EclipseCon 2011Massive device deployment - EclipseCon 2011
Massive device deployment - EclipseCon 2011
 
LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6LAMP_TRAINING_SESSION_6
LAMP_TRAINING_SESSION_6
 
SEO - It Works Even if You Don’t Know How or Why
SEO - It Works Even if You Don’t Know How or Why SEO - It Works Even if You Don’t Know How or Why
SEO - It Works Even if You Don’t Know How or Why
 
Science Fiction Sensor Networks
Science Fiction Sensor NetworksScience Fiction Sensor Networks
Science Fiction Sensor Networks
 
E-Primer Your Business Online
E-Primer Your Business OnlineE-Primer Your Business Online
E-Primer Your Business Online
 
Csharp intsight
Csharp intsightCsharp intsight
Csharp intsight
 
Csharp intsight[1]
Csharp intsight[1]Csharp intsight[1]
Csharp intsight[1]
 
Introduction - Builders at Play
Introduction - Builders at PlayIntroduction - Builders at Play
Introduction - Builders at Play
 
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012 Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
Time Travel - Predicting the Future and Surviving a Parallel Universe - JDC2012
 
Evolving systems and the link to service orientation
Evolving systems and the link to service orientationEvolving systems and the link to service orientation
Evolving systems and the link to service orientation
 
Interaction design
Interaction designInteraction design
Interaction design
 
Архитектура коммутаторов Cisco Catalyst 6500
Архитектура коммутаторов Cisco Catalyst 6500Архитектура коммутаторов Cisco Catalyst 6500
Архитектура коммутаторов Cisco Catalyst 6500
 
Steering Iterative and Incremental Delivery with Jeff Patton
Steering Iterative and Incremental Delivery with Jeff PattonSteering Iterative and Incremental Delivery with Jeff Patton
Steering Iterative and Incremental Delivery with Jeff Patton
 
Ipad gump
Ipad gumpIpad gump
Ipad gump
 
Ico corporate presentation en
Ico corporate presentation enIco corporate presentation en
Ico corporate presentation en
 
Web API Directory: Statistics, Trends and Good Practices
Web API Directory: Statistics, Trends and Good PracticesWeb API Directory: Statistics, Trends and Good Practices
Web API Directory: Statistics, Trends and Good Practices
 
Moosecon native apps_blackberry_10-optimized
Moosecon native apps_blackberry_10-optimizedMoosecon native apps_blackberry_10-optimized
Moosecon native apps_blackberry_10-optimized
 

Kürzlich hochgeladen

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 

Kürzlich hochgeladen (20)

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 

InnoDB Magic

  • 1. g414-inno Embedded InnoDB,Voldemort, St8, and a Few Other Tidbits Sunny Gleason
  • 2. What’s in this Preso • What is InnoDB? • Relation to MySQL & Other Products • InnoDB Model • g414-inno: a Java Access Library for InnoDB
  • 3. What else is in this Preso • Creating a Voldemort Storage Engine with Embedded InnoDB • St8: A REST-based Storage Server • Faban Benchmark Results
  • 4. What is InnoDB? • High-Performance “guts” of MySQL • Finely Tuned B-Tree Storage Engine • MVCC Transactional Store a la Jim Gray (“Transactional Processing Systems”) • Available Stand-Alone as Embedded InnoDB (stagnant) or HailDB (drizzle)
  • 5. Relation to MySQL • One of many MySQL storage engines • Transactional, in contrast to MYISAM • Well-known, Bullet-Proof Backup, Failure & Recovery Modes • Advanced Buffer Pool Management (adaptive hash index, tunable LRU) • Online Backup Support (Xtrabackup / Hot)
  • 6. Other Products • Tokyo BDB, Oracle BDB & BDB-JE • Schema-Free (No Structure / Data Types) • Lower Concurrency (fewer writers) • Performance Degradation in Larger DBs • (TODO: quantify performance gap - in meantime, see Dynamo & Voldemort)
  • 7. InnoDB Model (Logical) • Database == Tablespace • Tablespace has Table(s) and Log(s) • Table has columns (rich datatypes) • Tables have a PRIMARY clustered index • Tables may have SECONDARY indexes • Row == Tuple • Tuples are stored / clustered by index sort • Secondary index stores full Primary Key
  • 8. InnoDB Model (Txns) • Everything uses a Transaction • Isolation Level: Serialized, Read Committed, Read Uncommitted • Locks: Shared (Read-only), Exclusive (Read/Write) • Cursors provide access to tables: Lookup by index, Iteration / Traversal • Secondary index contains partial Tuples • Secondary cursor can access primary (full tuple)
  • 9. InnoDB Model (Physical) • Tablespace is a collection of pages (16K) • Pages organized as a B-Tree: infimum & supremum keys, pointers to children • Pages contain row or index tuple data, or blob overflow data • Pages written to log first and flushed to tablespace based on ‘sync’ policy
  • 10. Physical Considerations • New pages requested from OS in extend_size increments • OS Assigns space from file system / partition “free list” • Temporal Locality (pages close together) • Spatial Locality / Fragmentation from Updates • Prefer “narrow” rows / indexes: faster scan, keeps working set in-memory • Secondary “covering” indexes can save primary index access
  • 11. !""#$%&$'(')'*+&,-.+* -"!./'0($('012%,$32" !"#$%&'$()*%#+(,% *++,-. %$"'() !"#$%"&' (&#& 0.%2 .!'$1 (!)#!*"&%+ !"1$%#5 /0 !""*(,-.!'$-/$%-#&,'$ ,3..$%5 3"(* '*41 0!,( .!'$1 !"#$%$ &!'() *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
  • 12. !""#$%&'()*+,-(.+, 0)>8$/.)?$ &$'($%# !"#$%# !"#$%# <$)=*%53$*/$'($%# A5%B8$)=*%53$*/$'($%# !"#$%# !"#$%# !"#$%# 4588>)?@*/$'($%# ;)'$ 456 456 456 01"*23 456 456 456 4588*.52%#$1 72$83*.52%#$1/ 456 456 72$83*9 72$83*: 72$83*% )%*$"#$%#*+*,-*.)'$/ *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
  • 13. !""#$%&'#() * !"#$%&'()*+, * -./0-12$3"456 #+,-./#( !"#$%&'( 012, * * 7890/:-2$3"456 #+,-./#( 012, !"#$%&'(&%'''')%*'+,''''!$--'./%''''0-&'./%1'''$2"%3-$45.67"'./%'88'09"-&'26-:"1 *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
  • 14. !""#$%&!"'()(*&+ ,-./0-1 !$020&-#3*&0-(&*2#-('& ! ! ."&24(&%+2-((&5(06& "#'(*&#6&0&758*2(-('& ."'() ! %+2-((&.*&#-90".:('& ;1&<-./0-1&=(1&#-& "#"+"855&8".>8(&=(1& #6&20;5(?&.6&'(6."('@& (5*(?&0"&."2(-"05& !"#$%&'&( )*'+,-'./0&.1 +2(&3 "#$%&#'()*+,- 7#58/"&3.24&A+;12(& BCDE!$&.*&0''('F *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
  • 15. !""#$%&!"'()(*&+ ,(-#"'./0 !"#$%&'&( !"#$%&'&( )*'+,-'./0&.1 )*'+,-'./0&.1 +2(&3 +2(&3 ! ,(-#"'./0&1"'()&%+ 2/((&3(.4&"#'(*& -#"2.1"5&4#/&(.-6&7(0& 8.39(5&26(&:/1;./0& 67%'&&/"&-8/25(&$9/!52%-+2+2:/(-%- 7(0*&#4&26(& -#//(*:#"'1"<&/#=*5& 9*('&2#&.--(**& -39*2(/1"<&1"'()&2#& #>2.1"&26(&'.2. ,(-#"'./0 !"#$% 67%'&&/"&-8/25(&$9/!52%-+2+2: ;<$ 4&!52(-'./+2(&3 4&!52(-'./+2(&3 *source: http://www.mysqlconf.com/mysql2009/public/schedule/detail/7052
  • 16. How can we use InnoDB? • Download Embedded InnoDB or HailDB • Use C-API for access to InnoDB tables • Innostore: Erlang library for InnoDB access (from Basho’s Riak NoSQL project) • g414-inno: Open-Source Java access library for Embedded InnoDB
  • 17. g414-inno Foundations • Uses JNA (Java Native Access): Like JNI, but doesn’t provoke (as much) insanity • JNAerate: creates thin Java Class wrapper from a C-based header file (innodb.h) • But, complex C API’s are super ugly in Java • Need to clean that up a bit...
  • 18. g414-inno Library • Provides a more Object-Oriented API to mask all of the JNA “Pointer” madness • Transaction Objects, Cursors, Table Builder, Tuple Builder, Datatype Validation • Java Enum Types for ‘int’ enums in C API • inTransaction() templates (like Spring, JDBI) • Contains sanity checks to prevent common errors (mostly C API order of operations)
  • 19. Use Case:Voldemort • Voldemort: High-Performance Key-Value Store (Amazon Dynamo clone) • Nokia: good results with Voldemort on MySQL with InnoDB • Typical features of DB (network connectivity, SQL language) not really necessary • Thought: why bother with DB layer? The g414-inno project is born ...
  • 20. Voldemort Storage Engines • Trivial to integrate new persistence mechanisms with Voldemort • 2 Classes: Config & Storage Engine • Trivial InnoDB Table: key_ VARBINARY(200) NOT NULL version_ VARBINARY(200) NOT NULL value_ BLOB PRIMARY KEY (key_, version_) • 3 Operations: put(k, v), get(k), delete(k) • Complication: k is Versioned<Key>
  • 21. V Storage Engine: put • put(byte[] key, byte[] version, byte[] value) • Start transaction, open table cursor • Create search tuple for key • Cursor.find(key) • Foreach row matching key if row.version is below, delete row if row.version is above, throw exception • Cursor.insert(key, version, value)
  • 22. V Storage Engine: get • get(byte[] key, byte[] version) • Start transaction, open table cursor • Create search tuple for key • Cursor.find(key) • Foreach row matching key add to results • Return results
  • 23. V Storage Engine: delete • delete(byte[] key) • Start transaction, open table cursor • Create search tuple for key • Cursor.find(key) • Foreach row matching key delete row
  • 24. V Storage Engine: TODO • Perform Benchmarks (in EC2, local) • Tuning / Optimization • Clarify licenses (GPLv2 + Apache == ouch) • Organize & streamline distribution
  • 25. St8 • Simple, Open Source REST-based Storage Server • Wraps InnoDB with thin “but pleasant” HTTP API • Custom Tables using JSON table definitions • Natural, JSON-based access to tables: CRUD, Index- based Query & Iteration • Under the hood: Jetty, Jersey, Guice, Jackson, g414- inno, Embedded InnoDB
  • 26. St8 Table Def { "columns":[ {"name":"key1","type":"INT","length":4}, {"name":"key2","type":"VARCHAR","length":50}, {"name":"val","type":"BLOB","length":0} ], "indexes":[ { "name":"PRIMARY", "clustered":true,"unique":true, "indexColumns":[{"name":"key1"}] }, { "name":"key2", "clustered":false,"unique":false, "indexColumns":[{"name":"key2"}] } ] }
  • 27. St8 Interface • Operations for Table Management: create, describe, delete, truncate • Operations for Data Management: Create, Retrieve, Update, Delete • Influences g414-inno design: template methods for inTransaction(), insert, update, insertOrUpdate, delete, load • Coming Soon: Query & Iteration APIs
  • 28. St8: Sample Requests SIMPLE GET: curl "http://localhost:8080/d/atable;key1=123" INSERT: curl -X PUT "http://localhost:8080/d/atable;key1=123;key2=ABC;val=AVERYLONGDATA" UPDATE: curl -X POST "http://localhost:8080/d/atable;key1=123;key2=CDE;val=NEWDATA" DELETE: curl -X DELETE "http://localhost:8080/d/atable;key1=123"
  • 29. g414-inno: Faban Benchmark • Row: 4-byte Key, 4096-byte value • Insert Sequential, Random • Single disk, 3-disk RAID 0, SSD • TODO: Concurrent Benchmarks, Mixed Read/Write
  • 30. Benchmark Results Embedded InnoDB Latency (ms) 20 15 10 5 0 InsertSeq InsertRnd SelectRnd Single Disk (OS X 1) 3-Disk Raid 0 (OS X 1) SSD (OS X 2) Single-Threaded Benchmarks InsertSeq InsertRnd SelectRnd Single Disk (OS X 1) 9.0 9.3 16 3-Disk Raid 0 (OS X 1) 0.47 1.4 5.2 SSD (OS X 2) 0.51 1.2 0.71
  • 31. Next Steps / Future Work • Finish St8: Queries & Iteration, Benchmark • Package / Qualify Voldemort Storage Engine • Integrate with Xtrabackup (hot backup) • Integrate with Sqoop (hadoop export) • Explore more advanced App-Level Replication Support
  • 32. Questions? • Thank you for listening!
  • 33. References / More Info • Embedded InnoDB, HailDB (drizzle) • InnoDB Performance • GitHub: g414-inno, st8, voldemort, xfaban • Java Native Access (JNA) • Tokyo BDB, Oracle BDB & BDB-JE • Amazon Dynamo;Voldemort Project