SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Accelerating NoSQL
   Running Voldemort on HailDB


         Sunny Gleason
         March 11, 2011
whoami
• Sunny Gleason, human
• passion: distributed systems engineering
• previous...
   Ning : custom social networks
   Amazon.com : infra & web services
• now...
   building cloud infrastructure
whereami

• twitter : twitter.com/sunnygleason
• github : github.com/sunnygleason
• linkedin : linkedin.com/in/sunnygleason
what’s in this presentation?

  • NoSQL Roundup
  • Voldemort who?
  • HailDB wha?
  • Results & Next Steps
  • Special Bonus Material
NoSQL

• “Not Only” SQL
• What’s the point?
• Proponent: “reaching next level of scale”
• Cynic: “cloud is hype, ops nightmare”
what does it gain?

• Higher performance, scalability, availability
• More robust fault-tolerance
• Simplified systems design
• Easier operations
what does it lose?
• Reduced / simplified programming model
• No ad-hoc queries, no joins, no txns
• Not ACID: Atomicity / Consistency /
  Isolation / Durability
• Operations / management is still evolving
• Challenging to quantify health of system
• Fewer domain experts
NoSQL Map
                                KV Stores
                                (volatile)           Memcached,
                                                       Redis




                                KV Stores             Dynamo,
        Key-Value               (durable)            Voldemort,
          Store
                                                        Riak


                     Document
                       Store
NoSQL                                    CouchDB,
                                         MongoDB

                    Column
                     Store              Cassandra,
                                         BigTable,
                                          HBase

         Graph
                                             Neo4J
         Store
NoSQL Map
                                KV Stores
                                (volatile)




                                KV Stores      Dynamo,
        Key-Value               (durable)    Voldemort,
          Store
                                                 Riak


                     Document
                       Store
NoSQL

                    Column
                     Store



         Graph
         Store
motivation

• database on 1 box : ok
• database with master/slave replication : ok
• database on cluster : tricky
• database on SAN : time bomb
performance
                                      + Sharding
Complexity




                               +FusionIO

                      +SSD

             MySQL       Voldemort                 +Cluster


                               Memcached


                1K       10K           100K             1M

                     Aggregate Operations / Sec
dynamo case study
• Amazon : high read throughput, always-
  accessible writes
• Shopping cart application
• ‘Glitches’ ok, duplicate or missing item
• Data loss or unavailability is unacceptable
• Solution: K-V schema plus smart routing &
  data placement
key-value storage
• Essentially, a gigantic hash table
• Typically assign byte[] values to byte[] keys
• Plus versioning mixed in to handle failures
  and conflicts
• Yes, you *can* do range partitioning; in
  practice, avoid it because of hot spots
k-v: durable vs. volatile

• RAM is ridiculous speed (ns), not durable
• Disk is persistent and slow (3-7ms)
• RAID eases the pain a bit (4-8x throughput)
• SSD is providing good promise (100-300us)
• FusionIO is redefining the space (30-100us)
dynamo clones

• Voldemort : from LinkedIn, dynamo
  implementation in Java (default: BDB-JE)
• Riak : from Basho, dynamo implementation
  in Erlang (default: embedded InnoDB)
Voldemort

• Developed at LinkedIn
• Scalable Key-Value Storage
• Based on Amazon Dynamo model
• High Read Throughput
• Always Writable
Voldemort features

• Consistent Hashing
• Quorum settings : R, W, N
• Auto-sharding & rebalancing
• Pluggable storage engines
Consistent Hashing
         * Arrange keys around ring


           * Compute token in ring
              using hash function


       * Determine nodes responsible
           for token using live set
R/W/N
• N : maximum number of nodes to query
  for an operation
• R : read quorum
• W : write quorum
• Can adjust ‘quorum’ to balance throughput
  and fault-tolerance
setting up Voldemort 1
Step 1: Download the code
Download either a recent stable release or, for those who like to live more
dangerously, the up-to-the-minute build from the build server.

Step 2: Start single node cluster
	

   > bin/voldemort-server.sh config/single_node_cluster > /tmp/voldemort.log &

Step 3: Start commandline test client and do some operations
	

   > bin/voldemort-shell.sh test tcp://localhost:6666
	

   Established connection to test via tcp://localhost:6666
	

   > put "hello" "world"
	

   > get "hello"
	

   version(0:1): "world"
	

   > delete "hello"
	

   > get "hello"
	

   null
	

   > exit
	

   k k thx bye.
setting up Voldemort 2

• For a cluster, use cloud startup scripts
• Works with Amazon EC2
• See https://github.com/voldemort/
  voldemort/wiki/EC2-Testing-Infrastructure
Voldemort client libraries


• Java, Scala, Clojure
• Ruby
• Python
• C++
storage engines

• BDB-JE (Oracle Sleepycat, the original)
• Krati (LinkedIn, pretty new)
• HailDB (new!)
• MySQL (old / dated)
BDB-JE

• Log-Structured B-Tree
• Fast Storage When Mostly Cached
• Configured without fsync() by default -
  writes are batched and flushed periodically
Krati

• Fast Hash-Oriented Storage
• Uses memory-mapped files for speed
• Configured without fsync() by default -
  writes are batched and flushed periodically
HailDB
• Fork of MySQL InnoDB plugin
  (contributors : Oracle, Google, Facebook,
  Percona)
• Higher stability for large data sets
• Fast crash recovery
• External from Java heap (ease GC pain)
• apt-get install haildb (from launchpad PPA)
• Use “flush-once-per-second” mode
HailDB, Java & Voldemort
                         Voldemort Client




    Voldemort Node       Voldemort Node     Voldemort Node



     v-storage-inno

      g414-haildb

          JNA


         HailDB
    (log, buffer pool,
       tablespace)
HailDB & Java

• g414-haildb : where the magic happens
• uses JNA: Java Native Access
• dynamic binding to libhaildb shared library
• auto-generated from .h file (w/ JNAerator)
• Pointer classes & other shenanigans
HailDB schema

_key VARBINARY(200)
_version VARBINARY(200)
_value BLOB
PRIMARY KEY(_key, _version)
implementation gotchas

• InnoDB API-level usage is unclear
• Synchronization & locking is unclear
• Therefore... I learned to love reading C
• Error handling is *nasty*
• Installation a bit of a pain
experimental setup
• OS X: 8-Core Xeon, 32GB RAM, 200GB
  OWC SSD
• Faban Benchmark : PUT 64-byte key, 1024-
  byte value
• Scenarios:1, 2, 4, 8 threads
• 512M Java Heap
Perf: BDB Put 100%
Perf: BDB Put 20%/Get 80%
Perf: BDB Put 20% Detail
Perf: Krati Put 100%
Perf: Krati Put 20%/Get 80%
Perf: Krati Put 20% Detail
Perf: HailDB Put 100%
Perf: HailDB Put 20%/Get 80%
Perf: HailDB Put 20% Detail
future work

• Improve Packaging / Installation
• Schema refinements & perf enhancements
• Online backup/export with XtraBackup
• JNI Bindings
schema refinements
• Build upon Nokia work on fast k-v schema
• 8-byte ‘long’ key hash vs. full key bytes
• Smart use of secondary indexes
• Native representation of vector clocks
• Delayed / soft deletion
• Expect 40-50% performance boost
InnoDB tuning
• Skinny columns, skinny rows! (esp. Primary Key)
 • Varchar enum ‘bad’, int or smallint ‘good’
 • fixed-width rows allows in-place updates
• Use covering indexes strategically
• More data per page means faster index scans,
  more efficient buffer pool utilization
• You only get so many trx’s on given CPU/RAM
  configuration - benchmark this!
refined schema
_id BIGINT (auto increment)
_key_hash BIGINT
_key VARBINARY(200)
_version VARBINARY(200)
_value BLOB
PRIMARY KEY(_id)
KEY(_key_hash)
online backup

• hot backup of data to other machine /
  destination
• test Percona Xtrabackup with HailDB
• next step: backup/export to Hadoop/HDFS
  (similar to Cloudera Sqoop tool)
JNI bindings

• JNI can get 2-5x perf boost vs. JNA
• ... at the expense of nasty code
• Will go for schema optimizations and
  InnoDB tuning tips *first*
resources

• github.com/voldemort/voldemort
  freenode #voldemort
• github.com/sunnygleason/v-storage-haildb
  github.com/sunnygleason/v-storage-bench
  github.com/sunnygleason/g414-haildb
• jna.dev.java.net
more resources

• Amazon Dynamo
• Faban / XFaban
• HailDB
• Drizzle
• PBXT
Thank You!

Weitere ähnliche Inhalte

Was ist angesagt?

Turning object storage into vm storage
Turning object storage into vm storageTurning object storage into vm storage
Turning object storage into vm storagewim_provoost
 
Threads Needles Stacks Heaps - Java edition
Threads Needles Stacks Heaps - Java editionThreads Needles Stacks Heaps - Java edition
Threads Needles Stacks Heaps - Java editionOvidiu Dimulescu
 
Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Henrik Ingo
 
Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndJervin Real
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraMichael Kjellman
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera ClusterAbdul Manaf
 
Databases in the hosted cloud
Databases in the hosted cloudDatabases in the hosted cloud
Databases in the hosted cloudColin Charles
 
NewSQL vs NoSQL for New OLTP
NewSQL vs NoSQL for New OLTPNewSQL vs NoSQL for New OLTP
NewSQL vs NoSQL for New OLTPDATAVERSITY
 
A beginners guide to MariaDB
A beginners guide to MariaDBA beginners guide to MariaDB
A beginners guide to MariaDBColin Charles
 
High Performance Weibo QCon Beijing 2011
High Performance Weibo QCon Beijing 2011High Performance Weibo QCon Beijing 2011
High Performance Weibo QCon Beijing 2011Tim Y
 
Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011Henrik Ingo
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]Malin Weiss
 
Redis memcached pdf
Redis memcached pdfRedis memcached pdf
Redis memcached pdfErin O'Neill
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonIvan Zoratti
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityJean-Paul Azar
 
Gear6 Webinar - MySQL Scaling with Memcached
Gear6 Webinar - MySQL Scaling with MemcachedGear6 Webinar - MySQL Scaling with Memcached
Gear6 Webinar - MySQL Scaling with MemcachedGear6
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandraShun Nakamura
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilitycherryhillco
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and BeyondSage Weil
 

Was ist angesagt? (20)

Turning object storage into vm storage
Turning object storage into vm storageTurning object storage into vm storage
Turning object storage into vm storage
 
Threads Needles Stacks Heaps - Java edition
Threads Needles Stacks Heaps - Java editionThreads Needles Stacks Heaps - Java edition
Threads Needles Stacks Heaps - Java edition
 
Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)Using and Benchmarking Galera in different architectures (PLUK 2012)
Using and Benchmarking Galera in different architectures (PLUK 2012)
 
Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlnd
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
MariaDB Galera Cluster
MariaDB Galera ClusterMariaDB Galera Cluster
MariaDB Galera Cluster
 
Databases in the hosted cloud
Databases in the hosted cloudDatabases in the hosted cloud
Databases in the hosted cloud
 
NewSQL vs NoSQL for New OLTP
NewSQL vs NoSQL for New OLTPNewSQL vs NoSQL for New OLTP
NewSQL vs NoSQL for New OLTP
 
A beginners guide to MariaDB
A beginners guide to MariaDBA beginners guide to MariaDB
A beginners guide to MariaDB
 
High Performance Weibo QCon Beijing 2011
High Performance Weibo QCon Beijing 2011High Performance Weibo QCon Beijing 2011
High Performance Weibo QCon Beijing 2011
 
Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011Choosing a MySQL High Availability solution - Percona Live UK 2011
Choosing a MySQL High Availability solution - Percona Live UK 2011
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
Redis memcached pdf
Redis memcached pdfRedis memcached pdf
Redis memcached pdf
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka Security
 
Gear6 Webinar - MySQL Scaling with Memcached
Gear6 Webinar - MySQL Scaling with MemcachedGear6 Webinar - MySQL Scaling with Memcached
Gear6 Webinar - MySQL Scaling with Memcached
 
Why MariaDB?
Why MariaDB?Why MariaDB?
Why MariaDB?
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
What's new in Jewel and Beyond
What's new in Jewel and BeyondWhat's new in Jewel and Beyond
What's new in Jewel and Beyond
 

Ähnlich wie Accelerating NoSQL Running Voldemort on HailDB

Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataRoger Xia
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Chris Richardson
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarDataStax Academy
 
Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the CloudInes Sombra
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydbDaniel Austin
 
PhegData X - High Performance EBS
PhegData X - High Performance EBSPhegData X - High Performance EBS
PhegData X - High Performance EBSHanson Dong
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Amazon Web Services
 
A Global In-memory Data System for MySQL
A Global In-memory Data System for MySQLA Global In-memory Data System for MySQL
A Global In-memory Data System for MySQLDaniel Austin
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataacelyc1112009
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Alexey Rybak
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyDataStax Academy
 
Adding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of TricksAdding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of Trickssiculars
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Felix GV
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 

Ähnlich wie Accelerating NoSQL Running Voldemort on HailDB (20)

Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag JambhekarC* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
 
Mongodb lab
Mongodb labMongodb lab
Mongodb lab
 
Getting started with Riak in the Cloud
Getting started with Riak in the CloudGetting started with Riak in the Cloud
Getting started with Riak in the Cloud
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydb
 
PhegData X - High Performance EBS
PhegData X - High Performance EBSPhegData X - High Performance EBS
PhegData X - High Performance EBS
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
 
A Global In-memory Data System for MySQL
A Global In-memory Data System for MySQLA Global In-memory Data System for MySQL
A Global In-memory Data System for MySQL
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
NoSQL
NoSQLNoSQL
NoSQL
 
Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)Large-scale projects development (scaling LAMP)
Large-scale projects development (scaling LAMP)
 
Client storage
Client storageClient storage
Client storage
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
 
Adding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of TricksAdding Riak to your NoSQL Bag of Tricks
Adding Riak to your NoSQL Bag of Tricks
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
 
Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 

Kürzlich hochgeladen

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Accelerating NoSQL Running Voldemort on HailDB

  • 1. Accelerating NoSQL Running Voldemort on HailDB Sunny Gleason March 11, 2011
  • 2. whoami • Sunny Gleason, human • passion: distributed systems engineering • previous... Ning : custom social networks Amazon.com : infra & web services • now... building cloud infrastructure
  • 3. whereami • twitter : twitter.com/sunnygleason • github : github.com/sunnygleason • linkedin : linkedin.com/in/sunnygleason
  • 4. what’s in this presentation? • NoSQL Roundup • Voldemort who? • HailDB wha? • Results & Next Steps • Special Bonus Material
  • 5. NoSQL • “Not Only” SQL • What’s the point? • Proponent: “reaching next level of scale” • Cynic: “cloud is hype, ops nightmare”
  • 6. what does it gain? • Higher performance, scalability, availability • More robust fault-tolerance • Simplified systems design • Easier operations
  • 7. what does it lose? • Reduced / simplified programming model • No ad-hoc queries, no joins, no txns • Not ACID: Atomicity / Consistency / Isolation / Durability • Operations / management is still evolving • Challenging to quantify health of system • Fewer domain experts
  • 8. NoSQL Map KV Stores (volatile) Memcached, Redis KV Stores Dynamo, Key-Value (durable) Voldemort, Store Riak Document Store NoSQL CouchDB, MongoDB Column Store Cassandra, BigTable, HBase Graph Neo4J Store
  • 9. NoSQL Map KV Stores (volatile) KV Stores Dynamo, Key-Value (durable) Voldemort, Store Riak Document Store NoSQL Column Store Graph Store
  • 10. motivation • database on 1 box : ok • database with master/slave replication : ok • database on cluster : tricky • database on SAN : time bomb
  • 11. performance + Sharding Complexity +FusionIO +SSD MySQL Voldemort +Cluster Memcached 1K 10K 100K 1M Aggregate Operations / Sec
  • 12. dynamo case study • Amazon : high read throughput, always- accessible writes • Shopping cart application • ‘Glitches’ ok, duplicate or missing item • Data loss or unavailability is unacceptable • Solution: K-V schema plus smart routing & data placement
  • 13. key-value storage • Essentially, a gigantic hash table • Typically assign byte[] values to byte[] keys • Plus versioning mixed in to handle failures and conflicts • Yes, you *can* do range partitioning; in practice, avoid it because of hot spots
  • 14. k-v: durable vs. volatile • RAM is ridiculous speed (ns), not durable • Disk is persistent and slow (3-7ms) • RAID eases the pain a bit (4-8x throughput) • SSD is providing good promise (100-300us) • FusionIO is redefining the space (30-100us)
  • 15. dynamo clones • Voldemort : from LinkedIn, dynamo implementation in Java (default: BDB-JE) • Riak : from Basho, dynamo implementation in Erlang (default: embedded InnoDB)
  • 16. Voldemort • Developed at LinkedIn • Scalable Key-Value Storage • Based on Amazon Dynamo model • High Read Throughput • Always Writable
  • 17. Voldemort features • Consistent Hashing • Quorum settings : R, W, N • Auto-sharding & rebalancing • Pluggable storage engines
  • 18. Consistent Hashing * Arrange keys around ring * Compute token in ring using hash function * Determine nodes responsible for token using live set
  • 19. R/W/N • N : maximum number of nodes to query for an operation • R : read quorum • W : write quorum • Can adjust ‘quorum’ to balance throughput and fault-tolerance
  • 20. setting up Voldemort 1 Step 1: Download the code Download either a recent stable release or, for those who like to live more dangerously, the up-to-the-minute build from the build server. Step 2: Start single node cluster > bin/voldemort-server.sh config/single_node_cluster > /tmp/voldemort.log & Step 3: Start commandline test client and do some operations > bin/voldemort-shell.sh test tcp://localhost:6666 Established connection to test via tcp://localhost:6666 > put "hello" "world" > get "hello" version(0:1): "world" > delete "hello" > get "hello" null > exit k k thx bye.
  • 21. setting up Voldemort 2 • For a cluster, use cloud startup scripts • Works with Amazon EC2 • See https://github.com/voldemort/ voldemort/wiki/EC2-Testing-Infrastructure
  • 22. Voldemort client libraries • Java, Scala, Clojure • Ruby • Python • C++
  • 23. storage engines • BDB-JE (Oracle Sleepycat, the original) • Krati (LinkedIn, pretty new) • HailDB (new!) • MySQL (old / dated)
  • 24. BDB-JE • Log-Structured B-Tree • Fast Storage When Mostly Cached • Configured without fsync() by default - writes are batched and flushed periodically
  • 25. Krati • Fast Hash-Oriented Storage • Uses memory-mapped files for speed • Configured without fsync() by default - writes are batched and flushed periodically
  • 26. HailDB • Fork of MySQL InnoDB plugin (contributors : Oracle, Google, Facebook, Percona) • Higher stability for large data sets • Fast crash recovery • External from Java heap (ease GC pain) • apt-get install haildb (from launchpad PPA) • Use “flush-once-per-second” mode
  • 27. HailDB, Java & Voldemort Voldemort Client Voldemort Node Voldemort Node Voldemort Node v-storage-inno g414-haildb JNA HailDB (log, buffer pool, tablespace)
  • 28. HailDB & Java • g414-haildb : where the magic happens • uses JNA: Java Native Access • dynamic binding to libhaildb shared library • auto-generated from .h file (w/ JNAerator) • Pointer classes & other shenanigans
  • 29. HailDB schema _key VARBINARY(200) _version VARBINARY(200) _value BLOB PRIMARY KEY(_key, _version)
  • 30. implementation gotchas • InnoDB API-level usage is unclear • Synchronization & locking is unclear • Therefore... I learned to love reading C • Error handling is *nasty* • Installation a bit of a pain
  • 31. experimental setup • OS X: 8-Core Xeon, 32GB RAM, 200GB OWC SSD • Faban Benchmark : PUT 64-byte key, 1024- byte value • Scenarios:1, 2, 4, 8 threads • 512M Java Heap
  • 33. Perf: BDB Put 20%/Get 80%
  • 34. Perf: BDB Put 20% Detail
  • 36. Perf: Krati Put 20%/Get 80%
  • 37. Perf: Krati Put 20% Detail
  • 39. Perf: HailDB Put 20%/Get 80%
  • 40. Perf: HailDB Put 20% Detail
  • 41. future work • Improve Packaging / Installation • Schema refinements & perf enhancements • Online backup/export with XtraBackup • JNI Bindings
  • 42. schema refinements • Build upon Nokia work on fast k-v schema • 8-byte ‘long’ key hash vs. full key bytes • Smart use of secondary indexes • Native representation of vector clocks • Delayed / soft deletion • Expect 40-50% performance boost
  • 43. InnoDB tuning • Skinny columns, skinny rows! (esp. Primary Key) • Varchar enum ‘bad’, int or smallint ‘good’ • fixed-width rows allows in-place updates • Use covering indexes strategically • More data per page means faster index scans, more efficient buffer pool utilization • You only get so many trx’s on given CPU/RAM configuration - benchmark this!
  • 44. refined schema _id BIGINT (auto increment) _key_hash BIGINT _key VARBINARY(200) _version VARBINARY(200) _value BLOB PRIMARY KEY(_id) KEY(_key_hash)
  • 45. online backup • hot backup of data to other machine / destination • test Percona Xtrabackup with HailDB • next step: backup/export to Hadoop/HDFS (similar to Cloudera Sqoop tool)
  • 46. JNI bindings • JNI can get 2-5x perf boost vs. JNA • ... at the expense of nasty code • Will go for schema optimizations and InnoDB tuning tips *first*
  • 47. resources • github.com/voldemort/voldemort freenode #voldemort • github.com/sunnygleason/v-storage-haildb github.com/sunnygleason/v-storage-bench github.com/sunnygleason/g414-haildb • jna.dev.java.net
  • 48. more resources • Amazon Dynamo • Faban / XFaban • HailDB • Drizzle • PBXT