SlideShare a Scribd company logo
1 of 40
Cassandra from the trenches:
      migrating Netflix
          Jason Brown
    Senior Software Engineer
             Netflix
       @jasobrown jasedbrown@gmail.com

      http://www.linkedin.com/in/jasedbrown
Your host for the evening
• Sr. Software Engineer at Netflix > 3 years
  – Currently lead a team developing and operating
    AB testing infrastructure in EC2
  – Spent time migrating core e-commerce
    functionality out of PL/SQL and scaling it up
• MLB Advanced Media
  – Ran Ecommerce engineering group
• Wandered about in the wireless space
  (J2ME, BREW)
History
• In the beginning, there was the webapp
  – And a database, too
  – In one datacenter
• Then we grew, and grew, and grew
  – More databases, all conjoined
  – Database links with PL/SQL and M views
  – Multi-Master replication
History,2
• Then it melted down (2008)
  – Oracle MMR between two databases
  – SPOF – one Oracle instance for website (no
    backup)
• Couldn’t ship DVDs for ~3 days
History,3
• Time to rethink everything
  – Abandon datacenter for EC2
     • We’re not in the business of building datacenters
  – Ditch monolithic webapp for distributed systems
     • Greater independence for all teams/initiatives
  – Migrate SPOF database to …
History,4
• SimpleDb/S3
  – Somebody else manages your database (yeah!)
  – Tried it out, but didn’t quite work well for us
  – High latency, rate limiting (throttling), (no) auto-
    sharding, no backup problems
• Time to try out one of them (other) new
  fangled NoSql things…
Shiny new toy
• We selected Cassandra
  – Dynamo-model appealed to us
  – Column-based, key-value data model seemed
    sufficient for most needs
  – Performance looked great (rudimentary tests)
• Now what?
  – Put something into it
  – Run it in EC2
  – Sounds easy enough…
• Data Modeling
  – Where the rubber meets the road
About Netflix’s AB Testing
• We use it everywhere (no, really)
• Basic concepts
  – Test – An experiment where several competing
    behaviors are implemented and compared
  – Cell – different experiences within a test that are
    being compared against each other
  – Allocation – a customer-specific assignment to a
    cell within a test
     • Customer can only be in one cell of a test at a time
     • Generally immutable (very important for analysis)
Data Modeling - background
• AB has two sets of data
  – metadata about tests
  – allocations
• Both need to be migrated out of Oracle and
  into Cassandra in the cloud
AB - allocations
• Single table to hold allocations
  – Currently at ~950 million records
  – Plus indices!
• One record for every test that every customer
  is allocated into
• Unique constraint on customer/test
AB - metadata
• Fairly typical parent-child table relationship
• Not updated frequently, so service can cache
Data modeling in cassandra
• Every where I looked, the internets told me to
  understand my data use patterns
  – Understand the questions that you need to
    answer from the data
     • Meaning: know how to query your data structure the
       persistence model to match


• There’s no free lunch here, apparently
Identifying the AB questions that need
            to be answered
• get all allocations for a customer
• get count of customers in test/cell
• find all customers in a test/cell
  – So we can kick them out of the test
  – So we can clean up ancient data
  – So we can move them to a different cell in test
• find all customers allocated to test within a
  date range
  – So we can kick them out of the test
Modeling allocations in cassandra
• As we’re read-heavy, read all allocations for a
  customer as fast as possible
  – Denormalize allocations into a single row
  – But, how do I denormalize?
• Find all of customers in a test/cell = reverse
  index
• Get count of customers in test/cell = count the
  entries in the reverse index
Denormalization-HOWTO
• The internets talk about it, but no real world
  examples
  – ‘Normalization is for sissies’, Pat Helland
• Denormalizing allocations per customer
  – Trivial with a schema-less database
Denormalized allocations
• Sample normalized data




• Sample denormalized data (sparse!)
Implementing allocations
• As allocation for a customer has a handful of
  data points, they logically can be grouped
  together
• Hello, super columns
• Avoided blobs, json or otherwise
  – data race concerns
  – BI integration
  – Serialization alg changes could tank the data
Implementing allocations, second
               round
• But, cassandra devs secretly despise don’t
  enjoy super columns
• Switched to standard column family, using
  composite columns
• Composite columns are sorted by each ‘token’
  in name
  – This sorts each allocation’s data together (by
    testId)
Composite columns
• Allocation column naming convention
  – <testId>:<field>
  – 42:cell = 2
  – 42:enabled = Y
  – 47:cell = 0
  – 47:enabled = Y
• Using terse field names, but still have column
  name overhead (~15 bytes)
Implementing indices
• Cassandra’s secondary indices vs. hand-built
  and maintained alternate indices
• Secondary indices work great on uniform data
  between rows
• But sparse column data not so easy
Hand-built Indices, 1

• Reverse index
  – Test/cell (key) to custIds (columns)
     • Column value is timestamp
• Mutate on allocating a customer into test
Hand-built indices, 2
• Counter column family
  – Test/cell to count of customers in test columns
  – Mutate on allocating a customer into test
• Counters are not idempotent!
• Mutates need to write to every node that
  hosts that key
Index rebuilding
• Yeah, even Oracle needs to have it’s indices
  rebuilt
• Easy enough to rebuild the reverse index, but
  how about that counter column?
  – Read the reverse index for the count and write
    that as counter’s value
Modeling AB metadata in cassandra
• Explored several models, including json
  blobs, spreading across multiple CFs, differing
  degrees of denormalization
• Reverse index to identify all tests for loading
Implementing metadata
• One CF, one row for all test’s data
  – Every data point is a column – no blobs
• Composite columns
  – type:id:field
     • Types = base info, cells, allocation plans
     • Id = cell number, allocation plan (gu)id
     • Field = type-specific
        – Base info = test name, description, enabled
        – Cell’s name / description
        – Plan’s start/end dates, country to allocate to
Into the real world … here comes the hurt
Allocation mutates
• AB allocations are immutable, so how do you
  prevent mutating?
  – Oracle – unique constraint on table
  – Cassandra – read before write
• Read before write in a distributed system is a
  data race
Running cassandra
• Compactions happen
  – Part of the Cassandra lifestyle
  – Mutations are written to memory (memtable)
  – Flushed to disk (sstable) on triggering threshold
     • Time
     • Size
     • Operations against column family
  – Eventually, Cassandra decides to merge sstables as
    data for a individual rows becomes scattered
Compactions, 2
• Spikes happen, esp. on read-heavy systems
  – Everything can slow down
  – Sometimes, average latency > 95%ile
  – Throttling in newer Cass versions helps, I think
  – Affects clients (hector, astyanax)
Repairs
• Different from read repair!
• Fix all the data in a single node by pulling
  shared ranges from neighbor nodes
Repairs, 2
• Replication factor determines number of
  nodes involved in repair of single node
• Neighbor nodes will perform validation
  compaction
  – Pushes disk and network hard dep. on data size
• Guess what happens when you run a multi-
  region cluster?
Client libraries
• Round-robin is not the way to go for
  connection pooling
  – Coordinator Cassandra nodes will incorrectly be
    marked down rather than target slow node
• Token-aware is safer, faster, but harder to
  implement
Tunings, 1
• Key and row caches
  – Left unbounded can chew up jvm memory needed
    for normal work
  – Latencies will spike as the jvm needs to fight for
    memory
  – Off-heap row cache is better but still maintains
    data structures on-heap
Tunings, 2
• mmap() as in-memory cache
  – When process terminated, mmap pages are added
    to the free list
Tunings, 3
• Sizing memtable flushes for optimizing
  compactions
  – Easier when writes are uniformly
    distributed, timewise – easier to reason about
    flush patterns
  – Best to optimize flushes based on memtable
    size, not time
Tunings, 4
• Sharding
  – Not dead yet!
  – If a single row has disproportionately high
    gets/mutates, the nodes holding it will become
    hot spots
  – If a row grows too large, it won’t fit into memory
Takeaways
• Netflix is making all of our components
  distributed and fault tolerant as we grow
  domestically and internationally.

• Cassandra is a core piece of our cloud
  infrastructure.
終わり(The End)


• Q&A



        @jasobrown jasedbrown@gmail.com

        http://www.linkedin.com/in/jasedbrown
References
• Pat Helland, ‘Normalization Is for Sissies”
  http://blogs.msdn.com/b/pathelland/archive/
  2007/07/23/normalization-is-for-sissies.aspx
• btoddb, “Storage Sizing” http://btoddb-cass-
  storage.blogspot.com/

More Related Content

What's hot

Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloudJustin Swanhart
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)alexbaranau
 
HBase Schema Design - HBase-Con 2012
HBase Schema Design - HBase-Con 2012HBase Schema Design - HBase-Con 2012
HBase Schema Design - HBase-Con 2012Ian Varley
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPTony Rogerson
 
Bigtable and Boxwood
Bigtable and BoxwoodBigtable and Boxwood
Bigtable and BoxwoodEvan Weaver
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraTarun Garg
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
8. key value databases laboratory
8. key value databases laboratory 8. key value databases laboratory
8. key value databases laboratory Fabio Fumarola
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...Cloudera, Inc.
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLPostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLAlexei Krasner
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
 

What's hot (20)

Voldemort Nosql
Voldemort NosqlVoldemort Nosql
Voldemort Nosql
 
Fudcon talk.ppt
Fudcon talk.pptFudcon talk.ppt
Fudcon talk.ppt
 
Project Voldemort
Project VoldemortProject Voldemort
Project Voldemort
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloud
 
No SQL and MongoDB - Hyderabad Scalability Meetup
No SQL and MongoDB - Hyderabad Scalability MeetupNo SQL and MongoDB - Hyderabad Scalability Meetup
No SQL and MongoDB - Hyderabad Scalability Meetup
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)
 
HBase Schema Design - HBase-Con 2012
HBase Schema Design - HBase-Con 2012HBase Schema Design - HBase-Con 2012
HBase Schema Design - HBase-Con 2012
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
 
Bigtable and Boxwood
Bigtable and BoxwoodBigtable and Boxwood
Bigtable and Boxwood
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
PostgreSQL and MySQL
PostgreSQL and MySQLPostgreSQL and MySQL
PostgreSQL and MySQL
 
8. key value databases laboratory
8. key value databases laboratory 8. key value databases laboratory
8. key value databases laboratory
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQLPostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQL
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
 
Intro to column stores
Intro to column storesIntro to column stores
Intro to column stores
 

Viewers also liked

An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...DataStax
 
Instaclustr: When and how to migrate from a relational database to Cassandra
Instaclustr: When and how to migrate from a relational database to CassandraInstaclustr: When and how to migrate from a relational database to Cassandra
Instaclustr: When and how to migrate from a relational database to CassandraDataStax Academy
 
Netflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to CassandraNetflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to CassandraRoopa Tangirala
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraJim Hatcher
 
Why Migrate from MySQL to Cassandra
Why Migrate from MySQL to CassandraWhy Migrate from MySQL to Cassandra
Why Migrate from MySQL to CassandraDATAVERSITY
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchMigration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchDataStax Academy
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Jay Patel
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 

Viewers also liked (8)

An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
 
Instaclustr: When and how to migrate from a relational database to Cassandra
Instaclustr: When and how to migrate from a relational database to CassandraInstaclustr: When and how to migrate from a relational database to Cassandra
Instaclustr: When and how to migrate from a relational database to Cassandra
 
Netflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to CassandraNetflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to Cassandra
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Why Migrate from MySQL to Cassandra
Why Migrate from MySQL to CassandraWhy Migrate from MySQL to Cassandra
Why Migrate from MySQL to Cassandra
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchMigration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a Hitch
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 

Similar to Cassandra from the trenches: migrating Netflix

Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Jason Brown
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Column db dol
Column db dolColumn db dol
Column db dolpoojabi
 
Efficient node bootstrapping for decentralised shared-nothing Key-Value Stores
Efficient node bootstrapping for decentralised shared-nothing Key-Value StoresEfficient node bootstrapping for decentralised shared-nothing Key-Value Stores
Efficient node bootstrapping for decentralised shared-nothing Key-Value StoresHan Li
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Boris Yen
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache CassandraJacky Chu
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceAbdelmonaim Remani
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical dataOleksandr Semenov
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Applications in the Cloud
Applications in the CloudApplications in the Cloud
Applications in the CloudEberhard Wolff
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxRahul Borate
 

Similar to Cassandra from the trenches: migrating Netflix (20)

Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
NoSql
NoSqlNoSql
NoSql
 
Column db dol
Column db dolColumn db dol
Column db dol
 
Efficient node bootstrapping for decentralised shared-nothing Key-Value Stores
Efficient node bootstrapping for decentralised shared-nothing Key-Value StoresEfficient node bootstrapping for decentralised shared-nothing Key-Value Stores
Efficient node bootstrapping for decentralised shared-nothing Key-Value Stores
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011Talk about apache cassandra, TWJUG 2011
Talk about apache cassandra, TWJUG 2011
 
Talk About Apache Cassandra
Talk About Apache CassandraTalk About Apache Cassandra
Talk About Apache Cassandra
 
The Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot PersistenceThe Rise of NoSQL and Polyglot Persistence
The Rise of NoSQL and Polyglot Persistence
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Applications in the Cloud
Applications in the CloudApplications in the Cloud
Applications in the Cloud
 
Cassandra
CassandraCassandra
Cassandra
 
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptxUNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Cassandra from the trenches: migrating Netflix

  • 1. Cassandra from the trenches: migrating Netflix Jason Brown Senior Software Engineer Netflix @jasobrown jasedbrown@gmail.com http://www.linkedin.com/in/jasedbrown
  • 2. Your host for the evening • Sr. Software Engineer at Netflix > 3 years – Currently lead a team developing and operating AB testing infrastructure in EC2 – Spent time migrating core e-commerce functionality out of PL/SQL and scaling it up • MLB Advanced Media – Ran Ecommerce engineering group • Wandered about in the wireless space (J2ME, BREW)
  • 3. History • In the beginning, there was the webapp – And a database, too – In one datacenter • Then we grew, and grew, and grew – More databases, all conjoined – Database links with PL/SQL and M views – Multi-Master replication
  • 4. History,2 • Then it melted down (2008) – Oracle MMR between two databases – SPOF – one Oracle instance for website (no backup) • Couldn’t ship DVDs for ~3 days
  • 5. History,3 • Time to rethink everything – Abandon datacenter for EC2 • We’re not in the business of building datacenters – Ditch monolithic webapp for distributed systems • Greater independence for all teams/initiatives – Migrate SPOF database to …
  • 6. History,4 • SimpleDb/S3 – Somebody else manages your database (yeah!) – Tried it out, but didn’t quite work well for us – High latency, rate limiting (throttling), (no) auto- sharding, no backup problems • Time to try out one of them (other) new fangled NoSql things…
  • 7. Shiny new toy • We selected Cassandra – Dynamo-model appealed to us – Column-based, key-value data model seemed sufficient for most needs – Performance looked great (rudimentary tests) • Now what? – Put something into it – Run it in EC2 – Sounds easy enough…
  • 8. • Data Modeling – Where the rubber meets the road
  • 9. About Netflix’s AB Testing • We use it everywhere (no, really) • Basic concepts – Test – An experiment where several competing behaviors are implemented and compared – Cell – different experiences within a test that are being compared against each other – Allocation – a customer-specific assignment to a cell within a test • Customer can only be in one cell of a test at a time • Generally immutable (very important for analysis)
  • 10. Data Modeling - background • AB has two sets of data – metadata about tests – allocations • Both need to be migrated out of Oracle and into Cassandra in the cloud
  • 11. AB - allocations • Single table to hold allocations – Currently at ~950 million records – Plus indices! • One record for every test that every customer is allocated into • Unique constraint on customer/test
  • 12. AB - metadata • Fairly typical parent-child table relationship • Not updated frequently, so service can cache
  • 13. Data modeling in cassandra • Every where I looked, the internets told me to understand my data use patterns – Understand the questions that you need to answer from the data • Meaning: know how to query your data structure the persistence model to match • There’s no free lunch here, apparently
  • 14. Identifying the AB questions that need to be answered • get all allocations for a customer • get count of customers in test/cell • find all customers in a test/cell – So we can kick them out of the test – So we can clean up ancient data – So we can move them to a different cell in test • find all customers allocated to test within a date range – So we can kick them out of the test
  • 15. Modeling allocations in cassandra • As we’re read-heavy, read all allocations for a customer as fast as possible – Denormalize allocations into a single row – But, how do I denormalize? • Find all of customers in a test/cell = reverse index • Get count of customers in test/cell = count the entries in the reverse index
  • 16. Denormalization-HOWTO • The internets talk about it, but no real world examples – ‘Normalization is for sissies’, Pat Helland • Denormalizing allocations per customer – Trivial with a schema-less database
  • 17. Denormalized allocations • Sample normalized data • Sample denormalized data (sparse!)
  • 18. Implementing allocations • As allocation for a customer has a handful of data points, they logically can be grouped together • Hello, super columns • Avoided blobs, json or otherwise – data race concerns – BI integration – Serialization alg changes could tank the data
  • 19. Implementing allocations, second round • But, cassandra devs secretly despise don’t enjoy super columns • Switched to standard column family, using composite columns • Composite columns are sorted by each ‘token’ in name – This sorts each allocation’s data together (by testId)
  • 20. Composite columns • Allocation column naming convention – <testId>:<field> – 42:cell = 2 – 42:enabled = Y – 47:cell = 0 – 47:enabled = Y • Using terse field names, but still have column name overhead (~15 bytes)
  • 21. Implementing indices • Cassandra’s secondary indices vs. hand-built and maintained alternate indices • Secondary indices work great on uniform data between rows • But sparse column data not so easy
  • 22. Hand-built Indices, 1 • Reverse index – Test/cell (key) to custIds (columns) • Column value is timestamp • Mutate on allocating a customer into test
  • 23. Hand-built indices, 2 • Counter column family – Test/cell to count of customers in test columns – Mutate on allocating a customer into test • Counters are not idempotent! • Mutates need to write to every node that hosts that key
  • 24. Index rebuilding • Yeah, even Oracle needs to have it’s indices rebuilt • Easy enough to rebuild the reverse index, but how about that counter column? – Read the reverse index for the count and write that as counter’s value
  • 25. Modeling AB metadata in cassandra • Explored several models, including json blobs, spreading across multiple CFs, differing degrees of denormalization • Reverse index to identify all tests for loading
  • 26. Implementing metadata • One CF, one row for all test’s data – Every data point is a column – no blobs • Composite columns – type:id:field • Types = base info, cells, allocation plans • Id = cell number, allocation plan (gu)id • Field = type-specific – Base info = test name, description, enabled – Cell’s name / description – Plan’s start/end dates, country to allocate to
  • 27. Into the real world … here comes the hurt
  • 28. Allocation mutates • AB allocations are immutable, so how do you prevent mutating? – Oracle – unique constraint on table – Cassandra – read before write • Read before write in a distributed system is a data race
  • 29. Running cassandra • Compactions happen – Part of the Cassandra lifestyle – Mutations are written to memory (memtable) – Flushed to disk (sstable) on triggering threshold • Time • Size • Operations against column family – Eventually, Cassandra decides to merge sstables as data for a individual rows becomes scattered
  • 30. Compactions, 2 • Spikes happen, esp. on read-heavy systems – Everything can slow down – Sometimes, average latency > 95%ile – Throttling in newer Cass versions helps, I think – Affects clients (hector, astyanax)
  • 31. Repairs • Different from read repair! • Fix all the data in a single node by pulling shared ranges from neighbor nodes
  • 32. Repairs, 2 • Replication factor determines number of nodes involved in repair of single node • Neighbor nodes will perform validation compaction – Pushes disk and network hard dep. on data size • Guess what happens when you run a multi- region cluster?
  • 33. Client libraries • Round-robin is not the way to go for connection pooling – Coordinator Cassandra nodes will incorrectly be marked down rather than target slow node • Token-aware is safer, faster, but harder to implement
  • 34. Tunings, 1 • Key and row caches – Left unbounded can chew up jvm memory needed for normal work – Latencies will spike as the jvm needs to fight for memory – Off-heap row cache is better but still maintains data structures on-heap
  • 35. Tunings, 2 • mmap() as in-memory cache – When process terminated, mmap pages are added to the free list
  • 36. Tunings, 3 • Sizing memtable flushes for optimizing compactions – Easier when writes are uniformly distributed, timewise – easier to reason about flush patterns – Best to optimize flushes based on memtable size, not time
  • 37. Tunings, 4 • Sharding – Not dead yet! – If a single row has disproportionately high gets/mutates, the nodes holding it will become hot spots – If a row grows too large, it won’t fit into memory
  • 38. Takeaways • Netflix is making all of our components distributed and fault tolerant as we grow domestically and internationally. • Cassandra is a core piece of our cloud infrastructure.
  • 39. 終わり(The End) • Q&A @jasobrown jasedbrown@gmail.com http://www.linkedin.com/in/jasedbrown
  • 40. References • Pat Helland, ‘Normalization Is for Sissies” http://blogs.msdn.com/b/pathelland/archive/ 2007/07/23/normalization-is-for-sissies.aspx • btoddb, “Storage Sizing” http://btoddb-cass- storage.blogspot.com/

Editor's Notes

  1. Point of departure from the datacenterData modeling -Relational to non-relImplementation(s)real world – Ops, tuning, compactions, gotchas
  2. Background as to why netflix has moved to the cloud and embraced new databases
  3. Circa mid-late 2010, we evaluated a bunch of database systems, primarily focusing on the new NoSQL breed.
  4. I lead AB testing and we’ll be using that data set as a model for discussion. I’ll describe the legacy oracle implementation and how I went about moving it to cass
  5. Show example of an AB test (1482) on the homepage
  6. Existing data sets in our legacy Oracle database that need to be migrated and transformed
  7. LAST SLIDE ON DATA MODELING! Next is running this in prod!
  8. Going to share real world issues from design, ops, performance
  9. Some some systems, as long as one writes wins (eventual consistency), all is fine
  10. Explain difference between read repair and node repair
  11. Makes minor compactions smoother
  12. Too large - AB Indices ran afoul of thisProblem for reads, compactions, and repairs