SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Hindsight is 20/20:
MySQL to Cassandra
Michael Kjellman (@mkjellman)
Barracuda Networks
#cassandra13
What I Do
• Build and maintain “real-time” Spam detection
and Web Filter classification
• Java/Perl/C (and bits of everything else)
• Author perlcassa (Perl C* client)
• Frontend? Backend? Customer? Internal?
Broken RAID Card? Bad Disk? I touch it all.
#cassandra13
Our C* Cluster
• In production for ~2 years since 0.8
• Running 1.2.5 + minor patches
• 24 nodes in 2 datacenters
• (2) 2TB Hard Drives (no RAID)
• (1) Small SSD for small hot CFs
• 64GB of RAM
• Puppet for management
• Cobbler for deployment
• Target max load at 600GB per node
#cassandra13
What is “real-time” exactly?
#cassandra13
#cassandra13
Our Rewrite by the Numbers
Cassandra Based MySQL Based
Average Application Latency 2.41ms 5.0ms
Elements in Database 32,836,767 3,946,713
Elements Application Handles 32,836,767 314,974
Element Seen Prior to Tracking 1st request Various Thresholds
Datacenters 2 1
Average Latency of Automated
Classification
3 seconds 8 minutes
#cassandra13
Should you Rewrite?
• How To Survive a Ground-Up Rewrite Without
Losing Your Sanity[1] – Joel Spolsky
• Past engineering decisions preventing
implementation of new business requirements
• New threats smarter and more targeted
[1]http://onstartups.com/tabid/3339/bid/97052/How-To-Survive-a-Ground-Up-Rewrite-Without-Losing-Your-Sanity.aspx
#cassandra13
Evolving Legacy Systems
• Even good developers can write sloppy code
• Too much duct tape
– Most layers applied around the database
#cassandra13
Hitting the Reset Button
• Plan for continuous failure
• Easily Scalable
• No Single Point of Failure – that you know of
• Many smaller boxes vs. one monolithic box
#cassandra13
Whiteboard to Reality
• Get technical buy-in from all parties
• Migrate and rewrite in stages
– Business requirements forced hybrid period with the
old and new systems operated in parallel
#cassandra13
#cassandra13
Cassandra is Not…
1. Direct MySQL replacement
2. Magic bullet to solve everything
#cassandra13
Migrating
• Painful
• Painful
• Painful
• Tons of rewriting
• Tons of regressions
• Did I say painful?
#cassandra13
So Why Migrate?
• C* is the best option for persistence tier
• Business success motivation
• Don‟t let your database hold you back
#cassandra13
Lessons Learned (the good)
• Carefully defining data model up front
• Creating a flexible systems architecture that
adapts well to changes during implementation
• Seriously – “Measure twice, cut once.”
#cassandra13
Lessons Learned (the bad)
• Consider migration and delivery requirements
from the very beginning
• Adjust expectations – didn‟t expect relying on
legacy systems for so long
• Make syncing data between systems a priority
#cassandra13
Tips
1. Define requirements early
2. Start with the queries
3. Think differently regarding reads
4. Syncing and migrating data
5. Don‟t use C* as a queue
6. Estimate capacity
7. Automate, Automate, Automate
8. Some maintenance required
#cassandra13
1. Define Requirements Early
• What kind of queries will your application make?
• Do you need ordered results for all of your
rows?
• What is your read load? Write load?
#cassandra13
2. Start with the Queries
• C* != “#dontneedtothinkaboutmyschema”
• Counters and Composites
• Optimize for use case
– Don‟t be afraid of writes. Storage is cheap.
– Optimize to reduce the number of tombstones
#cassandra13
3. Think Differently Regarding Reads
• Do you really need all that data at once?
• mysql> SELECT * FROM mysupercooltable WHERE
foo = ‘bar’;
– Slow, but eventually will work
• cqlsh> SELECT * FROM myreallybigcf WHERE foo
= ‘bar’;
– Won‟t work. Expect RPC timeout exceptions on reads generally
after ~10,000 rows even with paging
• Our solutions:
– ElasticSearch
– Hadoop/Pig
#cassandra13
4. Syncing and Migrating Data
• Sync and migration scripts – take more seriously
than production code
• Design sync to be continuous with both systems
running in parallel during migration
• Prioritize the sync
#cassandra13
5. Don‟t use C* as a Queue
• Cassandra anti-patterns: Queues and queue-like
datasets[2] – Aleksey Yeschenko
• Tombstones + read performance
• Our solution:
– Kafka (multiple publisher, multiple consumer durable
queue)
[2]http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
#cassandra13
6. Estimate Capacity
• Don‟t forget the Java heap (8GB Max)
• Plan capacity – today and future
• Stress Tool – profile node and multiply
• MySQL hardware != Cassandra hardware
• New bottlenecks thanks to C* being so
awesome?
• I/O still an important concern with C*
#cassandra13
7. Automate, Automate, Automate
• Love your inner Ops self. Distributed systems
move complexity to operations.
• Puppet or something similar (really)
• Learn CCM earlier rather than later
– www.github.com/pcmanus/ccm
#cassandra13
8. Some Maintenance Required
• Repairs & Cleanup ops
– automate and run frequently
• Rolling restart meet rolling
repair
• Learn jconsole
• Solution:
– Jolokia (JMX via HTTP)
#cassandra13
Where is Barracuda Today?
• 2 years in production with Cassandra
• Definitely the right choice for our persistence tier
• 2 product lines on C* based system and another
major product in beta
• Achieved “real-time” response
#cassandra13
2.0 and Beyond
• Thrift -> CQL
• CQL helps the MySQL to C* migration
– Easier to comprehend / grasp
• Everyone understands SELECT * FROM cf WHERE
key = „foo‟;
• CAS and other 2.0 features make C* an even
better replacement option for MySQL
#cassandra13
C* Community
• Supercalifragilisticexpialidocious community!
• Riak, HBase, Oracle are other options. How is
their dev community?
• Great client support. Great people. Great
motivated developers.
• IRC: #cassandra on freenode
• Mailing List: user@cassandra.apache.org
#cassandra13

Weitere ähnliche Inhalte

Was ist angesagt?

Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraDataStax
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandraScyllaDB
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScyllaDB
 
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in GoScylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in GoScyllaDB
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformDataStax Academy
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireDataStax
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...DataStax
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraDataStax Academy
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...ScyllaDB
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandrazznate
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationRamkumar Nottath
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterScyllaDB
 
Apache Cassandra Management
Apache Cassandra ManagementApache Cassandra Management
Apache Cassandra ManagementInstaclustr
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015Ivan Glushkov
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 

Was ist angesagt? (20)

Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
Advanced Operations
Advanced OperationsAdvanced Operations
Advanced Operations
 
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in GoScylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
Scylla Summit 2016: Using ScyllaDB for a Microservice-based Pipeline in Go
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
 
Cassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on FireCassandra Community Webinar | Data Model on Fire
Cassandra Community Webinar | Data Model on Fire
 
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
Building a Multi-Region Cluster at Target (Aaron Ploetz, Target) | Cassandra ...
 
Mesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run CassandraMesosphere and Contentteam: A New Way to Run Cassandra
Mesosphere and Contentteam: A New Way to Run Cassandra
 
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandra
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
Performance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migrationPerformance tuning - A key to successful cassandra migration
Performance tuning - A key to successful cassandra migration
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Apache Cassandra Management
Apache Cassandra ManagementApache Cassandra Management
Apache Cassandra Management
 
NewSQL overview, Feb 2015
NewSQL overview, Feb 2015NewSQL overview, Feb 2015
NewSQL overview, Feb 2015
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 

Ähnlich wie C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman

Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayDataStax Academy
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyDataStax Academy
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...DataStax
 
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScyllaDB
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationCesare Cugnasco
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at PollfishPollfish
 
C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...
C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...
C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...DataStax Academy
 
Continuous Deployment with Cassandra
Continuous Deployment with CassandraContinuous Deployment with Cassandra
Continuous Deployment with CassandraMichael Kjellman
 
Continuous Deployment with Cassandra
Continuous Deployment with CassandraContinuous Deployment with Cassandra
Continuous Deployment with CassandraDataStax Academy
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsJulien Anguenot
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsConcentric Sky
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 

Ähnlich wie C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman (20)

Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
 
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of DutyCassandra Summit 2014: Deploying Cassandra for Call of Duty
Cassandra Summit 2014: Deploying Cassandra for Call of Duty
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
Tales From the Field: The Wrong Way of Using Cassandra (Carlos Rolo, Pythian)...
 
Scylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the DatabaseScylla Summit 2016: Compose on Containing the Database
Scylla Summit 2016: Compose on Containing the Database
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 
C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...
C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...
C* Summit 2013: The Perils and Triumphs of using Cassandra at a .NET/Microsof...
 
Continuous Deployment with Cassandra
Continuous Deployment with CassandraContinuous Deployment with Cassandra
Continuous Deployment with Cassandra
 
Continuous Deployment with Cassandra
Continuous Deployment with CassandraContinuous Deployment with Cassandra
Continuous Deployment with Cassandra
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 

Mehr von DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Mehr von DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman

  • 1. Hindsight is 20/20: MySQL to Cassandra Michael Kjellman (@mkjellman) Barracuda Networks #cassandra13
  • 2. What I Do • Build and maintain “real-time” Spam detection and Web Filter classification • Java/Perl/C (and bits of everything else) • Author perlcassa (Perl C* client) • Frontend? Backend? Customer? Internal? Broken RAID Card? Bad Disk? I touch it all. #cassandra13
  • 3. Our C* Cluster • In production for ~2 years since 0.8 • Running 1.2.5 + minor patches • 24 nodes in 2 datacenters • (2) 2TB Hard Drives (no RAID) • (1) Small SSD for small hot CFs • 64GB of RAM • Puppet for management • Cobbler for deployment • Target max load at 600GB per node #cassandra13
  • 4. What is “real-time” exactly? #cassandra13
  • 6. Our Rewrite by the Numbers Cassandra Based MySQL Based Average Application Latency 2.41ms 5.0ms Elements in Database 32,836,767 3,946,713 Elements Application Handles 32,836,767 314,974 Element Seen Prior to Tracking 1st request Various Thresholds Datacenters 2 1 Average Latency of Automated Classification 3 seconds 8 minutes #cassandra13
  • 7. Should you Rewrite? • How To Survive a Ground-Up Rewrite Without Losing Your Sanity[1] – Joel Spolsky • Past engineering decisions preventing implementation of new business requirements • New threats smarter and more targeted [1]http://onstartups.com/tabid/3339/bid/97052/How-To-Survive-a-Ground-Up-Rewrite-Without-Losing-Your-Sanity.aspx #cassandra13
  • 8. Evolving Legacy Systems • Even good developers can write sloppy code • Too much duct tape – Most layers applied around the database #cassandra13
  • 9. Hitting the Reset Button • Plan for continuous failure • Easily Scalable • No Single Point of Failure – that you know of • Many smaller boxes vs. one monolithic box #cassandra13
  • 10. Whiteboard to Reality • Get technical buy-in from all parties • Migrate and rewrite in stages – Business requirements forced hybrid period with the old and new systems operated in parallel #cassandra13
  • 12. Cassandra is Not… 1. Direct MySQL replacement 2. Magic bullet to solve everything #cassandra13
  • 13. Migrating • Painful • Painful • Painful • Tons of rewriting • Tons of regressions • Did I say painful? #cassandra13
  • 14. So Why Migrate? • C* is the best option for persistence tier • Business success motivation • Don‟t let your database hold you back #cassandra13
  • 15. Lessons Learned (the good) • Carefully defining data model up front • Creating a flexible systems architecture that adapts well to changes during implementation • Seriously – “Measure twice, cut once.” #cassandra13
  • 16. Lessons Learned (the bad) • Consider migration and delivery requirements from the very beginning • Adjust expectations – didn‟t expect relying on legacy systems for so long • Make syncing data between systems a priority #cassandra13
  • 17. Tips 1. Define requirements early 2. Start with the queries 3. Think differently regarding reads 4. Syncing and migrating data 5. Don‟t use C* as a queue 6. Estimate capacity 7. Automate, Automate, Automate 8. Some maintenance required #cassandra13
  • 18. 1. Define Requirements Early • What kind of queries will your application make? • Do you need ordered results for all of your rows? • What is your read load? Write load? #cassandra13
  • 19. 2. Start with the Queries • C* != “#dontneedtothinkaboutmyschema” • Counters and Composites • Optimize for use case – Don‟t be afraid of writes. Storage is cheap. – Optimize to reduce the number of tombstones #cassandra13
  • 20. 3. Think Differently Regarding Reads • Do you really need all that data at once? • mysql> SELECT * FROM mysupercooltable WHERE foo = ‘bar’; – Slow, but eventually will work • cqlsh> SELECT * FROM myreallybigcf WHERE foo = ‘bar’; – Won‟t work. Expect RPC timeout exceptions on reads generally after ~10,000 rows even with paging • Our solutions: – ElasticSearch – Hadoop/Pig #cassandra13
  • 21. 4. Syncing and Migrating Data • Sync and migration scripts – take more seriously than production code • Design sync to be continuous with both systems running in parallel during migration • Prioritize the sync #cassandra13
  • 22. 5. Don‟t use C* as a Queue • Cassandra anti-patterns: Queues and queue-like datasets[2] – Aleksey Yeschenko • Tombstones + read performance • Our solution: – Kafka (multiple publisher, multiple consumer durable queue) [2]http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets #cassandra13
  • 23. 6. Estimate Capacity • Don‟t forget the Java heap (8GB Max) • Plan capacity – today and future • Stress Tool – profile node and multiply • MySQL hardware != Cassandra hardware • New bottlenecks thanks to C* being so awesome? • I/O still an important concern with C* #cassandra13
  • 24. 7. Automate, Automate, Automate • Love your inner Ops self. Distributed systems move complexity to operations. • Puppet or something similar (really) • Learn CCM earlier rather than later – www.github.com/pcmanus/ccm #cassandra13
  • 25. 8. Some Maintenance Required • Repairs & Cleanup ops – automate and run frequently • Rolling restart meet rolling repair • Learn jconsole • Solution: – Jolokia (JMX via HTTP) #cassandra13
  • 26. Where is Barracuda Today? • 2 years in production with Cassandra • Definitely the right choice for our persistence tier • 2 product lines on C* based system and another major product in beta • Achieved “real-time” response #cassandra13
  • 27. 2.0 and Beyond • Thrift -> CQL • CQL helps the MySQL to C* migration – Easier to comprehend / grasp • Everyone understands SELECT * FROM cf WHERE key = „foo‟; • CAS and other 2.0 features make C* an even better replacement option for MySQL #cassandra13
  • 28. C* Community • Supercalifragilisticexpialidocious community! • Riak, HBase, Oracle are other options. How is their dev community? • Great client support. Great people. Great motivated developers. • IRC: #cassandra on freenode • Mailing List: user@cassandra.apache.org #cassandra13

Hinweis der Redaktion

  1. -usage changed and significantly increased
  2. It’s never really real timeIs it 1 second? 3 seconds? 1 hour?When do you have a business problem due to the fact you are not “real-time” enough?
  3. -We had a technical “realtime” issue that translated (more importantly) to a business problem. We weren’t catching spam fast enough.-Example: vimaseg.com.br -> 8 minutes from the first hit to classified translated into 180 messages in customers inboxes-How to close that gap to near zero?-New system classified the same domain in 3 seconds from the first hit. 0 messages in customers inboxes
  4. Our Rewrite by the numbers
  5. -The data grows as business continues to grow and there is a need to consolidate and aggregate data across products and systems
  6. -What does “legacy” bring to mind at most companies. Ops team ducktape (The data has a life of its own)-Over time, the various layers of duck tape make operations harder and hardersystems built with good intentions but frequently hit an inflection point where the underlying database problem can’t be fixed anymore-ducktape isn’t good enough anymoreadd a slave-addmemcache-attempt to better batch queries
  7. -If the legacy system is preventing implementation, then new system design is required-our inflection point: throwing away valuable data to keep the system stable-five years ago, continuous failure in your persistence tier was virtually unthinkable five years ago
  8. -Getting technical buy in from all parties that C* and other tools were the “right” tool going forwardHad to engineer our migration and rewrite in stages to provide tangible business value earlierCouldn’t just “go away” for a year and promise a perfect solution sometime down the roadBusiness requirements forced hybrid period with the old and new systems operated in parallelGetting technical buy-in
  9. -The up front costs are high, but the ability to implement anything going forward is a powerful proposition.
  10. -the old problems won’t go away during the migration-prepare to manage expectations that things might get worse before they get better
  11. What kind of queries will your application make?Do you need ordered results for all of your rows? (Solr or ElasticSearch)What is your read load? What is your write load? It almost certainly won’t be what you think it is. Get real numbers.
  12. C* != “#dontneedtothinkaboutmyschema”Counters and CompositesOptimize for use caseDon’t be afraid of writes. Storage is cheap. If multiple writes make for a cleaner, simpler read path, do it.Optimize to reduce the number of tombstones
  13. -talk about the first iteration, where I also tried the select * approach to prefill our cache. Not necessary and more importantly bad design.-mysql / relational database mentality of batch retrieval-possible to get the same result, but required different thinking and logic
  14. Almost impossible to get it right the first time-give example of elements that were in MySQL incorrectly with a timestamp of 0 for the epoch. I incorrectly assumed that > 0 would be valid. Our initial sync missed all elements with the incorrect timestamp of 0-how we had to split up our sync code into pieces-how important is the speed of your syncing
  15. -give example of bcd, where to remove and make external changes in the hashtable, bcd would read every n seconds from a mysql (select *) and then delete all after retrieving the records-goes back to article number 2
  16. -If MySQL was the bottleneck before, after migrating to C* other elements might now become the bottleneck
  17. -Deploying changes to distributed systems is more complicated and more prone to human error-give example of person who tried to manually upgrade 30+ node cluster and made human error which resulted in app being down-with distributed systems comes more complication, and minor mistakes can lead to cascading failures