SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Scaling With Cassandra
    Jeff Bollinger – CTO - @jbollinger
  Jeff Smoley – Infrastructure Architect
Agenda


About NativeX
The Backstory
Why Cassandra
Cassandra Overview
NativeX Cassandra Implementation / Metrics
What we Learned
NativeX
Formerly W3i
Marketing technology platform
that enables developers to build
successful businesses around
their apps.
Vanity Metrics


Over 620M unique devices on our network
Over 500 apps in network
> 100M Monthly Active Users
100 GB of data ingest per week
Backstory

A growing mobile advertising network
                                    API Requests
            6
 Billions




            5

            4

            3

            2

            1

            0
                2011 Q4   2012 Q1    2012 Q2       2012 Q3   2012 Q4   2013 Q1
Infrastructure Intensive Model

                    Session Calls by Week After User Acquired
           12
Millions




                                      Lifetime of user
           10


           8


           6


           4


           2


           0
                0    1   2   3    4   5     6     7      8   9   10   11   12
Scale Up Architecture

Microsoft SQL Server
  2 Node Cluster (failover)
  12 cores / node
  192 GB of / node
Compellent SAN
  172 Disk (SSD,FC,SATA)
CAP Theorem



                      Consistency
SQL Server, MySQL                      MongoDB




                                Partition
              Availability
                                Tolerance


                         Cassandra
Objectives


     Scale            Resiliency

•Horizontal        •No single point
•Incremental        of failure
 cost structure    •Geographically
                    distributed
What Needed to Scale


Web Application Tier
Database Tier


Web Application Tier is already a server farm that can scale
horizontally through our VMWare environment.
Database Tier was one giant monolithic Microsoft SQL
Server machine.
What is NoSQL?


Stands for Not Only SQL
The NoSQL movement is not about silver bullets and
black boxes.
It’s about understanding problems and focusing on
solutions.
It’s about using the right tool for the right problem.
Selecting Cassandra

DB            Distributed Maturity High Availability Style                       Documentation Native Language Drivers Popularity
MongoDB       Yes         Medium Yes                 Document - NoSQL            Excellent      Major Languages        High
VoltDB        Yes         Low      Yes               RDBMS - SQL                 Good           Major Languages        Low
MySQL Cluster Yes         High     Yes               RDBMS - SQL & Key/Value     Excellent      Major Languages        Medium
MySQL ScaleDB Yes         Low      Yes               RDBMS - SQL                 Good           Major Languages        Low
Cassandra     Yes         Medium Yes                 Key/Value - Column Family   Excellent      Major; Poor .Net       High
CouchDB       No          Medium Yes                 Document - NoSQL            ?              No - REST only         Medium
RavenDB       Yes?        Low      No                Document - NoSQL            Poor           C#, JS, REST           Medium
Couchbase     Yes         Medium Yes                 Key/Value - Document        Good           Major Languages        Medium

 http://nosql.mypopescu.com/ is a helpful site for discovering and learning about
 different DB Systems.


  *Disclaimer, this data was complied in spring of 2012 and my not reflect the
  current state of each database system shown here.
Top Choices


Considered Multiple DB Providers
  MySQL Cluster
    Relational and very familiar.
    Has physical row limitations.
  MongoDB
    Data modeling was simpler than C*.
    Not very clear if it had multi-cluster support.
  Cassandra
    At the very core it’s all about scalability and resiliency.
    Data modeling a little scary, limited .Net support.
Cassandra

Multi-node
Multi-cluster
                        Tunable Consistency
Highly Available
Durable                 Shared Nothing
C* at NativeX


C* was not a replacement DB system, but an addition.
C* solves a very specific problem (for us).
  Writing large volumes of data quickly.
  Reading very specific data out of a large record set.
NoSQL solutions, like C*, are not meant to be a
replacement for everything.
  You will make your lifer harder if you try!
The same should be said about Relational Databases.
  They don’t solve every problem!
Data Classification


We have three major classifications of data.
  Configuration
  Activity Tracking
  Device History
Configuration Data


This data is relatively small in total size and is used
to operationally run our products. Examples
include:
   Mobile Apps
   Offers
   Campaigns
   Restrictions
   Queue Settings
This data is typically relational and therefore
continues to be stored in MS SQL Server.
The Very Basics of C* Data Modeling


Data is stored inside of Column Families using nested Key/Value pairs.
A Row Key maps to a collection of Columns.
A Column Name (AKA Column Key) maps to a Column Value.
The Column Name is stored along side the Value.
A common strategy is to store JSON/XML in the Column Value.
(Side note, if you’ve heard of Super Columns, forget about them, they
hurt more than they help)
Activity Tracking Data

Raw tracking data for all activities used by the ETL process to
produce OLAP data on an hourly basis.
Synonymous with Time Series, Event Series, or Logging data.
Examples include:
  Running of Mobile Apps
  Viewing Offers
  Clicking on Offers
  Receiving Rewards
Device History Data


Historical activities that each device has performed while
being part of NativeX’s network.
Used for offer classification for a given device.
Examples include:
   Clicking on Offers
   Running Mobile Apps
   Redeeming Rewards
Hardware


12 Nodes
Cisco UCS Blades
  12 Cores @ 2.0GHz with Hyper-threading
  64GB of Ram
2 x 480GB Intel commodity SSDs in RAID 0
  10.5 TB total, ~7 TB usable
Red Hat Linux
Commodity Vs. Enterprise


We chose to use Enterprise hardware for the servers
so that we would have support for them.
However, our work load is very read heavy and 15K
rpm rotational disks were a bottle neck.
We chose to swap out the rotational for commodity
SSDs. (Enterprise SSDs were 10x as expensive)
We have limited support on the hardware because of
this.
Internal C* Cluster Stats


240 peak Writes per second per node
  2,880/sec cluster wide
888 peak Reads per second per node
  10,656/sec cluster wide
0.53 ms average Write Latency per request
1.7 ms average Read Latency per request
Almost 3 TB of data adding 1 TB a month
Application Side Latencies


MS SQL
 Writes 12 ms
 Reads 1.5 ms
C*
 Writes 3 ms
 Reads 4 ms
Can We Make Reads Faster?


We think that in SQL Server, reads were faster
because most of the data sat in memory.
We might be able to achieve lower latencies in C* if
we gave each node just as much memory as our SQL
Server.
To counter act the increased latencies we used
certain techniques like parallel reads using multi-
threading in our web application.
Not all Roses


There are still challenges with C*, like any complex
system.
More moving parts and things that need to stay in
sync.
Misconfigurations can literally destroy your data.
Certain config settings cannot be changed after you
are live, such as the number of virtual Racks.
Lessons Learned

Get into production early
Data Import = Reality
Break down communication barriers
Understanding your IO profile is really important
Cassandra changes quickly, you need to keep up
Scalable systems like C* have a massive amount of
knobs, you need to know them
Leverage cloud resources in working toward right
sizing your cluster
Thanks


We’re hiring
  http://nativex.com/careers/
Join the MSP C* Meetup
  http://www.meetup.com/Minneapolis-St-Paul-Cassandra-
  Meetup/
Email us
  Jeff.Smoley@nativex.com
  Jeff.Bollinger@nativex.com or @jbollinger
Slide Deck
  http://www.slideshare.net/JBollinger/minnebar-2013-scaling-
  with-cassandra

Weitere ähnliche Inhalte

Was ist angesagt?

The Future of Distributed Databases
The Future of Distributed DatabasesThe Future of Distributed Databases
The Future of Distributed DatabasesNuoDB
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDBScaleGrid.io
 
Azure Data services
Azure Data servicesAzure Data services
Azure Data servicesRajesh Kolla
 
2012 10 24_briefing room
2012 10 24_briefing room2012 10 24_briefing room
2012 10 24_briefing roomNuoDB
 
Postgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatPostgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatEDB
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...ivmaykov
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsDataStax
 
Leveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the CloudLeveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the CloudOliver Theobald
 
TCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleTCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleEl Taller Web
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetupPatrick McFadin
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloudmoshfiq
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?DataStax
 
DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 

Was ist angesagt? (20)

The Future of Distributed Databases
The Future of Distributed DatabasesThe Future of Distributed Databases
The Future of Distributed Databases
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDB
 
Azure Data services
Azure Data servicesAzure Data services
Azure Data services
 
2012 10 24_briefing room
2012 10 24_briefing room2012 10 24_briefing room
2012 10 24_briefing room
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
Postgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can EatPostgres: The NoSQL Cake You Can Eat
Postgres: The NoSQL Cake You Can Eat
 
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...Cassandra nyc 2011   ilya maykov - ooyala - scaling video analytics with apac...
Cassandra nyc 2011 ilya maykov - ooyala - scaling video analytics with apac...
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Leveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the CloudLeveraging ApsaraDB to Deploy Business Data on the Cloud
Leveraging ApsaraDB to Deploy Business Data on the Cloud
 
TCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleTCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & Oracle
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Nosql intro
Nosql introNosql intro
Nosql intro
 
Azure Cosmos DB
Azure Cosmos DBAzure Cosmos DB
Azure Cosmos DB
 
Toronto jaspersoft meetup
Toronto jaspersoft meetupToronto jaspersoft meetup
Toronto jaspersoft meetup
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
 
DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 

Andere mochten auch

Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...
Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...
Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...MobileMonday Estonia
 
Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...
Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...
Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...MobileMonday Estonia
 
Mindset of a Ninja Tester - Vaido Vähk - QA Lead @Mooncascade
Mindset of a Ninja Tester - Vaido Vähk - QA Lead @MooncascadeMindset of a Ninja Tester - Vaido Vähk - QA Lead @Mooncascade
Mindset of a Ninja Tester - Vaido Vähk - QA Lead @MooncascadeMobileMonday Estonia
 
Product Engineering @ TransferWise
Product Engineering @ TransferWiseProduct Engineering @ TransferWise
Product Engineering @ TransferWiseMobileMonday Estonia
 

Andere mochten auch (6)

Fortumo - Product Development
Fortumo - Product DevelopmentFortumo - Product Development
Fortumo - Product Development
 
Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...
Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...
Continuous Integration, Delivery and Deployment for Mobile Tauno Talimaa - CT...
 
Product Management @ Weekdone
Product Management @ WeekdoneProduct Management @ Weekdone
Product Management @ Weekdone
 
Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...
Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...
Front-end Testing (manual, automated, you name it) - Erich Jagomägis - Develo...
 
Mindset of a Ninja Tester - Vaido Vähk - QA Lead @Mooncascade
Mindset of a Ninja Tester - Vaido Vähk - QA Lead @MooncascadeMindset of a Ninja Tester - Vaido Vähk - QA Lead @Mooncascade
Mindset of a Ninja Tester - Vaido Vähk - QA Lead @Mooncascade
 
Product Engineering @ TransferWise
Product Engineering @ TransferWiseProduct Engineering @ TransferWise
Product Engineering @ TransferWise
 

Ähnlich wie Scaling With Cassandra: Lessons Learned From NativeX's Implementation

Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.Lukas Smith
 
NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options ComparedSergey Bushik
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSatya Pal
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesJon Meredith
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012Amazon Web Services
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013Facundo Farias
 
Practical Design Patterns for Building Applications Resilient to Infrastructu...
Practical Design Patterns for Building Applications Resilient to Infrastructu...Practical Design Patterns for Building Applications Resilient to Infrastructu...
Practical Design Patterns for Building Applications Resilient to Infrastructu...MongoDB
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Dataexponential-inc
 

Ähnlich wie Scaling With Cassandra: Lessons Learned From NativeX's Implementation (20)

Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
NoSQL
NoSQLNoSQL
NoSQL
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.If NoSQL is your answer, you are probably asking the wrong question.
If NoSQL is your answer, you are probably asking the wrong question.
 
NoSQL Options Compared
NoSQL Options ComparedNoSQL Options Compared
NoSQL Options Compared
 
NoSQL
NoSQLNoSQL
NoSQL
 
NoSql Databases
NoSql DatabasesNoSql Databases
NoSql Databases
 
No sql
No sqlNo sql
No sql
 
Sql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explainedSql vs NO-SQL database differences explained
Sql vs NO-SQL database differences explained
 
Front Range PHP NoSQL Databases
Front Range PHP NoSQL DatabasesFront Range PHP NoSQL Databases
Front Range PHP NoSQL Databases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012DAT101 Understanding AWS Database Options - AWS re: Invent 2012
DAT101 Understanding AWS Database Options - AWS re: Invent 2012
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013NoSQL Databases Introduction - UTN 2013
NoSQL Databases Introduction - UTN 2013
 
Practical Design Patterns for Building Applications Resilient to Infrastructu...
Practical Design Patterns for Building Applications Resilient to Infrastructu...Practical Design Patterns for Building Applications Resilient to Infrastructu...
Practical Design Patterns for Building Applications Resilient to Infrastructu...
 
Database Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big DataDatabase Virtualization: The Next Wave of Big Data
Database Virtualization: The Next Wave of Big Data
 

Mehr von Jeff Bollinger

Mobile News Madness - March 2012
Mobile News Madness - March 2012Mobile News Madness - March 2012
Mobile News Madness - March 2012Jeff Bollinger
 
Code Obfuscation for Android & WP7
Code Obfuscation for Android & WP7Code Obfuscation for Android & WP7
Code Obfuscation for Android & WP7Jeff Bollinger
 
Android Development with Flash Builder Burrito
Android Development with Flash Builder BurritoAndroid Development with Flash Builder Burrito
Android Development with Flash Builder BurritoJeff Bollinger
 
Objective C for C# Developers
Objective C for C# DevelopersObjective C for C# Developers
Objective C for C# DevelopersJeff Bollinger
 
Agile Development at W3i
Agile Development at W3iAgile Development at W3i
Agile Development at W3iJeff Bollinger
 

Mehr von Jeff Bollinger (7)

Mobile News Madness - March 2012
Mobile News Madness - March 2012Mobile News Madness - March 2012
Mobile News Madness - March 2012
 
Agile
AgileAgile
Agile
 
Code Obfuscation for Android & WP7
Code Obfuscation for Android & WP7Code Obfuscation for Android & WP7
Code Obfuscation for Android & WP7
 
Android Development with Flash Builder Burrito
Android Development with Flash Builder BurritoAndroid Development with Flash Builder Burrito
Android Development with Flash Builder Burrito
 
Objective C for C# Developers
Objective C for C# DevelopersObjective C for C# Developers
Objective C for C# Developers
 
Mobile News Round Up
Mobile News Round UpMobile News Round Up
Mobile News Round Up
 
Agile Development at W3i
Agile Development at W3iAgile Development at W3i
Agile Development at W3i
 

Kürzlich hochgeladen

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 

Scaling With Cassandra: Lessons Learned From NativeX's Implementation

  • 1. Scaling With Cassandra Jeff Bollinger – CTO - @jbollinger Jeff Smoley – Infrastructure Architect
  • 2. Agenda About NativeX The Backstory Why Cassandra Cassandra Overview NativeX Cassandra Implementation / Metrics What we Learned
  • 3. NativeX Formerly W3i Marketing technology platform that enables developers to build successful businesses around their apps.
  • 4. Vanity Metrics Over 620M unique devices on our network Over 500 apps in network > 100M Monthly Active Users 100 GB of data ingest per week
  • 5. Backstory A growing mobile advertising network API Requests 6 Billions 5 4 3 2 1 0 2011 Q4 2012 Q1 2012 Q2 2012 Q3 2012 Q4 2013 Q1
  • 6. Infrastructure Intensive Model Session Calls by Week After User Acquired 12 Millions Lifetime of user 10 8 6 4 2 0 0 1 2 3 4 5 6 7 8 9 10 11 12
  • 7. Scale Up Architecture Microsoft SQL Server 2 Node Cluster (failover) 12 cores / node 192 GB of / node Compellent SAN 172 Disk (SSD,FC,SATA)
  • 8. CAP Theorem Consistency SQL Server, MySQL MongoDB Partition Availability Tolerance Cassandra
  • 9. Objectives Scale Resiliency •Horizontal •No single point •Incremental of failure cost structure •Geographically distributed
  • 10. What Needed to Scale Web Application Tier Database Tier Web Application Tier is already a server farm that can scale horizontally through our VMWare environment. Database Tier was one giant monolithic Microsoft SQL Server machine.
  • 11. What is NoSQL? Stands for Not Only SQL The NoSQL movement is not about silver bullets and black boxes. It’s about understanding problems and focusing on solutions. It’s about using the right tool for the right problem.
  • 12. Selecting Cassandra DB Distributed Maturity High Availability Style Documentation Native Language Drivers Popularity MongoDB Yes Medium Yes Document - NoSQL Excellent Major Languages High VoltDB Yes Low Yes RDBMS - SQL Good Major Languages Low MySQL Cluster Yes High Yes RDBMS - SQL & Key/Value Excellent Major Languages Medium MySQL ScaleDB Yes Low Yes RDBMS - SQL Good Major Languages Low Cassandra Yes Medium Yes Key/Value - Column Family Excellent Major; Poor .Net High CouchDB No Medium Yes Document - NoSQL ? No - REST only Medium RavenDB Yes? Low No Document - NoSQL Poor C#, JS, REST Medium Couchbase Yes Medium Yes Key/Value - Document Good Major Languages Medium http://nosql.mypopescu.com/ is a helpful site for discovering and learning about different DB Systems. *Disclaimer, this data was complied in spring of 2012 and my not reflect the current state of each database system shown here.
  • 13. Top Choices Considered Multiple DB Providers MySQL Cluster Relational and very familiar. Has physical row limitations. MongoDB Data modeling was simpler than C*. Not very clear if it had multi-cluster support. Cassandra At the very core it’s all about scalability and resiliency. Data modeling a little scary, limited .Net support.
  • 14. Cassandra Multi-node Multi-cluster Tunable Consistency Highly Available Durable Shared Nothing
  • 15. C* at NativeX C* was not a replacement DB system, but an addition. C* solves a very specific problem (for us). Writing large volumes of data quickly. Reading very specific data out of a large record set. NoSQL solutions, like C*, are not meant to be a replacement for everything. You will make your lifer harder if you try! The same should be said about Relational Databases. They don’t solve every problem!
  • 16. Data Classification We have three major classifications of data. Configuration Activity Tracking Device History
  • 17. Configuration Data This data is relatively small in total size and is used to operationally run our products. Examples include: Mobile Apps Offers Campaigns Restrictions Queue Settings This data is typically relational and therefore continues to be stored in MS SQL Server.
  • 18. The Very Basics of C* Data Modeling Data is stored inside of Column Families using nested Key/Value pairs. A Row Key maps to a collection of Columns. A Column Name (AKA Column Key) maps to a Column Value. The Column Name is stored along side the Value. A common strategy is to store JSON/XML in the Column Value. (Side note, if you’ve heard of Super Columns, forget about them, they hurt more than they help)
  • 19. Activity Tracking Data Raw tracking data for all activities used by the ETL process to produce OLAP data on an hourly basis. Synonymous with Time Series, Event Series, or Logging data. Examples include: Running of Mobile Apps Viewing Offers Clicking on Offers Receiving Rewards
  • 20. Device History Data Historical activities that each device has performed while being part of NativeX’s network. Used for offer classification for a given device. Examples include: Clicking on Offers Running Mobile Apps Redeeming Rewards
  • 21. Hardware 12 Nodes Cisco UCS Blades 12 Cores @ 2.0GHz with Hyper-threading 64GB of Ram 2 x 480GB Intel commodity SSDs in RAID 0 10.5 TB total, ~7 TB usable Red Hat Linux
  • 22. Commodity Vs. Enterprise We chose to use Enterprise hardware for the servers so that we would have support for them. However, our work load is very read heavy and 15K rpm rotational disks were a bottle neck. We chose to swap out the rotational for commodity SSDs. (Enterprise SSDs were 10x as expensive) We have limited support on the hardware because of this.
  • 23. Internal C* Cluster Stats 240 peak Writes per second per node 2,880/sec cluster wide 888 peak Reads per second per node 10,656/sec cluster wide 0.53 ms average Write Latency per request 1.7 ms average Read Latency per request Almost 3 TB of data adding 1 TB a month
  • 24. Application Side Latencies MS SQL Writes 12 ms Reads 1.5 ms C* Writes 3 ms Reads 4 ms
  • 25. Can We Make Reads Faster? We think that in SQL Server, reads were faster because most of the data sat in memory. We might be able to achieve lower latencies in C* if we gave each node just as much memory as our SQL Server. To counter act the increased latencies we used certain techniques like parallel reads using multi- threading in our web application.
  • 26. Not all Roses There are still challenges with C*, like any complex system. More moving parts and things that need to stay in sync. Misconfigurations can literally destroy your data. Certain config settings cannot be changed after you are live, such as the number of virtual Racks.
  • 27. Lessons Learned Get into production early Data Import = Reality Break down communication barriers Understanding your IO profile is really important Cassandra changes quickly, you need to keep up Scalable systems like C* have a massive amount of knobs, you need to know them Leverage cloud resources in working toward right sizing your cluster
  • 28. Thanks We’re hiring http://nativex.com/careers/ Join the MSP C* Meetup http://www.meetup.com/Minneapolis-St-Paul-Cassandra- Meetup/ Email us Jeff.Smoley@nativex.com Jeff.Bollinger@nativex.com or @jbollinger Slide Deck http://www.slideshare.net/JBollinger/minnebar-2013-scaling- with-cassandra