SlideShare ist ein Scribd-Unternehmen logo
1 von 5
Downloaden Sie, um offline zu lesen
Make the most of your technology investments before it’s too late

A Vendor-independent Comparison of NoSQL
Databases: Cassandra, HBase, MongoDB, Riak
By Sergey Bushik, Senior R&D Engineer at Altoros Systems, Inc.
“The more alternatives, the more difficult the choice.”
Abbe' D'Allanival
Make the most of your technology investments before it’s too late

In 2010, when the world became enchanted by the capabilities of cloud systems and new databases
designed to serve them, a group of researchers from Yahoo decided to look into NoSQL. They
developed the YCSB framework to assess the performance of new tools and find the best cases for
their use. The results were published in the paper, “Benchmarking Cloud Serving Systems with YCSB.”
The Yahoo guys did a great job, but like any paper, it could not include everything:
● The research did not provide all the information we needed for our own analysis.
● Though Cassandra, HBase, Yahoo’s PNUTS, and a simple sharded MySQL implementation
were analyzed, some of the databases we often work with were not covered.
● Yahoo used high-performance hardware, while it would be more useful for most companies to
see how these databases perform on average hardware.
As R&D engineers at Altoros Systems, Inc., a big data specialist, we were inspired by Yahoo’s
endeavors and decided to add some effort of our own. This article is our vendor-independent analysis
of NoSQL databases, based on performance measured under different system workloads.
What Makes This Research Unique?
Often referred to as NoSQL, non-relational databases feature elasticity and scalability in combination
with a capability to store big data and work with cloud computing systems, all of which make them
extremely popular. NoSQL data management systems are inherently schema-free (with no obsessive
complexity and a flexible data model) and eventually consistent (complying with BASE rather than
ACID). They have a simple API, serve huge amounts of data and provide high throughput.

In 2012, the number of NoSQL products reached 120+ and the figure is still growing. That variety
makes it difficult to select the best tool for a particular case. Database vendors usually measure
productivity of their products with custom hardware and software settings designed to demonstrate the
advantages of their solutions. We wanted to do independent and unbiased research to complement the
work done by the folks at Yahoo.
Using Amazon virtual machines to ensure verifiable results and research transparency (which also
helped minimize errors due to hardware differences), we have analyzed and evaluated the following
NoSQL solutions:
● Cassandra, a column family store
● HBase (column-oriented, too)
● MongoDB, a document-oriented database

Headquarters
830 Stew art Dr., Suite 119
Sunnyvale, CA 94085
Phone: (650) 395-7002
Fax:
(866) 201-3646

w w w.altoros.com

Norw ay Office
Thorvald Meyers gate 70B
0552 Oslo, Norw ay
Phone: + 47 35 29 13 00
www.altoros.no

Denm ark Office
VejlsĂžvej 51, Bygning O
DK-8600, Silkeborg
Denmark
Phone: +45 40 46 79 64
www.altoros.dk

Development Center
9-6 Dombrovskaya Str.,
Minsk, Belarus, 220140
Phone: +375 (17) 388-01-32
+375 (29) 367-0849

(650) 395-7002

UK Office
London
Phone: +44(0) 203 318 4785
Mobile: +44(0) 7979 907559

engineering@altoros.com
Make the most of your technology investments before it’s too late

● Riak, a key-value store
We also tested MySQL Cluster and sharded MySQL, taking them as benchmarks.
After some of the results had been presented to the public, some observers said MongoDB should not
be compared to other NoSQL databases because it is more targeted at working with memory directly.
We certainly understand this, but the aim of this investigation is to determine the best use cases for
different NoSQL products. Therefore, the databases were tested under the same conditions, regardless
of their specifics.
Tools, Libraries, and Methods
For benchmarking, we used Yahoo Cloud Serving Benchmark, which consists of the following
components:
● a framework with a workload generator
● a set of workload scenarios
We have measured database performance under certain types of workloads. A workload was defined
by different distributions assigned to the two main choices:


which operation to perform



which record to read or write

Operations against a data store were randomly selected and could be of the following types:


Insert: Inserts a new record.



Update: Updates a record by replacing the value of one field.



Read: Reads a record, either one randomly selected field, or all fields.



Scan: Scans records in order, starting at a randomly selected record key. The number of records
to scan is also selected randomly from the range between 1 and 100.

Each workload was targeted at a table of 100,000,000 records; each record was 1,000 bytes in size
and contained 10 fields. A primary key identified each record, which was a string, such as
“user234123.” Each field was named field0, field1, and so on. The values in each field were random
strings of ASCII characters, 100 bytes each.

Headquarters
830 Stew art Dr., Suite 119
Sunnyvale, CA 94085
Phone: (650) 395-7002
Fax:
(866) 201-3646

w w w.altoros.com

Norw ay Office
Thorvald Meyers gate 70B
0552 Oslo, Norw ay
Phone: + 47 35 29 13 00
www.altoros.no

Denm ark Office
VejlsĂžvej 51, Bygning O
DK-8600, Silkeborg
Denmark
Phone: +45 40 46 79 64
www.altoros.dk

Development Center
9-6 Dombrovskaya Str.,
Minsk, Belarus, 220140
Phone: +375 (17) 388-01-32
+375 (29) 367-0849

(650) 395-7002

UK Office
London
Phone: +44(0) 203 318 4785
Mobile: +44(0) 7979 907559

engineering@altoros.com
Make the most of your technology investments before it’s too late

Database performance was defined by the speed at which a database computed basic operations. A
basic operation is an action performed by the workload executor, which drives multiple client threads.
Each thread executes a sequential series of operations by making calls to the database interface layer
both to load the database (the load phase) and to execute the workload (the transaction phase). The
threads throttle the rate at which they generate requests, so that we may directly control the offered
load against the database. In addition, the threads measure the latency and achieved throughput of
their operations and report these measurements to the statistics module.

The performance of the system was evaluated under different workloads:
Workload
Workload
Workload
Workload
Workload
Workload
Workload

A: Update heavily
B: Read mostly
C: Read only
D: Read latest
E: Scan short ranges
F: Read-modify-write
G: Write heavily

Headquarters
830 Stew art Dr., Suite 119
Sunnyvale, CA 94085
Phone: (650) 395-7002
Fax:
(866) 201-3646

w w w.altoros.com

Norw ay Office
Thorvald Meyers gate 70B
0552 Oslo, Norw ay
Phone: + 47 35 29 13 00
www.altoros.no

Denm ark Office
VejlsĂžvej 51, Bygning O
DK-8600, Silkeborg
Denmark
Phone: +45 40 46 79 64
www.altoros.dk

Development Center
9-6 Dombrovskaya Str.,
Minsk, Belarus, 220140
Phone: +375 (17) 388-01-32
+375 (29) 367-0849

(650) 395-7002

UK Office
London
Phone: +44(0) 203 318 4785
Mobile: +44(0) 7979 907559

engineering@altoros.com
Make the most of your technology investments before it’s too late

Further Actions
Please go to this link for downloading the full research for free.
Also, browse through other Altoros’s researches published by CIO.com,
NetworkWorld, ComputerWorld, TechWorld, and other online magazines.

Headquarters
830 Stew art Dr., Suite 119
Sunnyvale, CA 94085
Phone: (650) 395-7002
Fax:
(866) 201-3646

w w w.altoros.com

Norw ay Office
Thorvald Meyers gate 70B
0552 Oslo, Norw ay
Phone: + 47 35 29 13 00
www.altoros.no

Denm ark Office
VejlsĂžvej 51, Bygning O
DK-8600, Silkeborg
Denmark
Phone: +45 40 46 79 64
www.altoros.dk

Development Center
9-6 Dombrovskaya Str.,
Minsk, Belarus, 220140
Phone: +375 (17) 388-01-32
+375 (29) 367-0849

(650) 395-7002

UK Office
London
Phone: +44(0) 203 318 4785
Mobile: +44(0) 7979 907559

engineering@altoros.com

Weitere Àhnliche Inhalte

Mehr von Altoros

Containers and Kubernetes
Containers and KubernetesContainers and Kubernetes
Containers and KubernetesAltoros
 
Distributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter TradingDistributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter TradingAltoros
 
5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple Nodes5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple NodesAltoros
 
Deploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with KubesprayDeploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with KubesprayAltoros
 
UAA for Kubernetes
UAA for KubernetesUAA for Kubernetes
UAA for KubernetesAltoros
 
Troubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud FoundryTroubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud FoundryAltoros
 
Continuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCFContinuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCFAltoros
 
How to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment UnattendedHow to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment UnattendedAltoros
 
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsCloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsAltoros
 
Smart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based SolutionSmart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based SolutionAltoros
 
Navigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry TilesNavigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry TilesAltoros
 
AI as a Catalyst for IoT
AI as a Catalyst for IoTAI as a Catalyst for IoT
AI as a Catalyst for IoTAltoros
 
Over-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and TreatmentOver-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and TreatmentAltoros
 
What's New in the Cloud Foundry Ecosystem?
What's New in the Cloud Foundry Ecosystem?What's New in the Cloud Foundry Ecosystem?
What's New in the Cloud Foundry Ecosystem?Altoros
 
Bluemix Live Sync: Speed Up Maintenance and Delivery for Node.js
Bluemix Live Sync: Speed Up Maintenance and Delivery for Node.jsBluemix Live Sync: Speed Up Maintenance and Delivery for Node.js
Bluemix Live Sync: Speed Up Maintenance and Delivery for Node.jsAltoros
 
Deep Learning in Finance
Deep Learning in FinanceDeep Learning in Finance
Deep Learning in FinanceAltoros
 
How to Run TensorFlow Cheaper in the Cloud Using Elastic GPUs
How to Run TensorFlow Cheaper in the Cloud Using Elastic GPUsHow to Run TensorFlow Cheaper in the Cloud Using Elastic GPUs
How to Run TensorFlow Cheaper in the Cloud Using Elastic GPUsAltoros
 
Toward ML-Assisted Tumor Boards Using Cross-Modal Learning
Toward ML-Assisted Tumor Boards Using Cross-Modal LearningToward ML-Assisted Tumor Boards Using Cross-Modal Learning
Toward ML-Assisted Tumor Boards Using Cross-Modal LearningAltoros
 
Future of IoT: Key Challenges to Face
Future of IoT: Key Challenges to FaceFuture of IoT: Key Challenges to Face
Future of IoT: Key Challenges to FaceAltoros
 
Using Hyperledger Fabric to Manage Compliance with Fund Managers and Regulators
Using Hyperledger Fabric to Manage Compliance with Fund Managers and RegulatorsUsing Hyperledger Fabric to Manage Compliance with Fund Managers and Regulators
Using Hyperledger Fabric to Manage Compliance with Fund Managers and RegulatorsAltoros
 

Mehr von Altoros (20)

Containers and Kubernetes
Containers and KubernetesContainers and Kubernetes
Containers and Kubernetes
 
Distributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter TradingDistributed Ledger Technology for Over-the-Counter Trading
Distributed Ledger Technology for Over-the-Counter Trading
 
5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple Nodes5-Step Deployment of Hyperledger Fabric on Multiple Nodes
5-Step Deployment of Hyperledger Fabric on Multiple Nodes
 
Deploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with KubesprayDeploying Kubernetes on GCP with Kubespray
Deploying Kubernetes on GCP with Kubespray
 
UAA for Kubernetes
UAA for KubernetesUAA for Kubernetes
UAA for Kubernetes
 
Troubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud FoundryTroubleshooting .NET Applications on Cloud Foundry
Troubleshooting .NET Applications on Cloud Foundry
 
Continuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCFContinuous Integration and Deployment with Jenkins for PCF
Continuous Integration and Deployment with Jenkins for PCF
 
How to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment UnattendedHow to Never Leave Your Deployment Unattended
How to Never Leave Your Deployment Unattended
 
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and LogsCloud Foundry Monitoring How-To: Collecting Metrics and Logs
Cloud Foundry Monitoring How-To: Collecting Metrics and Logs
 
Smart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based SolutionSmart Baggage Tracking: End-to-End Sensor-Based Solution
Smart Baggage Tracking: End-to-End Sensor-Based Solution
 
Navigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry TilesNavigating the Ecosystem of Pivotal Cloud Foundry Tiles
Navigating the Ecosystem of Pivotal Cloud Foundry Tiles
 
AI as a Catalyst for IoT
AI as a Catalyst for IoTAI as a Catalyst for IoT
AI as a Catalyst for IoT
 
Over-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and TreatmentOver-Engineering: Causes, Symptoms, and Treatment
Over-Engineering: Causes, Symptoms, and Treatment
 
What's New in the Cloud Foundry Ecosystem?
What's New in the Cloud Foundry Ecosystem?What's New in the Cloud Foundry Ecosystem?
What's New in the Cloud Foundry Ecosystem?
 
Bluemix Live Sync: Speed Up Maintenance and Delivery for Node.js
Bluemix Live Sync: Speed Up Maintenance and Delivery for Node.jsBluemix Live Sync: Speed Up Maintenance and Delivery for Node.js
Bluemix Live Sync: Speed Up Maintenance and Delivery for Node.js
 
Deep Learning in Finance
Deep Learning in FinanceDeep Learning in Finance
Deep Learning in Finance
 
How to Run TensorFlow Cheaper in the Cloud Using Elastic GPUs
How to Run TensorFlow Cheaper in the Cloud Using Elastic GPUsHow to Run TensorFlow Cheaper in the Cloud Using Elastic GPUs
How to Run TensorFlow Cheaper in the Cloud Using Elastic GPUs
 
Toward ML-Assisted Tumor Boards Using Cross-Modal Learning
Toward ML-Assisted Tumor Boards Using Cross-Modal LearningToward ML-Assisted Tumor Boards Using Cross-Modal Learning
Toward ML-Assisted Tumor Boards Using Cross-Modal Learning
 
Future of IoT: Key Challenges to Face
Future of IoT: Key Challenges to FaceFuture of IoT: Key Challenges to Face
Future of IoT: Key Challenges to Face
 
Using Hyperledger Fabric to Manage Compliance with Fund Managers and Regulators
Using Hyperledger Fabric to Manage Compliance with Fund Managers and RegulatorsUsing Hyperledger Fabric to Manage Compliance with Fund Managers and Regulators
Using Hyperledger Fabric to Manage Compliance with Fund Managers and Regulators
 

KĂŒrzlich hochgeladen

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

KĂŒrzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

A Vendor-independent Comparison of NoSQL Databases: Cassandra, HBase, MongoDB, and Riak (Preview)

  • 1. Make the most of your technology investments before it’s too late A Vendor-independent Comparison of NoSQL Databases: Cassandra, HBase, MongoDB, Riak By Sergey Bushik, Senior R&D Engineer at Altoros Systems, Inc. “The more alternatives, the more difficult the choice.” Abbe' D'Allanival
  • 2. Make the most of your technology investments before it’s too late In 2010, when the world became enchanted by the capabilities of cloud systems and new databases designed to serve them, a group of researchers from Yahoo decided to look into NoSQL. They developed the YCSB framework to assess the performance of new tools and find the best cases for their use. The results were published in the paper, “Benchmarking Cloud Serving Systems with YCSB.” The Yahoo guys did a great job, but like any paper, it could not include everything: ● The research did not provide all the information we needed for our own analysis. ● Though Cassandra, HBase, Yahoo’s PNUTS, and a simple sharded MySQL implementation were analyzed, some of the databases we often work with were not covered. ● Yahoo used high-performance hardware, while it would be more useful for most companies to see how these databases perform on average hardware. As R&D engineers at Altoros Systems, Inc., a big data specialist, we were inspired by Yahoo’s endeavors and decided to add some effort of our own. This article is our vendor-independent analysis of NoSQL databases, based on performance measured under different system workloads. What Makes This Research Unique? Often referred to as NoSQL, non-relational databases feature elasticity and scalability in combination with a capability to store big data and work with cloud computing systems, all of which make them extremely popular. NoSQL data management systems are inherently schema-free (with no obsessive complexity and a flexible data model) and eventually consistent (complying with BASE rather than ACID). They have a simple API, serve huge amounts of data and provide high throughput. In 2012, the number of NoSQL products reached 120+ and the figure is still growing. That variety makes it difficult to select the best tool for a particular case. Database vendors usually measure productivity of their products with custom hardware and software settings designed to demonstrate the advantages of their solutions. We wanted to do independent and unbiased research to complement the work done by the folks at Yahoo. Using Amazon virtual machines to ensure verifiable results and research transparency (which also helped minimize errors due to hardware differences), we have analyzed and evaluated the following NoSQL solutions: ● Cassandra, a column family store ● HBase (column-oriented, too) ● MongoDB, a document-oriented database Headquarters 830 Stew art Dr., Suite 119 Sunnyvale, CA 94085 Phone: (650) 395-7002 Fax: (866) 201-3646 w w w.altoros.com Norw ay Office Thorvald Meyers gate 70B 0552 Oslo, Norw ay Phone: + 47 35 29 13 00 www.altoros.no Denm ark Office VejlsĂžvej 51, Bygning O DK-8600, Silkeborg Denmark Phone: +45 40 46 79 64 www.altoros.dk Development Center 9-6 Dombrovskaya Str., Minsk, Belarus, 220140 Phone: +375 (17) 388-01-32 +375 (29) 367-0849 (650) 395-7002 UK Office London Phone: +44(0) 203 318 4785 Mobile: +44(0) 7979 907559 engineering@altoros.com
  • 3. Make the most of your technology investments before it’s too late ● Riak, a key-value store We also tested MySQL Cluster and sharded MySQL, taking them as benchmarks. After some of the results had been presented to the public, some observers said MongoDB should not be compared to other NoSQL databases because it is more targeted at working with memory directly. We certainly understand this, but the aim of this investigation is to determine the best use cases for different NoSQL products. Therefore, the databases were tested under the same conditions, regardless of their specifics. Tools, Libraries, and Methods For benchmarking, we used Yahoo Cloud Serving Benchmark, which consists of the following components: ● a framework with a workload generator ● a set of workload scenarios We have measured database performance under certain types of workloads. A workload was defined by different distributions assigned to the two main choices:  which operation to perform  which record to read or write Operations against a data store were randomly selected and could be of the following types:  Insert: Inserts a new record.  Update: Updates a record by replacing the value of one field.  Read: Reads a record, either one randomly selected field, or all fields.  Scan: Scans records in order, starting at a randomly selected record key. The number of records to scan is also selected randomly from the range between 1 and 100. Each workload was targeted at a table of 100,000,000 records; each record was 1,000 bytes in size and contained 10 fields. A primary key identified each record, which was a string, such as “user234123.” Each field was named field0, field1, and so on. The values in each field were random strings of ASCII characters, 100 bytes each. Headquarters 830 Stew art Dr., Suite 119 Sunnyvale, CA 94085 Phone: (650) 395-7002 Fax: (866) 201-3646 w w w.altoros.com Norw ay Office Thorvald Meyers gate 70B 0552 Oslo, Norw ay Phone: + 47 35 29 13 00 www.altoros.no Denm ark Office VejlsĂžvej 51, Bygning O DK-8600, Silkeborg Denmark Phone: +45 40 46 79 64 www.altoros.dk Development Center 9-6 Dombrovskaya Str., Minsk, Belarus, 220140 Phone: +375 (17) 388-01-32 +375 (29) 367-0849 (650) 395-7002 UK Office London Phone: +44(0) 203 318 4785 Mobile: +44(0) 7979 907559 engineering@altoros.com
  • 4. Make the most of your technology investments before it’s too late Database performance was defined by the speed at which a database computed basic operations. A basic operation is an action performed by the workload executor, which drives multiple client threads. Each thread executes a sequential series of operations by making calls to the database interface layer both to load the database (the load phase) and to execute the workload (the transaction phase). The threads throttle the rate at which they generate requests, so that we may directly control the offered load against the database. In addition, the threads measure the latency and achieved throughput of their operations and report these measurements to the statistics module. The performance of the system was evaluated under different workloads: Workload Workload Workload Workload Workload Workload Workload A: Update heavily B: Read mostly C: Read only D: Read latest E: Scan short ranges F: Read-modify-write G: Write heavily Headquarters 830 Stew art Dr., Suite 119 Sunnyvale, CA 94085 Phone: (650) 395-7002 Fax: (866) 201-3646 w w w.altoros.com Norw ay Office Thorvald Meyers gate 70B 0552 Oslo, Norw ay Phone: + 47 35 29 13 00 www.altoros.no Denm ark Office VejlsĂžvej 51, Bygning O DK-8600, Silkeborg Denmark Phone: +45 40 46 79 64 www.altoros.dk Development Center 9-6 Dombrovskaya Str., Minsk, Belarus, 220140 Phone: +375 (17) 388-01-32 +375 (29) 367-0849 (650) 395-7002 UK Office London Phone: +44(0) 203 318 4785 Mobile: +44(0) 7979 907559 engineering@altoros.com
  • 5. Make the most of your technology investments before it’s too late Further Actions Please go to this link for downloading the full research for free. Also, browse through other Altoros’s researches published by CIO.com, NetworkWorld, ComputerWorld, TechWorld, and other online magazines. Headquarters 830 Stew art Dr., Suite 119 Sunnyvale, CA 94085 Phone: (650) 395-7002 Fax: (866) 201-3646 w w w.altoros.com Norw ay Office Thorvald Meyers gate 70B 0552 Oslo, Norw ay Phone: + 47 35 29 13 00 www.altoros.no Denm ark Office VejlsĂžvej 51, Bygning O DK-8600, Silkeborg Denmark Phone: +45 40 46 79 64 www.altoros.dk Development Center 9-6 Dombrovskaya Str., Minsk, Belarus, 220140 Phone: +375 (17) 388-01-32 +375 (29) 367-0849 (650) 395-7002 UK Office London Phone: +44(0) 203 318 4785 Mobile: +44(0) 7979 907559 engineering@altoros.com