SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Evolution of Distributed
Operational Database
Technologies in the Digital Era
An architectural and digital fitment analysis
Vishal Puri
Executive Architect
Data Platforms Services
IBM GBS Cognitive Business Decisions Support
Email - vpuri@us.ibm.com
Mobile - 4692197510
Table of contents
• Overview
• Characteristics of Distributed Databases
• Distributed Database Models
• The Incumbents
• The Challengers
• Overview
• Characteristics of Distributed Databases
• Distributed Database Models
• The Incumbents
• The Challengers
Executive Brief
The use of distributed databases (also called NoSQL DBs) for supporting operational processes including
operational intelligence is increasing as companies increase the adoption of digital process engagement
with their consumers as well as digitizing internal and partner facing operations. With digitization, comes
the challenge of scale, flexibility and agility that distributed databases are uniquely positioned to address.
Distributed database marketplace
• Distributed databases have been in the market for a long time with mature solutions with significant
enterprise adoption. The examples of such databases is MongoDB, Cassandra, Redis and HBase.
Incumbent leaders continue to innovate their feature sets, with some doing so more successfully than
others, such as Mongo DB (with MongoRocks).
• The market leaders are saddled with legacy architectural debt, which has opened the door to new
challengers that provide attractive propositions and are making seamless replacement of existing
solutions a key architectural goal. These solutions are ScyllaDB, Aerospike, Accumulo and Yugabyte
among others.
• While this paper does not touch upon public cloud only offerings (Amazon DynamoDB, Microsoft
Cosmos or Google Spanner) there is a significant interest in these offerings in the marketplace. There
is also significant interest in adopting solutions that are cloud native and multi-cloud enabled, with as-
a service offerings from multiple vendors.
Database practitioner recommendations
• Data Architects must keep on top of the changing landscape and innovations. As this paper
articulates, the best way forward is to understand distributed database architecture patterns. This
paper also postulates a framework to breakdown the characteristics of any distributed database for
comparison purposes and to gauge its suitability for specific architectures.
• Most distributed databases require upfront data modeling and data modeling as a discipline for
implementing such databases is a skill that must be nurtured in technology organizations
• With changing landscapes and use case specific implementations, it is important to have database
devops SME’s to be multiskilled, especially w.r.t database operations in a cloud environment
Core Audience and Goals
Who is the core audience?
• The core audience for this document is a practicing architect providing solutions for digital business
functions.
What are the key goals ?
• Explore core architectural patterns for distributed databases.
• Explore a class of distributed databases that are used to enable digital engagement use cases.
• Establish an architectural framework to determine suitability of specific Distributed DBs in addressing
various digital use cases.
• Apply the above architectural framework in assessing market leading DBs as well as emerging
challengers in the marketplace.
• Enable deep architectural thinking as opposed to providing convenient but shallow answers.
What is not a goal?
• Providing a simple decision tree for choosing a database.
• Direct and convenient comparison of all DBs.
What are the class of NoSQL DBs not covered ?
• Any Relational or hybrid relational + NoSQL DB – Postgres, MySQL, DashDB.
• Any solution that only scales vertically or via replication (MySQL, Postgres, Aurora etc).
• Graph Databases or purpose built time-series DBs – Neo4J,OrientDB, Titan etc.
• Analytics oriented NoSQL solutions – Hive, Impala, Drill, Spark SQL etc.
• Search oriented solutions –Solr, ElasticSearch etc.
What problems do Distributed Databases solve in
the digital world?
Internet Scale
• By making compromises along the consistency and availability spectrum, NoSQL databases
enable distributed databases that can scale horizontally and can possibly be distributed
geographically. This was not possible with traditional relational DB’s with ACID semantics and
enforced referential integrity (However, some distributed NewSQL architectures such as
Spanner challenge this)
Insights Driven Digital Engagement
• Enable alternate data models that more naturally represent problem domains in the dynamic
digital world (As opposed to force fitting everything into a predefined relational DB), thus
enabling more efficient queries for operational insights and interactions
• Dynamic Customer attributes and activities
• Time series data – Sensors, Stock price
• Session caches
• Searchable logs,
• Large Collections – Queues, Lists, Maps, Counters etc.
• Network and semantic Graphs
• Real time counters
• Global custom
• er transaction databases
• Overview
• Characteristics of Distributed Databases
• Distributed Database Models
• The Incumbents
• The Challengers
Common characteristics of distributed databases
Horizontally scalable
distributed database,
for internet scale use
cases
1
High availability and
resiliency built in
2
Data is partitioned
and replicated across
the cluster
3
Data model is some
form of Multi-
Dimensional Map
4
Primarily optimized
for operational use
cases
5
How to dissect and analyze fitment to purpose of a Distributed DB
Consistency, Availability, and
Latency
•Eventual vs strong consistency
•What happens on network partitions or
server failures
• Write optimized, read optimized or
throughput optimized
Data Model
•Wide Column/Key Value/Document
•Static vs Dynamic Typing
Horizontal Scaling Strategy
•Hash based partitioning/ Ordered
Partitioning/ Replication
•Load-balancing reads and writes
Operational Analytics and Search
•Support for secondary indexes –
transitionally consistent indexes/external
indexes/distributed indexes
•Support for range scans
•Support for joins
•Support for counters and aggregation
Replication
•Master Slave / Distributed
•Cross datacenter replication support
Management Operation Behavior
– Add/Remove Node
•Data redistribution, Master election
Storage Support
•Tiered vs Memory centric vs Disk Centric vs
Flash Centric
Developer Friendliness
•Client APIs - REST API/CQL/Java
Library/JSON/Other
•Popularity score in DbEngines
Handling Updates
•Locking vs MVCC
•Partial vs full update
Technology Used
•Open source vs Closed source
•Java/JVM based vs C/C++ based
•Cross platform vs Linux optimized
Security
•DB and schema level security –
Read/Write/Delete/DDL
•Data Level security – Row/Column/Cell level
security
•LDAP/Kerberos integration
Ease of Setup and Scale
•Minimum infrastructure required to start
•Ease of scaling
The Consistency Availability Spectrum for Distributed Databases – CAP
Theorem
CAP Theorem
Only 2 of the 3 properties - Consistency, Availability, and Partition-
tolerance can be satisfied by a distributed database.
Consistency: A read operation is guaranteed to return the most recent
write
Availability: Any operation is guaranteed to receive a response saying
whether it has succeeded or failed
Partition tolerance: The system continues to operate when a network
partition occurs
While the CAP theorem is not considered to be sufficient* to articulate
the behavior of a distributed database it does make for a useful
classification in making high level decisions about a distributed database
choice.
• https://arxiv.org/pdf/1302.0309.pdf
An alternative to understanding distributed systems is the PACELC
Theorem, described in the original paper as -
“if there is a partition (P), how does the system trade off availability and
consistency (A and C); else (E), when the system is running normally in
the absence of partitions, how does the system trade off latency (L) and
consistency (C)?”
The Distributed Operational DB Architecture Spectrum
Big Table
Dynamo
In Mem Data
Grid
KV Store
Cassandra
Scylla
Google
BigTable
Dynamo DB
HBase
Redis
Aerospike
Memcache
Hazelcast
Ignite
Clustered /
Sharded /
Distributed
SQL DB
Mysql NDB Cluster
Citus DB
(Postgres)
MemSQL
Document Store
Wide Column
Store
Graph /
RDF Store
Orient DB
Titan
CoucHBase
MongoDB
Couch DB
Marklogic
Marklogic
Elastic Search
Search
Solr
Accumulo
• There is no perfect all purpose
distributed database.
• Each distributed database has made
specific architectural choices and
compromises so as to be best suited
for a narrow range of use cases
• Fortunately each architectural
category has several choices of DB’s,
providing us with choices no what
our use case
• Architectures continue to evolve
with newer distributed databases
addressing shortcomings and
technical debt of existing leaders by
making better ground up
architectural decisions and
technology choices
Google
Spanner Cockroach
DB
YugaByte DB
Coherence
Azure Cosmos
DB
Data Modeling in a Distributed Database
Along with making choices on availability, consistency and latency, every
database offering makes fundamental architectural choices regarding the
data models it would like to enable and the associated restrictions it needs
to put in those data-models to honor the laws of computational physics.
Most of the distributed database data model characteristics have evolved
based on popular and expanded usage scenarios along with new
developments in Hardware architectures. Key characteristics of a Data
Model to look for are –
1. What is the fundamental data structure ? – Wide Column,
Document, Key Value, Object Collections
2. How is the data distributed/sharded in the cluster? – Random
partitioning Or Ordered partitioning, Range partitions or Hash
Partitions
3. Does the data model support clustering or ordered data storage
within a server to support scans? – Clustering Keys, Composite
primary Keys, Ordered Storage
4. Does the the data model support secondary indexes? If so, are the
secondary indexes transitionally consistent. Are the secondary
indexes efficient in supporting a search without a primary key
reference?
5. Does the data model support CRDTs (Conflict-free replicated data
types) – These are especially useful in eventually consistent DBs like
Cassandra, Riak and Aerospike to enable reliable distributed
updates. For example Counters, Sets etc
6. Does the data processing engine support server side operations
such as stored procedures, map-reduce functions or triggers
Needless to say, data modelling in a Distributed NoSQL or newSQL
database is very different from modeling a relational schema. One
requires a shift in mindset -
• View the data as a composite dataset driven by narrow access
patterns.
• Since multi-row or multi-document transactions are at a premium,
having denormalized or nested data is preferred
• Since joins of data are either not supported or prohibitively expensive,
plan for manual joins in the application
• Design of the primary key is the most important part of modeling and
getting it right will dictate
• Ease of access to the data for most access patterns
• Scalability of reads and writes across the cluster
• Ability to have shared multi-tenant database
• Referential integrity is not maintained by the database in most
solutions (barring distributed sql DBs like Spanner) and should not be
relied on for maintaining this
• Know your CRDTs
Distributed DB Engine Popularity*
• Popularity in the marketplace and developer
mind-space is often a consideration in selecting
a Distributed DB of choice
• A very high popularity score often reflects the
ease of setup and developer friendliness of the
solution, rather than merits of the solution at
scale
• MongoDB, Redis and Cassandra and incumbent
leaders in this space
• However, cloud centric offerings (Dynamo,
Cosmos) along with newer entrants such as
Scylla, Aerospike and Yugabyte are on the rise
• While the popularity may not reflect the fit of
the DB for the use case, it does reflect on
availability of skills in the marketplace
• Stagnation in popularity of a DB also indicates
emergence of alternatives that are addressing
shortcomings of a solution. It may also reflect
the stagnation of the ecosystem itself, for e.g.
Hadoop adoption stagnation impacts adoption
of Accumulo and HBase
* https://db-engines.com/en/ranking_definition
* https://db-engines.com/en/ranking_trend
Distributed DB architecture limitations and consequences
Category Issue Consequence Alternative Solutions
Inefficient
Processing
Use of Java/JVM
technologies
Garbage Collection Stops, Inefficient memory
usage, Complexity in operations and expensive at
scale
• Use C/C++ frameworks
• Use and manage offheap memory where feasible
Not optimized for
OS/Hardware,
Uses generic OS Caches and default kernels that do
not optimize for target use cases and access
patterns leading to large installations at scale
• Optimized for multi core, NUMA architectures, L1-L2 Caches,
vector processing etc
Not optimized for storage Uses a single storage system that does not provide
adequate flexibility between speed and cost
• Intelligently use of multi-tiered storage strategy, dynamically
moving data between flash, RAM and disk in order to
optimize based on access pattern
Flexibility Fixed access patterns based
on primary or shard keys
Multiple access patterns require changes to data
model design and/or duplication of data
• Support for scalable secondary index implementation
• Support for range scans
Fixed schemas Addition of schema elements is operationally
complex process
• Support for easy schema evolution through flexible schemas
with dynamic attributes
Operational
complexity
Eventual consistency Results in stale data across the distributed nodes,
that needs to be reconciled and repaired
• Strongly consistent solutions
• Incremental and fast repair tools that do not impact
availability/performance
• Support for distributed transactions
Data redistribution Adding or removing nodes causes redistribution of
large volumes of data which impacts performance
and takes long time to complete
• Architectures that minimize data movement for scaling and
load balancing
References
• Jepsen – A tool to understand Distributed database Availability and Consistency characteristics –
• https://aphyr.com/tags/jepsen
• https://aphyr.com/posts/343-scala-days-2017-jepsen-keynote
• Conflict-Free Replicated Data Types (CRDT) - https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
• Overview
• Characteristics of Distributed Databases
• Distributed Database Models
• The Incumbents
• The Challengers
Wide-Column - Big Table
Architecture
• Provides horizontal scalability, immediate consistency and network partition tolerance, at the cost of loss of availability in some scenarios
• Requires Master Server for metadata
• Write optimized as no disk seek is needed – Write to log and Memtable, flushed periodically to SSTable on disk. Many SSTables on disk – 1
per generation/version
• Merged reads – read from Memtable and SSTable.
• Compaction - Many SSTables increase read time, therefore compaction occurs to reduce disk seek time. Compaction = multiple SSTables
combined into single SSTable at O(logN) rate
• Relies on distributed file system for durability
Data model
• Column Family oriented
• Supports sparse data – lots of null values
• File per column family. Data sorted per column family by row id, column name and timestamp. Sorted data is compressed
• Cell is smallest addressable unit – row, column. Data in each cell has configurable number of versions
Best used for
• high throughput writes, where read to write ratio is balanced
Example
• HBase, Cassandra, Scylla
Wide-Column - Dynamo
Architecture
• Enables high availability and partition tolerance with horizonal scalability, at the cost of immediate consistency, by enabling eventual
consistency models
• Dynamo allows read and write operations to continue even during network partitions and resolves update conflicts using different conflict
resolution mechanisms, some client driven.
• Dynamo's Gossip based membership algorithm helps every node maintain information about every other node.
• Dynamo can be defined as a structured overlay with at most one-hop request routing.
• Dynamo detects updated conflicts using a vector clock scheme, but prefers a client side conflict resolution mechanism. A write operation in
Dynamo also requires a read to be performed for managing the vector timestamps. This is can be very limiting in environments where
systems need to handle a very high write throughput.
Data model
• Item model – {Key – Attribute Maps}
Example
• DynamoDB, Riak, Cassandra, Scylla
Document Stores
Architecture
• Wide variety of architectures –
• from strongly consistent (MongoDB, Marklogic, YugaByte) to eventually consistent (CouchDB, Cloudant)
• Automated sharding with fixed shards (MongoDB) to truly distributed and dynamic sharding (YugaByte, CouchDB)
• Read Scaling and fault tolerance with dedicated Replica Sets (MongoDB) to sharded replication with distributed read and writes
• Enables dynamic schemas, leaving data validation and integrity to be largely driven by applications
• Data model is usually JSON in text or Binary form (BSON) or XML
• Most distributed document databases only support local secondary indexes forcing one to search all nodes in a cluster when primary key is
not part of the query
Data model
• Item model – {Key – Binary JSON}
Example
• MongoDB, CouchDB, Cloudant, MarkLogic, YugaByte
In Memory Key-Value Store
Architecture
• Most solutions (Memcache, Redis) in this category have roots in single server data storage with a focus on speed and throughput.
Application driven sharding, as well as lack of availability and data consistency guarantees has been the rule rather than the exception
• Some solutions such as Redis Cluster evolved to provide horizontal scalability in addition to performance, without providing true
availability or consistency guarantees under network partitions.
• Some solutions such as Aerospike support tunable consistency with varying degrees of availability, i.e., always available with simple conflict
resolution for eventual consistency or primary key consistency with lower availability
• Master slave architectures
• Automated or manual sharding
• In Memory storage of keys, Values stored in memory or on other storage media (Disk/Flash)
Data model
• Key – Value Pairs, support for complex data structures such as Lists and Nested Maps
Example
• Memcache, Redis, Redis Cluster, Aerospike, Riak KV
In Memory Data Grid
Architecture
• In – Memory cache
• Configurable Replication
• Limited ACID Compliance through transactions
• Configurable Consistency and availability semantics with limited Network partition tolerance
Data model
• Distributed Collections – Maps, Sets Lists
Example
• Hazelcast, Ignite (GridGain), Coherence
Distributed ACID compliant DBs - Spanner
Architecture
• The architecture was mostly developed to take care of three problems at massive scale (Trillions of rows):
• Global Consistency - if a record is updated in one region (say, Asia) someone reading in another region (Say Africa) will have the same record
updated on reading. This is supported through synchronous globally distributed transactions using the TrueTime API and atomic clocks
supported by GPS, in order to prevent any time inconsistency
• Table-Like Schema – Support for table like schema via rows and columns, backed by a key-value store
• SQL Query Language – A SQL query parser to support SQL clients
• In order to support the above characteristics this architecture sacrifices support for raw performance required by low latency use cases and
incurs possible additional latency with every new node added
• The TrueTime API along with the atomic clock infrastructure ensures that data is consistent within a very small time interval (~1-7 ms)
• Spanner has three main types of operations: read-write transaction, read-only transaction and snapshot read at a timestamp, which are all
supported on a globally distributed database
• Spanner provides first class support for unified cross data center and cross availability zone data clusters. It enables this via concepts of
global indexes, zone masters, span servers (which are similar to HBase region servers) and use of paxos for fine grained cluster consensus
Data model
• Item model – Table like schemas implemented using timestamped key value data structure
Use case
• Data Consistency in a global deployment, across many data centers. E.g. Global digital banking or any global ecommerce application
• Very large data sets with relaxed read write latencies (>20 ms)
• Blockchain and crypto currency implementations
Example
• Spanner, CockroachDB, Azure Cosmos DB, YugaByte, NuoDB
References
• Google BigTable Paper - https://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf
• Amazon Dynamo Paper - https://www.dynamodbguide.com/the-dynamo-paper/
• Spanner Architecture - https://kunigami.blog/2017/04/27/paper-reading-spanner-googles-globally-distributed-database/
• Comparing scaleout sql - https://blog.yugabyte.com/practical-tradeoffs-in-google-cloud-spanner-azure-cosmos-db-and-yugabyte-db-
ce720e07c0fd
• Overview
• Characteristics of Distributed Databases
• Distributed Database Models
• The Incumbents
• The Challengers
Cassandra - Overview
Cassandra is a distributed eventually consistent database which follows the
BigTable wide column format and the Dynamo DB masterless “ring” distributed
storage architecture
Data Objects in Cassandra
• Keyspace – a container for data tables and indexes; analogous to a database in
many relational databases. It is also the level at which replication is defined.
• Replication factor − Number of machines in the cluster that will receive
copies of the same data.
• Replica placement strategy − Strategy to place replicas in the ring.
• Column families − Keyspace is a container for a list of one or more column
families. A column family, in turn, is a container of a collection of rows.
Each row contains ordered columns. Column families represent the
structure of your data. Each Keyspace has at least one and often many
column families.
• Row key – used to identity a row uniquely in a Column Family and also distribute
a table’s rows across multiple nodes in a cluster.
• Index – similar to a relational index in that it speeds some read operations; also
different from relational indices in important ways.
Query Language – Cassandra Query Language (CQL)
Masterless Ring architecture –
Multi-datacenter and cloud deployable
Keyspace as a Column Family
container
Column Family consists of Row
Key and Columns
Cassandra – Features and limitations
Consistency, Availability, and Latency
•Provides very high availability and eventual
consistency based on Dynamo architecture
•Remains available in the face of network partitions
• Optimized for very low latency writes.
• However, reads incur cost through quorum checks, if
read consistency levels are set to >1
Data Model
•Adopts the BigTable Wide Column data model
•Enables Clustering columns to provide order within a
partition
•Supports secondary indexes as well
•Schema needs to be modeled upfront, including
column families and columns
Horizontal Scaling Strategy
•Supports scalability via consistent hashing and random
spread of data across the cluster.
• Large number of nodes in a cluster (>100) causes
excessive inter node communication. Similarly large
number of tables in a cluster also causes issues
Operational Analytics and Search
• Since data is randomly distributed, range scans are not
supported on primary keys or on secondary indexes,
• All clustering columns must be provided in a query
• Secondary index lookups require querying all partitions
• Materialized views are supported but are very limited
in their function
• Good support for distributed Counters
•Most common implementation is to enable secondary
indexes using Elastic Search connectors. These are not
strongly consistent indexes
Replication
• Very good support for cross datacenter replication out
of the box
Management Operation Behavior –
Add/Remove Node
•Adding a node can take a long time (~days) because
data must replicate. More smaller servers alleviates
this
•Node repair is a costly operation that can impact
resources used to serve data requests. Incremental
node repair can minimize this impact but is only
available in the enterprise version
•Similarly data compactions can also impact online
performance
Storage Support
•Primarily uses hard disk for persistent storage. Node
density of no more than 2 TBs, although 1 TB is
recommended. This can lead to large cluster sizes
quickly
Developer Friendliness
• Most popular wide column DB and 3nd most popular
NoSQL DB after MongoDB and Redis
•Supports SQl like language called CQL –used for query
and DDL
Handling Updates
•Supports lightweight transactions through compare
and Set features.
• There is no support for multi-row transactions. Must
eliminate need for multi-row and multi-table
transactions through appropriate data modeling
techniques
Technology Used
•Java based implementation
• Need to monitor and tune JVM garbage collection and
heap usage constantly
• Offheap memory can also be allocated for memtables,
bloom filters etc
Security
•Row level access control supported, but require exact
string matches and Datastax
•Columns level security not yet implemented
Ease of Setup and Scale
•Requires minimal setup to get started.
•Scales from small to large datasets seamlessly.
•Unlike HBase, all nodes are equal
Cassandra - Practices and Use Cases
Best Practices
• One must commit to upfront design based on known use cases and
data access paths. If the use case evolves and there are multiple
optimal access paths to the data required then it will almost certainly
have to be supported through duplication of atleast a subset of data
• It is common to pair Cassandra with an implementation of Solr or
Elastic search in order to enable secondary indexes
• It is recommended to keep Cassandra Cluster sizes to a moderate
number of servers (<150 ) with at most 2 TB disk size
• Cassandra requires an operations team that will work towards
monitoring and optimizing of the settings and infrastructure on a
regular basis, apart from monitoring complex operations such as
splitting, compaction and node repair
When to consider
• For application features or microservices that serve a narrow use case
and access pattern, for large data sets and users
• For maintaining global or fine grained counts through counters
• Suited well for applications and feature sets that need to start small
but grow with increased usage
• Ingesting large volumes of data with high throughput and low latency
• Data services across geographies and data centers
Use cases
• Consumer activity and extended profile DB for recommendation and
personalization
• Web analytics data such as clickstreams and counters
• Graph DB for social network analysis, fraud analysis
• Storage for IOT data
• Storage for firehose datasets
• Web analytics data such as clickstreams and counters
• E-commerce carts and checkout
• Product catalogs and playlists
When not to consider
• For applications that require flexible access patterns through sql like
queries, range scans etc
• For applications that require a flexible and dynamic schema
• As an analytics DB. Cassandra is often paired with Spark to perform
analytical operations, however Cassandra does not provide any
benefits to the spark engine such as query execution or data set
filtering/pruning
HBase - Overview
HBase is a distributed strongly consistent database which follows the BigTable
architecture and wide column format, and uses HDFS for storage, while providing low
latency read and write access
Data Objects in HBase
• HBase Tables – Logical collection of rows stored in individual partitions
known as Regions.
• HBase Row – Instance of data in a table.
• RowKey -Every entry in an HBase table is identified and indexed by a
RowKey.
• Columns - For every RowKey an unlimited number of attributes can be
stored.
• Column Family – Data in rows is grouped together as column families
and all columns are stored together in a low level storage file known as
HFile.
Query Language – SQL via Phoenix or Java API
HBase – Features and limitations
Consistency, Availability, and Latency
•Provides strong Consistency at the expense of
Availability, based on the BigTable architecture
•Availability can be increased by adding read replica’s
on failure of primary replica
• Reads and writes are optimized for throughput
rather than latency, since the storage is dependent
on HDFS.
•Can provide MySQL like latencies at scale
Data Model
•Adopts the BigTable Wide Column data model
•Data is stored and ordered by row key
•Fixed column families – No more than 2 column
families advised. Within a column family flexible
number of columns are allowed
Horizontal Scaling Strategy
•Supports scalability via auto sharding. Basic unit of
sharding is region. Regions are created and merged
depending upon a configurable policy
•Writes are bound to a primary region and unlike
reads, are not load-balanced to the replicas
• Metadata does not split and scale
• Can easily handle billions of rows X millions of
columns
Operational Analytics and Search
• Stores keys in ordered fashion. Good at range scans
and sorts
•Query predicate push down via server side scan and
get filters
• Most common implementation is to enable
secondary indexes using Phoenix or Solr connectors.
These are not strongly consistent indexes
Replication
•Support for synchronous replication, as of HBase 2.0,
as well as eventually consistent replication to read
only replica
Management Operation Behavior –
Add/Remove Node
•Rolling restart for configuration changes and minor
upgrades
•Region Splits caused by data growth results in loss of
availability as Region of data is forced offline to
enable movement of data
Storage Support
•HBase can use tiered storage across RAM, SSD and
disk
•Uses HDFS for storage – HDFS is optimized for
throughput, not latency
•Uses Region Server as a middleware for
implementing Cache and mediating access – Adds
network latency
Developer Friendliness
• 2nd most popular wide column DB, second only to
Cassandra
•Supports Thrift, REST and Java Client APIs
•HTTP supports XML, Protobuf, and binary
Handling Updates
•Supports single row transactions for safe atomic
updates, including safe server side row locks
•No support for cross table or multi row transactions .
However, Alternatives exist via external libraries such
as OMID
Technology Used
•Java based implementation with offheap read and
write capability (as of HBase 2.0)
Security
•HBase now supports cell level security via Co-
Processors. Supports Kerberos integration
Ease of Setup and Scale
•Requires considerable starter infrastructure with a
minimum of 3 data nodes and 3 master nodes
•Works well with large data sets. Accessing Small to
medium sparsely distributed data is not performant
HBase - Practices and Use Cases
Best Practices
• One must commit to upfront design based on known use cases and
data access paths. If the use case evolves and there are multiple
optimal access paths to the data required then it will almost certainly
have to be supported through duplication of atleast a subset of data
• It is common to pair HBase with an implementation of Solr or Elastic
search in order to enable secondary indexes
• HBase requires an operations team that will work towards monitoring
and optimizing of the settings and infrastructure on a regular basis,
apart from monitoring complex operations such as splitting and
merging of data, backups and multi-cluster replication
When to consider
• When the Hadoop/HDFS stack is well adopted
• Data sets that are naturally ordered (time series data) such as sensor,
stock prices, IOT etc
• Large Data Processing using MapReduce like algorithms
• Hybrid Operations and Analytics along with Hadoop
• Real time high performance scans of large join-less dataset
• Large master data sets with frequent updates
Use cases
• Consumer activity and extended profile DB for recommendation and
personalization
• Web analytics data such as clickstreams and counters
• Log processing
• Time Series data storage and analysis for network and sensor data
• Graph DB for social network analysis, fraud analysis
• Product Price by day by Store, location and competitor
• https://blogs.apache.org/HBase/entry/HBase-application-archetypes-
redux
When not to consider
• General purpose database with support for multiple access patterns and
an evolving schema.
• Small data sets - Does not perform comparatively well for small data
sets as it has significant overhead. Consider a minimum of 5 data nodes
storage before using HBase
• Need guarantees of very low latencies – Due to overheads and possible
blocking operations (Region Splits, Garbage collection), low latency
guaranties cannot be given
• Need Complex 2 phase commits across database tables or across
resources
• No existing or planned implementations of Hadoop/HDFS
Redis Cluster - Overview
Redis is an in-memory Key-Value data store built for low latency access. Redis Cluster
is a distributed database that automatically shards Redis key-value data structures
across a cluster of nodes, in a master Slave HA architecture
• Redis Cluster was designed as a general solution for high availability and
horizontal scalability while keeping the core focus of Redis in mind - low latency
and a strong data model. Because of this, Redis Cluster implements neither true
availability nor consistency of the CAP theorem
Data Objects in Redis
• Redis supports any Key Value data structure, where the value data structure may
be of type String, HashMap, List, Sorted Set, Set, Bitmap or HyperLogLog
• Redis Cluster places restrictions on multi-key operations and the ability to
support multiple database in a single cluster
Query Language – Via client side libraries in multiple languages
Redis Cluster – Features and limitations
Consistency, Availability, and Latency
• Provides raw performance and scalability at the cost
of true high availability and consistency
• It is possible to lose data if a failure occurs after a
master has acknowledged a write but before
replication has completed.
• Redis Clusters are unavailable on network partitions
Data Model
•Key value data store with support for complex data
structures such as lists, sets, sorted sets and Maps
•Values can be set to expire (as in a cache)
•Supports Lua scripting for data processing
Horizontal Scaling Strategy
•Supports scalability via sharding based on evenly
distributing keys across the cluster using a hashing or
range partitioning algorithm
• The number of shards is fixed across all data
collections
• Supports Client assisted query routing for data shard
Operational Analytics and Search
• First class support for Counters, top N queries
• No support for secondary indexes, which must be
manually created as top level collections
Replication
• Manual configuration of master-slave. A node is only a
master or a slave, requiring more machines to be
managed.
•All replication is performed asynchronously. It is
possible to lose data if a failure occurs after a master
has acknowledged a write but before replication has
completed.
• No support for cross datacenter replication
• Redis enterprise now supports geo distributed active
active replication using CRDTs
Management Operation Behavior –
Add/Remove Node
•Adding a node can take a long time (~days) because
data must replicate. More smaller servers alleviates
this
•Supports only manually resharding keys while staying
online. However this procedure is not guaranteed to
survive all kinds of failure and may result in loss of data
Storage Support
•Disk backed- In memory DB. Highly tuned for RAM
usage. This can make infrastructure costs go up quickly
with large data sets
• Redis Enterprise now supports Flash
Developer Friendliness
• Most popular key value store and 2nd most popular
NoSQL DB after MongoDB
•Memcache API, client libraries in most languages
•Supports Lua Scripting
• Supports messaging semantics out of the box
Handling Updates
•Supports node local transactions in theory, however
most Redis cluster clients do not support this
Technology Used
•Implemented in C, but cross platform. Single threaded
and tuned for RAM
Security
•No fine grained data level security
• Supports LDAP authentication and RBAC
Ease of Setup and Scale
•Requires minimal setup to get started.
•Scales from small to large datasets easily
Redis Cluster - Practices and Use Cases
Best Practices
• One must commit to upfront design based on known use cases and
data access paths. If the use case evolves and there are multiple
optimal access paths to the data required then it will almost
certainly have to be supported through duplication of atleast a
subset of data through multiple key value collections
• Redis cluster is fairly new and it is best used with the Redis
Enterprise distribution in order to enable easier data operations
and administrative functions
When to consider
• As a data cache with a narrow access pattern
• For small data sets that need to be accessed many times
• Storing temporary state for fast access
• Message queues and pub sub scenarios
Use cases
• Analytics Counters and leaderboards
• Mobile Event notifications and subscriptions
• Spam filtering
• Item expiration by time
• User session information
• Cache for serving backend analytics
When not to consider
• For applications that require consistency or data integrity guarantees
• For applications that require flexible sql like queries
• Storing wide data, such as thousands of attributes for a key
• Storing data that requires queries with high time complexity
• Storing data that requires secondary access paths
MongoDB - Overview
MongoDB is a distributed document (binary JSON) Database that provides a tunable
consistency and availability model, in a master slave architecture. Mongo supports a
flexible MySQL like storage layer architecture with pluggable storage engines – such
as WiredTiger, MMAP, RocksDB etc
Data Objects in MongoDB
• Namespace – A logical grouping of collections
• Collections– Analogous to a table in a relational database, represents a
set of documents.
• Document – A record in a binary JSON (BSON) format .
• ObjectID – Unique identity of a document
• Index – A named index on a collection. Index is maintained local to a
shard
Query Language – Mongo DB custom query language and API
MongoDB– Features and limitations
Consistency, Availability, and Latency
•Provides tunable consistency semantics, including strong consistency
at the highest read and write concerns using the V1 protocol
•Also provides tunable availability at the (considerable) cost of
consistency
• Expect read latency if you want linearizable reads, as a quorum is
enforced at read time
• lack of centralized resource management across the cluster can lead
to unmanaged performance issues, such as when all nodes can
independently decide to do compaction or garbage collection
Data Model
•Provides a document model with support for both embedded and
normalized documents
• Does not impose a schema on the documents and this is largely
managed by applications using the schema, allowing for maximum
flexibility, albeit with considerable scope for application induced
integrity and quality issues
•Cannot guarantee uniqueness of index across shards. This must be
managed by the application, if required
Horizontal Scaling Strategy
•Supports scalability via consistent hashing as per a shard id and
splitting data into a fixed number of predetermined shards
• A replicated set of Config servers store metadata and configuration
settings
• Each shard has one write replica set for handling writes and multiple
read replica sets for handling reads. This keeps read replicas idle for
any write activity, leading to lower resource utilization, but provides
better separation of workloads
•Requires additional mongos instances for query routing to the correct
shard
Operational Analytics and Search
• MongoDB supports secondary indexes.
• Secondary indexes are local to the shard, and any query without a
shard key will result in querying every shard
• MongoDb does not support a group by operation in a sharded cluster.
Instead, we need to run mapreduce and aggregate operations.It also
does not support Covered indexes
• Supports multikey indexes for searching on array structures embedded
inside a document
•
Replication
• Replication is enabled using a master slave architecture via Replica
Sets
• Primary replication sets are used for writes and secondary replication
sets for reads
•Replication is asynchronous from master to slave, consistent reads can
only be achieved by incurring the cost of a majority based quorum at
the time of a read
• Good replication characteristics for cross data center replication
Management Operation Behavior – Add/Remove Node
• Easy to add a shard. The balancer process automatically balances the
cluster by migrating chunks
• MongoDb also manages splitting and rebalancing data chunks once
they reach a threshold
• However these operations come at a cost and impact read and write
performance, leading to operations teams often shutting down the
auto balancing and splitting processing
• Need to manually run compact processes in order to release unused
memory/space to the OS
• Supports background rolling builds that keep DB available during index
rebuilds (although performance is impacted)
Storage Support
•WiredTiger storage engine - Primarily uses hard disk for persistent
storage and RAM for indexes and working datasets. There is also an in-
memory storage engine that stores data entirely in memory
•The wiredtiger storage engine supports TTL indexes and tiered storage
via MongoDB Zones
•Typical Disk space used for Mongo production deployment /per
physical instance is 512 GB- 1 TB, largely limited by backup/restore
processes
Developer Friendliness
• Most popular NoSQL DB
•Supports a proprietary json based query syntax.
• No imposition of document structure and ease of indexing is a big plus
in terms of getting started quickly
• Support for Java, Python and other language APIs
Handling Updates.
• There is no support for multi-document or multi-collection
transactions. Must eliminate need for multi-row and multi-table
transactions through appropriate data modeling techniques. Although
mongodb documentation suggests the use of a Two phase commit like
pattern, but it is not recommended for production apps that need to
ensure consistency and integrity of data
•WiredTiger uses MVCC for non locking algorithms for concurrent
updates which improves efficiency
•Supports change streams out of the box, to support near real time
event notification and synchronization scenarios
Technology Used
• Written in C++
• Does not optimize for the linux kernel, especially for disk or SSD based
IO
•The wiredtiger storage engine does provide storage efficiencies via
enhanced compression and granular concurrency control. It also
supports intra cluster network compression
•While mongo is multi-threaded it does not utilize OS native techniques
that optimize for modern multi-core and numa aware hardware
architectures
Security
•Comes with loose default configurations that have been recently
exploited in malicious attacks
• Provides LDAP based authentication, role based ACLs and Encryption
at rest as well as in transit.
•Provides, only collection level access control policies and not row or
field level security
Ease of Setup and Scale
•Considerable initial setup is required with 3 replica sets per shard
equating to 9 nodes with 1 mongod instance each or 3 nodes with 3
mongod instances each
•Requires additional mongos instances for query routing to the correct
shard
MongoDB- Practices and Use Cases
Best Practices
• It is important to give significant attention to data modeling upfront
– including deciding on embedding documents or to have references,
sharding keys and indexing strategy. Avoid scatter-gather queries and
choose the right level of write guarantees and read concerns
• Attention needs to be given to hardware sizing and configuration
parameters that accounts for growth in volume and usage, thus
avoiding cumbersome migrations. Ensure working sets fit in RAM
and avoid large documents. Dedicate each server to a single role, as
each Mongo DB server role has significantly different workload
characteristics. Ensure proper configuration of compression and data
tiering
When to consider
• When schema flexibility is required
• When MySQL like latencies are desired but the data does not fit a
single server
• When eventual consistency can be tolerated
Use cases
• A datastore for customer data
• Ecommerce product catalog
• For near real time event notifications and collaboration
• Real time analytics
• Mobile and social networking applications
• Storing semi-structured data such as blogs, content and logs
• For any application that has evolving data requirements.
When not to consider
• When analytical or general search queries are required
• When ACID transactions need to be guaranteed across
documents or collections
• When very low latency reads or writes need to be guaranteed
• When very high throughput writes are required
• For large batch data processing jobs
• For very large datasets (>100 TB) – While MongoDB can handle
these large datasets, the lack of cluster wide resource
optimization, and replica sets based architecture can make such
large clusters expensive to provision and maintain
References
• Cassandra - https://academy.datastax.com/resources/brief-introduction-apache-cassandra
• Cassandra Consistency - https://aphyr.com/posts/294-jepsen-cassandra
• HBase - https://mapr.com/blog/in-depth-look-HBase-architecture/
• HBase Splitting and merging - https://hortonworks.com/blog/apache-HBase-region-splitting-and-merging/
• HBase Filters - https://intellipaat.com/tutorial/HBase-tutorial/client-api-advanced-features/
• HBase vs Cassandra - http://bigdatanoob.blogspot.in/2012/11/HBase-vs-cassandra.html
• Redis scale-out - https://www.credera.com/blog/technology-insights/open-source-technology-insights/an-introduction-to-redis-cluster/
• MongoDB performance guide - https://neotan.github.io/images/media/MongoDB-Performance-Best-Practices.pdf
• MongoDb at Baidu - https://www.slideshare.net/matkeep/mongodb-at-baidu/7
• Overview
• Characteristics of Distributed Databases
• Distributed Database Models
• The Incumbents
• The Challengers
Scylla - Overview
Scylla is a distributed database designed from the ground up, using the Seastar framework, to be a significantly more efficient and scalable
drop-in replacement of Cassandra. In using the Seastar framework, Scylla has optimized heavily across the utilization of CPU, memory, network
and IO resources, significantly reducing costs compared to a Cassandra deployment with similar workloads
As a distributed database Scylla has the same architectural foundation of Cassandra, in that it uses the wide column data model and masterless
ring architecture. There are however, a few system architecture choices that allow it to offer a few enhanced operations capabilities , such as
guaranteed low latencies (no JVM stops) and availability during repair processes (due to parallel repair)
Scylla – Innovations
Faster packet processing
by bypassing the kernel
space in 80 CPU cycles
Scylla system architecture innovations
Implications
No garbage collection
pauses, expensive
locking and low CPU
utilization
No thread context
switches.
Asynchronous lockless
inter-core
communication which is
highly scalable
Reconcile data in
cache with incoming
writes – reduces IO
and data model
complexity
Row cache format
same as the serialized
format – reducing
serialization and
deserialization
overhead
Direct storage
access, with
explicit cache
management
leading to better
control
Reduced
serialization and
deserialization
overhead
Scylla – as an evolution of Cassandra
The implications of the system architecture improvements Scylla has made result in
• 5-10 times better throughput for combined read/write workloads, when compared to Cassandra, making Scylla more cost effective
alternative to Cassandra
• Ability to scale effectively with additional number of cores in a node
• Guaranteed low latency , which cannot be offered by Cassandra because of Garbage collection stops and thread locking behaviour
• Better compression rates, compaction rates and IO efficiency leads to deployment of higher density storage per node (2-5 TB/node), thus
reducing the total cost of infrastructure
Scylla offers additional operations benefits
• Running repair and compaction processes in parallel with query workloads
• Tuning – Self tuning capability removes a lot of the manual overhead and guessing
• Isolation and scheduling of background and foreground jobs
• Provisioning – Ease of adding nodes to a cluster – multiple nodes can be added at once and standing up nodes is a faster process compared
to adding a Cassandra node to a Cassandra cluster
Accumulo – Overview
Apache Accumulo is a highly scalable structured store based on Google’s BigTable. Accumulo is written in Java and operates over the Hadoop
Distributed File System (HDFS). Accumulo supports efficient storage and retrieval of structured data, including queries for ranges, and provides
support for using Accumulo tables as input and output for MapReduce jobs. Accumulo provides strong consistency models and is CP on the
CAP spectrum
Accumulo is the 4th most popular wide column data store after Cassandra, HBase and Microsoft Cosmos. It stands at 60th overall in database
popularity as per DB-Engines ranking
Accumulo, has a lot in common with HBase and can be considered for similar use cases. However, Accumulo has implemented several
enhanced capabilities that give it an edge when compared to HBase
Accumulo – Innovations
Accumulo architecture innovations
Security
• Accumulo data model adds the
concept of column visibility to the
original BigTable model. This
enabled implementing very fine
grained cell level security for big
data
Data Model Flexibility
• Ability to add and change column
families “after the fact”
• Flexible locality groups, which
allow application designers to
control how columns are grouped
on disk, a trick that conventional
column-oriented databases rely on
for performance of ad-hoc queries.
• Configurable conditions under
which writes to a table will be
rejected. Constraints are written in
Java and configurable on a per
table basis.
Secondary Index support
• While Accumulo does not support
secondary indexes out of the box,
it provides several architectural
features that make it easy to
implement secondary indexes
• support for very large rows and
partial scans of rows, which
allows applications to build and
maintain their own secondary
index tables without hitting
memory limits
• batch scanners, which can
facilitate fetching many small
reads in a random access fashion
that allows applications to
quickly return full rows
corresponding to matches found
via index tables
• Through the use of specialized
iterators, Accumulo can be a
parallel sharded document store.
For example Wikipedia could be
stored and searched for
documents containing certain
words.
Volume support
• Supports a collection of HDFS URIs
(host and path) which allows
Accumulo to operate over multiple
disjoint HDFS instances. This
allows Accumulo to scale beyond
the limits of a single namenode.
• Allows splitting of metadata files,
thus allowing scaling to very large
volumes in trillions of rows of data
Accumulo – A better BigTable implementation
The implications of the architecture improvements for Accumulo enable the following additional use cases to be implemented with Accumulo
• When flexible data models are required in big data scenarios
• When data ingest rates are high and the data sets can grow up to trillions of rows
• For flexible querying of data using search terms or for Graph traversal in very large graphs
• When data security requirements are complex and fine grained at the data attribute level
However, Accumulo does have a few downsides compared to HBase
• The Accumulo server instances tend to be large to accommodate for large rows. Therefore recovery from a failed node can take significant
time
• Accumulo is not as well integrated into the Hadoop ecosystem as HBase (eg. Atlas integration, Oozie integration etc)
Aerospike - Overview
Aerospike is a distributed, scalable NoSQL database for storing key-Value based structures. Although
Aerospike architecture is fundamentally geared towards maximizing availability, it has made
accommodations for strong consistency models since Aerospike 3.0, albeit at the cost of latency.
Aerospike, is unlike any other popular key value stores – Redis Server, Redis Cluster or Memcached,
and has made drastically different architectural choices while making significant improvements to
the system architecture
Aerospike Architecture
• The Aerospike architecture comprises three layers:
• Client Layer: This cluster-aware layer includes open source client libraries, which implement Aerospike
APIs, track nodes, and know where data resides in the cluster.
• Clustering and Data Distribution Layer: This layer manages cluster communications and automates fail-
over, replication, cross data center synchronization, and intelligent re-balancing and data migration.
• Data Storage Layer: This layer reliably stores data in DRAM and Flash for fast retrieval.
Aerospike – Innovation
Distributed Architecture innovation
• Unlike other key-value databases, which started out as single server high performance caches, the architecture of Aerospike has three key
objectives:
• Create a flexible, scalable platform for web-scale applications – Multi cluster masterless setup without complex master slave configurations. It also provides
flexibility in its data model in that it allows multiple datatypes to be mixed into bins
• Provide the robustness and reliability (as in ACID) expected from traditional databases – Provides single key atomic transactions (Does not support multi-key
acid transactions)
• Provide operational efficiency with minimal manual involvement – Automated balancing, indexing, sharding, cross datacenter replication and recovery
System Architecture Innovation
• Aerospike is implemented in C and is optimized for SSDs and processing speed
• Supports hybrid storage across SSD, HDD and RAM – This enables scalability without compromising on speed
Limitations
• Indexes are all stored in memory, which increases the cost of storage for very large data sets. Also indexes are not global, and need to be
queried using scatter gather queries
• Some data structures have Maps associated with them
Aerospike – The best key-value clustered DB
The applications of Aerospike will tend to be in the following areas
• Storing massive amounts of profile data in online advertising or retail Web sites.
• For real time low latency streaming analytics applications, such as real time fraud detection, financial front office applications etc
What is not a good use
• data store with large number of indexes
• where transactional integrity across datasets or keys are required – e.g. Inventory management, financial transaction processing etc
• Very large volumes - ~PBs
Yugabyte – Overview
Yugabyte is an implementation of the Google Spanner Architecture (as laid
out in the Google Spanner paper). Like Google Spanner It is meant to be a
system-of-record/authoritative database that geo-distributed applications
can rely on for correctness and availability.
It is written in C++ and is an apache 2.0 licensed open source software.
Yugabyte API is wire compatible with CQL, Redis and PostgreSQL
Unlike Google Spanner DB, Yugabyte maintains the goals of low latency
inspite of respecting ACID semantics
This opens up a host of application possibilities for Yugabyte, including
blockchain implementations
Yugabyte - Innovation
Yugabyte is built with the following very ambitious goals in mind
1. Transactional
• Distributed acid transactions that allow multi-row updates across any number of shards at any scale.
• Strongly consistent secondary indexes
• Transactional key-document storage engine that’s backed by self-healing, strongly consistent replication.
2. High Performance
• Low latency for geo-distributed applications with multiple read consistency levels and read-only replicas.
• High throughput for ingesting and serving ever-growing datasets.
3. Planet-Scale
• Global data distribution that brings consistent data close to users through multi-region and multi-cloud deployments.
• Auto-sharding and auto-rebalancing to ensure uniform load balancing across all nodes even for very large clusters.
4. Cloud Native
• Built for the container era with highly elastic scaling and infrastructure portability, including Kubernetes-driven orchestration.
• Self-healing database that automatically tolerates any failures common in the inherently unreliable modern cloud infrastructure.
Key Weaknesses
• Relatively new and not yet adopted in the enterprise, although some promising adoption with startups
• No independently published benchmarks. Although lot of positive benchmarks in comparison to Mongo, Cassandra and Google Spanner, highlighting
performance and throughput strengths when other DBs are tuned to strong consistency levels
Yugabyte – Promising cloud native Globally consistent DB
Yugabyte’s unique capabilities make it usable for many demanding use cases –
• Elastic DB service for IOT data, especially where sensor data may be geographically distributed
• Geographically distributed consumer facing digital operations.
• Financial data service – real time strongly consistent updates– stock quote service, finance portfolio management
• Lambda architectures for serving real time analytics – especially for time ordered datasets – eg.. Personalization based on user activity
References
• Scylla –
• https://github.com/scylladb/scylla/wiki/Repair---Scylla-vs-Cassandra
• https://www.youtube.com/watch?v=YBsbXYvyZnA
• https://www.scylladb.com/product/technology/
• https://github.com/scylladb/scylla/wiki/Repair---Scylla-vs-Cassandra
• https://www.youtube.com/watch?v=YBsbXYvyZnA
• Accumulo
• https://www.slideshare.net/DonaldMiner/survey-of-accumulo-techniques-for-indexing-data
• http://accumulosummit.com/program/talks/comparing-accumulo-cassandra-HBase/
• Yugabyte vs MongoDb - https://blog.yugabyte.com/overcoming-mongodb-sharding-and-replication-limitations-with-yugabyte-db-
ec4eefa5bbd5
• Aerospike Availability and Consistency-
• https://aphyr.com/posts/324-jepsen-aerospike
• Aerospike consistency - https://www.aerospike.com/docs/architecture/acid.html
Databases to be added
• Google Spanner
• MongoRocks (MongoDB with the RocksDB engine)
• Microsoft Cosmos
• CouchDB (maybe)

Weitere ähnliche Inhalte

Was ist angesagt?

SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems divjeev
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingDATAVERSITY
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameCloudera, Inc.
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Cloudera, Inc.
 
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...Denodo
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
BarbaraZigmanResume 2016
BarbaraZigmanResume 2016BarbaraZigmanResume 2016
BarbaraZigmanResume 2016bzigman
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsDavid Portnoy
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseBui Ha
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse designines beltaief
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 

Was ist angesagt? (20)

NoSQL Consepts
NoSQL ConseptsNoSQL Consepts
NoSQL Consepts
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems
 
SQL vs NoSQL
SQL vs NoSQLSQL vs NoSQL
SQL vs NoSQL
 
Big Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data ModelingBig Challenges in Data Modeling: NoSQL and Data Modeling
Big Challenges in Data Modeling: NoSQL and Data Modeling
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
 
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
Denodo Data Virtualization Platform: Overview (session 1 from Architect to Ar...
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
BarbaraZigmanResume 2016
BarbaraZigmanResume 2016BarbaraZigmanResume 2016
BarbaraZigmanResume 2016
 
Comparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse PlatformsComparison of MPP Data Warehouse Platforms
Comparison of MPP Data Warehouse Platforms
 
Bigdata
BigdataBigdata
Bigdata
 
Data Lake
Data LakeData Lake
Data Lake
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 

Ähnlich wie Evolution of Distributed Database Technologies in the Digital era

NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
 
Introduction to NoSQL database technology
Introduction to NoSQL database technologyIntroduction to NoSQL database technology
Introduction to NoSQL database technologynicolausalex722
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology LandscapeShivanandaVSeeri
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureArthur Gimpel
 
Presentation On NoSQL Databases
Presentation On NoSQL DatabasesPresentation On NoSQL Databases
Presentation On NoSQL DatabasesAbiral Gautam
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldKaren Lopez
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB OverviewAndrew Liu
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraCaserta
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
 

Ähnlich wie Evolution of Distributed Database Technologies in the Digital era (20)

NoSQLDatabases
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
 
Introduction to NoSQL database technology
Introduction to NoSQL database technologyIntroduction to NoSQL database technology
Introduction to NoSQL database technology
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
NoSql Brownbag
NoSql BrownbagNoSql Brownbag
NoSql Brownbag
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology Landscape
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Oracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data ArchitectureOracle Week 2016 - Modern Data Architecture
Oracle Week 2016 - Modern Data Architecture
 
Presentation On NoSQL Databases
Presentation On NoSQL DatabasesPresentation On NoSQL Databases
Presentation On NoSQL Databases
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database World
 
UNIT-2.pptx
UNIT-2.pptxUNIT-2.pptx
UNIT-2.pptx
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB Overview
 
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and CassandraLow-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
Low-Latency Analytics with NoSQL – Introduction to Storm and Cassandra
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 

Kürzlich hochgeladen

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Kürzlich hochgeladen (20)

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Evolution of Distributed Database Technologies in the Digital era

  • 1. Evolution of Distributed Operational Database Technologies in the Digital Era An architectural and digital fitment analysis Vishal Puri Executive Architect Data Platforms Services IBM GBS Cognitive Business Decisions Support Email - vpuri@us.ibm.com Mobile - 4692197510
  • 2. Table of contents • Overview • Characteristics of Distributed Databases • Distributed Database Models • The Incumbents • The Challengers
  • 3. • Overview • Characteristics of Distributed Databases • Distributed Database Models • The Incumbents • The Challengers
  • 4. Executive Brief The use of distributed databases (also called NoSQL DBs) for supporting operational processes including operational intelligence is increasing as companies increase the adoption of digital process engagement with their consumers as well as digitizing internal and partner facing operations. With digitization, comes the challenge of scale, flexibility and agility that distributed databases are uniquely positioned to address. Distributed database marketplace • Distributed databases have been in the market for a long time with mature solutions with significant enterprise adoption. The examples of such databases is MongoDB, Cassandra, Redis and HBase. Incumbent leaders continue to innovate their feature sets, with some doing so more successfully than others, such as Mongo DB (with MongoRocks). • The market leaders are saddled with legacy architectural debt, which has opened the door to new challengers that provide attractive propositions and are making seamless replacement of existing solutions a key architectural goal. These solutions are ScyllaDB, Aerospike, Accumulo and Yugabyte among others. • While this paper does not touch upon public cloud only offerings (Amazon DynamoDB, Microsoft Cosmos or Google Spanner) there is a significant interest in these offerings in the marketplace. There is also significant interest in adopting solutions that are cloud native and multi-cloud enabled, with as- a service offerings from multiple vendors. Database practitioner recommendations • Data Architects must keep on top of the changing landscape and innovations. As this paper articulates, the best way forward is to understand distributed database architecture patterns. This paper also postulates a framework to breakdown the characteristics of any distributed database for comparison purposes and to gauge its suitability for specific architectures. • Most distributed databases require upfront data modeling and data modeling as a discipline for implementing such databases is a skill that must be nurtured in technology organizations • With changing landscapes and use case specific implementations, it is important to have database devops SME’s to be multiskilled, especially w.r.t database operations in a cloud environment
  • 5. Core Audience and Goals Who is the core audience? • The core audience for this document is a practicing architect providing solutions for digital business functions. What are the key goals ? • Explore core architectural patterns for distributed databases. • Explore a class of distributed databases that are used to enable digital engagement use cases. • Establish an architectural framework to determine suitability of specific Distributed DBs in addressing various digital use cases. • Apply the above architectural framework in assessing market leading DBs as well as emerging challengers in the marketplace. • Enable deep architectural thinking as opposed to providing convenient but shallow answers. What is not a goal? • Providing a simple decision tree for choosing a database. • Direct and convenient comparison of all DBs. What are the class of NoSQL DBs not covered ? • Any Relational or hybrid relational + NoSQL DB – Postgres, MySQL, DashDB. • Any solution that only scales vertically or via replication (MySQL, Postgres, Aurora etc). • Graph Databases or purpose built time-series DBs – Neo4J,OrientDB, Titan etc. • Analytics oriented NoSQL solutions – Hive, Impala, Drill, Spark SQL etc. • Search oriented solutions –Solr, ElasticSearch etc.
  • 6. What problems do Distributed Databases solve in the digital world? Internet Scale • By making compromises along the consistency and availability spectrum, NoSQL databases enable distributed databases that can scale horizontally and can possibly be distributed geographically. This was not possible with traditional relational DB’s with ACID semantics and enforced referential integrity (However, some distributed NewSQL architectures such as Spanner challenge this) Insights Driven Digital Engagement • Enable alternate data models that more naturally represent problem domains in the dynamic digital world (As opposed to force fitting everything into a predefined relational DB), thus enabling more efficient queries for operational insights and interactions • Dynamic Customer attributes and activities • Time series data – Sensors, Stock price • Session caches • Searchable logs, • Large Collections – Queues, Lists, Maps, Counters etc. • Network and semantic Graphs • Real time counters • Global custom • er transaction databases
  • 7. • Overview • Characteristics of Distributed Databases • Distributed Database Models • The Incumbents • The Challengers
  • 8. Common characteristics of distributed databases Horizontally scalable distributed database, for internet scale use cases 1 High availability and resiliency built in 2 Data is partitioned and replicated across the cluster 3 Data model is some form of Multi- Dimensional Map 4 Primarily optimized for operational use cases 5
  • 9. How to dissect and analyze fitment to purpose of a Distributed DB Consistency, Availability, and Latency •Eventual vs strong consistency •What happens on network partitions or server failures • Write optimized, read optimized or throughput optimized Data Model •Wide Column/Key Value/Document •Static vs Dynamic Typing Horizontal Scaling Strategy •Hash based partitioning/ Ordered Partitioning/ Replication •Load-balancing reads and writes Operational Analytics and Search •Support for secondary indexes – transitionally consistent indexes/external indexes/distributed indexes •Support for range scans •Support for joins •Support for counters and aggregation Replication •Master Slave / Distributed •Cross datacenter replication support Management Operation Behavior – Add/Remove Node •Data redistribution, Master election Storage Support •Tiered vs Memory centric vs Disk Centric vs Flash Centric Developer Friendliness •Client APIs - REST API/CQL/Java Library/JSON/Other •Popularity score in DbEngines Handling Updates •Locking vs MVCC •Partial vs full update Technology Used •Open source vs Closed source •Java/JVM based vs C/C++ based •Cross platform vs Linux optimized Security •DB and schema level security – Read/Write/Delete/DDL •Data Level security – Row/Column/Cell level security •LDAP/Kerberos integration Ease of Setup and Scale •Minimum infrastructure required to start •Ease of scaling
  • 10. The Consistency Availability Spectrum for Distributed Databases – CAP Theorem CAP Theorem Only 2 of the 3 properties - Consistency, Availability, and Partition- tolerance can be satisfied by a distributed database. Consistency: A read operation is guaranteed to return the most recent write Availability: Any operation is guaranteed to receive a response saying whether it has succeeded or failed Partition tolerance: The system continues to operate when a network partition occurs While the CAP theorem is not considered to be sufficient* to articulate the behavior of a distributed database it does make for a useful classification in making high level decisions about a distributed database choice. • https://arxiv.org/pdf/1302.0309.pdf An alternative to understanding distributed systems is the PACELC Theorem, described in the original paper as - “if there is a partition (P), how does the system trade off availability and consistency (A and C); else (E), when the system is running normally in the absence of partitions, how does the system trade off latency (L) and consistency (C)?”
  • 11. The Distributed Operational DB Architecture Spectrum Big Table Dynamo In Mem Data Grid KV Store Cassandra Scylla Google BigTable Dynamo DB HBase Redis Aerospike Memcache Hazelcast Ignite Clustered / Sharded / Distributed SQL DB Mysql NDB Cluster Citus DB (Postgres) MemSQL Document Store Wide Column Store Graph / RDF Store Orient DB Titan CoucHBase MongoDB Couch DB Marklogic Marklogic Elastic Search Search Solr Accumulo • There is no perfect all purpose distributed database. • Each distributed database has made specific architectural choices and compromises so as to be best suited for a narrow range of use cases • Fortunately each architectural category has several choices of DB’s, providing us with choices no what our use case • Architectures continue to evolve with newer distributed databases addressing shortcomings and technical debt of existing leaders by making better ground up architectural decisions and technology choices Google Spanner Cockroach DB YugaByte DB Coherence Azure Cosmos DB
  • 12. Data Modeling in a Distributed Database Along with making choices on availability, consistency and latency, every database offering makes fundamental architectural choices regarding the data models it would like to enable and the associated restrictions it needs to put in those data-models to honor the laws of computational physics. Most of the distributed database data model characteristics have evolved based on popular and expanded usage scenarios along with new developments in Hardware architectures. Key characteristics of a Data Model to look for are – 1. What is the fundamental data structure ? – Wide Column, Document, Key Value, Object Collections 2. How is the data distributed/sharded in the cluster? – Random partitioning Or Ordered partitioning, Range partitions or Hash Partitions 3. Does the data model support clustering or ordered data storage within a server to support scans? – Clustering Keys, Composite primary Keys, Ordered Storage 4. Does the the data model support secondary indexes? If so, are the secondary indexes transitionally consistent. Are the secondary indexes efficient in supporting a search without a primary key reference? 5. Does the data model support CRDTs (Conflict-free replicated data types) – These are especially useful in eventually consistent DBs like Cassandra, Riak and Aerospike to enable reliable distributed updates. For example Counters, Sets etc 6. Does the data processing engine support server side operations such as stored procedures, map-reduce functions or triggers Needless to say, data modelling in a Distributed NoSQL or newSQL database is very different from modeling a relational schema. One requires a shift in mindset - • View the data as a composite dataset driven by narrow access patterns. • Since multi-row or multi-document transactions are at a premium, having denormalized or nested data is preferred • Since joins of data are either not supported or prohibitively expensive, plan for manual joins in the application • Design of the primary key is the most important part of modeling and getting it right will dictate • Ease of access to the data for most access patterns • Scalability of reads and writes across the cluster • Ability to have shared multi-tenant database • Referential integrity is not maintained by the database in most solutions (barring distributed sql DBs like Spanner) and should not be relied on for maintaining this • Know your CRDTs
  • 13. Distributed DB Engine Popularity* • Popularity in the marketplace and developer mind-space is often a consideration in selecting a Distributed DB of choice • A very high popularity score often reflects the ease of setup and developer friendliness of the solution, rather than merits of the solution at scale • MongoDB, Redis and Cassandra and incumbent leaders in this space • However, cloud centric offerings (Dynamo, Cosmos) along with newer entrants such as Scylla, Aerospike and Yugabyte are on the rise • While the popularity may not reflect the fit of the DB for the use case, it does reflect on availability of skills in the marketplace • Stagnation in popularity of a DB also indicates emergence of alternatives that are addressing shortcomings of a solution. It may also reflect the stagnation of the ecosystem itself, for e.g. Hadoop adoption stagnation impacts adoption of Accumulo and HBase * https://db-engines.com/en/ranking_definition * https://db-engines.com/en/ranking_trend
  • 14. Distributed DB architecture limitations and consequences Category Issue Consequence Alternative Solutions Inefficient Processing Use of Java/JVM technologies Garbage Collection Stops, Inefficient memory usage, Complexity in operations and expensive at scale • Use C/C++ frameworks • Use and manage offheap memory where feasible Not optimized for OS/Hardware, Uses generic OS Caches and default kernels that do not optimize for target use cases and access patterns leading to large installations at scale • Optimized for multi core, NUMA architectures, L1-L2 Caches, vector processing etc Not optimized for storage Uses a single storage system that does not provide adequate flexibility between speed and cost • Intelligently use of multi-tiered storage strategy, dynamically moving data between flash, RAM and disk in order to optimize based on access pattern Flexibility Fixed access patterns based on primary or shard keys Multiple access patterns require changes to data model design and/or duplication of data • Support for scalable secondary index implementation • Support for range scans Fixed schemas Addition of schema elements is operationally complex process • Support for easy schema evolution through flexible schemas with dynamic attributes Operational complexity Eventual consistency Results in stale data across the distributed nodes, that needs to be reconciled and repaired • Strongly consistent solutions • Incremental and fast repair tools that do not impact availability/performance • Support for distributed transactions Data redistribution Adding or removing nodes causes redistribution of large volumes of data which impacts performance and takes long time to complete • Architectures that minimize data movement for scaling and load balancing
  • 15. References • Jepsen – A tool to understand Distributed database Availability and Consistency characteristics – • https://aphyr.com/tags/jepsen • https://aphyr.com/posts/343-scala-days-2017-jepsen-keynote • Conflict-Free Replicated Data Types (CRDT) - https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
  • 16. • Overview • Characteristics of Distributed Databases • Distributed Database Models • The Incumbents • The Challengers
  • 17. Wide-Column - Big Table Architecture • Provides horizontal scalability, immediate consistency and network partition tolerance, at the cost of loss of availability in some scenarios • Requires Master Server for metadata • Write optimized as no disk seek is needed – Write to log and Memtable, flushed periodically to SSTable on disk. Many SSTables on disk – 1 per generation/version • Merged reads – read from Memtable and SSTable. • Compaction - Many SSTables increase read time, therefore compaction occurs to reduce disk seek time. Compaction = multiple SSTables combined into single SSTable at O(logN) rate • Relies on distributed file system for durability Data model • Column Family oriented • Supports sparse data – lots of null values • File per column family. Data sorted per column family by row id, column name and timestamp. Sorted data is compressed • Cell is smallest addressable unit – row, column. Data in each cell has configurable number of versions Best used for • high throughput writes, where read to write ratio is balanced Example • HBase, Cassandra, Scylla
  • 18. Wide-Column - Dynamo Architecture • Enables high availability and partition tolerance with horizonal scalability, at the cost of immediate consistency, by enabling eventual consistency models • Dynamo allows read and write operations to continue even during network partitions and resolves update conflicts using different conflict resolution mechanisms, some client driven. • Dynamo's Gossip based membership algorithm helps every node maintain information about every other node. • Dynamo can be defined as a structured overlay with at most one-hop request routing. • Dynamo detects updated conflicts using a vector clock scheme, but prefers a client side conflict resolution mechanism. A write operation in Dynamo also requires a read to be performed for managing the vector timestamps. This is can be very limiting in environments where systems need to handle a very high write throughput. Data model • Item model – {Key – Attribute Maps} Example • DynamoDB, Riak, Cassandra, Scylla
  • 19. Document Stores Architecture • Wide variety of architectures – • from strongly consistent (MongoDB, Marklogic, YugaByte) to eventually consistent (CouchDB, Cloudant) • Automated sharding with fixed shards (MongoDB) to truly distributed and dynamic sharding (YugaByte, CouchDB) • Read Scaling and fault tolerance with dedicated Replica Sets (MongoDB) to sharded replication with distributed read and writes • Enables dynamic schemas, leaving data validation and integrity to be largely driven by applications • Data model is usually JSON in text or Binary form (BSON) or XML • Most distributed document databases only support local secondary indexes forcing one to search all nodes in a cluster when primary key is not part of the query Data model • Item model – {Key – Binary JSON} Example • MongoDB, CouchDB, Cloudant, MarkLogic, YugaByte
  • 20. In Memory Key-Value Store Architecture • Most solutions (Memcache, Redis) in this category have roots in single server data storage with a focus on speed and throughput. Application driven sharding, as well as lack of availability and data consistency guarantees has been the rule rather than the exception • Some solutions such as Redis Cluster evolved to provide horizontal scalability in addition to performance, without providing true availability or consistency guarantees under network partitions. • Some solutions such as Aerospike support tunable consistency with varying degrees of availability, i.e., always available with simple conflict resolution for eventual consistency or primary key consistency with lower availability • Master slave architectures • Automated or manual sharding • In Memory storage of keys, Values stored in memory or on other storage media (Disk/Flash) Data model • Key – Value Pairs, support for complex data structures such as Lists and Nested Maps Example • Memcache, Redis, Redis Cluster, Aerospike, Riak KV
  • 21. In Memory Data Grid Architecture • In – Memory cache • Configurable Replication • Limited ACID Compliance through transactions • Configurable Consistency and availability semantics with limited Network partition tolerance Data model • Distributed Collections – Maps, Sets Lists Example • Hazelcast, Ignite (GridGain), Coherence
  • 22. Distributed ACID compliant DBs - Spanner Architecture • The architecture was mostly developed to take care of three problems at massive scale (Trillions of rows): • Global Consistency - if a record is updated in one region (say, Asia) someone reading in another region (Say Africa) will have the same record updated on reading. This is supported through synchronous globally distributed transactions using the TrueTime API and atomic clocks supported by GPS, in order to prevent any time inconsistency • Table-Like Schema – Support for table like schema via rows and columns, backed by a key-value store • SQL Query Language – A SQL query parser to support SQL clients • In order to support the above characteristics this architecture sacrifices support for raw performance required by low latency use cases and incurs possible additional latency with every new node added • The TrueTime API along with the atomic clock infrastructure ensures that data is consistent within a very small time interval (~1-7 ms) • Spanner has three main types of operations: read-write transaction, read-only transaction and snapshot read at a timestamp, which are all supported on a globally distributed database • Spanner provides first class support for unified cross data center and cross availability zone data clusters. It enables this via concepts of global indexes, zone masters, span servers (which are similar to HBase region servers) and use of paxos for fine grained cluster consensus Data model • Item model – Table like schemas implemented using timestamped key value data structure Use case • Data Consistency in a global deployment, across many data centers. E.g. Global digital banking or any global ecommerce application • Very large data sets with relaxed read write latencies (>20 ms) • Blockchain and crypto currency implementations Example • Spanner, CockroachDB, Azure Cosmos DB, YugaByte, NuoDB
  • 23. References • Google BigTable Paper - https://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf • Amazon Dynamo Paper - https://www.dynamodbguide.com/the-dynamo-paper/ • Spanner Architecture - https://kunigami.blog/2017/04/27/paper-reading-spanner-googles-globally-distributed-database/ • Comparing scaleout sql - https://blog.yugabyte.com/practical-tradeoffs-in-google-cloud-spanner-azure-cosmos-db-and-yugabyte-db- ce720e07c0fd
  • 24. • Overview • Characteristics of Distributed Databases • Distributed Database Models • The Incumbents • The Challengers
  • 25. Cassandra - Overview Cassandra is a distributed eventually consistent database which follows the BigTable wide column format and the Dynamo DB masterless “ring” distributed storage architecture Data Objects in Cassandra • Keyspace – a container for data tables and indexes; analogous to a database in many relational databases. It is also the level at which replication is defined. • Replication factor − Number of machines in the cluster that will receive copies of the same data. • Replica placement strategy − Strategy to place replicas in the ring. • Column families − Keyspace is a container for a list of one or more column families. A column family, in turn, is a container of a collection of rows. Each row contains ordered columns. Column families represent the structure of your data. Each Keyspace has at least one and often many column families. • Row key – used to identity a row uniquely in a Column Family and also distribute a table’s rows across multiple nodes in a cluster. • Index – similar to a relational index in that it speeds some read operations; also different from relational indices in important ways. Query Language – Cassandra Query Language (CQL) Masterless Ring architecture – Multi-datacenter and cloud deployable Keyspace as a Column Family container Column Family consists of Row Key and Columns
  • 26. Cassandra – Features and limitations Consistency, Availability, and Latency •Provides very high availability and eventual consistency based on Dynamo architecture •Remains available in the face of network partitions • Optimized for very low latency writes. • However, reads incur cost through quorum checks, if read consistency levels are set to >1 Data Model •Adopts the BigTable Wide Column data model •Enables Clustering columns to provide order within a partition •Supports secondary indexes as well •Schema needs to be modeled upfront, including column families and columns Horizontal Scaling Strategy •Supports scalability via consistent hashing and random spread of data across the cluster. • Large number of nodes in a cluster (>100) causes excessive inter node communication. Similarly large number of tables in a cluster also causes issues Operational Analytics and Search • Since data is randomly distributed, range scans are not supported on primary keys or on secondary indexes, • All clustering columns must be provided in a query • Secondary index lookups require querying all partitions • Materialized views are supported but are very limited in their function • Good support for distributed Counters •Most common implementation is to enable secondary indexes using Elastic Search connectors. These are not strongly consistent indexes Replication • Very good support for cross datacenter replication out of the box Management Operation Behavior – Add/Remove Node •Adding a node can take a long time (~days) because data must replicate. More smaller servers alleviates this •Node repair is a costly operation that can impact resources used to serve data requests. Incremental node repair can minimize this impact but is only available in the enterprise version •Similarly data compactions can also impact online performance Storage Support •Primarily uses hard disk for persistent storage. Node density of no more than 2 TBs, although 1 TB is recommended. This can lead to large cluster sizes quickly Developer Friendliness • Most popular wide column DB and 3nd most popular NoSQL DB after MongoDB and Redis •Supports SQl like language called CQL –used for query and DDL Handling Updates •Supports lightweight transactions through compare and Set features. • There is no support for multi-row transactions. Must eliminate need for multi-row and multi-table transactions through appropriate data modeling techniques Technology Used •Java based implementation • Need to monitor and tune JVM garbage collection and heap usage constantly • Offheap memory can also be allocated for memtables, bloom filters etc Security •Row level access control supported, but require exact string matches and Datastax •Columns level security not yet implemented Ease of Setup and Scale •Requires minimal setup to get started. •Scales from small to large datasets seamlessly. •Unlike HBase, all nodes are equal
  • 27. Cassandra - Practices and Use Cases Best Practices • One must commit to upfront design based on known use cases and data access paths. If the use case evolves and there are multiple optimal access paths to the data required then it will almost certainly have to be supported through duplication of atleast a subset of data • It is common to pair Cassandra with an implementation of Solr or Elastic search in order to enable secondary indexes • It is recommended to keep Cassandra Cluster sizes to a moderate number of servers (<150 ) with at most 2 TB disk size • Cassandra requires an operations team that will work towards monitoring and optimizing of the settings and infrastructure on a regular basis, apart from monitoring complex operations such as splitting, compaction and node repair When to consider • For application features or microservices that serve a narrow use case and access pattern, for large data sets and users • For maintaining global or fine grained counts through counters • Suited well for applications and feature sets that need to start small but grow with increased usage • Ingesting large volumes of data with high throughput and low latency • Data services across geographies and data centers Use cases • Consumer activity and extended profile DB for recommendation and personalization • Web analytics data such as clickstreams and counters • Graph DB for social network analysis, fraud analysis • Storage for IOT data • Storage for firehose datasets • Web analytics data such as clickstreams and counters • E-commerce carts and checkout • Product catalogs and playlists When not to consider • For applications that require flexible access patterns through sql like queries, range scans etc • For applications that require a flexible and dynamic schema • As an analytics DB. Cassandra is often paired with Spark to perform analytical operations, however Cassandra does not provide any benefits to the spark engine such as query execution or data set filtering/pruning
  • 28. HBase - Overview HBase is a distributed strongly consistent database which follows the BigTable architecture and wide column format, and uses HDFS for storage, while providing low latency read and write access Data Objects in HBase • HBase Tables – Logical collection of rows stored in individual partitions known as Regions. • HBase Row – Instance of data in a table. • RowKey -Every entry in an HBase table is identified and indexed by a RowKey. • Columns - For every RowKey an unlimited number of attributes can be stored. • Column Family – Data in rows is grouped together as column families and all columns are stored together in a low level storage file known as HFile. Query Language – SQL via Phoenix or Java API
  • 29. HBase – Features and limitations Consistency, Availability, and Latency •Provides strong Consistency at the expense of Availability, based on the BigTable architecture •Availability can be increased by adding read replica’s on failure of primary replica • Reads and writes are optimized for throughput rather than latency, since the storage is dependent on HDFS. •Can provide MySQL like latencies at scale Data Model •Adopts the BigTable Wide Column data model •Data is stored and ordered by row key •Fixed column families – No more than 2 column families advised. Within a column family flexible number of columns are allowed Horizontal Scaling Strategy •Supports scalability via auto sharding. Basic unit of sharding is region. Regions are created and merged depending upon a configurable policy •Writes are bound to a primary region and unlike reads, are not load-balanced to the replicas • Metadata does not split and scale • Can easily handle billions of rows X millions of columns Operational Analytics and Search • Stores keys in ordered fashion. Good at range scans and sorts •Query predicate push down via server side scan and get filters • Most common implementation is to enable secondary indexes using Phoenix or Solr connectors. These are not strongly consistent indexes Replication •Support for synchronous replication, as of HBase 2.0, as well as eventually consistent replication to read only replica Management Operation Behavior – Add/Remove Node •Rolling restart for configuration changes and minor upgrades •Region Splits caused by data growth results in loss of availability as Region of data is forced offline to enable movement of data Storage Support •HBase can use tiered storage across RAM, SSD and disk •Uses HDFS for storage – HDFS is optimized for throughput, not latency •Uses Region Server as a middleware for implementing Cache and mediating access – Adds network latency Developer Friendliness • 2nd most popular wide column DB, second only to Cassandra •Supports Thrift, REST and Java Client APIs •HTTP supports XML, Protobuf, and binary Handling Updates •Supports single row transactions for safe atomic updates, including safe server side row locks •No support for cross table or multi row transactions . However, Alternatives exist via external libraries such as OMID Technology Used •Java based implementation with offheap read and write capability (as of HBase 2.0) Security •HBase now supports cell level security via Co- Processors. Supports Kerberos integration Ease of Setup and Scale •Requires considerable starter infrastructure with a minimum of 3 data nodes and 3 master nodes •Works well with large data sets. Accessing Small to medium sparsely distributed data is not performant
  • 30. HBase - Practices and Use Cases Best Practices • One must commit to upfront design based on known use cases and data access paths. If the use case evolves and there are multiple optimal access paths to the data required then it will almost certainly have to be supported through duplication of atleast a subset of data • It is common to pair HBase with an implementation of Solr or Elastic search in order to enable secondary indexes • HBase requires an operations team that will work towards monitoring and optimizing of the settings and infrastructure on a regular basis, apart from monitoring complex operations such as splitting and merging of data, backups and multi-cluster replication When to consider • When the Hadoop/HDFS stack is well adopted • Data sets that are naturally ordered (time series data) such as sensor, stock prices, IOT etc • Large Data Processing using MapReduce like algorithms • Hybrid Operations and Analytics along with Hadoop • Real time high performance scans of large join-less dataset • Large master data sets with frequent updates Use cases • Consumer activity and extended profile DB for recommendation and personalization • Web analytics data such as clickstreams and counters • Log processing • Time Series data storage and analysis for network and sensor data • Graph DB for social network analysis, fraud analysis • Product Price by day by Store, location and competitor • https://blogs.apache.org/HBase/entry/HBase-application-archetypes- redux When not to consider • General purpose database with support for multiple access patterns and an evolving schema. • Small data sets - Does not perform comparatively well for small data sets as it has significant overhead. Consider a minimum of 5 data nodes storage before using HBase • Need guarantees of very low latencies – Due to overheads and possible blocking operations (Region Splits, Garbage collection), low latency guaranties cannot be given • Need Complex 2 phase commits across database tables or across resources • No existing or planned implementations of Hadoop/HDFS
  • 31. Redis Cluster - Overview Redis is an in-memory Key-Value data store built for low latency access. Redis Cluster is a distributed database that automatically shards Redis key-value data structures across a cluster of nodes, in a master Slave HA architecture • Redis Cluster was designed as a general solution for high availability and horizontal scalability while keeping the core focus of Redis in mind - low latency and a strong data model. Because of this, Redis Cluster implements neither true availability nor consistency of the CAP theorem Data Objects in Redis • Redis supports any Key Value data structure, where the value data structure may be of type String, HashMap, List, Sorted Set, Set, Bitmap or HyperLogLog • Redis Cluster places restrictions on multi-key operations and the ability to support multiple database in a single cluster Query Language – Via client side libraries in multiple languages
  • 32. Redis Cluster – Features and limitations Consistency, Availability, and Latency • Provides raw performance and scalability at the cost of true high availability and consistency • It is possible to lose data if a failure occurs after a master has acknowledged a write but before replication has completed. • Redis Clusters are unavailable on network partitions Data Model •Key value data store with support for complex data structures such as lists, sets, sorted sets and Maps •Values can be set to expire (as in a cache) •Supports Lua scripting for data processing Horizontal Scaling Strategy •Supports scalability via sharding based on evenly distributing keys across the cluster using a hashing or range partitioning algorithm • The number of shards is fixed across all data collections • Supports Client assisted query routing for data shard Operational Analytics and Search • First class support for Counters, top N queries • No support for secondary indexes, which must be manually created as top level collections Replication • Manual configuration of master-slave. A node is only a master or a slave, requiring more machines to be managed. •All replication is performed asynchronously. It is possible to lose data if a failure occurs after a master has acknowledged a write but before replication has completed. • No support for cross datacenter replication • Redis enterprise now supports geo distributed active active replication using CRDTs Management Operation Behavior – Add/Remove Node •Adding a node can take a long time (~days) because data must replicate. More smaller servers alleviates this •Supports only manually resharding keys while staying online. However this procedure is not guaranteed to survive all kinds of failure and may result in loss of data Storage Support •Disk backed- In memory DB. Highly tuned for RAM usage. This can make infrastructure costs go up quickly with large data sets • Redis Enterprise now supports Flash Developer Friendliness • Most popular key value store and 2nd most popular NoSQL DB after MongoDB •Memcache API, client libraries in most languages •Supports Lua Scripting • Supports messaging semantics out of the box Handling Updates •Supports node local transactions in theory, however most Redis cluster clients do not support this Technology Used •Implemented in C, but cross platform. Single threaded and tuned for RAM Security •No fine grained data level security • Supports LDAP authentication and RBAC Ease of Setup and Scale •Requires minimal setup to get started. •Scales from small to large datasets easily
  • 33. Redis Cluster - Practices and Use Cases Best Practices • One must commit to upfront design based on known use cases and data access paths. If the use case evolves and there are multiple optimal access paths to the data required then it will almost certainly have to be supported through duplication of atleast a subset of data through multiple key value collections • Redis cluster is fairly new and it is best used with the Redis Enterprise distribution in order to enable easier data operations and administrative functions When to consider • As a data cache with a narrow access pattern • For small data sets that need to be accessed many times • Storing temporary state for fast access • Message queues and pub sub scenarios Use cases • Analytics Counters and leaderboards • Mobile Event notifications and subscriptions • Spam filtering • Item expiration by time • User session information • Cache for serving backend analytics When not to consider • For applications that require consistency or data integrity guarantees • For applications that require flexible sql like queries • Storing wide data, such as thousands of attributes for a key • Storing data that requires queries with high time complexity • Storing data that requires secondary access paths
  • 34. MongoDB - Overview MongoDB is a distributed document (binary JSON) Database that provides a tunable consistency and availability model, in a master slave architecture. Mongo supports a flexible MySQL like storage layer architecture with pluggable storage engines – such as WiredTiger, MMAP, RocksDB etc Data Objects in MongoDB • Namespace – A logical grouping of collections • Collections– Analogous to a table in a relational database, represents a set of documents. • Document – A record in a binary JSON (BSON) format . • ObjectID – Unique identity of a document • Index – A named index on a collection. Index is maintained local to a shard Query Language – Mongo DB custom query language and API
  • 35. MongoDB– Features and limitations Consistency, Availability, and Latency •Provides tunable consistency semantics, including strong consistency at the highest read and write concerns using the V1 protocol •Also provides tunable availability at the (considerable) cost of consistency • Expect read latency if you want linearizable reads, as a quorum is enforced at read time • lack of centralized resource management across the cluster can lead to unmanaged performance issues, such as when all nodes can independently decide to do compaction or garbage collection Data Model •Provides a document model with support for both embedded and normalized documents • Does not impose a schema on the documents and this is largely managed by applications using the schema, allowing for maximum flexibility, albeit with considerable scope for application induced integrity and quality issues •Cannot guarantee uniqueness of index across shards. This must be managed by the application, if required Horizontal Scaling Strategy •Supports scalability via consistent hashing as per a shard id and splitting data into a fixed number of predetermined shards • A replicated set of Config servers store metadata and configuration settings • Each shard has one write replica set for handling writes and multiple read replica sets for handling reads. This keeps read replicas idle for any write activity, leading to lower resource utilization, but provides better separation of workloads •Requires additional mongos instances for query routing to the correct shard Operational Analytics and Search • MongoDB supports secondary indexes. • Secondary indexes are local to the shard, and any query without a shard key will result in querying every shard • MongoDb does not support a group by operation in a sharded cluster. Instead, we need to run mapreduce and aggregate operations.It also does not support Covered indexes • Supports multikey indexes for searching on array structures embedded inside a document • Replication • Replication is enabled using a master slave architecture via Replica Sets • Primary replication sets are used for writes and secondary replication sets for reads •Replication is asynchronous from master to slave, consistent reads can only be achieved by incurring the cost of a majority based quorum at the time of a read • Good replication characteristics for cross data center replication Management Operation Behavior – Add/Remove Node • Easy to add a shard. The balancer process automatically balances the cluster by migrating chunks • MongoDb also manages splitting and rebalancing data chunks once they reach a threshold • However these operations come at a cost and impact read and write performance, leading to operations teams often shutting down the auto balancing and splitting processing • Need to manually run compact processes in order to release unused memory/space to the OS • Supports background rolling builds that keep DB available during index rebuilds (although performance is impacted) Storage Support •WiredTiger storage engine - Primarily uses hard disk for persistent storage and RAM for indexes and working datasets. There is also an in- memory storage engine that stores data entirely in memory •The wiredtiger storage engine supports TTL indexes and tiered storage via MongoDB Zones •Typical Disk space used for Mongo production deployment /per physical instance is 512 GB- 1 TB, largely limited by backup/restore processes Developer Friendliness • Most popular NoSQL DB •Supports a proprietary json based query syntax. • No imposition of document structure and ease of indexing is a big plus in terms of getting started quickly • Support for Java, Python and other language APIs Handling Updates. • There is no support for multi-document or multi-collection transactions. Must eliminate need for multi-row and multi-table transactions through appropriate data modeling techniques. Although mongodb documentation suggests the use of a Two phase commit like pattern, but it is not recommended for production apps that need to ensure consistency and integrity of data •WiredTiger uses MVCC for non locking algorithms for concurrent updates which improves efficiency •Supports change streams out of the box, to support near real time event notification and synchronization scenarios Technology Used • Written in C++ • Does not optimize for the linux kernel, especially for disk or SSD based IO •The wiredtiger storage engine does provide storage efficiencies via enhanced compression and granular concurrency control. It also supports intra cluster network compression •While mongo is multi-threaded it does not utilize OS native techniques that optimize for modern multi-core and numa aware hardware architectures Security •Comes with loose default configurations that have been recently exploited in malicious attacks • Provides LDAP based authentication, role based ACLs and Encryption at rest as well as in transit. •Provides, only collection level access control policies and not row or field level security Ease of Setup and Scale •Considerable initial setup is required with 3 replica sets per shard equating to 9 nodes with 1 mongod instance each or 3 nodes with 3 mongod instances each •Requires additional mongos instances for query routing to the correct shard
  • 36. MongoDB- Practices and Use Cases Best Practices • It is important to give significant attention to data modeling upfront – including deciding on embedding documents or to have references, sharding keys and indexing strategy. Avoid scatter-gather queries and choose the right level of write guarantees and read concerns • Attention needs to be given to hardware sizing and configuration parameters that accounts for growth in volume and usage, thus avoiding cumbersome migrations. Ensure working sets fit in RAM and avoid large documents. Dedicate each server to a single role, as each Mongo DB server role has significantly different workload characteristics. Ensure proper configuration of compression and data tiering When to consider • When schema flexibility is required • When MySQL like latencies are desired but the data does not fit a single server • When eventual consistency can be tolerated Use cases • A datastore for customer data • Ecommerce product catalog • For near real time event notifications and collaboration • Real time analytics • Mobile and social networking applications • Storing semi-structured data such as blogs, content and logs • For any application that has evolving data requirements. When not to consider • When analytical or general search queries are required • When ACID transactions need to be guaranteed across documents or collections • When very low latency reads or writes need to be guaranteed • When very high throughput writes are required • For large batch data processing jobs • For very large datasets (>100 TB) – While MongoDB can handle these large datasets, the lack of cluster wide resource optimization, and replica sets based architecture can make such large clusters expensive to provision and maintain
  • 37. References • Cassandra - https://academy.datastax.com/resources/brief-introduction-apache-cassandra • Cassandra Consistency - https://aphyr.com/posts/294-jepsen-cassandra • HBase - https://mapr.com/blog/in-depth-look-HBase-architecture/ • HBase Splitting and merging - https://hortonworks.com/blog/apache-HBase-region-splitting-and-merging/ • HBase Filters - https://intellipaat.com/tutorial/HBase-tutorial/client-api-advanced-features/ • HBase vs Cassandra - http://bigdatanoob.blogspot.in/2012/11/HBase-vs-cassandra.html • Redis scale-out - https://www.credera.com/blog/technology-insights/open-source-technology-insights/an-introduction-to-redis-cluster/ • MongoDB performance guide - https://neotan.github.io/images/media/MongoDB-Performance-Best-Practices.pdf • MongoDb at Baidu - https://www.slideshare.net/matkeep/mongodb-at-baidu/7
  • 38. • Overview • Characteristics of Distributed Databases • Distributed Database Models • The Incumbents • The Challengers
  • 39. Scylla - Overview Scylla is a distributed database designed from the ground up, using the Seastar framework, to be a significantly more efficient and scalable drop-in replacement of Cassandra. In using the Seastar framework, Scylla has optimized heavily across the utilization of CPU, memory, network and IO resources, significantly reducing costs compared to a Cassandra deployment with similar workloads As a distributed database Scylla has the same architectural foundation of Cassandra, in that it uses the wide column data model and masterless ring architecture. There are however, a few system architecture choices that allow it to offer a few enhanced operations capabilities , such as guaranteed low latencies (no JVM stops) and availability during repair processes (due to parallel repair)
  • 40. Scylla – Innovations Faster packet processing by bypassing the kernel space in 80 CPU cycles Scylla system architecture innovations Implications No garbage collection pauses, expensive locking and low CPU utilization No thread context switches. Asynchronous lockless inter-core communication which is highly scalable Reconcile data in cache with incoming writes – reduces IO and data model complexity Row cache format same as the serialized format – reducing serialization and deserialization overhead Direct storage access, with explicit cache management leading to better control Reduced serialization and deserialization overhead
  • 41. Scylla – as an evolution of Cassandra The implications of the system architecture improvements Scylla has made result in • 5-10 times better throughput for combined read/write workloads, when compared to Cassandra, making Scylla more cost effective alternative to Cassandra • Ability to scale effectively with additional number of cores in a node • Guaranteed low latency , which cannot be offered by Cassandra because of Garbage collection stops and thread locking behaviour • Better compression rates, compaction rates and IO efficiency leads to deployment of higher density storage per node (2-5 TB/node), thus reducing the total cost of infrastructure Scylla offers additional operations benefits • Running repair and compaction processes in parallel with query workloads • Tuning – Self tuning capability removes a lot of the manual overhead and guessing • Isolation and scheduling of background and foreground jobs • Provisioning – Ease of adding nodes to a cluster – multiple nodes can be added at once and standing up nodes is a faster process compared to adding a Cassandra node to a Cassandra cluster
  • 42. Accumulo – Overview Apache Accumulo is a highly scalable structured store based on Google’s BigTable. Accumulo is written in Java and operates over the Hadoop Distributed File System (HDFS). Accumulo supports efficient storage and retrieval of structured data, including queries for ranges, and provides support for using Accumulo tables as input and output for MapReduce jobs. Accumulo provides strong consistency models and is CP on the CAP spectrum Accumulo is the 4th most popular wide column data store after Cassandra, HBase and Microsoft Cosmos. It stands at 60th overall in database popularity as per DB-Engines ranking Accumulo, has a lot in common with HBase and can be considered for similar use cases. However, Accumulo has implemented several enhanced capabilities that give it an edge when compared to HBase
  • 43. Accumulo – Innovations Accumulo architecture innovations Security • Accumulo data model adds the concept of column visibility to the original BigTable model. This enabled implementing very fine grained cell level security for big data Data Model Flexibility • Ability to add and change column families “after the fact” • Flexible locality groups, which allow application designers to control how columns are grouped on disk, a trick that conventional column-oriented databases rely on for performance of ad-hoc queries. • Configurable conditions under which writes to a table will be rejected. Constraints are written in Java and configurable on a per table basis. Secondary Index support • While Accumulo does not support secondary indexes out of the box, it provides several architectural features that make it easy to implement secondary indexes • support for very large rows and partial scans of rows, which allows applications to build and maintain their own secondary index tables without hitting memory limits • batch scanners, which can facilitate fetching many small reads in a random access fashion that allows applications to quickly return full rows corresponding to matches found via index tables • Through the use of specialized iterators, Accumulo can be a parallel sharded document store. For example Wikipedia could be stored and searched for documents containing certain words. Volume support • Supports a collection of HDFS URIs (host and path) which allows Accumulo to operate over multiple disjoint HDFS instances. This allows Accumulo to scale beyond the limits of a single namenode. • Allows splitting of metadata files, thus allowing scaling to very large volumes in trillions of rows of data
  • 44. Accumulo – A better BigTable implementation The implications of the architecture improvements for Accumulo enable the following additional use cases to be implemented with Accumulo • When flexible data models are required in big data scenarios • When data ingest rates are high and the data sets can grow up to trillions of rows • For flexible querying of data using search terms or for Graph traversal in very large graphs • When data security requirements are complex and fine grained at the data attribute level However, Accumulo does have a few downsides compared to HBase • The Accumulo server instances tend to be large to accommodate for large rows. Therefore recovery from a failed node can take significant time • Accumulo is not as well integrated into the Hadoop ecosystem as HBase (eg. Atlas integration, Oozie integration etc)
  • 45. Aerospike - Overview Aerospike is a distributed, scalable NoSQL database for storing key-Value based structures. Although Aerospike architecture is fundamentally geared towards maximizing availability, it has made accommodations for strong consistency models since Aerospike 3.0, albeit at the cost of latency. Aerospike, is unlike any other popular key value stores – Redis Server, Redis Cluster or Memcached, and has made drastically different architectural choices while making significant improvements to the system architecture Aerospike Architecture • The Aerospike architecture comprises three layers: • Client Layer: This cluster-aware layer includes open source client libraries, which implement Aerospike APIs, track nodes, and know where data resides in the cluster. • Clustering and Data Distribution Layer: This layer manages cluster communications and automates fail- over, replication, cross data center synchronization, and intelligent re-balancing and data migration. • Data Storage Layer: This layer reliably stores data in DRAM and Flash for fast retrieval.
  • 46. Aerospike – Innovation Distributed Architecture innovation • Unlike other key-value databases, which started out as single server high performance caches, the architecture of Aerospike has three key objectives: • Create a flexible, scalable platform for web-scale applications – Multi cluster masterless setup without complex master slave configurations. It also provides flexibility in its data model in that it allows multiple datatypes to be mixed into bins • Provide the robustness and reliability (as in ACID) expected from traditional databases – Provides single key atomic transactions (Does not support multi-key acid transactions) • Provide operational efficiency with minimal manual involvement – Automated balancing, indexing, sharding, cross datacenter replication and recovery System Architecture Innovation • Aerospike is implemented in C and is optimized for SSDs and processing speed • Supports hybrid storage across SSD, HDD and RAM – This enables scalability without compromising on speed Limitations • Indexes are all stored in memory, which increases the cost of storage for very large data sets. Also indexes are not global, and need to be queried using scatter gather queries • Some data structures have Maps associated with them
  • 47. Aerospike – The best key-value clustered DB The applications of Aerospike will tend to be in the following areas • Storing massive amounts of profile data in online advertising or retail Web sites. • For real time low latency streaming analytics applications, such as real time fraud detection, financial front office applications etc What is not a good use • data store with large number of indexes • where transactional integrity across datasets or keys are required – e.g. Inventory management, financial transaction processing etc • Very large volumes - ~PBs
  • 48. Yugabyte – Overview Yugabyte is an implementation of the Google Spanner Architecture (as laid out in the Google Spanner paper). Like Google Spanner It is meant to be a system-of-record/authoritative database that geo-distributed applications can rely on for correctness and availability. It is written in C++ and is an apache 2.0 licensed open source software. Yugabyte API is wire compatible with CQL, Redis and PostgreSQL Unlike Google Spanner DB, Yugabyte maintains the goals of low latency inspite of respecting ACID semantics This opens up a host of application possibilities for Yugabyte, including blockchain implementations
  • 49. Yugabyte - Innovation Yugabyte is built with the following very ambitious goals in mind 1. Transactional • Distributed acid transactions that allow multi-row updates across any number of shards at any scale. • Strongly consistent secondary indexes • Transactional key-document storage engine that’s backed by self-healing, strongly consistent replication. 2. High Performance • Low latency for geo-distributed applications with multiple read consistency levels and read-only replicas. • High throughput for ingesting and serving ever-growing datasets. 3. Planet-Scale • Global data distribution that brings consistent data close to users through multi-region and multi-cloud deployments. • Auto-sharding and auto-rebalancing to ensure uniform load balancing across all nodes even for very large clusters. 4. Cloud Native • Built for the container era with highly elastic scaling and infrastructure portability, including Kubernetes-driven orchestration. • Self-healing database that automatically tolerates any failures common in the inherently unreliable modern cloud infrastructure. Key Weaknesses • Relatively new and not yet adopted in the enterprise, although some promising adoption with startups • No independently published benchmarks. Although lot of positive benchmarks in comparison to Mongo, Cassandra and Google Spanner, highlighting performance and throughput strengths when other DBs are tuned to strong consistency levels
  • 50. Yugabyte – Promising cloud native Globally consistent DB Yugabyte’s unique capabilities make it usable for many demanding use cases – • Elastic DB service for IOT data, especially where sensor data may be geographically distributed • Geographically distributed consumer facing digital operations. • Financial data service – real time strongly consistent updates– stock quote service, finance portfolio management • Lambda architectures for serving real time analytics – especially for time ordered datasets – eg.. Personalization based on user activity
  • 51. References • Scylla – • https://github.com/scylladb/scylla/wiki/Repair---Scylla-vs-Cassandra • https://www.youtube.com/watch?v=YBsbXYvyZnA • https://www.scylladb.com/product/technology/ • https://github.com/scylladb/scylla/wiki/Repair---Scylla-vs-Cassandra • https://www.youtube.com/watch?v=YBsbXYvyZnA • Accumulo • https://www.slideshare.net/DonaldMiner/survey-of-accumulo-techniques-for-indexing-data • http://accumulosummit.com/program/talks/comparing-accumulo-cassandra-HBase/ • Yugabyte vs MongoDb - https://blog.yugabyte.com/overcoming-mongodb-sharding-and-replication-limitations-with-yugabyte-db- ec4eefa5bbd5 • Aerospike Availability and Consistency- • https://aphyr.com/posts/324-jepsen-aerospike • Aerospike consistency - https://www.aerospike.com/docs/architecture/acid.html
  • 52. Databases to be added • Google Spanner • MongoRocks (MongoDB with the RocksDB engine) • Microsoft Cosmos • CouchDB (maybe)