SlideShare ist ein Scribd-Unternehmen logo
1 von 99
Downloaden Sie, um offline zu lesen
Containers get Rich

NoSQL Day, Udine – Italy 15-11-2013
STEFANO VALLE

2
http://www.mvlabs.it

3
THE BIG DATA WE USE EVERY DAY
NOSQL IN RESCUE!
OMG! How to choose?
DATA MODEL
key

value
nosqlday

http://nosqlday.it

// Store value
> SET nosqlday http://nosqlday.it

// Retrieve value
> GET nosqlday
http://nosqlday.it
the_godfather
KEY - VALUE
the_godfather
DOCUMENT
COLUMN
Year

Cast

Rating

1994

Tom Hanks,
Robin Wright,
Gary Sinise

8,7

Braveheart

1995

Mel Gibson,
Sophie Marceau,
Patrick McGoohan

Fast &
Furious 7

2014

Forrest Gump
GRAPH
product
1

home

product
2

product
3

17
product
1

landing
page 1

home

landing
page 2

18

product
2

product
3
product
1

landing
page 1

home

landing
page 2

19

product
2

product
3
product
1

landing
page 1

home

landing
page 2

20

product
2

product
3
Classification by data model

Key - value

Column
21

Document
Graph
Is all about
data model?
Let’s suppose exists a RDBMS «category»…

23
Let’s suppose exists a RDBMS «category»…

24
NOSQL ARE SIMPLY
DATA CONTAINERS?
Many more
to say about
NoSQL datastores
OFFLINE WEB APPLICATIONS
OFFLINE WEB APPLICATIONS
GEOSPATIAL SEARCH
GEOSPATIAL SEARCH
KEY-VALUE + LISTS
KEY-VALUE + SETS
VALUE EXPIRATION
NoSQL Mobile Databases

[Lite]
ALL useful things!
ALL useful things!
but don’t limit to

features-first
comparison
37

_______
_______
_______
_______
_______

Durability vs Performance
Durability vs Performance
_______
_______
_______
_______
_______
38

safe mode?
off = data loss risk
Durability vs Performance
safe mode?
off = data loss risk

_______
_______
_______
_______
_______

file system
39
Durability vs Performance
safe mode?
off = data loss risk

_______
_______
_______
_______
_______

Consider use of
Journaling

file system
40
Durability vs Performance
safe mode?
off = data loss risk

_______
_______
_______
_______
_______

Disk / RAID cache?

file system
41
And about scaling?
Goldfish, not thoroughbreds

Scale up
Goldfish, not thoroughbreds

Scale out

Scale up
Goldfish, not thoroughbreds
Allow for
fast, cost-effective,
on-demand growth
(or shrink)

Scale out

Scale up
"I know two companies that collapsed
due to inability to reduce operating
costs when the utilization of their sites
diminished"

Theo Schlossnagle

From Theo’s book "Scalable Internet Architectures"
Ease of
scalability

KeyValue
stores
ColumnFamily
stores

Document
databases

Graph
databases

> 90% of use cases

Complexity
Adapted from http://www.slideshare.net/emileifrem/an-overview-of-nosql-jfokus-2011
Query capability

48
Query capability

function(doc) {
if (doc.city == ‘London’) {
emit(doc._id, null)
}
}
49
Query capability

We couldn’t use user input here

function(doc) {
if (doc.city == ???) {
emit(doc._id, null)
}
}
50
Query capability
db.events.find(
{ city: ‘Rome’ }
)
Here we could use user input!

function(doc) {
if (doc.city == ???) {
emit(doc._id, null)
}
}
51
Distribution model

52
FILTERED MULTI-MASTER  S
Distribution model: MREPLICATION

MASTER

all data

SLAVE
FILTERED MULTI-MASTER  M
Distribution model: MREPLICATION

MASTER

MASTER

MASTER
FILTERED MULTI-MASTER
Filtered multi-master REPLICATION

Product list

MASTER

MASTER

(eg. head quarter)

(eg. customer plant)

Purchases
FILTERED MULTI-MASTER REPLICATION
Scaling reads

MASTER

MASTER

MASTER
FILTERED MULTI-MASTER REPLICATION
Scaling reads

client

client

MASTER

MASTER

MASTER
client
FILTERED MULTI-MASTER REPLICATION
Scaling writes?

MASTER

MASTER

MASTER
FILTERED MULTI-MASTER REPLICATION
Scaling writes?
client

MASTER

MASTER

MASTER
FILTERED
ShardingMULTI-MASTER REPLICATION

Shard 1
[A to F]

Shard 2
[G to N]

Shard 3
[O to T]

Shard 4
[U to Z]
FILTERED MULTI-MASTER REPLICATION
Scaling writes
client

Shard 1
[A to F]

Shard 2
[G to N]

Shard 3
[O to T]

Shard 4
[U to Z]
FILTERED MULTI-MASTER REPLICATION
Scaling writes
client

Shard 1
[A to F]

Shard 2
[G to N]

Shard 3
[O to T]

Shard 4
[U to Z]
R / W data from 2 nodes

63
R / W data from 2 nodes
T1
Node 1

C

X=0

Node 2

C

X=0

64
R / W data from 2 nodes
T1

T2

Node 1

C

X=0

C

X=1

Node 2

C

X=0

C

X=0

65
R / W data from 2 nodes
T1

T2

Node 1

C

X=0

C

X=1

Node 2

C

X=0

C

X=0

66
CAP Theorem
Consistency

Partition
Tolerance

67

Availability
Choose
2

CAP Theorem
Consistency

Partition
Tolerance

68

Availability
(Some of) available solutions
CP:
BigTable
Hbase
MongoDB
Redis
MemcacheDB
etc.

PA:
Dynamo
CouchDB
Cassandra
SimpleDB
Tokyo Cabinet
Voldemort
etc.

69

CA:
RDBMS
etc.

Consistency

Partition
Tolerance

Availability
from

CONSISTENCY
to

EVENTUAL CONSISTENCY
Basic
Availability
Soft state
Eventual consistency
Atomicity
Consistency
Isolation
Durability
Aggregates

Source: AggregateOrientedDatabase - http://martinfowler.com/bliki/AggregateOrientedDatabase.html
74
Aggregates
Atomicity and Isolation
are guaranteed inside
an aggregate

Source: AggregateOrientedDatabase - http://martinfowler.com/bliki/AggregateOrientedDatabase.html
75
ARE YOU SURE WE NEED ACID?
ARE YOU SURE WE NEED ACID?
Safety
vs
Liveness
Safety
vs
Liveness
Availability is revenue!
BACK TO CONTAINERS
Data Model

STARTING FROM DATA MODEL…
Scalability
model

Data durability
Query model

Position
on CAP
Some needful
feature

Performance
Data Model

MANY OTHER THINGS TO CONSIDER
RELATIONAL DBMS
RELATIONAL DBMS
"the relational model
is pretty magical"
Laurie Voss

http://seldo.com/weblog/2010/07/12/in_defence_of_sql
NOSQL DATASTORES
"Big data is like teenage sex: everyone
talks about it, nobody really knows
how to do it, everyone thinks everyone
else is doing it, so everyone claims
they are doing it..."

Dan Ariely

https://www.facebook.com/dan.ariely/posts/904383595868
Schemaless
Schemaless

$doc = $myDb->getDoc('the_godfather');
$year = $doc['year'];
$castCount = count($doc['cast']);
if ($castCount > 0) {
$firstCastName = $doc['cast'][0]['name'];
}
Schemaless
…are you sure?
$doc = $myDb->getDoc('the_godfather');
$year = $doc['year'];
$castCount = count($doc['cast']);
if ($castCount > 0) {
$firstCastName = $doc['cast'][0]['name'];
}

The application is aware of
document schema!
Polyglot persistence
Farm

Node 1

Provisioning

LAPP stack

Node n

91

Devices
status

Data
aggregation
Polyglot persistence made safe
Other
component

Provisioning

layer

92
Polyglot persistence made safe
Other
component

Data Store as a Service

Provisioning

layer

93
Polyglot persistence made safe
Other
component

Provisioning

Anti
Corruption
Layer
94

layer
GOOD APPLICATION DESIGN
GOOD APPLICATION DESIGN

THINK ABOUT DATA LIFECYCLE
GOOD APPLICATION DESIGN

THINK ABOUT DATA LIFECYCLE

NOT ONLY DATA MODEL
That’s all, folks!

Stefano Valle
@stefanovalle
s.valle@mvassociati.it
Photo credits
http://www.flickr.com/photos/aloha75/4571410233
http://www.flickr.com/photos/djnordic/167433120
http://www.flickr.com/photos/jpstanley/69523927
http://www.flickr.com/photos/lodigs/2833648828
http://www.flickr.com/photos/ppym1/387781444
http://www.flickr.com/photos/freefoto/3844247553
http://www.flickr.com/photos/jamesgood/1708602693
http://www.flickr.com/photos/ms_cwang/133084413
http://www.flickr.com/photos/birminghammag/7979485144
http://www.flickr.com/photos/capcase/4970062870

Weitere ähnliche Inhalte

Ähnlich wie NoSQL Containers get Rich

Ähnlich wie NoSQL Containers get Rich (20)

Orchestrating Big Data pipelines @ Fandom - Krystian Mistrzak Thejas Murthy
Orchestrating Big Data pipelines @ Fandom - Krystian Mistrzak Thejas MurthyOrchestrating Big Data pipelines @ Fandom - Krystian Mistrzak Thejas Murthy
Orchestrating Big Data pipelines @ Fandom - Krystian Mistrzak Thejas Murthy
 
Macro
MacroMacro
Macro
 
The Dynamic Language is not Enough
The Dynamic Language is not EnoughThe Dynamic Language is not Enough
The Dynamic Language is not Enough
 
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
クラウドDWHとしても進化を続けるPivotal Greenplumご紹介
 
Cassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalkCassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalk
 
OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia
 
Cassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data ModelingCassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data Modeling
 
What Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and GoWhat Kiwi.com Has Learned Running ScyllaDB and Go
What Kiwi.com Has Learned Running ScyllaDB and Go
 
Model Selection and Multi-model Inference
Model Selection and Multi-model InferenceModel Selection and Multi-model Inference
Model Selection and Multi-model Inference
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Hadoop
HadoopHadoop
Hadoop
 
Cassandra - lesson learned
Cassandra  - lesson learnedCassandra  - lesson learned
Cassandra - lesson learned
 
Fast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineFast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL Engine
 
Big Data LDN 2017: Big Data Analytics with MariaDB ColumnStore
Big Data LDN 2017: Big Data Analytics with MariaDB ColumnStoreBig Data LDN 2017: Big Data Analytics with MariaDB ColumnStore
Big Data LDN 2017: Big Data Analytics with MariaDB ColumnStore
 
Getting started with Cassandra 2.1
Getting started with Cassandra 2.1Getting started with Cassandra 2.1
Getting started with Cassandra 2.1
 
Compiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark CatalystCompiling openCypher graph queries with Spark Catalyst
Compiling openCypher graph queries with Spark Catalyst
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
 
Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)Big Data & NoSQL - EFS'11 (Pavlo Baron)
Big Data & NoSQL - EFS'11 (Pavlo Baron)
 
Replicate from Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate from Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsReplicate from Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate from Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 

Mehr von Stefano Valle

Mehr von Stefano Valle (6)

IoT: protocolli, dispositivi, architetture
IoT: protocolli, dispositivi, architettureIoT: protocolli, dispositivi, architetture
IoT: protocolli, dispositivi, architetture
 
Protocol Rollercoaster: da HTTP a AMQP, passando per CoAP e MQTT
Protocol Rollercoaster: da HTTP a AMQP, passando per CoAP e MQTTProtocol Rollercoaster: da HTTP a AMQP, passando per CoAP e MQTT
Protocol Rollercoaster: da HTTP a AMQP, passando per CoAP e MQTT
 
Instant ACLs with Zend Framework 2
Instant ACLs with Zend Framework 2Instant ACLs with Zend Framework 2
Instant ACLs with Zend Framework 2
 
Asset management with Zend Framework 2
Asset management with Zend Framework 2Asset management with Zend Framework 2
Asset management with Zend Framework 2
 
Moduli su Zend Framework 2: come sfruttarli
Moduli su Zend Framework 2: come sfruttarliModuli su Zend Framework 2: come sfruttarli
Moduli su Zend Framework 2: come sfruttarli
 
Introduzione a Git
Introduzione a GitIntroduzione a Git
Introduzione a Git
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

NoSQL Containers get Rich