Weitere ähnliche Inhalte Ähnlich wie NoSQL Smackdown! (20) Mehr von Tim Berglund (10) Kürzlich hochgeladen (20) NoSQL Smackdown!6. relation (n.)
An unordered set of
tuple (n.)the same type.
tuples of
6
7. tuple (n.)
A function that maps
attributes to values.
7
8. tuple (n.)
(A bundle of key-value
pairs—but don’t tell
anyone!)
8
9. id username pwd_hash born_at monkey
1 mluther d8c82af9 Nov 1483 FALSE
2 aaugustine 329b8dae Nov 354 FALSE
3 gnyssa e50ec9e0 Jun 335 FALSE
4 bonzo 330e01f2 Apr 2007 TRUE
9
17. availability (n.)
All clients can always
read or write within some
maximum latency.
17
18. partition tolerance (n.)
No set of failures less than
total network failure is
allowed to cause the system
to respond incorrectly.
18
20. Cluster Node
Cluster Node
Switch
Cluster Node
Cluster Node
20
21. Cluster Node
Cluster Node
Switch
Cluster Node
Cluster Node
21
28. NoSQL is a set of different
approaches to storing and
retrieving data.
28
30. Tradeoffs
Complex transactions vs. scalability
Consistency vs. availability (often)
Performance vs. durability
Horizontal vs. vertical scale
Cheap writes vs. cheap reads
30
35. Origin-
Facebook Inbox search
b ack in 2007
License-
Apache Public License 2.0
35
37. Name Value Timestamp
Column
37
38. email tlberglund@gmail.com 20101011T120502Z
Column
38
39. Key
Column
Column
Column
Column
Column
Row
39
40. Key Column Column
Key Column Column Column
Key Column
Column Family
40
41. bbf77f01d full_name email
050fe74e2 full_name email mobile
8b20d8f6 full_name
“Contacts” Column Family
41
42. Name
key Column
key Column
key Column
SuperColumn
42
43. 4145bfaf15f10c2e6033f8b9c3143297a36f5fe3
full_name full_name Tim Berglund 20101011T120502Z
email email tlberglund@gmail.com 20101011T120503Z
mobile mobile [redacted] 19940217T145637Z
postal_code postal_code 80123 20101011T120452Z
Contact Info SuperColumn
43
44. Key SuperColumn SuperColumn
Key SuperColumn SuperColumn
Key SuperColumn
SuperColumn Family
44
48. Scalability-
Rock star!
(see Amazo n Dynamo)
0000
2000
E000
4000
C000
6000
A000
8000
48
49. 0000
E000 2000
C000 4000
A000 6000
8000
49
50. Scalability
- Consist ent hashing
- No disting uished nodes
- Add and re move nodes
on a live cluster
50
51. API
- Thrift RPC
- Easy to fet ch columns
by key
- Nat ive clients
- Hadoop integration
51
53. Sup port/Community-
- www .datastax.com
- Eben H ewitt’s book
- Plus , it is an Apache
project...
53
56. Origin-
Founders of DoubleClick
were tota lly going to
take o ver the Cloud
License-
Database: G NU Affero 3.0
D rivers: APL 2
56
58. { "_id" : ObjectId("4cbd00455280f73d395922a4"),
"contact" : {
"tags" : ["man", "", "", ""]
"firstName" : "Myron",
"lastName" : "Dalton",
"address1" : "4322 Maple Street",
"city" : "Santa Ana",
"state" : "CA",
"postalCode" : "92705",
"email" : "Myron.C.Dalton@spambob.com"
},
"occupation" : "Long haul truck driver"
}
58
60. API
- Native JavaScript
console
- Bin ary drivers
- Ad-hoc q uery language
( but it’s NOT S QL, okay?)
60
62. API
- Can writ e MapReduce
jobs in JavaScript
- Morph ia for Java
- Mongoos e for node.js
62
65. Concerns
- Write durability?
Journa ling coming
in 1.8!
- Sharding p erformance
- But everyone still wants
to date her
65
68. Origin
-N eo Technolog ies in 2003
-Malmö and San Francisco
68
69. License
- GPL3, f ull-featured
-C ommercial
$49/ mo antiviral
$499/mo advanced
$1,999/mo enterprise
69
70. Maturity
- Productio n since 2003
- 1.0 i n Feb 2010
Implementat ion Language
- Java 6
- Easil y embeddable!
70
72. All nodes and relationships have
arbitrary properties
4CG
7L
CN?
M QCN
B
7I
>CMJ
%HA NCIH -;NNB?Q
LEM
ON;
;A? QCNB
QCNB
M CH
EM
(IFFSQII> ?;
4SJ?M 3J CNB
Q
+HIQM "LC;H
72
73. Query Model
- REST/JSON
- Java traversal API
- Bindings in C lojure, Ruby,
Python, PHP, S cala, Grails
- JTA/JTS XA
73
74. Scale Idiom
- Traditional ly focused on
si ngle-node pe rformance
- Recent HA support
-M aster/slave
- ZK mast er election
- Writeable slaves
74
78. Origin-
Internal datastore for
Basho’s S alesforce.com
apps
(Hey, it seemed like a
good idea at the time!)
78
79. License-
APL 2 fo r OSS version
Clo sed-source
“Enterprise DS” version
79
80. Implementatio n Language-
Erlang, C, S piderMonkey
Ja vaScript VM
Data Model-
Key/va lue store, but
w ith buckets!
80
81. Key Value
That’s it.
81
82. Key Value Key Value
Key Value Key Value
Bucket A
Key Value Key Value
Key Value Key Value
Bucket B
82
83. name Tim birthday 061972
occupation Developer city Littleton
Bucket A
name Aurelius birthday 110354
occupation Bishop city Hippo
Bucket B
83
84. Does it scale?
- Like a boss!
- No distin guished node
- Tunable consistency,
replication
- Add n odes without
taking the cluster down
84
85. API
- HTTP interface (slow,
but featureful)
- Proto col Buffers (a
performa nce beast)
85
86. API
- Key CRUD
- Ma pReduce in
JavaScript
- Grap h traversals
translate t o MapReduce
86
91. Origin-
Salvatore Sanfilippo
w rote it for his analytics
s ite, llogg.com
License-
Open Sou rce-Brand
open source
91
93. Data Model
-Key/ value store++
-Strings
-Hashes, Sets
-Lists
-Sorted Sets
93
94. Does it scale?
- Ve rtically, sure
- Plus it’s really fast
- Master/sl ave options
- Technically a CA system
94
95. API
- Binary socke t interface
- Dr ivers for 22+
languages
- Comm ands look like
assem bly language
95
105. Thank You
Tim Berglund
www.augusttechgroup.com
tim.berglund@augusttechgroup.com
@tlberglund
105
106. Further Reading
Brewer’s Conjecture
http://www.podc.org/podc2000/
Proof of Brewer’s Conjecture (the “CAP Theorem”)
http://bit.ly/cap-theorem-proof
Amazon Dynamo
http://bit.ly/amazon-dynamo
http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
Google BigTable
http://bit.ly/big-table
The CAP Theorem Explained
http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
Visualzing NoSQL Databases on the CAP Venn Diagram
http://blog.nahurst.com/visual-guide-to-nosql-systems
Redis
http://redis.io/
Cassandra
http://cassandra.apache.org
MongoDB
http://mongodb.org
106