4. What is NoSQL
Not a well defined term
(just the name of one single meetup in
2009 at San Francisco)
5. So, what does it stand for?
It is better to pay attention what does it
mean rather than what does it stand for
6. Common characteristics of
NoSQL
● Don't use SQL as a query language
(provide it is own query mechanism)
● Non relational
● Open-source projects
● Run on clusters
● Developed in 21st
century
● Schemaless
8. Why do you use NoSQL
To operate on big data on multiple
machines running across the cluster
Increase developer productivity
(even if there is no demand for big data)
9. What is wrong with traditional
RDBMS
● Nothing really, they will not disappear
(who knows ;)
● Well defined tools
(even the whole profession is behind
DBA)
● There is no black or white choice, NoSQL
and RDBMS will continue to work closely
together, i.e. the rise of Polyglot
Persistence
10. But, RDBMS is not perfect
Impedance mismatch
Running on cluster is a challenge
12. Data Model
Aggregate Oriented VS Relational
- Access by key
- Make it easier to manage data storage over
clusters
- Usually you adopt you aggregate/data model to
the query pattern your application has
Aggregate – is the collection of related objects that we wish to treat as a unit
13. ACID
NoSQL has ACID, but in scope of one
aggregate
(we can do atomic manipulate of a single
aggregate at a time)
Graph databases actually have full support of ACID
14. Distribution Models
● Single Sever (no distribution at all)
● Sharding (can be combined with replication)
(shard key – range based or hash based)
● Master-Slave Replication (“read” scalability)
(writes to M, reads can be done from S)
(M – single point of failure)
● Peer-to-Peer Replication (common to CF)
(consistency issue)
16. NWR
● N – number of nodes to replicate to
(replication factor, number of copies in
the cluster)
● W – number of nodes to write before write
succeeded successful
● R – number of nodes to read from before
read succeeded successful
17. NWR
● W+R <= N – eventual consistency
(eventually all the nodes in the cluster will get
the data)
● W = N, R = 1 – consistency by writes
(what RDBMS does)
● W = 1, R = N – consistency by reads
(conflicts must be resolved somehow)
● W + R > N – consistency by quorum
18. Quorum (W+R > N)
Read from more than half and
write to more than half
(QUORUM = N/2 + 1)