NoSQL is not a buzzword anymore. The array of non- relational technologies have found wide-scale adoption even in non-Internet scale focus areas. With the advent of the Cloud...the churn has increased even more yet there is no crystal clear guidance on adoption techniques and architectural choices surrounding the plethora of options available. This session initiates you into the whys & wherefores, architectural patterns, caveats and techniques that will augment your decision making process & boost your perception of architecting scalable, fault-tolerant & distributed solutions.
6. Agenda
• What NoSQL is & What it is not
• Why NoSQL – 2 specific reasons
• Conceptual Fundamentals & Grounding
• 3 techniques to classify & choose
• Way ahead
7. What
• Variety of non-
relational database
systems
• Usually schema-less
• Mostly open-source
• Not anti-RDBMS
• Not a replacement
8. No – relational tables –
were harmed in the making of
this presentation.
15. 4 Vs of Big Data
Volume Velocity
• Terabytes and Petabytes • Time sensitive real-time
data processing & decision
making
Variety Value
• Of structured and • Inherent value always
unstructured data
16. RDBMS can handle all that. Right??
• Scaling up has a limit.
• Sharding - spread data across servers.
• Denormalization - potentially duplicates data in the
database, requiring updates to multiple tables when a
. duplicated data item is changed
• Distributed Caching - caching recently accessed data in memory
and storing that data across any number of servers
. or virtual machines. Think Memcached.
17. RDBMS tactics - Downside & Pitfalls
• Re-sharding is disruptive.
• Maintain schema on every server
• Distributed Caching accelerates just the reads
• You lose relational benefits anyway.
19. Aggregate-orientation
• Unit of data can have a more complex
structure than a set of simple tuples.
• Excellent fit to run on a cluster.
• Atomic manipulation of single
aggregate.
• Application code takes precedence.
25. 3 properties of distributed databases
• Consistency means that each client always has the
same view of the data.
• Availability - node always available for read and
write.
• Partition tolerance means that the system works
well across physical network partitions.
26. consistency availability partition-tolerance only-2-out-of-3
CAP Theorem
27. consistency availability
partition-tolerance
This is incorrect
33. For the academically inclined:
Proprietary DB high-performance Google App. Engine
Google BigTable
Amazon Dynamo
Proprietary system high-availability AWS key-value
35. Object oriented
Faster and Declarative.
Lack of interoperability and recovery standards.
End-to-end development, database &
deployment platform
Embeddable and fast. Lack of querying
capabilities.
36. XML
Native XML database systems.
Typically XQuery used as querying mechanism.
Advantage or Disadvantage based on XML affinity.
Sedna Tamino
53. We learnt that ...
RDBMSs are here to stay. NoSQL is not creating
a paradigm shift.
NoSQL provides a set of non-relation data
stores & technologies that have affinity for
being processed in a clustered environment.
Some of them NoSQL databases also offer a
solution to Impedance Mismatch thus
increasing application developer productivity.
What Aggregate-Orientation in data modeling
means.
What the different types of database types
are.
And most importantly ... we now know that
RDBMS systems need DBAs - Database
Architects & Admins.
NoSQL systems need DBAs too - Developers
Beyond Awesome!