2. 2 Dell - Restricted - Confidential
What are databases?
• Tedd Codd & Chris Date
– 13 rules
– An Introduction to Database Systems
• Wikia/Wikipedia
• Mike
– An organized collection of data offering varying levels
of availability, scalability, performance, consistency,
management, accessibility and quality.
• Matt
Databases defined
4. 4 Dell - Restricted - Confidential
Nosql background, issues and considerations
• History
– Google Big Table, Amazon Dynamo
• What does schema-less mean?
– On read
– Still structured
– Embedded
– Can vary between records
• Languages & formats used
– Java, Python
– JSON, BSON, XML, CSV
5. 5 Dell - Restricted - Confidential
Nosql background, issues and considerations
continued
• Eric Brewer’s CAP theorem
– Can’t do all three.
• What does NoSQL really mean?
– Distributed, shared-nothing aggregate oriented database
– “Not only SQL” versus “No”
• What are the factors for the various choices?
– Best fit
– Use case(s)
– KV
– HA, Multi-site
– Network
– Kevin Bacon
• Sharding
– Partitioning
6. 6 Dell - Restricted - Confidential
NewSQL
• SQL as predominant access method
• OLTP
• Larger user populations than nosql
• Better consistency than nosql
• Still subject to Brewer’s CAP theorem
• Examples
– VoltDB, MemSQL, Clustrix, NuoDB
7. 7 Dell - Restricted - Confidential
RDBMS or NOSQL?-tablify
• RDBMS
– Large user populations
– Structured
– Static schema
– Strong typing
– Access by PK, AK, indexes
– Complex structures
– Feature rich
– Multi-purpose, shared by apps
– OLTP
– ACID
– Complex queries
– >3 way joins
– Small to medium sized dbs
– COTS pkgs
– Datamarts
• Nosql
– Smaller user populations
– Multi-structured
– Schema evolution
– Weak typing
– Mostly random access by PK
– Simple structures
– Bare bones functionality
– Single purpose/use case, not shared by apps
– Not transactional
– BASE
– Simple queries
– VLDB
– Horizontal scalability
8. 8 Dell - Restricted - Confidential
NoSQL Database Types
• Four types
– Columnar
– Hbase, Cassandra
– Document
– MongoDB, Couchbase
– KV
– Riak, Redis
– Graph
– Neo4j, Titan
• How many do you need?
– By type
– Within type
• Who will manage them?
– DBAs
• How do you access them?
– SQL, nosql
– Sequential
9. 9 Dell - Restricted - Confidential
Nosql Commonalities
• Mostly open source
• Weak typing
• Multi-structured
• Horizontal scale
• No standardization
• VLDB
• Single purpose, per database
11. 11 Dell - Restricted - Confidential
How are nosql databases typically used?
• As an adjunct to Hadoop
• As a partial replacement for some RDBMS workloads
• To scale linearly
• As a data store for semi-structured and multi-structured data
12. 12 Dell - Restricted - Confidential
What questions do our customers ask?
• Why is my Hbase cluster so CPU hungry?
• Do you have an RA for <Your favorite nosql db goes here>?
• Can I replace all my Oracle databases w/ some nosql databases?
13. 13 Dell - Restricted - Confidential
What are some common problems?
• Cohabitation with Hadoop and other programs on a cluster.
• Poor db design
• Falling prey to vendor hype
14. 14 Dell - Restricted - Confidential
How about some general recommendations?
• Read a book or two on your target nosql db.
• Search thru the blogosphere & twitterverse.
• Don’t use more than one type, unless you’re an SI or large service provider.
• If performance & service levels are important isolate the cluster.
• Review your database design w/ DBAs & those that have done it already.
– Presentations, conference proceedings, boutique consultancies