The days of the relational database being a one-stop-shop for all of your persistence needs are over. Although NoSQL databases address some issues that can’t be addressed by relational databases, the opposite is true as well. The relational database offers an unparalleled feature set and rock solid stability. One cannot underestimate the importance of using the right tool for the job, and for some jobs, one tool is not enough. This talk focuses on the strength and weaknesses of both relational and NoSQL databases, the benefits and challenges of polyglot persistence, and examples of polyglot persistence in the wild.
These slides were presented at WindyCityDB 2010.
Empowering Africa's Next Generation: The AI Leadership Blueprint
Polyglot Persistence - Two Great Tastes That Taste Great Together
1. Polyglot Persistence
Two Great Tastes
That Taste Great Together!
John Wood
john_p_wood@yahoo.com
@johnpwood
2. About Me
● Software Developer at Interactive Mediums
● Primarily work on a web application that allows
our customers to engage and interact with their
customers
● Writing code for about 15 years
● Tinkering with NoSQL for about 1.5 years
● Have a NoSQL solution that has been running
in production for a year
14. The RDBMS Is No Longer The
Default Choice
● Can be very difficult to scale horizontally
● Schemas can be difficult to maintain and
migrate
● For some applications, the data integrity
features of the RDBMS are an unnecessary
overhead
● Data constraints and JOINs can be expensive
at runtime
16. NoSQL Databases Have Stepped
Up To Address These Issues
● Schema-less
● Little to no data integrity enforcement
● Self-contained data
● Eventually consistent
● Easy to scale horizontally to add processing
power and storage
18. But The RDBMS Is Far From Dead
● Incredibly mature, and battle tested
● Immediate and constant consistency
● Integrity of data is enforced
● Efficient use of storage space if data
normalized properly
● Supported by everyone and everything (tools,
frameworks, libraries, etc)
● Incredibly flexible and powerful query language
● Help is plentiful and easy to find
28. “Polyglot Persistence, like
polyglot programming, is all
about choosing the right
persistence option for the task at
hand.” - Scott Leberknight,
October, 2008
http://www.nearinfinity.com/blogs/scott_leberknight/polyglot_persistence.html
53. class User < ActiveRecord::Base
end
class ContestEntry < CouchRest::ExtendedDocument
property :entry_number
end
54. class User < ActiveRecord::Base
def contest_entries
ContestEntry.entries_for_user(self.id)
end
end
class ContestEntry < CouchRest::ExtendedDocument
property :entry_number
property :user_id
def self.entries_for_user(user_id)
# Execute your view to fetch the contest entries
end
def user
User.f nd_by_id(user_id)
i
end
end
58. ● Primary MySQL database with a backup
● A few very large tables, containing 5M – 30M
rows each, and growing quickly
● Increasing query execution time
● Some pages on the web app were timing out
● Increasing database migration time
● Rigid schema of the RDBMS was preventing
some planned features from moving forward
59. ● Brought in a consultant to help us optimize our
MySQL setup
● Optimized slow queries
● Added some indexes
● Offloaded some work to the backup database
● Considered the use of summary tables for
statistics
61. ● Migrated old data from large tables to CouchDB
● Using CouchDB views to aggregate summary
data
● Data is imported and views are updated nightly
● Queries for statistics now very fast
● Using Lucene (via couchdb-lucene) for full text
searching
● Taking full advantage of CouchDBs schema-
less nature in several new application features
63. ● CouchDB databases and views can be very
large on disk
● Some queries could not be substituted with
CouchDB views
● Indexing tens of millions of documents for full
text search with Lucene takes weeks
● Development takes longer, as the map/reduce
model requires additional thought and planning
● Changing/Upgrading views in production not
straightforward
http://www.couch.io/migrating-to-couchdb
67. ● Vertically and horizontally partitioned MySQL
● Several layers of aggressive caching, all
application managed
● Schema changes impossible, resulting in the
use of bitfields and piggyback tables
● Hardware intensive
● Error prone
● Hitting MySQL limits
● Already eventually consistent
69. ● Migrating from MySQL to Cassandra as their
main online data store
● Hadoop/HBase used for people search feature
● FlockDB used to manage the social graph
● Hadoop for analytics
● “As with all NoSQL systems, strengths in
different situations” - Kevin Weil, Analytics
Lead, Twitter
http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010
70. ● Increased availability
● The ability to support new features
● The ability to analyze their massive amount of
data in a reasonable amount of time
http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010