Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Forget The ORM!
1. Forget the ORM!
Persistent data with
Non-traditional Databases
By Randal L. Schwartz,
Stonehenge Consulting Services, Inc.
<merlyn@stonehenge.com>
http://www.stonehenge.com/merlyn/
version 1.2 at 13 April 2010
9. Brief review of ORMs
• Objects in memory
• Tables in RDBMS
• Load objects from tables
10. Brief review of ORMs
• Objects in memory
• Tables in RDBMS
• Load objects from tables
• Store objects (and changes) back to tables
11. Brief review of ORMs
• Objects in memory
• Tables in RDBMS
• Load objects from tables
• Store objects (and changes) back to tables
• Typically mapped one attribute per column
12. Brief review of ORMs
• Objects in memory
• Tables in RDBMS
• Load objects from tables
• Store objects (and changes) back to tables
• Typically mapped one attribute per column
• Occasionally more complex mappings
16. But things change
• Object attributes get updated
• New objects get created
• Need to map that back to the RDB
17. But things change
• Object attributes get updated
• New objects get created
• Need to map that back to the RDB
• Updates often require custom SQL
18. But things change
• Object attributes get updated
• New objects get created
• Need to map that back to the RDB
• Updates often require custom SQL
• Or live with updating more than needed
19. But things change
• Object attributes get updated
• New objects get created
• Need to map that back to the RDB
• Updates often require custom SQL
• Or live with updating more than needed
• Detecting what has changed can be hard
23. One-to-many messes
• Messy when an attribute is a reference
• Or a non-DB data type, like a set
• Mapping involves joins
24. One-to-many messes
• Messy when an attribute is a reference
• Or a non-DB data type, like a set
• Mapping involves joins
• Left joins to get child rows
25. One-to-many messes
• Messy when an attribute is a reference
• Or a non-DB data type, like a set
• Mapping involves joins
• Left joins to get child rows
• Or multiple trips to get joined data
26. One-to-many messes
• Messy when an attribute is a reference
• Or a non-DB data type, like a set
• Mapping involves joins
• Left joins to get child rows
• Or multiple trips to get joined data
• Either way, expensive
29. Why this is bad
• If your car were an ORM...
• Reduce it to parts each night in garage
30. Why this is bad
• If your car were an ORM...
• Reduce it to parts each night in garage
• Rebuild it each morning
31. Why this is bad
• If your car were an ORM...
• Reduce it to parts each night in garage
• Rebuild it each morning
• Is this sane in the 21st century?
32. Why this is bad
• If your car were an ORM...
• Reduce it to parts each night in garage
• Rebuild it each morning
• Is this sane in the 21st century?
• Most ORMs generate the SQL on the fly
33. Why this is bad
• If your car were an ORM...
• Reduce it to parts each night in garage
• Rebuild it each morning
• Is this sane in the 21st century?
• Most ORMs generate the SQL on the fly
• Recompiling text on each hit?
34. Why this is bad
• If your car were an ORM...
• Reduce it to parts each night in garage
• Rebuild it each morning
• Is this sane in the 21st century?
• Most ORMs generate the SQL on the fly
• Recompiling text on each hit?
• Often called the “object relational
impedence mismatch”
36. How to solve it
• Rectangles don’t fit today’s data
37. How to solve it
• Rectangles don’t fit today’s data
• Solution: Don’t store rectangles
38. How to solve it
• Rectangles don’t fit today’s data
• Solution: Don’t store rectangles
• Also known as “NoSQL” solutions
39. How to solve it
• Rectangles don’t fit today’s data
• Solution: Don’t store rectangles
• Also known as “NoSQL” solutions
• Two main options:
40. How to solve it
• Rectangles don’t fit today’s data
• Solution: Don’t store rectangles
• Also known as “NoSQL” solutions
• Two main options:
• Document storage (typically JSON)
41. How to solve it
• Rectangles don’t fit today’s data
• Solution: Don’t store rectangles
• Also known as “NoSQL” solutions
• Two main options:
• Document storage (typically JSON)
• Object storage
42. How to solve it
• Rectangles don’t fit today’s data
• Solution: Don’t store rectangles
• Also known as “NoSQL” solutions
• Two main options:
• Document storage (typically JSON)
• Object storage
• (Generally) not possible if other apps still
need the data as RDB
46. Dropping ACID
• Automicity, Consistency, Isolation,
Durability
• Often expressed through transactions
• Most NoSQL offer no multi-update ACID
47. Dropping ACID
• Automicity, Consistency, Isolation,
Durability
• Often expressed through transactions
• Most NoSQL offer no multi-update ACID
• But atomic within a single doc update
48. Dropping ACID
• Automicity, Consistency, Isolation,
Durability
• Often expressed through transactions
• Most NoSQL offer no multi-update ACID
• But atomic within a single doc update
• Thus, think about your “schema” carefully
49. Dropping ACID
• Automicity, Consistency, Isolation,
Durability
• Often expressed through transactions
• Most NoSQL offer no multi-update ACID
• But atomic within a single doc update
• Thus, think about your “schema” carefully
• Ensure single doc update suffices
53. Eric Brewer’s CAP
• Consistency
• All working, or not working
• Yes, this is the “A” in ACID
54. Eric Brewer’s CAP
• Consistency
• All working, or not working
• Yes, this is the “A” in ACID
• Availability
55. Eric Brewer’s CAP
• Consistency
• All working, or not working
• Yes, this is the “A” in ACID
• Availability
• Is the service up and running?
56. Eric Brewer’s CAP
• Consistency
• All working, or not working
• Yes, this is the “A” in ACID
• Availability
• Is the service up and running?
• Partition Tolerance
57. Eric Brewer’s CAP
• Consistency
• All working, or not working
• Yes, this is the “A” in ACID
• Availability
• Is the service up and running?
• Partition Tolerance
• Can parts of it go offline safely?
58. Eric Brewer’s CAP
• Consistency
• All working, or not working
• Yes, this is the “A” in ACID
• Availability
• Is the service up and running?
• Partition Tolerance
• Can parts of it go offline safely?
• .... Pick any two
86. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
87. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
• Interactive JavaScript shell
88. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
• Interactive JavaScript shell
• In-place updates
89. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
• Interactive JavaScript shell
• In-place updates
• Replication
90. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
• Interactive JavaScript shell
• In-place updates
• Replication
• Auto-sharding
91. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
• Interactive JavaScript shell
• In-place updates
• Replication
• Auto-sharding
• Bindings for many languages
92. MongoDB
• Open source (C++, GNU AGPL3)
• Wire/Storage protocol is BSON
• Embedded JavaScript for map/reduce
• Interactive JavaScript shell
• In-place updates
• Replication
• Auto-sharding
• Bindings for many languages
• FLOSS Weekly #105
96. Apache Jackrabbit
• Open source (Java, Apache 2.0)
• XML content on disk
• Implements Java Content Repository API
97. Apache Jackrabbit
• Open source (Java, Apache 2.0)
• XML content on disk
• Implements Java Content Repository API
• Full text and XPath search
98. Apache Jackrabbit
• Open source (Java, Apache 2.0)
• XML content on disk
• Implements Java Content Repository API
• Full text and XPath search
• Versioning, transactions, observation
99. Apache Jackrabbit
• Open source (Java, Apache 2.0)
• XML content on disk
• Implements Java Content Repository API
• Full text and XPath search
• Versioning, transactions, observation
• Authentication
100. Apache Jackrabbit
• Open source (Java, Apache 2.0)
• XML content on disk
• Implements Java Content Repository API
• Full text and XPath search
• Versioning, transactions, observation
• Authentication
• Object persistence using Object Content
Manager
105. MarkLogic Server
• Not open-source
• XML database
• Implements XQuery
• Full-text and structured search
106. MarkLogic Server
• Not open-source
• XML database
• Implements XQuery
• Full-text and structured search
• Native geospatial searches
107. MarkLogic Server
• Not open-source
• XML database
• Implements XQuery
• Full-text and structured search
• Native geospatial searches
• In heavy use by their customers
108. MarkLogic Server
• Not open-source
• XML database
• Implements XQuery
• Full-text and structured search
• Native geospatial searches
• In heavy use by their customers
• Scalable to “hundreds of terabytes”
109. MarkLogic Server
• Not open-source
• XML database
• Implements XQuery
• Full-text and structured search
• Native geospatial searches
• In heavy use by their customers
• Scalable to “hundreds of terabytes”
• Both native and RESTful APIs
123. eXist
• Open source (Java, LGPL)
• Native XML queries
• XQuery, XPath, XSLT, XUpdate
• Rich APIs: REST, WebDAV, SOAP
• Can serve entire webapps!
124. eXist
• Open source (Java, LGPL)
• Native XML queries
• XQuery, XPath, XSLT, XUpdate
• Rich APIs: REST, WebDAV, SOAP
• Can serve entire webapps!
• Used by large installations
125. eXist
• Open source (Java, LGPL)
• Native XML queries
• XQuery, XPath, XSLT, XUpdate
• Rich APIs: REST, WebDAV, SOAP
• Can serve entire webapps!
• Used by large installations
• US State Department (history.state.gov)
126. eXist
• Open source (Java, LGPL)
• Native XML queries
• XQuery, XPath, XSLT, XUpdate
• Rich APIs: REST, WebDAV, SOAP
• Can serve entire webapps!
• Used by large installations
• US State Department (history.state.gov)
• FLOSS Weekly #97
136. InfoGrid
• Open source (Java, AGPL3)
• GraphDatabase stores nodes and edges
• MeshBase - self contained
• NetMeshBase - distributed knowledge
• API in front of other SQL and NoSQL DBs
137. InfoGrid
• Open source (Java, AGPL3)
• GraphDatabase stores nodes and edges
• MeshBase - self contained
• NetMeshBase - distributed knowledge
• API in front of other SQL and NoSQL DBs
• Includes components for authentication
141. Neo4j
• Open source (Java, AGPL3)
• Commercial support/version available
• Java objects on disk
142. Neo4j
• Open source (Java, AGPL3)
• Commercial support/version available
• Java objects on disk
• Transactional
143. Neo4j
• Open source (Java, AGPL3)
• Commercial support/version available
• Java objects on disk
• Transactional
• Scalable (several billion nodes on single
machine)
144. Neo4j
• Open source (Java, AGPL3)
• Commercial support/version available
• Java objects on disk
• Transactional
• Scalable (several billion nodes on single
machine)
• Small footprint (JAR under 500KB)
145. Neo4j
• Open source (Java, AGPL3)
• Commercial support/version available
• Java objects on disk
• Transactional
• Scalable (several billion nodes on single
machine)
• Small footprint (JAR under 500KB)
• RDF mappings
149. AllegroGraph
• Free versions (not open source)
• Holds Resource Description Frameworks
• From Franz, Inc (Franz Lisp, etc)
150. AllegroGraph
• Free versions (not open source)
• Holds Resource Description Frameworks
• From Franz, Inc (Franz Lisp, etc)
• Interfaces in many common languages
151. AllegroGraph
• Free versions (not open source)
• Holds Resource Description Frameworks
• From Franz, Inc (Franz Lisp, etc)
• Interfaces in many common languages
• Active development
152. AllegroGraph
• Free versions (not open source)
• Holds Resource Description Frameworks
• From Franz, Inc (Franz Lisp, etc)
• Interfaces in many common languages
• Active development
• In use by FLOSS/commercial/government
153. AllegroGraph
• Free versions (not open source)
• Holds Resource Description Frameworks
• From Franz, Inc (Franz Lisp, etc)
• Interfaces in many common languages
• Active development
• In use by FLOSS/commercial/government
• Queries with SPARQL
154. AllegroGraph
• Free versions (not open source)
• Holds Resource Description Frameworks
• From Franz, Inc (Franz Lisp, etc)
• Interfaces in many common languages
• Active development
• In use by FLOSS/commercial/government
• Queries with SPARQL
• ... Protocol and RDF Query Language
161. Memcached
• Open source (C, BSD)
• LRU cache (data may be lost)
• Developed for LiveJournal
162. Memcached
• Open source (C, BSD)
• LRU cache (data may be lost)
• Developed for LiveJournal
• used by many large sites
163. Memcached
• Open source (C, BSD)
• LRU cache (data may be lost)
• Developed for LiveJournal
• used by many large sites
• Simple key/value storage
164. Memcached
• Open source (C, BSD)
• LRU cache (data may be lost)
• Developed for LiveJournal
• used by many large sites
• Simple key/value storage
• Keys up to 120 bytes, values 1 MB
165. Memcached
• Open source (C, BSD)
• LRU cache (data may be lost)
• Developed for LiveJournal
• used by many large sites
• Simple key/value storage
• Keys up to 120 bytes, values 1 MB
• Clientside horizontal scaling
166. Memcached
• Open source (C, BSD)
• LRU cache (data may be lost)
• Developed for LiveJournal
• used by many large sites
• Simple key/value storage
• Keys up to 120 bytes, values 1 MB
• Clientside horizontal scaling
• Can also store on disk (memcachedb)
170. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
171. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
• Values can be lists, sets, hashes
172. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
• Values can be lists, sets, hashes
• Master/slave replication supported
173. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
• Values can be lists, sets, hashes
• Master/slave replication supported
• Many client languages available
174. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
• Values can be lists, sets, hashes
• Master/slave replication supported
• Many client languages available
• Used by many big sites
175. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
• Values can be lists, sets, hashes
• Master/slave replication supported
• Many client languages available
• Used by many big sites
• Github, Craigslist, Engine Yard, Guardian
176. Redis
• Open source (C, BSD)
• Development funded by VMWare
• Similar to Memcached (key/value pair)
• Values can be lists, sets, hashes
• Master/slave replication supported
• Many client languages available
• Used by many big sites
• Github, Craigslist, Engine Yard, Guardian
• Can also store on disk
187. Amazon SimpleDB
• Built by Amazon in Erlang
• Supports EC2 applications
• Highly scalable
• “Eventual” Consistency
188. Amazon SimpleDB
• Built by Amazon in Erlang
• Supports EC2 applications
• Highly scalable
• “Eventual” Consistency
• Items have query-able attributes
189. Amazon SimpleDB
• Built by Amazon in Erlang
• Supports EC2 applications
• Highly scalable
• “Eventual” Consistency
• Items have query-able attributes
• No text search
190. Amazon SimpleDB
• Built by Amazon in Erlang
• Supports EC2 applications
• Highly scalable
• “Eventual” Consistency
• Items have query-able attributes
• No text search
• Build your own indexes
193. Berkeley DB
• Open source (C, Sleepycat)
• Enhancement of original DBM from AT&T
194. Berkeley DB
• Open source (C, Sleepycat)
• Enhancement of original DBM from AT&T
• Key/value pairs in various storage formats
195. Berkeley DB
• Open source (C, Sleepycat)
• Enhancement of original DBM from AT&T
• Key/value pairs in various storage formats
• Transactional locking, HA features
196. Berkeley DB
• Open source (C, Sleepycat)
• Enhancement of original DBM from AT&T
• Key/value pairs in various storage formats
• Transactional locking, HA features
• Widely used
200. Tokyo Cabinet
• Open source (C, LGPL)
• Successor to GDBM and QDBM
• Very efficient in both space and speed
201. Tokyo Cabinet
• Open source (C, LGPL)
• Successor to GDBM and QDBM
• Very efficient in both space and speed
• Simple key/value pairs
202. Tokyo Cabinet
• Open source (C, LGPL)
• Successor to GDBM and QDBM
• Very efficient in both space and speed
• Simple key/value pairs
• Multiple storage strategies
203. Tokyo Cabinet
• Open source (C, LGPL)
• Successor to GDBM and QDBM
• Very efficient in both space and speed
• Simple key/value pairs
• Multiple storage strategies
• Multiple client language interfaces
204. Tokyo Cabinet
• Open source (C, LGPL)
• Successor to GDBM and QDBM
• Very efficient in both space and speed
• Simple key/value pairs
• Multiple storage strategies
• Multiple client language interfaces
• Local storage only: no network interface
205. Tokyo Cabinet
• Open source (C, LGPL)
• Successor to GDBM and QDBM
• Very efficient in both space and speed
• Simple key/value pairs
• Multiple storage strategies
• Multiple client language interfaces
• Local storage only: no network interface
• But see Tokyo Tyrant
209. Cassandra
• Open source (Java, Apache 2)
• Originally developed at Facebook
• Uses key-column-value storage
210. Cassandra
• Open source (Java, Apache 2)
• Originally developed at Facebook
• Uses key-column-value storage
• High availability/elastic through replication
211. Cassandra
• Open source (Java, Apache 2)
• Originally developed at Facebook
• Uses key-column-value storage
• High availability/elastic through replication
• Used by Facebook, Digg, Twitter, Rackspace
212. Cassandra
• Open source (Java, Apache 2)
• Originally developed at Facebook
• Uses key-column-value storage
• High availability/elastic through replication
• Used by Facebook, Digg, Twitter, Rackspace
• Eventually consistent
213. Cassandra
• Open source (Java, Apache 2)
• Originally developed at Facebook
• Uses key-column-value storage
• High availability/elastic through replication
• Used by Facebook, Digg, Twitter, Rackspace
• Eventually consistent
• Queries can ask “majority” or “all”
217. GT.M
• Open Source (Mumps, GPL)
• Commercial support from Fidelity
• Distributed persistence of key/value pairs
218. GT.M
• Open Source (Mumps, GPL)
• Commercial support from Fidelity
• Distributed persistence of key/value pairs
• Transactions via optimistic concurrency
219. GT.M
• Open Source (Mumps, GPL)
• Commercial support from Fidelity
• Distributed persistence of key/value pairs
• Transactions via optimistic concurrency
• Everything you read must be unchanged
220. GT.M
• Open Source (Mumps, GPL)
• Commercial support from Fidelity
• Distributed persistence of key/value pairs
• Transactions via optimistic concurrency
• Everything you read must be unchanged
• Mumps-to-C-to-Mumps APIs
228. Mnesia
• Open source (Erlang, Erlang license)
• Key/value pairs
• Value is any Erlang datatype
229. Mnesia
• Open source (Erlang, Erlang license)
• Key/value pairs
• Value is any Erlang datatype
• Live reconfiguration
230. Mnesia
• Open source (Erlang, Erlang license)
• Key/value pairs
• Value is any Erlang datatype
• Live reconfiguration
• Supports transactions and distribution
234. Hbase
• Open source (Java, Apache 2)
• Mimics Google’s “BigTable”
• Push map/reduce down to shards
235. Hbase
• Open source (Java, Apache 2)
• Mimics Google’s “BigTable”
• Push map/reduce down to shards
• Runs on Hadoop Distributed File System
236. Hbase
• Open source (Java, Apache 2)
• Mimics Google’s “BigTable”
• Push map/reduce down to shards
• Runs on Hadoop Distributed File System
• Java, REST, Thrift APIs
237. Hbase
• Open source (Java, Apache 2)
• Mimics Google’s “BigTable”
• Push map/reduce down to shards
• Runs on Hadoop Distributed File System
• Java, REST, Thrift APIs
• Might be the DB behind Bing
250. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
251. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
• Supports object persistence
252. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
• Supports object persistence
• both Java and dot-net
253. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
• Supports object persistence
• both Java and dot-net
• Can replicate to traditional RDBMS
254. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
• Supports object persistence
• both Java and dot-net
• Can replicate to traditional RDBMS
• Supports class migration
255. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
• Supports object persistence
• both Java and dot-net
• Can replicate to traditional RDBMS
• Supports class migration
• ... if you provide the code
256. Db4o
• Open source (Java/C#, GPL)
• Commercial license/support available
• Supports object persistence
• both Java and dot-net
• Can replicate to traditional RDBMS
• Supports class migration
• ... if you provide the code
• Large community
262. InterSystems Cache
• Commercial
• “World’s fastest object database”
• Cross-platform
• Persistence of multidimensional arrays
• Similar to MUMPS/Pick data
263. InterSystems Cache
• Commercial
• “World’s fastest object database”
• Cross-platform
• Persistence of multidimensional arrays
• Similar to MUMPS/Pick data
• Can be embedded in web pages
275. GemStone/S
• Commercial (free license for small apps)
• Smalltalk objects just “persist”
• Includes Java interface
• Automatic and/or guided class upgrades
276. GemStone/S
• Commercial (free license for small apps)
• Smalltalk objects just “persist”
• Includes Java interface
• Automatic and/or guided class upgrades
• Horizontally scaleable
277. GemStone/S
• Commercial (free license for small apps)
• Smalltalk objects just “persist”
• Includes Java interface
• Automatic and/or guided class upgrades
• Horizontally scaleable
• OOCL manages 40% of overseas traffic
278. GemStone/S
• Commercial (free license for small apps)
• Smalltalk objects just “persist”
• Includes Java interface
• Automatic and/or guided class upgrades
• Horizontally scaleable
• OOCL manages 40% of overseas traffic
• JPM’s Kapital created the financial crisis :)
279. GemStone/S
• Commercial (free license for small apps)
• Smalltalk objects just “persist”
• Includes Java interface
• Automatic and/or guided class upgrades
• Horizontally scaleable
• OOCL manages 40% of overseas traffic
• JPM’s Kapital created the financial crisis :)
• Integrates with Seaside
280. GemStone/S
• Commercial (free license for small apps)
• Smalltalk objects just “persist”
• Includes Java interface
• Automatic and/or guided class upgrades
• Horizontally scaleable
• OOCL manages 40% of overseas traffic
• JPM’s Kapital created the financial crisis :)
• Integrates with Seaside
• Web apps with transparent persistence
283. Magma
• Open source (Squeak Smalltalk, MIT)
• Transparent Smalltalk object persistence
284. Magma
• Open source (Squeak Smalltalk, MIT)
• Transparent Smalltalk object persistence
• “Free GemStone/S”
285. Magma
• Open source (Squeak Smalltalk, MIT)
• Transparent Smalltalk object persistence
• “Free GemStone/S”
• Client-server model
286. Magma
• Open source (Squeak Smalltalk, MIT)
• Transparent Smalltalk object persistence
• “Free GemStone/S”
• Client-server model
• HA mode (multiple slaves ready for master)
287. Magma
• Open source (Squeak Smalltalk, MIT)
• Transparent Smalltalk object persistence
• “Free GemStone/S”
• Client-server model
• HA mode (multiple slaves ready for master)
• Transaction-based (commit/rollback)
288. Magma
• Open source (Squeak Smalltalk, MIT)
• Transparent Smalltalk object persistence
• “Free GemStone/S”
• Client-server model
• HA mode (multiple slaves ready for master)
• Transaction-based (commit/rollback)
• Works nicely with Seaside