Since a couple of years, the NoSQL movement has developed a variety of open-source document stores. Most of them focus on high availability, horizontal scalability, and are designed to run on commodity hardware. These products have gained great traction in the industry to store large amounts of flexible data (mostly JSON). In the meantime, XQuery has evolved to a standardized, full-fledged programming language for XML with native support for complex queries, indexes, updates, full-text search, and scripting. Moreover, JSON has recently been added as a first-level datatype into the language. As of today, it is without doubt the most robust and productive technology to process flexible data.
The aim of this talk is to showcase the benefits that can be achieved by integrating the Zorba XQuery Processor with MongoDB. We will introduce the 28msec platform that seamlessly stores, indexes, and manages flexible data entirely in XQuery. The data itself is stored in MongoDB. The platform leverages MongoDB’s indexes, sharding, and consistency guarantees to scale-out horizontally. The talk will conclude by showing a benchmark of the platform and discuss perspectives of the outlined approach.
5. MongoDB BaseX
CouchBase eXist-db
Standardized Query Language X ✔
Flexible Data
Modern Query Processing X ✔
Typing X ✔
High Availability ✔ X
Scalability
Sharding ✔ X
Available as a Service ✔ X
9. JSONiq 28
• Open Specification: jsoniq.org
• Extension of the mature XQuery for JSON
- Joins, Group-by, Filters, Search...
• Leverage the complete XQuery Family
- Scripting, Updates, Full-Text
• Standardized Query Language
- Run the same code accross multiple JSON stores
13. The Goal 28
• memcached
Scalability & Performance
• key/value • MongoDB
• RDBMS
• XML DB
Depth of functionality
14. The Goal 28
28msec - XQuery on top of MongoDB
• memcached
Scalability & Performance
• key/value • MongoDB • 28msec
RDBMS
• XML DB
Depth of functionnality
15. Meet Zorba 28
• Open Source XQuery Processor
- Apache 2 License
- Contributors: Oracle, 28msec, FLWOR Foundation
• The Complete Family
- XQuery 3.0, Updates, Full-Text, Scripting, JSONiq
- XQuery Data Definition Facility
• Pluggable Store API
- Run Zorba on your own persistency layer
17. Meet MongoDB 28
• Open Source JSON Document Store
- License AGPL 3.0
• Focus on scalability
- Replication accross multiple availability zones
- Sharding
- Atomic updates on documents
• Available as a service
- MongoHQ, MongoLab
18. MongoDB Deployment Example 28
Shard1 Shard2 Shard3
MongoD
Replica set
C1 MongoD
C2 MongoD
C3 MongoD MongoS MongoS
Config Servers App Server App Server
22. Application Example 28
• Fetching sports news from XMLTeam.com
• Stored and indexed on MongoDB
• 1 million documents and counting
• Entirely built in XQuery from backend to frontend
• 1k loc, 1 developer, 1 week work
25. Index Declarations 28
declare %an:value-range index sports:by-datetime
on nodes db:collection(xs:QName('sports:docs'))
by ./sports-content/sports-metadata/@date-time;
26. Index Declarations 28
declare index ...
1. Compile Query
Compiler Runtime
createIndex(
2. qname, ordpath, keys
)
Store API
Zorba
3. Create Index
MongoDB
27. Insert Nodes 28
let $uri := 'http://xmlteam.com/...'
let $doc := http:get($uri)
return db:insert-nodes($sports:docs, $doc)
28. Insert Nodes 28
db:insert-nodes(...)
1. Process Query
Compiler Runtime
2. insertNode(qname, xdm)
Store API
Zorba
3. Insert BSON
MongoDB
29. MongoDB Store Layer 28
• Direct XQuery to MongoDB mapping
- Collections
- Indexes
• Converts XDM to BSON
• Inherits MongoDB consistency model
30. Request Processing on 28msec 28
HTTP Client
1 R 9
Availability Zone 1
ELB
R
2 8
Sausalito
7 Zorba
R
Processor
Request Handler
Store
4
3 5 6
MongoDB
Compiled Code Data
31. Scaling Out 28
Avg Response Time in ms
1000
750
500
2 App Servers 4 App Servers
250
0
10 40 50 70 80 100 120 150
Number of concurrent requests
32. XQuery on Top of MongoDB 28
• Seamless Integration of XQuery with MongoDB
- XDM to BSON
- Collections and indexes mapping
- Atomicity per document
• 28msec
- XQuery Platform on top of MongoDB
- Deploy your XQuery apps in 1-click
- Scale up & down automatically
33. Take Away 28
• Two Drivers
- Flexible Data
- Scalability
• Two Champions
- XQuery for Flexible Data
- JSON Stores for Scalability
• Two Contributions
- JSONiq: The SQL of NoSQL
- XQuery Platform on top of MongoDB