More Related Content
Similar to Query Languages for Document Stores
Similar to Query Languages for Document Stores (20)
Query Languages for Document Stores
- 1. © 2013 triAGENS GmbH | 2013-06-18
Query Languages
for Document Stores
2013-06-18
Jan Steemann
- 2. © 2013 triAGENS GmbH | 2013-06-18
me
I'm a software developer
working at triAGENS GmbH, CGN
on - a document store
- 4. © 2013 triAGENS GmbH | 2013-06-18
Documents
documents are self-contained,
aggregate data structures...
...consisting of named and typed attributes,
which can be nested / hierarchical
documents can be used to model complex
business objects
- 5. © 2013 triAGENS GmbH | 2013-06-18
Example order document
{
"id": "abc10022",
"date": "20130426"
"customer": {
"id": "c199023",
"name": "acme corp."
},
"items": [ {
"id": "p123",
"quantity": 1,
"price": 25.13
} ]
}
- 6. © 2013 triAGENS GmbH | 2013-06-18
Document stores
document stores are databases
specialised in handling documents
they've been around for a while
got really popular with the NoSQL buzz
(CouchDB, MongoDB, ...)
- 8. © 2013 triAGENS GmbH | 2013-06-18
Saving programming language data
document stores allow saving a
programming language object as a whole
your programming language object
becomes a document in the database,
without the need for much transformation
compare this to saving data in a relational
database...
- 9. © 2013 triAGENS GmbH | 2013-06-18
Persistence the relational way
orders
id date
1 2013-04-20
2 2013-04-21
3 2013-04-21
4 2013-04-22
customers
customer
c1
c2
c1
c3
id name
c1
c2
c3
acme corp.
sample.com
abc co.
orderitems
1
order item
1
price quantity
23.25 1
- 10. © 2013 triAGENS GmbH | 2013-06-18
Benefits of document stores
no impedance mismatch,
no complex object-relational mapping,
no normalisation requirements
querying documents is often easier and
faster than querying highly normalised
relational data
- 11. © 2013 triAGENS GmbH | 2013-06-18
Schema-less
in document stores, there is no "table"-
schema as in the relational world
each document can have different attributes
there is no such thing as ALTER TABLE
that's why document stores are called
schema-less or schema-free
- 13. © 2013 triAGENS GmbH | 2013-06-18
Querying by document id is easy
every document store allows querying a
single document at a time
accessing documents by their unique ids is
almost always dead-simple
- 14. © 2013 triAGENS GmbH | 2013-06-18
Complex queries?
what if you want to run complex queries (e.g.
projections, filters, aggregations,
transformations, joins, ...)??
let's check the available options in some of
the popular document stores
- 15. © 2013 triAGENS GmbH | 2013-06-18
CouchDB: map-reduce
querying by something else than document
key / id requires writing a view
views are JavaScript functions that are
stored inside the database
views are populated by incremental map-
reduce
- 16. © 2013 triAGENS GmbH | 2013-06-18
map-reduce
the map function is applied on each document
(that changed)
map can filter out non-matching documents
or emit modified or unmodified versions of them
emitted documents can optionally be passed into
a reduce function
reduce is called with groups of similar
documents and can thus perform aggregation
- 17. © 2013 triAGENS GmbH | 2013-06-18
CouchDB map-reduce example
map = function (doc) {
var i, n = doc.orderItems.length;
for (i = 0; i < n; ++i) {
emit(doc.orderItems[i], 1);
}
};
reduce = function (keys, values, rereduce) {
if (rereduce) {
return sum(values);
}
return values.length;
};
- 18. © 2013 triAGENS GmbH | 2013-06-18
map-reduce
map-reduce is generic and powerful
provides a programming language
need to create views for everything that is
queried
access to a single "table" at a time (no
cross-"table" views)
a bit clumsy for ad-hoc exploratory queries
- 19. © 2013 triAGENS GmbH | 2013-06-18
MongoDB: find()
ad-hoc queries in MongoDB are much easier
can directly apply filters on collections,
allowing to find specific documents easily:
mongo> db.orders.find({
"customer": {
"id": "c1",
"name": "acme corp."
}
});
- 20. © 2013 triAGENS GmbH | 2013-06-18
MongoDB: complex filters
can filter on any document attribute or
sub-attribute
indexes will automatically be used if present
nesting filters allows complex queries
quite flexible and powerful, but tends to be
hard to use and read for more complex
queries
- 21. © 2013 triAGENS GmbH | 2013-06-18
MongoDB: complex filtering
mongo> db.users.find({
"$or": [
{
"active": true
},
{
"age": {
"$gte": 40
}
}
]
});
- 22. © 2013 triAGENS GmbH | 2013-06-18
MongoDB: more options
can also use JavaScript functions for filtering,
or JavaScript map-reduce
several aggregation functions are also
provided
neither option allows running cross-"table"
queries
- 24. © 2013 triAGENS GmbH | 2013-06-18
Query languages
a good query language should
allow writing both simple and complex
queries, without having to switch the
methodology
provide the required features for filtering,
aggregation, joining etc.
hide the database internals
- 25. © 2013 triAGENS GmbH | 2013-06-18
SQL
in the relational world, there is one accepted
general-purpose query language: SQL
it is quite well-known and mature:
35+ years of experience
many developers and established tools
around it
standardised (but mind the "dialects"!)
- 26. © 2013 triAGENS GmbH | 2013-06-18
SQL in document stores?
SQL is good at handling relational data
not good at handling multi-valued or
hierchical attributes, which are common in
documents
(too) powerful: SQL provides features many
document stores intentionally lack (e.g. joins,
transactions)
SQL has not been adopted by document
stores yet
- 28. © 2013 triAGENS GmbH | 2013-06-18
XQuery?
XQuery is a query and programming
language
targeted mainly at processing XML data
can process hierarchical data
very powerful and extensible
W3C recommendation
- 29. © 2013 triAGENS GmbH | 2013-06-18
XQuery
XQuery has found most adoption in the area
of XML processing
today people want to use JSON, not XML
XQuery not available in popular document
stores
- 30. © 2013 triAGENS GmbH | 2013-06-18
ArangoDB Query Language (AQL)
ArangoDB provides AQL, a query language
made for JSON document processing
it allows running complex queries on
documents, including joins and aggregation
language syntax was inspired by XQuery and
provides similar concepts such as
FOR, LET, RETURN, ...
the language integrates JSON "naturally"
- 31. © 2013 triAGENS GmbH | 2013-06-18
AQL example
FOR order IN orders
FILTER order.status == "processed"
LET itemsValue = SUM((
FOR item IN order.items
FILTER item.status == "confirmed"
RETURN item.price * item.quantity
))
FILTER itemsValue >= 500
RETURN {
"items" : order.items,
"itemsValue" : itemsValue,
"itemsCount" : LENGTH(order.items)
}
- 32. © 2013 triAGENS GmbH | 2013-06-18
AQL: some features
queries can combine data from multiple
"tables"
this allows joins using any document
attributes or sub-attributes
indexes will be used if present
- 33. © 2013 triAGENS GmbH | 2013-06-18
AQL: join example
FOR user IN users
FILTER user.id == 1234
RETURN {
"user" : user,
"posts" : (FOR post IN blogPosts
FILTER post.userId == user.id &&
post.date >= '20130613'
RETURN post
)
}
- 34. © 2013 triAGENS GmbH | 2013-06-18
AQL: additional features
AQL provides basic functionality to query
graphs, too
the language can be extended with user-
defined JavaScript functions
- 35. © 2013 triAGENS GmbH | 2013-06-18
JSONiq
JSONiq is a data processing and query
language for handling JSON data
it is based on XQuery, thus provides the same
FLWOR expressions: FOR, LET, WHERE,
ORDER, ...
JSON is integrated "naturally"
most of the XML handling is removed
- 36. © 2013 triAGENS GmbH | 2013-06-18
JSONiq: example
for $order in collection("orders")
where $order.customer.id eq "abc123"
return {
customer : $order.customer,
items : $order.items
}
- 37. © 2013 triAGENS GmbH | 2013-06-18
JSONiq: join example
for $post in collection("posts")
let $postId := $post.id
for $comment in collection("comments")
where $comment.postId eq $postId
group by $postId
order by count($comment) descending
return {
id : $postId,
comments : count($comment)
}
- 38. © 2013 triAGENS GmbH | 2013-06-18
JSONiq
JSONiq is a generic, database-agnostic
language
it can be extended with user-defined XQuery
functions
JSONiq is currently not implemented inside
any document database...
- 39. © 2013 triAGENS GmbH | 2013-06-18
JSONiq
...but it can be used via a service (at 28.io)
the service provides the JSONiq query
language and implements functionality not
provided by a specific database
such features are implemented client-side,
e.g. joins for MongoDB
- 41. © 2013 triAGENS GmbH | 2013-06-18
Summary
today's document stores provide different,
proprietary mechanisms for querying data
there is currently no standard query
mechanism for document stores as there is
in the relational world (SQL)
- 42. © 2013 triAGENS GmbH | 2013-06-18
Summary
you CAN use query languages in document
stores today, e.g. AQL and JSONiq
if you like the idea, give them a try, provide
feedback and contribute!