This presentation will cover the history of creation, implementation details and various challenges related to embedded documents querying in MongoDB, along with examples of how to properly create and utilize the extension on top of official MongoDB Scala Driver. This newly introduced extension allows to fully utilize Spray JSON and represents bidirectional serialization for case classes in BSON, as well as flexible DSL for MongoDB query operators, documents and collections.
Spray Json and MongoDB Queries: Insights and Simple Tricks.
1. Spray Json and MongoDB Queries
Insights and Simple Tricks
2. Spray JSON
spray-json is a lightweight, clean and efficient JSON implementation in Scala.
It is currently maintained by the Akka team at Lightbend.
https://github.com/spray/spray-json
3. MongoDB / NoSQL / BSON
Introduction to MongoDB: https://docs.mongodb.com/manual/introduction/
MongoDB is an open-source document database that provides high performance, high
availability, and automatic scaling. MongoDB stores data records as BSON documents.
BSON is a binary representation of JSON documents, though it contains more data types
than JSON. https://docs.mongodb.com/manual/core/document
NoSQL Databases Explained: https://www.mongodb.com/nosql-explained
4. MongoDB Scala Driver
MongoDB Scala Driver introduced in Sep 2015.
http://rosslawley.co.uk/introducing-mongodb-scala-driver/
The driver requires to define BSON codecs to work with custom data types.
Case classes support added in Mar 2017.
http://mongodb.github.io/mongo-scala-driver/2.0/changelog/
https://github.com/mongodb/mongo-scala-driver/releases/tag/r2.0.0
5. Application components
Models, JSON formats for API and BSON codecs for MongoDB
INTERNAL
API / WEB
SERVICES MODELS
{ JSON }
FORMATS
MongoDB
API
EXTERNAL
API / WEB
SERVICES
{ BSON }
CODECS
6. Coding example of application components
Models, JSON formats for API and BSON codecs for MongoDB
// MODEL
case class Test(id: Long, number: Int, comment: String)
// JSON PROTOCOL
trait TestJsonProtocol extends DefaultJsonProtocol {
implicit def testJf = jsonFormat3(Test)
}
// BSON CODEC
class TestBsonCodec extends Codec[Test] {
override def getEncoderClass: Class[Test] = classOf[Test]
override def encode(writer: BsonWriter, value: Test, encoderContext: EncoderContext) = // ...
override def decode(reader: BsonReader, decoderContext: DecoderContext): Test = // ...
}
val registry = fromRegistries(fromCodecs(new TestBsonCodec()), DEFAULT_CODEC_REGISTRY)
val database: MongoDatabase = MongoClient().getDatabase("mydb").withCodecRegistry(registry)
val collection: MongoCollection[Test] = database.getCollection("test-collection")
8. Implementation of BSON codec decode
class TestBsonCodec extends Codec[Test] {
override def decode(reader: BsonReader, decoderContext: DecoderContext): Test = {
reader.readStartDocument()
reader.readObjectId("_id")
val number = reader.readInt32()
val id = reader.readInt64()
val comment = reader.readString()
reader.readEndDocument()
Test(id, number, comment)
}
}
9. Use the test codec to write and read test record
collection.insertOne(Test(123L, 123, "some string data"))
.toFuture().onComplete(res => println(s"INSERT: $res"))
INSERT: Success(The operation completed successfully)
collection.find(Filters.eq("id", 123L))
.toFuture().onComplete(res => println(s"FIND: $res"))
FIND: Failure(org.bson.BsonInvalidOperationException: readInt32 can only be
called when CurrentBSONType is INT32, not when CurrentBSONType is INT64.)
10. Implementation of BSON codec decode
read fields by name
class TestScalaCodec extends Codec[Test] {
override def decode(reader: BsonReader, decoderContext: DecoderContext): Test = {
reader.readStartDocument()
reader.readObjectId("_id")
val number = reader.readInt32("number")
val id = reader.readInt64("id")
val comment = reader.readString("comment")
reader.readEndDocument()
Test(id, number, comment)
}
}
11. Use the test codec to write and read test record
collection.insertOne(Test(123L, 123, "some string data"))
.toFuture().onComplete(res => println(s"INSERT: $res"))
INSERT: Success(The operation completed successfully)
collection.find(Filters.eq("id", 123L))
.toFuture().onComplete(res => println(s"FIND: $res"))
FIND: Failure(org.bson.BsonSerializationException: Expected element name to be
'number', not 'id'.)
12. MongoDB and BSON
BSON is a binary format in which zero or more ordered key/value pairs are stored as a single
entity.
http://bsonspec.org/spec.html
MongoDB represents JSON documents in binary-encoded format called BSON behind the
scenes. BSON extends the JSON model to provide additional data types, ordered fields, and
to be efficient for encoding and decoding within different languages.
https://www.mongodb.com/json-and-bson
13. Motivation: reuse existing JSON formats in MongoDB
INTERNAL
API / WEB
SERVICES
MODELS
{ JSON / BSON }
FORMATS
MongoDB
API
EXTERNAL
API / WEB
SERVICES
{ BSON }
CODECS
14. MongoDB Extended JSON
http://mongodb.github.io/mongo-scala-driver/2.6/bson/extended-json/
The Scala driver supports reading and writing BSON documents represented as MongoDB
Extended JSON.
Furthermore, the Document provides two sets of convenient methods for this purpose:
● Document.toJson(): a set of overloaded methods that convert a Document instance to a
JSON string
● Document(json): a set of overloaded static factory methods that convert a JSON string to a
Documentinstance
15. Motivation: reuse existing JSON formats in MongoDB
INTERNAL
API / WEB
SERVICES
MODELS
{ JSON / BSON }
FORMATS
MongoDB
API
EXTERNAL
API / WEB
SERVICES
{ BSON }
CODECS
16. Data transformation pipeline
[T]
// custom data types
{ JSON }
// related JSON
representation
Document(JSON)
// Document from JSON
BSON
// Binary data
in MongoDB
21. Override default LongJsonFormat in JsonProtocol
https://github.com/spray/spray-json/blob/v1.3.5/src/main/scala/spray/json/BasicFormats.scala#L33-L39
package spray.json
/**
* Provides the JsonFormats for the most important Scala types.
*/
trait BasicFormats {
implicit object LongJsonFormat extends JsonFormat[Long] {
def write(x: Long) = JsNumber(x)
def read(value: JsValue) = value match {
case JsNumber(x) => x.longValue
case x => deserializationError("Expected Long as JsNumber, but got " + x)
}
}
// …
object LongJsonFormat cannot override final member
22. Hide LongJsonFormat in JsonProtocol
Liskov substitution principle
Subtype Requirement: Let φ( 𝓍) be a property provable about objects 𝓍 of type T. Then
φ(y) should be true for objects y of type S where S is a subtype of T.
https://en.wikipedia.org/wiki/Liskov_substitution_principle
Error: overriding object LongJsonFormat in trait BasicFormats;
value LongJsonFormat has weaker access privileges; it should be public
23. Hide LongJsonFormat from import
Don’t extend DefaultJsonProtocol
trait TestJsonProtocol {
import DefaultJsonProtocol.{ LongJsonFormat => _, _ }
implicit val LongJsonFormat: JsonFormat[Long] = new JsonFormat[Long] {
def read(jsValue: JsValue): Long = jsValue match {
case JsObject(fields) => fields("$numberLong") match {
case JsString(v) => v.toLong
case _ => deserializationError("Long expected")
}
case JsNumber(v) => v.toLong
case _ => deserializationError("Long expected")
}
override def write(obj: Long): JsValue = JsObject("$numberLong" -> JsString(obj.toString))
}
implicit val testJf = jsonFormat3(Test)
}
27. LongJsonFormat is available in both JSON protocols
import DefaultJsonProtocol._
import TestJsonProtocol._
collection.find(Document(s"""{ "id": ${2147483648L.toJson} }""")).toFuture()
.map { bson => println(s"FIND BSON: $bson"); bson.map(_.toJson()) }
.map { json => println(s"FIND JSON: $json"); json.map(_.parseJson.convertTo[Test]) }
.map { obj => println(s"FIND OBJ: $obj") }
Error: Cannot find JsonWriter or JsonFormat type class for Long
collection.find(Document(s"""{ "id": ${2147483648L.toJson} }""")).toFuture()
Error: not enough arguments for method toJson: (implicit writer:
spray.json.JsonWriter[Long])spray.json.JsValue. Unspecified value parameter writer.
collection.find(Document(s"""{ "id": ${2147483648L.toJson} }""")).toFuture()
28. Hide LongJsonFormat from import
Both queries by long id and int number work well
import DefaultJsonProtocol.{ LongJsonFormat => _, _ }
import TestJsonProtocol._
collection.find(Document(s"""{ "id": ${2147483648L.toJson} }""")).toFuture()
.map { bson => bson.map(_.toJson().parseJson.convertTo[Test]) }
.map { obj => println(s"FIND BY ID (long): $obj") }
// will print
FIND BY ID (long): List(Test(2147483648,123,some string data))
collection.find(Document(s"""{ "number": ${123.toJson} }""")).toFuture()
.map { bson => bson.map(_.toJson().parseJson.convertTo[Test]) }
.map { obj => println(s"FIND BY NUMBER (int): $obj") }
// will print
FIND BY NUMBER (int): List(Test(2147483648,123,some string data))
29. GreenLeafJsonProtocol
trait GreenLeafJsonProtocol
extends StandardFormats
with CollectionFormats
with ProductFormats
with AdditionalFormats {
implicit val IntJsonFormat: JsonFormat[Int] = DefaultJsonProtocol.IntJsonFormat
implicit val LongJsonFormat: JsonFormat[Long] = DefaultJsonProtocol.LongJsonFormat
// …
}
object GreenLeafJsonProtocol extends GreenLeafJsonProtocol
New GreenLeafJsonProtocol based on DefaultJsonProtocol from Spray JSON. It allows to
override predefined JsonFormats to make possible use custom seriallization in BSON.
30. GreenLeafBsonProtocol
Override LongJsonFormat to serialize Long to BSON
trait GreenLeafBsonProtocol extends GreenLeafJsonProtocol {
override implicit val LongJsonFormat: JsonFormat[Long] = new JsonFormat[Long] {
def read(jsValue: JsValue): Long = jsValue match {
case JsObject(fields) => fields("$numberLong") match {
case JsString(v) => v.toLong
case _ => deserializationError("Long expected")
}
case JsNumber(v) => v.toLong
case _ => deserializationError("Long expected")
}
override def write(obj: Long): JsValue = JsObject("$numberLong" -> JsString(obj.toString))
}
}
object GreenLeafBsonProtocol extends GreenLeafBsonProtocol
31. Define JSON and BSON protocols for Test entity
// MODEL
case class Test(id: Long, number: Int, notes: String)
// JSON
trait TestJsonProtocol extends GreenLeafJsonProtocol {
implicit def testJf: RootJsonFormat[Test] = jsonFormat3(Test)
}
object TestJsonProtocol extends TestJsonProtocol
// BSON
object TestBsonProtocol extends TestJsonProtocol with GreenLeafBsonProtocol
32. Verify Test JSON protocol
import TestJsonProtocol._
val obj = Test(0xC0FFEE, 12648430, "Coffee")
println(obj.toJson.prettyPrint)
// will print
{
"id": 12648430,
"notes": "Coffee",
"number": 12648430
}
33. Verify Test BSON protocol
import TestBsonProtocol._
val obj = Test(0xC0FFEE, 12648430, "Coffee")
println(obj.toJson.prettyPrint)
// will print
{
"id": {
"$numberLong": "12648430"
},
"notes": "Coffee",
"number": 12648430
}
34. // import DefaultJsonProtocol.{ LongJsonFormat => _, _ }
// import TestJsonProtocol._
import TestBsonProtocol._
collection.find(Document(s"""{ "id": ${2147483648L.toJson} }""")).toFuture()
.map { bson => bson.map(_.toJson().parseJson.convertTo[Test]) }
.map { obj => println(s"FIND BY ID (long): $obj") }
// will print
FIND BY ID (long): List(Test(2147483648,123,some string data))
collection.find(Document(s"""{ "number": ${123.toJson} }""")).toFuture()
.map { bson => bson.map(_.toJson().parseJson.convertTo[Test]) }
.map { obj => println(s"FIND BY NUMBER (int): $obj") }
// will print
FIND BY NUMBER (int): List(Test(2147483648,123,some string data))
Use Test BSON protocol to write and to read BSON
35. Support for additional types
GreenLeaf JSON and BSON protocols provide formats for basic types such as Int and Long.
However, they also support various other common types such as Enumeration and
ZonedDateTime.
36. Scala Enumeration
Defines a set of values specific to the enumeration. Typically these values enumerate
all possible forms that can take and provide a lightweight alternative to case classes.
https://www.scala-lang.org/api/current/scala/Enumeration.html
// Define a new enumeration with a type alias and work with the full set of enumerated values
object WeekDay extends Enumeration {
type WeekDay = Value
val Mon, Tue, Wed, Thu, Fri, Sat, Sun = Value
}
WeekDay.values.foreach(x => println(s"${x.id}t$x"))
// output:
// 0 Mon
// 1 Tue
// 2 Wed
// 3 Thu
// 4 Fri
// 5 Sat
// 6 Sun
37. Support scala Enumeration in JSON/BSON protocols
trait GreenLeafJsonProtocol {
def enumToJsonFormatAsString(e: Enumeration): JsonFormat[e.Value] = new JsonFormat[e.Value] {
def write(v: e.Value): JsValue = JsString(v.toString)
def read(value: JsValue): e.Value = value match {
case JsString(v) => e.withName(v)
case x => deserializationError(s"Expected enum, but got $x")
}
}
def enumToJsonFormatAsInt(e: Enumeration): JsonFormat[e.Value] = new JsonFormat[e.Value] {
def write(v: e.Value): JsValue = JsNumber(v.id)
def read(value: JsValue): e.Value = value match {
case JsNumber(v) => e.apply(v.intValue())
case x => deserializationError(s"Expected enum, but got $x")
}
}
}
38. Verify scala Enumeration JSON/BSON formats
// MODEL
object StockSymbol extends Enumeration {
type StockSymbol = Value
val AAPL, GOOG, MSFT, AMZN, FB, BABA, JNJ, JPM = Value
}
case class Quote(symbol: StockSymbol, open: BigDecimal, high: BigDecimal, low: BigDecimal)
// JSON
trait StockJsonProtocol extends GreenLeafJsonProtocol {
implicit val stockJf: JsonFormat[StockSymbol] = enumToJsonFormatAsString(StockSymbol)
implicit val quoteJf: RootJsonFormat[Quote] = jsonFormat4(Quote)
}
object StockJsonProtocol extends StockJsonProtocol
// BSON
object StockBsonProtocol extends StockJsonProtocol with GreenLeafBsonProtocol
40. ObjectId type
ObjectIds are small, likely unique, fast to generate and ordered.
ObjectId values consist of 12 bytes, where the first four bytes represent timestamp that
reflect the ObjectId’s creation.
In MongoDB, each document stored in collection requires a unique _id field that acts as a
primary key. If an inserted document omits the _id field, the MongoDB driver automatically
generates an ObjectId for the _id field.
https://docs.mongodb.com/manual/reference/bson-types/#objectid
41. Add ObjectId JSON format
trait GreenLeafJsonProtocol {
implicit val ObjectIdJsonFormat: JsonFormat[ObjectId] = new JsonFormat[ObjectId] {
def write(obj: ObjectId): JsValue = JsString(obj.toString)
def read(jsValue: JsValue): ObjectId = jsValue match {
case JsString(value) => new ObjectId(value)
case x => deserializationError(s"Expected ObjectId, but got $x")
}
}
}
42. Add ObjectId BSON format
https://docs.mongodb.com/manual/reference/mongodb-extended-json/#oid
trait GreenLeafBsonProtocol {
override implicit val ObjectIdJsonFormat: JsonFormat[ObjectId] = new JsonFormat[ObjectId] {
def write(obj: ObjectId): JsValue = JsObject("$oid" -> JsString(obj.toString))
def read(jsValue: JsValue): ObjectId = jsValue match {
case JsObject(fields) => fields("$oid") match {
case JsString(oid) => new ObjectId(oid)
case x => deserializationError("Expected ObjectId, but got " + x)
}
case x => deserializationError("Expected ObjectId, but got " + x)
}
}
}
43. Sorting on ObjectId field
MongoDB clients should add an _id field with a unique ObjectId. Using ObjectIds for the _id
field provides the following additional benefits:
● in the mongo shell, users can access the creation time of the ObjectId by utilizing the
ObjectId.getTimestamp() method.
● sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by
creation time.
https://docs.mongodb.com/manual/reference/bson-types/#objectid
44. // MODEL
case class Message(id: ObjectId = new ObjectId, text: String)
// JSON
trait MessageJsonProtocol extends GreenLeafJsonProtocol {
implicit val messageJf: RootJsonFormat[Message] = jsonFormat2(Message)
}
object MessageJsonProtocol extends MessageJsonProtocol
// BSON
object MessageBsonProtocol extends MessageJsonProtocol with GreenLeafBsonProtocol {
override implicit val messageJf: RootJsonFormat[Message] = jsonFormat(Message, "_id", "text")
}
Define model and JSON/BSON protocols
to verify ObjectId formats
46. Find test records and sort them by _id field
import MessageBsonProtocol._
collection.find(/* all messages */).sort(Document("""{"_id": 1}""")).toFuture()
.map { bson => bson.map(_.toJson().parseJson.convertTo[Message]) }
.map { messages => println(messages.mkString("n")) }
// output:
Message(5c8863e0d639e92a5f953273,A)
Message(5c8863e0d639e92a5f953274,B)
Message(5c8863e0d639e92a5f953275,C)
Message(5c8863e0d639e92a5f953276,D)
Message(5c8863e0d639e92a5f953277,E)
…
Message(5c8863e0d639e92a5f953288,V)
Message(5c8863e0d639e92a5f953289,W)
Message(5c8863e0d639e92a5f95328a,X)
Message(5c8863e0d639e92a5f95328b,Y)
Message(5c8863e0d639e92a5f95328c,Z)
47. Important note about sorting on ObjectId field
While ObjectId values should increase over time, they are not necessarily monotonic for the
following reasons:
● They only contain one second of temporal resolution, so ObjectId values created
within the same second do not have a guaranteed ordering
● They get generated by clients, which may have differing system clocks
https://docs.mongodb.com/manual/reference/bson-types/#objectid
48. UUID JSON format
Simple serialization to string by default which is possible to override
trait GreenLeafJsonProtocol {
implicit val UuidAsStrJsonFormat: JsonFormat[UUID] = new JsonFormat[UUID] {
def write(v: UUID): JsValue = JsString(v.toString)
def read(value: JsValue): UUID = value match {
case JsString(v) => UUID.fromString(v)
case x => deserializationError(s"Expected UUID, but got $x")
}
}
}
49. ZonedDateTime
ZonedDateTime is an immutable representation of a date-time with a time-zone.
This class stores all date and time fields to a precision of nanoseconds and a time-zone, with a
zone offset used to handle ambiguous local date-times.
https://docs.oracle.com/javase/8/docs/api/java/time/ZonedDateTime.html
50. ZonedDateTime JSON format
trait GreenLeafJsonProtocol {
implicit val ZdtJsonFormat: JsonFormat[ZonedDateTime] = new JsonFormat[ZonedDateTime] {
def write(obj: ZonedDateTime): JsValue = JsString(obj.format(DateTimePattern))
def read(jsValue: JsValue): ZonedDateTime = jsValue match {
// 1970-01-01T01:02:03+04:00
case JsString(zdt) if zdt.length >= 20 => parseDateTimeIso(zdt)
// 1970-01-01T00:00:00
case JsString(zdt) if zdt.length >= 19 && zdt.contains('T') => parseDateTimeIso(zdt)
// 1970-01-01 00:00:00
case JsString(zdt) if zdt.length == 19 => parseDateTime(zdt)
case JsString(zdt) => parseDate(zdt)
case x => deserializationError(s"Expected ZonedDateTime, but got $x")
}
}
}
51. BSON date type
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix
epoch.
This results in a representable date range of about 290 million years into the past and future.
The official BSON specification refers to the BSON Date type as the UTC datetime.
BSON Date type is signed. Negative values represent dates before 1970.
https://docs.mongodb.com/manual/reference/bson-types/#date
52. ZonedDateTime BSON format
trait GreenLeafBsonProtocol {
override implicit val ZdtJsonFormat: JsonFormat[ZonedDateTime] = new JsonFormat[ZonedDateTime] {
def write(obj: ZonedDateTime): JsValue = {
JsObject("$date" -> JsNumber(obj.toInstant.toEpochMilli))
}
def read(jsValue: JsValue): ZonedDateTime = jsValue match {
case JsObject(fields) => fields("$date") match {
case JsNumber(v) => Instant.ofEpochMilli(v.toLong).atZone(ZoneOffset.UTC)
case x => deserializationError("Expected ZonedDateTime, but got " + x)
}
case x => deserializationError("Expected ZonedDateTime, but got " + x)
}
}
}
53. Verify ZonedDateTime JSON/BSON formats
// MODEL
case class LogMessage(text: String, timestamp: ZonedDateTime)
// JSON
trait LogMessageJsonProtocol extends GreenLeafJsonProtocol {
implicit def logMessageJf = jsonFormat2(LogMessage)
}
object LogMessageJsonProtocol extends LogMessageJsonProtocol
// BSON
object LogMessageBsonProtocol extends LogMessageJsonProtocol with GreenLeafBsonProtocol
56. MongoDB Extended JSON and Date format
In Strict mode, <date> is an ISO-8601 date format with a mandatory time zone field following
the template YYYY-MM-DDTHH:mm:ss.mmm<+/-Offset>.
The MongoDB JSON parser currently DOESN’T support loading ISO-8601 strings representing
dates prior to the Unix epoch.
https://docs.mongodb.com/manual/reference/mongodb-extended-json/#date
58. BigDecimal type
BigDecimal represents decimal floating-point numbers of an arbitrary precision.
By default, the precision approximately matches that of IEEE 128-bit floating point numbers
(34 decimal digits, HALF_EVEN rounding mode).
https://www.scala-lang.org/api/current/scala/math/BigDecimal.html
59. Verify BigDecimal BSON format
// MODEL
case class MathematicalConstant(value: BigDecimal, symbol: String, description: String)
// JSON
trait MathematicalConstantJsonProtocol extends GreenLeafJsonProtocol {
implicit def logMessageJf = jsonFormat3(MathematicalConstant)
}
object MathematicalConstantJsonProtocol extends MathematicalConstantJsonProtocol
// BSON
object MathematicalConstantBsonProtocol extends MathematicalConstantJsonProtocol with GreenLeafBsonProtocol
import MathematicalConstantBsonProtocol._
val pi = MathematicalConstant(BigDecimal("3.141592653589793238462643383279"), "π", "Archimedes' constant")
val e = MathematicalConstant(BigDecimal("2.718281828459045235360287471352"), "e", "Euler number")
collection.insertOne(Document(pi.toJson.compactPrint)).toFuture().onComplete(x => println(x))
collection.insertOne(Document(e.toJson.compactPrint)).toFuture().onComplete(x => println(x))
// output:
Success(The operation completed successfully)
Success(The operation completed successfully)
61. MongoDB Extended JSON: NumberDecimal
The mongo shell treats all numbers as 64-bit floating-point double values by default.
The mongo shell provides the NumberDecimal() constructor to explicitly specify 128-bit decimal-based
floating-point values capable of emulating decimal rounding with exact precision.
This functionality is intended for applications that handle monetary data, such as financial, tax, and
scientific computations.
The decimal BSON type uses the IEEE 754 decimal128 floating-point numbering format which supports 34
decimal digits (i.e. significant digits) and an exponent range of −6143 to +6144.
https://docs.mongodb.com/manual/core/shell-types/#numberdecimal
https://docs.mongodb.com/manual/tutorial/model-monetary-data/#using-the-decimal-bson-type
https://docs.mongodb.com/manual/reference/mongodb-extended-json/#numberdecimal
62. BigDecimal BSON format
trait GreenLeafBsonProtocol {
override implicit val BigDecimalJsonFormat: JsonFormat[BigDecimal] = new JsonFormat[BigDecimal] {
override def read(jsValue: JsValue): BigDecimal = jsValue match {
case JsObject(fields) => fields("$numberDecimal") match {
case JsString(v) => BigDecimal(v)
case x => deserializationError("Expected BigDecimal/NumberDecimal, but got " + x)
}
case x => deserializationError("Expected BigDecimal/NumberDecimal, but got " + x)
}
override def write(obj: BigDecimal): JsValue = {
JsObject("$numberDecimal" -> JsString(obj.toString()))
}
}
}
64. Order of fields in JSON, BSON and MongoDB documents
JSON is built on two structures: a collection of name/value pairs and an ordered list of values.
An object is an unordered set of name/value pairs.
https://www.json.org/
BSON is a binary format in which zero or more ordered key/value pairs are stored as a single entity.
http://bsonspec.org/spec.html
MongoDB preserves the order of the document fields following write operations except for the following cases:
● The _id field is always the first field in the document.
● Updates that include renaming of field names may result in the reordering of fields in the document.
https://docs.mongodb.com/manual/core/document/#embedded-documents
65. Example of different order of JSON fields
case class Test(i: Int, l: Long)
object TestJsonProtocol extends GreenLeafJsonProtocol { implicit val testJf = jsonFormat2(Test) }
import TestJsonProtocol._
println(Test(1, 1024L).toJson)
// "spray-json" % "1.3.5" output:
// {"i":1,"l":1024}
// "spray-json" % "1.3.4" output:
// {"i":1,"l":1024}
case class Test(i: Int, l: Long, f: Float, d: Double)
object TestJsonProtocol extends GreenLeafJsonProtocol { implicit val testJf = jsonFormat4(Test) }
import TestJsonProtocol._
println(Test(1, 1024L, 2.0f, 20.48d).toJson)
// "spray-json" % "1.3.5" output:
// {"d":20.48,"f":2.0,"i":1,"l":1024}
// "spray-json" % "1.3.4" output:
// {"i":1,"l":1024,"f":2.0,"d":20.48}
66. JSON fields order and related MongoDB filter issue
// MODEL
object Currency extends Enumeration {
type Currency = Value
val USD, GBP, CAD, PLN, JPY, EUR = Value
}
import Currency._
// ID as object { "id": { "base": "USD", "date": "2019-01-18" }, "rates": ... }
case class ExchangeRateId(base: Currency, date: ZonedDateTime)
// In official driver macro codecs don't allow to use Enum value as key in Map data structure
case class ExchangeRate(id: ExchangeRateId, rates: Map[Currency, BigDecimal], updated: ZonedDateTime)
// BSON
object ExchangeRateJsonProtocol extends GreenLeafBsonProtocol {
implicit def ccyJf = enumToJsonFormatAsString(Currency)
implicit def erIdJf = jsonFormat2(ExchangeRateId)
implicit def erJf = jsonFormat(ExchangeRate, "_id", "rates", "updated")
}
69. JSON fields order and related MongoDB filter issue
plain query where filter fields order doesn’t match 2nd
record fields order
val filter = Document("""{ "_id": {"base": "USD", "date": { "$date": "2019-01-02T00:00:00.000Z" } } }""")
val filter = Filters.eq("_id", Document(ExchangeRateId(USD, "2019-01-02").toJson.compactPrint))
val filter = Document(s"""{ "_id": ${ExchangeRateId(USD, "2019-01-02").toJson} }""")
// FILTER: {"_id": {"base": "USD", "date": {"$date": 1546387200000}}}
// output:
BSON: List()
JSON: List()
OBJ: List()
70. MongoDB query operator ‘$eq’
Specifies equality condition. The $eq operator matches documents where the value of a field
equals the specified value. The $eq expression is equivalent to { field: <value> }.
If the specified <value> is a document, the order of the fields in the document matters.
https://docs.mongodb.com/manual/reference/operator/query/eq/#match-a-document-value
Equality matches on the whole embedded document require an exact match of the specified
<value> document, including the field order.
https://docs.mongodb.com/manual/tutorial/query-embedded-documents/#match-an-embedded-nested-document
71. JSON fields order and related MongoDB filter issue
plain query where filter fields order matches 2nd
record fields order
// explicit correct fields order as stored in MongoDB for this record
val filter = Document("""{"_id": { "date": { "$date": "2019-01-02T00:00:00.000Z" }, "base": "USD" } }""")
// wrong fields order
// val filter = Filters.eq("_id", Document(ExchangeRateId(USD, "2019-01-02").toJson.compactPrint))
// wrong fields order
// val filter = Document(s"""{"_id": ${ExchangeRateId(USD, "2019-01-02").toJson} }""")
// FILTER: {"_id": {"date": {"$date": 1546387200000}, "base": "USD"}}
// output:
BSON: List(Document((_id,{"date": {"$date": 1546387200000}, "base": "USD"}), (rates,{"PLN": 3.7671665351,
"CAD": 1.3442076646, "GBP": 0.7891607472, "JPY": 108.0417434009, "USD": 1.0, "EUR": 0.8769622029}),
(updated,BsonDateTime{value=1546387200000})))
JSON: List({"_id": {"date": {"$date": 1546387200000}, "base": "USD"}, "rates": {"PLN": 3.7671665351,
"CAD": 1.3442076646, "GBP": 0.7891607472, "JPY": 108.0417434009, "USD": 1.0, "EUR": 0.8769622029},
"updated": {"$date": 1546387200000}})
OBJ: List(ExchangeRate(ExchangeRateId(USD,2019-01-02T00:00Z),Map(USD -> 1.0, EUR -> 0.8769622029,
GBP -> 0.7891607472, CAD -> 1.3442076646, PLN -> 3.7671665351, JPY -> 108.0417434009),2019-01-02T00:00Z))
72. MongoDB Query on Nested Field
Dot notation is used to specify a query condition on fields in an embedded/nested
document.
https://docs.mongodb.com/manual/tutorial/query-embedded-documents/#query-on-nested-field
To specify or access a field of an embedded document with dot notation, concatenate the
embedded document name with the dot (.) and the field name, and enclose in quotes
https://docs.mongodb.com/manual/core/document/#embedded-documents
73. JSON fields order and related MongoDB filter issue
BSON document can’t be created from basic types
val filter = Filters.and(
Filters.eq("_id.base", Document(USD.toJson.compactPrint)),
org.bson.BsonInvalidOperationException: readStartDocument can only be called
when CurrentBSONType is DOCUMENT, not when CurrentBSONType is STRING.
Filters.eq("_id.date", Document(ZonedDateTime("2019-01-03").toJson.compactPrint))
org.bson.BsonInvalidOperationException: readStartDocument can only be called
when CurrentBSONType is DOCUMENT, not when CurrentBSONType is DATE_TIME.
)
74. JSON fields order and related MongoDB filter issue
Plain ZonedDateTime JSON can’t be used in Filter
val filter = Filters.and(
Filters.eq("_id.base", USD.toJson.compactPrint),
Filters.eq("_id.date", ZonedDateTime("2019-01-03").toJson.compactPrint)
)
// will return nothing because filter is incorrect - date is string, but
should be an object
// FILTER: {"_id.base": ""USD"", "_id.date": "{"$date":1546473600000}"}
79. MongoDB query on Nested fields limitations
upsert:true and dotted _id in insert operation
// ENTITY DOESN’T EXIST
val filter = "_id" $eq ExchangeRateId(USD, "2039-01-03")
val replacement = // ...
val options = FindOneAndReplaceOptions().upsert(true)
collection.findOneAndReplace(filter, replacement, options).toFuture()
com.mongodb.MongoCommandException: Command failed with error 111
(NotExactValueField): 'field at '_id' must be exactly specified, field at
sub-path '_id.base' found'.
The full response is {"ok": 0.0, "errmsg": "field at '_id' must be exactly
specified, field at sub-path '_id.base' found", "code": 111, "codeName":
"NotExactValueField"}
80. MongoDB query on Nested fields limitations
upsert:true and dotted _id
When users execute an update() with upsert: true and the query matches no existing
document, MongoDB will refuse to insert a new document if the query specifies conditions
on the _id field using dot notation.
This restriction ensures that the order of fields embedded in the _id document is well-defined
and not bound to the order specified in the query
If users attempted to insert a document in such way, MongoDB will raise an error.
https://docs.mongodb.com/manual/reference/method/db.collection.update/#upsert-true-with-a-dotted-id-query
81. Optional fields
// MODEL
case class GeoKey(country: String, state: Option[String] = None, city: Option[String] = None)
case class GeoRecord(key: GeoKey, name: String, population: Int)
// JSON
trait GeoModelJsonProtocol extends GreenLeafJsonProtocol {
implicit val GeoKeyFormat: RootJsonFormat[GeoKey] = jsonFormat3(GeoKey)
implicit val GeoRecordFormat: RootJsonFormat[GeoRecord] = jsonFormat3(GeoRecord)
}
object GeoModelJsonProtocol extends GeoModelJsonProtocol
// BSON
object GeoModelBsonProtocol extends GeoModelJsonProtocol with GreenLeafBsonProtocol {
override implicit val GeoRecordFormat: RootJsonFormat[GeoRecord] =
jsonFormat(GeoRecord, "_id", "name", "population")
}
82. Incorrect use of query with optional fields
import GeoModelBsonProtocol._
import GreenLeafMongoDsl._
val filter: Document = "_id" $eq GeoKey("6252001")
// filter will select all records with _id.country = USA includes states and cities:
// but this is look strange because we used pretty explicit filter by primary key
// FILTER: {"_id.country": {"$eq": "6252001"}}
// output:
GeoRecord(GeoKey(6252001,None,None),United States of America,310232863)
GeoRecord(GeoKey(6252001,Some(5128638),None),New York,19274244)
GeoRecord(GeoKey(6252001,Some(5128638),Some(5128581)),New York City,8175133)
GeoRecord(GeoKey(6252001,Some(5128638),Some(5133273)),Queens,2272771)
GeoRecord(GeoKey(6252001,Some(5128638),Some(5110302)),Brooklyn,2300664)
GeoRecord(GeoKey(6252001,Some(5101760),None),New Jersey,8751436)
GeoRecord(GeoKey(6252001,Some(5101760),Some(5099836)),Jersey City,264290)
GeoRecord(GeoKey(6252001,Some(5101760),Some(5099133)),Hoboken,53635)
GeoRecord(GeoKey(6252001,Some(5332921),None),California,37691912)
GeoRecord(GeoKey(6252001,Some(5332921),Some(5391959)),San Francisco,864816)
...
83. Optional fields, spray-json and MongoDB queries
Usually, optional members that are undefined (None) are not rendered at all.
The NullOptions trait supplies an alternative rendering mode for optional case class
members.
By mixing this trait into custom JsonProtocol users can enforce the rendering of undefined
members as null.
https://github.com/spray/spray-json#nulloptions
The { item : null } query matches documents that either contain item field with value null
or do not contain this field.
https://docs.mongodb.com/manual/tutorial/query-for-null-fields/
84. Verify BSON protocol with NullOptions
trait GreenLeafBsonProtocol extends GreenLeafJsonProtocol with NullOptions { // ...
// select country by primary key
val filter: Document = "_id" $eq GeoKey(country = "6252001")
// {"_id.city": {"$eq": null}, "_id.country": {"$eq": "6252001"}, "_id.state": {"$eq": null}}
// output:
GeoRecord(GeoKey(6252001,None,None),United States of America,310232863)
// select state by primary key
val filter: Document = "_id" $eq GeoKey(country = "6252001", state = "5128638")
// {"_id.city": {"$eq": null}, "_id.country": {"$eq": "6252001"}, "_id.state": {"$eq": "5128638"}}
// output:
GeoRecord(GeoKey(6252001,Some(5128638),None),New York,19274244)
// etc.
87. GreenLeafMongoDao
GreenLeafMongoDao extends GreenLeafMongoDsl and provides simple DSL to
transform Mongo's Observable[Document] instances to Future[Seq[T]],
Future[Option[T]] and Future[T].
This trait also provides many useful generic methods such as insert, getById, findById,
updateById and replaceById.
GreenLeafMongoDao uses _id field name for primary key by default, but it is possible to
override it to something different like “id” or “key” which will be reflected in all
predefined queries.
Methods such as insert or replace make preprocessing of the JSON to skip fields with
nullable values.
88. Example of GreenLeafMongoDao usage
// MODEL
case class Building(id: Long, name: String, height: Int, floors: Int, year: Int, address: String)
// JSON
trait BuildingModelJsonProtocol extends GreenLeafJsonProtocol {
implicit lazy val BuildingFormat: RootJsonFormat[Building] = jsonFormat6(Building)
}
object BuildingModelJsonProtocol extends BuildingModelJsonProtocol
// BSON
trait BuildingModelBsonProtocol extends BuildingModelJsonProtocol with GreenLeafBsonProtocol {
override implicit lazy val BuildingFormat: RootJsonFormat[Building] =
jsonFormat(Building, "_id", "name", "height", "floors", "year", "address")
}
object BuildingModelBsonProtocol extends BuildingModelBsonProtocol with DaoBsonProtocol[Long, Building] {
override implicit val idFormat: JsonFormat[Long] = LongJsonFormat
override implicit val entityFormat: JsonFormat[Building] = BuildingFormat
}
90. Conclusion:
1) GreenLeafJsonProtocol provides all JSON formats from Spray’s DefaultJsonProtocol.
JSON formats for additional types such as Zoned Date Time and Scala Enumeration also
defined. All these JSON formats can be overridden.
2) GreenLeafBsonProtocol extends GreenLeafJsonProtocol and overrides JSON formats for
some types as required for BSON serialization.
3) GreenLeafMongoDsl provides DSL that allows to write queries with a syntax that is more
close to real queries in MongoDB.
4) GreenLeafMongoDao extends GreenLeafMongoDsl, defines DSL for mongo’s observables
and provides typical methods such as insert, findById, getById, updateById, etc.