The document discusses the MySQL Document Store, which provides both NoSQL and SQL capabilities on a single platform. It allows for schemaless document storage and querying using JSON documents, while also providing the reliability, security and transaction support of MySQL. Examples are given in several programming languages of basic CRUD operations on document collections using simple APIs that avoid the need for SQL. The document also shows how JSON documents can be queried using SQL/JSON functions, enabling more complex analysis that was previously only possible in a relational database. This provides the best aspects of both NoSQL and SQL on a proven database platform.
6. Safe Harbor Agreement
THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT
DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY
NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO
DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY, AND SHOULD NOT BE
RELIED UPON IN MAKING PURCHASING DECISIONS. THE DEVELOPMENT,
RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED
FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE.
6
10. Relational Databases
● Original Goal was to save data with minimal duplication
● Disks were expensive
● … and slow
● 45 years old
● Introduced the concept of accessing many records with a single command
10
12. Relational Databases
● Need to set up tables BEFORE use
● Relations, indexes, data normalization, query optimizations
● Hard to change on the fly
● Need a DBA or someone who has DBA skills
● This can be a chokepoint
12
14. NoSQL or Document Store
● Schemaless
○ No schema design, no normalization, no foreign keys, no data types, …
○ Very quick initial development
● Flexible data structure
○ Embedded arrays or objects
○ Valid solution when natural data can not be modeled optimally into a
relational model
○ Objects persistence without the use of any ORM - *mapping object-
oriented*
14
15. NoSQL or JSON Document Store
● JSON
● close to frontend
● native in JS
● easy to learn
15
16. How DBAs see data as opposed to how Developers see data
{
"GNP" : 249704,
"Name" : "Belgium",
"government" : {
"GovernmentForm" :
"Constitutional Monarchy, Federation",
"HeadOfState" : "Philippe I"
},
"_id" : "BEL",
"IndepYear" : 1830,
"demographics" : {
"Population" : 10239000,
"LifeExpectancy" : 77.8000030517578
},
"geography" : {
"Region" : "Western Europe",
"SurfaceArea" : 30518,
"Continent" : "Europe"
}
}
16
17. What if there was a way to provide both
SQL and NoSQL on one stable platform that
has proven stability on well know
technology with a large Community and a
diverse ecosystem ?
With the MySQL Document
Store it is now an option!
17
18. A Solution for all
Developers:
schemaless
★ rapid prototyping
& simpler APIs
★ document model
★ transactions
Operations:
★ performance
management/visibility
★ robust replication,
backup, restore
★ comprehensive tooling
ecosystem
★ simpler application
schema upgrades 18
Business Owner:
★ don't lose my data ==
ACID trx
★ capture all my data =
extensible/schemaless
★ product on
schedule/time to
market = rapid
development
19. Built on the MySQL JSON Data type and Proven MySQL Server Technology 19
★ Provides a schema flexible JSON Document Store
★ No SQL required
★ No need to define all possible attributes, tables,
etc.
★ Uses new MySQL X DevAPI
★ Can leverage generated column to extract JSON
values into materialized columns that can be
indexed for fast SQL searches.
20. Built on the MySQL JSON Data type and Proven MySQL Server Technology 20
★ Document can be ~1GB
○ It's a column in a row of a table
★ Allows use of modern programming styles
○ No more embedded strings of SQL in your code
○ Easy to read
★ Also works with relational Tables
★ Proven MySQL Technology
21. ★ C++
★ Java
★ .Net
★ Node.js
★ JavaScript
★ Python
★ PHP
○ Working with other Communities to help them support it too 21
Connectors for
22. ★ Command Completion
★ Python, JavaScripts & SQL modes
★ Admin functions
★ New Util object
★ A new high-level session concept that can scale from single MySQL
Server to a multiple server environment
22
New MySQL Shell
23. ★ Non-blocking, asynchronous calls follow common language patterns
★ Send out many queries and proicess other things until they return
★ Supports CRUD operations
★ Concentreate on basic funmctions
★ Easily scale from one server to InnoDB cluster w/o changing application!
23
New Model
28. JavaScript 28
// Connecting to MySQL Server and working with a Collection
var mysqlx = require('mysqlx');
// Connect to server
var mySession = mysqlx.getSession( {
host: 'localhost', port: 33060,
user: 'user', password: 'password'} );
var myDb = mySession.getSchema('test');
// Create a new collection 'my_collection'
var myColl = myDb.createCollection('my_collection');
// Insert documents
myColl.add({_id: '1', name: 'Sakila', age: 15}).execute();
myColl.add({_id: '2', name: 'Susanne', age: 24}).execute();
myColl.add({_id: '3', name: 'User', age: 39}).execute();
// Find a document
var docs = myColl.find('name like :param1 AND age < :param2').limit(1).
bind('param1','S%').bind('param2',20).execute();
// Print document
print(docs.fetchOne());
// Drop the collection
myDb.dropCollection('my_collection');
No SQL!!
29. Python 29
# Connecting to MySQL Server and working with a Collection
from mysqlsh import mysqlx
# Connect to server
mySession = mysqlx.get_session( {
'host': 'localhost', 'port': 33060,
'user': 'user', 'password': 'password'} )
myDb = mySession.get_schema('test')
# Create a new collection 'my_collection'
myColl = myDb.create_collection('my_collection')
# Insert documents
myColl.add({'_id': '1', 'name': 'Sakila', 'age': 15}).execute()
myColl.add({'_id': '2', 'name': 'Susanne', 'age': 24}).execute()
myColl.add({'_id': '3', 'name': 'User', 'age': 39}).execute()
# Find a document
docs = myColl.find('name like :param1 AND age < :param2')
.limit(1)
.bind('param1','S%')
.bind('param2',20)
.execute()
# Print document
doc = docs.fetch_one()
print doc
30. Node.JS 30
// Connecting to MySQL Server and working with a Collection
var mysqlx = require('@mysql/xdevapi');
var db;
// Connect to server
mysqlx
.getSession({
user: 'user',
password: 'password',
host: 'localhost',
port: '33060',
})
.then(function (session) {
db = session.getSchema('test');
// Create a new collection 'my_collection'
return db.createCollection('my_collection');
})
.then(function (myColl) {
// Insert documents
return Promise
.all([
myColl.add({ name: 'Sakila', age: 15 }).execute(),
myColl.add({ name: 'Susanne', age: 24 }).execute(),
myColl.add({ name: 'User', age: 39 }).execute()
])
.then(function () {
// Find a document
return myColl
.find('name like :name && age < :age')
.bind({ name: 'S%', age: 20 })
.limit(1)
.execute(function (doc) {
// Print document
console.log(doc);
});
});
})
.then(function(docs) {
// Drop the collection
return db.dropCollection('my_collection');
})
.catch(function(err) {
// Handle error
});
31. C++ 31
// Connect to server
var mySession = MySQLX.GetSession("server=localhost;port=33060;user=user;password=password;");
var myDb = mySession.GetSchema("test");
// Create a new collection "my_collection"
var myColl = myDb.CreateCollection("my_collection");
// Insert documents
myColl.Add(new { name = "Sakila", age = 15}).Execute();
myColl.Add(new { name = "Susanne", age = 24}).Execute();
myColl.Add(new { name = "User", age = 39}).Execute();
// Find a document
var docs = myColl.Find("name like :param1 AND age < :param2").Limit(1)
.Bind("param1", "S%").Bind("param2", 20).Execute();
// Print document
Console.WriteLine(docs.FetchOne());
// Drop the collection
myDb.DropCollection("my_collection");
32. Java 32
// Connect to server
Session mySession = new
SessionFactory().getSession("mysqlx://localhost:33060/test?user=user&password=password");
Schema myDb = mySession.getSchema("test");
// Create a new collection 'my_collection'
Collection myColl = myDb.createCollection("my_collection");
// Insert documents
myColl.add("{"name":"Sakila", "age":15}").execute();
myColl.add("{"name":"Susanne", "age":24}").execute();
myColl.add("{"name":"User", "age":39}").execute();
// Find a document
DocResult docs = myColl.find("name like :name AND age < :age")
.bind("name", "S%").bind("age", 20).execute();
// Print document
DbDoc doc = docs.fetchOne();
System.out.println(doc);
// Drop the collection
myDB.dropCollection("test", "my_collection");
39. For this example, I will use the well known restaurants collection:
We need to dump the data to a file and
we will use the MySQL Shell
with the Python interpreter to load the data.
Migration from MongoDB to MySQL Document Store
39
40. Dump and load using MySQL Shell & Python
This example is inspired by @datacharmer's work: https://www.slideshare.net/datacharmer/mysql-documentstore
$ mongo quiet eval 'DBQuery.shellBatchSize=30000;
db.restaurants.find().shellPrint()'
| perl -pe 's/(?:ObjectId|ISODate)(("[^"]+"))/ $1/g' > all_recs.json
40
56. What does a collection look like on the server ? 56
57. Every document has a unique identifier called the document ID, which can be
thought of as the equivalent of a table's primary key. The document ID value can
be manually assigned when adding a document.
If no value is assigned, a document ID is generated and assigned to the
document automatically !
Use getDocumentId() or getDocumentIds() to get _ids(s)
_id
57
58. Mapping to SQL Examples
createCollection('mycollection')
versus
CREATE TABLE `test`.`mycoll` (
doc JSON,
_id VARCHAR(32)
GENERATED ALWAYS AS (doc->>'$._id') STORED
PRIMARY KEY
) CHARSET utf8mb4;
58
59. Mapping to SQL Examples
mycollection.add({‘test’: 1234})
versus
INSERT INTO `test`.`mycoll` (doc)
VALUES ( JSON_OBJECT( 'test',1234));
59
60. More Mapping to SQL Examples
mycollection.find("test > 100")
Versus
SELECT doc
FROM `test`.`mycoll`
WHERE (JSON_EXTRACT(doc,'$.test') >100);
60
73. Find the top 10 restaurants by grade for each cuisine 73
WITH cte1 AS
(SELECT doc->>"$.name" AS name,
doc->>"$.cuisine" AS cuisine,
(SELECT AVG(score) FROM
JSON_TABLE(doc, "$.grades[*]" COLUMNS
(score INT PATH "$.score")) AS r) AS avg_score
FROM restaurants)
SELECT *, RANK() OVER
(PARTITION BY cuisine ORDER BY avg_score DESC) AS `rank`
FROM cte1
ORDER BY `rank`, avg_score DESC LIMIT 10;
This query uses a Common Table Expression (CTE) and a Windowing Function to rank the
average scores of each restaurant, by each cuisine assembled in a JSON_TABLE
74. No SQL Consumed In This Query!! 74
$schema = $session->getSchema("world");
$table = $schema->getTable("city");
$row = $table->select('Name','District')
->where('District like :district')
->bind(['district' => 'Texas'])
->limit(25)
->execute()->fetchAll();
76. This is the best of the two worlds in one product !
● Data integrity
● ACID Compliant
● Transactions
● SQL
● Schemaless
● flexible data structure
● easy to start (CRUD)
76
80. New in MySQL 8.0
1. True Data Dictionary
2. Default UTF8MB4
3. Windowing Functions, CTEs, Lateral Derived Joins
4. InnoDB SKIPPED LOCK and NOWAIT
5. Instant Add Column
6. Histograms
7. Resource Groups
8. Better optimizer with new temporary table engine
9. True Descending Indexes
10.3D GIS
11.JSON Enhancements
80
81. Please buy my book!
If you deal with the JSON
Data Type or have an
interest in the MySQL
Document Store, this text is a
great guide with many
examples to help you
understand the complexities
and opportunities with a
native JSON Data Type –
Avalable on Amazon 81
Good morning, good afternoon, or good evening, depending on your location. This is the MySQL Best of Both Worlds webinar and I am Dave Stokes. I am a MySQL Community Manager and I will be you guide for the this presentation. Please, if you have anyquestions that pop up after this presentation, do not hesitate to contact and there is a mechanism for you to ask questions during this webinar. Remember that the only bad question is the one that you never get answered.
The old timers will pronounce it My-Ess-queue-ell but you may pronounce it My-sequel
I will be talking about software already available for your use today but we may have some tangent where we talk about future products. However I am not blessed with perfect knowledge of the future so take anything I say about future features with a grain of salt.
I will mainly be talking about functionality in the MySQL Community Edition which is the free to download version under the Gnu General Public License Version 2 but there is an Enterprise Edition available for those who want phone support, the best monitoring software, enhanced backup, at rest data encryption, and other very nice features. The core of the server is the same between the two and I will try to point out if I do reference the Enterprise Edition. And your local MySQL sales professional can guide you through the Enterprise Edition.
We will be talking about the MySQL Document Store which was introduced with version 5.7 and greatly enhanced with version 8.0. This is built on our new X DevAPI, our new X Protocol, and the JSON native data type. This provides you the ability to use MySQL as a SQL (structured query language) relational database, or as a NoSQL JSON Document Store Database, or both. Depending on your use cases, you can decide how to handle your data.
So let us start with a little review of relational databases and why they are the dominate data store
Relational Databases have been around for quite a long time and their core functionality maps very well to so much business logic. But it originally was designed to save disk space since disks were slow and very expensive. In the last four and a half decades relational technology has proven itself despite some flaws. And an interesting bit of trivia is that is was the first to introduce the concept of accessing many records with a single command
Relational databases depend on having your data put into normal forms to reduce redundancy and help with data integrity. Your data is organized so that columns in tables are organized to ensure their dependencies are enforced by integrity constraints. Add to this the ability to perform transactions so that all or none action take place and your get a very powerful tool. And over the past handful of decades SQL has become a very powerful programming language
However there is some work that needs to be done before you can use a relational database. Data normalization can be very hard work. Then you need to set up the schema, setup tables, establish indexes, and other tasks that usually performed by a DBA. And once setup, ad hoc changes are difficult, often requiring down time. That is much improved in MySQL 8.0 but it is not mutable data.
The rise in popularity of the NoSQL Document Store style database has been due to many factures including those drawback to the relational databases.
First is that you do not need to normalize data, do not need to set up keys foreign or otherwise, and you can start saving data right away. The JSON document format is very flexible and provides for data mutability. And often times you will start a project with limited knowledge of the data requirements. And you do not need to use an Object Relational Mapper because your programming language of choice is not SQL.
JSON has become the de facto data interchange format in the last few years. It is very close to the frontend of applications mainly because so much frontend work is now JavaScript. And JSON document stores are easier to learn than SQL as there are less prerequisites to learn before you can use them efficiently.
Now DBAs see the world as relations but developers often have a different view. On the left you have traditional relational model and on the right a JSON document model
But what If there was a way to support both views of the world on one very stable and well known platform from a proven technological base that comes with a very active community? Well, now you can have the best of both the SQL and NoSQL worlds.
Developers get fast access to storing data and transactions. Operations gets a known platform they know how to support. And the management gets safely stored data and rapid development.
So you get the JSON document format, you no longer have to embed ugly strings of SQL in your beautiful code, and the use of the new MySQL X DevAPI. And if you need to get some of that JSON formatted data out into a SQL relational columns there are ways to do that with generated columns.
Your payload is one GIGAbyte compared to MongoDB’s 16 MEGAbytes. And that is per column so you can add multiple columns. Since you do not embed SQL in your code and are using a modern API, the queries are easier to comprehend. And this new NoSQL approach for MySQL also works with your relational tables too. And this is all on MySQL technology that you already know.
There are connectors out there for using the X DevAPI from all the popular languages
And there is a new shell built on the new protocol that you will appreciate. It features command completion, amazing command help, and three modes – Python, JavaScript, and SQL – to help you work with your data. The new shells also has several utilities for setting up InnoDB Clusters and checking the status of your MySQL 5.7 server before upgrading to MySQL 8.0
Part of the new functionality is the ability to send out a group of queries asynchronously. And with the emphasis on CRUD – Create Read Update Delete – the emphasis if off the intricacies of SQL and on the basics of handling data. And you can scale from one server to a multi-node highly available master-master InnoDB cluster without having to change your application code.
The new X Protocol is built on Google Protobufs which provide a language neutral and platform neutral platform for serializing structured data. Yes, even unstructured NoSQL have a structure and that is JSON.
Our older architecture is on the left where you use the SQL API over the standard protocol to talk to the server. On the right you can use the SQL protocol or the X Protocol to talk to the server.
As I mentioned before, you can scale from one to many servers easily without changing code. I would argue that your application really should not care if it is talking to one or many servers and here is an example of using MySQL Router and letting MySQL Router divvy up the reads and the writes as needed.
Here is a quick example in PHP. Simply chose your schema then pick the document collection you need to use. Here we are looking to find the records where the _id field is equal to ‘USA’. Notice that there is ZERO SQL in that query. And if you do need to modify that query you do not need to fight SQL syntax, as you will see later.
The follow examples are from the X DevAPI guide that you can find on https://dev.mysql.com and I would like you to notice the lack of SQL
Moving to Python, you see here the same code but in a different language. And again, no embedded SQL
The MySQL Document Store supports both Node.JS and JavaScript
As well as C and C++
And JAVA
The new MySQL Shell also has a lot of interesting features that we will cover next
Here we start the new shell and then use \c to connect. The shell can store you connection information if you wish, in an encrypted file for later connections. Here we create a schema named ‘scale’ and then tell the shell we want to use that schema as the default. Note that the default schema is accessible via pointer named ‘db’.
Now we can create a JSON Document collection named ‘Pasadena’. I used these as an example at the Southern California Linux Expo or SCaLE earlier this month. Then we can add a document with the add() function. No setting up relations or normalizing data and we are storing data in MySQL with NoSQL
If we use the find() function, we see the record we just added. The server added the _id field and I will talk more about that in a little bit.
What about modify a record. Here we use the modify function and specify how to find the specific record we want to change. Then we set a key/pair value to foo and bar. Running find again shows us the new key/value pair
There are some other things to notice about the new shell. First at the bottom of the page you will notice that MySQL 8 is all UTF8MB4 and the server is optimized to support that character set. Next, note that the server is listening to port 33060 for the X Protocol while the traditional port is 3306. And on the SSL line note that we are using TLS be default.
There may be some MongoDB fans out there who are wondering how hard it would be to migrate to the MySQL Document Store. For the next slides we will use the restaurants collection that Mongo has used in their tutorials. This data set is a list of restaurants and their health grades in the boroughs of New York city.
One of my predecessors, Giuseppe Maxia also known as the Datacharmer, came up with then Python script, here run on the new shell, to read in a dump of the MongoDB restaurant collection and load it into MySQL
Or you could use the new JSON bulk loader built in to the new shell which runs much faster
And yes, we know how to handle the BSON or Binary JSON format data that Mongo uses. So you can choose to convert or not
Well I though that was pretty impressive
So what does that data look like? Well there are 25 thousand records. Id use postpend limit to our query we see the data for the restaurant. Please note the embedded array of grades starting about a third of the way down that I will details how to parse later.
What if you do not want the entire JSON document. Use the fields operator to specify the desired fields. And yes you can alias fields to another value.
Syntax differences between MongoDB and MySQL are for you to judge. I find the MySQL version easier to understand and write over the embedding of a JSON object in the MongoDB query
Of course we have created, read, and updated JSON documents. So what about remove? As you can see remove is just as simple.
The rail road diagrams tell a lot about the CRUD operators for the MySQL document store. Add is very straight forward as you are adding one or more JSON formatted documents.
Modify is a but more complex. Not that you can sort, limit, bind values, set & unset, plus manipulate arrays.
You will notice that the design for these functions is very complete.
And you still have all the power that your are used to having in your queries from SQL with the ability to use group by, having, sort, bind values, and transactions.
He is a hierarchical view the object involved with the MySQL Document Store. Please notice on the right we have relational CRUD functions with the familiar insert, select, update, and delete. So the document store does with relational tables too.
Transactions are a strong point of the MySQL InnoDB table type. Here we start a transaction, add a record, and then find both records in the collection. Notice there has been no ‘commit’ so the transaction is still in play
By issuing a rollback, the new record is not committed to the server.
This is a diagram of how to use the two MySQL Protocols with either the new X DevAPI on port 33060 or the classic protocol on 3306. You do not need to use the document store unless you want to.
When you create a collection, the optimizer sets in place the process to create a two column table. One is named ‘doc’ for the JSON document and the other is ‘_id’ for that fields that I promise you I will cover shortly. If you go into SQL mode with the new shell, you can see the DESCRIPTION and SHOW CREATE TABLE detals.
Every document needs a unique identifier and we call that _id. If you do not have a key/value pair with _id, the server will create one for you. Or you can assign your own but remember to keep them unique!
We have examples comparing the MySQL Document Store versus traditional SQL – In red we create a collection versus the CREATE TABLE command to do the equivalent. If you are a lazy typist like me, your will enjoy the shorter syntax
Adding a record is similarly shorter
As are queries to retrieve data
Let say you have data in JSON but you want to extract it and make it relational column. Here we can use a generated column to extract the value matching the key ‘borough’ and materialize the data in its own column.
And yes it is also easy to create indexes on values in the JSON document. Here we cast the cuisine key’s data into a text 20 and create and index on that field
You can use EXPLAIN to see if the new index aids in a query.
Without the index we have a FULL table scan where the entire tables has to be read and that is about 25,000 records to search for the desired information.
Now we create our index on the borough
And we are down to 192 records to read instead of 25,000! Much faster
Remember those restaurant heath scores I mentioned earlier? Well, embedded arrays are always messy to parse but the MySQL Document Store makes it much easier.
If we get all the grades for one record or ‘$.grades’ you will see in the top pane the complete list. Great but you may want to be more selective with just the first grad or $.grade[0], or the last as $.grades[last], or just the second and third $.grad[1 to 2] ; remember here arrays start at zero
So it becomes very easy to deal with embedded arrays
Now sometimes you may want to take that unstructured NoSQL data and turn it into a structured table temporarily. Here we can use the JSON_TABLE function. In this case we take the name key/value, cast the value into a CHAR(40), and call it ‘name’. From here we can treat this data as if it was in a structured table.
Here is an other example were we cast the street and zip code values and make them available for SQL processing
Okay Dave, but show me something you can do easily with another Document database
Here is something you can do with MySQL 8.0; Take those embedded grade arrays and turn them into integers by using JSON_TABLE, the lines in RED, then use a Common Table Expression to create an average score, and then use a windowing function to rate all the top rated restaurants by cuisine. You can do this type of sophisticated analysis with MySQL 8.0
But what if you have all your data in relational tables and want to use the new API? Well, as you can see here we can also make queries to relational tables without SQL. And you will see that in the PHP example here that the query is very easy to read and will be easy to modify, say with a having population over a certain amount, or sorting the names. And you can also case collections as tables if you need to. So you do get a lot of flexibility in dealing with your data.
So, what do you gain by using MySQL 8.0 and the MySQL Document Store
You do get the best of both worlds in one product. You get the data integrity, transactions, and SQL plus the schemaless JSON document store that provides data mutability with easy CRUD functions
For years I have been preaching third normal form or better. But there are cases were having the JSON data type for mutable information in cases where things change quickly or do not fit the relational model where you can put information in a JSON fields and save one or more dives into the indexes and table where you would normally data in what I call ‘stub’ tables.
And yes, you can make Non JSON data into JSON very easily with functions like JSON_OBJECT
And yes we do support GeoJSON with the 3D boost.Geometry libraries in MySQL 8.0
What is new in MySQL 8.0? Much more than I can cover here but 8.0 is a big advance for MySQL and you should look at these features too.
Please forgive the shameless plug but I wrote this book to help those new to the JSON data type to show with numerous coding examples how to use the JSON data type and how to get started with the MySQL Document Store. It is a short, easy read that will become an essential part of your reference library.