SlideShare ist ein Scribd-Unternehmen logo
1 von 76
So MANY databases, which
one do I pick?
BY KRISTIN FERRIER
About Me – Kristin Ferrier
 18+ Years in IT
 Principal Consultant at Ferrier Solutions
 Full stack web developer with specialty and passion for data
 Twitter: @SQLEnergy
 Techlahoma Slack: @EnergyDev
 GitHub: @EnergyDev
10-15 Years Ago
Oracle SQL Server
Postgres MySQL
Which database
should I use?
There are four to pick from.
Now
Oracle
SQL
Server
Postgres
MySQL
Redis
Neo4j OrientDB
CosmoDB
Riak
DynamoDB
HBase
Google
Cloud
BigTable
Cassandra
MongoDB Firebase
Couchbase
Overview
 Difference between SQL and NoSQL databases
 ACID – Something you may want your database to have
 5 Main Database Families
 Great scenarios for each of these families
SQL and NoSQL
SQL Databases
 SQL databases are also known as Relational Databases
 They store data in tables (in math terms called relations)
 Use SQL to interact with the data
ID Product Price
3 Gryffindor Scarf 9.99
4 Ravenclaw Scarf 9.99
5 Hufflepuff 9.99
6 Slytherin Scarf 9.99
NoSQL
 Varying opinions on the definition
 Purpose-built databases for specific data models with an emphasis on flexible schemas
and new demands of modern applications
 The data models include document, graph, key-value, and columnar
 Non-relational databases (i.e. any database that isn’t relational)
ACID
ACID
Atomicity
Consistency
Isolation
Durability
Atomicity
 Transaction must execute completely or not at all
 Such atomicity must be guaranteed in all situations, including power failures and
hardware crashes
Account A $1,000 Account B
2. Account B Deposit $1,000
1. Account A Withdraw $1,000
This entire “unit of work”
succeeds or fails. No
intermediate state.
Consistency
 Once a transaction has been committed, the data must conform to the
database schema.
 Example: Database has foreign key constraint requiring deposits or withdrawals to
happen with bank accounts that exist in the Accounts table. Thus, the system
would only commit deposits or withdrawals corresponding to an account in the
Accounts table.
Accounts
123
456
789
Deposit $250 into
Account 123
Deposit $250 into
Account 999
Yes
No. Account
doesn’t exist
Isolation
 Concurrent transactions must leave the database in a status as if they were executed
sequentially
 Let’s look at an example where April is transferring $1,000 from Account A to Account B, while
Taylor is withdrawing $500 from Account B. Below is the expected result.
Account A Account B
$3,000 $2,000
-$1,000 +$1,000
$2,000 $3,000
-$ 500
$2,500
Isolation (No Insolation, account loses
money)
Transfer $1,000 from Account A
to Account B
 Read Account A as $3,000
 Withdraw $1,000 from Account A
 Read Account B as $2,000
 Add $1,000 to Account B
 Account B has $3,000
Withdraw $500 from Account B
 Read Account B as $2,000
 Withdraw $500 from Account B
 Account B has $1,500
 We LOST $1,000!
Isolation (With isolation, we don’t lose the
money)
Transfer $1,000 from Account A
to Account B
 Read Account A as $3,000
 Pull $1,000 from Account A
 Read Account B as $2,000
 Add $1,000 to Account B
 Account B has $3,000
Withdraw $500 from Account B
 Read Account B (sorry, please, wait)
 We’re waiting
 We’re waiting
 Account B has $3,000
 Withdraw $500 from Account B
 Account B has $2,500
Durability
 Once a transaction has completed execution, the updates to the database are
persisted in a way that is recoverable upon system failure.
 Example: April’s $1,000 transfer, once completed, could be recovered in case of
hardware failure.
Account A Account B$1,000
HW
Crash
We still
know about
the transfer
5 Database Families
5 Main Database Families
SQL NoSQL
Relational
Postgres
MySQL
MS SQL Server
Oracle
Key-Value
Redis
Riak
DynamoDB*
Document
MongoDB
Couchbase
CouchDB
Firebase
* Multi-model database
Columnar
Cassandra
HBase
Google Cloud
BigTable
Graph
Neo4j
OrientDB*
Cosmos DB*
SQL / Relational
Relational - Major Players
Logos used via “fair use”
Relational – Data model
 Data resides in tables containing rows and columns
ProductID (PK) ProductName CurrentPrice SubcategoryID (FK)
1 Red Headphones 49.99 1
2 Blue Headphones 59.99 1
3 Gryffindor Scarf 9.99 2
4 Ravenclaw Scarf 9.99 2
5 Hufflepuff 9.99 2
6 Slytherin Scarf 9.99 2
7 Developer Dragon T-shirt (M) 19.99 2
Relational – Data model
 Each column has a data type, like
INT or VARCHAR(50)
 At any one time a table has an
exact number of columns for
each row
 Table structure must be defined
prior to adding data
ProductID (PK) ProductName CurrentPrice SubcategoryID (FK)
1 Red Headphones 49.99 1
2 Blue Headphones 59.99 1
3 Gryffindor Scarf 9.99 2
4 Ravenclaw Scarf 9.99 2
5 Hufflepuff 9.99 2
6 Slytherin Scarf 9.99 2
7 Developer Dragon T-shirt (M) 19.99 2
Relational – Data model
SELECT ProductName,
CurrentPrice
FROM Products
Data is accessed using SQL
SQL
Data lives in tables
Customers
Benefits
Orders
Categories
OrderDetails
Products
That can be tied together
Customers
Benefits
Orders
Categories
OrderDetails
Products
In many ways
Customers
Benefits
Orders
Categories
OrderDetails
Products
Providing much flexibility
Customers
Benefits
Orders
Categories
OrderDetails
Products
Relational - Overview
 Design schema with types ahead of time
 Can query the data with lots of flexibility
 ACID-compliant
Relational – Popular for
 Online Transaction Processing (OLTP)
 Order entry
 Financial transactions
 Retail sales
 Data warehousing
 Great for many things – Some people recommend using relational databases
unless expecting to reach 500 GB or more fairly quickly
Relational – Not great for
 Large volume (Petabytes) of data
 Rapidly ingesting large volumes of data with unknown structures
 Scaling out / Horizontal scaling
Relational – What if I want more?
 Postgres – JSON data type (9.2)
JSONB data type (9.4)
 SQL Server – JSON functionality (2016)
R (2016)
Python (2017)
 MySQL – JSON data type (5.7.8)
 Oracle – JSON functionality (12.1.0.2)
Document
Document – Major Players
Document – Data model
 Data resides in documents, such as JSON or XML documents
{
"EpisodeTitle": "And the Crown of King Arthur",
"Director": "Dean Devlin",
"FranchiseName": "The Librarians",
"Characters": [
{
"CharacterName": "Cassandra",
"Actor": "Lindy Booth"
},
{
"CharacterName": "Ezekiel",
"Actor": "John Harlin Kim"
},
{
"CharacterName": "Jake",
"Actor": "Christian Kane"
}
]
}
Document – Data model
 Each document corresponds to a
document key
 Can store nested objects within a
document
 Typically store all data for a single object
in a single document
 Typically can query by document key or
data within the document
DocumentKey: “7E8ABGED92A”
{
"EpisodeTitle": "And the Crown of King Arthur",
"Director": "Dean Devlin",
"FranchiseName": "The Librarians",
"Characters": [
{
"CharacterName": "Cassandra",
"Actor": "Lindy Booth"
},
{
"CharacterName": "Ezekiel",
"Actor": "John Harlin Kim"
},
{
"CharacterName": "Jake",
"Actor": "Christian Kane"
}
]
}
Document
 Schemaless, nested data documents
 You don’t need to know the structure ahead of time
 ACID-compliance depends upon the provider and might be complete, partial, or
none
Document – Data access
 Data manipulation and querying depends upon the specific provider
 MongoDB: JavaScript
 Couchbase: N1QL (SQL for JSON) for querying
Other options for insert/update/delete
MongoDB – Example
Insert
db.quotes.insertMany([
{ Franchise: "Star Wars", Character: "Yoda", QuoteText: "Do. Or do not. There is no try."},
{ Franchise: "The Librarians", Character: "Cassandra", QuoteText: "Mathemagics. I like it.."}
]);
Retrieve
db.quotes.find();
MongoDB – Insert Example
 Inserting two quotes
MongoDB – Retrieve Example
 Retrieving all quotes
Document – Popular for
 OLTP
 Customer 360
 IoT
Document – Real Examples
 eBay – Stored meta data for every item for sale on eBay using MongoDB
 Gap – Many supply chain systems run against MongoDB
 Marriott - Entire reservation system run via Couchbase
 Viber - Mobile devices with always-available messaging using Couchbase
Document – Not great for
 Large-scale batch analytics
 Highly interconnected data sets
Key-Value
Key-Value – Major Players
Key-Value – Data model
 Data resides in Key-Value pairs where the Key must be unique
Key Value
Key1 25
ab928019281019210 “Carmen Electron”
ae0918384901-01102 <QuoteText>Mathemagics. I like it..</QuoteText>
user:jackson123:name “Jackson Peterson”
librarians:characters {"Characters": [
{ "CharacterName": "Cassandra", "Actor": "Lindy Booth" },
{ "CharacterName": "Ezekiel", "Actor": "John Harlin Kim"},
{ "CharacterName": "Jake", "Actor": "Christian Kane“
}]}
Key-Value – Data Model
 Highly flexible in what you can store
 Don’t need to know the data structure ahead of time
 Can store various kinds of simple data like integers and strings and more complex
data like JSON or XML with very high levels of nesting.
 When designing, strive for a system that will know the key when querying
Key-Value – Data Access
 Typically you query the database by the key and not the data corresponding to
the key
 In simplest Key-Value systems the data value is opaque
 Some providers provide additional capabilities to allow range queries or other
extended functionality
Key-Value – Often strong with
 Horizontal scaling
 Speed
 Handling data with unknown structure
Server Server Server
Key Value – Popular For
 Messaging and chat
 User and session data
bet365, Hibernum, Riot Games, Rovio have used Riak KV to store session data for
gamers and players
Virgin America and Flywheel have used RiakKV to store passenger information
and session data
Key-Value – Not great for
 Flexible querying
 Querying capabilities are limited
 Data warehousing with aggregations of numbers
 Complex data query needs
Columnar
Columnar – Major Players
Columnar – Data model
 Data resides in a keyspace (table in Hbase) that contains column families
Row key Column Family “color” Column Family “shape”
“primary”
“red”: “#FF0000”
“yellow”: “#FFFF00”
“blue”: “#0000FF”
“rectangle”: “4”
“triangle”: “3”
“secondary”
“purple”: “#A020F0”
“orange”: “#FFA500”
“green”: “#008000”
“triangle”: 3
“rainbow”
“red”: “#FF0000”
“orange”: “#FFA500”
“yellow”: “#FFFF00”
“green”: “#008000”
“blue”: “#0000FF”
"indigo": "#4b0082“
"violet": "#EE82EE"
“icosagon”: “20”
Columnar – Data model
 Terminology varies even between HBase and
Cassandra. We’ll look at Hbase.
 A column family may contain multiple columns
 For each row, the column family may have
different columns
 Columns don’t need to be known at time of
column family creation
Row key
Column family
“color”
Column Family
“shape”
“primary”
“red”: “#FF0000”
“yellow”: “#FFFF00”
“blue”: “#0000FF”
“rectangle”: “4”
“triangle”: “3”
“secondary”
“purple”: “#A020F0”
“orange”: “#FFA500”
“green”: “#008000”
“triangle”: 3
“rainbow”
“red”: “#FF0000”
“orange”: “#FFA500”
“yellow”: “#FFFF00”
“green”: “#008000”
“blue”: “#0000FF”
"indigo": "#4b0082“
"violet": "#EE82EE"
“icosagon”: “20”
Columnar – Data model
 For “primary” the columns are
color:red
color:yellow
color:blue
shape:rectangle
shape:triangle
Row key
Column family
“color”
Column Family
“shape”
“primary”
“red”: “#FF0000”
“yellow”: “#FFFF00”
“blue”: “#0000FF”
“rectangle”: “4”
“triangle”: “3”
“secondary”
“purple”: “#A020F0”
“orange”: “#FFA500”
“green”: “#008000”
“triangle”: 3
“rainbow”
“red”: “#FF0000”
“orange”: “#FFA500”
“yellow”: “#FFFF00”
“green”: “#008000”
“blue”: “#0000FF”
"indigo": "#4b0082“
"violet": "#EE82EE"
“icosagon”: “20”
Columnar – Data model
 HBase code to create an empty version of this
“table”
Hbase> create ‘visualizations’ ‘color’, ‘shape’
Row key
Column family
“color”
Column Family
“shape”
“primary”
“red”: “#FF0000”
“yellow”: “#FFFF00”
“blue”: “#0000FF”
“rectangle”: “4”
“triangle”: “3”
“secondary”
“purple”: “#A020F0”
“orange”: “#FFA500”
“green”: “#008000”
“triangle”: 3
“rainbow”
“red”: “#FF0000”
“orange”: “#FFA500”
“yellow”: “#FFFF00”
“green”: “#008000”
“blue”: “#0000FF”
"indigo": "#4b0082“
"violet": "#EE82EE"
“icosagon”: “20”
Columnar – Strong at
 Fast retrieval of columns of data
 Scaling “out” / horizontal scaling
Columnar – Popular For
 Analytics, like data warehousing, on large amounts of data
 Internet of Things (IoT)
Columnar – Real Examples
 Twitter - people search capability
 Facebook Messenger (previously)
Columnar – Not great for
 Online Transaction Processing (OLTP)
 Insert/update/delete an entire row is relatively slow
 Systems with small amounts of data
 Low Gigabytes or smaller
 Systems where you don’t understand your query needs upfront as design tends to
be focused towards meeting those query needs.
Graph
Graph – Major Players
Graph – Data model
 Data resides in Graphs containing Nodes and Relationships
Jesse
JeffJamie
Friends With
Graph – Data model
 Nodes (vertices) – Entities like people,
accounts, products
 Nodes can contain multiple pieces of data
for an entity
 Relationships (edges) – Represent
relationships between nodes
 Relationships can contain additional data
about a relationship
Jamie
Joined: 2016
Province: OK
Friends With
As of: 2017
Graph Example
Appeared In
Neo4j - Example
Neo4j – Answering questions with Cypher
 Who are the friends of Supergirl?
MATCH (n)-[:FRIENDS_WITH]-(m) WHERE n.name="Supergirl" RETURN n, m;
 Who are the enemies of Supergirl?
MATCH (n)-[:ENEMY_OF]-(m) WHERE n.name="Supergirl" RETURN n, m;
Friends of Supergirl
Kinds of Questions?
 Who are the friends of friends of Supergirl?
 Who are within 2 degrees of Supergirl?
 Who are friends of people who worked with Supergirl?
 Who are friends of enemies of Supergirl?
Graph – Popular For
 Applications that work with highly connected datasets
 Social networking
 Recommendation engines
 Fraud detection
 Knowledge graphs
 Asset Management
Graph – Real Examples
 Walmart – Online real-time recommendations using Neo4j
 Monsanto – Analysis of plant genetics using Neo4j
Graph – Not great for
 Large scale analytics
 Example: Not great at aggregating numbers
 Online Transaction Processing (OLTP)
CAP Theorem – Important Too
 In Distributed DB Systems
 You can have 2 of these
Consistency
Availability Partition
Tolerance
CA PC
AP
Where to go from here?
 https://www.db-fiddle.com/ - Playground for multiple SQL databases
 http://www.sqlfiddle.com – Playground for multiple SQL databases
 https://neo4j.com/sandbox-v2/ - Neo4j sandbox
 https://docs.mongodb.com/manual/tutorial/query-documents/ - MongoDB
documentation with an online MongoDB Web Shell
 https://university.mongodb.com/courses/catalog - MongoDB Courses
 https://www.couchbase.com/get-started - Couchbase Get Started, including interactive
tutorial for N1QL
 Seven Databases in Seven Weeks - Book by Eric Redmond and Jim Wilson
Q&A and Thank You
Q&A
Catch up with me later
 Twitter @SQLEnergy
 Techlahoma Slack @EnergyDev

Weitere ähnliche Inhalte

Ähnlich wie So MANY databases, which one do I pick?

Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Amazon Web Services
 
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...Cathrine Wilhelmsen
 
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_uploadProf. Wim Van Criekinge
 
OakTable World 2015 - Using XMLType content with the Oracle In-Memory Column...
OakTable World 2015  - Using XMLType content with the Oracle In-Memory Column...OakTable World 2015  - Using XMLType content with the Oracle In-Memory Column...
OakTable World 2015 - Using XMLType content with the Oracle In-Memory Column...Marco Gralike
 
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...Cathrine Wilhelmsen
 
Advanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation PipelinesAdvanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation PipelinesTom Schreiber
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Alex Sharp
 
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...Amazon Web Services
 
What to do when one size does not fit all?!
What to do when one size does not fit all?!What to do when one size does not fit all?!
What to do when one size does not fit all?!Arjen de Vries
 
What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)Krist Wongsuphasawat
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on ReadKent Graziano
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesRyan CrawCour
 
Introduction to Sql on Hadoop
Introduction to Sql on HadoopIntroduction to Sql on Hadoop
Introduction to Sql on HadoopSamuel Yee
 
Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...
Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...
Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...DevDay Dresden
 
JakartaData-JCon.pptx
JakartaData-JCon.pptxJakartaData-JCon.pptx
JakartaData-JCon.pptxEmilyJiang23
 

Ähnlich wie So MANY databases, which one do I pick? (20)

Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
Intro to Amazon Redshift Spectrum: Quickly Query Exabytes of Data in S3 - Jun...
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLSat...
 
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part5_v_upload
 
OakTable World 2015 - Using XMLType content with the Oracle In-Memory Column...
OakTable World 2015  - Using XMLType content with the Oracle In-Memory Column...OakTable World 2015  - Using XMLType content with the Oracle In-Memory Column...
OakTable World 2015 - Using XMLType content with the Oracle In-Memory Column...
 
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...
Tools and Tips: From Accidental to Efficient Data Warehouse Developer (SQLBit...
 
Advanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation PipelinesAdvanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation Pipelines
 
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation PipelinesMongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
 
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...
BDA305 NEW LAUNCH! Intro to Amazon Redshift Spectrum: Now query exabytes of d...
 
What to do when one size does not fit all?!
What to do when one size does not fit all?!What to do when one size does not fit all?!
What to do when one size does not fit all?!
 
What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)What to expect when you are visualizing (v.2)
What to expect when you are visualizing (v.2)
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
Xml basics concepts
Xml basics conceptsXml basics concepts
Xml basics concepts
 
Introduction to Sql on Hadoop
Introduction to Sql on HadoopIntroduction to Sql on Hadoop
Introduction to Sql on Hadoop
 
Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...
Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...
Dev Day 2019: Markus Winand – Die Mutter aller Abfragesprachen: SQL im 21. Ja...
 
JakartaData-JCon.pptx
JakartaData-JCon.pptxJakartaData-JCon.pptx
JakartaData-JCon.pptx
 
SQL
SQLSQL
SQL
 

Mehr von kristinferrier

Intro to Firebase Realtime Database and Authentication
Intro to Firebase Realtime Database and AuthenticationIntro to Firebase Realtime Database and Authentication
Intro to Firebase Realtime Database and Authenticationkristinferrier
 
Demystifying JSON in SQL Server
Demystifying JSON in SQL ServerDemystifying JSON in SQL Server
Demystifying JSON in SQL Serverkristinferrier
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQLkristinferrier
 
3D Geospatial Visualization Using Power Map
3D Geospatial Visualization Using Power Map3D Geospatial Visualization Using Power Map
3D Geospatial Visualization Using Power Mapkristinferrier
 

Mehr von kristinferrier (6)

Intro to Firebase Realtime Database and Authentication
Intro to Firebase Realtime Database and AuthenticationIntro to Firebase Realtime Database and Authentication
Intro to Firebase Realtime Database and Authentication
 
Demystifying JSON in SQL Server
Demystifying JSON in SQL ServerDemystifying JSON in SQL Server
Demystifying JSON in SQL Server
 
SQL to JSON
SQL to JSONSQL to JSON
SQL to JSON
 
T-SQL Treats
T-SQL TreatsT-SQL Treats
T-SQL Treats
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
 
3D Geospatial Visualization Using Power Map
3D Geospatial Visualization Using Power Map3D Geospatial Visualization Using Power Map
3D Geospatial Visualization Using Power Map
 

Kürzlich hochgeladen

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

So MANY databases, which one do I pick?

  • 1. So MANY databases, which one do I pick? BY KRISTIN FERRIER
  • 2. About Me – Kristin Ferrier  18+ Years in IT  Principal Consultant at Ferrier Solutions  Full stack web developer with specialty and passion for data  Twitter: @SQLEnergy  Techlahoma Slack: @EnergyDev  GitHub: @EnergyDev
  • 3. 10-15 Years Ago Oracle SQL Server Postgres MySQL Which database should I use? There are four to pick from.
  • 5. Overview  Difference between SQL and NoSQL databases  ACID – Something you may want your database to have  5 Main Database Families  Great scenarios for each of these families
  • 7. SQL Databases  SQL databases are also known as Relational Databases  They store data in tables (in math terms called relations)  Use SQL to interact with the data ID Product Price 3 Gryffindor Scarf 9.99 4 Ravenclaw Scarf 9.99 5 Hufflepuff 9.99 6 Slytherin Scarf 9.99
  • 8. NoSQL  Varying opinions on the definition  Purpose-built databases for specific data models with an emphasis on flexible schemas and new demands of modern applications  The data models include document, graph, key-value, and columnar  Non-relational databases (i.e. any database that isn’t relational)
  • 11. Atomicity  Transaction must execute completely or not at all  Such atomicity must be guaranteed in all situations, including power failures and hardware crashes Account A $1,000 Account B 2. Account B Deposit $1,000 1. Account A Withdraw $1,000 This entire “unit of work” succeeds or fails. No intermediate state.
  • 12. Consistency  Once a transaction has been committed, the data must conform to the database schema.  Example: Database has foreign key constraint requiring deposits or withdrawals to happen with bank accounts that exist in the Accounts table. Thus, the system would only commit deposits or withdrawals corresponding to an account in the Accounts table. Accounts 123 456 789 Deposit $250 into Account 123 Deposit $250 into Account 999 Yes No. Account doesn’t exist
  • 13. Isolation  Concurrent transactions must leave the database in a status as if they were executed sequentially  Let’s look at an example where April is transferring $1,000 from Account A to Account B, while Taylor is withdrawing $500 from Account B. Below is the expected result. Account A Account B $3,000 $2,000 -$1,000 +$1,000 $2,000 $3,000 -$ 500 $2,500
  • 14. Isolation (No Insolation, account loses money) Transfer $1,000 from Account A to Account B  Read Account A as $3,000  Withdraw $1,000 from Account A  Read Account B as $2,000  Add $1,000 to Account B  Account B has $3,000 Withdraw $500 from Account B  Read Account B as $2,000  Withdraw $500 from Account B  Account B has $1,500  We LOST $1,000!
  • 15. Isolation (With isolation, we don’t lose the money) Transfer $1,000 from Account A to Account B  Read Account A as $3,000  Pull $1,000 from Account A  Read Account B as $2,000  Add $1,000 to Account B  Account B has $3,000 Withdraw $500 from Account B  Read Account B (sorry, please, wait)  We’re waiting  We’re waiting  Account B has $3,000  Withdraw $500 from Account B  Account B has $2,500
  • 16. Durability  Once a transaction has completed execution, the updates to the database are persisted in a way that is recoverable upon system failure.  Example: April’s $1,000 transfer, once completed, could be recovered in case of hardware failure. Account A Account B$1,000 HW Crash We still know about the transfer
  • 18. 5 Main Database Families SQL NoSQL Relational Postgres MySQL MS SQL Server Oracle Key-Value Redis Riak DynamoDB* Document MongoDB Couchbase CouchDB Firebase * Multi-model database Columnar Cassandra HBase Google Cloud BigTable Graph Neo4j OrientDB* Cosmos DB*
  • 20. Relational - Major Players Logos used via “fair use”
  • 21. Relational – Data model  Data resides in tables containing rows and columns ProductID (PK) ProductName CurrentPrice SubcategoryID (FK) 1 Red Headphones 49.99 1 2 Blue Headphones 59.99 1 3 Gryffindor Scarf 9.99 2 4 Ravenclaw Scarf 9.99 2 5 Hufflepuff 9.99 2 6 Slytherin Scarf 9.99 2 7 Developer Dragon T-shirt (M) 19.99 2
  • 22. Relational – Data model  Each column has a data type, like INT or VARCHAR(50)  At any one time a table has an exact number of columns for each row  Table structure must be defined prior to adding data ProductID (PK) ProductName CurrentPrice SubcategoryID (FK) 1 Red Headphones 49.99 1 2 Blue Headphones 59.99 1 3 Gryffindor Scarf 9.99 2 4 Ravenclaw Scarf 9.99 2 5 Hufflepuff 9.99 2 6 Slytherin Scarf 9.99 2 7 Developer Dragon T-shirt (M) 19.99 2
  • 23. Relational – Data model SELECT ProductName, CurrentPrice FROM Products Data is accessed using SQL SQL
  • 24. Data lives in tables Customers Benefits Orders Categories OrderDetails Products
  • 25. That can be tied together Customers Benefits Orders Categories OrderDetails Products
  • 28. Relational - Overview  Design schema with types ahead of time  Can query the data with lots of flexibility  ACID-compliant
  • 29. Relational – Popular for  Online Transaction Processing (OLTP)  Order entry  Financial transactions  Retail sales  Data warehousing  Great for many things – Some people recommend using relational databases unless expecting to reach 500 GB or more fairly quickly
  • 30. Relational – Not great for  Large volume (Petabytes) of data  Rapidly ingesting large volumes of data with unknown structures  Scaling out / Horizontal scaling
  • 31. Relational – What if I want more?  Postgres – JSON data type (9.2) JSONB data type (9.4)  SQL Server – JSON functionality (2016) R (2016) Python (2017)  MySQL – JSON data type (5.7.8)  Oracle – JSON functionality (12.1.0.2)
  • 34. Document – Data model  Data resides in documents, such as JSON or XML documents { "EpisodeTitle": "And the Crown of King Arthur", "Director": "Dean Devlin", "FranchiseName": "The Librarians", "Characters": [ { "CharacterName": "Cassandra", "Actor": "Lindy Booth" }, { "CharacterName": "Ezekiel", "Actor": "John Harlin Kim" }, { "CharacterName": "Jake", "Actor": "Christian Kane" } ] }
  • 35. Document – Data model  Each document corresponds to a document key  Can store nested objects within a document  Typically store all data for a single object in a single document  Typically can query by document key or data within the document DocumentKey: “7E8ABGED92A” { "EpisodeTitle": "And the Crown of King Arthur", "Director": "Dean Devlin", "FranchiseName": "The Librarians", "Characters": [ { "CharacterName": "Cassandra", "Actor": "Lindy Booth" }, { "CharacterName": "Ezekiel", "Actor": "John Harlin Kim" }, { "CharacterName": "Jake", "Actor": "Christian Kane" } ] }
  • 36. Document  Schemaless, nested data documents  You don’t need to know the structure ahead of time  ACID-compliance depends upon the provider and might be complete, partial, or none
  • 37. Document – Data access  Data manipulation and querying depends upon the specific provider  MongoDB: JavaScript  Couchbase: N1QL (SQL for JSON) for querying Other options for insert/update/delete
  • 38. MongoDB – Example Insert db.quotes.insertMany([ { Franchise: "Star Wars", Character: "Yoda", QuoteText: "Do. Or do not. There is no try."}, { Franchise: "The Librarians", Character: "Cassandra", QuoteText: "Mathemagics. I like it.."} ]); Retrieve db.quotes.find();
  • 39. MongoDB – Insert Example  Inserting two quotes
  • 40. MongoDB – Retrieve Example  Retrieving all quotes
  • 41. Document – Popular for  OLTP  Customer 360  IoT
  • 42. Document – Real Examples  eBay – Stored meta data for every item for sale on eBay using MongoDB  Gap – Many supply chain systems run against MongoDB  Marriott - Entire reservation system run via Couchbase  Viber - Mobile devices with always-available messaging using Couchbase
  • 43. Document – Not great for  Large-scale batch analytics  Highly interconnected data sets
  • 46. Key-Value – Data model  Data resides in Key-Value pairs where the Key must be unique Key Value Key1 25 ab928019281019210 “Carmen Electron” ae0918384901-01102 <QuoteText>Mathemagics. I like it..</QuoteText> user:jackson123:name “Jackson Peterson” librarians:characters {"Characters": [ { "CharacterName": "Cassandra", "Actor": "Lindy Booth" }, { "CharacterName": "Ezekiel", "Actor": "John Harlin Kim"}, { "CharacterName": "Jake", "Actor": "Christian Kane“ }]}
  • 47. Key-Value – Data Model  Highly flexible in what you can store  Don’t need to know the data structure ahead of time  Can store various kinds of simple data like integers and strings and more complex data like JSON or XML with very high levels of nesting.  When designing, strive for a system that will know the key when querying
  • 48. Key-Value – Data Access  Typically you query the database by the key and not the data corresponding to the key  In simplest Key-Value systems the data value is opaque  Some providers provide additional capabilities to allow range queries or other extended functionality
  • 49. Key-Value – Often strong with  Horizontal scaling  Speed  Handling data with unknown structure Server Server Server
  • 50. Key Value – Popular For  Messaging and chat  User and session data bet365, Hibernum, Riot Games, Rovio have used Riak KV to store session data for gamers and players Virgin America and Flywheel have used RiakKV to store passenger information and session data
  • 51. Key-Value – Not great for  Flexible querying  Querying capabilities are limited  Data warehousing with aggregations of numbers  Complex data query needs
  • 54. Columnar – Data model  Data resides in a keyspace (table in Hbase) that contains column families Row key Column Family “color” Column Family “shape” “primary” “red”: “#FF0000” “yellow”: “#FFFF00” “blue”: “#0000FF” “rectangle”: “4” “triangle”: “3” “secondary” “purple”: “#A020F0” “orange”: “#FFA500” “green”: “#008000” “triangle”: 3 “rainbow” “red”: “#FF0000” “orange”: “#FFA500” “yellow”: “#FFFF00” “green”: “#008000” “blue”: “#0000FF” "indigo": "#4b0082“ "violet": "#EE82EE" “icosagon”: “20”
  • 55. Columnar – Data model  Terminology varies even between HBase and Cassandra. We’ll look at Hbase.  A column family may contain multiple columns  For each row, the column family may have different columns  Columns don’t need to be known at time of column family creation Row key Column family “color” Column Family “shape” “primary” “red”: “#FF0000” “yellow”: “#FFFF00” “blue”: “#0000FF” “rectangle”: “4” “triangle”: “3” “secondary” “purple”: “#A020F0” “orange”: “#FFA500” “green”: “#008000” “triangle”: 3 “rainbow” “red”: “#FF0000” “orange”: “#FFA500” “yellow”: “#FFFF00” “green”: “#008000” “blue”: “#0000FF” "indigo": "#4b0082“ "violet": "#EE82EE" “icosagon”: “20”
  • 56. Columnar – Data model  For “primary” the columns are color:red color:yellow color:blue shape:rectangle shape:triangle Row key Column family “color” Column Family “shape” “primary” “red”: “#FF0000” “yellow”: “#FFFF00” “blue”: “#0000FF” “rectangle”: “4” “triangle”: “3” “secondary” “purple”: “#A020F0” “orange”: “#FFA500” “green”: “#008000” “triangle”: 3 “rainbow” “red”: “#FF0000” “orange”: “#FFA500” “yellow”: “#FFFF00” “green”: “#008000” “blue”: “#0000FF” "indigo": "#4b0082“ "violet": "#EE82EE" “icosagon”: “20”
  • 57. Columnar – Data model  HBase code to create an empty version of this “table” Hbase> create ‘visualizations’ ‘color’, ‘shape’ Row key Column family “color” Column Family “shape” “primary” “red”: “#FF0000” “yellow”: “#FFFF00” “blue”: “#0000FF” “rectangle”: “4” “triangle”: “3” “secondary” “purple”: “#A020F0” “orange”: “#FFA500” “green”: “#008000” “triangle”: 3 “rainbow” “red”: “#FF0000” “orange”: “#FFA500” “yellow”: “#FFFF00” “green”: “#008000” “blue”: “#0000FF” "indigo": "#4b0082“ "violet": "#EE82EE" “icosagon”: “20”
  • 58. Columnar – Strong at  Fast retrieval of columns of data  Scaling “out” / horizontal scaling
  • 59. Columnar – Popular For  Analytics, like data warehousing, on large amounts of data  Internet of Things (IoT)
  • 60. Columnar – Real Examples  Twitter - people search capability  Facebook Messenger (previously)
  • 61. Columnar – Not great for  Online Transaction Processing (OLTP)  Insert/update/delete an entire row is relatively slow  Systems with small amounts of data  Low Gigabytes or smaller  Systems where you don’t understand your query needs upfront as design tends to be focused towards meeting those query needs.
  • 62. Graph
  • 63. Graph – Major Players
  • 64. Graph – Data model  Data resides in Graphs containing Nodes and Relationships Jesse JeffJamie Friends With
  • 65. Graph – Data model  Nodes (vertices) – Entities like people, accounts, products  Nodes can contain multiple pieces of data for an entity  Relationships (edges) – Represent relationships between nodes  Relationships can contain additional data about a relationship Jamie Joined: 2016 Province: OK Friends With As of: 2017
  • 68. Neo4j – Answering questions with Cypher  Who are the friends of Supergirl? MATCH (n)-[:FRIENDS_WITH]-(m) WHERE n.name="Supergirl" RETURN n, m;  Who are the enemies of Supergirl? MATCH (n)-[:ENEMY_OF]-(m) WHERE n.name="Supergirl" RETURN n, m;
  • 70. Kinds of Questions?  Who are the friends of friends of Supergirl?  Who are within 2 degrees of Supergirl?  Who are friends of people who worked with Supergirl?  Who are friends of enemies of Supergirl?
  • 71. Graph – Popular For  Applications that work with highly connected datasets  Social networking  Recommendation engines  Fraud detection  Knowledge graphs  Asset Management
  • 72. Graph – Real Examples  Walmart – Online real-time recommendations using Neo4j  Monsanto – Analysis of plant genetics using Neo4j
  • 73. Graph – Not great for  Large scale analytics  Example: Not great at aggregating numbers  Online Transaction Processing (OLTP)
  • 74. CAP Theorem – Important Too  In Distributed DB Systems  You can have 2 of these Consistency Availability Partition Tolerance CA PC AP
  • 75. Where to go from here?  https://www.db-fiddle.com/ - Playground for multiple SQL databases  http://www.sqlfiddle.com – Playground for multiple SQL databases  https://neo4j.com/sandbox-v2/ - Neo4j sandbox  https://docs.mongodb.com/manual/tutorial/query-documents/ - MongoDB documentation with an online MongoDB Web Shell  https://university.mongodb.com/courses/catalog - MongoDB Courses  https://www.couchbase.com/get-started - Couchbase Get Started, including interactive tutorial for N1QL  Seven Databases in Seven Weeks - Book by Eric Redmond and Jim Wilson
  • 76. Q&A and Thank You Q&A Catch up with me later  Twitter @SQLEnergy  Techlahoma Slack @EnergyDev