SlideShare ist ein Scribd-Unternehmen logo
1 von 68
{
"name": "Andrew Liu",
"e-mail": "andrl@microsoft.com",
"twitter": "@aliuy8"
}
Heterogeneous data
Item Author Pages Language
Harry Potter and the Sorcerer’s
Stone
J.K. Rowling 309 English
Game of Thrones: A Song of Ice
and Fire
George R.R.
Martin
864 English
Item Author Pages Language
Harry Potter and the Sorcerer’s
Stone
J.K. Rowling 309 English
Game of Thrones: A Song of Ice
and Fire
George R.R.
Martin
864 English
Lenovo Thinkpad X1 Carbon ??? ??? ???
fully managed, scalable, queryable, schemafree JSON
document database service for modern applications
transactional processing
rich query
managed as a service
elastic scale
internet accessible http/rest
schema-free data model
arbitrary data formats
query over
schema-free
JSON
transactional
integrated javascript
tunable
performance
fully managed
as a service
query over
schema-free
JSON
transactional
integrated javascript
tunable
performance
fully managed
as a service
No need to define secondary indices / schema hints for indexing!
-- Nested lookup against index
SELECT Books.Author
FROM Books
WHERE Books.Author.Name = "Leo Tolstoy"
-- Transformation, Filters, Array access
SELECT { Name: Books.Title, Author: Books.Author.Name }
FROM Books
WHERE Books.Price > 10 AND Books.Languages[0] = "English"
-- Joins, User Defined Functions (UDF)
SELECT CalculateRegionalTax(Books.Price, "USA", "WA")
FROM Books
JOIN LanguagesArr IN Books.Languages
WHERE LanguagesArr.Language = "Russian"
SQL Query Grammar
query over
schema-free
JSON
transactional
integrated javascript
tunable
performance
fully managed
as a service
function(playerId1, playerId2) {
var playersToSwap = __.filter (function (document) {
return (document.id == playerId1 || document.id == playerId2);
});
var player1 = playersToSwap[0], player2 = playersToSwap[1];
var player1ItemTemp = player1.item;
player1.item = player2.item;
player2.item = player1ItemTemp;
__.replaceDocument(player1)
.then(function() { return __.replaceDocument(player2); })
.fail(function(error){ throw 'Unable to update players, abort'; });
}
client.executeStoredProcedureAsync
("procs/1234", ["MasterChief", "SolidSnake“])
.then(function (response) {
console.log(“success!");
}, function (err) {
console.log("Failed to swap!", error);
}
);
Client Database
query over
schema-free
JSON
transactional
integrated javascript
tunable
performance
fully managed
as a service
Brewer’s CAP Theorem
Consistency
Availability Partition Tolerance
DocumentDB offers 4 consistency levelsBrewer’s CAP Theorem
Consistency
Availability Partition Tolerance
99.95% Availability SLA
query over
schema-free
JSON
transactional
integrated javascript
tunable
performance
fully managed
as a service
• Predictable Performance
• Hourly Billing
• 99.95% Availability
• Adjustable Performance Levels
S1 S2 S3
I’m not
crying
anymore
“With Azure DocumentDB, we didn’t have to say ‘no’ to
the business, and we weren’t a bottleneck to launching
the promotion — in fact, we came in ahead of schedule.”
http://aka.ms/docdbsearch
http://aka.ms/docdbhdi
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"addresses": [
{
"line1": "100 Some Street",
"line2": "Unit 1",
"city": "Seattle",
"state": "WA",
"zip": 98012 }
],
"contactDetails": [
{"email: "thomas@andersen.com"},
{"phone": "+1 555 555-5555", "extension": 5555}
]
}
Try model your entity as a self-
contained document
Generally, use embedded data
models when:
contains
one-to-few
changes infrequently
won’t grow
integral
better read performance
In general, use normalized data
models when:
Write performance
one-to-many
many-to-many
changes frequently
{
"id": "xyz",
"username: "user xyz"
}
{
"id": "address_xyz",
"userid": "xyz",
"address" : {
…
}
}
{
"id: "contact_xyz",
"userid": "xyz",
"email" : "user@user.com"
"phone" : "555 5555"
}
Normalizing typically provides better write performance
No magic bullet
Think about how your data is
going to be written, read and
model accordingly
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"countOfBooks": 3,
"books": [1, 2, 3],
"images": [
{"thumbnail": "http://....png"}
{"profile": "http://....png"}
]
}
{
"id": 1,
"name": "DocumentDB 101",
"authors": [
{"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"},
{"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"}
]
}
Request Unit (RU) is the
normalized currency
% Memory
% IOPS
% CPU
Replica gets a fixed budget
of Request Units
Resource
Resource
set
Resource
Resource
DocumentsSQL
sprocs
args
Resource Resource
Predictable Performance
Operation Request units
(RUs)
consumed*
Reading a single 1KB document 1
Reading a single 2KB document 2
Query with a simple predicate for a 1KB
document
3
Creating a single 1 KB document with 10
JSON properties (consistent indexing)
14
Create a single 1 KB document with 100 JSON
properties (consistent indexing)
20
Replacing a single 1 KB document 28
Execute a stored procedure with two create
documents
30
• Data Size
A single collection holds 10GB
• Throughput
3 Performance tiers with a max of 2,500 RU/sec
Tenant Partition Id
Customer 1
Big Customer 2
Another 3
{
record: "1",
created: {
"date": "6/1/2014",
"epoch": 1401662986
}
},
{
record: "3",
created: {
"date": "9/23/2014"
"epoch": 1411512586
}
} ,
{
record: "123",
created: {
"date": "8/17/2013"
"epoch": 1376779786
}
}
SELECT * FROM root r WHERE r.date.epoch BETWEEN 1376779786 AND 1401662986
{
record: "1",
created: {
"date": "6/1/2014",
"epoch": 1401662986
}
},
{
record: "3",
created: {
"date": "9/23/2014"
"epoch": 1411512586
}
}
{
record: "43233",
created: {
"epoch": 1411512586
}
} ,
{
record: "1123",
created: {
"date": "8/17/2013"
"epoch": 1376779786
}
},
{
record: "43234",
created: {
"epoch": 1376779786
}
Hash sharding
• Examples: Profile data (user ID, app ID), (user ID), Device and vehicle data (device/vin ID),
Catalog data (item ID)
• Pros: balanced, stateless
• Cons: reshuffling is hard
Range sharding
• Examples: Operational data (timestamp), (timestamp, event ID)
• Pros: easy sliding window, range queries
• Cons: stateful
Lookup sharding
• SaaS/multitenant service (tenant ID), Metadata store (type ID)
• Pros: simple, easy to reshuffle, can span accounts
• Cons: stateful, works only on discrete keys
How it works
Automatic indexing of documents
JSON documents are represented as
trees
Structural information and instance
values are normalized into a JSON-Path
Fixed upper bound on index size
(typically 5-10% in real production data)
Example
{"headquarters": "Belgium"}  /"headquarters"/"Belgium"
{"exports": [{"city": “Moscow"}, {"city": Athens"}]}  /"exports"/0/"city"/"Moscow"
and /"exports"/1/"city"/"Athens".
Configuration Level Options
Automatic Per collection True (default) or False
Override with each document write
Indexing Mode Per collection Consistent or Lazy
Lazy for eventual updates/bulk ingestion
Included and excluded
paths
Per path Individual path or recursive includes (? And *)
Indexing Type Per path Support Hash (Default) and Range
Hash for equality, range for range queries
Indexing Precision Per path Supports 3 – 7 per path
Tradeoff storage, query RUs and write RUs
Path Description/use case
/ Default path for collection. Recursive and applies to whole document tree.
/"prop"/? Serve queries like the following (with Hash or Range types respectively):
SELECT * FROM collection c WHERE c.prop = "value"
SELCT * FROM collection c WHERE c.prop > 5
/"prop"/* All paths under the specified label.
/"prop"/"subprop"/ Used during query execution to prune documents that do not have the
specified path.
/"prop"/"subprop"/? Serve queries (with Hash or Range types respectively):
SELECT * FROM collection c WHERE c.prop.subprop = "value"
SELECT * FROM collection c WHERE c.prop.subprop > 5
Introducing Azure DocumentDB - NoSQL, No Problem

Weitere ähnliche Inhalte

Was ist angesagt?

NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
BigBlueHat
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
George Stathis
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
lehresman
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
Steven Francia
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases
MongoDB
 

Was ist angesagt? (20)

High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
 
MongoDB at the Silicon Valley iPhone and iPad Developers' Meetup
MongoDB at the Silicon Valley iPhone and iPad Developers' MeetupMongoDB at the Silicon Valley iPhone and iPad Developers' Meetup
MongoDB at the Silicon Valley iPhone and iPad Developers' Meetup
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 
Transitioning from SQL to MongoDB
Transitioning from SQL to MongoDBTransitioning from SQL to MongoDB
Transitioning from SQL to MongoDB
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
MongoDB and hadoop
MongoDB and hadoopMongoDB and hadoop
MongoDB and hadoop
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
MongoDB Best Practices for Developers
MongoDB Best Practices for DevelopersMongoDB Best Practices for Developers
MongoDB Best Practices for Developers
 
OSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBOSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDB
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
 
Relational to Graph - Import
Relational to Graph - ImportRelational to Graph - Import
Relational to Graph - Import
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
 
MongoDB
MongoDBMongoDB
MongoDB
 
Common MongoDB Use Cases
Common MongoDB Use Cases Common MongoDB Use Cases
Common MongoDB Use Cases
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
 
MongoDB and Schema Design
MongoDB and Schema DesignMongoDB and Schema Design
MongoDB and Schema Design
 

Andere mochten auch

Andere mochten auch (20)

Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDB
 
Azure DocumentDB for Healthcare Integration
Azure DocumentDB for Healthcare IntegrationAzure DocumentDB for Healthcare Integration
Azure DocumentDB for Healthcare Integration
 
SQL Server vs. Azure DocumentDB – Ein Battle zwischen XML und JSON
SQL Server vs. Azure DocumentDB – Ein Battle zwischen XML und JSONSQL Server vs. Azure DocumentDB – Ein Battle zwischen XML und JSON
SQL Server vs. Azure DocumentDB – Ein Battle zwischen XML und JSON
 
Azure DocumentDb
Azure DocumentDbAzure DocumentDb
Azure DocumentDb
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
TDD and Getting Paid
TDD and Getting PaidTDD and Getting Paid
TDD and Getting Paid
 
Sensible scaling
Sensible scalingSensible scaling
Sensible scaling
 
Of Gaps, Fillers and Empty Spaces… Fronteers2015 closing keynote
Of Gaps, Fillers and Empty Spaces… Fronteers2015 closing keynoteOf Gaps, Fillers and Empty Spaces… Fronteers2015 closing keynote
Of Gaps, Fillers and Empty Spaces… Fronteers2015 closing keynote
 
Protect your users with Circuit breakers
Protect your users with Circuit breakersProtect your users with Circuit breakers
Protect your users with Circuit breakers
 
Designing irresistible apis
Designing irresistible apisDesigning irresistible apis
Designing irresistible apis
 
DNS for Developers - NDC Oslo 2016
DNS for Developers - NDC Oslo 2016DNS for Developers - NDC Oslo 2016
DNS for Developers - NDC Oslo 2016
 
Living With Legacy Code
Living With Legacy CodeLiving With Legacy Code
Living With Legacy Code
 
Getting Browsers to Improve the Security of Your Webapp
Getting Browsers to Improve the Security of Your WebappGetting Browsers to Improve the Security of Your Webapp
Getting Browsers to Improve the Security of Your Webapp
 
Microservices Minus the Hype: How to Build and Why
Microservices Minus the Hype: How to Build and WhyMicroservices Minus the Hype: How to Build and Why
Microservices Minus the Hype: How to Build and Why
 
The Evolution and Future of Content Publishing
The Evolution and Future of Content PublishingThe Evolution and Future of Content Publishing
The Evolution and Future of Content Publishing
 
DNS for Developers - ConFoo Montreal
DNS for Developers - ConFoo MontrealDNS for Developers - ConFoo Montreal
DNS for Developers - ConFoo Montreal
 
Get more than a cache back! - ConFoo Montreal
Get more than a cache back! - ConFoo MontrealGet more than a cache back! - ConFoo Montreal
Get more than a cache back! - ConFoo Montreal
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB
 
Securing MicroServices - ConFoo 2017
Securing MicroServices - ConFoo 2017Securing MicroServices - ConFoo 2017
Securing MicroServices - ConFoo 2017
 
Microservices
MicroservicesMicroservices
Microservices
 

Ähnlich wie Introducing Azure DocumentDB - NoSQL, No Problem

Building Highly Flexible, High Performance Query Engines
Building Highly Flexible, High Performance Query EnginesBuilding Highly Flexible, High Performance Query Engines
Building Highly Flexible, High Performance Query Engines
MapR Technologies
 

Ähnlich wie Introducing Azure DocumentDB - NoSQL, No Problem (20)

Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
 
R-Users Group JSON and ReST Introduction using Twitter
R-Users Group JSON and ReST Introduction using TwitterR-Users Group JSON and ReST Introduction using Twitter
R-Users Group JSON and ReST Introduction using Twitter
 
Application Development & Database Choices: Postgres Support for non Relation...
Application Development & Database Choices: Postgres Support for non Relation...Application Development & Database Choices: Postgres Support for non Relation...
Application Development & Database Choices: Postgres Support for non Relation...
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented Databases
 
Building Highly Flexible, High Performance Query Engines
Building Highly Flexible, High Performance Query EnginesBuilding Highly Flexible, High Performance Query Engines
Building Highly Flexible, High Performance Query Engines
 
ICONUK 2016: REST Assured, Freeing Your Domino Data Has Never Been That Easy!
ICONUK 2016: REST Assured, Freeing Your Domino Data Has Never Been That Easy!ICONUK 2016: REST Assured, Freeing Your Domino Data Has Never Been That Easy!
ICONUK 2016: REST Assured, Freeing Your Domino Data Has Never Been That Easy!
 
Snowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your businessSnowplow - Evolve your analytics stack with your business
Snowplow - Evolve your analytics stack with your business
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
Test Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsTest Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely tests
 
MongoDB 3.0
MongoDB 3.0 MongoDB 3.0
MongoDB 3.0
 
No SQL, No Problem: Use Azure DocumentDB
No SQL, No Problem: Use Azure DocumentDBNo SQL, No Problem: Use Azure DocumentDB
No SQL, No Problem: Use Azure DocumentDB
 
Chen li asterix db: 大数据处理开源平台
Chen li asterix db: 大数据处理开源平台Chen li asterix db: 大数据处理开源平台
Chen li asterix db: 大数据处理开源平台
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
 
Towards Interoperability between W3C Web of Things and NGSI-LD
Towards Interoperability between W3C Web of Things and NGSI-LDTowards Interoperability between W3C Web of Things and NGSI-LD
Towards Interoperability between W3C Web of Things and NGSI-LD
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
Brief Introduction to REST
Brief Introduction to RESTBrief Introduction to REST
Brief Introduction to REST
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 

Kürzlich hochgeladen

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Kürzlich hochgeladen (20)

Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 

Introducing Azure DocumentDB - NoSQL, No Problem

  • 1. { "name": "Andrew Liu", "e-mail": "andrl@microsoft.com", "twitter": "@aliuy8" }
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 12. Item Author Pages Language Harry Potter and the Sorcerer’s Stone J.K. Rowling 309 English Game of Thrones: A Song of Ice and Fire George R.R. Martin 864 English
  • 13. Item Author Pages Language Harry Potter and the Sorcerer’s Stone J.K. Rowling 309 English Game of Thrones: A Song of Ice and Fire George R.R. Martin 864 English Lenovo Thinkpad X1 Carbon ??? ??? ???
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. fully managed, scalable, queryable, schemafree JSON document database service for modern applications transactional processing rich query managed as a service elastic scale internet accessible http/rest schema-free data model arbitrary data formats
  • 23. No need to define secondary indices / schema hints for indexing!
  • 24. -- Nested lookup against index SELECT Books.Author FROM Books WHERE Books.Author.Name = "Leo Tolstoy" -- Transformation, Filters, Array access SELECT { Name: Books.Title, Author: Books.Author.Name } FROM Books WHERE Books.Price > 10 AND Books.Languages[0] = "English" -- Joins, User Defined Functions (UDF) SELECT CalculateRegionalTax(Books.Price, "USA", "WA") FROM Books JOIN LanguagesArr IN Books.Languages WHERE LanguagesArr.Language = "Russian" SQL Query Grammar
  • 26.
  • 27.
  • 28. function(playerId1, playerId2) { var playersToSwap = __.filter (function (document) { return (document.id == playerId1 || document.id == playerId2); }); var player1 = playersToSwap[0], player2 = playersToSwap[1]; var player1ItemTemp = player1.item; player1.item = player2.item; player2.item = player1ItemTemp; __.replaceDocument(player1) .then(function() { return __.replaceDocument(player2); }) .fail(function(error){ throw 'Unable to update players, abort'; }); } client.executeStoredProcedureAsync ("procs/1234", ["MasterChief", "SolidSnake“]) .then(function (response) { console.log(“success!"); }, function (err) { console.log("Failed to swap!", error); } ); Client Database
  • 31. DocumentDB offers 4 consistency levelsBrewer’s CAP Theorem Consistency Availability Partition Tolerance 99.95% Availability SLA
  • 33. • Predictable Performance • Hourly Billing • 99.95% Availability • Adjustable Performance Levels S1 S2 S3 I’m not crying anymore
  • 34.
  • 35. “With Azure DocumentDB, we didn’t have to say ‘no’ to the business, and we weren’t a bottleneck to launching the promotion — in fact, we came in ahead of schedule.”
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "addresses": [ { "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": 98012 } ], "contactDetails": [ {"email: "thomas@andersen.com"}, {"phone": "+1 555 555-5555", "extension": 5555} ] } Try model your entity as a self- contained document Generally, use embedded data models when: contains one-to-few changes infrequently won’t grow integral better read performance
  • 51. In general, use normalized data models when: Write performance one-to-many many-to-many changes frequently { "id": "xyz", "username: "user xyz" } { "id": "address_xyz", "userid": "xyz", "address" : { … } } { "id: "contact_xyz", "userid": "xyz", "email" : "user@user.com" "phone" : "555 5555" } Normalizing typically provides better write performance
  • 52. No magic bullet Think about how your data is going to be written, read and model accordingly { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [ {"thumbnail": "http://....png"} {"profile": "http://....png"} ] } { "id": 1, "name": "DocumentDB 101", "authors": [ {"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"} ] }
  • 53.
  • 54. Request Unit (RU) is the normalized currency % Memory % IOPS % CPU Replica gets a fixed budget of Request Units Resource Resource set Resource Resource DocumentsSQL sprocs args Resource Resource Predictable Performance
  • 55. Operation Request units (RUs) consumed* Reading a single 1KB document 1 Reading a single 2KB document 2 Query with a simple predicate for a 1KB document 3 Creating a single 1 KB document with 10 JSON properties (consistent indexing) 14 Create a single 1 KB document with 100 JSON properties (consistent indexing) 20 Replacing a single 1 KB document 28 Execute a stored procedure with two create documents 30
  • 56.
  • 57. • Data Size A single collection holds 10GB • Throughput 3 Performance tiers with a max of 2,500 RU/sec
  • 58.
  • 59.
  • 60. Tenant Partition Id Customer 1 Big Customer 2 Another 3
  • 61.
  • 62. { record: "1", created: { "date": "6/1/2014", "epoch": 1401662986 } }, { record: "3", created: { "date": "9/23/2014" "epoch": 1411512586 } } , { record: "123", created: { "date": "8/17/2013" "epoch": 1376779786 } } SELECT * FROM root r WHERE r.date.epoch BETWEEN 1376779786 AND 1401662986 { record: "1", created: { "date": "6/1/2014", "epoch": 1401662986 } }, { record: "3", created: { "date": "9/23/2014" "epoch": 1411512586 } } { record: "43233", created: { "epoch": 1411512586 } } , { record: "1123", created: { "date": "8/17/2013" "epoch": 1376779786 } }, { record: "43234", created: { "epoch": 1376779786 }
  • 63. Hash sharding • Examples: Profile data (user ID, app ID), (user ID), Device and vehicle data (device/vin ID), Catalog data (item ID) • Pros: balanced, stateless • Cons: reshuffling is hard Range sharding • Examples: Operational data (timestamp), (timestamp, event ID) • Pros: easy sliding window, range queries • Cons: stateful Lookup sharding • SaaS/multitenant service (tenant ID), Metadata store (type ID) • Pros: simple, easy to reshuffle, can span accounts • Cons: stateful, works only on discrete keys
  • 64.
  • 65. How it works Automatic indexing of documents JSON documents are represented as trees Structural information and instance values are normalized into a JSON-Path Fixed upper bound on index size (typically 5-10% in real production data) Example {"headquarters": "Belgium"}  /"headquarters"/"Belgium" {"exports": [{"city": “Moscow"}, {"city": Athens"}]}  /"exports"/0/"city"/"Moscow" and /"exports"/1/"city"/"Athens".
  • 66. Configuration Level Options Automatic Per collection True (default) or False Override with each document write Indexing Mode Per collection Consistent or Lazy Lazy for eventual updates/bulk ingestion Included and excluded paths Per path Individual path or recursive includes (? And *) Indexing Type Per path Support Hash (Default) and Range Hash for equality, range for range queries Indexing Precision Per path Supports 3 – 7 per path Tradeoff storage, query RUs and write RUs
  • 67. Path Description/use case / Default path for collection. Recursive and applies to whole document tree. /"prop"/? Serve queries like the following (with Hash or Range types respectively): SELECT * FROM collection c WHERE c.prop = "value" SELCT * FROM collection c WHERE c.prop > 5 /"prop"/* All paths under the specified label. /"prop"/"subprop"/ Used during query execution to prune documents that do not have the specified path. /"prop"/"subprop"/? Serve queries (with Hash or Range types respectively): SELECT * FROM collection c WHERE c.prop.subprop = "value" SELECT * FROM collection c WHERE c.prop.subprop > 5

Hinweis der Redaktion

  1. Image  licensed under the Creative Commons Attribution-Share Alike 2.0 Generic license. http://commons.wikimedia.org/wiki/File:Crying-girl.jpg
  2. The “write” index for consistent queries Highly concurrent, lock free, log structured indexing technology developed with Microsoft Research Optimized for SSD (works well for HDD) Resource governed for tenant isolation Automatic indexing of JSON documents without requiring schema or secondary indices, but configurable via: Modes Policies Paths Types
  3. Query over heterogeneous documents without defining schema or managing indexes Query arbitrary paths, properties and values without specifying secondary indexes or indexing hints Execute queries with consistent results in the face of sustained writes Query through fluent language integration including LINQ for .NET developers and a “document oriented“ SQL grammar for traditional SQL developers Extend query execution through application supplied JavaScript UDFs Supported SQL features include; predicates, iterations (arrays), sub-queries, logical operators, UDFs, intra-document JOINs, JSON transforms
  4. Stored Procedures and Triggers Familiar programming model constructs for executing application logic Registered as named, URI addressable, durable resources Scoped to a DocumentDB collection JavaScript as a procedural language to express business logic Language integration JavaScript throw statement results into aborting the transaction Execution JavaScript runtime is hosted on each replica Pre-compiled on registration The entire procedure is wrapped in an implicit database transaction Fully resource governed and sandboxed execution
  5. Stored Procedures and Triggers Familiar programming model constructs for executing application logic Registered as named, URI addressable, durable resources Scoped to a DocumentDB collection JavaScript as a procedural language to express business logic Language integration JavaScript throw statement results into aborting the transaction Execution JavaScript runtime is hosted on each replica Pre-compiled on registration The entire procedure is wrapped in an implicit database transaction Fully resource governed and sandboxed execution
  6. Stored Procedures and Triggers Familiar programming model constructs for executing application logic Registered as named, URI addressable, durable resources Scoped to a DocumentDB collection JavaScript as a procedural language to express business logic Language integration JavaScript throw statement results into aborting the transaction Execution JavaScript runtime is hosted on each replica Pre-compiled on registration The entire procedure is wrapped in an implicit database transaction Fully resource governed and sandboxed execution
  7. In theoretical computer science, the CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:[1][2][3] Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it succeeded or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
  8. Strong: guarantees that a write is only visible after it is committed durably by the majority quorum of replicas and reads are always acknowledged by the majority read quorum Session: Provides predictable read consistency for a session while offering the low latency writes. Reads are also low latency as it read will be served by a single replica Bounded Staleness: Bounded Staleness consistency guarantees the total order of propagation of writes but reads may lag writes by N seconds or operations (configurable) Eventual: Eventual consistency is the weakest form of consistency wherein a client may get the values which are older than the ones it had seen before, over time Image licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license: http://commons.wikimedia.org/wiki/File:Fale_F1_Monza_2004_73.jpg
  9. Image licensed under the Creative Commons Attribution 2.0 Generic license: http://en.wikipedia.org/wiki/File:A_smiling_baby.jpg
  10. Talk about productivity and iterative development. No rigid schemas to weigh you down!
  11. Source: http://en.wikipedia.org/wiki/Denormalization In computing, denormalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data.[1][2] In some cases, denormalization is a means of addressing performance or scalability in relational database software.
  12. With DocumentDB, you can choose to also use a hybrid model that to mimic advantages of normalization.
  13. With DocumentDB, you can choose to also use a hybrid model that to mimic advantages of normalization.