SlideShare ist ein Scribd-Unternehmen logo
1 von 61
Downloaden Sie, um offline zu lesen
Azure Cosmos DB – The Swiss Army
NoSQL Cloud Database
Steef-Jan Wiggers
https://nl.linkedin.com/in/steefjan
What can you expect?
• Introduction of Cosmos DB
• Scenarios
• Some In-depth discussion of capabilities (Scale, Consistency,
Indexing, Models, API)
• Reference case
• RU/s, Latency, SLA
• Demos
Introduction
Evolution
Service
Customers
Scenarios
Azure Cosmos DB Evolution
Originally started to address the
problems faced by large scale apps
inside Microsoft
• Built from the ground up for the
cloud
• Used extensively inside Microsoft
• One of the fastest growing
services on Azure
2010 2014 2015 2017
DocumentDB Cosmos DBProject Florence
Cosmos DB Service
5
Column-family
Document
Graph
Turnkey global distribution
Elastic scale out
of storage & throughput
Guaranteed low latency at the 99th percentile
Comprehensive SLAs
Five well-defined consistency models
Table API
Key-value
MongoDB API
A globally distributed, massively scalable, multi-model database service
Cassandra API
Customers
Scenario - Telemetry&SensorData
Azure IoT Hub
Apache Storm on
Azure HDInsight
Azure Cosmos DB
(telemetry and
device state)
events
Azure Web Jobs
(Change feed
processor)
Azure Function
latest state
Azure Data Lake
(archival)
Scenario – Order processing
Azure Functions
(E-Commerce Checkout API)
Azure Cosmos DB
(Order Event Store)
Azure Functions
(Microservice 1: Tax)
Azure Functions
(Microservice 2: Payment)
Azure Functions
(Microservice N: Fufillment)
. . .
Scenario – Multiplayer gaming
Azure Cosmos DB
(game database)
Azure HDInsight
(game analytics)
Azure CDN
Azure API Apps
(game backend)
Azure Storage
(game files)
Azure Notification Hubs
(push notifications)
Azure Functions
Azure Traffic
Manager
Capabilities
Global distribution
Resource model
Scale
Consistency
Indexing
Turnkey Global Distribution
High Availability
• Automatic and Manual Failover
• Multi-homing API removes need for app redeployment
Low Latency (anywhere in the world)
• Packets cannot move fast than the speed of light
• Sending a packet across the world under ideal network
conditions takes 100’s of milliseconds
• You can cheat the speed of light – using data locality
• CDN’s solved this for static content
• Azure Cosmos DB solves this for dynamic content
Global Distribution
• Automatic and transparent replication
worldwide
• Each partition hosts a replica set per
region
• Customers can test end to end
application availability by
programmatically simulating failovers
• All regions are hidden behind a single
global URI with multi-homing
capabilities
• Customers can dynamically add /
remove additional regions at any time
Writes/
Reads
Reads
"airport" : “AMS" "airport" : “MEL"
West US
Container
"airport" : "LAX"
Local Distribution (via horizontal partitioning)
GlobalDistribution(ofresourcepartitions)
Reads
30K transactions/sec
Writes/
Reads
Reads
Reads
West Europ
30K transactions/sec
Partition-key = "airport"
Replication globally
Resource model
• Resources identified by their logical
and stable URI
• Hierarchical overlay over horizontally
partitioned entities; spanning
machines, clusters and regions
• Extensible custom projections based on
specific type of API interface
• Stateless interaction (HTTP and TCP)
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
Account URI and Credentials
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
********.azure.com
IGeAvVUp …
Creating Account
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
Database representation
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
DatabaseDatabaseContainer
DatabaseDatabaseItem
Container representation
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
= Collection Graph Table
Creating collection – SQL API
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
Container level resources
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem ConflictSproc Trigger UDF
System topology (behind the scenes)
Resource Hierarchy
Containers
Resource Partitions
CollectionsTables Graphs
Tenants
CONTAINERS
Logical resources “surfaced” to APIs as tables,
collections or graphs, which are made up of one
or more physical partitions or servers.
RESOURCE PARTITIONS
Consistent, highly available, and resource-
governed coordination primitives
Consist of replica sets, with each replica hosting
an instance of the database engine
Leader
Follower
Follower
Forwarder
Replica Set
To remote resource partition(s)
Scale
• Pay as go for storage and throughput
• Elastic scale across regions
• Partitions
Horizontal Scaling
• Containers are horizontally
partitioned
• Each partition made highly available
via a replica set
• Partition management is
transparent and highly responsive
• Partitioning scheme is dictated by a
“partition-key”
(partition-key = “airport”)
Replica set
Resource Partitions
{"airport":“DUB"}
Container = Collection Graph Table
Elastic Scale Out of Storage and
Throughput
SCALES AS YOUR APPS’ NEEDS CHANGE
• Database elastically scales storage and
throughput
• How? Scale-out!
• Collections can span across large clusters
of machines
• Can start small and seamlessly grow as
your app grows
Partitions
Cosmos DB Container
(e.g. Collection)
Partition Key: User ID
Logical Partitioning Abstraction
Behind the Scenes:
Physical Partition Sets
hash(User ID)
Psuedo-random distribution of data over range of possible hashed values
Partitions
…
Partition 1 Partition 2 Partition n
Frugal # of Partitions based on actual storage and throughput needs
(yielding scalability with low total cost of ownership)
hash(User ID)
Pseudo-random distribution of data over range of possible hashed values
Andrew
Mike
…
Bob
Dharma
Shireesh
Karthik
Rimma
Alice
Carol
…
Partitions
Best Practices: Design Goals for Choosing a Good Partition Key
Distribute the overall request + storage volume
Avoid “hot” partition keys
Steps for Success
• Ballpark scale needs (size/throughput)
• Understand the workload
• # of reads/sec vs writes per sec
• Use pareto principal (80/20 rule) to help optimize bulk of workload
• For reads – understand top 3-5 queries (look for common filters)
• For writes – understand transactional needs
General Tips
• Build a POC to strengthen your understanding of the workload and
iterate (avoid analyses paralysis)
• Don’t be afraid of having too many partition keys
• Partitions keys are logical
• More partition keys more scalability
• Partition Key is scope for multi-record transactions and routing queries
• Queries can be intelligently routed via partition key
• Omitting partition key on query requires fan-out
Consistency
• Global distribution forces us to navigate the CAP
theorem
• Writing correct distributed applications is hard
• Five well-defined consistency levels
• Intuitive and practical with clear PACELC tradeoffs
• Programmatically change at anytime
• Can be overridden on a per-request basis
Programmable data consistency
Strong consistency
High latency
Eventual consistency,
Low latency
Well-defined consistency models
• Intuitive programming model
• 5 Well-defined, consistency models
• Overridable on a per-request basis
Clear tradeoffs
• Latency
• Availability
• Throughput
5 Well-defined, consistency models
PACELC Theorem: In the case of network partitioning (P) in a distributed computer
system, one has to choose between availability (A) and consistency (C) (as per the CAP
theorem), but else (E), even when the system is running normally in the absence of
partitions, one has to choose between latency (L) and consistency (C).
Consistency levels
Choose the right consistency
• Five well-defined, consistency models
• Overridable on a per-request basis
• Provides control over performance-consistency
tradeoffs, backed by comprehensive SLAs.
• An intuitive programming model offering low
latency and high availability for your planet-scale
app.
CLEAR TRADEOFFS
• Latency
• Availability
• Throughput
Strong Bounded-staleness Session Consistent prefix Eventual
Handle any data with no schema or
indexing required
Azure Cosmos DB’s schema-less service automatically indexes all your data,
regardless of the data model, to delivery blazing fast queries.
• Automatic index management
• Synchronous auto-indexing
• No schemas or secondary indices needed
• Works across every data model
Indexing Json Documents
{
"locations": [
{
"country": "Germany",
"city": "Berlin"
},
{
"country": "France",
"city": "Paris"
}
],
"headquarter": "Belgium",
"exports": [
{ "city": "Moscow" },
{ "city": "Athens" }
]
}
locations headquarter exports
0
country city
Germany Berlin
1
country city
France Paris
0 1
city
Athens
city
Moscow
Belgium
Indexing Json Documents
{
"locations": [
{
"country": "Germany",
"city": "Bonn",
"revenue": 200
}
],
"headquarter": "Italy",
"exports": [
{
"city": "Berlin",
"dealers": [
{ "name": "Hans" }
]
},
{ "city": "Athens" }
]
}
locations headquarter exports
0
country city
Germany Bonn
revenue
200
0 1
citycity
Berlin
Italy
dealers
0
name
Hans
Indexing Json Documents
locations headquarter exports
0
country city
Germany Bonn
revenue
200
0 1
citycity
Berlin
Italy
dealers
0
name
Hans
locations headquarter exports
0
country city
Germany Berlin
1
country city
France Paris
0 1
city
Athens
city
Moscow
Belgium
Inverted Index
locations headquarter exports
0
country city
Germany
Berlin
revenue
200
0 1
city
Athens
city
Berlin
Italy
dealers
0
name
Hans
Bonn
1
country city
France Paris
Belgium
Moscow
Index Policies
CUSTOM INDEXING POLICIES
Though all Azure Cosmos DB data is indexed by default, you
can specify a custom indexing policy for your collections.
Custom indexing policies allow you to design and customize
the shape of your index while maintaining schema flexibility.
• Define trade-offs between storage, write and query
performance, and query consistency
• Include or exclude documents and paths to and from the
index
• Configure various index types
{
"automatic": true,
"indexingMode": "Consistent",
"includedPaths": [{
"path": "/*",
"indexes": [{
"kind": "Hash",
"dataType": "String",
"precision": -1
}, {
"kind": "Range",
"dataType": "Number",
"precision": -1
}, {
"kind": "Spatial",
"dataType": "Point"
}]
}],
"excludedPaths": [{
"path": "/nonIndexedContent/*"
}]
}
Swiss Army
Knife
Multi-model + Multi API
Reference Case
Demos
Multi-model + multi-API
• Different models:
• Graph
• Key-Value
• Document DB
• No schema or index management
• Automatic indexing
• API support:
• SQL
• JavaScript
• Gremlin
• MongoDB
• Azure Table Storage
• Cassandra
Mimic strategy
• MongoDB
• Cassandra
What model?
Graph model
A graph is a structure that's composed of vertices and
edges. Both vertices and edges can have an arbitrary
number of properties.
• Vertices - Vertices denote discrete objects, such as a
person, a place, or an event.
• Edges - Edges denote relationships between vertices.
For example, a person might know another person,
be involved in an event, and recently been at a
location.
• Properties - Properties express information about
the vertices and edges.
Demo - Graph
Use case – Cosmos DB Graph
https://customers.microsoft.com/en-in/story/reed-business-information-professional-services-azure
Document (JSON)
• A schema-less JSON database engine with rich SQL querying capabilities.
• Store documents
• Searchable by integrating with Azure Search
• Easy integration with Azure Functions
• Change Feed
Demo - Home Automation
data
data
data
data
start/stop
data
start/stop
start/stop
Cosmos DB
Event Hubs
IoT Hub
Functions
data
data
notify
Logic App Cosmos DB Connector
• Cosmos DB connector is a standard connector.
• Several operation including create and update document.
• Note: The maximum size of a document that is supported by the
Document DB (Azure Cosmos DB) connector is 2 MB.
Demo – Data Ingestion
Convert Epoch to
DateTime
Ingest
Pull Push Pull Push
Function A
Store in Collection
More…
Request Units
Latency
SLA
Request Units
• Request Units (RU) is a rate-based
currency
• Abstracts physical resources for
performing requests
• Key to multi-tenancy, SLAs, and COGS
efficiency
• Foreground and background activities
% IOPS
% CPU
% Memory
Request Units
• Normalized across various access
methods
• 1 RU = 1 read of 1 KB document
• Each request consumes fixed RUs
• Applies to reads, writes, queries, and
stored procedure execution
GET
POST
PUT
Query
Request Units
• Provisioned in terms of RU/sec and RU/min
granularities
• Rate limiting based on amount of throughput
provisioned
• Can be increased or decreased instantaneously
• Metered Hourly
• Background processes like TTL expiration,
index transformations scheduled when
quiescent
Min RU/sec
Max RU/sec
IncomingRequests
Replica
Quiescent
Rate
limit
No
throttling
Latency
• Simultaneously read/write
• Latch free and write-optimized
• 10-ms latency on read
• 15-ms latency on writes
~ 100 reads per second
~ 70 writes per second
Performance Demo
Rules
Subscription0
Subscription1
Subscription2
Subscription3
Subscription4
Cosmos DB
Database
Document Collection
SLA
• Fully managed service
• 99,99% SLA for latency
• Guaranteed throughput,
consistency and high
availability
Key Takeaways
• Multiple options with models and API’s
• Various consistency models
• Global scale
• Flexible through put
• Support for diverse scenario’s
Call to action
• Documentation: https://docs.microsoft.com/en-us/azure/
• Middleware Friday: https://www.youtube.com/watch?v=ZplZgOoGUzY
• Graph demo: https://github.com/anthonychu/cosmosdb-gremlin-flights
• Pluralsight: https://www.pluralsight.com/courses/azure-cosmos-db
• Channel 9: https://channel9.msdn.com/Events/Build/2018/BRK3319
Thanks for watching!
facebook.com/BizTalk360
twitter.com/BizTalk360
http://www.biztalk360.com

Weitere ähnliche Inhalte

Was ist angesagt?

Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
confluent
 

Was ist angesagt? (20)

Explore Azure Cosmos DB
Explore Azure Cosmos DBExplore Azure Cosmos DB
Explore Azure Cosmos DB
 
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
Developing a custom Kafka connector? Make it shine! | Igor Buzatović, Porsche...
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 
Public Cloud Workshop
Public Cloud WorkshopPublic Cloud Workshop
Public Cloud Workshop
 
Kafka Deployment to Steel Thread
Kafka Deployment to Steel ThreadKafka Deployment to Steel Thread
Kafka Deployment to Steel Thread
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
What's new in AWS?
What's new in AWS?What's new in AWS?
What's new in AWS?
 
APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of APAC Kafka Summit - Best Of
APAC Kafka Summit - Best Of
 
Evolving from Messaging to Event Streaming
Evolving from Messaging to Event StreamingEvolving from Messaging to Event Streaming
Evolving from Messaging to Event Streaming
 
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
 
Building Event Driven Architectures with Kafka and Cloud Events (Dan Rosanova...
Building Event Driven Architectures with Kafka and Cloud Events (Dan Rosanova...Building Event Driven Architectures with Kafka and Cloud Events (Dan Rosanova...
Building Event Driven Architectures with Kafka and Cloud Events (Dan Rosanova...
 
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
Best practices for Data warehousing with Amazon Redshift - AWS PS Summit Canb...
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and MoreWSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
 
Netflix Massively Scalable, Highly Available, Immutable Infrastructure
Netflix Massively Scalable, Highly Available, Immutable InfrastructureNetflix Massively Scalable, Highly Available, Immutable Infrastructure
Netflix Massively Scalable, Highly Available, Immutable Infrastructure
 
Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...Real time data processing and model inferncing platform with Kafka streams (N...
Real time data processing and model inferncing platform with Kafka streams (N...
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache Beam
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
What database
What databaseWhat database
What database
 

Ähnlich wie Azure Cosmos DB - The Swiss Army NoSQL Cloud Database

Ähnlich wie Azure Cosmos DB - The Swiss Army NoSQL Cloud Database (20)

Technical overview of Azure Cosmos DB
Technical overview of Azure Cosmos DBTechnical overview of Azure Cosmos DB
Technical overview of Azure Cosmos DB
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
 
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB Overview
 
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...
 
Azure basics
Azure basicsAzure basics
Azure basics
 
Presentacion redislabs-ihub
Presentacion redislabs-ihubPresentacion redislabs-ihub
Presentacion redislabs-ihub
 
Azure CosmosDb
Azure CosmosDbAzure CosmosDb
Azure CosmosDb
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Kognitio - an overview
Kognitio - an overviewKognitio - an overview
Kognitio - an overview
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
Select Stars: A SQL DBA's Introduction to Azure Cosmos DB (SQL Saturday Orego...
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개
 
Tech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DBTech-Spark: Exploring the Cosmos DB
Tech-Spark: Exploring the Cosmos DB
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosql
 

Mehr von BizTalk360

Mehr von BizTalk360 (20)

Optimise Business Activity Tracking – Insights from Smurfit Kappa
Optimise Business Activity Tracking – Insights from Smurfit KappaOptimise Business Activity Tracking – Insights from Smurfit Kappa
Optimise Business Activity Tracking – Insights from Smurfit Kappa
 
Optimise Business Activity Tracking – Insights from Smurfit Kappa
Optimise Business Activity Tracking – Insights from Smurfit KappaOptimise Business Activity Tracking – Insights from Smurfit Kappa
Optimise Business Activity Tracking – Insights from Smurfit Kappa
 
What's inside "migrating to biz talk server 2020" Book (BizTalk360 Webinar)
What's inside "migrating to biz talk server 2020" Book (BizTalk360 Webinar)What's inside "migrating to biz talk server 2020" Book (BizTalk360 Webinar)
What's inside "migrating to biz talk server 2020" Book (BizTalk360 Webinar)
 
Integration Monday - Logic Apps: Development Experiences
Integration Monday - Logic Apps: Development ExperiencesIntegration Monday - Logic Apps: Development Experiences
Integration Monday - Logic Apps: Development Experiences
 
Integration Monday - BizTalk Migrator Deep Dive
Integration Monday - BizTalk Migrator Deep DiveIntegration Monday - BizTalk Migrator Deep Dive
Integration Monday - BizTalk Migrator Deep Dive
 
Testing for Logic App Solutions | Integration Monday
Testing for Logic App Solutions | Integration MondayTesting for Logic App Solutions | Integration Monday
Testing for Logic App Solutions | Integration Monday
 
No-Slides
No-SlidesNo-Slides
No-Slides
 
System Integration using Reactive Programming | Integration Monday
System Integration using Reactive Programming | Integration MondaySystem Integration using Reactive Programming | Integration Monday
System Integration using Reactive Programming | Integration Monday
 
Building workflow solution with Microsoft Azure and Cloud | Integration Monday
Building workflow solution with Microsoft Azure and Cloud | Integration MondayBuilding workflow solution with Microsoft Azure and Cloud | Integration Monday
Building workflow solution with Microsoft Azure and Cloud | Integration Monday
 
Serverless Minimalism: How to architect your apps to save 98% on your Azure b...
Serverless Minimalism: How to architect your apps to save 98% on your Azure b...Serverless Minimalism: How to architect your apps to save 98% on your Azure b...
Serverless Minimalism: How to architect your apps to save 98% on your Azure b...
 
Migrating BizTalk Solutions to Azure: Mapping Messages | Integration Monday
Migrating BizTalk Solutions to Azure: Mapping Messages | Integration MondayMigrating BizTalk Solutions to Azure: Mapping Messages | Integration Monday
Migrating BizTalk Solutions to Azure: Mapping Messages | Integration Monday
 
Integration-Monday-Infrastructure-As-Code-With-Terraform
Integration-Monday-Infrastructure-As-Code-With-TerraformIntegration-Monday-Infrastructure-As-Code-With-Terraform
Integration-Monday-Infrastructure-As-Code-With-Terraform
 
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
Integration-Monday-Stateful-Programming-Models-Serverless-FunctionsIntegration-Monday-Stateful-Programming-Models-Serverless-Functions
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
 
Integration-Monday-Serverless-Slackbots-with-Azure-Durable-Functions
Integration-Monday-Serverless-Slackbots-with-Azure-Durable-FunctionsIntegration-Monday-Serverless-Slackbots-with-Azure-Durable-Functions
Integration-Monday-Serverless-Slackbots-with-Azure-Durable-Functions
 
Integration-Monday-Building-Stateful-Workloads-Kubernetes
Integration-Monday-Building-Stateful-Workloads-KubernetesIntegration-Monday-Building-Stateful-Workloads-Kubernetes
Integration-Monday-Building-Stateful-Workloads-Kubernetes
 
Integration-Monday-Logic-Apps-Tips-Tricks
Integration-Monday-Logic-Apps-Tips-TricksIntegration-Monday-Logic-Apps-Tips-Tricks
Integration-Monday-Logic-Apps-Tips-Tricks
 
Integration-Monday-Terraform-Serverless
Integration-Monday-Terraform-ServerlessIntegration-Monday-Terraform-Serverless
Integration-Monday-Terraform-Serverless
 
Integration-Monday-Microsoft-Power-Platform
Integration-Monday-Microsoft-Power-PlatformIntegration-Monday-Microsoft-Power-Platform
Integration-Monday-Microsoft-Power-Platform
 
One name unify them all
One name unify them allOne name unify them all
One name unify them all
 
Securely Publishing Azure Services
Securely Publishing Azure ServicesSecurely Publishing Azure Services
Securely Publishing Azure Services
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Azure Cosmos DB - The Swiss Army NoSQL Cloud Database

  • 1. Azure Cosmos DB – The Swiss Army NoSQL Cloud Database Steef-Jan Wiggers https://nl.linkedin.com/in/steefjan
  • 2. What can you expect? • Introduction of Cosmos DB • Scenarios • Some In-depth discussion of capabilities (Scale, Consistency, Indexing, Models, API) • Reference case • RU/s, Latency, SLA • Demos
  • 4. Azure Cosmos DB Evolution Originally started to address the problems faced by large scale apps inside Microsoft • Built from the ground up for the cloud • Used extensively inside Microsoft • One of the fastest growing services on Azure 2010 2014 2015 2017 DocumentDB Cosmos DBProject Florence
  • 5. Cosmos DB Service 5 Column-family Document Graph Turnkey global distribution Elastic scale out of storage & throughput Guaranteed low latency at the 99th percentile Comprehensive SLAs Five well-defined consistency models Table API Key-value MongoDB API A globally distributed, massively scalable, multi-model database service Cassandra API
  • 7. Scenario - Telemetry&SensorData Azure IoT Hub Apache Storm on Azure HDInsight Azure Cosmos DB (telemetry and device state) events Azure Web Jobs (Change feed processor) Azure Function latest state Azure Data Lake (archival)
  • 8. Scenario – Order processing Azure Functions (E-Commerce Checkout API) Azure Cosmos DB (Order Event Store) Azure Functions (Microservice 1: Tax) Azure Functions (Microservice 2: Payment) Azure Functions (Microservice N: Fufillment) . . .
  • 9. Scenario – Multiplayer gaming Azure Cosmos DB (game database) Azure HDInsight (game analytics) Azure CDN Azure API Apps (game backend) Azure Storage (game files) Azure Notification Hubs (push notifications) Azure Functions Azure Traffic Manager
  • 11. Turnkey Global Distribution High Availability • Automatic and Manual Failover • Multi-homing API removes need for app redeployment Low Latency (anywhere in the world) • Packets cannot move fast than the speed of light • Sending a packet across the world under ideal network conditions takes 100’s of milliseconds • You can cheat the speed of light – using data locality • CDN’s solved this for static content • Azure Cosmos DB solves this for dynamic content
  • 12. Global Distribution • Automatic and transparent replication worldwide • Each partition hosts a replica set per region • Customers can test end to end application availability by programmatically simulating failovers • All regions are hidden behind a single global URI with multi-homing capabilities • Customers can dynamically add / remove additional regions at any time Writes/ Reads Reads "airport" : “AMS" "airport" : “MEL" West US Container "airport" : "LAX" Local Distribution (via horizontal partitioning) GlobalDistribution(ofresourcepartitions) Reads 30K transactions/sec Writes/ Reads Reads Reads West Europ 30K transactions/sec Partition-key = "airport"
  • 14. Resource model • Resources identified by their logical and stable URI • Hierarchical overlay over horizontally partitioned entities; spanning machines, clusters and regions • Extensible custom projections based on specific type of API interface • Stateless interaction (HTTP and TCP) Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem
  • 15. Account URI and Credentials Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem ********.azure.com IGeAvVUp …
  • 19. Creating collection – SQL API Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem
  • 21. System topology (behind the scenes)
  • 22. Resource Hierarchy Containers Resource Partitions CollectionsTables Graphs Tenants CONTAINERS Logical resources “surfaced” to APIs as tables, collections or graphs, which are made up of one or more physical partitions or servers. RESOURCE PARTITIONS Consistent, highly available, and resource- governed coordination primitives Consist of replica sets, with each replica hosting an instance of the database engine Leader Follower Follower Forwarder Replica Set To remote resource partition(s)
  • 23. Scale • Pay as go for storage and throughput • Elastic scale across regions • Partitions
  • 24. Horizontal Scaling • Containers are horizontally partitioned • Each partition made highly available via a replica set • Partition management is transparent and highly responsive • Partitioning scheme is dictated by a “partition-key” (partition-key = “airport”) Replica set Resource Partitions {"airport":“DUB"} Container = Collection Graph Table
  • 25. Elastic Scale Out of Storage and Throughput SCALES AS YOUR APPS’ NEEDS CHANGE • Database elastically scales storage and throughput • How? Scale-out! • Collections can span across large clusters of machines • Can start small and seamlessly grow as your app grows
  • 26. Partitions Cosmos DB Container (e.g. Collection) Partition Key: User ID Logical Partitioning Abstraction Behind the Scenes: Physical Partition Sets hash(User ID) Psuedo-random distribution of data over range of possible hashed values
  • 27. Partitions … Partition 1 Partition 2 Partition n Frugal # of Partitions based on actual storage and throughput needs (yielding scalability with low total cost of ownership) hash(User ID) Pseudo-random distribution of data over range of possible hashed values Andrew Mike … Bob Dharma Shireesh Karthik Rimma Alice Carol …
  • 28. Partitions Best Practices: Design Goals for Choosing a Good Partition Key Distribute the overall request + storage volume Avoid “hot” partition keys Steps for Success • Ballpark scale needs (size/throughput) • Understand the workload • # of reads/sec vs writes per sec • Use pareto principal (80/20 rule) to help optimize bulk of workload • For reads – understand top 3-5 queries (look for common filters) • For writes – understand transactional needs General Tips • Build a POC to strengthen your understanding of the workload and iterate (avoid analyses paralysis) • Don’t be afraid of having too many partition keys • Partitions keys are logical • More partition keys more scalability • Partition Key is scope for multi-record transactions and routing queries • Queries can be intelligently routed via partition key • Omitting partition key on query requires fan-out
  • 29. Consistency • Global distribution forces us to navigate the CAP theorem • Writing correct distributed applications is hard • Five well-defined consistency levels • Intuitive and practical with clear PACELC tradeoffs • Programmatically change at anytime • Can be overridden on a per-request basis
  • 30. Programmable data consistency Strong consistency High latency Eventual consistency, Low latency
  • 31. Well-defined consistency models • Intuitive programming model • 5 Well-defined, consistency models • Overridable on a per-request basis Clear tradeoffs • Latency • Availability • Throughput
  • 32. 5 Well-defined, consistency models PACELC Theorem: In the case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) (as per the CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C).
  • 34. Choose the right consistency • Five well-defined, consistency models • Overridable on a per-request basis • Provides control over performance-consistency tradeoffs, backed by comprehensive SLAs. • An intuitive programming model offering low latency and high availability for your planet-scale app. CLEAR TRADEOFFS • Latency • Availability • Throughput Strong Bounded-staleness Session Consistent prefix Eventual
  • 35. Handle any data with no schema or indexing required Azure Cosmos DB’s schema-less service automatically indexes all your data, regardless of the data model, to delivery blazing fast queries. • Automatic index management • Synchronous auto-indexing • No schemas or secondary indices needed • Works across every data model
  • 36. Indexing Json Documents { "locations": [ { "country": "Germany", "city": "Berlin" }, { "country": "France", "city": "Paris" } ], "headquarter": "Belgium", "exports": [ { "city": "Moscow" }, { "city": "Athens" } ] } locations headquarter exports 0 country city Germany Berlin 1 country city France Paris 0 1 city Athens city Moscow Belgium
  • 37. Indexing Json Documents { "locations": [ { "country": "Germany", "city": "Bonn", "revenue": 200 } ], "headquarter": "Italy", "exports": [ { "city": "Berlin", "dealers": [ { "name": "Hans" } ] }, { "city": "Athens" } ] } locations headquarter exports 0 country city Germany Bonn revenue 200 0 1 citycity Berlin Italy dealers 0 name Hans
  • 38. Indexing Json Documents locations headquarter exports 0 country city Germany Bonn revenue 200 0 1 citycity Berlin Italy dealers 0 name Hans locations headquarter exports 0 country city Germany Berlin 1 country city France Paris 0 1 city Athens city Moscow Belgium
  • 39. Inverted Index locations headquarter exports 0 country city Germany Berlin revenue 200 0 1 city Athens city Berlin Italy dealers 0 name Hans Bonn 1 country city France Paris Belgium Moscow
  • 40. Index Policies CUSTOM INDEXING POLICIES Though all Azure Cosmos DB data is indexed by default, you can specify a custom indexing policy for your collections. Custom indexing policies allow you to design and customize the shape of your index while maintaining schema flexibility. • Define trade-offs between storage, write and query performance, and query consistency • Include or exclude documents and paths to and from the index • Configure various index types { "automatic": true, "indexingMode": "Consistent", "includedPaths": [{ "path": "/*", "indexes": [{ "kind": "Hash", "dataType": "String", "precision": -1 }, { "kind": "Range", "dataType": "Number", "precision": -1 }, { "kind": "Spatial", "dataType": "Point" }] }], "excludedPaths": [{ "path": "/nonIndexedContent/*" }] }
  • 41. Swiss Army Knife Multi-model + Multi API Reference Case Demos
  • 42. Multi-model + multi-API • Different models: • Graph • Key-Value • Document DB • No schema or index management • Automatic indexing • API support: • SQL • JavaScript • Gremlin • MongoDB • Azure Table Storage • Cassandra
  • 45. Graph model A graph is a structure that's composed of vertices and edges. Both vertices and edges can have an arbitrary number of properties. • Vertices - Vertices denote discrete objects, such as a person, a place, or an event. • Edges - Edges denote relationships between vertices. For example, a person might know another person, be involved in an event, and recently been at a location. • Properties - Properties express information about the vertices and edges.
  • 47. Use case – Cosmos DB Graph https://customers.microsoft.com/en-in/story/reed-business-information-professional-services-azure
  • 48. Document (JSON) • A schema-less JSON database engine with rich SQL querying capabilities. • Store documents • Searchable by integrating with Azure Search • Easy integration with Azure Functions • Change Feed
  • 49. Demo - Home Automation data data data data start/stop data start/stop start/stop Cosmos DB Event Hubs IoT Hub Functions data data notify
  • 50. Logic App Cosmos DB Connector • Cosmos DB connector is a standard connector. • Several operation including create and update document. • Note: The maximum size of a document that is supported by the Document DB (Azure Cosmos DB) connector is 2 MB.
  • 51. Demo – Data Ingestion Convert Epoch to DateTime Ingest Pull Push Pull Push Function A Store in Collection
  • 53. Request Units • Request Units (RU) is a rate-based currency • Abstracts physical resources for performing requests • Key to multi-tenancy, SLAs, and COGS efficiency • Foreground and background activities % IOPS % CPU % Memory
  • 54. Request Units • Normalized across various access methods • 1 RU = 1 read of 1 KB document • Each request consumes fixed RUs • Applies to reads, writes, queries, and stored procedure execution GET POST PUT Query
  • 55. Request Units • Provisioned in terms of RU/sec and RU/min granularities • Rate limiting based on amount of throughput provisioned • Can be increased or decreased instantaneously • Metered Hourly • Background processes like TTL expiration, index transformations scheduled when quiescent Min RU/sec Max RU/sec IncomingRequests Replica Quiescent Rate limit No throttling
  • 56. Latency • Simultaneously read/write • Latch free and write-optimized • 10-ms latency on read • 15-ms latency on writes ~ 100 reads per second ~ 70 writes per second
  • 58. SLA • Fully managed service • 99,99% SLA for latency • Guaranteed throughput, consistency and high availability
  • 59. Key Takeaways • Multiple options with models and API’s • Various consistency models • Global scale • Flexible through put • Support for diverse scenario’s
  • 60. Call to action • Documentation: https://docs.microsoft.com/en-us/azure/ • Middleware Friday: https://www.youtube.com/watch?v=ZplZgOoGUzY • Graph demo: https://github.com/anthonychu/cosmosdb-gremlin-flights • Pluralsight: https://www.pluralsight.com/courses/azure-cosmos-db • Channel 9: https://channel9.msdn.com/Events/Build/2018/BRK3319