SlideShare ist ein Scribd-Unternehmen logo
1 von 57
I N T H I S S E S S I O N …
Azure Cosmos DB Core Concepts and What’s New @ //Build/ 2018
TL;DR High-Level Overview
Resource Model
Request Units
Partitioning
Replication
Automatic Indexing
New Goodies
Q&A
SQL
MongoDB
Table API
Turnkey global
distribution
Elastic scale out
of storage & throughput
Guaranteed low latency
at the 99th percentile
Comprehensive
SLAs
Five well-defined
consistency models
A Z U R E C O S M O S D B
DocumentColumn-family
Key-value Graph
A globally distributed, massively scalable, multi-model database service
Leveraging Azure Cosmos DB to automatically scale
your data across the globe
This module will reference partitioning in the context
of all Azure Cosmos DB modules and APIs.
R E S O U R C E M O D E L
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
A C C O U N T U R I A N D C R E D E N T I A L S
********.azure.com
IGeAvVUp …
C R E AT I N G A C C O U N T
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
D ATA B A S E R E P R E S E N TAT I O N S
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
DatabaseDatabaseContainer
DatabaseDatabaseItem
C O N TA I N E R R E P R E S E N TAT I O N S
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
= Collection Graph Table
C R E AT I N G C O L L E C T I O N S – S Q L A P I
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem
C O N TA I N E R - L E V E L R E S O U R C E S
Account
DatabaseDatabaseDatabase
DatabaseDatabaseContainer
DatabaseDatabaseItem ConflictSproc Trigger UDF
S Y S T E M TO P O LO G Y ( B E H I N D T H E S C E N E S )
Resource
Manager
Language
Runtime(s)
Hosts
Query
Processor
RSM
Index Manager
Bw-tree++/ LLAMA++
Log Manager
IO Manager
Resource Governor
Transport
Database engine
Admission control
…
…
Planet Earth Azure regions Datacenters Stamps Fault domains
Cluster Machine Replica Database engine
Container
Various agents
R E S O U R C E H I E R A R C H Y
CONTAINERS
Logical resources “surfaced” to APIs as tables,
collections or graphs, which are made up of one or
more physical partitions or servers.
RESOURCE PARTITIONS
• Consistent, highly available, and resource-governed
coordination primitives
• Consist of replica sets, with each replica hosting an
instance of the database engine
Containers
Resource Partitions
CollectionsTables Graphs
Tenants
Leader
Follower
Follower
Forwarder
Replica Set
To remote resource partition(s)
R E Q U E S T U N I T S
Request Units (RUs) is a rate-based currency
Abstracts physical resources for performing requests
Key to multi-tenancy, SLAs, and COGS efficiency
Foreground and background activities
% IOPS% CPU% Memory
R E Q U E S T U N I T S
Normalized across various access methods
1 read of 1 KB document from a single partition
Each request consumes fixed RUs
Applies to reads, writes, query, and stored procedures
GET
POST
PUT
Query
…
=
=
=
=
R E Q U E S T U N I T S
Provisioned in terms of RU/sec
Rate limiting based on amount of throughput provisioned
Can be increased or decreased instantaneously
Metered Hourly
Background processes like TTL expiration, index
transformations scheduled when quiescent
Min RU/sec
Max RU/sec
IncomingRequests
Replica Quiescent
Rate limit
No rate limiting
* N E W * P R O V I S I O N R U / S F O R A S E T O F C O N TA I N E R S
Remove friction for OSS NoSQL APIs
Provision RU/sec shared across containers
Mix containers with dedicated throughput and
containers with shared throughput
Elastically scale provisioned throughput for a
set of containers at any time
E L A S T I C S C A L E O U T O F S TO R A G E A N D T H R O U G H P U T
SCALES AS YOUR APPS’ NEEDS CHANGE
Database elastically scales storage and throughput
How? Scale-out!
Collections can span across large clusters of machines
Can start small and seamlessly grow as your app grows
E L A S T I C S C A L E O U T O F S TO R A G E A N D T H R O U G H P U T
SCALES AS YOUR APPS’ NEEDS CHANGE
Database elastically scales storage and throughput
How? Scale-out!
Collections can span across large clusters of machines
Can start small and seamlessly grow as your app grows
PA R T I T I O N S
Cosmos DB Container
(e.g. Collection)
Partition Key: User ID
Logical Partitioning Abstraction
Behind the Scenes:
Physical Partition Sets
hash(User ID)
Psuedo-random distribution of data over range of possible hashed values
PA R T I T I O N S
…
Partition 1 Partition 2 Partition n
Frugal # of Partitions based on actual storage and throughput needs
(yielding scalability with low total cost of ownership)
hash(User ID)
Pseudo-random distribution of data over range of possible hashed values
Andrew
Mike
…
Bob
Dharma
Shireesh
Karthik
Rimma
Alice
Carol
…
PA R T I T I O N S
…
Partition 1 Partition 2 Partition n
What happens when partitions need to grow?
hash(User ID)
Pseudo-random distribution of data over range of possible hashed values
Andrew
Mike
…
Bob
Dharma
Shireesh
Karthik
Rimma
Alice
Carol
…
PA R T I T I O N S
Partition Ranges can be dynamically sub-divided to seamlessly
grow database as the application grows while simultaneously
maintaining high availability.
Partition management is fully managed by Azure Cosmos DB,
so you don't have to write code or manage your partitions.
+
Partition x Partition x1 Partition x2
hash(User ID)
Pseudo-random distribution of data over range of possible hashed values
Rimma
Karthik
…
Dharma
Shireesh
Karthik
Rimma
Alice
Carol
…
Dharma
Shireesh
…
PA R T I T I O N S
Best Practices: Design Goals for Choosing a Good Partition Key
• Distribute the overall request + storage volume
• Avoid “hot” partition keys
Steps for Success
• Ballpark scale needs (size/throughput)
• Understand the workload
• # of reads/sec vs writes per sec
• Use pareto principal (80/20 rule) to help optimize bulk of workload
• For reads – understand top 3-5 queries (look for common filters)
• For writes – understand transactional needs
General Tips
• Build a POC to strengthen your understanding of the workload and
iterate (avoid analyses paralysis)
• Don’t be afraid of having too many partition keys
• Partitions keys are logical
• More partition keys  more scalability
• Partition Key is scope for multi-record transactions and routing queries
• Queries can be intelligently routed via partition key
• Omitting partition key on query requires fan-out
* N E W * B U L K E X E C U TO R L I B R A R Y
Easy out-of-the-box bulk operation functionality
Supports bulk import and update
Auto handles congestion control + transient errors
10x client-side performance improvement
Easily scale-out clients across more VMs
Available starting with .NET and Java
T U R N K E Y G LO B A L D I S T R I B U T I O N
High Availability
• Automatic and Manual Failover
• Multi-homing API removes need for app redeployment
Low Latency (anywhere in the world)
• Packets cannot move fast than the speed of light
• Sending a packet across the world under ideal network
conditions takes 100’s of milliseconds
• You can cheat the speed of light – using data locality
• CDN’s solved this for static content
• Azure Cosmos DB solves this for dynamic content
T U R N K E Y G LO B A L D I S T R I B U T I O N
• Automatic and transparent replication worldwide
• Each partition hosts a replica set per region
• Customers can test end to end application
availability by programmatically simulating failovers
• All regions are hidden behind a single global URI
with multi-homing capabilities
• Customers can dynamically add / remove
additional regions at any time
Writes/
Reads
Reads
"airport" : “AMS" "airport" : “MEL"
West US
Container
"airport" : "LAX"
Local Distribution (via horizontal partitioning)
GlobalDistribution(ofresourcepartitions)
Reads
30K transactions/sec
Writes/
Reads
Reads
Reads
West Europe
30K transactions/sec
Partition-key = "airport"
R E P L I C AT I N G D ATA G LO B A L LY
R E P L I C AT I N G D ATA G LO B A L LY
R E P L I C AT I N G D ATA G LO B A L LY
A U TO M AT I C FA I LO V E R
A U TO M AT I C FA I LO V E R
M A N U A L FA I LO V E R
Strong Bounded-staleness Session Consistent prefix Eventual
F I V E W E L L - D E F I N E D C O N S I S T E N C Y M O D E L S
CHOOSE THE BEST CONSISTENCY MODEL FOR YOUR APP
Five well-defined, consistency models
Overridable on a per-request basis
Provides control over performance-consistency tradeoffs,
backed by comprehensive SLAs.
An intuitive programming model offering low latency and
high availability for your planet-scale app.
CLEAR TRADEOFFS
• Latency
• Availability
• Throughput
* N E W * M U LT I - M A S T E R ( P R E V I E W )
Perfect for Intelligent Cloud
and Intelligent Edge Applications
Write scalability around the world
Low latency writes around the world
99.999% High Availability around the world
Well-defined consistency models
Comprehensive conflict management
H A N D L E A N Y D ATA W I T H N O
S C H E M A O R I N D E X I N G R E Q U I R E D
Azure Cosmos DB’s schema-less service automatically indexes all your
data, regardless of the data model, to delivery blazing fast queries.
Item Color
Microwave
safe
Liquid
capacity
CPU Memory Storage
Geek
mug
Graphite Yes 16ox ??? ??? ???
Coffee
Bean
mug
Tan No 12oz ??? ??? ???
Surface
book
Gray ??? ??? 3.4 GHz
Intel
Skylake
Core i7-
6600U
16GB 1 TB SSD
• Automatic index management
• Synchronous auto-indexing
• No schemas or secondary indices needed
• Works across every data model
GEEK
I N D E X I N G J S O N D O C U M E N T S
{
"locations": [
{
"country": "Germany",
"city": "Berlin"
},
{
"country": "France",
"city": "Paris"
}
],
"headquarter": "Belgium",
"exports": [
{ "city": "Moscow" },
{ "city": "Athens" }
]
}
locations headquarter exports
0
country city
Germany Berlin
1
country city
France Paris
0 1
city
Athens
city
Moscow
Belgium
I N D E X I N G J S O N D O C U M E N T S
{
"locations": [
{
"country": "Germany",
"city": "Bonn",
"revenue": 200
}
],
"headquarter": "Italy",
"exports": [
{
"city": "Berlin",
"dealers": [
{ "name": "Hans" }
]
},
{ "city": "Athens" }
]
}
locations headquarter exports
0
country city
Germany Bonn
revenue
200
0 1
citycity
Berlin
Italy
dealers
0
name
Hans
I N D E X I N G J S O N D O C U M E N T S
Athens
locations headquarter exports
0
country city
Germany Bonn
revenue
200
0 1
citycity
Berlin
Italy
dealers
0
name
Hans
locations headquarter exports
0
country city
Germany Berlin
1
country city
France Paris
0 1
city
Athens
city
Moscow
Belgium
I N V E R T E D I N D E X
locations headquarter exports
0
country city
Germany
Berlin
revenue
200
0 1
city
Athens
city
Berlin
Italy
dealers
0
name
Hans
Bonn
1
country city
France Paris
Belgium
Moscow
I N D E X P O L I C I E S
CUSTOM INDEXING POLICIES
Though all Azure Cosmos DB data is indexed by default, you
can specify a custom indexing policy for your collections.
Custom indexing policies allow you to design and customize
the shape of your index while maintaining schema flexibility.
• Define trade-offs between storage, write and query
performance, and query consistency
• Include or exclude documents and paths to and from the
index
• Configure various index types
{
"automatic": true,
"indexingMode": "Consistent",
"includedPaths": [{
"path": "/*",
"indexes": [{
"kind": "Hash",
"dataType": "String",
"precision": -1
}, {
"kind": "Range",
"dataType": "Number",
"precision": -1
}, {
"kind": "Spatial",
"dataType": "Point"
}]
}],
"excludedPaths": [{
"path": "/nonIndexedContent/*"
}]
}
P R O V I S I O N T H R O U G H P U T F O R A S E T O F C O N TA I N E R S
Remove friction for OSS NoSQL APIs
Provision RU/sec shared across containers
Mix containers with dedicated throughput and
containers with shared throughput
Elastically scale provisioned throughput for a
set of containers at any time
B U L K E X E C U TO R L I B R A R Y
Easy out-of-the-box bulk operation functionality
Supports bulk import and update
Auto handles congestion control + transient errors
10x client-side performance improvement
Easily scale-out clients across more VMs
Available starting with .NET and Java
M U LT I - M A S T E R @ G LO B A L S C A L E ( P R E V I E W )
Perfect for Intelligent Cloud
and Intelligent Edge Applications
Write scalability around the world
Low latency writes around the world
99.999% High Availability around the world
Well-defined consistency models
Comprehensive conflict management
V N E T S E R V I C E E N D P O I N T
Secure communication without
exposing public endpoints
Limit access to specific VNET(s) subnet(s)
Compatible with IP Firewall ACLs
Available in all Azure regions
J AVA A S Y N C L I B R A R Y F O R S Q L A P I
New Async API surface for event-based
programs w/ observable sequencies
Leverages popular RxJava library
2x client-side performance improvement
Improved user experience
R E C A P
Azure Cosmos DB Core Concepts and What’s New @ //Build/ 2018
TL;DR High-Level Overview
Resource Model
Request Units
Partitioning
Replication
Automatic Indexing
New Goodies
Q&A
Technical overview of Azure Cosmos DB
Technical overview of Azure Cosmos DB

Weitere ähnliche Inhalte

Was ist angesagt?

Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxCalvinSim10
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLPGConf APAC
 
Introduction to Azure SQL DB
Introduction to Azure SQL DBIntroduction to Azure SQL DB
Introduction to Azure SQL DBChristopher Foot
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed InstanceJames Serra
 
Azure SQL Database
Azure SQL DatabaseAzure SQL Database
Azure SQL Databaserockplace
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?confluent
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksAmazon Web Services
 
CAF presentation 09 16-2020
CAF presentation 09 16-2020CAF presentation 09 16-2020
CAF presentation 09 16-2020Michael Nichols
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍
AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍
AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍Amazon Web Services Korea
 
Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)Amazon Web Services
 

Was ist angesagt? (20)

Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptxData platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
Introduction to Azure SQL DB
Introduction to Azure SQL DBIntroduction to Azure SQL DB
Introduction to Azure SQL DB
 
Azure storage
Azure storageAzure storage
Azure storage
 
Azure SQL Database Managed Instance
Azure SQL Database Managed InstanceAzure SQL Database Managed Instance
Azure SQL Database Managed Instance
 
Azure SQL Database
Azure SQL DatabaseAzure SQL Database
Azure SQL Database
 
How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?How to govern and secure a Data Mesh?
How to govern and secure a Data Mesh?
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
 
CAF presentation 09 16-2020
CAF presentation 09 16-2020CAF presentation 09 16-2020
CAF presentation 09 16-2020
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
Amazon QuickSight
Amazon QuickSightAmazon QuickSight
Amazon QuickSight
 
AWS Storage services
AWS Storage servicesAWS Storage services
AWS Storage services
 
AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍
AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍
AWS Quicksight에서의 비즈니스 인텔리전스 - 김기완 :: 2015 리인벤트 리캡 게이밍
 
Azure storage
Azure storageAzure storage
Azure storage
 
Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)Amazon Relational Database Service (Amazon RDS)
Amazon Relational Database Service (Amazon RDS)
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
Aurora Deep Dive | AWS Floor28
Aurora Deep Dive | AWS Floor28Aurora Deep Dive | AWS Floor28
Aurora Deep Dive | AWS Floor28
 

Ähnlich wie Technical overview of Azure Cosmos DB

Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAndre Essing
 
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...Andre Essing
 
Azure Cosmos DB - The Swiss Army NoSQL Cloud Database
Azure Cosmos DB - The Swiss Army NoSQL Cloud DatabaseAzure Cosmos DB - The Swiss Army NoSQL Cloud Database
Azure Cosmos DB - The Swiss Army NoSQL Cloud DatabaseBizTalk360
 
Tour de France Azure PaaS 3/7 Stocker des informations
Tour de France Azure PaaS 3/7 Stocker des informationsTour de France Azure PaaS 3/7 Stocker des informations
Tour de France Azure PaaS 3/7 Stocker des informationsAlex Danvy
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFAmazon Web Services
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Mohammad Asif
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlRiccardo Cappello
 
Azure Cosmos DB + Gremlin API in Action
Azure Cosmos DB + Gremlin API in ActionAzure Cosmos DB + Gremlin API in Action
Azure Cosmos DB + Gremlin API in ActionDenys Chamberland
 
Time Series Analytics Azure ADX
Time Series Analytics Azure ADXTime Series Analytics Azure ADX
Time Series Analytics Azure ADXRiccardo Zamana
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB OverviewAndrew Liu
 
Presentacion redislabs-ihub
Presentacion redislabs-ihubPresentacion redislabs-ihub
Presentacion redislabs-ihubssuser9d7c90
 
Keynote 1: AWS re:Invent 2017 Recap - Solutions Overview
Keynote 1: AWS re:Invent 2017 Recap - Solutions OverviewKeynote 1: AWS re:Invent 2017 Recap - Solutions Overview
Keynote 1: AWS re:Invent 2017 Recap - Solutions OverviewAmazon Web Services
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we areMarco Parenzan
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 
Azure Cosmos DB L100 Pitch Deck
Azure Cosmos DB L100 Pitch DeckAzure Cosmos DB L100 Pitch Deck
Azure Cosmos DB L100 Pitch DeckNicholas Vossburg
 

Ähnlich wie Technical overview of Azure Cosmos DB (20)

Azure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep DiveAzure Cosmos DB - Technical Deep Dive
Azure Cosmos DB - Technical Deep Dive
 
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
Azure Cosmos DB - NoSQL Strikes Back (An introduction to the dark side of you...
 
Azure Cosmos DB - The Swiss Army NoSQL Cloud Database
Azure Cosmos DB - The Swiss Army NoSQL Cloud DatabaseAzure Cosmos DB - The Swiss Army NoSQL Cloud Database
Azure Cosmos DB - The Swiss Army NoSQL Cloud Database
 
Tour de France Azure PaaS 3/7 Stocker des informations
Tour de France Azure PaaS 3/7 Stocker des informationsTour de France Azure PaaS 3/7 Stocker des informations
Tour de France Azure PaaS 3/7 Stocker des informations
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.Modeling data and best practices for the Azure Cosmos DB.
Modeling data and best practices for the Azure Cosmos DB.
 
Azure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosqlAzure CosmosDB the new frontier of big data and nosql
Azure CosmosDB the new frontier of big data and nosql
 
Azure Cosmos DB + Gremlin API in Action
Azure Cosmos DB + Gremlin API in ActionAzure Cosmos DB + Gremlin API in Action
Azure Cosmos DB + Gremlin API in Action
 
Time Series Analytics Azure ADX
Time Series Analytics Azure ADXTime Series Analytics Azure ADX
Time Series Analytics Azure ADX
 
Azure DocumentDB Overview
Azure DocumentDB OverviewAzure DocumentDB Overview
Azure DocumentDB Overview
 
Presentacion redislabs-ihub
Presentacion redislabs-ihubPresentacion redislabs-ihub
Presentacion redislabs-ihub
 
Keynote 1: AWS re:Invent 2017 Recap - Solutions Overview
Keynote 1: AWS re:Invent 2017 Recap - Solutions OverviewKeynote 1: AWS re:Invent 2017 Recap - Solutions Overview
Keynote 1: AWS re:Invent 2017 Recap - Solutions Overview
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
Azure CosmosDb - Where we are
Azure CosmosDb - Where we areAzure CosmosDb - Where we are
Azure CosmosDb - Where we are
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
Azure Cosmos DB L100 Pitch Deck
Azure Cosmos DB L100 Pitch DeckAzure Cosmos DB L100 Pitch Deck
Azure Cosmos DB L100 Pitch Deck
 
Using Data Lakes
Using Data LakesUsing Data Lakes
Using Data Lakes
 
AWS Big Data Landscape
AWS Big Data LandscapeAWS Big Data Landscape
AWS Big Data Landscape
 

Mehr von Microsoft Tech Community

Removing Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment SuccessRemoving Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment SuccessMicrosoft Tech Community
 
Building mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and XamarinBuilding mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and XamarinMicrosoft Tech Community
 
Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Microsoft Tech Community
 
Interactive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive CardsInteractive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive CardsMicrosoft Tech Community
 
Unlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph APIUnlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph APIMicrosoft Tech Community
 
Break through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable FunctionsBreak through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable FunctionsMicrosoft Tech Community
 
Multiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container InstancesMultiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container InstancesMicrosoft Tech Community
 
Media Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and XamarinMedia Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and XamarinMicrosoft Tech Community
 
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexityReal-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexityMicrosoft Tech Community
 
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightIngestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightMicrosoft Tech Community
 
Getting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AIGetting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AIMicrosoft Tech Community
 
Mobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing MapsMobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing MapsMicrosoft Tech Community
 
Cognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detectionCognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detectionMicrosoft Tech Community
 

Mehr von Microsoft Tech Community (20)

100 ways to use Yammer
100 ways to use Yammer100 ways to use Yammer
100 ways to use Yammer
 
10 Yammer Group Suggestions
10 Yammer Group Suggestions10 Yammer Group Suggestions
10 Yammer Group Suggestions
 
Removing Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment SuccessRemoving Security Roadblocks to IoT Deployment Success
Removing Security Roadblocks to IoT Deployment Success
 
Building mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and XamarinBuilding mobile apps with Visual Studio and Xamarin
Building mobile apps with Visual Studio and Xamarin
 
Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...Best practices with Microsoft Graph: Making your applications more performant...
Best practices with Microsoft Graph: Making your applications more performant...
 
Interactive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive CardsInteractive emails in Outlook with Adaptive Cards
Interactive emails in Outlook with Adaptive Cards
 
Unlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph APIUnlocking security insights with Microsoft Graph API
Unlocking security insights with Microsoft Graph API
 
Break through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable FunctionsBreak through the serverless barriers with Durable Functions
Break through the serverless barriers with Durable Functions
 
Multiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container InstancesMultiplayer Server Scaling with Azure Container Instances
Multiplayer Server Scaling with Azure Container Instances
 
Explore Azure Cosmos DB
Explore Azure Cosmos DBExplore Azure Cosmos DB
Explore Azure Cosmos DB
 
Media Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and XamarinMedia Streaming Apps with Azure and Xamarin
Media Streaming Apps with Azure and Xamarin
 
DevOps for Data Science
DevOps for Data ScienceDevOps for Data Science
DevOps for Data Science
 
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexityReal-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
 
Azure Functions and Microsoft Graph
Azure Functions and Microsoft GraphAzure Functions and Microsoft Graph
Azure Functions and Microsoft Graph
 
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsightIngestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
 
Getting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AIGetting Started with Visual Studio Tools for AI
Getting Started with Visual Studio Tools for AI
 
Using AML Python SDK
Using AML Python SDKUsing AML Python SDK
Using AML Python SDK
 
Mobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing MapsMobile Workforce Location Tracking with Bing Maps
Mobile Workforce Location Tracking with Bing Maps
 
Cognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detectionCognitive Services Labs in action Anomaly detection
Cognitive Services Labs in action Anomaly detection
 
Speech Devices SDK
Speech Devices SDKSpeech Devices SDK
Speech Devices SDK
 

Kürzlich hochgeladen

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Kürzlich hochgeladen (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Technical overview of Azure Cosmos DB

  • 1.
  • 2.
  • 3. I N T H I S S E S S I O N … Azure Cosmos DB Core Concepts and What’s New @ //Build/ 2018 TL;DR High-Level Overview Resource Model Request Units Partitioning Replication Automatic Indexing New Goodies Q&A
  • 4.
  • 5. SQL MongoDB Table API Turnkey global distribution Elastic scale out of storage & throughput Guaranteed low latency at the 99th percentile Comprehensive SLAs Five well-defined consistency models A Z U R E C O S M O S D B DocumentColumn-family Key-value Graph A globally distributed, massively scalable, multi-model database service
  • 6.
  • 7. Leveraging Azure Cosmos DB to automatically scale your data across the globe This module will reference partitioning in the context of all Azure Cosmos DB modules and APIs. R E S O U R C E M O D E L Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem
  • 8. Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem A C C O U N T U R I A N D C R E D E N T I A L S ********.azure.com IGeAvVUp …
  • 9. C R E AT I N G A C C O U N T Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem
  • 10. D ATA B A S E R E P R E S E N TAT I O N S Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem DatabaseDatabaseContainer DatabaseDatabaseItem
  • 11. C O N TA I N E R R E P R E S E N TAT I O N S Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem = Collection Graph Table
  • 12. C R E AT I N G C O L L E C T I O N S – S Q L A P I Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem
  • 13. C O N TA I N E R - L E V E L R E S O U R C E S Account DatabaseDatabaseDatabase DatabaseDatabaseContainer DatabaseDatabaseItem ConflictSproc Trigger UDF
  • 14. S Y S T E M TO P O LO G Y ( B E H I N D T H E S C E N E S ) Resource Manager Language Runtime(s) Hosts Query Processor RSM Index Manager Bw-tree++/ LLAMA++ Log Manager IO Manager Resource Governor Transport Database engine Admission control … … Planet Earth Azure regions Datacenters Stamps Fault domains Cluster Machine Replica Database engine Container Various agents
  • 15. R E S O U R C E H I E R A R C H Y CONTAINERS Logical resources “surfaced” to APIs as tables, collections or graphs, which are made up of one or more physical partitions or servers. RESOURCE PARTITIONS • Consistent, highly available, and resource-governed coordination primitives • Consist of replica sets, with each replica hosting an instance of the database engine Containers Resource Partitions CollectionsTables Graphs Tenants Leader Follower Follower Forwarder Replica Set To remote resource partition(s)
  • 16.
  • 17. R E Q U E S T U N I T S Request Units (RUs) is a rate-based currency Abstracts physical resources for performing requests Key to multi-tenancy, SLAs, and COGS efficiency Foreground and background activities % IOPS% CPU% Memory
  • 18. R E Q U E S T U N I T S Normalized across various access methods 1 read of 1 KB document from a single partition Each request consumes fixed RUs Applies to reads, writes, query, and stored procedures GET POST PUT Query … = = = =
  • 19. R E Q U E S T U N I T S Provisioned in terms of RU/sec Rate limiting based on amount of throughput provisioned Can be increased or decreased instantaneously Metered Hourly Background processes like TTL expiration, index transformations scheduled when quiescent Min RU/sec Max RU/sec IncomingRequests Replica Quiescent Rate limit No rate limiting
  • 20. * N E W * P R O V I S I O N R U / S F O R A S E T O F C O N TA I N E R S Remove friction for OSS NoSQL APIs Provision RU/sec shared across containers Mix containers with dedicated throughput and containers with shared throughput Elastically scale provisioned throughput for a set of containers at any time
  • 21.
  • 22. E L A S T I C S C A L E O U T O F S TO R A G E A N D T H R O U G H P U T SCALES AS YOUR APPS’ NEEDS CHANGE Database elastically scales storage and throughput How? Scale-out! Collections can span across large clusters of machines Can start small and seamlessly grow as your app grows
  • 23. E L A S T I C S C A L E O U T O F S TO R A G E A N D T H R O U G H P U T SCALES AS YOUR APPS’ NEEDS CHANGE Database elastically scales storage and throughput How? Scale-out! Collections can span across large clusters of machines Can start small and seamlessly grow as your app grows
  • 24. PA R T I T I O N S Cosmos DB Container (e.g. Collection) Partition Key: User ID Logical Partitioning Abstraction Behind the Scenes: Physical Partition Sets hash(User ID) Psuedo-random distribution of data over range of possible hashed values
  • 25. PA R T I T I O N S … Partition 1 Partition 2 Partition n Frugal # of Partitions based on actual storage and throughput needs (yielding scalability with low total cost of ownership) hash(User ID) Pseudo-random distribution of data over range of possible hashed values Andrew Mike … Bob Dharma Shireesh Karthik Rimma Alice Carol …
  • 26. PA R T I T I O N S … Partition 1 Partition 2 Partition n What happens when partitions need to grow? hash(User ID) Pseudo-random distribution of data over range of possible hashed values Andrew Mike … Bob Dharma Shireesh Karthik Rimma Alice Carol …
  • 27. PA R T I T I O N S Partition Ranges can be dynamically sub-divided to seamlessly grow database as the application grows while simultaneously maintaining high availability. Partition management is fully managed by Azure Cosmos DB, so you don't have to write code or manage your partitions. + Partition x Partition x1 Partition x2 hash(User ID) Pseudo-random distribution of data over range of possible hashed values Rimma Karthik … Dharma Shireesh Karthik Rimma Alice Carol … Dharma Shireesh …
  • 28. PA R T I T I O N S Best Practices: Design Goals for Choosing a Good Partition Key • Distribute the overall request + storage volume • Avoid “hot” partition keys Steps for Success • Ballpark scale needs (size/throughput) • Understand the workload • # of reads/sec vs writes per sec • Use pareto principal (80/20 rule) to help optimize bulk of workload • For reads – understand top 3-5 queries (look for common filters) • For writes – understand transactional needs General Tips • Build a POC to strengthen your understanding of the workload and iterate (avoid analyses paralysis) • Don’t be afraid of having too many partition keys • Partitions keys are logical • More partition keys  more scalability • Partition Key is scope for multi-record transactions and routing queries • Queries can be intelligently routed via partition key • Omitting partition key on query requires fan-out
  • 29. * N E W * B U L K E X E C U TO R L I B R A R Y Easy out-of-the-box bulk operation functionality Supports bulk import and update Auto handles congestion control + transient errors 10x client-side performance improvement Easily scale-out clients across more VMs Available starting with .NET and Java
  • 30.
  • 31.
  • 32. T U R N K E Y G LO B A L D I S T R I B U T I O N High Availability • Automatic and Manual Failover • Multi-homing API removes need for app redeployment Low Latency (anywhere in the world) • Packets cannot move fast than the speed of light • Sending a packet across the world under ideal network conditions takes 100’s of milliseconds • You can cheat the speed of light – using data locality • CDN’s solved this for static content • Azure Cosmos DB solves this for dynamic content
  • 33. T U R N K E Y G LO B A L D I S T R I B U T I O N • Automatic and transparent replication worldwide • Each partition hosts a replica set per region • Customers can test end to end application availability by programmatically simulating failovers • All regions are hidden behind a single global URI with multi-homing capabilities • Customers can dynamically add / remove additional regions at any time Writes/ Reads Reads "airport" : “AMS" "airport" : “MEL" West US Container "airport" : "LAX" Local Distribution (via horizontal partitioning) GlobalDistribution(ofresourcepartitions) Reads 30K transactions/sec Writes/ Reads Reads Reads West Europe 30K transactions/sec Partition-key = "airport"
  • 34. R E P L I C AT I N G D ATA G LO B A L LY
  • 35. R E P L I C AT I N G D ATA G LO B A L LY
  • 36. R E P L I C AT I N G D ATA G LO B A L LY
  • 37. A U TO M AT I C FA I LO V E R
  • 38. A U TO M AT I C FA I LO V E R
  • 39. M A N U A L FA I LO V E R
  • 40. Strong Bounded-staleness Session Consistent prefix Eventual F I V E W E L L - D E F I N E D C O N S I S T E N C Y M O D E L S CHOOSE THE BEST CONSISTENCY MODEL FOR YOUR APP Five well-defined, consistency models Overridable on a per-request basis Provides control over performance-consistency tradeoffs, backed by comprehensive SLAs. An intuitive programming model offering low latency and high availability for your planet-scale app. CLEAR TRADEOFFS • Latency • Availability • Throughput
  • 41. * N E W * M U LT I - M A S T E R ( P R E V I E W ) Perfect for Intelligent Cloud and Intelligent Edge Applications Write scalability around the world Low latency writes around the world 99.999% High Availability around the world Well-defined consistency models Comprehensive conflict management
  • 42.
  • 43. H A N D L E A N Y D ATA W I T H N O S C H E M A O R I N D E X I N G R E Q U I R E D Azure Cosmos DB’s schema-less service automatically indexes all your data, regardless of the data model, to delivery blazing fast queries. Item Color Microwave safe Liquid capacity CPU Memory Storage Geek mug Graphite Yes 16ox ??? ??? ??? Coffee Bean mug Tan No 12oz ??? ??? ??? Surface book Gray ??? ??? 3.4 GHz Intel Skylake Core i7- 6600U 16GB 1 TB SSD • Automatic index management • Synchronous auto-indexing • No schemas or secondary indices needed • Works across every data model GEEK
  • 44. I N D E X I N G J S O N D O C U M E N T S { "locations": [ { "country": "Germany", "city": "Berlin" }, { "country": "France", "city": "Paris" } ], "headquarter": "Belgium", "exports": [ { "city": "Moscow" }, { "city": "Athens" } ] } locations headquarter exports 0 country city Germany Berlin 1 country city France Paris 0 1 city Athens city Moscow Belgium
  • 45. I N D E X I N G J S O N D O C U M E N T S { "locations": [ { "country": "Germany", "city": "Bonn", "revenue": 200 } ], "headquarter": "Italy", "exports": [ { "city": "Berlin", "dealers": [ { "name": "Hans" } ] }, { "city": "Athens" } ] } locations headquarter exports 0 country city Germany Bonn revenue 200 0 1 citycity Berlin Italy dealers 0 name Hans
  • 46. I N D E X I N G J S O N D O C U M E N T S Athens locations headquarter exports 0 country city Germany Bonn revenue 200 0 1 citycity Berlin Italy dealers 0 name Hans locations headquarter exports 0 country city Germany Berlin 1 country city France Paris 0 1 city Athens city Moscow Belgium
  • 47. I N V E R T E D I N D E X locations headquarter exports 0 country city Germany Berlin revenue 200 0 1 city Athens city Berlin Italy dealers 0 name Hans Bonn 1 country city France Paris Belgium Moscow
  • 48. I N D E X P O L I C I E S CUSTOM INDEXING POLICIES Though all Azure Cosmos DB data is indexed by default, you can specify a custom indexing policy for your collections. Custom indexing policies allow you to design and customize the shape of your index while maintaining schema flexibility. • Define trade-offs between storage, write and query performance, and query consistency • Include or exclude documents and paths to and from the index • Configure various index types { "automatic": true, "indexingMode": "Consistent", "includedPaths": [{ "path": "/*", "indexes": [{ "kind": "Hash", "dataType": "String", "precision": -1 }, { "kind": "Range", "dataType": "Number", "precision": -1 }, { "kind": "Spatial", "dataType": "Point" }] }], "excludedPaths": [{ "path": "/nonIndexedContent/*" }] }
  • 49.
  • 50. P R O V I S I O N T H R O U G H P U T F O R A S E T O F C O N TA I N E R S Remove friction for OSS NoSQL APIs Provision RU/sec shared across containers Mix containers with dedicated throughput and containers with shared throughput Elastically scale provisioned throughput for a set of containers at any time
  • 51. B U L K E X E C U TO R L I B R A R Y Easy out-of-the-box bulk operation functionality Supports bulk import and update Auto handles congestion control + transient errors 10x client-side performance improvement Easily scale-out clients across more VMs Available starting with .NET and Java
  • 52. M U LT I - M A S T E R @ G LO B A L S C A L E ( P R E V I E W ) Perfect for Intelligent Cloud and Intelligent Edge Applications Write scalability around the world Low latency writes around the world 99.999% High Availability around the world Well-defined consistency models Comprehensive conflict management
  • 53. V N E T S E R V I C E E N D P O I N T Secure communication without exposing public endpoints Limit access to specific VNET(s) subnet(s) Compatible with IP Firewall ACLs Available in all Azure regions
  • 54. J AVA A S Y N C L I B R A R Y F O R S Q L A P I New Async API surface for event-based programs w/ observable sequencies Leverages popular RxJava library 2x client-side performance improvement Improved user experience
  • 55. R E C A P Azure Cosmos DB Core Concepts and What’s New @ //Build/ 2018 TL;DR High-Level Overview Resource Model Request Units Partitioning Replication Automatic Indexing New Goodies Q&A