SlideShare ist ein Scribd-Unternehmen logo
1 von 92
IoT day 2015
NoSQL in Azure per l’IoT
(e il Business)
Marco Parenzan
Microsoft Azure MVP
@marco_parenzan
marco [dot] parenzan [at] 1nn0va [dot] it
IoT day 2015
Sponsor
IoT day 2015
Speaker info/Marco Parenzan
 www.slideshare.net/marco.parenzan
 www.github.com/marcoparenzan
 marco [dot] parenzan [at] 1nn0va [dot] it
 www.1nnova.it
 @marco_parenzan
Formazione ,Divulgazione e Consulenza con 1nn0va
Microsoft MVP 2014 for Microsoft Azure
Cloud Architect, NET developer
Loves Functional Programming, Html5 Game Programming and Internet of Things
Microservices
Saturday 2015:
un viaggio con
NServiceBus LI
VE
AZURE
COMMUNITY
BOOTCAMP 2015
IoT as an hobby (now…?)
IoT day 2015
Data Ecosystem
Where do I put data
received in EventHub?
From private to public Cloud
A Continuous offering
Microsoft Relational Storage Options
IoT day 2015
SQL Server database technology “as a Service”
Fully managed database-as-a-service built on SQL with near zero administration
Enterprise-ready with automatic support for HA, DR, Backups, replication and more
Highly available and elastically scalable for unpredictable SaaS workloads
Uptime SLA of 99.99%
Predictable performance & Pricing
Built-in regional database geo-replication for additional protection
All core search capabilities - faceting, suggestions, geospatial
Secure and compliant for your sensitive data
Fully compatible with SQL Server 2014 databases
SQL Azure features
StreamingRelational
Internal &
external

Non-
relational NoSQL
MobileReports
Natural
language
queryDashboardsApplications
Orchestration
Machine
learningModeling
Information
management
Complex
event
processing
Data
The Microsoft data platform
The traditional world
IoT day 2015
Business, no longer data, is the foundation of software design
DDD!=OOP
Don’t start from Data
Data are not unique
No more ACID…ACID transactions are not useful with a
distributed model over different storages
Paradigm Shift
IoT day 2015
How many queries can be determined at level analysis?
“A repository should offer an explicit and well defined contract
and avoid arbitrary query”
In business … don’t‘ delete anything (Repository doesn’t
delete anything)
From theory to practice
Classic MVC
Business Logic
Contract BL/P
View
Controller
CQRS (Service Bus powered)
Event Handler
UI
EventCommand Handler
Queue
Topics/Subscription
CQRS for IoT (Service Bus Powered)
Event Handler
UI
Event
Command Handler
Event
Device
Queue
Topics/Subscription
Event Hub
Write
Model
Read
/Search
Model
IoT day 2015
No longer build on data…but on “what happens”
No more one single data store
Data store typess
Logs
Persistence
Saga (long transactions)
Search
Event-based systems
The Big Picture
A modern view:
The traditional world in Azure
Why Use a NoSQL Technology on Azure?
Choosing a Data Technology
IoT day 2015
Db for what?
To store data?
To manipulate data?
Long-term theme
IoT day 2015
NoSql Introduction
IoT day 2015
Key/Value
Table
Blob
Queue
Graph
Document
Not Only Sql Paradigms
What is a document database?
Definitely NOT this
kind of document !
What is a document database?
Not ideal, but it can work -
{
"id": "13244_post",
"text": "Lorizzle ghetto dolor tellivizzle boofron, stuff pimpin' elizzle. Nullam sapizzle
velizzle, my shizz tellivizzle, suscipizzle funky fresh, shizzle my nizzle crocodizzle
vizzle, arcu. Pellentesque eget tortizzle. Sizzle erizzle. Mammasay mammasa mamma oo sa
break it down dolor own yo' things fo shizzle mah nizzle fo rizzle, mah home g-dizzle
sure. Maurizzle pellentesque dawg ghetto turpizzle. Shiz izzle my shizz. Pellentesque
eleifend rhoncizzle nisi. In its fo rizzle owned ma nizzle dictumst. Sizzle gangsta.
Curabitur tellizzle urna, pretizzle go to hizzle, mattizzle izzle, eleifend vitae,
tellivizzle. Dawg shizzlin dizzle. Integer semper velit sizzle stuff.
Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt. Maecenizzle
things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the
shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle
ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing
crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down
get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling
bling vitae pizzle ut libero commodo gizzle. Fusce izzle augue eu yo mamma dang.
Phasellizzle break it down fo nizzle erat. Suspendisse shizzlin dizzle owned,
sollicitudin sizzle, mah nizzle izzle, commodo nec, justo. Donizzle fizzle
porttitizzle ligula. Nunc feugizzle, tellus tellivizzle ornare tempor, sapizzle break
it down tincidunt gangster, eget dapibus daahng dawg enizzle izzle that's the shizzle.
Stuff quizzle leo, imperdizzle izzle, fo shizzle my nizzle izzle, semper izzle,
sapien. Ut boofron magna vizzle ghetto. I'm in the shizzle ante bling bling,
suscipizzle vitae, yo mamma stuff, rutrizzle pizzle, velizzle.
Mauris da bomb go to zzle. Sizzle mammasay mammasa mamma oo sa magna own yo' amet risus
congue. Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt.
things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the
shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle
ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing
crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down
get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling "
}
What is a document database?
Ideally suited to this
kind of document -
{
"id": "13244_user",
"firstName": "John",
"lastName": "Smith",
"age": 25,
"employmentHistory" : [
{
"company":"Contoso Inc"
"start": {"date":"Thu, 02 Apr 2015 20:54:45 GMT", "epoch":1428008086},
"position":"CEO"
},
{
"start": {"date":"Thu, 02 Apr 2012 20:54:45 GMT", "epoch":1428008086},
"end": {"date":"Thu, 01 Apr 2015 20:54:45 GMT", "epoch":1428008086},
"position":"GM"},
],
"address":
{
"streetAddress": "21 2nd Str",
"city": "New York",
"state": "NY",
"postalCode": "10021"
},
"children": [
{"name":"Megan", "age":10},
{"name": "Bruce", "age":7},
{"name": "Angus", "sports" : ["football", "basketball", "hockey"]}
]
"mobileNumber": "212 555-1234"
}
IoT day 2015
JSON can represent complex containment relationships that are
difficult to represent in RDBMS
Schema-less – great for growing requirements during dev unlike
RDBMS where you must know the structure up front and its
painful to modify it
Native notation for JavaScript
Why JSON?
IoT day 2015
try to treat your entities as self-contained documents represented in JSON
When working with relational databases, we've been taught for years to normalize, normalize,
normalize.
There are contains relationships between entities.
There are one-to-few relationships between entities.
There is embedded data that changes infrequently.
There is embedded data won't grow without bound.
There is embedded data that is integral to data in a document.
Embedding
better read performance
IoT day 2015
Representing one-to-many relationships.
Representing many-to-many relationships.
Related data changes frequently.
Referenced data could be unbounded
Provides more flexibility than embedding
More round trips to read data
Referencing
Normalizing typically provides better write performance
•
No magic bullet
Think about how your data
is going to be written, read
and model accordingly
Hybrid models ~ denormalize + reference + aggregate
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"countOfBooks": 3,
"books": [1, 2, 3],
"images": [
{"thumbnail": "http://....png"}
{"profile": "http://....png"}
]
}
{
"id": 1,
"name": "DocumentDB 101",
"authors": [
{"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"},
{"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"}
]
}
IoT day 2015
Promote code first development (mapping objects to json)
Resilient to iterative schema changes
Richer query and indexing (compared to KV stores)
Low impedance as object / JSON store; no ORM required
It just works
It’s fast
Developer Appeal
IoT day 2015
DocumentDb Introduction
IoT day 2015
Store schema-less JSON documents
Excels at search w/ SQL syntax
JavaScript for Stored Procs, Triggers and UDFs
Elastic capacity (not in specific Azure sense, up to now)
Multi-document transaction (Batch)
Tweak everything (read/write performance vs. consistency, index
performance, security)
Designed for massive scale
What is DocumentDb?
IoT day 2015
Applications that need managed elastic scale
Customer does not want to add additional IT resources for
support and maintenance
Avoiding CAPEX and OPEX
Built-for-the-cloud database technology
Access via RESTful HTTP API or client library
DocumentDB: DbaaS
IoT day 2015
Catalog data
Preferences and state
Event store
User generated content
Data exchange
Typical usage
IoT day 2015
Resource Model
Database Account
JS
JS
JS
101
010
Database
JS
JS
JS
101
010
Collections
JS
JS
JS
101
010
* collection != table of homogenous entities
collection ~ a data partition
Documents
JS
JS
JS
101
010
{
"id" : "123"
"name" : "joe"
"age" : 30
"address" : {
"street" : "some st"
}
}
Users, Server Scripts, Attachments
JS
JS
JS
101
010
IoT day 2015
Collections
IoT day 2015
a container of JSON documents and the associated JavaScript
application logic
JSON docs inside of a collection can vary dramatically
A unit of scale for transaction and query throughput (capacity
units allocated uniformly across all collections)
A unit of scale for capacity
A unit of replication
What is a collection?
IoT day 2015
Collections in DocumentDB are not just logical containers, but
also physical containers
They are the transaction boundary for stored procedures and
triggers
entry point to queries and CRUD operations
Each collection is assigned a reserved amount of throughput
which is not shared with other collections in the same account
Collections do not enforce schema
Collections
IoT day 2015
Partitioning
Design: Partitioning
Why Partition?
• Data Size
A single collection (currently*) holds 10GB
• Throughput
3 Performance tiers with a max of 2,500 RU/sec
IoT day 2015
In hash partitioning, partitions are assigned based on the value
of a hash function, allowing you to evenly distribute requests
and data across a number of partitions. This is commonly used
to partition data produced or consumed from a large number of
distinct clients, and is useful for storing user profiles, catalog
items, and IoT ("Internet of Things") telemetry data.
Hash Partitioning
IoT day 2015
In range partitioning, partitions are assigned based on whether
the partition key is within a certain range
This is commonly used for partitioning with time
stamp properties
Keep current data hot, Warm historical data, Scale-down older
data, Purge / Archive
Range partitioning
IoT day 2015
In lookup partitioning, partitions are assigned based on a
lookup map that assigns discrete partition values to specific
partitions a.k.a. a partition or shard map
This is commonly used for partitioning by region
Lookup partitioning
Tenant Partition Id
Customer 1
Big Customer 2
Another 3
{
record: "1",
created: {
"date": "6/1/2014",
"epoch": 1401662986
}
},
{
record: "3",
created: {
"date": "9/23/2014"
"epoch": 1411512586
}
} ,
{
record: "123",
created: {
"date": "8/17/2013"
"epoch": 1376779786
}
}
SELECT * FROM root r WHERE r.date.epoch BETWEEN 1376779786 AND 1401662986
{
record: "1",
created: {
"date": "6/1/2014",
"epoch": 1401662986
}
},
{
record: "3",
created: {
"date": "9/23/2014"
"epoch": 1411512586
}
}
{
record: "43233",
created: {
"epoch": 1411512586
}
} ,
{
record: "1123",
created: {
"date": "8/17/2013"
"epoch": 1376779786
}
},
{
record: "43234",
created: {
"epoch": 1376779786
}
Partitioning - Fan-out Queries
IoT day 2015
Consistency
IoT day 2015
Query / transaction throughput (and reliability – i.e., hardware failure) depend on
replication!
All writes to the primary are replicated across two secondary replicas
All reads are distributed across three copies
“Scalability of throughput” – allowing different clients to read from different replicas helps prevent
bottlenecks
BUT replication takes time!
Potential scenario: some clients are
reading while another is writing
Now, the data is out-of-date, inconsistent!
Why worry about consistency?
IoT day 2015
Trade-off: speed (performance & availability) or consistency
(data correctness)?
“Does every read need the MOST current data?”
“Or do I need every request to be handled and handled quickly?”
No “one size fits all” answer … so it’s up to you!
4 options …
For the entire Db…
…In a future release, we intend to support overriding the default consistency level on
a per collection basis.
Tweakable Consistency
IoT day 2015
client always sees completely consistent data
Slowest reads / writes
Mission critical: e.x. stock market, banking, airline reservation
Strong
IoT day 2015
Default – even trade-off between performance & availability vs.
data correctness
client reads its own writes, but other clients reading this same
data might see older values
Session
IoT day 2015
client might see old data, but it can specify a limit for how old
that data can be (ex. 2 seconds)
Updates happen in order received
similar to Session consistency, but speeds up reads while still
preserving the order of updates
Bounded Staleness
IoT day 2015
client might see old data for as long as it takes a write to
propagate to all replicas
High performance & availability, but a client might sometimes
read out-of-date information or see updates out of order
Eventual
IoT day 2015
At the database level (see preview portal)
On a per-read or per-query basis (optional parameter on
CreateDocumentQuery method)
Setting Consistency
IoT day 2015
Use Weaker Consistency Levels for better Read latencies
• IoT
• Data Analysis
http://azure.microsoft.com/blog/2015/01/27/performance-tips-
for-azure-documentdb-part-2/
Consistency Tips
IoT day 2015
Indexing
IoT day 2015
Efficient, rich hierarchical and relational queries without any schema or
index definitions.
Consistent query results while handling a sustained volume of writes. For
high write throughput workloads with consistent queries, the index is
updated incrementally, efficiently, and online while handling a sustained
volume of writes.
Storage efficiency. For cost effectiveness, the on-disk storage overhead of
the index is bounded and predictable.
Indexing
var collection = new DocumentCollection
{
Id = "lazyCollection"
};
collection.IndexingPolicy.IndexingMode = IndexingMode.Lazy;
client.CreateDocumentCollectionAsync(databaseLink, collection);
Indexing modes
Consistent
Default mode
Index updated synchronously on writes
Lazy
Useful for bulk ingestion scenarios
Indexing policies
Automatic
Default
Manual
Can choose to index documents via
RequestOptions
Can read non-indexed documents
via selflink
Indexing – Modes and policies
Set indexing mode
Set indexing policy
var collection = new DocumentCollection
{
Id = "manualCollection"
};
collection.IndexingPolicy.Automatic = false;
client.CreateDocumentCollectionAsync(databaseLink, collection);
Setting paths, types, and precision
var collection = new DocumentCollection
{
Id = "Orders"
};
collection.IndexingPolicy.ExcludedPaths.Add("/"metaData"/*");
collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath
{
IndexType = IndexType.Hash,
Path = "/",
});
collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath
{
IndexType = IndexType.Range,
Path = @"/""shippedTimestamp""/?",
NumericPrecision = 7
});
client.CreateDocumentCollectionAsync(databaseLink, collection);
Index paths
Include and/or Exclude paths
Index types
Hash
Supported for strings and numbers
Optimized for equality matches
Range
Supported for numbers
Optimized for comparison queries
Index precision
String precision
Default is 3
Numeric precision
Default is 3
Increase for larger number fields
Indexing – Paths and types
IoT day 2015
Use lazy indexing for faster peak time ingestion rates
Exclude unused paths from indexing for faster writes
Specify range index path type for all paths used in range queries
Vary index precision for write vs query performance vs storage
tradeoffs
http://azure.microsoft.com/blog/2015/01/27/performance-tips-
for-azure-documentdb-part-2/
Indexing tips
IoT day 2015
Querying
IoT day 2015
Optimize for queries with small result sets for scalability
Limit use of scans (no range index, NOT, UDFs in WHERE)
Use page size (MaxItemCount) and continuation tokens
For large result sets, use a larger page size (1000)
Querying
Query over heterogeneous documents without defining
schema or managing indexes
 Query arbitrary paths, properties and values without
specifying secondary indexes or indexing hints
 Execute queries with consistent results
 Supported SQL features; predicates, iterations (arrays),
sub-queries, logical operators, UDFs, intra-document
JOINs, JSON transforms
 In general, more predicates result in a larger request
charge.
 Additional predicates can help if they result in narrowing
the overall result set.
from book in client.CreateDocumentQuery<Book>(collectionSelfLink)
where book.Title == "War and Peace"
select book;
from book in client.CreateDocumentQuery<Book>(collectionSelfLink)
where book.Author.Name == "Leo Tolstoy"
select book.Author;
-- Nested lookup against index
SELECT B.Author
FROM Books B
WHERE B.Author.Name = "Leo Tolstoy"
-- Transformation, Filters, Array access
SELECT { Name: B.Title, Author: B.Author.Name }
FROM Books B
WHERE B.Price > 10 AND B.Language[0] = "English"
-- Joins, User Defined Functions (UDF)
SELECT udf.CalculateRegionalTax(B.Price, "USA", "WA")
FROM Books B
JOIN L IN B.Languages
WHERE L.Language = "Russian"
LINQ Query
SQL Query Grammar
Query
IoT day 2015
Programmability
function region(doc)
{
switch (doc.Location.Region)
{
case 0:
return "North";
case 1:
return "Middle";
case 2:
return "South";
}
}
The complexity of a query impacts the
request units consumed for an operation:
Use of user-defined functions (UDFs)
SELECT or WHERE clauses
To take advantage of indexing, try and have at least
one filter against an indexed property when
leveraging a UDF in the WHERE clause
.
Query with user-defined function
function count(filterQuery, continuationToken) {
var collection = getContext().getCollection();
var maxResult = 25; // MAX number of docs to process in one
batch, when reached, return to client/request continuation.
// intentionally set low to demonstrate the concept. This can
be much higher. Try experimenting.
// We've had it in to the high thousands before seeing the
stored proceudre timing out.
// The number of documents counted.
var result = 0;
tryQuery(continuationToken);
}
Execute “explicit” Javascript
code on collection
Executing Stored Procedures
function normalize() {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
var doc = getContext().getRequest().getBody();
var newDoc = {
"Sensor": {
"Id": doc.sensorId,
"Class": 0
},
"Degree": {
"Value": doc.degreeValue,
"Type": 0
},
"Location": {
"Name": doc.locationName,
"Region": doc.locationRegion,
"Longitude": doc.locationLong,
"Latitude": doc.locationLat
},
"id": doc.id
};
// Update the request -- this is what is going to be inserted.
getContext().getRequest().setBody(newDoc);
}
Execute “implicit” Javascript
code on CRUD operations
(Insert, Update, Delete) on
collections
Triggers!
IoT day 2015
Performances
IoT day 2015
Data is saved on SSD
All writes to the primary are replicated across two secondary
replicas
(Replicas are spread on different hardware in same region to protect
against failures)
All reads are distributed across the
three copies (when and how depend
on consistency level for db account
and query)
DocumentDb Performance
IoT day 2015
Measure and Tune for lower request units/second usage
DocumentDB offers a rich set of database operations including relational and hierarchical queries with UDFs, stored procedures and triggers – all operating on the
documents within a database collection. The cost associated with each of these operations will vary based on the CPU, IO and memory required to complete the operation.
Instead of thinking about and managing hardware resources, you can think of a request unit (RU) as a single measure for the resources required to perform various database
operations and service an application request.
Handle Server throttles/request rate too large
When a client attempts to exceed the reserved throughput for an account, there will be no performance degradation at the server and no use of throughput capacity beyond the reserved
level. The server will preemptively end the request with RequestRateTooLarge (HTTP status code 429) and return the x-ms-retry-after-ms header indicating the amount of time, in
milliseconds, that the user must wait before reattempting the request.
Delete empty collections to utilize all provisioned throughput
Every document collection created in a DocumentDB account is allocated reserved throughput capacity based on the number of Capacity Units (CUs) provisioned, and the number of
collections created. A single CU makes available 2,000 request units (RUs) and supports up to 3 collections
Design for smaller documents for higher throughput
The Request Charge (i.e. request processing cost) of a given operation is directly correlated to the size of the document
http://azure.microsoft.com/blog/2015/01/27/performance-tips-for-azure-documentdb-part-2/
Performance Tips
IoT day 2015
Considerations
IoT day 2015
User generated content
Many specific data (varbinary(MAX) in SQL)
Catalog data
Log data
User preferences data
Device sensor data
IoT use cases commonly share some patterns in how they ingest, process and store data. First, these
systems allow for data intake that can ingest bursts of data from device sensors of various locales. Next,
these systems process and analyze streaming data to derive real time insights. And last but not least,
most if not all data will eventually land in a data store for adhoc querying and offline analytics.
Usage: what is DocumentDb for?
IoT day 2015
Maturity: Balancing embedding (ok) and relating (limits)
Searching and Denormalizing
Opportunity
Storing transient Data
Better Opportunities
Storing Files
Append Only
(Table) Storage
Limits from DocumentDb
IoT day 2015
Logs
Attachments
Transient Data
Search
Alternatives for some scenarios
IoT day 2015
Targeted at streaming workloads (E.g. files read from beginning
to end like media files)
Each blob consists of a sequence of blocks
Each block is identified by a Block ID
Each block can be a maximum of 64 MB in size
Size limit 200GB per blob
Azure Storage Blob: Block Blob
Block Blob:
IoT day 2015
Targeted at random read/write workloads (E.g. backing storage
for the VHDs used in Azure VMs)
Each blob consists of an array of pages
Each page is identified by its offset from the start of the blob
Size limit 1TB per blob
Azure Storage Blob: Page Blob
IoT day 2015
Not an RDBMS Table!
The mental picture is ‘Entities’
Entity can have up to 255 properties
Up to 1MB per entity
Partitioning
PartitionKey & RowKey are mandatory properties
Composite key which uniquely identifies an entity
They are the only indexed properties
Defines the sort order
Purpose of the PartitionKey:
Entity Locality
Entities in the same partition will be stored together
Efficient querying and cache locality
Entity Group Transactions
Target throughput – 500 tps/partition, several thousand tps/account
Microsoft Azure monitors the usage patterns of partitions
Automatically load balance partitions
Each partition can be served by a different storage node
Scale to meet the traffic needs of your table
Supports full manipulation (CRUD)
Table Scalability
Azure Table Storage Details
IoT day 2015
Embed a sophisticated search experience into web and mobile
applications without having to worry about the complexities of
full-text search and without having to deploy, maintain or
manage any infrastructure.
Perfect for enterprise cloud developers, cloud software vendors,
cloud architects who need a fully-managed search solution.
Search is a natural backend for Cortana
Take a bunch of words  apply linguistics  return relevant results
Azure Search
IoT day 2015
“Search service”
Scope for capacity
Bound to a region
Has keys, indexes, indexers, data sources
Provisioning
Azure Portal
Azure resource management API
Elastic scale
Capacity can be changed dynamically
Replicas ~ more QPS, HA
Partitions ~ more documents, write throughput
Azure Search Service
IoT day 2015
Simple HTTP/JSON API for creating indexes, pushing documents, searching
Keyword search with user-friendly operators (+, -, *, “”, etc.)
Hit highlighting
Faceting (histograms over ranges, typically used in catalog browsing)
Based on ElasticSearch
Search Functionality
IoT day 2015
Linguistics are key in search
Support for 50 languages
Word breaking, stop words, inflections
Lucene analyzers
Well-known analyzer stack
Stemming
Microsoft analyzers
Same NLP stack used by parts of Office, Bing
Lematization in many languages
Linguistics
IoT day 2015
Suggestions (auto-complete)
Rich structured queries (filter, select, sort) that combines with search
Scoring profiles to model search result relevance
Geo-spatial support integrated in filtering, sorting and ranking (such as finding all
restaurants within 5 KM of your current location)
Search Functionality
IoT day 2015
Redis is an open source, BSD licensed, networked, single-
threaded, in-memory key-value cache and store.
Key-value cache and store (value can be a couple of things)
In-memory (no persistence, but you can)
Single-threaded (atomic operations & transactions)
Networked (it’s a server and it does master/slave)
Some other stuff (scripting, pub/sub, Sentinel, snapshot
Caching: Redis
IoT day 2015
Conclusions
IoT day 2015
Pro:
partitioning, replica and scaling at it’s core
self contained documents
programmability in Javascript
SQL like “intradocument” queries
Cons:
No SQL generic queries
Can work alone just in few scenarios
So DocumentDb…
IoT day 2015
Great storage opportunities in Azure
• Log
• Search
• Transient
• Files/Attachments
• SQL!
• And all new Data Analysis/Machine Learning opportunities
Other Not Only SQL alternatives
IoT day 2015
http://bit.do/documentdb-pricing
Capacity Units (CU)
Capacity
Throughput (in terms of rate of transactions / second)
• Request Units (RU) = 2000 request per second
• “Request” depends on the size of the document – ex. Uploading 1000 large JSON documents
might count as more than one request
Pricing
Standard pricing tier with hourly
billing
1 hr from just $0.034!
Performance levels can be
adjusted
Each collection = 10GB of SSD
Collection* perf is set by S1, S2,
S3
Limit of 100 collections (1 TB)
Soft limit, can be lifted as
needed per account
What does DocumentDB cost?
* collection != table of homogenous entities
collection ~ a data partition
IoT day 2015
NoSQL in Azure per l’IoT
(e il Business)
Marco Parenzan
Microsoft Azure MVP
@marco_parenzan
marco [dot] parenzan [at] 1nn0va [dot] it

Weitere ähnliche Inhalte

Andere mochten auch

Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016DataStax
 
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big DataVoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big DataVoltDB
 
MongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB
 
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...DataStax
 
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...Amazon Web Services
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaReal-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaAmazon Web Services
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
A management introduction to IoT - Myths - Pitfalls - Challenges
A management introduction to IoT - Myths - Pitfalls - ChallengesA management introduction to IoT - Myths - Pitfalls - Challenges
A management introduction to IoT - Myths - Pitfalls - ChallengesSven Beauprez
 
Getting started with azure event hubs and stream analytics services
Getting started with azure event hubs and stream analytics servicesGetting started with azure event hubs and stream analytics services
Getting started with azure event hubs and stream analytics servicesEastBanc Tachnologies
 
Internet of Things (IoT) - Seminar ppt
Internet of Things (IoT) - Seminar pptInternet of Things (IoT) - Seminar ppt
Internet of Things (IoT) - Seminar pptNishant Kayal
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsAnthony Chen
 

Andere mochten auch (14)

Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
Real World Use Case with Cassandra (Eddie Satterly, DataNexus) | C* Summit 2016
 
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big DataVoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
VoltDB and HPE Vertica Present: Building an IoT Architecture for Fast + Big Data
 
MongoDB and the Internet of Things
MongoDB and the Internet of ThingsMongoDB and the Internet of Things
MongoDB and the Internet of Things
 
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
 
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
(SDD407) Amazon DynamoDB: Data Modeling and Scaling Best Practices | AWS re:I...
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS LambdaReal-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
Real-time Data Processing with Amazon DynamoDB Streams and AWS Lambda
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
A management introduction to IoT - Myths - Pitfalls - Challenges
A management introduction to IoT - Myths - Pitfalls - ChallengesA management introduction to IoT - Myths - Pitfalls - Challenges
A management introduction to IoT - Myths - Pitfalls - Challenges
 
Getting started with azure event hubs and stream analytics services
Getting started with azure event hubs and stream analytics servicesGetting started with azure event hubs and stream analytics services
Getting started with azure event hubs and stream analytics services
 
Internet of Things (IoT) - Seminar ppt
Internet of Things (IoT) - Seminar pptInternet of Things (IoT) - Seminar ppt
Internet of Things (IoT) - Seminar ppt
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 

Ähnlich wie NoSQL Database in Azure for IoT and Business

Cortana analytics ou comment office 365 peut rendre vos données plus intellig...
Cortana analytics ou comment office 365 peut rendre vos données plus intellig...Cortana analytics ou comment office 365 peut rendre vos données plus intellig...
Cortana analytics ou comment office 365 peut rendre vos données plus intellig...Nicolas Georgeault
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015Jonathan Woodward
 
SQL PASS BA London 2014 - Data Culture & Future of Analytics
SQL PASS BA London 2014 - Data Culture & Future of AnalyticsSQL PASS BA London 2014 - Data Culture & Future of Analytics
SQL PASS BA London 2014 - Data Culture & Future of AnalyticsJonathan Woodward
 
Net conf cl v2018 real time analytics
Net conf cl v2018 real time analyticsNet conf cl v2018 real time analytics
Net conf cl v2018 real time analyticsGaston Cruz
 
A Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsA Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsInside Analysis
 
Net conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsNet conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsGaston Cruz
 
IOT y el Data Analytics
IOT y el Data AnalyticsIOT y el Data Analytics
IOT y el Data Analyticsnnakasone
 
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Codit
 
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...apidays
 
Exploring the Azure IoT Ecosystem
Exploring the Azure IoT EcosystemExploring the Azure IoT Ecosystem
Exploring the Azure IoT EcosystemBizTalk360
 
Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Guido Schmutz
 
Introduction to Microsoft IoT Central
Introduction to Microsoft IoT Central Introduction to Microsoft IoT Central
Introduction to Microsoft IoT Central Codit
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in TbilisiAlexey Bokov
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studydeep.bi
 
Introducing power bi in your company - andrea martorana tusa
Introducing power bi in your company - andrea martorana tusaIntroducing power bi in your company - andrea martorana tusa
Introducing power bi in your company - andrea martorana tusaAndrea Martorana Tusa
 
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTMeetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTAlex Danvy
 
Go real-time with the InternetOfThings
Go real-time with the InternetOfThingsGo real-time with the InternetOfThings
Go real-time with the InternetOfThingsUffe Björklund
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataGuido Schmutz
 

Ähnlich wie NoSQL Database in Azure for IoT and Business (20)

Cortana analytics ou comment office 365 peut rendre vos données plus intellig...
Cortana analytics ou comment office 365 peut rendre vos données plus intellig...Cortana analytics ou comment office 365 peut rendre vos données plus intellig...
Cortana analytics ou comment office 365 peut rendre vos données plus intellig...
 
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
Data Culture Series  - Keynote & Panel - Reading - 12th May 2015Data Culture Series  - Keynote & Panel - Reading - 12th May 2015
Data Culture Series - Keynote & Panel - Reading - 12th May 2015
 
SQL PASS BA London 2014 - Data Culture & Future of Analytics
SQL PASS BA London 2014 - Data Culture & Future of AnalyticsSQL PASS BA London 2014 - Data Culture & Future of Analytics
SQL PASS BA London 2014 - Data Culture & Future of Analytics
 
Net conf cl v2018 real time analytics
Net conf cl v2018 real time analyticsNet conf cl v2018 real time analytics
Net conf cl v2018 real time analytics
 
A Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsA Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of Things
 
Net conf uy v2018 real time analytics
Net conf uy v2018 real time analyticsNet conf uy v2018 real time analytics
Net conf uy v2018 real time analytics
 
IOT y el Data Analytics
IOT y el Data AnalyticsIOT y el Data Analytics
IOT y el Data Analytics
 
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
 
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
apidays LIVE London 2021 - API Horror Stories from an Unnamed Coworking Compa...
 
Exploring the Azure IoT Ecosystem
Exploring the Azure IoT EcosystemExploring the Azure IoT Ecosystem
Exploring the Azure IoT Ecosystem
 
Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?
 
Microsoft & IoT
Microsoft & IoTMicrosoft & IoT
Microsoft & IoT
 
Introduction to Microsoft IoT Central
Introduction to Microsoft IoT Central Introduction to Microsoft IoT Central
Introduction to Microsoft IoT Central
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in Tbilisi
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case study
 
Azure Document Db
Azure Document DbAzure Document Db
Azure Document Db
 
Introducing power bi in your company - andrea martorana tusa
Introducing power bi in your company - andrea martorana tusaIntroducing power bi in your company - andrea martorana tusa
Introducing power bi in your company - andrea martorana tusa
 
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoTMeetup Toulouse Microsoft Azure : Bâtir une solution IoT
Meetup Toulouse Microsoft Azure : Bâtir une solution IoT
 
Go real-time with the InternetOfThings
Go real-time with the InternetOfThingsGo real-time with the InternetOfThings
Go real-time with the InternetOfThings
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big Data
 

Mehr von Marco Parenzan

Azure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerMarco Parenzan
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxMarco Parenzan
 
Azure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsMarco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
 
Developing Actors in Azure with .net
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .netMarco Parenzan
 
Math with .NET for you and Azure
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and AzureMarco Parenzan
 
Power BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralMarco Parenzan
 
.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogameMarco Parenzan
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Marco Parenzan
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETMarco Parenzan
 
Deploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsMarco Parenzan
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetMarco Parenzan
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .netMarco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .netMarco Parenzan
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicMarco Parenzan
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTMarco Parenzan
 

Mehr von Marco Parenzan (20)

Azure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineer
 
Azure Hybrid @ Home
Azure Hybrid @ HomeAzure Hybrid @ Home
Azure Hybrid @ Home
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
 
Azure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT Solutions
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
 
Developing Actors in Azure with .net
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .net
 
Math with .NET for you and Azure
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and Azure
 
Power BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT Central
 
.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Deploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data Solutions
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
 
Azure IoT Central
Azure IoT CentralAzure IoT Central
Azure IoT Central
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETT
 

Kürzlich hochgeladen

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 

Kürzlich hochgeladen (20)

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 

NoSQL Database in Azure for IoT and Business

  • 1. IoT day 2015 NoSQL in Azure per l’IoT (e il Business) Marco Parenzan Microsoft Azure MVP @marco_parenzan marco [dot] parenzan [at] 1nn0va [dot] it
  • 3. IoT day 2015 Speaker info/Marco Parenzan  www.slideshare.net/marco.parenzan  www.github.com/marcoparenzan  marco [dot] parenzan [at] 1nn0va [dot] it  www.1nnova.it  @marco_parenzan Formazione ,Divulgazione e Consulenza con 1nn0va Microsoft MVP 2014 for Microsoft Azure Cloud Architect, NET developer Loves Functional Programming, Html5 Game Programming and Internet of Things Microservices Saturday 2015: un viaggio con NServiceBus LI VE AZURE COMMUNITY BOOTCAMP 2015
  • 4. IoT as an hobby (now…?)
  • 5. IoT day 2015 Data Ecosystem Where do I put data received in EventHub?
  • 6. From private to public Cloud A Continuous offering Microsoft Relational Storage Options
  • 7. IoT day 2015 SQL Server database technology “as a Service” Fully managed database-as-a-service built on SQL with near zero administration Enterprise-ready with automatic support for HA, DR, Backups, replication and more Highly available and elastically scalable for unpredictable SaaS workloads Uptime SLA of 99.99% Predictable performance & Pricing Built-in regional database geo-replication for additional protection All core search capabilities - faceting, suggestions, geospatial Secure and compliant for your sensitive data Fully compatible with SQL Server 2014 databases SQL Azure features
  • 10. IoT day 2015 Business, no longer data, is the foundation of software design DDD!=OOP Don’t start from Data Data are not unique No more ACID…ACID transactions are not useful with a distributed model over different storages Paradigm Shift
  • 11. IoT day 2015 How many queries can be determined at level analysis? “A repository should offer an explicit and well defined contract and avoid arbitrary query” In business … don’t‘ delete anything (Repository doesn’t delete anything) From theory to practice
  • 12. Classic MVC Business Logic Contract BL/P View Controller
  • 13. CQRS (Service Bus powered) Event Handler UI EventCommand Handler Queue Topics/Subscription
  • 14. CQRS for IoT (Service Bus Powered) Event Handler UI Event Command Handler Event Device Queue Topics/Subscription Event Hub Write Model Read /Search Model
  • 15. IoT day 2015 No longer build on data…but on “what happens” No more one single data store Data store typess Logs Persistence Saga (long transactions) Search Event-based systems
  • 16. The Big Picture A modern view:
  • 18. Why Use a NoSQL Technology on Azure?
  • 19. Choosing a Data Technology
  • 20. IoT day 2015 Db for what? To store data? To manipulate data? Long-term theme
  • 21. IoT day 2015 NoSql Introduction
  • 23. What is a document database? Definitely NOT this kind of document !
  • 24. What is a document database? Not ideal, but it can work - { "id": "13244_post", "text": "Lorizzle ghetto dolor tellivizzle boofron, stuff pimpin' elizzle. Nullam sapizzle velizzle, my shizz tellivizzle, suscipizzle funky fresh, shizzle my nizzle crocodizzle vizzle, arcu. Pellentesque eget tortizzle. Sizzle erizzle. Mammasay mammasa mamma oo sa break it down dolor own yo' things fo shizzle mah nizzle fo rizzle, mah home g-dizzle sure. Maurizzle pellentesque dawg ghetto turpizzle. Shiz izzle my shizz. Pellentesque eleifend rhoncizzle nisi. In its fo rizzle owned ma nizzle dictumst. Sizzle gangsta. Curabitur tellizzle urna, pretizzle go to hizzle, mattizzle izzle, eleifend vitae, tellivizzle. Dawg shizzlin dizzle. Integer semper velit sizzle stuff. Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt. Maecenizzle things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling bling vitae pizzle ut libero commodo gizzle. Fusce izzle augue eu yo mamma dang. Phasellizzle break it down fo nizzle erat. Suspendisse shizzlin dizzle owned, sollicitudin sizzle, mah nizzle izzle, commodo nec, justo. Donizzle fizzle porttitizzle ligula. Nunc feugizzle, tellus tellivizzle ornare tempor, sapizzle break it down tincidunt gangster, eget dapibus daahng dawg enizzle izzle that's the shizzle. Stuff quizzle leo, imperdizzle izzle, fo shizzle my nizzle izzle, semper izzle, sapien. Ut boofron magna vizzle ghetto. I'm in the shizzle ante bling bling, suscipizzle vitae, yo mamma stuff, rutrizzle pizzle, velizzle. Mauris da bomb go to zzle. Sizzle mammasay mammasa mamma oo sa magna own yo' amet risus congue. Boofron mofo auctizzle ma nizzle. Pot a elizzle ut nibh pretium tincidunt. things erat. Own yo' in lacizzle sed maurizzle elementizzle tristique. I'm in the shizzle yippiyo sizzle daahng dawg eros ultricizzle . In velit tortor, ultricizzle ghetto, hendrerizzle fo shizzle mah nizzle fo rizzle, mah home g-dizzle, adipiscing crunk, boom shackalack. Etizzle velit doggy, hizzle consequizzle, pharetra get down get down, dictizzle sed, shut the shizzle up. Fo shizzle neque. Fo lorizzle. Bling " }
  • 25. What is a document database? Ideally suited to this kind of document - { "id": "13244_user", "firstName": "John", "lastName": "Smith", "age": 25, "employmentHistory" : [ { "company":"Contoso Inc" "start": {"date":"Thu, 02 Apr 2015 20:54:45 GMT", "epoch":1428008086}, "position":"CEO" }, { "start": {"date":"Thu, 02 Apr 2012 20:54:45 GMT", "epoch":1428008086}, "end": {"date":"Thu, 01 Apr 2015 20:54:45 GMT", "epoch":1428008086}, "position":"GM"}, ], "address": { "streetAddress": "21 2nd Str", "city": "New York", "state": "NY", "postalCode": "10021" }, "children": [ {"name":"Megan", "age":10}, {"name": "Bruce", "age":7}, {"name": "Angus", "sports" : ["football", "basketball", "hockey"]} ] "mobileNumber": "212 555-1234" }
  • 26. IoT day 2015 JSON can represent complex containment relationships that are difficult to represent in RDBMS Schema-less – great for growing requirements during dev unlike RDBMS where you must know the structure up front and its painful to modify it Native notation for JavaScript Why JSON?
  • 27. IoT day 2015 try to treat your entities as self-contained documents represented in JSON When working with relational databases, we've been taught for years to normalize, normalize, normalize. There are contains relationships between entities. There are one-to-few relationships between entities. There is embedded data that changes infrequently. There is embedded data won't grow without bound. There is embedded data that is integral to data in a document. Embedding better read performance
  • 28. IoT day 2015 Representing one-to-many relationships. Representing many-to-many relationships. Related data changes frequently. Referenced data could be unbounded Provides more flexibility than embedding More round trips to read data Referencing Normalizing typically provides better write performance
  • 29. • No magic bullet Think about how your data is going to be written, read and model accordingly Hybrid models ~ denormalize + reference + aggregate { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [ {"thumbnail": "http://....png"} {"profile": "http://....png"} ] } { "id": 1, "name": "DocumentDB 101", "authors": [ {"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"} ] }
  • 30. IoT day 2015 Promote code first development (mapping objects to json) Resilient to iterative schema changes Richer query and indexing (compared to KV stores) Low impedance as object / JSON store; no ORM required It just works It’s fast Developer Appeal
  • 31. IoT day 2015 DocumentDb Introduction
  • 32. IoT day 2015 Store schema-less JSON documents Excels at search w/ SQL syntax JavaScript for Stored Procs, Triggers and UDFs Elastic capacity (not in specific Azure sense, up to now) Multi-document transaction (Batch) Tweak everything (read/write performance vs. consistency, index performance, security) Designed for massive scale What is DocumentDb?
  • 33. IoT day 2015 Applications that need managed elastic scale Customer does not want to add additional IT resources for support and maintenance Avoiding CAPEX and OPEX Built-for-the-cloud database technology Access via RESTful HTTP API or client library DocumentDB: DbaaS
  • 34. IoT day 2015 Catalog data Preferences and state Event store User generated content Data exchange Typical usage
  • 38. Collections JS JS JS 101 010 * collection != table of homogenous entities collection ~ a data partition
  • 39. Documents JS JS JS 101 010 { "id" : "123" "name" : "joe" "age" : 30 "address" : { "street" : "some st" } }
  • 40. Users, Server Scripts, Attachments JS JS JS 101 010
  • 42. IoT day 2015 a container of JSON documents and the associated JavaScript application logic JSON docs inside of a collection can vary dramatically A unit of scale for transaction and query throughput (capacity units allocated uniformly across all collections) A unit of scale for capacity A unit of replication What is a collection?
  • 43. IoT day 2015 Collections in DocumentDB are not just logical containers, but also physical containers They are the transaction boundary for stored procedures and triggers entry point to queries and CRUD operations Each collection is assigned a reserved amount of throughput which is not shared with other collections in the same account Collections do not enforce schema Collections
  • 45. Design: Partitioning Why Partition? • Data Size A single collection (currently*) holds 10GB • Throughput 3 Performance tiers with a max of 2,500 RU/sec
  • 46. IoT day 2015 In hash partitioning, partitions are assigned based on the value of a hash function, allowing you to evenly distribute requests and data across a number of partitions. This is commonly used to partition data produced or consumed from a large number of distinct clients, and is useful for storing user profiles, catalog items, and IoT ("Internet of Things") telemetry data. Hash Partitioning
  • 47. IoT day 2015 In range partitioning, partitions are assigned based on whether the partition key is within a certain range This is commonly used for partitioning with time stamp properties Keep current data hot, Warm historical data, Scale-down older data, Purge / Archive Range partitioning
  • 48. IoT day 2015 In lookup partitioning, partitions are assigned based on a lookup map that assigns discrete partition values to specific partitions a.k.a. a partition or shard map This is commonly used for partitioning by region Lookup partitioning Tenant Partition Id Customer 1 Big Customer 2 Another 3
  • 49. { record: "1", created: { "date": "6/1/2014", "epoch": 1401662986 } }, { record: "3", created: { "date": "9/23/2014" "epoch": 1411512586 } } , { record: "123", created: { "date": "8/17/2013" "epoch": 1376779786 } } SELECT * FROM root r WHERE r.date.epoch BETWEEN 1376779786 AND 1401662986 { record: "1", created: { "date": "6/1/2014", "epoch": 1401662986 } }, { record: "3", created: { "date": "9/23/2014" "epoch": 1411512586 } } { record: "43233", created: { "epoch": 1411512586 } } , { record: "1123", created: { "date": "8/17/2013" "epoch": 1376779786 } }, { record: "43234", created: { "epoch": 1376779786 } Partitioning - Fan-out Queries
  • 51. IoT day 2015 Query / transaction throughput (and reliability – i.e., hardware failure) depend on replication! All writes to the primary are replicated across two secondary replicas All reads are distributed across three copies “Scalability of throughput” – allowing different clients to read from different replicas helps prevent bottlenecks BUT replication takes time! Potential scenario: some clients are reading while another is writing Now, the data is out-of-date, inconsistent! Why worry about consistency?
  • 52. IoT day 2015 Trade-off: speed (performance & availability) or consistency (data correctness)? “Does every read need the MOST current data?” “Or do I need every request to be handled and handled quickly?” No “one size fits all” answer … so it’s up to you! 4 options … For the entire Db… …In a future release, we intend to support overriding the default consistency level on a per collection basis. Tweakable Consistency
  • 53. IoT day 2015 client always sees completely consistent data Slowest reads / writes Mission critical: e.x. stock market, banking, airline reservation Strong
  • 54. IoT day 2015 Default – even trade-off between performance & availability vs. data correctness client reads its own writes, but other clients reading this same data might see older values Session
  • 55. IoT day 2015 client might see old data, but it can specify a limit for how old that data can be (ex. 2 seconds) Updates happen in order received similar to Session consistency, but speeds up reads while still preserving the order of updates Bounded Staleness
  • 56. IoT day 2015 client might see old data for as long as it takes a write to propagate to all replicas High performance & availability, but a client might sometimes read out-of-date information or see updates out of order Eventual
  • 57. IoT day 2015 At the database level (see preview portal) On a per-read or per-query basis (optional parameter on CreateDocumentQuery method) Setting Consistency
  • 58. IoT day 2015 Use Weaker Consistency Levels for better Read latencies • IoT • Data Analysis http://azure.microsoft.com/blog/2015/01/27/performance-tips- for-azure-documentdb-part-2/ Consistency Tips
  • 60. IoT day 2015 Efficient, rich hierarchical and relational queries without any schema or index definitions. Consistent query results while handling a sustained volume of writes. For high write throughput workloads with consistent queries, the index is updated incrementally, efficiently, and online while handling a sustained volume of writes. Storage efficiency. For cost effectiveness, the on-disk storage overhead of the index is bounded and predictable. Indexing
  • 61. var collection = new DocumentCollection { Id = "lazyCollection" }; collection.IndexingPolicy.IndexingMode = IndexingMode.Lazy; client.CreateDocumentCollectionAsync(databaseLink, collection); Indexing modes Consistent Default mode Index updated synchronously on writes Lazy Useful for bulk ingestion scenarios Indexing policies Automatic Default Manual Can choose to index documents via RequestOptions Can read non-indexed documents via selflink Indexing – Modes and policies Set indexing mode Set indexing policy var collection = new DocumentCollection { Id = "manualCollection" }; collection.IndexingPolicy.Automatic = false; client.CreateDocumentCollectionAsync(databaseLink, collection);
  • 62. Setting paths, types, and precision var collection = new DocumentCollection { Id = "Orders" }; collection.IndexingPolicy.ExcludedPaths.Add("/"metaData"/*"); collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath { IndexType = IndexType.Hash, Path = "/", }); collection.IndexingPolicy.IncludedPaths.Add(new IndexingPath { IndexType = IndexType.Range, Path = @"/""shippedTimestamp""/?", NumericPrecision = 7 }); client.CreateDocumentCollectionAsync(databaseLink, collection); Index paths Include and/or Exclude paths Index types Hash Supported for strings and numbers Optimized for equality matches Range Supported for numbers Optimized for comparison queries Index precision String precision Default is 3 Numeric precision Default is 3 Increase for larger number fields Indexing – Paths and types
  • 63. IoT day 2015 Use lazy indexing for faster peak time ingestion rates Exclude unused paths from indexing for faster writes Specify range index path type for all paths used in range queries Vary index precision for write vs query performance vs storage tradeoffs http://azure.microsoft.com/blog/2015/01/27/performance-tips- for-azure-documentdb-part-2/ Indexing tips
  • 65. IoT day 2015 Optimize for queries with small result sets for scalability Limit use of scans (no range index, NOT, UDFs in WHERE) Use page size (MaxItemCount) and continuation tokens For large result sets, use a larger page size (1000) Querying
  • 66. Query over heterogeneous documents without defining schema or managing indexes  Query arbitrary paths, properties and values without specifying secondary indexes or indexing hints  Execute queries with consistent results  Supported SQL features; predicates, iterations (arrays), sub-queries, logical operators, UDFs, intra-document JOINs, JSON transforms  In general, more predicates result in a larger request charge.  Additional predicates can help if they result in narrowing the overall result set. from book in client.CreateDocumentQuery<Book>(collectionSelfLink) where book.Title == "War and Peace" select book; from book in client.CreateDocumentQuery<Book>(collectionSelfLink) where book.Author.Name == "Leo Tolstoy" select book.Author; -- Nested lookup against index SELECT B.Author FROM Books B WHERE B.Author.Name = "Leo Tolstoy" -- Transformation, Filters, Array access SELECT { Name: B.Title, Author: B.Author.Name } FROM Books B WHERE B.Price > 10 AND B.Language[0] = "English" -- Joins, User Defined Functions (UDF) SELECT udf.CalculateRegionalTax(B.Price, "USA", "WA") FROM Books B JOIN L IN B.Languages WHERE L.Language = "Russian" LINQ Query SQL Query Grammar Query
  • 68. function region(doc) { switch (doc.Location.Region) { case 0: return "North"; case 1: return "Middle"; case 2: return "South"; } } The complexity of a query impacts the request units consumed for an operation: Use of user-defined functions (UDFs) SELECT or WHERE clauses To take advantage of indexing, try and have at least one filter against an indexed property when leveraging a UDF in the WHERE clause . Query with user-defined function
  • 69. function count(filterQuery, continuationToken) { var collection = getContext().getCollection(); var maxResult = 25; // MAX number of docs to process in one batch, when reached, return to client/request continuation. // intentionally set low to demonstrate the concept. This can be much higher. Try experimenting. // We've had it in to the high thousands before seeing the stored proceudre timing out. // The number of documents counted. var result = 0; tryQuery(continuationToken); } Execute “explicit” Javascript code on collection Executing Stored Procedures
  • 70. function normalize() { var collection = getContext().getCollection(); var collectionLink = collection.getSelfLink(); var doc = getContext().getRequest().getBody(); var newDoc = { "Sensor": { "Id": doc.sensorId, "Class": 0 }, "Degree": { "Value": doc.degreeValue, "Type": 0 }, "Location": { "Name": doc.locationName, "Region": doc.locationRegion, "Longitude": doc.locationLong, "Latitude": doc.locationLat }, "id": doc.id }; // Update the request -- this is what is going to be inserted. getContext().getRequest().setBody(newDoc); } Execute “implicit” Javascript code on CRUD operations (Insert, Update, Delete) on collections Triggers!
  • 72. IoT day 2015 Data is saved on SSD All writes to the primary are replicated across two secondary replicas (Replicas are spread on different hardware in same region to protect against failures) All reads are distributed across the three copies (when and how depend on consistency level for db account and query) DocumentDb Performance
  • 73. IoT day 2015 Measure and Tune for lower request units/second usage DocumentDB offers a rich set of database operations including relational and hierarchical queries with UDFs, stored procedures and triggers – all operating on the documents within a database collection. The cost associated with each of these operations will vary based on the CPU, IO and memory required to complete the operation. Instead of thinking about and managing hardware resources, you can think of a request unit (RU) as a single measure for the resources required to perform various database operations and service an application request. Handle Server throttles/request rate too large When a client attempts to exceed the reserved throughput for an account, there will be no performance degradation at the server and no use of throughput capacity beyond the reserved level. The server will preemptively end the request with RequestRateTooLarge (HTTP status code 429) and return the x-ms-retry-after-ms header indicating the amount of time, in milliseconds, that the user must wait before reattempting the request. Delete empty collections to utilize all provisioned throughput Every document collection created in a DocumentDB account is allocated reserved throughput capacity based on the number of Capacity Units (CUs) provisioned, and the number of collections created. A single CU makes available 2,000 request units (RUs) and supports up to 3 collections Design for smaller documents for higher throughput The Request Charge (i.e. request processing cost) of a given operation is directly correlated to the size of the document http://azure.microsoft.com/blog/2015/01/27/performance-tips-for-azure-documentdb-part-2/ Performance Tips
  • 75. IoT day 2015 User generated content Many specific data (varbinary(MAX) in SQL) Catalog data Log data User preferences data Device sensor data IoT use cases commonly share some patterns in how they ingest, process and store data. First, these systems allow for data intake that can ingest bursts of data from device sensors of various locales. Next, these systems process and analyze streaming data to derive real time insights. And last but not least, most if not all data will eventually land in a data store for adhoc querying and offline analytics. Usage: what is DocumentDb for?
  • 76. IoT day 2015 Maturity: Balancing embedding (ok) and relating (limits) Searching and Denormalizing Opportunity Storing transient Data Better Opportunities Storing Files Append Only (Table) Storage Limits from DocumentDb
  • 77. IoT day 2015 Logs Attachments Transient Data Search Alternatives for some scenarios
  • 78. IoT day 2015 Targeted at streaming workloads (E.g. files read from beginning to end like media files) Each blob consists of a sequence of blocks Each block is identified by a Block ID Each block can be a maximum of 64 MB in size Size limit 200GB per blob Azure Storage Blob: Block Blob Block Blob:
  • 79. IoT day 2015 Targeted at random read/write workloads (E.g. backing storage for the VHDs used in Azure VMs) Each blob consists of an array of pages Each page is identified by its offset from the start of the blob Size limit 1TB per blob Azure Storage Blob: Page Blob
  • 80. IoT day 2015 Not an RDBMS Table! The mental picture is ‘Entities’ Entity can have up to 255 properties Up to 1MB per entity Partitioning PartitionKey & RowKey are mandatory properties Composite key which uniquely identifies an entity They are the only indexed properties Defines the sort order Purpose of the PartitionKey: Entity Locality Entities in the same partition will be stored together Efficient querying and cache locality Entity Group Transactions Target throughput – 500 tps/partition, several thousand tps/account Microsoft Azure monitors the usage patterns of partitions Automatically load balance partitions Each partition can be served by a different storage node Scale to meet the traffic needs of your table Supports full manipulation (CRUD) Table Scalability Azure Table Storage Details
  • 81. IoT day 2015 Embed a sophisticated search experience into web and mobile applications without having to worry about the complexities of full-text search and without having to deploy, maintain or manage any infrastructure. Perfect for enterprise cloud developers, cloud software vendors, cloud architects who need a fully-managed search solution. Search is a natural backend for Cortana Take a bunch of words  apply linguistics  return relevant results Azure Search
  • 82. IoT day 2015 “Search service” Scope for capacity Bound to a region Has keys, indexes, indexers, data sources Provisioning Azure Portal Azure resource management API Elastic scale Capacity can be changed dynamically Replicas ~ more QPS, HA Partitions ~ more documents, write throughput Azure Search Service
  • 83. IoT day 2015 Simple HTTP/JSON API for creating indexes, pushing documents, searching Keyword search with user-friendly operators (+, -, *, “”, etc.) Hit highlighting Faceting (histograms over ranges, typically used in catalog browsing) Based on ElasticSearch Search Functionality
  • 84. IoT day 2015 Linguistics are key in search Support for 50 languages Word breaking, stop words, inflections Lucene analyzers Well-known analyzer stack Stemming Microsoft analyzers Same NLP stack used by parts of Office, Bing Lematization in many languages Linguistics
  • 85. IoT day 2015 Suggestions (auto-complete) Rich structured queries (filter, select, sort) that combines with search Scoring profiles to model search result relevance Geo-spatial support integrated in filtering, sorting and ranking (such as finding all restaurants within 5 KM of your current location) Search Functionality
  • 86. IoT day 2015 Redis is an open source, BSD licensed, networked, single- threaded, in-memory key-value cache and store. Key-value cache and store (value can be a couple of things) In-memory (no persistence, but you can) Single-threaded (atomic operations & transactions) Networked (it’s a server and it does master/slave) Some other stuff (scripting, pub/sub, Sentinel, snapshot Caching: Redis
  • 88. IoT day 2015 Pro: partitioning, replica and scaling at it’s core self contained documents programmability in Javascript SQL like “intradocument” queries Cons: No SQL generic queries Can work alone just in few scenarios So DocumentDb…
  • 89. IoT day 2015 Great storage opportunities in Azure • Log • Search • Transient • Files/Attachments • SQL! • And all new Data Analysis/Machine Learning opportunities Other Not Only SQL alternatives
  • 90. IoT day 2015 http://bit.do/documentdb-pricing Capacity Units (CU) Capacity Throughput (in terms of rate of transactions / second) • Request Units (RU) = 2000 request per second • “Request” depends on the size of the document – ex. Uploading 1000 large JSON documents might count as more than one request Pricing
  • 91. Standard pricing tier with hourly billing 1 hr from just $0.034! Performance levels can be adjusted Each collection = 10GB of SSD Collection* perf is set by S1, S2, S3 Limit of 100 collections (1 TB) Soft limit, can be lifted as needed per account What does DocumentDB cost? * collection != table of homogenous entities collection ~ a data partition
  • 92. IoT day 2015 NoSQL in Azure per l’IoT (e il Business) Marco Parenzan Microsoft Azure MVP @marco_parenzan marco [dot] parenzan [at] 1nn0va [dot] it

Hinweis der Redaktion

  1. Slide Objectives: Show Microsoft’ continuous Private to Public Cloud Offering, but this presentation will focus on Microsoft’s relational database PaaS offering. Transition: Microsoft provides a continuous solution from private cloud to the public cloud. No matter where you are on your technology roadmap we have a solution to fit your needs. We are a trusted advisor and platform in the traditional enterprise and ISV space with new IaaS offerings that making it easier to bring this same level of trust and ease of use to the public cloud. However, Microsoft Azure SQL Database extends SQL Server capabilities to the cloud by offering SQL Server as a relational database service. Speaking Points: SQL Database provides SQL Server as a relational service.
  2. Slide Objectives: Understand the overall concepts and benefits of SQL Database Transition: Let’s clear up any confusion and look at the basics of what SQL Database really is and some of its benefits. Speaking Points: The same great SQL Server database technology that you know, love, and use on-premises provided as a service Enterprise-ready Automatic support for High-Availability DR = Disaster Recovery Designed to scale on-demand to provide the same great elasticity Notes: High-availability – 3 copies of the database free for the cost of the one database. Always in sync. The cost to do this on-premises isn’t cheap. This is FREE in SQL Database.
  3. Notes A data lake is a massive, easily accessible, centralized repository of large volumes of structured and unstructured data
  4. Slide Objective Speaker Notes Notes
  5. Slide Objective Speaker Notes Notes
  6. Slide Objective Understand block blob Speaker Notes Block blobs are comprised of blocks, each of which is identified by a block ID. You create or modify a block blob by uploading a set of blocks and committing them by their block IDs. Each block can be a maximum of 64 MB in size. The maximum size for a block blob in version 2009-09-19 is 200 GB, or up to 50,000 blocks. Notes http://msdn.microsoft.com/en-us/library/dd135734.aspx
  7. Slide Objective Understand page blob Speaker Notes Page blobs are a collection of pages. A page is a range of data that is identified by its offset from the start of the blob. To create a page blob, you initialize the page blob by calling Put Blob and specifying its maximum size. To add content to or update a page blob, you call the Put Page operation to modify a page or range of pages by specifying an offset and range. All pages must align 512-byte page boundaries. Unlike writes to block blobs, writes to page blobs happen in-place and are immediately committed to the blob. The maximum size for a page blob is 1 TB. A page written to a page blob may be up to 1 TB in size but will typically be much smaller Notes http://msdn.microsoft.com/en-us/library/dd135734.aspx
  8. Slide Objectives Understand Tables Speaker Notes Within a storage account, a developer may create named tables. Tables store data as entities. An entity is a collection of named properties and their values, similar to a row. Tables are partitioned to support load balancing across storage nodes. Each table has as its first property a partition key that specifies the partition an entity belongs to. The second property is a row key that identifies an entity within a given partition. The combination of the partition key and the row key forms a primary key that identifies each entity uniquely within the table. The Table service does not enforce any schema. A developer may choose to implement and enforce a schema on the client side Notes http://msdn.microsoft.com/en-us/library/dd573356.aspx
  9. Azure Search is a fully managed search solution that allows developers to enable search experiences in applications.
  10. What Azure Search does is that it sits right next to your data store (relational or NOSQL) which can be on-prem or on the Cloud (which may be Azure or any other public cloud) and provides the necessary index that can be used to search the operational data. This service is used only by the application developer and saves him the overhead of developing a search function specifically for his app.  Faceted navigation is a filtering mechanism that provides self-directed drilldown navigation in search applications.