ADX Deep Dive into Time Series Analytics

MCT Summit 2021
Time series Analytics
A deep dive into ADX Azure Data Explorer
Riccardo Zamana
1

MCT Summit 2021
• 20+ Experience in IT
• 10+ Experience in IoT
• 5+ Experience in Azure Projects
2

MCT Summit 2021
ADX - Basics
3

MCT Summit 2021
What is ADX for me, today
4
• A Telemetry data Search engine => ELK replacement
• A TSDB envolved in LAMBDA replacements (as WARM path) => OSS LAMBDA (MinIO +
Kafka) replacement
• A Tool to Materialize data into ADLS & SQL
• A Tool for monitoring, summarizing information and send notifications

MCT Summit 2021
ADX Architecture
5
1. CONTEXT
2. SOURCES
3. INFRASTRUCTURE
4. DESTINATIONS

MCT Summit 2021
ADX - Quickstart
6

MCT Summit 2021
What is Azure Data Explorer
Any append-
only stream
of records
Relational query
model:
Filter, aggregate, join,
calculated columns, …
Fully-
managed
Rapid iterations to
explore the data
High volume
High velocity
High variance
(structured, semi-
structured, free-text)
PaaS,
Vanilla,
Database
Purposely built

MCT Summit 2021
The role of ADX
8
Raw data DWH
Refined data
Real time
derived data
Data
comparison
and fast kpi
ADX
THREE KEY USERS IN ONE TOOL:
• IoT Developer (data check, rule engine for insights)
• Data engineer (data comparison)
• Data scientist (data exploration)

MCT Summit 2021
How ADX is Organized
11
INSTANCE DATABASE SOURCES
DB Users/Apps
Ingestion URL
Querying URL
Cache storage
Blob storage
EXTERNAL
SOURCES
EXTERNAL
DESTINATIONS
IotHUB
EventHub
Storage
ADLS
Sql Server
MANY..

MCT Summit 2021
ADX - Ingest
14

MCT Summit 2021
FIRST PHASE: Ingestion
15
• Many connections & Plugins
• Many SDKs
• Many managed pipelines
• Many tools to Ingest Rapidly
Managed pipelines:
• Ingest blob using EventGrid
• Ingest Eventhub stream
• Ingest IotHub stream
• Ingest data from ADF
Connections & Plugins:
• Logstash plugin
• Kafka Connector
• Apache spark Connector
Many SDK:
• Python SDK
• .NET SDK
• Java SDK
• Node SDK
• REST API
• GO API
Tools:
• One click ingestion
• LightIngest

MCT Summit 2021
Ingestion Types:
16
• Streaming ingestion: Optimized for low volume of data per table,
over thousands of tables
• Operation completes in under 10 seconds
• Data available for query after completion
• Batching ingestion: optimized for high ingestion throughput
• Default batch params: 5 minutes, 500 items, or 1000MB

MCT Summit 2021
Ingestion Tecniques
17
For high-volume, reliable, and
cheap data ingestion
Batch ingestion
(provided by SDK)
the client uploads the data to Azure Blob
storage (designated by the Azure Data
Explorer data management service) and
posts a notification to an Azure Queue.
Batch ingestion is the recommended
technique.
Most appropriate for exploration and
prototyping
.Inline ingestion
(provided by query tools)
Inline ingestion: control command (.ingest inline) containing in-
band data is intended for ad hoc testing purposes.
Ingest from query: control command (.set, .set-or-append, .set-
or-replace) that points to query results is used for generating
reports or small temporary tables.
Ingest from storage: control command (.ingest into) with data
stored externally (for example, Azure Blob Storage) allows
efficient bulk ingestion of data.

MCT Summit 2021
What is LightIngest
18
• command-line utility for ad-hoc data
ingestion into Kusto
• pull source data from a local folder
• pull source data from an Azure Blob
Storage container
• Useful to ingest fastly and play with
ADX
• Most useful when you want to ingest a
large amount of data, (time constraint
on ingestion duration)
[Ingest JSON data from blobs]
LightIngest "https://adxclu001.kusto.windows.net;Federated=true"
-database:db001
-table:LAB
-sourcePath:"https://ACCOUNT_NAME.blob.core.windows.net/CONTAINER_NAME?SAS_TOKEN"
-prefix:MyDir1/MySubDir2
-format:json
-mappingRef:DefaultJsonMapping
-pattern:*.json
-limit:100
[Ingest CSV data with headers from local files]
-database:MyDb
-table:MyTable
-sourcePath:"D:MyFolderData"
-format:csv
-ignoreFirstRecord:true
-mappingPath:"D:MyFolderCsvMapping.txt"
-pattern:*.csv.gz
-limit:100
REFERENCE:
https://docs.microsoft.com/en-us/azure/kusto/tools/lightingest

MCT Summit 2021
LightIngest: pay attention with Users!
19
Queued ingestion
Direct ingestion
IMPORTANT: the -creationTimePattern argument allows users to partition the data by creation time, not ingestion time

MCT Summit 2021
LightIngest: pay attention with Users!
20
IMPORTANT:
All the data is indexed but... How is partitioned???? By Ingestion TIME !!!
the -creationTimePattern argument allows users to partition the data by creation time, not ingestion time
[Ingest CSV data with headers from local files]
-database:MyDb
-table:MyTable
-sourcePath:"D:MyFolderData"
-format:csv
-ignoreFirstRecord:true
-mappingPath:"D:MyFolderCsvMapping.txt"
-pattern:*.csv.gz
-limit:100
[Ingest JSON data from blobs]
-database:db001
-table:LAB
-sourcePath:
"https://ACCOUNT_NAME.blob.core.windows.net/CONTAIN
ER_NAME?SAS_TOKEN"
-prefix:MyDir1/MySubDir2
-format:json
-mappingRef:DefaultJsonMapping
-pattern:*.json
-limit:100

MCT Summit 2021
One Click ingestion GA
21
• One Click makes ingestion (intuitive UX)
• Start ingesting data , creating tables and mapping structures
• Different data formats
• One-time or continuous ingestion
FIRST: check your data,
create and destroy tons of
test tables

MCT Summit 2021
Kafka Gold certified connector
22
• From apache Kafka
cluster (on cloud or
onprem)
• Kafka to ingest data
into ADX at scale
• GOLD (Partner
supported <
Microsoft)
What’s the VISION behind it?

MCT Summit 2021
What is FluentBIT
23
• Collaboration with CNCF FluentBIT project
• Multi platform Log Processor and Forwarder to collect
data/logs from different sources
• Unify and send to Block Blob
• Ingest them into ADX using EventGrid
• Can use AZURITE as a storageEndpoint for Simulation
https://docs.microsoft.com/en-us/azure/storage/common/storage-use-
azurite?toc=/azure/storage/blobs/toc.json

MCT Summit 2021
Ingestion: Format & UseCases
24
• Ingest data using native formats: ApacheAvro, CSV (RFC4180),
JSON, MultiJSON (jsonLine), ORC, Parquet, PSV, SCSV, TSV, TXT
• Files/Blobs can be compressed: ZIP, GZIP
• Better to use declarative names: MyData.csv.zip, MyData.json.gz

MCT Summit 2021
Supported data formats
25
For all ingestion methods other than ingest from query, format the data so that Azure Data Explorer can parse it. The
supported data formats are:
• CSV, TSV, TSVE, PSV, SCSV, SOH
• JSON (line-separated, multi-line), Avro
• ZIP and GZIP
Schema mapping helps bind source data fields to destination table columns.
• CSV Mapping (optional) works with all ordinal-based formats. It can be performed using the ingest
command parameter or pre-created on the table and referenced from the ingest command
parameter.
• JSON Mapping (mandatory) and Avro mapping (mandatory) can be performed using the ingest
command parameter. They can also be pre-created on the table and referenced from the ingest
command parameter.

MCT Summit 2021
My ingestion best experience
26
Open points:
• Why EventHub after IotHub?
• Why the second EventHub?

MCT Summit 2021
ADX - TOOLS
27

MCT Summit 2021
How about the Tools?
28
3.VISUALIZE
• Notebooks
• Power BI
• Graphana
• ADX WEB UI
2.QUERY
• Kusto.Explorer
• Web UI
4.ORCHESTRATE
• Microsoft Flow
• Microsoft Logic App
1.LOAD
• LightIngest
• Azure Data Factory
Load
Query
Visualize
Orchestrate
BI People
IT People
ML People

MCT Summit 2021
Azure data studio plugins:
29
Manager Cluster
Manager
Notebooks
1. Select New connection from the Connections pane.
2. Fill in the Connection Details information.
3. For Connection type , select Kusto.
4. For Cluster , enter in your Azure Data Explorer cluster.
5. (When entering the cluster name, don't include the https://
prefix or a trailing /)
6. For Authentication Type , use the default - Azure Active
Directory - Universal with MFA account.
7. For Account , use your account information.
8. For Database , use Default.
9. For Server Group , use Default.
10. For Name (optional) , leave blank.

MCT Summit 2021
Azure data studio plugins:
30
• Filter/View data
• Build 3D Charts
• Take snapshot as JSON declarative file

MCT Summit 2021
Notebooks + ADX = KQL Magic
32
KQL magic:
https://github.com/microsoft/jupyter-Kqlmagic
• extends the capabilities of the Python kernel in Jupyter
• can run Kusto language queries natively
• combine Python and Kusto query language

MCT Summit 2021
A critical perspective
34

MCT Summit 2021
Which are the OSS Alternatives that we should compare with?
35
From db-engines.com
Azure Data Explorer
Fully managed big data
interactive analytics platform
Elastic Search
A distributed, RESTful modern
search and analytics engine
ADX can be a replacement for search and log analytics engines such as Elasticsearch, Splunk, InfluxDB.
Splunk
real-time insights Engine to
boost productivity & security.
InfluxDB
DBMS for storing time series,
events and metrics
Vs

MCT Summit 2021
Comparison chart
36
Name Elasticsearch (ELASTIC) InfluxDB (InfluxData Inc.) Azure Data Explorer (Microsoft) Splunk (Splunk Inc.)
Description A distributed, RESTful modern search and
analytics engine based on Apache Lucene
DBMS for storing time series, events and
metrics
Fully managed big data interactive analytics
platform
Analytics Platform for Big Data
Database models Search engine, Document store Time Series DBMS Time Series DBMS, Search engine, Document
store , Event Store, Relational DBMS
Search engine
Initial release 2010 2013 2019 2003
License Open Source Open Source commercial commercial
Cloud-based only no no yes no
Implementation language Java Go
Server operating systems All OS with a Java VM Linux, OS X hosted Linux, OS X, Solaris, Windows
Data scheme schema-free schema-free Fixed schema with schema-less datatypes
(dynamic)
yes
Typing yes Numeric data and Strings yes yes
XML support no no yes yes
Secondary indexes yes no all fields are automatically indexed yes
SQL SQL-like query language SQL-like query language Kusto Query Language (KQL), SQL subset no
APIs and other access methods RESTful HTTP/JSON API HTTP API RESTful HTTP API HTTP REST
Java API JSON over UDP Microsoft SQL Server communication
protocol (MS-TDS)
Supported programming languages .Net, Java, JavaScript, Python .Net, Java, JavaScript, Python .Net, Java, JavaScript, Python .Net, Java, JavaScript, Python
Ruby, PHP, Perl, Groovy, Community
Contributed Clients
R,Ruby,PHP,Perl,Haskell,Clojure,Erlang,Go,Lisp
,Rust,Scala
R, PowerShell Ruby, PHP
Server-side scripts yes no Yes, possible languages: KQL, Python, R yes
Triggers yes no yes yes
Partitioning methods Sharding Sharding Sharding Sharding
Replication methods yes selectable replication factor yes Master-master replication
MapReduce ES-Hadoop Connector no no yes
Consistency concepts Eventual Consistency Eventual Consistency Eventual Consistency
Immediate Consistency
Foreign keys no no no no
Transaction concepts no no no no
Concurrency yes yes yes yes
Durability yes yes yes yes
In-memory capabilities Memcached and Redis integration yes no no
User concepts simple rights management via user accounts Azure Active Directory Authentication Access rights for users and roles

MCT Summit 2021
Update Policy
37
Automatically append data to a target table whenever new data is inserted into the source table, based on a
transformation query that runs on the data inserted into the source table.
USE IT IF:
• The source table is as a «free-text column based»
• The target table accepts only specific morphology
Cascading updates are allowed (TableA → TableB → TableC → ...).
Raw table Refined table

MCT Summit 2021
How to use Update Policy
38
// Create a function that will be used for update
.create function
MyUpdateFunction()
{
MyTableX
| where ColumnA == 'some-string'
| summarize MyCount=count() by ColumnB, Key=ColumnC
| join (OtherTable | project OtherColumnZ, Key=OtherColumnC) on Key
| project ColumnB, ColumnZ=OtherColumnZ, Key, MyCount
}
// Create the target table (if it doesn't already exist)
.set-or-append DerivedTableX <| MyUpdateFunction() | limit 0
// Use update policy on table DerivedTableX
.alter table DerivedTableX policy update
@'[{"IsEnabled": true, "Source": "MyTableX", "Query": "MyUpdateFunction()", "IsTransactional": false, "PropagateIngestionProperties": false}]'

MCT Summit 2021
Pay attention to failures!
39
Evaluate resource usage
.show table MySourceTable extents;
// The following line provides the extent ID for the not-yet-merged extent in the source table which has the most records
let extentId = $command_results | where MaxCreatedOn > ago(1hr) and MinCreatedOn == MaxCreatedOn | top 1 by RowCount
desc | project ExtentId;
let MySourceTable = MySourceTable | where extent_id() == toscalar(extentId);
MyFunction()
Failures
.show ingestion failures
| where FailedOn > ago(1hr) and OriginatesFromUpdatePolicy == true
• Non-transactional policy: ignored
• Transactional policy: If the ingestion method is pull => automated retry
on the entire ingestion operation (max time)
SO:
You should check failures to
trigger «BROKEN FILES» … but
HOW?

MCT Summit 2021
Use this pattern
40
First table is NEVER wide!! … but YES for the second!
First table schema is K,V,TS,Metadata
Second table schema is WT (Wide Table)
Telemetry oriented ML oriented

MCT Summit 2021
FUNCTION3
FUNCTION2
FUNCTION1
My personal approach
41
DATA
FUNCTION3.1
FUNCTION3.2
FUNCTION3.3
KPI
DEFINITION
KPI
DEFINITION
KPI
DEFINITION
KPI
DEFINITION
DASHBOARD
(use KPI to
embed and
filter them)

MCT Summit 2021
ADX - Query
42

MCT Summit 2021
Some code Examples
43
Query with between
Function with parameters «ToScalar» expression
«Extend» usage

MCT Summit 2021
Kusto for SQL USers
44
• Perform SQL SELECT (no DDL, only SELECT)
• Use KQL (Kusto Query Language)
• Supports translating T-SQL queries to Kusto query language
--
explain
select top(10) * from StormEvents
order by DamageProperty desc
StormEvents
| sort by DamageProperty desc nulls first
| take 10

MCT Summit 2021
Language examples
45
Alias
database["wiki"] =
cluster("https://somecluster.kusto.windows.net:443").da
tabase("somedatabase");
database("wiki").PageViews | count
Let
start = ago(5h);
let period = 2h;
T | where Time > start and Time < start + period | ...
Bin:
T | summarize Hits=count() by bin(Duration, 1s)
Batch:
let m = materialize(StormEvents | summarize n=count() by
State);
m | where n > 2000; m | where n < 10
Tabular expression:
Logs
| where Timestamp > ago(1d)
| join ( Events | where continent == 'Europe' ) on RequestId

MCT Summit 2021
Time Series Analysis – Bin Operator
46
T | summarize Hits=count() by bin(Duration, 1s)
bin(value,roundTo)
bin operator
Rounds values down to an integer multiple of a given bin size. If you have a scattered set of values, they will be
grouped into a smaller set of specific values.
[Rule]
[Example]

MCT Summit 2021
Time Series Analysis – Make Series Operator
47
T | make-series sum(amount) default=0, avg(price) default=0 on
timestamp from datetime(2016-01-01) to datetime(2016-01-10) step
1d by supplier
T | make-series [MakeSeriesParamters] [Column =] Aggregation [default = DefaultValue] [, ...] on
AxisColumn from start to end step step [by [Column =] GroupExpression [, ...]]
make-series operator
[Rule]
[Example]

MCT Summit 2021
Time Series Analysis – Basket Operator
48
StormEvents
| where monthofyear(StartTime) == 5
| extend Damage = iff(DamageCrops + DamageProperty > 0 , "YES" , "NO")
| project State, EventType, Damage, DamageCrops
| evaluate basket(0.2)
basket operator
Basket finds all frequent patterns of discrete attributes (dimensions) in the data and will return all frequent patterns
that passed the frequency threshold in the original query.
[Rule]
[Example]
T | evaluate basket([Threshold, WeightColumn, MaxDimensions, CustomWildcard, CustomWildcard, ...])

MCT Summit 2021
Time Series Analysis – Autocluster Operator
49
StormEvents
| project State , EventType , Damage
| evaluate autocluster(0.6)
autocluster operator
AutoCluster finds common patterns of discrete attributes (dimensions) in the data and will reduce the results of the
original query (whether it's 100 or 100k rows) to a small number of patterns.
[Rule]
[Example]
T | evaluate autocluster([SizeWeight, WeightColumn, NumSeeds, CustomWildcard, CustomWildcard, ...])
StormEvents
| project State , EventType , Damage
| evaluate autocluster(0.2, '~', '~', '*')

MCT Summit 2021
ADX Functions
50
Functions are reusable queries or query parts. Kusto supports several kinds
of functions:
• Stored functions, which are user-defined functions that are stored and managed a
one kind of a database's schema entities. See Stored functions.
• Query-defined functions, which are user-defined functions that are defined and
used within the scope of a single query. The definition of such functions is done
through a let statement. See User-defined functions.
• Built-in functions, which are hard-coded (defined by Kusto and cannot be modified
by users).

MCT Summit 2021
Materialized views
51
The view expose an always up-to-date view of the defined aggregation.
Advantages:
• Performance improvement
• Freshness
• Cost reduction
Behind the scenes:
• Source table is periodically materialized into the view table
• During the query time, the view combines the materialized part with the DELTA in raw table since last
materialization to return complete results

MCT Summit 2021
QUERY AND PERFORMANCE OPTIMIZATION
52
• Materialized views
• Partitioning
• Query result caching
• Near real time scoring of AML and ONNX models
• FFT functions
• Geospatial

MCT Summit 2021
Query result caching
53
• Better query performance
• Lower resource consumption
• The queries needs to be identical
• The cache policy will be defined ùby MAX AGE
• Common use cases: DASHBOARD

MCT Summit 2021
Geospatial joins
55
• Use cases
• Connected mobility solutions
• Geospatial risk analysis
• Agriculture optimization using weather data
• Technical background
• Join of polygons reference data and geospatial timeseries data
• Based on three-dimensioanl S2 geometry
• Consists on a coarse-grained join using S2 cell coverage and exact
validation using geo_point_in_polygon function

MCT Summit 2021
ADX Dashboards
60
• Integration in KUSTO
Web Explorer
• Optimized for big data
• Using powerful KQL to
retrieve visual data
• Make dynamic views
or widgets

MCT Summit 2021
Grafana query builder
61
• Create Grafana panels with no
KQL knowledge
• Select values/filter/grouping
using simple UI dropdowns
• Switch to RawMode to enhance
queries with KQL

MCT Summit 2021
How to use Grafana easily
62
Go to https://grafana.com/
Signup and get and Account

MCT Summit 2021
63
Go to All Plugins section, search ADX
Datasource and install plugin

MCT Summit 2021
64
Go to your grafana
https://<workbenchname>.grafana.net/datasources
And configure ADX datasource
And then Start building dashboards!

MCT Summit 2021
ADX - Orchestration
65

MCT Summit 2021
How about orchestration?
Three use cases in which FLOW + KUSTO are the solution
Push data to Power BI dataset
Periodically do queries, and
push to PowerBI dataset
Conditional queries
Make data checks, and send
notifications with no code
Email multiple ADX Flow charts
Send incredible emails with HTML5
Chart as query result

MCT Summit 2021
Orchestration?
Manage costs
Starting and stopping cluster,
evaluating a condition
Query sets to check data
Plan a Set of Queries in order
to say «IT’S OK, even Today
!»
Manage data retention
Based on dynamic condition

MCT Summit 2021
An Example of:
68
1. Set trigger 2. Connect and test ADX BLOCK 3. Configure Email BLOCK with dynamic params

MCT Summit 2021
And the result is:
69

MCT Summit 2021
ADX - INTEGRATION
70

MCT Summit 2021
Export
71
• To Storage
.export async compressed to csv (
h@"https://storage1.blob.core.windows.net/containerName;secretKey",
h@"https://storage1.blob.core.windows.net/containerName2;secretKey" ) with (
sizeLimit=100000, namePrefix=export, includeHeaders=all, encoding =UTF8NoBOM
) <| myLogs | where id == "moshe" | limit 10000
• To Sql
.export async to sql ['dbo.MySqlTable']
h@"Server=tcp:myserver.database.windows.net,1433;Database=MyDatabase;Auth
entication=Active Directory Integrated;Connection Timeout=30;" with
(createifnotexists="true", primarykey="Id") <| print Message = "Hello World!",
Timestamp = now(), Id=12345678
1. DEFINE COMMAND
Define ADX command and try your
recurrent export strategy
2. TRY IN EDITOR
Use an Editor to try command,
verifying conection strings and
parametrizing them
3. BUILD A JOB
Build a Notebook or a C# JOB using
the command as a SQL QUERY in
your CODE

MCT Summit 2021
External tables & Continuous Export
72
• It’s an external
endpoint:
• Azure Storage
• Azure Datalake Store
• SQL Server
• You need to define:
• Destination
• Continuous-Export
Strategy
EXT TABLE CREATION
.create external table ExternalAdlsGen2 (Timestamp:datetime, x:long,
s:string) kind=adl partition by bin(Timestamp, 1d) dataformat=csv (
h@'abfss://filesystem@storageaccount.dfs.core.windows.net/path;secretKey
' ) with ( docstring = "Docs", folder = "ExternalTables", namePrefix="Prefix" )
EXPORT to EXT TABLE
.create-or-alter continuous-export MyExport over (T) to table
ExternalAdlsGen2 with (intervalBetweenRuns=1h, forcedLatency=10m,
sizeLimit=104857600) <| T

MCT Summit 2021
My best experience
73
Open points
• How to extract insights, using dynamic and
codeless approach?
• Ho to integrate ADX with low cost DB
solutions?

MCT Summit 2021
My final ADX recipe
74
Blob
Storage RawTables
Refined
Tables
Triggered dynamic
check queries
Datalake (long term buckets)
SQL
DWH
Update
policy
External
table
Materialized
view
Batch
ingestion
External
table

MCT Summit 2021
ADX - Management
75

MCT Summit 2021
Data encryption in ADX
• encryption rest (using Azure Storage
• A Microsoft-managed key is used
• customer-managed keys can be enabled
• key rotation, temporary disable and revoke access controls can be
implemented.
• Soft Delete and Purge Protection will be enabled on the Key Vault and cannot
be disabled.
76

MCT Summit 2021
Extents, policies and Partition
• What are data shards or extents
• Column, segments, and blocks
• merge policy and sharding policy
• Data partitioning policy (post-ingestion)
77

MCT Summit 2021
FACTS:
A) Kusto stores its ingested data in reliable storage (most commonly Azure Blob Storage).
B) To speed-up queries on that data, Kusto caches this data (or parts of it) on its processing nodes,
The Kusto cache provides a granular cache policy that
customers can use to differentiate between two data
cache policies: hot data cache and cold data cache.
set query_datascope="hotcache";
T | union U | join (T datascope=all | where Timestamp < ago(365d) on X
YOU CAN SPECIFY WHICH LOCATION MUST BE USED
Cache policy
is independent
from retention
policy !
Retention policy
78

MCT Summit 2021
Retention policy
79
• Soft Delete Period (number)
• Data is available for query
ts is the ADX IngestionDate
• Default is set to 100 YEARS
• Recoverability (enabled/disabled)
• Default is set to ENABLED
• Recoverable for 14 days after deletion
.alter database DatabaseName policy retention "{}"
.alter table TableName policy retention "{}"
EXAMPLE:
{ "SoftDeletePeriod": "36500.00:00:00",
"Recoverability":"Enabled" }
.delete database DatabaseName policy retention
.delete table TableName policy retention
.alter-merge table MyTable1 policy retention softdelete = 7d
2 Parameters, applicable to DB or Table

MCT Summit 2021
Data Purge
80
The purge process is final and irreversible
PURGE PROCESS:
1. It requires database admin
permissions
2. Prior to Purging you have to
be ENABLED, opening a
SUPPORT TICKET.
3. Run purge QUERY, and
identify SIZE, EXEC.TIME and
give VerificationToken
4. Run REALLY purge QUERY
passing Verification Token
.purge table MyTable records in database MyDatabase <| where
CustomerId in ('X', 'Y')
NumRecordsToPurge
EstimatedPurge
ExecutionTime VerificationToken
1,596 00:00:02 e43c7184ed22f4f
23c7a9d7b124d19
6be2e570096987
e5baadf65057fa6
5736b
.purge table MyTable records in database MyDatabase with
(verificationtoken='e43c7184ed22f4f23c7a9d7b124d196be2e570
096987e5baadf65057fa65736b') <| where CustomerId in ('X', 'Y')
.purge table MyTable records
in database MyDatabase
with (noregrets='true')
2 STEP PROCESS 1 STEP PROCESS
With No Regrets !!!!

MCT Summit 2021
Virtual Network
81
BENEFITS
• USE NSG rules to limit traffic.
• Connect your on-premise network to Azure Data Explorer cluster's subnet.
• Secure your data connection sources (Event Hub and Event Grid) with service
endpoints.
VNET gives you TWO Independent IPs
• Private IP: access the cluster inside the VNet.
• Public IP: access the cluster from outside the VNet (management and monitoring) and as a source address
for outbound connections initiated from the cluster.

MCT Summit 2021
My experience
82

MCT Summit 2021
Enterprise readiness
83
• RLS
• Provides fine control of access to table data by different users
• Allow specifying user access to specific rows in tables
• Provides mechanics to mask PII data in tables

MCT Summit 2021
Leader and Follower
84
• Azure Data Share creates a symbolic link between two ADX cluster.
• Sharing occurs in near-real-time (no data pipeline)
• ADX Decouples the storage and compute
• Allows customers to run multiple compute (read-only) instances on the same underlying storage
• You can attach a database as a follower database, which is a read-only database on a remote cluster.
• You can share the data at the database level or at the cluster level.
The cluster sharing the database is the leader cluster and the
cluster receiving the share is the follower cluster.
A follower cluster can follow one or more leader cluster
databases. The follower cluster periodically synchronizes to
check for changes.
The queries running on the follower cluster use local cache
and don't use the resources of the leader cluster.

MCT Summit 2021 97
Riccardo Zamana
Riccardo.Zamana@gmail.com
@riccardozamana
www.linkedin.com/in/riccardozamana/
Thank you.

ADX Deep Dive into Time Series Analytics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie ADX Deep Dive into Time Series Analytics

Ähnlich wie ADX Deep Dive into Time Series Analytics (20)

Mehr von Riccardo Zamana

Mehr von Riccardo Zamana (7)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

ADX Deep Dive into Time Series Analytics

Hinweis der Redaktion