SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Raven: End-to-end Optimization
of ML Prediction Queries
Konstantinos Karanasos, Kwanghyun Park
Gray Systems Lab, Microsoft
App
logic
offline
online
Model Inference
Featurization Model
Model
optimization
policies
orchestratio
n
Data
Catalogs
Governance
Model
Tracking
& Provenance
Access
Control
Logs &
Telemetry
policies
Decisions
Live Data
deployment
other data
featurizatio
n
Model
Training
Model Development / Training
offline feat.
Model
Enterprise-grade ML lifecycle
Data Scientist
Analyst/Developer
model
training
model scoring
data exploration/
preparation
data selection/
transformation
model
deployment
Use Case: Length-of-stay in Hospital
Model:
“Predict length of stay of
a patient in the hospital”
Prediction query:
“Find pregnant patients that
are expected to stay in the
hospital more than a week”
Featurization Model
Container
REST
Prediction Queries: Baseline Approach
policies
HTTP
WebServer
App logic
ODBC
DBMS
Enterprise Features
• Security: data and models outside of the DB
• Extra infrastructure
• High TCO
• Lack tooling/best-practices
Performance
• Data movement
• Latency
• Throughput on batch-scoring
Prediction Queries: In-Engine Evaluation
policies
HTTP
WebServer
App logic
ODBC
DBMS
Enterprise Features
• Security: Data and models within the DBMS
• Reuse Existing infrastructure
• Language/tools/best practices
• Low TCO
Performance ?
• Up to 13x faster on Spark
• Up to 330x faster on SQL Server
Raven: An Optimizer for Prediction Queries in Azure Data
+
data
models
Unified IR
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
+
Optimized
IR
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
+
Embed high-performance
ML inference runtimes
within our data engines
Express data and
ML operations in
a common graph
Constructing the IR
Raven IR operators
Relational algebra
Linear algebra
Other ML operators and data featurizers
UDFs
Static analysis of the prediction query
Support for SQL+ML
Adding support for Python
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
MQ: inference query
ML Inference in Azure Data Engines
SQL Server
PREDICT statement in SQL Server
Embedded ONNX Runtime in the engine
Available in Azure SQL Edge and SQL DW
(part of Azure Synapse Analytics)
Spark
Introduced a new PREDICT operator
Similar syntax to SQL Server
Support for different types of models
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
pa
Un
Static
Analysis
MQ: inference query
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
+
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
+
Q: “Find pregnant patients
expected to stay in the hospital
more than a week”
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
+
Raven: An Optimizer for Prediction Queries
+
+
Runtime
Code
Gen
Raven optimizations in practice
(name, model) AS
”,
eline import Pipeline
rocessing import StandardScaler
import DecisionTreeClassifier
n’, FeatureUnion(…
scaler’,StandardScaler()), …))
reeClassifier())])”);
Data Scientist)
ng model (Data Analyst)
rbinary(max) = (
OM scoring_models
e = ”duration_of_stay“ );
nfo AS pi
ts AS be ON pi.id = be.id
tests AS pt ON be.id = pt.id
ngth_of_stay
L=@model, DATA=data AS d)
ay Pred float) AS p
= 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
Runti
Code
1. Predicate-based
model pruning
2. Model projection
pushdown
3. Model splitting
4. Model-to-SQL
translation
5. NN translation
6. Standard DB
optimizations
7. Compiler
optimizations
1. Avoid unnecessary computation
Information passing between model and data
2. Pick the right runtime for each operation
Translation between data and ML operations
3. Hardware acceleration
Translation to tensor computations
(Hummingbird)
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
Raven optimizations: Key Ideas
0
1,000
2,000
3,000
4,000
5,000
DT-depth5 DT-depth8 LR-.001 GB-20est DT-depth5 DT-depth8 LR-.001 GB-20est
Hospital - 2 billion rows Expedia - 500 million rows
Elapsed
time
(seconds)
End-to-end inferene query time
SparkML Sklearn ONNX runtime Raven
Performance Evaluation: Raven in Spark (HDI)
Best of Raven:
• Decision Trees (DT) and Logistic Regressions (LR): Model Projection Pushdown + ML-to-SQL
• Gradient Boost (GB): Model Projection Pushdown
SELECT PREDICT(model, col1, …)
FROM Hospital
SELECT PREDICT(model, S.col1, …)
FROM listings S, hotels R1, searches R2
WHERE S.prop_id = R1.prop_id AND S.srch_id = R2.srch_id
Raven outperforms other ML runtimes (SparkML, Sklearn, ONNX runtime) by up to ~44x
~44x
0
500
60Est/Dep5 100Est/Dep4 100Est/Dep8 500Est/Dep8
Elapsed
time
(seconds)
Gradient Boost Models (Hospital 200M rows)
ONNX runtime Raven - CPU Raven - GPU
2500
3000
3500
End-to-end inference query time
Performance Evaluation: Raven in Spark with GPU
SELECT PREDICT(model, col1, …) FROM Hospital
Raven + GPU outperforms ONNX runtime by up to ~8x for complex models
~8x
1
10
100
1,000
10,000
100,000
DT-depth5 DT- depth8 LR-.001 GB/RF-20est DT-depth5 DT- depth8 LR-.001 GB-20est
hospital - 100M rows expedia - 100M rows
End-to-end
Time
(sec)
Log
Scale
End-to-end inference query time
MADlib SQL Server (DOP1) Raven (DOP1) SQL Server (DOP16) Raven (DOP16)
Performance Evaluation: Raven Plans in SQL Server
Potential gains with Raven in SQL Server are significantly large!
~230x
~100x
Best of Raven:
• Decision Trees (DT) and Logistic Regressions (LR): Model Projection Pushdown + ML-to-SQL
• Gradient Boost (GB): Model Projection Pushdown
Performance Evaluation: Raven in SQL Server with GPU
Potential gains with Raven and GPU acceleration are significantly large!
~100x
Batch size:
• CPU: Minimum query time obtained with optimal choice of batch size (50K/100K rows).
• GPU: 600K rows.
0
200
400
600
800
1000
1200
1400
depth3-
20est
depth5-
60est
depth4-
100est
depth8-
100est
depth8-
500est
End-to-end
Time
(secs)
Min. CPU-SKL GPU-HB
~2.6x
hospital – 100M rows, GB models
Demo
Conclusion: in-DBMS model inference
• Raven is the first step in a long journey of incorporating ML inference
as a foundational extension of relational algebra and an integral
part of SQL query optimizers and runtimes
• Novel Raven optimizer with cross optimizations and operator
transformations
Ø Up to 13x performance improvements on Spark
Ø Up to 330x performance improvements on SQL Server
• Integration of Raven within Spark and SQL Server
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.
Backup
Current state of affairs: In-application model inference
Use case: hospital length-of-stay
“Find pregnant patients that are
expected to stay in the hospital more
than a week”
Security
• Data leaves the DB
• Model outside of the DB
Performance
• Data movement
• Use of Python for data operations
DBMS
Raven: In-DBMS model inference
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
Static
Analysis
MQ: inference query
Inference query: SQL + PREDICT (SQL Server
2017 syntax) to combine SQL operations with ML
inference
DBMS
DBMS
Raven
model
data
SQL
+
ML
Raven: In-DB model inference
DBMS
Raven
Security
• Data and models within the DB
• Treat models as data
User experience
• Leverage maturity of RDBMS
• Connectivity, tool integration
Can in-DBMS ML inference match (or exceed?)
the performance of state-of-the-art ML
frameworks?
Yes, by up to 230x!
Cross-optimizations in practice
l (name, model) AS
ay”,
ipeline import Pipeline
eprocessing import StandardScaler
ee import DecisionTreeClassifier
=
ion’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
nTreeClassifier())])”);
(Data Scientist)
oking model (Data Analyst)
varbinary(max) = (
FROM scoring_models
ame = ”duration_of_stay“ );
_info AS pi
ests AS be ON pi.id = be.id
l_tests AS pt ON be.id = pt.id
length_of_stay
DEL=@model, DATA=data AS d)
stay Pred float) AS p
t = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
y
Run
Cod
Cross-IR optimizations
and operator
transformations:
Ø Predicate-based
model pruning
Ø Model projection
pushdown
Ø Model splitting
Ø Model inlining
Ø NN translation
Ø Standard DB
optimizations
Ø Compiler
optimizations
Raven overview
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
patient_info blood_tests
Categorical
Encoding
FeatureExtractor
DecisionTreeClassifier
Rescaling
Concat
prenatal_tests
σ pregnant = 1
age
pregnant
gender
1 0
F M X
<35 >=35
…
bp … …
…
…
…
Unified IR for MQ
patient_info blood_tests
NeuralNet
prenatal_tests
Optimized plan for MQ
switch:
case (bp>140): 7
case (120<bp<140): 4
case (bp<120): 2
σage >35
σ pregnant = 1
π π π
σage <=35
U
σlength_of_stay >= 7
Static
Analysis
Cross
Optimization
2 4 7
… … … …
σlength_of_stay >= 7
σbp>140
SQL-inlined model
MQ: inference query
Runtime
Code gen
+
Key ideas:
1. Novel cross-optimizations between SQL and ML operations
2. Combine high-performance ML inference engines with SQL Server
Effect of cross optimizations
1
10
102
103
104
1K 10K 100K 1M
Inference
Time
(ms)
Log
Scale
Dataset Size
RF (scikit-learn)
RF-NN (CPU)
RF-NN (GPU)
24.5x
15x
5.3x
Execution modes
In-process
Deep integration of
ONNX Runtime in
SQL Server
Out-of-process
For queries/models not
supported by our static
analyzer
sp_execute_external_script
(Python, R, Java)
Containerized
For languages not
supported by out-
of-process execution
In-process execution
Native predict: execute the model in the same process as SQL Server
Rudimentary support since SQL Server 2017 (five hardcoded models)
Take advantage of state-of-the-art ML inference engines
Compiler optimizations, Code generation, Hardware acceleration
SQL Server + ONNX Runtime
Some challenges
Align schemata between DB and model
Transform data to/from tensors (avoid copying)
Cache inference sessions
Allow for different ML engines
INSERT INTO model (name, model) AS
(“duration_of_stay”,
“from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from …
model_pipeline =
Pipeline([(‘union’, FeatureUnion(…
(‘scaler’,StandardScaler()), …))
(‘clf’,DecisionTreeClassifier())])”);
M: model pipeline (Data Scientist)
Q: SQL query invoking model (Data Analyst)
DECLARE @model varbinary(max) = (
SELECT model FROM scoring_models
WHERE model_name = ”duration_of_stay“ );
WITH data AS(
SELECT *
FROM patient_info AS pi
JOIN blood_tests AS be ON pi.id = be.id
JOIN prenatal_tests AS pt ON be.id = pt.id
);
SELECT d.id, p.length_of_stay
FROM PREDICT(MODEL=@model, DATA=data AS d)
WITH(length_of_stay Pred float) AS p
WHERE d.pregnant = 1 AND p.length_of_stay > 7;
St
Ana
MQ: inference query
Current status
In-process predictions
Ø Implementation in SQL Server 2019
Ø Public preview in Azure SQL DB Edge
Ø Private preview in Azure SQL DW
Out-of-process predictions
Ø ONNX Runtime as an external
language
(ongoing)
Benefits of deep integration
1
10
100
1K
10k
1K 10K 100K 1M 10M 1K 10K 100K 1M 10M
Total
Inference
Time
(ms)
Log
Scale
Dataset Size
Random Forest MLP
ORT
Raven
Raven ext.

Weitere ähnliche Inhalte

Was ist angesagt?

NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017AWS Chicago
 
How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...HostedbyConfluent
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0Databricks
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationDatabricks
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySparkSpark Summit
 
Apache Calcite: One planner fits all
Apache Calcite: One planner fits allApache Calcite: One planner fits all
Apache Calcite: One planner fits allJulian Hyde
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Julian Hyde
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseApache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseMo Patel
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta LakeDatabricks
 
A Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta LakeA Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta LakeDatabricks
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...HostedbyConfluent
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with PythonGokhan Atil
 
Cost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceCost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceDatabricks
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 

Was ist angesagt? (20)

NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
 
How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...How Kafka Powers the World's Most Popular Vector Database System with Charles...
How Kafka Powers the World's Most Popular Vector Database System with Charles...
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
 
Apache Spark Core – Practical Optimization
Apache Spark Core – Practical OptimizationApache Spark Core – Practical Optimization
Apache Spark Core – Practical Optimization
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
 
Apache Calcite: One planner fits all
Apache Calcite: One planner fits allApache Calcite: One planner fits all
Apache Calcite: One planner fits all
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use CaseApache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
Apache Spark GraphX & GraphFrame Synthetic ID Fraud Use Case
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
A Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta LakeA Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta Lake
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
Cost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark ServiceCost Efficiency Strategies for Managed Apache Spark Service
Cost Efficiency Strategies for Managed Apache Spark Service
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 

Ähnlich wie Raven: End-to-end Optimization of ML Prediction Queries

Translating data to predictive models
Translating data to predictive modelsTranslating data to predictive models
Translating data to predictive modelsChemAxon
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Paolo Missier
 
Automation of building reliable models
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable modelsEszter Szabó
 
Certified Reasoning for Automated Verification
Certified Reasoning for Automated VerificationCertified Reasoning for Automated Verification
Certified Reasoning for Automated VerificationAsankhaya Sharma
 
Predictive Modeling Workshop
Predictive Modeling WorkshopPredictive Modeling Workshop
Predictive Modeling Workshopodsc
 
Translating data to model ICCS2022_pub.pdf
Translating data to model ICCS2022_pub.pdfTranslating data to model ICCS2022_pub.pdf
Translating data to model ICCS2022_pub.pdfwhitecomma
 
Data mining with caret package
Data mining with caret packageData mining with caret package
Data mining with caret packageVivian S. Zhang
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesAndrey Karpov
 
Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016Comsysto Reply GmbH
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Yao Yao
 
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMEREVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMERAndrey Karpov
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedVivian S. Zhang
 
Thesis presentation am lesas
Thesis presentation am lesasThesis presentation am lesas
Thesis presentation am lesasAnne-Marie Lesas
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreLukas Fittl
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Paolo Missier
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecJosh Patterson
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Miguel González-Fierro
 
Mining attributes
Mining attributesMining attributes
Mining attributesSandra Alex
 

Ähnlich wie Raven: End-to-end Optimization of ML Prediction Queries (20)

Translating data to predictive models
Translating data to predictive modelsTranslating data to predictive models
Translating data to predictive models
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
 
Presentation
PresentationPresentation
Presentation
 
Automation of building reliable models
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable models
 
Certified Reasoning for Automated Verification
Certified Reasoning for Automated VerificationCertified Reasoning for Automated Verification
Certified Reasoning for Automated Verification
 
Predictive Modeling Workshop
Predictive Modeling WorkshopPredictive Modeling Workshop
Predictive Modeling Workshop
 
Translating data to model ICCS2022_pub.pdf
Translating data to model ICCS2022_pub.pdfTranslating data to model ICCS2022_pub.pdf
Translating data to model ICCS2022_pub.pdf
 
Data mining with caret package
Data mining with caret packageData mining with caret package
Data mining with caret package
 
36x48_Trifold_FinalPoster
36x48_Trifold_FinalPoster36x48_Trifold_FinalPoster
36x48_Trifold_FinalPoster
 
Static analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutesStatic analysis: Around Java in 60 minutes
Static analysis: Around Java in 60 minutes
 
Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016Machinelearning Spark Hadoop User Group Munich Meetup 2016
Machinelearning Spark Hadoop User Group Munich Meetup 2016
 
Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...Lab 2: Classification and Regression Prediction Models, training and testing ...
Lab 2: Classification and Regression Prediction Models, training and testing ...
 
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMEREVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
EVERYTHING ABOUT STATIC CODE ANALYSIS FOR A JAVA PROGRAMMER
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
 
Thesis presentation am lesas
Thesis presentation am lesasThesis presentation am lesas
Thesis presentation am lesas
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & more
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Smart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVecSmart Data Conference: DL4J and DataVec
Smart Data Conference: DL4J and DataVec
 
Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...Running Intelligent Applications inside a Database: Deep Learning with Python...
Running Intelligent Applications inside a Database: Deep Learning with Python...
 
Mining attributes
Mining attributesMining attributes
Mining attributes
 

Mehr von Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionDatabricks
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityDatabricks
 

Mehr von Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 
Jeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and QualityJeeves Grows Up: An AI Chatbot for Performance and Quality
Jeeves Grows Up: An AI Chatbot for Performance and Quality
 

Kürzlich hochgeladen

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Kürzlich hochgeladen (20)

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Raven: End-to-end Optimization of ML Prediction Queries

  • 1. Raven: End-to-end Optimization of ML Prediction Queries Konstantinos Karanasos, Kwanghyun Park Gray Systems Lab, Microsoft
  • 2. App logic offline online Model Inference Featurization Model Model optimization policies orchestratio n Data Catalogs Governance Model Tracking & Provenance Access Control Logs & Telemetry policies Decisions Live Data deployment other data featurizatio n Model Training Model Development / Training offline feat. Model Enterprise-grade ML lifecycle
  • 3. Data Scientist Analyst/Developer model training model scoring data exploration/ preparation data selection/ transformation model deployment Use Case: Length-of-stay in Hospital Model: “Predict length of stay of a patient in the hospital” Prediction query: “Find pregnant patients that are expected to stay in the hospital more than a week”
  • 4. Featurization Model Container REST Prediction Queries: Baseline Approach policies HTTP WebServer App logic ODBC DBMS Enterprise Features • Security: data and models outside of the DB • Extra infrastructure • High TCO • Lack tooling/best-practices Performance • Data movement • Latency • Throughput on batch-scoring
  • 5. Prediction Queries: In-Engine Evaluation policies HTTP WebServer App logic ODBC DBMS Enterprise Features • Security: Data and models within the DBMS • Reuse Existing infrastructure • Language/tools/best practices • Low TCO Performance ? • Up to 13x faster on Spark • Up to 330x faster on SQL Server
  • 6. Raven: An Optimizer for Prediction Queries in Azure Data + data models Unified IR INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen + Optimized IR INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen + Embed high-performance ML inference runtimes within our data engines Express data and ML operations in a common graph
  • 7. Constructing the IR Raven IR operators Relational algebra Linear algebra Other ML operators and data featurizers UDFs Static analysis of the prediction query Support for SQL+ML Adding support for Python INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 MQ: inference query
  • 8. ML Inference in Azure Data Engines SQL Server PREDICT statement in SQL Server Embedded ONNX Runtime in the engine Available in Azure SQL Edge and SQL DW (part of Azure Synapse Analytics) Spark Introduced a new PREDICT operator Similar syntax to SQL Server Support for different types of models INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; pa Un Static Analysis MQ: inference query
  • 9. INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen + INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen + Q: “Find pregnant patients expected to stay in the hospital more than a week” INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen + Raven: An Optimizer for Prediction Queries + + Runtime Code Gen
  • 10. Raven optimizations in practice (name, model) AS ”, eline import Pipeline rocessing import StandardScaler import DecisionTreeClassifier n’, FeatureUnion(… scaler’,StandardScaler()), …)) reeClassifier())])”); Data Scientist) ng model (Data Analyst) rbinary(max) = ( OM scoring_models e = ”duration_of_stay“ ); nfo AS pi ts AS be ON pi.id = be.id tests AS pt ON be.id = pt.id ngth_of_stay L=@model, DATA=data AS d) ay Pred float) AS p = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model Runti Code 1. Predicate-based model pruning 2. Model projection pushdown 3. Model splitting 4. Model-to-SQL translation 5. NN translation 6. Standard DB optimizations 7. Compiler optimizations
  • 11. 1. Avoid unnecessary computation Information passing between model and data 2. Pick the right runtime for each operation Translation between data and ML operations 3. Hardware acceleration Translation to tensor computations (Hummingbird) INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen Raven optimizations: Key Ideas
  • 12. 0 1,000 2,000 3,000 4,000 5,000 DT-depth5 DT-depth8 LR-.001 GB-20est DT-depth5 DT-depth8 LR-.001 GB-20est Hospital - 2 billion rows Expedia - 500 million rows Elapsed time (seconds) End-to-end inferene query time SparkML Sklearn ONNX runtime Raven Performance Evaluation: Raven in Spark (HDI) Best of Raven: • Decision Trees (DT) and Logistic Regressions (LR): Model Projection Pushdown + ML-to-SQL • Gradient Boost (GB): Model Projection Pushdown SELECT PREDICT(model, col1, …) FROM Hospital SELECT PREDICT(model, S.col1, …) FROM listings S, hotels R1, searches R2 WHERE S.prop_id = R1.prop_id AND S.srch_id = R2.srch_id Raven outperforms other ML runtimes (SparkML, Sklearn, ONNX runtime) by up to ~44x ~44x
  • 13. 0 500 60Est/Dep5 100Est/Dep4 100Est/Dep8 500Est/Dep8 Elapsed time (seconds) Gradient Boost Models (Hospital 200M rows) ONNX runtime Raven - CPU Raven - GPU 2500 3000 3500 End-to-end inference query time Performance Evaluation: Raven in Spark with GPU SELECT PREDICT(model, col1, …) FROM Hospital Raven + GPU outperforms ONNX runtime by up to ~8x for complex models ~8x
  • 14. 1 10 100 1,000 10,000 100,000 DT-depth5 DT- depth8 LR-.001 GB/RF-20est DT-depth5 DT- depth8 LR-.001 GB-20est hospital - 100M rows expedia - 100M rows End-to-end Time (sec) Log Scale End-to-end inference query time MADlib SQL Server (DOP1) Raven (DOP1) SQL Server (DOP16) Raven (DOP16) Performance Evaluation: Raven Plans in SQL Server Potential gains with Raven in SQL Server are significantly large! ~230x ~100x Best of Raven: • Decision Trees (DT) and Logistic Regressions (LR): Model Projection Pushdown + ML-to-SQL • Gradient Boost (GB): Model Projection Pushdown
  • 15. Performance Evaluation: Raven in SQL Server with GPU Potential gains with Raven and GPU acceleration are significantly large! ~100x Batch size: • CPU: Minimum query time obtained with optimal choice of batch size (50K/100K rows). • GPU: 600K rows. 0 200 400 600 800 1000 1200 1400 depth3- 20est depth5- 60est depth4- 100est depth8- 100est depth8- 500est End-to-end Time (secs) Min. CPU-SKL GPU-HB ~2.6x hospital – 100M rows, GB models
  • 16. Demo
  • 17. Conclusion: in-DBMS model inference • Raven is the first step in a long journey of incorporating ML inference as a foundational extension of relational algebra and an integral part of SQL query optimizers and runtimes • Novel Raven optimizer with cross optimizations and operator transformations Ø Up to 13x performance improvements on Spark Ø Up to 330x performance improvements on SQL Server • Integration of Raven within Spark and SQL Server
  • 18. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.
  • 20. Current state of affairs: In-application model inference Use case: hospital length-of-stay “Find pregnant patients that are expected to stay in the hospital more than a week” Security • Data leaves the DB • Model outside of the DB Performance • Data movement • Use of Python for data operations DBMS
  • 21. Raven: In-DBMS model inference INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; Static Analysis MQ: inference query Inference query: SQL + PREDICT (SQL Server 2017 syntax) to combine SQL operations with ML inference DBMS DBMS Raven model data SQL + ML
  • 22. Raven: In-DB model inference DBMS Raven Security • Data and models within the DB • Treat models as data User experience • Leverage maturity of RDBMS • Connectivity, tool integration Can in-DBMS ML inference match (or exceed?) the performance of state-of-the-art ML frameworks? Yes, by up to 230x!
  • 23. Cross-optimizations in practice l (name, model) AS ay”, ipeline import Pipeline eprocessing import StandardScaler ee import DecisionTreeClassifier = ion’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) nTreeClassifier())])”); (Data Scientist) oking model (Data Analyst) varbinary(max) = ( FROM scoring_models ame = ”duration_of_stay“ ); _info AS pi ests AS be ON pi.id = be.id l_tests AS pt ON be.id = pt.id length_of_stay DEL=@model, DATA=data AS d) stay Pred float) AS p t = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model y Run Cod Cross-IR optimizations and operator transformations: Ø Predicate-based model pruning Ø Model projection pushdown Ø Model splitting Ø Model inlining Ø NN translation Ø Standard DB optimizations Ø Compiler optimizations
  • 24. Raven overview INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; patient_info blood_tests Categorical Encoding FeatureExtractor DecisionTreeClassifier Rescaling Concat prenatal_tests σ pregnant = 1 age pregnant gender 1 0 F M X <35 >=35 … bp … … … … … Unified IR for MQ patient_info blood_tests NeuralNet prenatal_tests Optimized plan for MQ switch: case (bp>140): 7 case (120<bp<140): 4 case (bp<120): 2 σage >35 σ pregnant = 1 π π π σage <=35 U σlength_of_stay >= 7 Static Analysis Cross Optimization 2 4 7 … … … … σlength_of_stay >= 7 σbp>140 SQL-inlined model MQ: inference query Runtime Code gen + Key ideas: 1. Novel cross-optimizations between SQL and ML operations 2. Combine high-performance ML inference engines with SQL Server
  • 25. Effect of cross optimizations 1 10 102 103 104 1K 10K 100K 1M Inference Time (ms) Log Scale Dataset Size RF (scikit-learn) RF-NN (CPU) RF-NN (GPU) 24.5x 15x 5.3x
  • 26. Execution modes In-process Deep integration of ONNX Runtime in SQL Server Out-of-process For queries/models not supported by our static analyzer sp_execute_external_script (Python, R, Java) Containerized For languages not supported by out- of-process execution
  • 27. In-process execution Native predict: execute the model in the same process as SQL Server Rudimentary support since SQL Server 2017 (five hardcoded models) Take advantage of state-of-the-art ML inference engines Compiler optimizations, Code generation, Hardware acceleration SQL Server + ONNX Runtime Some challenges Align schemata between DB and model Transform data to/from tensors (avoid copying) Cache inference sessions Allow for different ML engines INSERT INTO model (name, model) AS (“duration_of_stay”, “from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from sklearn.tree import DecisionTreeClassifier from … model_pipeline = Pipeline([(‘union’, FeatureUnion(… (‘scaler’,StandardScaler()), …)) (‘clf’,DecisionTreeClassifier())])”); M: model pipeline (Data Scientist) Q: SQL query invoking model (Data Analyst) DECLARE @model varbinary(max) = ( SELECT model FROM scoring_models WHERE model_name = ”duration_of_stay“ ); WITH data AS( SELECT * FROM patient_info AS pi JOIN blood_tests AS be ON pi.id = be.id JOIN prenatal_tests AS pt ON be.id = pt.id ); SELECT d.id, p.length_of_stay FROM PREDICT(MODEL=@model, DATA=data AS d) WITH(length_of_stay Pred float) AS p WHERE d.pregnant = 1 AND p.length_of_stay > 7; St Ana MQ: inference query
  • 28. Current status In-process predictions Ø Implementation in SQL Server 2019 Ø Public preview in Azure SQL DB Edge Ø Private preview in Azure SQL DW Out-of-process predictions Ø ONNX Runtime as an external language (ongoing)
  • 29. Benefits of deep integration 1 10 100 1K 10k 1K 10K 100K 1M 10M 1K 10K 100K 1M 10M Total Inference Time (ms) Log Scale Dataset Size Random Forest MLP ORT Raven Raven ext.