Building a machine learning model is an iterative process. A data scientist will build many tens to hundreds of models before arriving at one that meets some acceptance criteria. However, the current style of model building is ad-hoc and there is no practical way for a data scientist to manage models that are built over time. In addition, there are no means to run complex queries on models and related data.
In this talk, we present ModelDB, a novel end-to-end system for managing machine learning (ML) models. Using client libraries, ModelDB automatically tracks and versions ML models in their native environments (e.g. spark.ml, scikit-learn). A common set of abstractions enable ModelDB to capture models and pipelines built across different languages and environments. The structured representation of models and metadata then provides a platform for users to issue complex queries across various modeling artifacts. Our rich web frontend provides a way to query ModelDB at varying levels of granularity.
ModelDB has been open-sourced at https://github.com/mitdbg/modeldb.
Detecting Credit Card Fraud: A Machine Learning Approach
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk by Manasi Vartak
1. ModelDB: A system
to manage machine
learning models
Manasi Vartak
PhD Student, MIT DB Group
2. People
Manasi Vartak
PhD student, MIT
Srinidhi Viswanathan
MEng, MIT
Samuel Madden
Faculty, MIT
Matei Zaharia
Faculty, Stanford
Harihar Subramanyam
MEng, MIT
Wei-En Lee
MEng student, MIT
3. Building a credit
recommendation algorithm
Profession Credit History Risk of Default
Politician Reasonable 0.3
Struggling
artist
Poor 0.7
Investor
Has more
money than our
company
0.0
… … … …
Barack
Obama
Lindsay
Lohan
Warren
Buffet
8. df.withColumn(“timesDelayed”, udf1)
.withColumn(“percentPaid”, udf2)
.withColumn(“creditUsed”, udf3)
…
val lrGrid = new ParamGridBuilder()
.addGrid(lr.elasticNetParam, Array(0.01, 0.1, 0.5, 0.7))
val scaler = new StandardScaler()
.setInputCol(“features”)
…
val labelIndexer1 = new LabelIndexer()
val labelIndexer2 = new LabelIndexer()
…
Model 50
val udf1: (Int => Int) = (delayed..)
val udf2: (String, Int) = …
credit-default-clean.csv
10. Why is this a problem?
• No record of model history
Did my colleague do that
already?
11. Why is this a problem?
• No record of model history
• Insights lost along the way
Did my colleague do that
already?
How did normalization
affect my ROC?
12. Why is this a problem?
• No record of model history
• Insights lost along the way
• Difficult to reproduce results
Did my colleague do that
already?
How did normalization
affect my ROC?
What params did I use?
13. Why is this a problem?
• No record of model history
• Insights lost along the way
• Difficult to reproduce results
• Cannot search for or query models
Did my colleague do that
already?
How did normalization
affect my ROC?
Where’s the LR
model I tried last
week with featureX?
What params did I use?
14. Why is this a problem?
• No record of model history
• Insights lost along the way
• Difficult to reproduce results
• Cannot search for or query models
• Difficult to collaborate
Did my colleague do that
already?
How did normalization
affect my ROC?
How does someone review
your model?
Where’s the LR
model I tried last
week with featureX?
What params did I use?
28. User quotes
“I should have had this in my self-driving cars class; it
would have made things so much easier”
“…it can really help with reproducibility … and
collaboration in multi-person teams…”
“I used it to track models for a research project; it
was so simple”
33. ModelDB Architecture &
Design Decisions
1. Support for diverse
languages and environments
2. Minimal changes to
existing workflows
34. ModelDB Architecture &
Design Decisions
1. Support for diverse
languages and environments
2. Minimal changes to
existing workflows
3. Rich visual interface
35. ModelDB Architecture &
Design Decisions
1. Support for diverse
languages and environments
2. Minimal changes to
existing workflows
3. Rich visual interface
4. Support for complex
queries
36. “Oh, but why not git?”
• All code treated equal
• Some elements are special: data sources,
parameters, metrics, models
• Difficult to tease that out
• No semantics, so can’t run interesting queries
45. ModelDB Features
(currently available)
• Experiment tracking
• Versioning
• Reproducibility
• Comparisons, queries, search
• Collaboration
Log models, params, pipelines
etc. via ModelDB API
46. ModelDB Features
(currently available)
• Experiment tracking
• Versioning
• Reproducibility
• Comparisons, queries, search
• Collaboration
Log models, params, pipelines
etc. via ModelDB API
Every modeling run = version
47. ModelDB Features
(currently available)
• Experiment tracking
• Versioning
• Reproducibility
• Comparisons, queries, search
• Collaboration
Log models, params, pipelines
etc. via ModelDB API
All pipeline details, params
logged
Every modeling run = version
48. ModelDB Features
(currently available)
• Experiment tracking
• Versioning
• Reproducibility
• Comparisons, queries, search
• Collaboration
Log models, params, pipelines
etc. via ModelDB API
Model search, query,
comparison via frontend
All pipeline details, params
logged
Every modeling run = version
49. ModelDB Features
(currently available)
• Experiment tracking
• Versioning
• Reproducibility
• Comparisons, queries, search
• Collaboration
Log models, params, pipelines
etc. via ModelDB API
Model search, query,
comparison via frontend
Central repository of models
Review models, annotate
All pipeline details, params
logged
Every modeling run = version
53. ModelDB Features
(ongoing)
• Unified Querying of Modeling Artifacts
Base data, intermediates,
models, predictions, metadata
“How did the GBDTs do on married customers who
are interested in gardening?”
54. ModelDB Features
(ongoing)
• Unified Querying of Modeling Artifacts
Base data, intermediates,
models, predictions, metadata
“How did the GBDTs do on married customers who
are interested in gardening?”
Base
Data
is_married=T
55. ModelDB Features
(ongoing)
• Unified Querying of Modeling Artifacts
Base data, intermediates,
models, predictions, metadata
“How did the GBDTs do on married customers who
are interested in gardening?”
Base
Data
is_married=T
Intermediates
gardening=T
56. ModelDB Features
(ongoing)
• Unified Querying of Modeling Artifacts
Base data, intermediates,
models, predictions, metadata
“How did the GBDTs do on married customers who
are interested in gardening?”
Base
Data
is_married=T
Intermediates
gardening=T
Metadata
type=
GBDT
Models
ids={..}
57. ModelDB Features
(ongoing)
• Unified Querying of Modeling Artifacts
Base data, intermediates,
models, predictions, metadata
“How did the GBDTs do on married customers who
are interested in gardening?”
Base
Data
is_married=T
Intermediates
gardening=T
Predictions
accuracy(…)
Metadata
type=
GBDT
Models
ids={..}
58. ModelDB Features
(ongoing)
• Unified Querying of Modeling Artifacts
Base data, intermediates,
models, predictions, metadata
“How did the GBDTs do on married customers who
are interested in gardening?”
Base
Data
is_married=T
Intermediates
gardening=T
Predictions
accuracy(…)
Metadata
type=
GBDT
Models
ids={..}
What query language?
How to persist data?
61. ModelDB Features
(ongoing)
• Mining data in ModelDB
Model Features Params Metric
M13 X3,X9... l1=0.3 0.63
M22 X1,X4,X7 l2=0.7 0.8
M34 X11,X13 l1=0.7 0.55
… … … …
62. ModelDB Features
(ongoing)
• Mining data in ModelDB
Given model history, what
should we try next?
Bayesian Modeling/AutoML
Model Features Params Metric
M13 X3,X9... l1=0.3 0.63
M22 X1,X4,X7 l2=0.7 0.8
M34 X11,X13 l1=0.7 0.55
… … … …
63. ModelDB Features
(ongoing)
• Mining data in ModelDB
• Full model lifecycle management
Given model history, what
should we try next?
Bayesian Modeling/AutoML
Model Features Params Metric
M13 X3,X9... l1=0.3 0.63
M22 X1,X4,X7 l2=0.7 0.8
M34 X11,X13 l1=0.7 0.55
… … … …
64. ModelDB Features
(ongoing)
• Mining data in ModelDB
• Full model lifecycle management
Given model history, what
should we try next?
Bayesian Modeling/AutoML
Model Features Params Metric
M13 X3,X9... l1=0.3 0.63
M22 X1,X4,X7 l2=0.7 0.8
M34 X11,X13 l1=0.7 0.55
… … … …
65. ModelDB Features
(ongoing)
• Mining data in ModelDB
• Full model lifecycle management
Given model history, what
should we try next?
Bayesian Modeling/AutoML
Model performance degrades
Retrain model over time
Model Features Params Metric
M13 X3,X9... l1=0.3 0.63
M22 X1,X4,X7 l2=0.7 0.8
M34 X11,X13 l1=0.7 0.55
… … … …