DevOps Days Rockies MLOps

DevOps & MLOps -
The Same But Diﬀerent?
@mattreyuk

Agenda
2
● What is machine learning
● What’s the same as regular service development
● What’s different (and can go wrong)
● Building ML teams and their place in the company
● How we do ML at Ibotta
● The future

Background
3
● I'm Matt Reynolds, a principal platform engineer on the
machine learning team at Ibotta
● Ibotta is a rewarded shopping company with mobile, web
and white label platform components
● Based here in Denver but now fully "remote-friendly"
● We're hiring - https://home.ibotta.com/work-with-us/careers/

What Is ML?
4
Machine learning (ML) A program or system that builds
(trains) a predictive model from input data. The system uses
the learned model to make useful predictions from new
(never-before-seen) data drawn from the same distribution
as the one used to train the model.
https://developers.google.com/machine-learning/glossary#machine-learning

Types Of ML
5
● “Analytical” ML
One off, exploratory, ﬁndings used in reports to
management
● “Engineering” ML
Models deployed to production, called by services

What’s The Same For ML?
6
● Frameworks/libraries/tools
● Git/PRs for code
● CI/CD - process automation for repeatability
● Provide a service in production
● Service monitoring*

Some Companies Struggle…
7
55% of companies surveyed did not have a model in production
https://info.algorithmia.com/hubfs/2019/Whitepapers/The-State-of-Enterprise-ML-2020/Algo
rithmia_2020_State_of_Enterprise_ML.pdf
87% of data science projects don’t make it to production
https://venturebeat.com/2019/07/19/why-do-87-of-data-science-projects-never-make-it-into
-production/

What’s Diﬀerent?
8
● DATA

Data
9
https://medium.com/hackernoon/the-ai-hierarchy-of-needs-18f111fcc007

Exploratory Data Analysis (EDA)
10
New Models require:
● Finding data sources that may be suitable
● Checking Data Quality, distribution
● Figuring out label generation
● Building initial Features
● Testing with algorithm(s)
● Validating results and tuning

11
● DATA
● People

People
12
“Data Scientists” have different skill sets:
● Have their own jargon
● May not be used to writing “production ready” code
● May not be used to being on-call, production support
● Mostly work in Python

13
● DATA
● People
● Different tools

Diﬀerent Tools
14
As well as the tooling to run a “regular” service, you also need:
● Data pipeline
● Feature engineering
● Feature store
● Training & hyperparameter tuning infrastructure
● Maybe specialized inference hardware (GPU)
● Inference monitoring (data drift)

Jupyter Notebooks
15
https://jupyter.org/try-jupyter/retro/notebooks/?path=notebooks/Intro.ipynb

Jupyter Notebooks
16
https://jupyter.org/try-jupyter/retro/notebooks/?path=notebooks/Intro.ipynb
{
"cell_type": "code",
"source": "from matplotlib import pyplot as pltnimport numpy as npnn# Generate 100 random data
points along 3 dimensionsnx, y, scale = np.random.randn(3, 100)nfig, ax = plt.subplots()nn# Map each
onto a scatterplot we'll create with Matplotlibnax.scatter(x=x, y=y, c=scale,
s=np.abs(scale)*500)nax.set(title="Some random data, created with JupyterLab!")nplt.show()",
"metadata": {
"trusted": true
},
"execution_count": 1,
"outputs": [
{
"output_type": "display_data",
"data": {
"image/png":
"iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR
0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAA9hAAAPYQGoP6dpAADYYUlEQVR4nOzdd3wcxfn48c/sXlMvlmTJslzl3rE
dgwvYxlRTDHFoScD0JEBCKAlOAgECIaRQvkBov9BJAIPpmG5sgzu4V7nJRbJ6l67tzu+Pk2Sf1U7S3anNOy+9gvf2ZubqPjflGSGllCi
KoiiKoig9htbRDVAURVEURVHCSwWAiqIoiqIoPYwKABVFURR…

Ideal Team Composition
17
● Fighter (Software Engineering)
● Cleric (Data Engineering)
● Wizard (Data Science)
● Rogue (Ops/Infrastructure)

How Can You Help?
18
● Take some time to learn the lay of the land
● Look for pain points - local dev, process automation
● Make suggestions, listen to feedback
● Jump in and learn the ropes
● Work from the more “engineering” side to the more “ML”
● Teach what you do and learn what they do
● Encourage collaboration, standardization
● Explain why

ML In The Larger Organization
19
● Need to work with Data/Analytics & Engineering orgs
● Involve product
● Advocate for big picture concerns like:
● Data catalog, more metadata
● Data quality
● More (timely) data - events from engineering services

Our Process - Data & Training
20
● Airﬂow for job orchestration
● PySpark for Data transformation
● Sagemaker for managing training, hyperparameter jobs
● Local dev with Docker for Airﬂow DAGs
● Jupyter notebooks for EDA and troubleshooting

Our Process - Inference
21
● Sagemaker Endpoints using docker images built on top of
AWS supplied bases
● Postgres DB for storing real time features
● All behind API gateway for consistent API
● Lambda for A/B test, model aggregation
● Local dev with Docker for Inference test along with Jupyter
notebooks, test integration in staging environment

The Future
22
● How to scale
● Quality monitoring
● More real-time feature generation
● Serverless inference

DevOps Days Rockies MLOps

Recommended

Recommended

More Related Content

Similar to DevOps Days Rockies MLOps

Similar to DevOps Days Rockies MLOps (20)

Recently uploaded

Recently uploaded (20)

DevOps Days Rockies MLOps