Quby is a leading company offering data driven home services technology across European markets, known for creating the in-home display and smart thermostat Toon. In this talk, Erni will take you on a tour of how Quby leverages the full Databricks stack to quickly prototype, validate, scale and launch data science products. We will explore the technical workflow of a Data Science project from end to end. Starting from developing a notebook prototype and tracking the Machine Learning Model performance with ML Flow, we move towards production-grade Databricks jobs with a CI/CD pipeline, debugging production code with Databricks Connect, and finally setting up a monitoring system for the jobs.
VVVIP Call Girls In Greater Kailash ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
Quby - Making Homes Efficient and Comfortable using AI, IoT data and the full Databricks stack - Erni Durdevic at GoDataFest 2019
1. Making Homes Efficient and
Comfortable using AI, IoT data
and the full Databricks stack
Erni Durdevic
Quby’s (customer) story
Making Homes Efficient and
Comfortable using AI, IoT data
and the full Databricks stack
2. We believe that the future can be better.
Easier, more comfortable, and more sustainable
without compromising on the important things in life.
13. AWS S3
Our unified data analytics setup
IoT stream
Click data
Batch processing
Exploratory Data
Science Notebooks
Real time services
Live dashboards
Batch
Streaming
R & D
SQL
Services API
SQL / NoSQL
14. Our unified data analytics setup
IoT stream
Batch processing
AWS S3
Batch
Services API
SQL / NoSQL
22. Data pipeline – lessons learned
• Enforce idempotent constraints
• Enforce reproducibility
• Let data transformations be chainable
• Leverage partitioning and data locality
23. Our unified data analytics setup
IoT stream
Click data
Batch processing
Exploratory Data
Science Notebooks
Batch
R & D
AWS S3
SQL
Services API
SQL / NoSQL
24. AWS S3
Our unified data analytics setup
IoT stream
Click data
Batch processing
Exploratory Data
Science Notebooks
Real time services
Live dashboards
Batch
Streaming
R & D
SQL
Services API
SQL / NoSQL
25. Apache Hive vs Delta Lake
Apache Hive Delta Lake
Transaction log
27. United Teams of Data
Use-case proposition
Software & Data
ML & Algorithms
Analytics Translators Data Scientists Machine Learning Engineers Data Engineers
29. Why it is important?
We want to work in a
healthy and balanced team.
So we can build, deliver and
maintain great products and
services for our customers
30. Why does it work?
“developing and deploying ML systems is relatively fast and cheap, but
maintaining them over time is difficult and expensive.”