Illarion Khlestov "Developing robust ML training systems"

•Download as PPTX, PDF•

0 likes•75 views

This document discusses developing robust training systems for machine learning models. It covers topics like data handling, validation, visualization, training, evaluation, debugging, and libraries to help with experiments. The goal is to create standardized, reproducible systems for developing and testing models. Sections provide guidance on storing data in a unified format, validating data quality, visualizing results, configuring training processes, evaluating performance, and open source tools to aid experimentation.

Engineering

About me
Illarion Khlestov, Senior Research Engineer
GitHub: https://github.com/ikhlestov
Blog: https://medium.com/@illarionkhlestov
Facebook: https://www.facebook.com/i.khlestov

Format validation
- jsonschema
- trafaret
- validr
- voluptuous

Inbound data validation
- Allowed range of entries
- Annotation quality
- Data quantity
- Checksums
- Suitability

Results saving
- id:
- Logs
- Weights
- Graphs
- Predictions
- id example:
- 1541023813_mobile_net_trained_with_SGD_001

Python configs import
https://docs.python.org/3/library/importlib.html

Network validation
- Test run:
- Train
- Validation
- Test
- Results saving
- Size calculation
- Speed calculation

Model evaluation table
- Dataset ID
- Link to dataset statistic
- Data preprocessing
- Train graphs and logs
- Performance
- Accuracy
- Speed

Can you spot a frog-cat?
Generated by facets

Pre-production
- Additional metrics
- Don’t mix ids
- Test auto-check

Bonus: Libraries
- https://github.com/IDSIA/sacred - configure, organize, log and reproduce experiments
- https://github.com/henripal/labnotebook - flexibly monitor, record, save, and query experiments
- https://github.com/facebookresearch/visdom - visualizations of live, rich data
- https://github.com/uber/horovod - distributed training framework
- https://github.com/autonomio/talos - Hyperparameter Optimization for Keras Models
- https://github.com/hyperopt/hyperopt - Hyperparameter Optimization in Python

Similar to Illarion Khlestov "Developing robust ML training systems"

Practical operability techniques for teams - webinar - Skelton Thatcher & UnicomSkelton Thatcher Consulting Ltd

Practical, team-focused operability techniques for distributed systems - DevO...Matthew Skelton

MSBI Online Training in Hyderabadunited global soft

MSBI Online Training in Indiaunited global soft

MSBI Online Training united global soft

Automating Desktop Management with Windows Powershell V2.0 and Group Policy M...Microsoft TechNet

Welcome Webinar SlidesSumo Logic

Sprint 71ManageIQ

Model-based Analysis of Java EE Web Security Configurations - Mise 2016Jordi Cabot

Avoiding the 10 Deadliest and Most Common Sins for Securing WindowsBeyondTrust

Alluxio Webinar - Maximize GPU Utilization for Model TrainingAlluxio, Inc.

Practical operability techniques for teams - Matthew Skelton - Agile in the C...Skelton Thatcher Consulting Ltd

Practical operability techniques for teams - IPEXPO 2017Skelton Thatcher Consulting Ltd

Management Information System by professor sai chandusai chandu kandati

Rameshwar panchal Resumerameshwar panchal

PyCon Ukraine 2016: Maintaining a high load Python project for newcomersViach Kakovskyi

Semantic technologies in practice - KULeuven 2016Aad Versteden

Certification Study Group - NLP & Recommendation Systems on GCP Session 5gdgsurrey

New World Of SharePoint 2010 Administration OlesonJoel Oleson

SQL Bits 2018 | Best practices for Power BI on implementation and monitoring Bent Nissen Pedersen

Similar to Illarion Khlestov "Developing robust ML training systems" (20)

Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom

Practical, team-focused operability techniques for distributed systems - DevO...

MSBI Online Training in Hyderabad

MSBI Online Training in India

MSBI Online Training

Automating Desktop Management with Windows Powershell V2.0 and Group Policy M...

Welcome Webinar Slides

Sprint 71

Model-based Analysis of Java EE Web Security Configurations - Mise 2016

Avoiding the 10 Deadliest and Most Common Sins for Securing Windows

Alluxio Webinar - Maximize GPU Utilization for Model Training

Practical operability techniques for teams - Matthew Skelton - Agile in the C...

Practical operability techniques for teams - IPEXPO 2017

Management Information System by professor sai chandu

Rameshwar panchal Resume

PyCon Ukraine 2016: Maintaining a high load Python project for newcomers

Semantic technologies in practice - KULeuven 2016

Certification Study Group - NLP & Recommendation Systems on GCP Session 5

New World Of SharePoint 2010 Administration Oleson

SQL Bits 2018 | Best practices for Power BI on implementation and monitoring

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3

(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...ranjana rawat

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth

IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat

Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla

Introduction and different types of Ethernet.pptxupamatechverse

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa

Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Low Rate Call Girls In Saket, Delhi NCR

Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...

IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik

Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts

Coefficient of Thermal Expansion and their Importance.pptx

★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts

SPICE PARK APR2024 ( 6,793 SPICE Models )

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS

Introduction and different types of Ethernet.pptx

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx

Processing & Properties of Floor and Wall Tiles.pptx

VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf

Software Development Life Cycle By Team Orange (Dept. of Pharmacy)

Illarion Khlestov "Developing robust ML training systems"

1. Developing robust training systems

2. About me Illarion Khlestov, Senior Research Engineer GitHub: https://github.com/ikhlestov Blog: https://medium.com/@illarionkhlestov Facebook: https://www.facebook.com/i.khlestov

3. Motivation

4. It’s not complicated

5. Structure 25-30 mins 10 mins

6. Data handling

7. Data variety issue

8. Solution - unified format

9. How to store data?

10. Format validation - jsonschema - trafaret - validr - voluptuous

11. Inbound data validation - Allowed range of entries - Annotation quality - Data quantity - Checksums - Suitability

12. Visualization

13. Plotly & Dash

14. Plotly & Dash

15. Plotly structuring

16. Moving between servers

17. Data downloaders

18. Auto synchronization

19. Data update & cleanup

20.

21. Training

22.

23. The simplest approach

24. Results saving - id: - Logs - Weights - Graphs - Predictions - id example: - 1541023813_mobile_net_trained_with_SGD_001

25. Configs

26. Python configs

27. Python configs import https://docs.python.org/3/library/importlib.html

28. Configs updates

29. Network validation - Test run: - Train - Validation - Test - Results saving - Size calculation - Speed calculation

30. Training Speedup Image source

31.

32. Evaluation

33. Summary table

34. Configs diffs

35. Model evaluation table - Dataset ID - Link to dataset statistic - Data preprocessing - Train graphs and logs - Performance - Accuracy - Speed

36. Can you spot a frog-cat? Generated by facets

37. Mislabeling Debugging

38. Pre-production - Additional metrics - Don’t mix ids - Test auto-check

39. Training result structure

40. Bonus: Libraries - https://github.com/IDSIA/sacred - configure, organize, log and reproduce experiments - https://github.com/henripal/labnotebook - flexibly monitor, record, save, and query experiments - https://github.com/facebookresearch/visdom - visualizations of live, rich data - https://github.com/uber/horovod - distributed training framework - https://github.com/autonomio/talos - Hyperparameter Optimization for Keras Models - https://github.com/hyperopt/hyperopt - Hyperparameter Optimization in Python

41.

42. http://bit.ly/training_systems

Editor's Notes

Welcome everyone, My name is Larry. Few words about me. I'm working as a Seniour Research Engineer. I'm enaged in developing of various computer vision algorithms for videos and images processing. Mainly I make cloud based solutions. Sometimes I post my ideas or notes to Medium blog. Also there are a few interesting projects on my gitHub. So - subscribe. Moreover it's always cool to work with clever and talanted people. That's why if you want to join the team - feel free to have a talk after the lecture.
But let’s turn back to presentation. As it was described in the title we are going to talk about developing training systems. You may ask - what is the reason? From my point of view the main motivation is that there are a lot of tutorials how to build and train simple networks. Many sources try to give you a first impression what the Machine learning is. Mainly they are like: just take already existing code, try to find pretrained weights, tune them a little bit. And you are ready for production. Nothing complicated there.
Unfortunatelly, there is not a lot info about how to build large systems that can be updated, reused and trained with upcoming data. On the other hand such systems should be easy to use by many people if you have a team with more than one person. Today I’ll try to show that such projects shouldn’t be very complicated and with some simple architecture decisions you can speedup and improve your development process. And of course, you may not build such systems by your own but at least you will know what should you are looking for.
During the lecture I’ll cover three main chapters, such as * How to store, handle and prepare data for training * How to manage training itself * How we should analyze results after training Additionally in the end I’ll discuss some libraries and packages ready to use as a components. Lecture will take not very long. I’ll be happy to answer all questions in the end. And I'm going to provide link to slides in the end, that's why there is no reason to photo all links or library names. So let’s start.
TODO: replace with icons?

Illarion Khlestov "Developing robust ML training systems"

Recommended

Recommended

More Related Content

Similar to Illarion Khlestov "Developing robust ML training systems"

Similar to Illarion Khlestov "Developing robust ML training systems" (20)

More from Lviv Startup Club

More from Lviv Startup Club (20)

Recently uploaded

Recently uploaded (20)

Illarion Khlestov "Developing robust ML training systems"

Editor's Notes