2. About Me
● Software Engineer @ mPharma
● Technical Reviewer, Data Science @ Packt
● Writer @ AnalyticsVidhya
3. Disclaimer
I’m not going to talk about:
● What machine learning is
● Applications of machine learning
● What algorithms to use
● What Frameworks or libraries to use
4. What I will talk about
● The stuff that most tutorials don’t cover
○ Version Control
○ Testing
○ Performance Metrics
○ Reproducibility
○ Going to Production
○ Ethics
● Lessons learned from building and deploying models to production
● My 2 pesewas on how to get started
7. Version Control
Recording changes to certain components of the machine learning process so you can recall
specific versions from later.
What to Version?
● The general idea is to try versioning anything that requires iteration and continuous
improvement
● Most important components to version
● Code
● Data
● Models
9. Testing
● Data cleaning , Modelling, Deployment are all done with code; so treat them as such.
● Test your Data if possible
An idea for the brave
● Continuous Integration for data
10. Performance Metrics
It’s always good to have one number that tells you how good your model is.
But,
In some cases, you need to select your evaluation metric based with some amount
of
domain expertise.
11. Reproducibility
The ability to replicate a data science experiment using the same data and code running in the
same environment, producing the same results.
“non-reproducible single occurrences are of no significance to science.” -
Karl Popper
14. Going to Production
Production means getting your application used by its intended audience in a real world
situation.
Requirements:
● Accessibility
● Performance
● Fault Tolerance
● Scalability
● Maintenance
19. Choosing ML libraries and Frameworks
● Focus on people over tools
● Think of stability in production
● If you’re still tied, Follow the crowd
20. Choosing Deep Learning Architectures
● A good place to start: Research papers
● General Advice: Try to overfit, and add regularization to generalize
21. My 2 pesewas on how to get started in ML/DS
● Understand what data science is and how it can be used
● Learn the basics
○ Data Science from Scratch from O’Reilly
○ Doing Data Science by Cathy O’Neil and Rachel Schutt
● Work on projects
○ Kaggle
○ Zindi
● Read other people’s work
○ Paperswithcode
○ Medium
○ ArXiv
● Attend events like this and continue solving more problems
● Learn the rest as you go