SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
OvertonOverton
Apple- avored MLApple- avored ML
2019/10/07 Nantes Machine Learning Meetup2019/10/07 Nantes Machine Learning Meetup
OverviewOverview
OverviewOverview
What / When / WhereWhat / When / Where
Publishing date
Early September
Conference
NeurIPS
Goal
ML software lifecycle management
Maturity
Production
OverviewOverview
Challenges to overcomeChallenges to overcome
Precise monitoring
Complex pipelines
Ef cient feedback loop
OverviewOverview
Architectural choicesArchitectural choices
Code-less deep learning
Multi-task learning
Weak supervision
Code-less deepCode-less deep
learninglearning
Code-less deep learningCode-less deep learning
PrinciplesPrinciples
Models & training code = experts
“black box” by engineers
Code-less deep learningCode-less deep learning
ModularityModularity
Code-less deep learningCode-less deep learning
Con gurationCon guration
Fine grained MLFine grained ML
& multi-tasks& multi-tasks
Fine grained ML & multi-tasksFine grained ML & multi-tasks
ProblemsProblems
Lots of subtasks (implicit & explicit)
Need to evaluate & monitor them
Need to improve on them
Fine grained ML & multi-tasksFine grained ML & multi-tasks
ApproachApproach
Slice-based learning
De nition of data subsets
Augmentation of model capacity
Dedicated metrics
Fine grained ML & multi-tasksFine grained ML & multi-tasks
Slice-based learningSlice-based learning
Fine grained ML & multi-tasksFine grained ML & multi-tasks
De nition of critical dataDe nition of critical data
subsetssubsets
With “slice functions”:
def sf_bike(x):
return "bike" in object_detector(x)
def sf_night(x):
return avg(X.pixels.intensity) < 0.3
Fine grained ML & multi-tasksFine grained ML & multi-tasks
Slice expertsSlice experts
For each slice, an expert:
should add capacity to the model
has to know when to trigger
has to have dedicated metrics
Fine grained ML & multi-tasksFine grained ML & multi-tasks
Hard partsHard parts
Noise : slices are de ned with heuristics
Scale 1 : when the number of slices goes up, does the
model still run fast enough?
Scale 2 : when the number of slices goes up, is the
model still good enough?
Fine grained ML & multi-tasksFine grained ML & multi-tasks
Proposed solutionProposed solution
ResourcesResources
Snorkel blog “Slice-based Learning”
NeurIPS paper “Slice-based Learning: A
Programming Model for Residual Learning in Critical
Data Slices”
WeakWeak
supervisionsupervision
Weak supervisionWeak supervision
Problem statementProblem statement
We need data, but:
It's expensive, not available, yadda yadda
Especially for interesting corner cases
It's noisy
Weak supervisionWeak supervision
SolutionSolution
Use heuristics
@labeling_function()
def lf_regex_check_out(x):
"""Spam comments say 'check out my video', 'check it out', et
return SPAM if re.search(r"check.*out", x.text, flags=re.I) e
Weak supervisionWeak supervision
SolutionSolution
Then correct the loss accounting for their correlation
Weak supervisionWeak supervision
ImplementationImplementation
Overton uses a modi ed version of the “Label Model”
from Snorkel.
ResourcesResources
Snorkel blog « Introducing the New Snorkel »
ICML paper “Learning Dependency Structures for
Weak Supervision Models”
ConclusionConclusion
Modularity (alike AllenNLP, tensor2tensor)
Dedicated metrics & multi-tasks
Improvements on critical data subsets
Arti cal data with theoretical corrections
Questions /Questions /
DiscussionDiscussion

Weitere ähnliche Inhalte

Ähnlich wie Overton, Apple Flavored ML

Yann le cun
Yann le cunYann le cun
Yann le cun
Yandex
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
Domino Data Lab
 

Ähnlich wie Overton, Apple Flavored ML (20)

Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
 
Why is dev ops for machine learning so different
Why is dev ops for machine learning so differentWhy is dev ops for machine learning so different
Why is dev ops for machine learning so different
 
Yann le cun
Yann le cunYann le cun
Yann le cun
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
 
What's Missing in Language Workbenches
What's Missing in Language WorkbenchesWhat's Missing in Language Workbenches
What's Missing in Language Workbenches
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Best Artificial Intelligence Course | Online program | certification course
Best Artificial Intelligence Course | Online program | certification course Best Artificial Intelligence Course | Online program | certification course
Best Artificial Intelligence Course | Online program | certification course
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
 
Stefan Geissler kairntech - SDC Nice Apr 2019
Stefan Geissler kairntech - SDC Nice Apr 2019 Stefan Geissler kairntech - SDC Nice Apr 2019
Stefan Geissler kairntech - SDC Nice Apr 2019
 
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
 
Ibm aix wlm idea
Ibm aix wlm ideaIbm aix wlm idea
Ibm aix wlm idea
 
XP2018 presentation for Phoenix Scrum User Group 2018
XP2018 presentation for Phoenix Scrum User Group 2018XP2018 presentation for Phoenix Scrum User Group 2018
XP2018 presentation for Phoenix Scrum User Group 2018
 
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
 
Building intelligent applications with Large Language Models
Building intelligent applications with Large Language ModelsBuilding intelligent applications with Large Language Models
Building intelligent applications with Large Language Models
 
FortranCalculus Class
FortranCalculus ClassFortranCalculus Class
FortranCalculus Class
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
 

Mehr von source{d}

Deduplication on large amounts of code
Deduplication on large amounts of codeDeduplication on large amounts of code
Deduplication on large amounts of code
source{d}
 

Mehr von source{d} (15)

Unlocking Engineering Observability with advanced IT analytics
Unlocking Engineering Observability with advanced IT analyticsUnlocking Engineering Observability with advanced IT analytics
Unlocking Engineering Observability with advanced IT analytics
 
What's new in the latest source{d} releases!
What's new in the latest source{d} releases!What's new in the latest source{d} releases!
What's new in the latest source{d} releases!
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...
 
Gitbase, SQL interface to Git repositories
Gitbase, SQL interface to Git repositoriesGitbase, SQL interface to Git repositories
Gitbase, SQL interface to Git repositories
 
Deduplication on large amounts of code
Deduplication on large amounts of codeDeduplication on large amounts of code
Deduplication on large amounts of code
 
Assisted code review with source{d} lookout
Assisted code review with source{d} lookoutAssisted code review with source{d} lookout
Assisted code review with source{d} lookout
 
Machine Learning on Code - SF meetup
Machine Learning on Code - SF meetupMachine Learning on Code - SF meetup
Machine Learning on Code - SF meetup
 
Inextricably linked reproducibility and productivity in data science and ai ...
Inextricably linked reproducibility and productivity in data science and ai  ...Inextricably linked reproducibility and productivity in data science and ai  ...
Inextricably linked reproducibility and productivity in data science and ai ...
 
source{d} Engine - your code as data
source{d} Engine - your code as datasource{d} Engine - your code as data
source{d} Engine - your code as data
 
Introduction to the source{d} Stack
Introduction to the source{d} Stack Introduction to the source{d} Stack
Introduction to the source{d} Stack
 
source{d} Engine: Exploring git repos with SQL
source{d} Engine: Exploring git repos with SQLsource{d} Engine: Exploring git repos with SQL
source{d} Engine: Exploring git repos with SQL
 
Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout
 
Machine learning on Go Code
Machine learning on Go CodeMachine learning on Go Code
Machine learning on Go Code
 
Improving go-git performance
Improving go-git performanceImproving go-git performance
Improving go-git performance
 
Machine learning on source code
Machine learning on source codeMachine learning on source code
Machine learning on source code
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Overton, Apple Flavored ML