SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Agile Experiments in Machine
Learning
About me
• Mathias @brandewinder
• F# & Machine Learning
• Based in San Francisco
• I do have a tiny accent 
Why this talk?
• Machine learning competition as a team
• Team work requires process
• Code, but “subtly different”
• Statically typed functional with F#
These are unfinished thoughts
Code on GitHub
• JamesSDixon/Kaggle.HomeDepot
• mathias-brandewinder/Presentations
Plan
• The problem
• Creating & iterating Models
• Pre-processing of Data
• Parting thoughts
Kaggle Home Depot
Team & Results
• Jamie Dixon
(@jamie_Dixon), Taylor
Wood (@squeekeeper), &
alii
• Final ranking: 122nd/2125
(top 6%)
The question
“6 inch damper”
“Battic Door Energy Conservation
Products Premium 6 in. Back Draft
Damper”
Is this any good?
Search Product
The data
"Simpson Strong-Tie 12-Gauge Angle","l bracket",2.5
"BEHR Premium Textured DeckOver 1-gal. #SC-141 Tugboat Wood and
Concrete Coating","deck over",3
"Delta Vero 1-Handle Shower Only Faucet Trim Kit in Chrome (Valve Not
Included)","rain shower head",2.33
"Toro Personal Pace Recycler 22 in. Variable Speed Self-Propelled Gas
Lawn Mower with Briggs & Stratton Engine","honda mower",2
"Hampton Bay Caramel Simple Weave Bamboo Rollup Shade - 96 in. W x 72
in. L","hampton bay chestnut pull up shade",2.67
"InSinkErator SinkTop Switch Single Outlet for InSinkErator
Disposers","disposer",2.67
"Sunjoy Calais 8 ft. x 5 ft. x 8 ft. Steel Tile Fabric Grill
Gazebo","grill gazebo",3
...
The problem
• Given a Search, and the Product that was recommended,
• Predict how Relevant the recommendation is,
• Rated from terrible (1.0) to awesome (3.0).
The competition
• 70,000 training examples
• 20,000 search + product to predict
• Smallest RMSE* wins
• About 3 months
*RMSE ~ average distance between correct and predicted values
Machine
Learning
Experiments in Code
An obvious solution
// domain model
type Observation = {
Search: string
Product: string
}
// prediction function
let predict (obs:Observation) = 2.0
So… Are we done?
Code, but…
• Domain is trivial
• No obvious tests to write
• Correctness is (mostly) unimportant
What are we trying to do here?
We will change the function predict,
over and over and over again,
trying to be creative, and
come up with a predict function that
fits the data better.
Observation
• Single feature
• Never complete, no binary test
• Many experiments
• Possibly in parallel
• No “correct” model - any model could work. If it performs
better, it is better.
Experiments
We care about “something”
What we want
Observation Model Prediction
What we really mean
Observation Model Prediction
x1, x2, x3 f(x1, x2, x3) y
We formulate a model
What we have
Observation Result
Observation Result
Observation Result
Observation Result
Observation Result
Observation Result
We calibrate the model
0
10
20
30
40
50
60
0 2 4 6 8 10 12
Prediction is very difficult,
especially if it’s about the
future.
We validate the model
… which becomes the
“current best truth”
Overall process
Formulate model
Calibrate model
Validate model
ML: experiments in code
Formulate model: features
Calibrate model: learn
Validate model
Modelling
• Transform Observation into Vector
• Ex: Search length, % matching words, …
• [17.0; 0.35; 3.5; …]
• Learn f, such that f(vector)~Relevance
Learning with Algorithms
Validating
• Leave some of the data out
• Learn on part of the data
• Evaluate performance on the rest
Recap
• Traditional software: incrementally build solutions by
completing discrete features,
• Machine Learning: create experiments, hoping to improve
a predictor
• Traditional process likely inadequate
Practice
How the Sausage is Made
How does it look?
// load data
// extract features as vectors
// use some algorithm to learn
// check how good/bad the model does
An example
What are the problems?
• Hard to track features
• Hard to swap algorithm
• Repeat same steps
• Code doesn’t reflect what we are after
wasteful
ˈweɪstfʊl,-f(ə)l/
adjective
1. (of a person, action, or process) using or
expending something of value carelessly,
extravagantly, or to no purpose.
To avoid waste,
build flexibility where
there is volatility,
and automate repeatable steps.
Strategy
• Use types to represent what we are doing
• Automate everything that doesn’t change: data loading,
algorithm learning, evaluation
• Make what changes often (and is valuable) easy to
change: creation of features
Core model
type Observation = {
Search: string
Product: string }
type Relevance : float
type Predictor = Observation -> Relevance
type Feature = Observation -> float
type Example = Relevance * Observation
type Model = Feature []
type Learning = Model -> Example [] -> Predictor
“Catalog of Features”
let ``search length`` : Feature =
fun obs -> obs.Search.Length |> float
let ``product title length`` : Feature =
fun obs -> obs.Product.Length |> float
let ``matching words`` : Feature =
fun obs ->
let w1 = obs.Search.Split ' ' |> set
let w2 = obs.Product.Split ' ' |> set
Set.intersect w1 w2 |> Set.count |> float
Experiments
// shared/common data loading code
let model = [|
``search length``
``product title length``
``matching words``
|]
let predictor = RandomForest.regression model training
Let quality = evaluate predictor validation
Feature 1
…
Feature 2
Feature 3
Algorithm 1
Algorithm 2
Algorithm 3
…
Feature 1
Feature 3
Algorithm 2
Data
Validation
Experiment/Model
Shared / Reusable
Example, revisited
Food for thought
• Use types for modelling
• Model the process, not the entity
• Cross-validation replaces tests
Domain modelling?
// Object oriented style
type Observation = {
Search: string
Product: string }
with member this.SearchLength =
this.Search.Length
// Properties as functions
type Observation = {
Search: string
Product: string }
let searchLength (obs:Observation) =
obs.Search.Length
// "object" as a bag of functions
let model = [
fun obs -> searchLength obs
]
Did it work?
Recap
• F# Types to model Domain with common “language”
across scripts
• Separate code elements by role, to enable focusing on
high value activity, the creation of features
The unbearable
heaviness of data
Reproducible research
• Anyone must be able to re-compute everything, from
scratch
• Model is meaningless without the data
• Don’t tamper with the source data
• Script everything
Analogy: Source Control + Automated Build
If I check out code from source control,
it should work.
One simple main idea:
does the Search query look like the Product?
Dataset normalization
• “ductless air conditioners”, “GREE Ultra
Efficient 18,000 BTU (1.5Ton) Ductless
(Duct Free) Mini Split Air Conditioner with
Inverter, Heat, Remote 208-230V”
• “6 inch damper”,”Battic Door Energy
Conservation Products Premium 6 in. Back
Draft Damper”,
• “10000 btu windowair conditioner”, “GE
10,000 BTU 115-Volt Electronic Window Air
Conditioner with Remote”
Pre-processing pipeline
let normalize (txt:string) =
txt
|> fixPunctuation
|> fixThousands
|> cleanUnits
|> fixMisspellings
|> etc…
Lesson learnt
• Pre-processing data matters
• Pre-processing is slow
• Also, Regex. Plenty of Regex.
Tension
Keep data intact
& regenerate outputs
vs.
Cache intermediate results
There are only two hard problems
in computer science.
Cache invalidation, and
being willing to relocate to San Francisco.
Observations
• If re-computing everything is fast –
then re-compute everything, every time.
• Can you isolate causes of change?
Feature 1
…
Feature 2
Feature 3
Algorithm 1
Algorithm 2
Algorithm 3
…
Feature 1
Feature 3
Algorithm 2
Data
Validation
Experiment/Model
Shared / Reusable
Pre-Processing
Cache
Conclusion
General
• Don’t be religious about process
• Why do you follow a process?
• Identify where you waste energy
• Build flexibility around volatility
• Automate the repeatable parts
Statically typed functional
• Super clean scripts / data pipelines
• Types help define clear domain models
• Types prevent dumb mistakes
Open questions
• Better way to version features?
• Experiment is not an entity?
• Is pre-processing a feature?
• Something missing in overall versioning
• Better understanding of data/code dependencies (reuse
computation, …)
Shameless plug
I have a book out, “Machine
Learning projects for .NET
developers”, Apress
Thank you 
@brandewinder /
brandewinder.com
• Come chat if you are interested in
the topic!
• Check out fsharp.org…

Weitere ähnliche Inhalte

Was ist angesagt?

Automated Integrated Testing with MongoDB
Automated Integrated Testing with MongoDBAutomated Integrated Testing with MongoDB
Automated Integrated Testing with MongoDB
MongoDB
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
Legacy Typesafe (now Lightbend)
 

Was ist angesagt? (20)

Application Monitoring using Datadog
Application Monitoring using DatadogApplication Monitoring using Datadog
Application Monitoring using Datadog
 
The Art of Decomposing Monoliths - Kfir Bloch, Wix
The Art of Decomposing Monoliths - Kfir Bloch, WixThe Art of Decomposing Monoliths - Kfir Bloch, Wix
The Art of Decomposing Monoliths - Kfir Bloch, Wix
 
20160524 ibm fast data meetup
20160524 ibm fast data meetup20160524 ibm fast data meetup
20160524 ibm fast data meetup
 
Vladimir Ulogov - Large Scale Simulation | ZabConf2016 Lightning Talk
Vladimir Ulogov - Large Scale Simulation | ZabConf2016 Lightning TalkVladimir Ulogov - Large Scale Simulation | ZabConf2016 Lightning Talk
Vladimir Ulogov - Large Scale Simulation | ZabConf2016 Lightning Talk
 
Akka Persistence | Event Sourcing
Akka Persistence | Event SourcingAkka Persistence | Event Sourcing
Akka Persistence | Event Sourcing
 
Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!
 
Database deployment: still hard after all these years - Data Saturday #1
Database deployment: still hard after all these years - Data Saturday #1Database deployment: still hard after all these years - Data Saturday #1
Database deployment: still hard after all these years - Data Saturday #1
 
Automated Integrated Testing with MongoDB
Automated Integrated Testing with MongoDBAutomated Integrated Testing with MongoDB
Automated Integrated Testing with MongoDB
 
Java 8 and 9 in Anger
Java 8 and 9 in AngerJava 8 and 9 in Anger
Java 8 and 9 in Anger
 
Игорь Фесенко "Direction of C# as a High-Performance Language"
Игорь Фесенко "Direction of C# as a High-Performance Language"Игорь Фесенко "Direction of C# as a High-Performance Language"
Игорь Фесенко "Direction of C# as a High-Performance Language"
 
Сергей Калинец "Не SQL-ом единым..."
Сергей Калинец "Не SQL-ом единым..."Сергей Калинец "Не SQL-ом единым..."
Сергей Калинец "Не SQL-ом единым..."
 
Monitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloudMonitoring kubernetes across data center and cloud
Monitoring kubernetes across data center and cloud
 
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
A Deeper Look Into Reactive Streams with Akka Streams 1.0 and Slick 3.0
 
Building occasionally connected applications using event sourcing
Building occasionally connected applications using event sourcingBuilding occasionally connected applications using event sourcing
Building occasionally connected applications using event sourcing
 
Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...
Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...
Javantura v4 - Java or Scala – Web development with Playframework 2.5.x - Kre...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Agile Data Warehousing
Agile Data WarehousingAgile Data Warehousing
Agile Data Warehousing
 
AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)
 
Dapper: the microORM that will change your life
Dapper: the microORM that will change your lifeDapper: the microORM that will change your life
Dapper: the microORM that will change your life
 
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
 

Ähnlich wie Agile experiments in Machine Learning with F#

How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
Mike Harris
 

Ähnlich wie Agile experiments in Machine Learning with F# (20)

Agile Experiments in Machine Learning
Agile Experiments in Machine LearningAgile Experiments in Machine Learning
Agile Experiments in Machine Learning
 
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
How I Learned to Stop Worrying and Love Legacy Code - Ox:Agile 2018
 
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...Waking the Data Scientist at 2am:  Detect Model Degradation on Production Mod...
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...
 
VT.NET 20160411: An Intro to Test Driven Development (TDD)
VT.NET 20160411: An Intro to Test Driven Development (TDD)VT.NET 20160411: An Intro to Test Driven Development (TDD)
VT.NET 20160411: An Intro to Test Driven Development (TDD)
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
 
From Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender systemFrom Labelling Open data images to building a private recommender system
From Labelling Open data images to building a private recommender system
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!
 
Taming the shrew Power BI
Taming the shrew Power BITaming the shrew Power BI
Taming the shrew Power BI
 
Bdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchenBdf16 big-data-warehouse-case-study-data kitchen
Bdf16 big-data-warehouse-case-study-data kitchen
 
How we integrate Machine Learning Algorithms into our IT Platform at Outfitte...
How we integrate Machine Learning Algorithms into our IT Platform at Outfitte...How we integrate Machine Learning Algorithms into our IT Platform at Outfitte...
How we integrate Machine Learning Algorithms into our IT Platform at Outfitte...
 
Agile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender SystemsAgile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender Systems
 
Developer Night - Opticon18
Developer Night - Opticon18Developer Night - Opticon18
Developer Night - Opticon18
 
Testing Ext JS and Sencha Touch
Testing Ext JS and Sencha TouchTesting Ext JS and Sencha Touch
Testing Ext JS and Sencha Touch
 
GOTO Night: Decision Making Based on Machine Learning
GOTO Night: Decision Making Based on Machine LearningGOTO Night: Decision Making Based on Machine Learning
GOTO Night: Decision Making Based on Machine Learning
 
Decision Making based on Machine Learning at Outfittery (W-JAX 2017)
Decision Making based on Machine Learning at Outfittery (W-JAX 2017)Decision Making based on Machine Learning at Outfittery (W-JAX 2017)
Decision Making based on Machine Learning at Outfittery (W-JAX 2017)
 
Testing ASP.NET - Progressive.NET
Testing ASP.NET - Progressive.NETTesting ASP.NET - Progressive.NET
Testing ASP.NET - Progressive.NET
 
Design p atterns
Design p atternsDesign p atterns
Design p atterns
 
How we integrate Machine Learning Algorithms into our IT Platform at Outfittery
How we integrate Machine Learning Algorithms into our IT Platform at OutfitteryHow we integrate Machine Learning Algorithms into our IT Platform at Outfittery
How we integrate Machine Learning Algorithms into our IT Platform at Outfittery
 

Mehr von J On The Beach

Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
J On The Beach
 
Axon Server went RAFTing
Axon Server went RAFTingAxon Server went RAFTing
Axon Server went RAFTing
J On The Beach
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The Monkeys
J On The Beach
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind Libraries
J On The Beach
 

Mehr von J On The Beach (20)

Massively scalable ETL in real world applications: the hard way
Massively scalable ETL in real world applications: the hard wayMassively scalable ETL in real world applications: the hard way
Massively scalable ETL in real world applications: the hard way
 
Big Data On Data You Don’t Have
Big Data On Data You Don’t HaveBig Data On Data You Don’t Have
Big Data On Data You Don’t Have
 
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
 
Pushing it to the edge in IoT
Pushing it to the edge in IoTPushing it to the edge in IoT
Pushing it to the edge in IoT
 
Drinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actorsDrinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actors
 
How do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server patternHow do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server pattern
 
Java, Turbocharged
Java, TurbochargedJava, Turbocharged
Java, Turbocharged
 
When Cloud Native meets the Financial Sector
When Cloud Native meets the Financial SectorWhen Cloud Native meets the Financial Sector
When Cloud Native meets the Financial Sector
 
The big data Universe. Literally.
The big data Universe. Literally.The big data Universe. Literally.
The big data Universe. Literally.
 
Streaming to a New Jakarta EE
Streaming to a New Jakarta EEStreaming to a New Jakarta EE
Streaming to a New Jakarta EE
 
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
 
Pushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and BlazorPushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and Blazor
 
Axon Server went RAFTing
Axon Server went RAFTingAxon Server went RAFTing
Axon Server went RAFTing
 
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The Monkeys
 
Servers are doomed to fail
Servers are doomed to failServers are doomed to fail
Servers are doomed to fail
 
Interaction Protocols: It's all about good manners
Interaction Protocols: It's all about good mannersInteraction Protocols: It's all about good manners
Interaction Protocols: It's all about good manners
 
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
 
Leadership at every level
Leadership at every levelLeadership at every level
Leadership at every level
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind Libraries
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Agile experiments in Machine Learning with F#

  • 1. Agile Experiments in Machine Learning
  • 2. About me • Mathias @brandewinder • F# & Machine Learning • Based in San Francisco • I do have a tiny accent 
  • 3. Why this talk? • Machine learning competition as a team • Team work requires process • Code, but “subtly different” • Statically typed functional with F#
  • 5. Code on GitHub • JamesSDixon/Kaggle.HomeDepot • mathias-brandewinder/Presentations
  • 6. Plan • The problem • Creating & iterating Models • Pre-processing of Data • Parting thoughts
  • 8. Team & Results • Jamie Dixon (@jamie_Dixon), Taylor Wood (@squeekeeper), & alii • Final ranking: 122nd/2125 (top 6%)
  • 9. The question “6 inch damper” “Battic Door Energy Conservation Products Premium 6 in. Back Draft Damper” Is this any good? Search Product
  • 10. The data "Simpson Strong-Tie 12-Gauge Angle","l bracket",2.5 "BEHR Premium Textured DeckOver 1-gal. #SC-141 Tugboat Wood and Concrete Coating","deck over",3 "Delta Vero 1-Handle Shower Only Faucet Trim Kit in Chrome (Valve Not Included)","rain shower head",2.33 "Toro Personal Pace Recycler 22 in. Variable Speed Self-Propelled Gas Lawn Mower with Briggs & Stratton Engine","honda mower",2 "Hampton Bay Caramel Simple Weave Bamboo Rollup Shade - 96 in. W x 72 in. L","hampton bay chestnut pull up shade",2.67 "InSinkErator SinkTop Switch Single Outlet for InSinkErator Disposers","disposer",2.67 "Sunjoy Calais 8 ft. x 5 ft. x 8 ft. Steel Tile Fabric Grill Gazebo","grill gazebo",3 ...
  • 11. The problem • Given a Search, and the Product that was recommended, • Predict how Relevant the recommendation is, • Rated from terrible (1.0) to awesome (3.0).
  • 12. The competition • 70,000 training examples • 20,000 search + product to predict • Smallest RMSE* wins • About 3 months *RMSE ~ average distance between correct and predicted values
  • 14. An obvious solution // domain model type Observation = { Search: string Product: string } // prediction function let predict (obs:Observation) = 2.0
  • 15. So… Are we done?
  • 16. Code, but… • Domain is trivial • No obvious tests to write • Correctness is (mostly) unimportant What are we trying to do here?
  • 17. We will change the function predict, over and over and over again, trying to be creative, and come up with a predict function that fits the data better.
  • 18. Observation • Single feature • Never complete, no binary test • Many experiments • Possibly in parallel • No “correct” model - any model could work. If it performs better, it is better.
  • 20. We care about “something”
  • 21. What we want Observation Model Prediction
  • 22. What we really mean Observation Model Prediction x1, x2, x3 f(x1, x2, x3) y
  • 23. We formulate a model
  • 24. What we have Observation Result Observation Result Observation Result Observation Result Observation Result Observation Result
  • 25. We calibrate the model 0 10 20 30 40 50 60 0 2 4 6 8 10 12
  • 26. Prediction is very difficult, especially if it’s about the future.
  • 27. We validate the model … which becomes the “current best truth”
  • 29. ML: experiments in code Formulate model: features Calibrate model: learn Validate model
  • 30. Modelling • Transform Observation into Vector • Ex: Search length, % matching words, … • [17.0; 0.35; 3.5; …] • Learn f, such that f(vector)~Relevance
  • 32. Validating • Leave some of the data out • Learn on part of the data • Evaluate performance on the rest
  • 33. Recap • Traditional software: incrementally build solutions by completing discrete features, • Machine Learning: create experiments, hoping to improve a predictor • Traditional process likely inadequate
  • 35. How does it look? // load data // extract features as vectors // use some algorithm to learn // check how good/bad the model does
  • 37. What are the problems? • Hard to track features • Hard to swap algorithm • Repeat same steps • Code doesn’t reflect what we are after
  • 38. wasteful ˈweɪstfʊl,-f(ə)l/ adjective 1. (of a person, action, or process) using or expending something of value carelessly, extravagantly, or to no purpose.
  • 39. To avoid waste, build flexibility where there is volatility, and automate repeatable steps.
  • 40. Strategy • Use types to represent what we are doing • Automate everything that doesn’t change: data loading, algorithm learning, evaluation • Make what changes often (and is valuable) easy to change: creation of features
  • 41. Core model type Observation = { Search: string Product: string } type Relevance : float type Predictor = Observation -> Relevance type Feature = Observation -> float type Example = Relevance * Observation type Model = Feature [] type Learning = Model -> Example [] -> Predictor
  • 42. “Catalog of Features” let ``search length`` : Feature = fun obs -> obs.Search.Length |> float let ``product title length`` : Feature = fun obs -> obs.Product.Length |> float let ``matching words`` : Feature = fun obs -> let w1 = obs.Search.Split ' ' |> set let w2 = obs.Product.Split ' ' |> set Set.intersect w1 w2 |> Set.count |> float
  • 43. Experiments // shared/common data loading code let model = [| ``search length`` ``product title length`` ``matching words`` |] let predictor = RandomForest.regression model training Let quality = evaluate predictor validation
  • 44. Feature 1 … Feature 2 Feature 3 Algorithm 1 Algorithm 2 Algorithm 3 … Feature 1 Feature 3 Algorithm 2 Data Validation Experiment/Model Shared / Reusable
  • 46. Food for thought • Use types for modelling • Model the process, not the entity • Cross-validation replaces tests
  • 47. Domain modelling? // Object oriented style type Observation = { Search: string Product: string } with member this.SearchLength = this.Search.Length // Properties as functions type Observation = { Search: string Product: string } let searchLength (obs:Observation) = obs.Search.Length // "object" as a bag of functions let model = [ fun obs -> searchLength obs ]
  • 49. Recap • F# Types to model Domain with common “language” across scripts • Separate code elements by role, to enable focusing on high value activity, the creation of features
  • 51. Reproducible research • Anyone must be able to re-compute everything, from scratch • Model is meaningless without the data • Don’t tamper with the source data • Script everything
  • 52. Analogy: Source Control + Automated Build If I check out code from source control, it should work.
  • 53. One simple main idea: does the Search query look like the Product?
  • 54. Dataset normalization • “ductless air conditioners”, “GREE Ultra Efficient 18,000 BTU (1.5Ton) Ductless (Duct Free) Mini Split Air Conditioner with Inverter, Heat, Remote 208-230V” • “6 inch damper”,”Battic Door Energy Conservation Products Premium 6 in. Back Draft Damper”, • “10000 btu windowair conditioner”, “GE 10,000 BTU 115-Volt Electronic Window Air Conditioner with Remote”
  • 55. Pre-processing pipeline let normalize (txt:string) = txt |> fixPunctuation |> fixThousands |> cleanUnits |> fixMisspellings |> etc…
  • 56. Lesson learnt • Pre-processing data matters • Pre-processing is slow • Also, Regex. Plenty of Regex.
  • 57. Tension Keep data intact & regenerate outputs vs. Cache intermediate results
  • 58. There are only two hard problems in computer science. Cache invalidation, and being willing to relocate to San Francisco.
  • 59. Observations • If re-computing everything is fast – then re-compute everything, every time. • Can you isolate causes of change?
  • 60. Feature 1 … Feature 2 Feature 3 Algorithm 1 Algorithm 2 Algorithm 3 … Feature 1 Feature 3 Algorithm 2 Data Validation Experiment/Model Shared / Reusable Pre-Processing Cache
  • 62. General • Don’t be religious about process • Why do you follow a process? • Identify where you waste energy • Build flexibility around volatility • Automate the repeatable parts
  • 63. Statically typed functional • Super clean scripts / data pipelines • Types help define clear domain models • Types prevent dumb mistakes
  • 64. Open questions • Better way to version features? • Experiment is not an entity? • Is pre-processing a feature? • Something missing in overall versioning • Better understanding of data/code dependencies (reuse computation, …)
  • 65. Shameless plug I have a book out, “Machine Learning projects for .NET developers”, Apress
  • 66. Thank you  @brandewinder / brandewinder.com • Come chat if you are interested in the topic! • Check out fsharp.org…