Video: http://videos.re-work.co/videos/464-agile-deep-learning
Deep Learning has been called the ‘new electricity’ — transforming every industry. Innovative architectures and applications receive deserved attention. But to turn innovation into value requires integrating deep learning into practical technology products. Such products, including Spotify's, are often developed following the principles of agile. This talk focuses on approaching deep learning in an agile way and on integrating deep learning into the agile cadence of a modern software development organization.
6. @dmurga
If a typical person can do a mental task with
less than one second of thought, we can
probably automate it using AI either now or in
the near future.
- Andrew Ng, HBR Nov 2016
Identifying a Problem: Perception
7. @dmurga
For any concrete, repeated event that we
observe, we can reasonably try to predict the
outcome of the next such event.
- Andrew Ng, NIPS 2016
Identifying a Problem: Prediction
8. @dmurga
If a desire for content is shared by many
individuals but should be met in ways specific
to each of them, we can probably automate
satisfying those desires with AI.
- (yours truly, today :-)
Identifying a Problem: Personalization
12. @dmurga
Need both:
Offline: for quick experimentation -- model
training and quality analysis
Online: for monitoring alignment with
business goals
Identifying Metrics
13. @dmurga
Art of identifying some part of signal to try to predict it.
Example ways to identify signal:
‣ Create it directly: humans annotate the right output.
‣ Usage of feature: predict historical usage from data prior to it.
‣ Usage of other features: identify other ways users satisfied need.
‣ Outside product: predict related data in the same domain.
Identifying Metrics: Offline
14. @dmurga
Usage metrics:
‣ How many users?
‣ How much do they use it?
‣ How often do they use it?
‣ How do they use it?
Explicit feedback metrics:
‣ Thumbs up or down, etc.
Identifying Metrics: Online
16. @dmurga
Concrete data is the way deep learning products are
specified. Uses
‣ Truth: start with manual generation, especially by product
manager.
‣ Fodder: not the ultimate output, but valuable for training
subcomponents
‣ Baseline: output of simplest solution you can think of,
worst case random.
Watch out for bias against under-represented subpopulations!
Identifying Data
18. @dmurga
Identifying Model(s)
Quantity of data?
Structure of data?
1D Discrete
Categorical
2D+ Continuous
1D Continuous
R
U
L
E
S
SVM
CRF
CNN
RNN (BiLSTM)
GAN ...ever deeper
with richer
attention
FF
Deep RL
19. @dmurga
Seed with pre-trained models from similar tasks.
Consider other properties of model’s output:
‣ interpretable
‣ confidence scores
‣ time / space performance
Identifying Model(s)
25. @dmurga
Analyze: Bugs v. Errors
‣ Error: incorrect output from a
model despite the model
being correctly implemented.
Egregious examples are
“howlers” or “WTFs”.
‣ Bug: implementation does
something other than what
was intended
This distinction is useful for
managing expectations about cost
of addressing.
Bug Error
26. @dmurga
Analyze: Isolate functional tests
Options:
Black-box style: ensure “can’t be
wrong” (“earmark”) input/output
pairs. Might lead to spurious test
failures.
27. @dmurga
Analyze: Isolate functional tests
Options:
Black-box style: ensure “can’t be
wrong” (“earmark”) input/output
pairs. Might lead to spurious test
failures.
Clear-box style: use a mock
implementation of the model that
produces expected answers.
28. @dmurga
Analyze: Automate all tests
Deep Learning’s dependence on
data means changing anything
changes everything.
Look at aggregate results across
data sets to gauge importance.
34. @dmurga
High evaluation error? (train/dev OK)
Get more development data similar to test data
so you can return to the
“High development error?” step.
36. @dmurga
Kinds of overrides:
‣ Always give this answer.
‣ Never give this answer.
Beware of ‘whack a mole’.
Be sad when overrides are used.
Productize: Overrides
37. @dmurga
Productize: How and when to scale
Move from data parallelism to
model parallelism as there’s first
more data then more complex
models.
Only scale rest of product when
you’re sure what problem you’re
solving.
38. @dmurga
Productize: Milestones
1. By hand examples
2. Glued-together with some rules
(Prototype)
3. Functions on some data (“Labs” /
Alpha)
4. Measurable & inspectable (0.1% /
early Beta)
5. Accurate, not slow, nice demo,
documented & configurable (1% /
late Beta)
6. Simple & fast (100% / GA)
7. Handle new domains (post-100%
/ post-GA)
43. Thanks! Questions?
David Murgatroyd (@dmurga)
Suggestions:
Pros and cons of different team organizations strategies?
What are the different roles in a Deep Learning oriented group?
Does Scrum or Kanban work better for Deep Learning?
What about “presentation bias” for measuring on historic data?
We’re hiring in
Boston, NYC,
and Stockholm!
45. How does deep learning
affect team
organization?
45
Machine Learning Expert
46. Encourages alignment with
business goals.
Challenges machine learning
collaboration, depth and reuse.
Best for products with many
small, simpler models.
Option 1: integrated
teams with cross-team
groups (chapters!)
46
47. Encourages machine learning
collaboration, depth and reuse.
Challenges alignment with
business goals.
Best for products with fewer large,
complex model(s).
Option 2: independent
machine learning team
delivering models
47
48. Just one kind of ML Expert?
48
Machine
Learning
Expert
52. 52
An Applied Machine Learning Engineer:
● crafts specific (parts of) products
● by applying tools (e.g., libraries)
● to materials (e.g., data)
with an understanding of what sort of product is desired.
Carpenters
53. 53
A Machine Learning Toolist (Engineer/Scientist):
● implements practical machine learning ideas into
industrial-strength tools (like a blacksmith firing metal into
carpentry tools)
● understands the latest in ML theory and prototypes to see what’s
practical (like a blacksmith smelting ore into metal)
Blacksmiths
54. 54
An Machine Learning Theoretician:
● distills new material from nature to be made into tools
● understands the fundamental characteristics of that material to inform its use
Miners