Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Which Algorithms Really Matter?

©MapR Technologies 2013

1
Me, Us


Ted Dunning, Chief Application Architect, MapR
Committer PMC member, Mahout, Zookeeper, Drill
Bought the beer at...
Topic For Today


What is important? What is not?



Why?



What is the difference from academic research?



Some ex...
What is Important?


Deployable



Robust



Transparent



Skillset and mindset matched?



Proportionate

©MapR Tec...
What is Important?


Deployable
–

Clever prototypes don’t count if they can’t be standardized



Robust



Transparent...
What is Important?


Deployable
–



Robust
–



Clever prototypes don’t count
Mishandling is common

Transparent
–

Wi...
What is Important?


Deployable
–



Robust
–



Will degradation be obvious?

Skillset and mindset matched?
–



Mish...
Academic Goals vs Pragmatics


Academic goals
–
–

–



Reproducible
Isolate theoretically important aspects
Work on nov...
Example 1:
Making Recommendations Better

©MapR Technologies 2013

10
Recommendation Advances


What are the most important algorithmic advances in
recommendations over the last 10 years?


...
The Winner – None of the Above


What are the most important algorithmic advances in
recommendations over the last 10 yea...
The Real Issues


Exploration



Diversity



Speed



Not the last fraction of a percent

©MapR Technologies 2013

13
Result Dithering


Dithering is used to re-order recommendation results
–

Re-ordering is done randomly



Dithering is ...
Result Dithering


Dithering is used to re-order recommendation results
–

Re-ordering is done randomly



Dithering is ...
Simple Dithering Algorithm


Generate synthetic score from log rank plus Gaussian

s = logr + N(0, e )


Pick noise scal...
Example … ε = 0.5
1
1
1
1
1
1
1
2
4
2
3
2
©MapR Technologies 2013

2
2
4
2
6
2
2
1
1
1
1
1

6
3
3
4
2
3
3
3
2
5
5
3

5
8
2...
Example … ε = log 2 = 0.69
1
1
1
1
1
1
1
2
2
3
11
1
©MapR Technologies 2013

2
8
3
2
5
2
3
4
3
4
1
8

8
14
8
10
33
7
5
11
...
Exploring The Second Page

©MapR Technologies 2013

19
Lesson 1:
Exploration is good

©MapR Technologies 2013

20
Example 2:
Bayesian Bandits

©MapR Technologies 2013

21
Bayesian Bandits


Based on Thompson sampling



Very general sequential test



Near optimal regret



Trade-off expl...
Thompson Sampling


Select each shell according to the probability that it is the best



Probability that it is the bes...
Thompson Sampling – Take 2


Sample θ

q ~ P(q | D)


Pick i to maximize reward

i = argmax E[rj | q ]
j



Record resu...
Fast Convergence
0.12
0.11
0.1
0.09
0.08

regret

0.07
0.06

ε- greedy, ε = 0.05
0.05
0.04

Bayesian Bandit with Gam m a- ...
Thompson Sampling on Ads

An Empirical Evaluation of Thompson Sampling - Chapelle and Li, 2011
©MapR Technologies 2013

26
Bayesian Bandits versus Result Dithering


Many useful systems are difficult to frame in fully Bayesian form



Thompson...
Lesson 2:
Exploration is pretty
easy to do and pays
big benefits.

©MapR Technologies 2013

28
Example 3:
On-line Clustering

©MapR Technologies 2013

29
The Problem


K-means clustering is useful for feature extraction or compression



At scale and at high dimension, the ...
The Solution


Sketch-based algorithms produce a sketch of the data



Streaming k-means uses adaptive dp-means to produ...
An Example

©MapR Technologies 2013

32
An Example

©MapR Technologies 2013

33
The Cluster Proximity Features


Every point can be described by the nearest cluster
–
–



Or by the proximity to the 2...
Diagonalized Cluster Proximity

©MapR Technologies 2013

35
Lots of Clusters Are Fine

©MapR Technologies 2013

36
Typical k-means Failure

Selecting two seeds
here cannot be
fixed with Lloyds
Result is that these two
clusters get glued
...
Streaming k-means Ideas


By using a sketch with lots (k log N) of centroids, we avoid
pathological cases



We still ge...
Lesson 3:
Sketches make big
data small.

©MapR Technologies 2013

39
Example 4:
Search Abuse

©MapR Technologies 2013

40
Recommendations

Alice

Charles

©MapR Technologies 2013

Alice got an apple and a
puppy

Charles got a bicycle

41
Recommendations

Alice

Bob

Charles

©MapR Technologies 2013

Alice got an apple and a
puppy

Bob got an apple

Charles g...
Recommendations

Alice

Bob

?

What else would Bob like?

Charles

©MapR Technologies 2013

43
Log Files
Alice
Charles
Charles
Alice

Alice
Bob
Bob
©MapR Technologies 2013

44
History Matrix: Users by Items

Alice

✔

Bob

✔

Charles

©MapR Technologies 2013

✔

✔
✔
✔

45

✔
Co-occurrence Matrix: Items by Items
How do you tell which co-occurrences are useful?.

1

2

1

1

2

©MapR Technologies ...
Co-occurrence Binary Matrix

not
not

©MapR Technologies 2013

1
1

47

1
Indicator Matrix: Anomalous Co-Occurrence
Result: The marked row will be added to the indicator
field in the item document...
Indicator Matrix
That one row from indicator matrix becomes the indicator field in the Solr
document used to deploy the re...
Internals of the Recommender Engine

50

©MapR Technologies 2013

50
Internals of the Recommender Engine

51

©MapR Technologies 2013

51
Looking Inside LucidWorks
Real-time recommendation query and results: Evaluation

What to recommend if new user listened t...
Real-life example

©MapR Technologies 2013

53
Lesson 4:
Recursive search abuse pays
Search can implement recs
Which can implement search

©MapR Technologies 2013

54
Summary

©MapR Technologies 2013

55
©MapR Technologies 2013

56
Me, Us


Ted Dunning, Chief Application Architect, MapR
Committer PMC member, Mahout, Zookeeper, Drill
Bought the beer at...
Nächste SlideShare
Wird geladen in …5
×

Internals of the Recommender Engine Which Algorithms Really Matter

28.248 Aufrufe

Veröffentlicht am

Internals of the Recommender Engine

51

©MapR Technologies 2013

51

Veröffentlicht in: Technologie, Bildung
  • How to Train Your Dog to Stop Barking Download Dog Training Course. ♣♣♣ http://ishbv.com/doggyd4n/pdf
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

×