SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
Adapting Rankers Online

Maarten de Rijke
Joint work with Katja Hofmann
    and Shimon Whiteson




                 Adapting Rankers Online   2
   Growing complexity of search engines
   Current methods for optimizing mostly work offline
                                  Adapting Rankers Online   3
Online learning to rank


   No distinction between training and operating
   Search engine observes users’ natural interactions
    with the search interface, infers information from
    them, and improves its ranking function
    automatically
   Expensive data collection not required; the collected
    data matches target users and target setting

                                   Adapting Rankers Online   4
Users’ natural interactions with the search
    interface                                  Refe r
                                                      s
                                                                                                                          s m a l to
                                                                                                                    p o s s i le st
                                                                                                                             b
                                                        Minimum scope                                               of i te le s c op e
                                                                                                                             m
                                                                                                                    a cte d b ei n g
                                                                                                                               up o n
                                                    Segment                  Object                     Class
                   Behavior category




                                                   View, Listen, Scroll,
                                       Examine        Find, Query
                                                                                Select                   Browse

                                                                           Bookmark, Save,
                                       Retain              Print           Delete, Purchase,
                                                                                 Email
                                                                                                        Subscribe


                                                    Copy-and-paste,         Forward, Reply,
                                       Reference        Quote                  Link, Cite
         to
R efe rs of
      o se
p u rp ve d                            Annotate          Mark up             Rate, Publish              Organize
  o bser io r
         v
   beh a
                                       Create           Type, Edit              Author



     Oard and Kim, 2001
                                                                              Adapting Rankers Online                      5
     Kelly and Teevan, 2004
Users’ interactions

   Relevance feedback
     History goes back close to forty years

     Typically used for query expansion, user profiling

   Explicit feedback
     Users explicitly give feedback

     Keywords, selecting or marking documents,

      answering questions
     Natural explicit feedback can be difficult to obtain

     “Unnatural” explicit feedback through TREC

      assessors and crowd sourcing
                                     Adapting Rankers Online   6
Users’ interactions (2)


   Implicit feedback for learning, query expansion and
    user profiling
      Observe users’ natural interactions with system

      Reading time, saving, printing, bookmarking,

       selecting, clicking, …
      Thought to be less accurate than explicit

       measures
      Available in very large quantities at no cost


                                  Adapting Rankers Online   7
Learning to rank online

   Using online learning to rank approaches, retrieval
    systems can learn directly from implicit feedback,
    while they are running
      Algorithms need to explore new solutions to obtain

       feedback for effective learning and exploit what has
       been learned to produce results acceptable to users
      Interleaved comparison methods can use implicit

       feedback to detect small differences between
       rankers and can be used to learn ranking functions
       online
                                    Adapting Rankers Online   8
Agenda




   Balancing exploration and exploitation
   Inferring preferences from clicks




                                   Adapting Rankers Online   9
Rec
                                                                                                      en
                                                                                                  wor t
                                                                                                      k

                                   Balancing Exploitation
                                                  and Exploration




K. Hofmann et al. (2011), Balancing exploration and exploitation. In:
ECIR ’11.
                                                                        Adapting Rankers Online            10
Challenges



   Generalize over queries and documents
   Learn from implicit feedback that is …
     noisy

     relative

     rank-biased

   Keep users happy while learning


                                   Adapting Rankers Online   11
Learning document pair-wise preferences


                                                       Vienna




   Insight: infer preferences
    from clicks




Joachims, T. (2002). Optimizing search engines using
clickthrough data. In KDD '02, pages 133-142.              Adapting Rankers Online   12
Learning document pair-wise preferences




   Input: feature vectors constructed from document
          ( (q, di ),  (q, dj )) ∈ Rn × Rn
    pairs x            x
   Output: y ∈ {−1, +1} correct / incorrect order
   Learning method: supervised learning, e.g., SVM



Joachims, T. (2002). Optimizing search engines using clickthrough data. In KDD '02,
pages 133-142.                                                                        Adapting Rankers Online   13
Challenges



   Generalize over queries and documents
   Learn from implicit feedback that is …
     noisy

     relative

     rank-biased

   Keep users happy while learning


                                   Adapting Rankers Online   14
Dueling bandit gradient descent

   Learns a ranking function consisting of a weight vector
    for a linear weighted combination of feature vectors
    from feedback about relative quality of rankings
      Outcome: weights for ranking S = w (q, d)
                                           x
   Approach
     Maintain a current “best” ranking function
                                                                                                           candidate w
     On each incoming query:                                                                     x2
                                                                                                       current best w
             Generate a new candidate ranking function
             Compare to current “best”                                                                    x1
             If candidate is better, update “best” ranking function
Yue, Y. and Joachims, T. (2009). Interactively optimizing information
retrieval systems as a dueling bandits problem. In ICML '09.
                                                                        Adapting Rankers Online                 15
Challenges



   Generalize over queries and documents
   Learn from implicit feedback that is …
     noisy

     relative

     rank-biased

   Keep users happy while learning


                                   Adapting Rankers Online   16
Exploration and exploitation


Need to learn effectively      Need to present high-
from rank-biased               quality results while
feedback                       learning

       Exploration                           Exploitation


Previous approaches are either purely exploratory or
purely exploitative

                                  Adapting Rankers Online   17
Questions




   Can we improve online performance by balancing
    exploration and exploitation?
   How much exploration is needed for effective
    learning?



                                  Adapting Rankers Online   18
Problem formulation

   Reinforcement learning
     No explicit labels

     Learn from feedback from the environment in

      response to actions (document lists)
   Contextual bandit problem
                try something                                       documents



    Retrieval                   Environment   Retrieval                          Environment
     system                        (user)      system                               (user)



                 get feedback                                           clicks


                                                    Adapting Rankers Online             19
Our method



   Learning based on Dueling Bandit Gradient Descent
     Relative evaluations of quality of two document

      lists
     Infers such comparisons from implicit feedback

   Balance exploration and exploitation with k-greedy
    comparison of document lists


                                  Adapting Rankers Online   20
k-greedy exploration

   To compare document
    lists, interleave
   An exploration rate k
    influences the relative
    number of documents
    from each list                                        Blue wi n
                                                        c o mp a r s
                                                                  is o n

                                               n
                                 Exp l o ratio
                                 rate k = 0.5



                              Adapting Rankers Online             21
k-greedy exploration




            atio n                 atio n
   Exp l o r 0.5          Exp l o r 0.2
    rate k =               rate k =


                       Adapting Rankers Online   22
Evaluation

   Simulated interactions
   We need to
     observe clicks on arbitrary result lists

     measure online performance

   Simulate clicks and measure online performance
     Probabilistic click model: assume dependent click

      model and define click and stop probabilities based
      on standard learning to rank data sets
     Measure cumulative reward of the rankings

      displayed to the user
                                      Adapting Rankers Online   23
Experiments



   Vary exploration rate k
   Three click models
     “perfect”

     “navigational”

     “informational”

   Evaluate on nine data sets (LETOR 3.0 and 4.0)


                                 Adapting Rankers Online   24
“Perfect” click model




                                          0.8
                                          0.6
   Click model




                                          0.4
    P(c|R)   P(c|NR)   P(s|R)   P(s|NR)




                                          0.2
     1.0       0.0      0.0       0.0




                                          0.0
                                                0      200       400   600    800         1000



                                          Final performance over time for data set
                                          NP2003 and perfect click model

   Provides an upperbound


                                            Adapting Rankers Online                  25
“Perfect” online performance

                        k = 0.5            k = 0.4            k = 0.3              k = 0.2      k = 0.1
HP2003                  119.91             125.71             129.99               130.55       128.50
HP2004                  109.21             111.57             118.54               119.86       116.46
                                                             117.44 fo r m a n
                                                                              ce
NP2003                  108.74             113.61
                                                       Bes   t per          120.46
                                                                             o
                                                                                                119.06
                                                           124.47 n l y t w
NP2004                  112.33             119.34
                                                         with o            126.20
                                                                            y
                                                                                                123.70
TD2003                    82.00             84.24           88.20 r ato r 89.36
                                                           exp  lo                               86.20
                                                                             or
                                                                   e nts f91.71
TD2004                    85.67             90.23        do c u m
                                                            91.00                                88.98
OHSUMED                 128.12             130.40         top- 01
                                                           131.16     re s u lts
                                                                           133.37               131.93
MQ2007                    96.02             97.48              98.54               100.28        98.32
MQ2008                    90.97             92.99              94.03                    95.59    95.14

                Darker shades indicate higher performance
         125.71 Dark borders indicate significant improvements over the k = 0.5 baseline

                                                              Adapting Rankers Online             26
“Navigational” click model




                                          0.8
                                          0.6
   Click model




                                          0.4
    P(c|R)   P(c|NR)   P(s|R)   P(s|NR)




                                          0.2
     0.95     0.05      0.9       0.2




                                          0.0
                                                0      200       400   600    800         1000


                                          Final performance over time for data set
   Simulate realistic but                NP2003 and navigational click model


    reliable interaction


                                           Adapting Rankers Online                   27
“Navigational” online performance

                        k = 0.5           k = 0.4            k = 0.3              k = 0.2      k = 0.1
HP2003                  102.58            109.78             118.84               116.38       117.52
HP2004                   89.61              97.08             99.03               103.36       105.69
NP2003                   90.32            100.94       Be st  p e r fo r m a n c e
                                                          105.03           108.15              110.12
NP2004                   99.14            104.34
                                                                          t le
                                                           110.16 h l i t 112.05
                                                             wit                               116.00
TD2003                   70.93              75.20       ex  plo
                                                            77.64ratio n77.54dan                75.70
TD2004                   78.83              80.17           82.40 ot s o f 83.54
                                                                l                               80.98
OHSUMED                 125.35            126.92          127.37 l o i t at i o n
                                                            exp            127.94              127.21
MQ2007                   95.50              94.99             95.70                    96.02    94.94
MQ2008                   89.39              90.55             91.24                    92.36    92.25

                 Darker shades indicate higher performance
         125.71 Dark borders indicate significant improvements over the k = 0.5 baseline

                                                             Adapting Rankers Online             28
“Informational” click model




                                          0.8
                                                             k = 0.5   k = 0.2         k = 0.1




                                          0.6
   Click model




                                          0.4
    P(c|R)   P(c|NR)   P(s|R)   P(s|NR)




                                          0.2
     0.9       0.4      0.5       0.1




                                          0.0
                                                0      200       400   600       800          1000




   Simulate very noisy                   Final performance over time for data set
                                          NP2003 and informational click model

    interaction


                                           Adapting Rankers Online                       29
“Informational” online performance

                        k = 0.5           k = 0.4            k = 0.3              k = 0.2      k = 0.1
HP2003                   59.53              63.91             61.43                    70.11    71.19
HP2004                   41.12              52.88
                                                                 st 55.88
                                                       H i g h e 58.40
                                                              48.54                             55.16

                                                               e nts63.23t h
                                                                       wi
NP2003                   53.63             53.64        57.60                                   69.90
                                           63.38 p ro ve m
                                              im
                                                                   o n55.76 te s:
NP2004                   60.59                          64.17                                   69.96
                                                        51.58 at i
                                                            r          ra
TD2003                   52.78
                                            l o w exp l o
                                           52.95                                                57.30
                                                        59.75 n b et we e n
TD2004                   58.49
                                               i nte ra
                                           61.43
                                                        ctio          62.88                     63.37
                                                                     126.76 et
                                                                       as
OHSUMED
MQ2007
                        121.39
                         91.57
                                          123.26
                                           92.00
                                                       124.01
                                                         an
                                                 n o ise91.66 d dat90.79                       125.40
                                                                                                90.19
MQ2008                   86.06              87.26             85.83                    87.62    86.29

                 Darker shades indicate higher performance
         125.71 Dark borders indicate significant improvements over the k = 0.5 baseline

                                                             Adapting Rankers Online             30
Summary

   What?
     Developed first method for balancing exploration and

      exploitation in online learning to rank
     Devised experimental framework for simulating user

      interactions and measuring online performance
   And so?
     Balancing exploration and exploitation improves online

      performance for all click models and all data sets
     Best results are achieved with 2 exploratory

      documents per results list
                                     Adapting Rankers Online   31
What’s next here?




   Validate simulation assumptions
   Evaluate using on click logs
   Develop new algorithms for online learning to rank
    for IR that can balance exploration and exploitation



                                   Adapting Rankers Online   32
Ongo
                                           ing


Inferring Preferences
                                        work




    from Clicks




             Adapting Rankers Online             33
Interleaved ranker comparison methods

   Use implicit feedback (“clicks”), not to infer absolute
    judgments, but to compare two rankers by observing
    clicks on an interleaved result list
      Interleave two ranked lists (“outputs of two rankers”)

      Use click data to detect even very small differences

       between rankers

   Examine three existing methods for interleaving,
    identify issues with them and propose a new one
                                     Adapting Rankers Online   34
Three methods (1)


   Balanced interleave method
     Interleaved list is generated for each query based

      on the two rankers
     User’s clicks on interleaved list are attributed to

      each ranker based on how they ranked the clicked
      docs
     Ranker that obtains more clicks is deemed

      superior
Joachims, Evaluating retrieval performance       Adapting Rankers Online   35
using clickthrough data, In: Text Mining, 2003
1) Interleaving                                2) Comparison
List l1           List l2
 d1                d2                            d1                            d2
 d2                d3                          x d2                          x d1




                                    observed
                                    clicks c
 d3                d4                            d3                            d3
 d4                d1                          x d4                          x d4
                                                 k = min(4,3) = 3               k = min(4,4) = 4
Two possible interleaved lists l:                                               click count:
                                                 click count:
                                                 c1 = 1                         c1 = 2
d1                 d2                                                           c2 = 2
                                                 c2 = 2
d2                 d1
d3                 d3                          l2 wins the first comparison, and the lists tie for
d4                 d4                          the second. In expectation l2 wins.




                                                            Adapting Rankers Online             36
Three methods (2)

   Team draft method
     Create an interleaved list following the model of

      “team captains” selecting their team from a set of
      players
     For each pair of documents to be placed in the

      interleaved list, a coin flip determines which list
      gets to select a document first
     Record which document contributed which

      document
Radlinski et al., How does click-through data reflect   Adapting Rankers Online   37
retrieval quality? 2008
1) Interleaving                           2) Comparison
                                                                          assignments a
List l1             List l2
 d1                  d2                   a)                               c)
 d2                  d3                    d1               1               d2            2
 d3                  d4                    d2               2               d1            1
 d4                  d1                  x d3               1             x d3            2
                                           d4               2               d4            1
Four possible interleaved lists l,
with different assignments a:             b)                               d)
                                           d2               2               d1            1
For the interleaved lists a) and b) l1     d1               1               d2            2
wins the comparison. l2 wins in the      x d3               1             x d3            2
other two cases.                           d4               2               d4            1




                                                Adapting Rankers Online                   38
Three methods (3)

   Document-constraint method
     Result lists are interleaved and clicks observed as

      for the balanced interleaved method
     Infer constraints on pairs of individual documents

      based on clicks and ranks
           For each pair of a clicked document and a higher-ranked non-
            clicked document, a constraint is inferred that requires the
            former to be ranked higher than the latter
           The original list that violates fewer constraints is deemed
            superior
He et al., Evaluation of methods for relative comparison of retrieval   Adapting Rankers Online   39
systems based on clickthroughs, 2009
1) Interleaving                     2) Comparison
List l1           List l2
 d1                d2                 d1                          d2
 d2                d3               x d2                        x d1
 d3                d4               x d3                        x d3
 d4                d1                 d4                          d4
                                      inferred constraints         inferred constraints
Two possible interleaved lists l:
                                      violated by: l1 l2           violated by: l1 l2
d1                 d2                 d2 ≻ d1       x -            d1 ≻ d2       - x
d2                 d1                 d3 ≻ d1       x -            d3 ≻ d2       x x
d3                 d3               l2 wins the first comparison, and loses the
d4                 d4               second. In expectation l2 wins.




                                                 Adapting Rankers Online                  40
Assessing comparison methods

   Bias
     Don’t prefer either ranker when clicks are random

   Sensitivity
     The ability of a comparison method to detect

      differences in the quality of rankings

   Balanced interleave and document constraint are
    biased
   Team draft may suffer from insensitivity
                                   Adapting Rankers Online   41
A new proposal

   Briefly
     Based on team draft

     Instead of interleaving deterministically, model the

      interleaving process as random sampling from
      softmax functions that define probability
      distributions over documents
     Derive an estimator that is unbiased and sensitive

      to small ranking changes
     Marginalize over all possible assignments to make

      estimates more reliable
                                    Adapting Rankers Online   42
1) Probabilistic Interleave                                             2) Probabilistic       marginalize over all possible assignments:
                                                                           Comparison
l1 !   softmax s1      l2 !   softmax s2
                                                                                                  a      o(ci,a) P(a|li,qi)
 d1                     d2                     P(dr=1)=   0.85                                 1 1 1 1 2 0            0.053
                                                                        Observe data, e.g.     1 1 1 2 2 0            0.053
 d2                     d3                     P(dr=2)=   0.10
                                                                          d1          1        1 1 2 1 1 1            0.058
 d3                     d4                     P(dr=3)=   0.03          x d2          2        1 1 2 2 1 1            0.058
 d4                     d1                     P(dr=4)=   0.02
                                                                        x d3          1        1 2 1 1 1 1            0.065
For each rank of the interleaved list l draw one of {s1, s2} and          d4          2        1 2 1 2 1 1            0.065          P(c1  c2) = 0.108
sample d:                                                                                      1 2 2 1 0 2            0.071          P(c1  c2) = 0.144
                                                           s1      d4
                                                                                               1 2 2 2 0 2            0.071
                                          s1      d3
                                d2                         s2      d4                          2 1 1 1 2 0            0.001
                        s1                s2      d4 ...                                       2 1 1 2 2 0            0.001          s2 (based on l2) wins
              d1                d3 ...                                                         2 1 2 1 1 1            0.001          the comparison. s1 and
                        s2                                                                     2 1 2 2 1 1            0.001          s2 tie in expectation.
     s1       d2 ...            d4 ...
                                                                                               2 2 1 1 1 1            0.001
                                                                                               2 2 1 2 1 1            0.001
      s2      d3    ...                                                                        2 2 2 1 0 2            0.001
                                     All permutations of documents
              d4    ...              in D are possible.                                        2 2 2 2 0 2            0.001


          For an incoming query                                                   ...
            System generates                                                               All possible assignments are
                                                                                             generated;
             interleaved list                                                               Probability of each is computed
            Observe clicks
                                                                                            Expensive; only need to do this
            Compute probability of                                                          until the lowest observed click
             each possible outcome
                                                                                              Adapting Rankers Online                              43
Question




   Do analytical differences between the methods
    translate into performance differences?




                                 Adapting Rankers Online   44
Evaluation

   Set-up
     Simulation based on dependent click model

         Perfect and realistic instantiations
         Not binary, but with relevance levels
       MSLR-WEB30k Microsoft learning to rank data set
           136 doc features (i.e., rankers)
   Three experiments
     Exhaustive comparison of all distinct ranker pairs

           9,180 distinct pairs
     Selection of small subsets for detailed analysis
     Add noise
                                               Adapting Rankers Online   45
Results (1)

   Experiment 1
     Accuracy

         Percentage of pairs of rankers for which a comparison
          method identified the better ranker after 1000 queries

              Method                           Accuracy
balanced interleave                                0.881
team draft                                         0.898
document constraint                                0.857
new                                                0.914

                                           Adapting Rankers Online   46
Results (2): overview

   “Problematic” pairs
      Pairs of rankers for which

       all methods correctly
       identified the better one
      Three achieved perfect

       accuracy within 1000
       queries
      For each method,

       incorrectly judged pair with
       highest difference in
       NDCG
                                      Adapting Rankers Online   47
Results (3): perfect model
 1



0.8

                             1
0.6

                           0.8
0.4
                                               1
                           0.6
0.2        balanced interleave
                     team draft               0.8
          document constraint
                           0.4
      marginalized probabilities                                          1
 0
                                              0.6
      1                   10           100              1k    2k        5k   10k
                           0.2
                                                                         0.8

                                              0.4
                             0
                                                                         0.6
                                   1         10              100                   1k   2k       5k   10k
                                              0.2

                                                                         0.4

                                               0

                                                    1              10    0.2            100                 1k   2k     5k   10k



                                                                          0

                                                                               1                 10               100              1k   2k   5k   10k


                                                                                              Adapting Rankers Online                        48
Results (4): realistic model
  1



 0.8



 0.6                   1



 0.4                  0.8



 0.2                  0.6



  0                   0.4

       1   10   100             1k   2k   5k    10k

                      0.2



                       0

                            1              10                 100               1k   2k   5k   10k




                                                      Adapting Rankers Online                        49
Summary

   What?
     Methods for evaluating rankers using implicit

      feedback
     Analysis of interleaved comparison methods in

      terms of bias and sensitivity
   And so?
     Introduced a new probabilistic interleaved

      comparison method, unbiased and sensitive
     Experimental analysis: more accurate, with

      substantially fewer observed queries, more robust
                                   Adapting Rankers Online   50
What’s next here?




   Evaluate in a real-life setting in the future
   With more reliable and faster convergence, our
    approach can pave the way for online learning to
    rank methods that require many comparisons



                                      Adapting Rankers Online   51
Wrap-up




      Adapting Rankers Online   52
   Online learning to rank
   Emphasis on implicit feedback collected during
    normal operation of the search engine
   Balancing exploration and exploitation
   Probabilistic method for inferring preferences from
    clicks



                                   Adapting Rankers Online   53
Information retrieval observatory


   Academic experiments on online learning and
    implicit feedback used simulators
      Need to validate the simulators

   What’s really needed
     Move away from artificial explicit feedback to

      natural implicit feedback
     Shared experimental environment for observing

      users in the wild as they interact with systems
                                  Adapting Rankers Online   54
   Adapting Rankers Online
   Maarten de Rijke, derijke@uva.nl




                                 Adapting Rankers Online   55
(Intentionally left blank)




                     Adapting Rankers Online   56
Bias




   1) Interleaving                                2) Comparison
   List l1           List l2
    d1                d2                            d1                        d2
    d2                d3                          x d2                      x d1
                                       observed
                                       clicks c




    d3                d4                            d3                        d3
    d4                d1                          x d4                      x d4
                                                    k = min(4,3) = 3          k = min(4,4) = 4
   Two possible interleaved lists l:                                          click count:
                                                    click count:
                                                    c1 = 1                    c1 = 2
   d1                 d2                                                      c2 = 2
                                                    c2 = 2
   d2                 d1
   d3                 d3                          l2 wins the first comparison, and the lists tie for
   d4                 d4                          the second. In expectation l2 wins.

                                                                        Adapting Rankers Online         57
Sensitivity




  1) Interleaving                           2) Comparison
                                                                assignments a
  List l1             List l2
   d1                  d2                   a)                  c)
   d2                  d3                    d1       1          d2             2
   d3                  d4                    d2       2          d1             1
   d4                  d1                  x d3       1        x d3             2
                                             d4       2          d4             1
  Four possible interleaved lists l,
  with different assignments a:             b)                  d)
                                             d2       2          d1             1
  For the interleaved lists a) and b) l1     d1       1          d2             2
  wins the comparison. l2 wins in the      x d3       1        x d3             2
  other two cases.                           d4       2          d4             1



                                                          Adapting Rankers Online   58

Weitere ähnliche Inhalte

Ähnlich wie Adapting Rankers Online, Maarten de Rijke

Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
Ernesto Mislej
 
Extracting Semantic User Networks from Informal Communication Exchanges
Extracting Semantic User Networks from Informal Communication ExchangesExtracting Semantic User Networks from Informal Communication Exchanges
Extracting Semantic User Networks from Informal Communication Exchanges
Suvodeep Mazumdar
 
Online reading talk ppt
Online reading talk pptOnline reading talk ppt
Online reading talk ppt
djleu
 
Intelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn ApproachIntelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn Approach
Wouter Beek
 
Lak12 - Leeds - Deriving Group Profiles from Social Media
Lak12 - Leeds - Deriving Group Profiles from Social Media Lak12 - Leeds - Deriving Group Profiles from Social Media
Lak12 - Leeds - Deriving Group Profiles from Social Media
lydia-lau
 
Tag And Tag Based Recommender
Tag And Tag Based RecommenderTag And Tag Based Recommender
Tag And Tag Based Recommender
gu wendong
 
A self training framework for exploratory discourse detection final
A self training framework for exploratory discourse detection finalA self training framework for exploratory discourse detection final
A self training framework for exploratory discourse detection final
Zhongyu Wei
 
Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012
KM Chicago
 
Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
Victor Codina
 

Ähnlich wie Adapting Rankers Online, Maarten de Rijke (20)

Lise Getoor, "
Lise Getoor, "Lise Getoor, "
Lise Getoor, "
 
Extracting Semantic
Extracting Semantic Extracting Semantic
Extracting Semantic
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment Analysis
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
 
Opinion-Based Entity Ranking
Opinion-Based Entity RankingOpinion-Based Entity Ranking
Opinion-Based Entity Ranking
 
Extracting Semantic User Networks from Informal Communication Exchanges
Extracting Semantic User Networks from Informal Communication ExchangesExtracting Semantic User Networks from Informal Communication Exchanges
Extracting Semantic User Networks from Informal Communication Exchanges
 
Online reading talk ppt
Online reading talk pptOnline reading talk ppt
Online reading talk ppt
 
Intelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn ApproachIntelligent Tutoring Systems: The DynaLearn Approach
Intelligent Tutoring Systems: The DynaLearn Approach
 
Lak12 - Leeds - Deriving Group Profiles from Social Media
Lak12 - Leeds - Deriving Group Profiles from Social Media Lak12 - Leeds - Deriving Group Profiles from Social Media
Lak12 - Leeds - Deriving Group Profiles from Social Media
 
Yelp Fake Reviews Detection_new_v23.pptx
Yelp Fake Reviews Detection_new_v23.pptxYelp Fake Reviews Detection_new_v23.pptx
Yelp Fake Reviews Detection_new_v23.pptx
 
Tag And Tag Based Recommender
Tag And Tag Based RecommenderTag And Tag Based Recommender
Tag And Tag Based Recommender
 
TVOT June 2012
TVOT June 2012TVOT June 2012
TVOT June 2012
 
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...
 
CBL - Creating an iOS App in the Classroom
CBL - Creating an iOS App in the ClassroomCBL - Creating an iOS App in the Classroom
CBL - Creating an iOS App in the Classroom
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
 
Örüntü tanıma - Pattern Recognition
Örüntü tanıma - Pattern RecognitionÖrüntü tanıma - Pattern Recognition
Örüntü tanıma - Pattern Recognition
 
A self training framework for exploratory discourse detection final
A self training framework for exploratory discourse detection finalA self training framework for exploratory discourse detection final
A self training framework for exploratory discourse detection final
 
Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012Metaphors as design points for collaboration 2012
Metaphors as design points for collaboration 2012
 
Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
 

Mehr von yaevents

Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...
Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...
Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...
yaevents
 
Тема для WordPress в БЭМ. Владимир Гриненко, Яндекс
Тема для WordPress в БЭМ. Владимир Гриненко, ЯндексТема для WordPress в БЭМ. Владимир Гриненко, Яндекс
Тема для WordPress в БЭМ. Владимир Гриненко, Яндекс
yaevents
 
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...
yaevents
 
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндекс
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндексi-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндекс
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндекс
yaevents
 
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...
yaevents
 
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...Модели в профессиональной инженерии и тестировании программ. Александр Петрен...
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...
yaevents
 
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...
yaevents
 
Мониторинг со всех сторон. Алексей Симаков, Яндекс
Мониторинг со всех сторон. Алексей Симаков, ЯндексМониторинг со всех сторон. Алексей Симаков, Яндекс
Мониторинг со всех сторон. Алексей Симаков, Яндекс
yaevents
 
Истории про разработку сайтов. Сергей Бережной, Яндекс
Истории про разработку сайтов. Сергей Бережной, ЯндексИстории про разработку сайтов. Сергей Бережной, Яндекс
Истории про разработку сайтов. Сергей Бережной, Яндекс
yaevents
 
Разработка приложений для Android на С++. Юрий Береза, Shturmann
Разработка приложений для Android на С++. Юрий Береза, ShturmannРазработка приложений для Android на С++. Юрий Береза, Shturmann
Разработка приложений для Android на С++. Юрий Береза, Shturmann
yaevents
 
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...
yaevents
 
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...
yaevents
 
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндекс
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, ЯндексСканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндекс
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндекс
yaevents
 
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, FacebookМасштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
yaevents
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
yaevents
 
Юнит-тестирование и Google Mock. Влад Лосев, Google
Юнит-тестирование и Google Mock. Влад Лосев, GoogleЮнит-тестирование и Google Mock. Влад Лосев, Google
Юнит-тестирование и Google Mock. Влад Лосев, Google
yaevents
 
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...
yaevents
 
В поисках математики. Михаил Денисенко, Нигма
В поисках математики. Михаил Денисенко, НигмаВ поисках математики. Михаил Денисенко, Нигма
В поисках математики. Михаил Денисенко, Нигма
yaevents
 
Using classifiers to compute similarities between face images. Prof. Lior Wol...
Using classifiers to compute similarities between face images. Prof. Lior Wol...Using classifiers to compute similarities between face images. Prof. Lior Wol...
Using classifiers to compute similarities between face images. Prof. Lior Wol...
yaevents
 

Mehr von yaevents (20)

Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...
Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...
Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...
 
Тема для WordPress в БЭМ. Владимир Гриненко, Яндекс
Тема для WordPress в БЭМ. Владимир Гриненко, ЯндексТема для WordPress в БЭМ. Владимир Гриненко, Яндекс
Тема для WordPress в БЭМ. Владимир Гриненко, Яндекс
 
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...
 
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндекс
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндексi-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндекс
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндекс
 
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...
 
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...Модели в профессиональной инженерии и тестировании программ. Александр Петрен...
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...
 
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...
 
Мониторинг со всех сторон. Алексей Симаков, Яндекс
Мониторинг со всех сторон. Алексей Симаков, ЯндексМониторинг со всех сторон. Алексей Симаков, Яндекс
Мониторинг со всех сторон. Алексей Симаков, Яндекс
 
Истории про разработку сайтов. Сергей Бережной, Яндекс
Истории про разработку сайтов. Сергей Бережной, ЯндексИстории про разработку сайтов. Сергей Бережной, Яндекс
Истории про разработку сайтов. Сергей Бережной, Яндекс
 
Разработка приложений для Android на С++. Юрий Береза, Shturmann
Разработка приложений для Android на С++. Юрий Береза, ShturmannРазработка приложений для Android на С++. Юрий Береза, Shturmann
Разработка приложений для Android на С++. Юрий Береза, Shturmann
 
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...
 
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...
 
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндекс
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, ЯндексСканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндекс
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндекс
 
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, FacebookМасштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebook
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
 
Юнит-тестирование и Google Mock. Влад Лосев, Google
Юнит-тестирование и Google Mock. Влад Лосев, GoogleЮнит-тестирование и Google Mock. Влад Лосев, Google
Юнит-тестирование и Google Mock. Влад Лосев, Google
 
C++11 (formerly known as C++0x) is the new C++ language standard. Dave Abraha...
C++11 (formerly known as C++0x) is the new C++ language standard. Dave Abraha...C++11 (formerly known as C++0x) is the new C++ language standard. Dave Abraha...
C++11 (formerly known as C++0x) is the new C++ language standard. Dave Abraha...
 
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...
 
В поисках математики. Михаил Денисенко, Нигма
В поисках математики. Михаил Денисенко, НигмаВ поисках математики. Михаил Денисенко, Нигма
В поисках математики. Михаил Денисенко, Нигма
 
Using classifiers to compute similarities between face images. Prof. Lior Wol...
Using classifiers to compute similarities between face images. Prof. Lior Wol...Using classifiers to compute similarities between face images. Prof. Lior Wol...
Using classifiers to compute similarities between face images. Prof. Lior Wol...
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Adapting Rankers Online, Maarten de Rijke

  • 2. Joint work with Katja Hofmann and Shimon Whiteson Adapting Rankers Online 2
  • 3. Growing complexity of search engines  Current methods for optimizing mostly work offline Adapting Rankers Online 3
  • 4. Online learning to rank  No distinction between training and operating  Search engine observes users’ natural interactions with the search interface, infers information from them, and improves its ranking function automatically  Expensive data collection not required; the collected data matches target users and target setting Adapting Rankers Online 4
  • 5. Users’ natural interactions with the search interface Refe r s s m a l to p o s s i le st b Minimum scope of i te le s c op e m a cte d b ei n g up o n Segment Object Class Behavior category View, Listen, Scroll, Examine Find, Query Select Browse Bookmark, Save, Retain Print Delete, Purchase, Email Subscribe Copy-and-paste, Forward, Reply, Reference Quote Link, Cite to R efe rs of o se p u rp ve d Annotate Mark up Rate, Publish Organize o bser io r v beh a Create Type, Edit Author Oard and Kim, 2001 Adapting Rankers Online 5 Kelly and Teevan, 2004
  • 6. Users’ interactions  Relevance feedback  History goes back close to forty years  Typically used for query expansion, user profiling  Explicit feedback  Users explicitly give feedback  Keywords, selecting or marking documents, answering questions  Natural explicit feedback can be difficult to obtain  “Unnatural” explicit feedback through TREC assessors and crowd sourcing Adapting Rankers Online 6
  • 7. Users’ interactions (2)  Implicit feedback for learning, query expansion and user profiling  Observe users’ natural interactions with system  Reading time, saving, printing, bookmarking, selecting, clicking, …  Thought to be less accurate than explicit measures  Available in very large quantities at no cost Adapting Rankers Online 7
  • 8. Learning to rank online  Using online learning to rank approaches, retrieval systems can learn directly from implicit feedback, while they are running  Algorithms need to explore new solutions to obtain feedback for effective learning and exploit what has been learned to produce results acceptable to users  Interleaved comparison methods can use implicit feedback to detect small differences between rankers and can be used to learn ranking functions online Adapting Rankers Online 8
  • 9. Agenda  Balancing exploration and exploitation  Inferring preferences from clicks Adapting Rankers Online 9
  • 10. Rec en wor t k Balancing Exploitation and Exploration K. Hofmann et al. (2011), Balancing exploration and exploitation. In: ECIR ’11. Adapting Rankers Online 10
  • 11. Challenges  Generalize over queries and documents  Learn from implicit feedback that is …  noisy  relative  rank-biased  Keep users happy while learning Adapting Rankers Online 11
  • 12. Learning document pair-wise preferences Vienna  Insight: infer preferences from clicks Joachims, T. (2002). Optimizing search engines using clickthrough data. In KDD '02, pages 133-142. Adapting Rankers Online 12
  • 13. Learning document pair-wise preferences  Input: feature vectors constructed from document ( (q, di ), (q, dj )) ∈ Rn × Rn pairs x x  Output: y ∈ {−1, +1} correct / incorrect order  Learning method: supervised learning, e.g., SVM Joachims, T. (2002). Optimizing search engines using clickthrough data. In KDD '02, pages 133-142. Adapting Rankers Online 13
  • 14. Challenges  Generalize over queries and documents  Learn from implicit feedback that is …  noisy  relative  rank-biased  Keep users happy while learning Adapting Rankers Online 14
  • 15. Dueling bandit gradient descent  Learns a ranking function consisting of a weight vector for a linear weighted combination of feature vectors from feedback about relative quality of rankings  Outcome: weights for ranking S = w (q, d) x  Approach  Maintain a current “best” ranking function candidate w  On each incoming query: x2 current best w  Generate a new candidate ranking function  Compare to current “best” x1  If candidate is better, update “best” ranking function Yue, Y. and Joachims, T. (2009). Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML '09. Adapting Rankers Online 15
  • 16. Challenges  Generalize over queries and documents  Learn from implicit feedback that is …  noisy  relative  rank-biased  Keep users happy while learning Adapting Rankers Online 16
  • 17. Exploration and exploitation Need to learn effectively Need to present high- from rank-biased quality results while feedback learning Exploration Exploitation Previous approaches are either purely exploratory or purely exploitative Adapting Rankers Online 17
  • 18. Questions  Can we improve online performance by balancing exploration and exploitation?  How much exploration is needed for effective learning? Adapting Rankers Online 18
  • 19. Problem formulation  Reinforcement learning  No explicit labels  Learn from feedback from the environment in response to actions (document lists)  Contextual bandit problem try something documents Retrieval Environment Retrieval Environment system (user) system (user) get feedback clicks Adapting Rankers Online 19
  • 20. Our method  Learning based on Dueling Bandit Gradient Descent  Relative evaluations of quality of two document lists  Infers such comparisons from implicit feedback  Balance exploration and exploitation with k-greedy comparison of document lists Adapting Rankers Online 20
  • 21. k-greedy exploration  To compare document lists, interleave  An exploration rate k influences the relative number of documents from each list Blue wi n c o mp a r s is o n n Exp l o ratio rate k = 0.5 Adapting Rankers Online 21
  • 22. k-greedy exploration atio n atio n Exp l o r 0.5 Exp l o r 0.2 rate k = rate k = Adapting Rankers Online 22
  • 23. Evaluation  Simulated interactions  We need to  observe clicks on arbitrary result lists  measure online performance  Simulate clicks and measure online performance  Probabilistic click model: assume dependent click model and define click and stop probabilities based on standard learning to rank data sets  Measure cumulative reward of the rankings displayed to the user Adapting Rankers Online 23
  • 24. Experiments  Vary exploration rate k  Three click models  “perfect”  “navigational”  “informational”  Evaluate on nine data sets (LETOR 3.0 and 4.0) Adapting Rankers Online 24
  • 25. “Perfect” click model 0.8 0.6  Click model 0.4 P(c|R) P(c|NR) P(s|R) P(s|NR) 0.2 1.0 0.0 0.0 0.0 0.0 0 200 400 600 800 1000 Final performance over time for data set NP2003 and perfect click model  Provides an upperbound Adapting Rankers Online 25
  • 26. “Perfect” online performance k = 0.5 k = 0.4 k = 0.3 k = 0.2 k = 0.1 HP2003 119.91 125.71 129.99 130.55 128.50 HP2004 109.21 111.57 118.54 119.86 116.46 117.44 fo r m a n ce NP2003 108.74 113.61 Bes t per 120.46 o 119.06 124.47 n l y t w NP2004 112.33 119.34 with o 126.20 y 123.70 TD2003 82.00 84.24 88.20 r ato r 89.36 exp lo 86.20 or e nts f91.71 TD2004 85.67 90.23 do c u m 91.00 88.98 OHSUMED 128.12 130.40 top- 01 131.16 re s u lts 133.37 131.93 MQ2007 96.02 97.48 98.54 100.28 98.32 MQ2008 90.97 92.99 94.03 95.59 95.14 Darker shades indicate higher performance 125.71 Dark borders indicate significant improvements over the k = 0.5 baseline Adapting Rankers Online 26
  • 27. “Navigational” click model 0.8 0.6  Click model 0.4 P(c|R) P(c|NR) P(s|R) P(s|NR) 0.2 0.95 0.05 0.9 0.2 0.0 0 200 400 600 800 1000 Final performance over time for data set  Simulate realistic but NP2003 and navigational click model reliable interaction Adapting Rankers Online 27
  • 28. “Navigational” online performance k = 0.5 k = 0.4 k = 0.3 k = 0.2 k = 0.1 HP2003 102.58 109.78 118.84 116.38 117.52 HP2004 89.61 97.08 99.03 103.36 105.69 NP2003 90.32 100.94 Be st p e r fo r m a n c e 105.03 108.15 110.12 NP2004 99.14 104.34 t le 110.16 h l i t 112.05 wit 116.00 TD2003 70.93 75.20 ex plo 77.64ratio n77.54dan 75.70 TD2004 78.83 80.17 82.40 ot s o f 83.54 l 80.98 OHSUMED 125.35 126.92 127.37 l o i t at i o n exp 127.94 127.21 MQ2007 95.50 94.99 95.70 96.02 94.94 MQ2008 89.39 90.55 91.24 92.36 92.25 Darker shades indicate higher performance 125.71 Dark borders indicate significant improvements over the k = 0.5 baseline Adapting Rankers Online 28
  • 29. “Informational” click model 0.8 k = 0.5 k = 0.2 k = 0.1 0.6  Click model 0.4 P(c|R) P(c|NR) P(s|R) P(s|NR) 0.2 0.9 0.4 0.5 0.1 0.0 0 200 400 600 800 1000  Simulate very noisy Final performance over time for data set NP2003 and informational click model interaction Adapting Rankers Online 29
  • 30. “Informational” online performance k = 0.5 k = 0.4 k = 0.3 k = 0.2 k = 0.1 HP2003 59.53 63.91 61.43 70.11 71.19 HP2004 41.12 52.88 st 55.88 H i g h e 58.40 48.54 55.16 e nts63.23t h wi NP2003 53.63 53.64 57.60 69.90 63.38 p ro ve m im o n55.76 te s: NP2004 60.59 64.17 69.96 51.58 at i r ra TD2003 52.78 l o w exp l o 52.95 57.30 59.75 n b et we e n TD2004 58.49 i nte ra 61.43 ctio 62.88 63.37 126.76 et as OHSUMED MQ2007 121.39 91.57 123.26 92.00 124.01 an n o ise91.66 d dat90.79 125.40 90.19 MQ2008 86.06 87.26 85.83 87.62 86.29 Darker shades indicate higher performance 125.71 Dark borders indicate significant improvements over the k = 0.5 baseline Adapting Rankers Online 30
  • 31. Summary  What?  Developed first method for balancing exploration and exploitation in online learning to rank  Devised experimental framework for simulating user interactions and measuring online performance  And so?  Balancing exploration and exploitation improves online performance for all click models and all data sets  Best results are achieved with 2 exploratory documents per results list Adapting Rankers Online 31
  • 32. What’s next here?  Validate simulation assumptions  Evaluate using on click logs  Develop new algorithms for online learning to rank for IR that can balance exploration and exploitation Adapting Rankers Online 32
  • 33. Ongo ing Inferring Preferences work from Clicks Adapting Rankers Online 33
  • 34. Interleaved ranker comparison methods  Use implicit feedback (“clicks”), not to infer absolute judgments, but to compare two rankers by observing clicks on an interleaved result list  Interleave two ranked lists (“outputs of two rankers”)  Use click data to detect even very small differences between rankers  Examine three existing methods for interleaving, identify issues with them and propose a new one Adapting Rankers Online 34
  • 35. Three methods (1)  Balanced interleave method  Interleaved list is generated for each query based on the two rankers  User’s clicks on interleaved list are attributed to each ranker based on how they ranked the clicked docs  Ranker that obtains more clicks is deemed superior Joachims, Evaluating retrieval performance Adapting Rankers Online 35 using clickthrough data, In: Text Mining, 2003
  • 36. 1) Interleaving 2) Comparison List l1 List l2 d1 d2 d1 d2 d2 d3 x d2 x d1 observed clicks c d3 d4 d3 d3 d4 d1 x d4 x d4 k = min(4,3) = 3 k = min(4,4) = 4 Two possible interleaved lists l: click count: click count: c1 = 1 c1 = 2 d1 d2 c2 = 2 c2 = 2 d2 d1 d3 d3 l2 wins the first comparison, and the lists tie for d4 d4 the second. In expectation l2 wins. Adapting Rankers Online 36
  • 37. Three methods (2)  Team draft method  Create an interleaved list following the model of “team captains” selecting their team from a set of players  For each pair of documents to be placed in the interleaved list, a coin flip determines which list gets to select a document first  Record which document contributed which document Radlinski et al., How does click-through data reflect Adapting Rankers Online 37 retrieval quality? 2008
  • 38. 1) Interleaving 2) Comparison assignments a List l1 List l2 d1 d2 a) c) d2 d3 d1 1 d2 2 d3 d4 d2 2 d1 1 d4 d1 x d3 1 x d3 2 d4 2 d4 1 Four possible interleaved lists l, with different assignments a: b) d) d2 2 d1 1 For the interleaved lists a) and b) l1 d1 1 d2 2 wins the comparison. l2 wins in the x d3 1 x d3 2 other two cases. d4 2 d4 1 Adapting Rankers Online 38
  • 39. Three methods (3)  Document-constraint method  Result lists are interleaved and clicks observed as for the balanced interleaved method  Infer constraints on pairs of individual documents based on clicks and ranks  For each pair of a clicked document and a higher-ranked non- clicked document, a constraint is inferred that requires the former to be ranked higher than the latter  The original list that violates fewer constraints is deemed superior He et al., Evaluation of methods for relative comparison of retrieval Adapting Rankers Online 39 systems based on clickthroughs, 2009
  • 40. 1) Interleaving 2) Comparison List l1 List l2 d1 d2 d1 d2 d2 d3 x d2 x d1 d3 d4 x d3 x d3 d4 d1 d4 d4 inferred constraints inferred constraints Two possible interleaved lists l: violated by: l1 l2 violated by: l1 l2 d1 d2 d2 ≻ d1 x - d1 ≻ d2 - x d2 d1 d3 ≻ d1 x - d3 ≻ d2 x x d3 d3 l2 wins the first comparison, and loses the d4 d4 second. In expectation l2 wins. Adapting Rankers Online 40
  • 41. Assessing comparison methods  Bias  Don’t prefer either ranker when clicks are random  Sensitivity  The ability of a comparison method to detect differences in the quality of rankings  Balanced interleave and document constraint are biased  Team draft may suffer from insensitivity Adapting Rankers Online 41
  • 42. A new proposal  Briefly  Based on team draft  Instead of interleaving deterministically, model the interleaving process as random sampling from softmax functions that define probability distributions over documents  Derive an estimator that is unbiased and sensitive to small ranking changes  Marginalize over all possible assignments to make estimates more reliable Adapting Rankers Online 42
  • 43. 1) Probabilistic Interleave 2) Probabilistic marginalize over all possible assignments: Comparison l1 ! softmax s1 l2 ! softmax s2 a o(ci,a) P(a|li,qi) d1 d2 P(dr=1)= 0.85 1 1 1 1 2 0 0.053 Observe data, e.g. 1 1 1 2 2 0 0.053 d2 d3 P(dr=2)= 0.10 d1 1 1 1 2 1 1 1 0.058 d3 d4 P(dr=3)= 0.03 x d2 2 1 1 2 2 1 1 0.058 d4 d1 P(dr=4)= 0.02 x d3 1 1 2 1 1 1 1 0.065 For each rank of the interleaved list l draw one of {s1, s2} and d4 2 1 2 1 2 1 1 0.065 P(c1 c2) = 0.108 sample d: 1 2 2 1 0 2 0.071 P(c1 c2) = 0.144 s1 d4 1 2 2 2 0 2 0.071 s1 d3 d2 s2 d4 2 1 1 1 2 0 0.001 s1 s2 d4 ... 2 1 1 2 2 0 0.001 s2 (based on l2) wins d1 d3 ... 2 1 2 1 1 1 0.001 the comparison. s1 and s2 2 1 2 2 1 1 0.001 s2 tie in expectation. s1 d2 ... d4 ... 2 2 1 1 1 1 0.001 2 2 1 2 1 1 0.001 s2 d3 ... 2 2 2 1 0 2 0.001 All permutations of documents d4 ... in D are possible. 2 2 2 2 0 2 0.001  For an incoming query  ...  System generates  All possible assignments are generated; interleaved list  Probability of each is computed  Observe clicks  Expensive; only need to do this  Compute probability of until the lowest observed click each possible outcome Adapting Rankers Online 43
  • 44. Question  Do analytical differences between the methods translate into performance differences? Adapting Rankers Online 44
  • 45. Evaluation  Set-up  Simulation based on dependent click model  Perfect and realistic instantiations  Not binary, but with relevance levels  MSLR-WEB30k Microsoft learning to rank data set  136 doc features (i.e., rankers)  Three experiments  Exhaustive comparison of all distinct ranker pairs  9,180 distinct pairs  Selection of small subsets for detailed analysis  Add noise Adapting Rankers Online 45
  • 46. Results (1)  Experiment 1  Accuracy  Percentage of pairs of rankers for which a comparison method identified the better ranker after 1000 queries Method Accuracy balanced interleave 0.881 team draft 0.898 document constraint 0.857 new 0.914 Adapting Rankers Online 46
  • 47. Results (2): overview  “Problematic” pairs  Pairs of rankers for which all methods correctly identified the better one  Three achieved perfect accuracy within 1000 queries  For each method, incorrectly judged pair with highest difference in NDCG Adapting Rankers Online 47
  • 48. Results (3): perfect model 1 0.8 1 0.6 0.8 0.4 1 0.6 0.2 balanced interleave team draft 0.8 document constraint 0.4 marginalized probabilities 1 0 0.6 1 10 100 1k 2k 5k 10k 0.2 0.8 0.4 0 0.6 1 10 100 1k 2k 5k 10k 0.2 0.4 0 1 10 0.2 100 1k 2k 5k 10k 0 1 10 100 1k 2k 5k 10k Adapting Rankers Online 48
  • 49. Results (4): realistic model 1 0.8 0.6 1 0.4 0.8 0.2 0.6 0 0.4 1 10 100 1k 2k 5k 10k 0.2 0 1 10 100 1k 2k 5k 10k Adapting Rankers Online 49
  • 50. Summary  What?  Methods for evaluating rankers using implicit feedback  Analysis of interleaved comparison methods in terms of bias and sensitivity  And so?  Introduced a new probabilistic interleaved comparison method, unbiased and sensitive  Experimental analysis: more accurate, with substantially fewer observed queries, more robust Adapting Rankers Online 50
  • 51. What’s next here?  Evaluate in a real-life setting in the future  With more reliable and faster convergence, our approach can pave the way for online learning to rank methods that require many comparisons Adapting Rankers Online 51
  • 52. Wrap-up Adapting Rankers Online 52
  • 53. Online learning to rank  Emphasis on implicit feedback collected during normal operation of the search engine  Balancing exploration and exploitation  Probabilistic method for inferring preferences from clicks Adapting Rankers Online 53
  • 54. Information retrieval observatory  Academic experiments on online learning and implicit feedback used simulators  Need to validate the simulators  What’s really needed  Move away from artificial explicit feedback to natural implicit feedback  Shared experimental environment for observing users in the wild as they interact with systems Adapting Rankers Online 54
  • 55. Adapting Rankers Online  Maarten de Rijke, derijke@uva.nl Adapting Rankers Online 55
  • 56. (Intentionally left blank) Adapting Rankers Online 56
  • 57. Bias 1) Interleaving 2) Comparison List l1 List l2 d1 d2 d1 d2 d2 d3 x d2 x d1 observed clicks c d3 d4 d3 d3 d4 d1 x d4 x d4 k = min(4,3) = 3 k = min(4,4) = 4 Two possible interleaved lists l: click count: click count: c1 = 1 c1 = 2 d1 d2 c2 = 2 c2 = 2 d2 d1 d3 d3 l2 wins the first comparison, and the lists tie for d4 d4 the second. In expectation l2 wins. Adapting Rankers Online 57
  • 58. Sensitivity 1) Interleaving 2) Comparison assignments a List l1 List l2 d1 d2 a) c) d2 d3 d1 1 d2 2 d3 d4 d2 2 d1 1 d4 d1 x d3 1 x d3 2 d4 2 d4 1 Four possible interleaved lists l, with different assignments a: b) d) d2 2 d1 1 For the interleaved lists a) and b) l1 d1 1 d2 2 wins the comparison. l2 wins in the x d3 1 x d3 2 other two cases. d4 2 d4 1 Adapting Rankers Online 58