SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Frank: A Ranking Method with
         Fidelity Loss
  Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi
        chen, Wei-Ying Ma, SIGIR 2007


                              Advisor: Chia-Hui Chang
                              Student: Chen-Ling Chen
                              Date: 2011-10-21




                                                        1
Outline
   Introduction
   Related Work
   Fidelity Rank
   Experiments
   Conclusions



                   2
Introduction
The IR problem can be formulated as a ranking problem:
provided a query and a set of documents, an IR system should
provide a ranked list of documents

 In addition to traditional IR approaches, machine learning
techniques are becoming more widely used for the ranking
problem of IR
   the learning-based methods aim to use labeled data for
   learning an effective ranking function
   several learning-based methods, such as RankBoost, RankSVM,
   and RankNet, view the problem as a ranking problem

                                                              3
Introduction(cont.)
The probabilistic ranking framework has many effective
properties for ranking, such as pair-wise differentiable loss,
and can better model multiple relevance levels

A further studies on the probabilistic ranking framework, and
propose a novel loss function named fidelity loss
   inherits those properties of the probabilistic ranking
   framework, and has several additional useful characteristics for
   ranking

Fidelity Rank (FRank) combines the probabilistic ranking
framework with the generalized additive model

                                                                      4
Related work
Treated IR as a binary classification problem that directly
classifies a document as relevant or irrelevant with respect to
the given query

– However, in real Web searching, a page has multiple relevance
  levels, such as highly relevant, partially relevant, and definitely
  irrelevant


– some studies regard IR as a ranking problem



                                                                  5
Related work(cont.)
RankBoost used the Boosting approach for learning a ranking
function and solved the problem of combining preferences
   aims to minimize the weighted number of pairs of instances that
   are misordered by the final ranking function
   For each round t, RankBoost chooses αt and the weak learner ht
   so as to minimize the pair-wise loss, which is defined by



   The rule of adjustment is to decrease the weight of pairs if ht
   gives a correct ranking (ht(xi) > ht(xj)) and increase otherwise.
   Finally, this algorithm outputs the ranking function by combining
   selected weak learners


                                                    6
Related work(cont.)
RankSVM used Support Vector Machines for optimizing
search performance using click-through data

  aims to minimize the number of discordant pairs, and to
  maximize the margin of pair

is well-founded in the structure risk minimization framework,
and is verified in a controlled experiment

   the advantage is that it needs no human-labeled data, and can
   automatically learn the ranking function from click-through data



                                                 7
Related work(cont.)
RankNet a probabilistic ranking framework and used neural
network for minimizing a pair-wise differentiable loss

compared to RankBoost and RankSVM,the loss function in RankNet
is pair-wise differentiable, which can be regarded as an advantage

    loss function in RankNet still has some problems:
       it has no real minimal loss and appropriate upper bound
       These problems may cause the trained ranking function inaccurate and
       the training process biased by some hard pairs




                                                         8
Fidelity Rank




                9
Fidelity Rank(cont.)




                   10
Fidelity Rank(cont.)
Problems within cross entropy loss function:

                            The cross entropy loss function
                            cannot achieve the real minimal loss,
                            zero, expect for the pair with
                            posterior probability is 0 or 1

                            The cross entropy loss of a pair
                            has no upper bound

                            The query-level normalization is hard
                            to define in cross entropy loss

                                            11
Fidelity Rank(cont.)




                   12
Fidelity Rank(cont.)




                   13
Fidelity Rank(cont.)




                   14
Fidelity Rank(cont.)




                   15
Fidelity Rank(cont.)




                   16
Fidelity Rank(cont.)




                   17
Fidelity Rank(cont.)
FRank algorithm Derivation




                             18
Fidelity Rank(cont.)
FRank algorithm Derivation
  When a binary weak learner is introduced, the above equation can
  be simplified because      only takes values -1, 0, and 1
  Therefore, the equation can be expressed by




                                                 19
Fidelity Rank(cont.)




                   20
Experiments




              21
Experiments(cont.)




                 22
Experiments(cont.)
Comparison methods
   RankBoost, RankNet_Linear, RankNet_TwoLayer, RankSVM, BM25
Experiment on TREC Dataset
   Four-fold cross validation


Experimental Results on TREC Dataset
   All learning-based methods
  outperform BM25
   FRank even obtains about 40%
  improvement over BM25, in MAP
   Frank is suitable for Web search


                                                   23
Experiments(cont.)
Experiment on Web Search Dataset
   These web pages are partially labeled with ratings from 1 to 5
        The unlabeled pages are given the rating 0
            20 unlabeled documents are randomly selected for training
   For a given query, a page is represented by query-dependent (e.g., term
   frequency) and query-independent (e.g., PageRank) features
   The total number of features is 619




Training process of FRank


                                                       24
Experiments(cont.)
The power of representation of RankBoost is quite limited with few
weak learners
   Set the number of weak learners as 224 for FRank and 271 for RankBoost
The performance of RankNet_TwoLayer dropped when the number
of epochs was more than 10
   Set the number of epochs as 25 for RankNet Linear and 9 for
   RankNet_TwoLayer




                                                         25
Experiments(cont.)
An empirical approach of using linear
combination of BM25 and PageRank is also
performed
Using conventional IR model is inadequate
   Because the learning-based methods is
   capable of leveraging various features in a large-
   scale Web search dataset
These results indicate that the probabilistic
ranking framework is a suitable framework for
learning to rank
FRank algorithm significantly outperforms
other learned based ranking algorithms
   The corresponding p-values are 0.0114 for
   NDCG@1, 0.007 for NDCG@5, and 0.0056 for
   NDCG@10.

                                                        26
Experiments(cont.)

Several observations as follows:
The FRank algorithm performs effectively in small training dataset
since it introduces query-level normalization in the fidelity loss
function
Using more training data does not guarantee improvement on
performance (e.g., RankBoost)
The probabilistic ranking framework performs well when the
amount of training data is large
   This is because more pair-wise information is capable of making the
   trained ranking function more accurate
FRank outperformed other ranking algorithms for all cases

                                                                27
Conslusion
• The FRank algorithm performs well in practice, even
  for conventional TREC dataset and real Web search
  dataset

• Future work
• For theoretical aspects, investigate how to prove the generalization bound
  based on the probabilistic ranking framework
• Study whether it is more effective to combine the fidelity loss with other
  machine learning techniques, such as kernel methods
• On scalability issues, implement a parallel version of FRank that can handle
  even larger training datasets


                                                          28
Thank you for listening.




                    29
Related work(cont.)
 Joachims proposed RankSVM algorithm, which uses Support
 Vector Machines for optimizing search performance using
 click-through data
 RankSVM aims to minimize the number of discordant pairs, and to
 maximize the margin of pair
 The margin maximization is equal to minimize the L2-norm of
 hyperplane parameter


 Given a query, if the ground truths assert that document di is
 more relevant than dj, the constraint of RankSVM is
                     Where          is the feature vector
calculated from document d relative to query q

                                               30
Related work(cont.)
the ranking problem can be expressed as the following
constrained optimization problem




RankSVM is well-founded in the structure risk minimization
framework, and is verified in a controlled experiment
    the advantage is that it needs no human-labeled data, and
   can automatically learn the ranking function from click-
   through data.

                                             31

Weitere ähnliche Inhalte

Ähnlich wie 2011-10-28大咪報告

probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine LearningPradip Rahul
 
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction ijmpict
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxdongchangim30
 
powerpoint
powerpointpowerpoint
powerpointbutest
 
Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...Sayed Mohsin Reza
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - DocumentNishna Ma
 
A Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational DatabasesA Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational DatabasesIOSR Journals
 
Download
DownloadDownload
Downloadbutest
 
Download
DownloadDownload
Downloadbutest
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.pptVishalKumar725248
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.pptVishalKumar725248
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixXavier Amatriain
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013MLconf
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学Xu jiakon
 
Presentation
PresentationPresentation
Presentationbutest
 

Ähnlich wie 2011-10-28大咪報告 (20)

Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine Learning
 
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction Enhancing Keyword Query Results Over Database for Improving User Satisfaction
Enhancing Keyword Query Results Over Database for Improving User Satisfaction
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
powerpoint
powerpointpowerpoint
powerpoint
 
Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...Performance analysis of machine learning approaches in software complexity pr...
Performance analysis of machine learning approaches in software complexity pr...
 
Record matching over multiple query result - Document
Record matching over multiple query result - DocumentRecord matching over multiple query result - Document
Record matching over multiple query result - Document
 
A Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational DatabasesA Study on Optimization of Top-k Queries in Relational Databases
A Study on Optimization of Top-k Queries in Relational Databases
 
Download
DownloadDownload
Download
 
Download
DownloadDownload
Download
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.ppt
 
lecture14-learning-ranking.ppt
lecture14-learning-ranking.pptlecture14-learning-ranking.ppt
lecture14-learning-ranking.ppt
 
ppt
pptppt
ppt
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
 
Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013Xavier amatriain, dir algorithms netflix m lconf 2013
Xavier amatriain, dir algorithms netflix m lconf 2013
 
Mazhiming
MazhimingMazhiming
Mazhiming
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学
 
Presentation
PresentationPresentation
Presentation
 

Kürzlich hochgeladen

Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...
Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...
Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...amitlee9823
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.Nitya salvi
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Gamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad IbrahimGamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad Ibrahimamgadibrahim92
 
Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )
Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )
Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )gajnagarg
 
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...Delhi Call girls
 
The hottest UI and UX Design Trends 2024
The hottest UI and UX Design Trends 2024The hottest UI and UX Design Trends 2024
The hottest UI and UX Design Trends 2024Ilham Brata
 
WhatsApp Chat: 📞 8617697112 Call Girl Baran is experienced
WhatsApp Chat: 📞 8617697112 Call Girl Baran is experiencedWhatsApp Chat: 📞 8617697112 Call Girl Baran is experienced
WhatsApp Chat: 📞 8617697112 Call Girl Baran is experiencedNitya salvi
 
ab-initio-training basics and architecture
ab-initio-training basics and architectureab-initio-training basics and architecture
ab-initio-training basics and architecturesaipriyacoool
 
Call Girls Jalgaon Just Call 8617370543Top Class Call Girl Service Available
Call Girls Jalgaon Just Call 8617370543Top Class Call Girl Service AvailableCall Girls Jalgaon Just Call 8617370543Top Class Call Girl Service Available
Call Girls Jalgaon Just Call 8617370543Top Class Call Girl Service AvailableNitya salvi
 
Hingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRL
Hingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRLHingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRL
Hingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRLNitya salvi
 
➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men 🔝Kolkata🔝 Escorts...
➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men  🔝Kolkata🔝   Escorts...➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men  🔝Kolkata🔝   Escorts...
➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men 🔝Kolkata🔝 Escorts...amitlee9823
 
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...amitlee9823
 
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...amitlee9823
 
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls AgencyHire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls AgencyNitya salvi
 
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...nirzagarg
 

Kürzlich hochgeladen (20)

Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Etawah Escorts ☎️9352988975 Two shot with one girl (...
 
Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...
Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...
Whitefield Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Ba...
 
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman MuscatAbortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
Abortion Pills in Oman (+918133066128) Cytotec clinic buy Oman Muscat
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
❤Personal Whatsapp Number 8617697112 Samba Call Girls 💦✅.
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Gamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad IbrahimGamestore case study UI UX by Amgad Ibrahim
Gamestore case study UI UX by Amgad Ibrahim
 
Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )
Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )
Just Call Vip call girls diu Escorts ☎️9352988975 Two shot with one girl (diu )
 
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
 
The hottest UI and UX Design Trends 2024
The hottest UI and UX Design Trends 2024The hottest UI and UX Design Trends 2024
The hottest UI and UX Design Trends 2024
 
WhatsApp Chat: 📞 8617697112 Call Girl Baran is experienced
WhatsApp Chat: 📞 8617697112 Call Girl Baran is experiencedWhatsApp Chat: 📞 8617697112 Call Girl Baran is experienced
WhatsApp Chat: 📞 8617697112 Call Girl Baran is experienced
 
ab-initio-training basics and architecture
ab-initio-training basics and architectureab-initio-training basics and architecture
ab-initio-training basics and architecture
 
Call Girls Jalgaon Just Call 8617370543Top Class Call Girl Service Available
Call Girls Jalgaon Just Call 8617370543Top Class Call Girl Service AvailableCall Girls Jalgaon Just Call 8617370543Top Class Call Girl Service Available
Call Girls Jalgaon Just Call 8617370543Top Class Call Girl Service Available
 
Hingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRL
Hingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRLHingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRL
Hingoli ❤CALL GIRL 8617370543 ❤CALL GIRLS IN Hingoli ESCORT SERVICE❤CALL GIRL
 
➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men 🔝Kolkata🔝 Escorts...
➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men  🔝Kolkata🔝   Escorts...➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men  🔝Kolkata🔝   Escorts...
➥🔝 7737669865 🔝▻ Kolkata Call-girls in Women Seeking Men 🔝Kolkata🔝 Escorts...
 
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️call girls in Dakshinpuri  (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
call girls in Dakshinpuri (DELHI) 🔝 >༒9953056974 🔝 genuine Escort Service 🔝✔️✔️
 
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men  🔝jhansi🔝   Escorts S...
➥🔝 7737669865 🔝▻ jhansi Call-girls in Women Seeking Men 🔝jhansi🔝 Escorts S...
 
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
Call Girls Basavanagudi Just Call 👗 7737669865 👗 Top Class Call Girl Service ...
 
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls AgencyHire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
Hire 💕 8617697112 Meerut Call Girls Service Call Girls Agency
 
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard  ...
Anamika Escorts Service Darbhanga ❣️ 7014168258 ❣️ High Cost Unlimited Hard ...
 

2011-10-28大咪報告

  • 1. Frank: A Ranking Method with Fidelity Loss Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi chen, Wei-Ying Ma, SIGIR 2007 Advisor: Chia-Hui Chang Student: Chen-Ling Chen Date: 2011-10-21 1
  • 2. Outline Introduction Related Work Fidelity Rank Experiments Conclusions 2
  • 3. Introduction The IR problem can be formulated as a ranking problem: provided a query and a set of documents, an IR system should provide a ranked list of documents In addition to traditional IR approaches, machine learning techniques are becoming more widely used for the ranking problem of IR the learning-based methods aim to use labeled data for learning an effective ranking function several learning-based methods, such as RankBoost, RankSVM, and RankNet, view the problem as a ranking problem 3
  • 4. Introduction(cont.) The probabilistic ranking framework has many effective properties for ranking, such as pair-wise differentiable loss, and can better model multiple relevance levels A further studies on the probabilistic ranking framework, and propose a novel loss function named fidelity loss inherits those properties of the probabilistic ranking framework, and has several additional useful characteristics for ranking Fidelity Rank (FRank) combines the probabilistic ranking framework with the generalized additive model 4
  • 5. Related work Treated IR as a binary classification problem that directly classifies a document as relevant or irrelevant with respect to the given query – However, in real Web searching, a page has multiple relevance levels, such as highly relevant, partially relevant, and definitely irrelevant – some studies regard IR as a ranking problem 5
  • 6. Related work(cont.) RankBoost used the Boosting approach for learning a ranking function and solved the problem of combining preferences aims to minimize the weighted number of pairs of instances that are misordered by the final ranking function For each round t, RankBoost chooses αt and the weak learner ht so as to minimize the pair-wise loss, which is defined by The rule of adjustment is to decrease the weight of pairs if ht gives a correct ranking (ht(xi) > ht(xj)) and increase otherwise. Finally, this algorithm outputs the ranking function by combining selected weak learners 6
  • 7. Related work(cont.) RankSVM used Support Vector Machines for optimizing search performance using click-through data aims to minimize the number of discordant pairs, and to maximize the margin of pair is well-founded in the structure risk minimization framework, and is verified in a controlled experiment the advantage is that it needs no human-labeled data, and can automatically learn the ranking function from click-through data 7
  • 8. Related work(cont.) RankNet a probabilistic ranking framework and used neural network for minimizing a pair-wise differentiable loss compared to RankBoost and RankSVM,the loss function in RankNet is pair-wise differentiable, which can be regarded as an advantage loss function in RankNet still has some problems: it has no real minimal loss and appropriate upper bound These problems may cause the trained ranking function inaccurate and the training process biased by some hard pairs 8
  • 11. Fidelity Rank(cont.) Problems within cross entropy loss function: The cross entropy loss function cannot achieve the real minimal loss, zero, expect for the pair with posterior probability is 0 or 1 The cross entropy loss of a pair has no upper bound The query-level normalization is hard to define in cross entropy loss 11
  • 19. Fidelity Rank(cont.) FRank algorithm Derivation When a binary weak learner is introduced, the above equation can be simplified because only takes values -1, 0, and 1 Therefore, the equation can be expressed by 19
  • 23. Experiments(cont.) Comparison methods RankBoost, RankNet_Linear, RankNet_TwoLayer, RankSVM, BM25 Experiment on TREC Dataset Four-fold cross validation Experimental Results on TREC Dataset All learning-based methods outperform BM25 FRank even obtains about 40% improvement over BM25, in MAP Frank is suitable for Web search 23
  • 24. Experiments(cont.) Experiment on Web Search Dataset These web pages are partially labeled with ratings from 1 to 5 The unlabeled pages are given the rating 0 20 unlabeled documents are randomly selected for training For a given query, a page is represented by query-dependent (e.g., term frequency) and query-independent (e.g., PageRank) features The total number of features is 619 Training process of FRank 24
  • 25. Experiments(cont.) The power of representation of RankBoost is quite limited with few weak learners Set the number of weak learners as 224 for FRank and 271 for RankBoost The performance of RankNet_TwoLayer dropped when the number of epochs was more than 10 Set the number of epochs as 25 for RankNet Linear and 9 for RankNet_TwoLayer 25
  • 26. Experiments(cont.) An empirical approach of using linear combination of BM25 and PageRank is also performed Using conventional IR model is inadequate Because the learning-based methods is capable of leveraging various features in a large- scale Web search dataset These results indicate that the probabilistic ranking framework is a suitable framework for learning to rank FRank algorithm significantly outperforms other learned based ranking algorithms The corresponding p-values are 0.0114 for NDCG@1, 0.007 for NDCG@5, and 0.0056 for NDCG@10. 26
  • 27. Experiments(cont.) Several observations as follows: The FRank algorithm performs effectively in small training dataset since it introduces query-level normalization in the fidelity loss function Using more training data does not guarantee improvement on performance (e.g., RankBoost) The probabilistic ranking framework performs well when the amount of training data is large This is because more pair-wise information is capable of making the trained ranking function more accurate FRank outperformed other ranking algorithms for all cases 27
  • 28. Conslusion • The FRank algorithm performs well in practice, even for conventional TREC dataset and real Web search dataset • Future work • For theoretical aspects, investigate how to prove the generalization bound based on the probabilistic ranking framework • Study whether it is more effective to combine the fidelity loss with other machine learning techniques, such as kernel methods • On scalability issues, implement a parallel version of FRank that can handle even larger training datasets 28
  • 29. Thank you for listening. 29
  • 30. Related work(cont.) Joachims proposed RankSVM algorithm, which uses Support Vector Machines for optimizing search performance using click-through data RankSVM aims to minimize the number of discordant pairs, and to maximize the margin of pair The margin maximization is equal to minimize the L2-norm of hyperplane parameter Given a query, if the ground truths assert that document di is more relevant than dj, the constraint of RankSVM is Where is the feature vector calculated from document d relative to query q 30
  • 31. Related work(cont.) the ranking problem can be expressed as the following constrained optimization problem RankSVM is well-founded in the structure risk minimization framework, and is verified in a controlled experiment the advantage is that it needs no human-labeled data, and can automatically learn the ranking function from click- through data. 31