Constructing Query Models from Sample Documents

Constructing Query Models from Elaborate Query Formulations A Few Examples Go A Long Way KrisztianBalog kbalog@science.uva.nl WouterWeerkamp weerkamp@science.uva.nl MaartendeRijke mdr@science.uva.nl ISLA,University of Amsterdam Presented by TanviMotwani

Along with the query it takes sample documents as input. Sample documents are additional information that users provide consisting of small number of “key references” (pages that should be linked to by good overview page of that topic)

Aim is to increase “aspect recall” by attempting to uncover aspects of information which are not captured by the query but by the sample documents.,[object Object]

Overview Retrieval Model Experimental Set up Query Representation Baseline Parameters Experimental Evaluation

Overview Retrieval Model Experimental Set up Query Representation Baseline Parameters Experimental Evaluation Query Likelihood Document Modeling Query Modeling

P(D1|Q) = 0.32 P(D2|Q) = 0.26 What is a Rainforest? P(D3|Q) = 0.19 P(D4|Q) = 0.12 P(D5|Q) = 0.09 Query (Q) Documents

Query Likelihood Bayes’ Rule Ignoring P(Q) Assuming Independence of Query terms Taking log Using query and document models

Relevance Model What is a Rainforest? Query (Q) Documents

Underlying Relevance Model The query and relevant documents are random samples from an underlying relevance model R. Documents are ranked based on their similarity to the query model. The Kullback-Leibler divergence between the query and document models can he used to provide a ranking of documents.

Document Modeling Maximum Likelihood Estimate Smoothing ML estimate This document will have P(“Rain”|D) as 0, thus smoothing is required.

Query Modeling P(t|Q) is extremely space and thus query expansion is necessary. This document does not have words “Rain” and “Forest” but have related words such as “Wild Life”. Expansion of query brings different “aspects” of the topic.

Experimental Setup ,[object Object]

Judgments made in 3-point scale: 2: highly relevant “key reference” 1: candidate key page 0: not a “key reference”

Overview Retrieval Model Experimental Set up Query Representation Baseline Parameters Experimental Evaluation Maximizing Average Precision (MAX_AP) Maximizing Query Log Likelihood (MAX_QLL) Best Empirical estimate (EMP_BEST)

Parameter Estimation Maximizing Average Precision (MAX_AP) Maximizing Query Log likelihood (MAX_QLL) Best Empirical Estimate (EMP_BEST)

MAX_QLL performs slightly better than MAX_AP,[object Object]

Query Representation ,[object Object]

This prevents the topic to shift away from the original user information need.,[object Object]

Feedback Using Relevance Models Joint Probability of observing t together with query terms q1,q2…qk divided by joint probability of the query terms. ,[object Object]

RM2 : Sampling of q1,q2…qk are dependent on t but independent of each other.,[object Object]

RM2 Given the term “wild” we first pick a document from M set with probability P(D|t) and then sample query words from the document. Assume P(D | “wild”) is 0.7 This document has 10 “rain” words And 20 “forest” words Document has 200 unique words P(“wild”) is 0.2 And M is just this document P(“wild”, “rain”, “forest”)= 0.2* 0.7 * 20/200 * 10/200

Overview Retrieval Model Experimental Set up Query Representation Baseline Parameters Experimental Evaluation Query Model from Sample Documents Feedback Using Relevance Models Relevance Models from Sample Documents

Relevance Models from Sample Documents ,[object Object]

For RM1 assume P(D) = 1/|S|.,[object Object]

Query Model from Sample Documents Top K terms with highest probability P(t|S) are taken and used to formulate expanded query. Sample Document set S Select document D from this set S with probability P(D|S) From this document, generate term t with probability P(t|D) Sum over all sample documents to obtain P(t|S)

Query Model from Sample Documents ,[object Object]

Smoothed Estimate of a term (EX-QM-SM)

Ranking Function proposed by Ponte and Croft for unsupervised query expansion (EX-QM-EXP),[object Object]

Inverse query-biased: ,[object Object]

Constructing Query Models from Sample Documents

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Ähnlich wie Constructing Query Models from Sample Documents

Ähnlich wie Constructing Query Models from Sample Documents (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Constructing Query Models from Sample Documents