SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Probabilistic Retrieval Model

Baradhidasan P
2nd Year
Pondicherry University
INTRODUCTION
• Probability theory has been used as a principal
means for modeling the retrieval process in
mathematical terms .
• In conventional retrieval situations a
document is retrieved whenever the keyword
set attached to the appears similar in some
sense to the query keywords.
• In this case the document is considered
relevant to the query.
Cont..
• Since the relevance of a document with
respect to a query is a matter of degree. It can
be postulated that when the document and
query vectors are sufficiently similar, the
corresponding probability of relevance is large
enough to make it reasonable to retrieve the
document in response query
• Applies the theory of probability
Why use Probabilities?
• Information Retrieval deals with uncertain
information
• Probability is a measure of uncertainty
• Probabilistic Ranking Principle

• provable
• minimization of risk
• Probabilistic Inference
• To justify your decision
Approach
• The basic underlying tenet of the probabilistic
approach to Retrieval is that, for optimal
performance documents should be ranked in
order of decreasing probability of relevance.
• Several models based on probabilistic
approaches have been advocated here we
shall briefly look into three such models.
objectives
•
•
•
•
•
•
•
•

Highlight influential work on probabilistic models for IR
Provide a working understanding of the probabilistic
Techniques through a set of common implementation
tricks
Establish relationships between the popular
approaches: stress common ideas, explain differences
Outline issues in extending the models to interactive,
cross-language, multi-media
Maron and kuhns
• Maron and kuhns proposed a model for
probabilistic retrieval as early as in 1960. they
advocated that the probability that a given
document would be relevant to a user can be
assessed by a calculation of the probability, for
each document in the collection . That a user
submitting a particular query would judge that
document relevant Thus,
Cont..
• For a query consisting of only one term
(B), the probability that particular document
(DM) will be judged relevant is the ratio of
users who submit query term (B) and
consider the document (DM) to be relevant in
relation to the number of users who
submitted the query term (B) Adopting this
approach one has to employ historical
information to calculate the probability of
relevance the number times users.
Cont..
• Who submitted a particular query term (B)
judged a document (Dm) relevant compared
with the total number of users who submitted
that particular query term (B)
Salton approach
• The model suggested by salton and mcgill
takes a different approach. The essence of
this model is that if estimates for the
probability of occurrence of various terms in
relevant document can be calculated, then the
probabilities that a document will be retrieved
given that it is relevant, several experiments
have shown that the probabilistic model can
yield good results.
Two basic parameters
• The probability of relevance –pr(rel)
• The probability of non-relevance-pr(non-rel)
if relevance is considered as a binary property
then pr(non-rel)= 1 pr(rel)
However, there are two cost parameters
associated with the process of retrieval
A1- the loss associated with the retrieval of a
non-relevant record
Cont…
• A2 the loss associated with the non- retrieval
of a relevant record
• Because of the fact that retrieval of anonrelevant record carries a loss of a1 {1p(rel)}, and the rejection of a relevant item
has an associated loss factor of a2pr(rel), the
total loss for a given retrieval process will be
minimized if an item is retrieved whenever
A2pr(rel)>a1pr(rel)
Cont…
• Detined, and an item may be retrieved whenever the
value of g and DISC is greater than or equals
zero, where
• g or DISC = P(rel) a1
1-Pr(rel)

a2

• The relevance properties of a record mist be related to
the relevance properties of various terms attached to
the records. The probabilities that a document is
relevant and not relevant, given that is has been
selected, are defined by P (rel selected) and P (non-rel
selected) respectively.
Historical Background
 The first attempts to develop a probabilistic theory of
retrieval were made over 30 years ago [Moron and
Kuhn's 1960; Miller 1971], and since then there has
been a steady development of the approach. There
are already several operational IR systems based upon
probabilistic or semi probabilistic models.
 One major obstacle in probabilistic or
semiprobabilistic IR models is finding methods for
estimating the probabilities used to evaluate the
probability of relevance that are both theoretically
sound and computationally efficient.
Conclusion

Weitere ähnliche Inhalte

Was ist angesagt?

The vector space model
The vector space modelThe vector space model
The vector space modelpkgosh
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Primya Tamil
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction) Primya Tamil
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challengeGan Keng Hoon
 
Indexing language concept types and characteristics
Indexing language concept types and characteristicsIndexing language concept types and characteristics
Indexing language concept types and characteristicsDr. Utpal Das
 
basis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsbasis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsSaroj Suwal
 
Ppt evaluation of information retrieval system
Ppt evaluation of information retrieval systemPpt evaluation of information retrieval system
Ppt evaluation of information retrieval systemsilambu111
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalVikas Bhushan
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsMounia Lalmas-Roelleke
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval ModelsNisha Arankandath
 
Z39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol pptZ39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol pptSUNILKUMARSINGH
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean modelVaibhav Khanna
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3alaa223
 
key word indexing and their types with example
key word indexing and their types with example key word indexing and their types with example
key word indexing and their types with example Sourav Sarkar
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information scienceharshaec
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval systemLeslie Vargas
 

Was ist angesagt? (20)

The vector space model
The vector space modelThe vector space model
The vector space model
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction)
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challenge
 
Indexing language concept types and characteristics
Indexing language concept types and characteristicsIndexing language concept types and characteristics
Indexing language concept types and characteristics
 
basis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival toolsbasis of infromation retrival part 1 retrival tools
basis of infromation retrival part 1 retrival tools
 
Ppt evaluation of information retrieval system
Ppt evaluation of information retrieval systemPpt evaluation of information retrieval system
Ppt evaluation of information retrieval system
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval Models
 
Z39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol pptZ39.50: Information Retrieval protocol ppt
Z39.50: Information Retrieval protocol ppt
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
Term weighting
Term weightingTerm weighting
Term weighting
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Lectures 1,2,3
Lectures 1,2,3Lectures 1,2,3
Lectures 1,2,3
 
key word indexing and their types with example
key word indexing and their types with example key word indexing and their types with example
key word indexing and their types with example
 
Library 2.0
Library 2.0Library 2.0
Library 2.0
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information science
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
 

Ähnlich wie Probabilistic retrieval model

Information retrieval 20 divergence from randomness
Information retrieval 20 divergence from randomnessInformation retrieval 20 divergence from randomness
Information retrieval 20 divergence from randomnessVaibhav Khanna
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspacePrakash Dubey
 
IRT Unit_ 2.pptx
IRT Unit_ 2.pptxIRT Unit_ 2.pptx
IRT Unit_ 2.pptxthenmozhip8
 
A multi criteria evaluation of environmental databases using hasse
A multi criteria evaluation of environmental databases using hasseA multi criteria evaluation of environmental databases using hasse
A multi criteria evaluation of environmental databases using hassebalamurugan.k Kalibalamurugan
 
Data Analysis in Research for Social Study
Data Analysis in Research for Social StudyData Analysis in Research for Social Study
Data Analysis in Research for Social StudyLisaneworkSileshi
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
IR-lec17-probabilistic-ir.pdf
IR-lec17-probabilistic-ir.pdfIR-lec17-probabilistic-ir.pdf
IR-lec17-probabilistic-ir.pdfhimarusti
 
Final_Presentation_SP2-2022-35.pptx
Final_Presentation_SP2-2022-35.pptxFinal_Presentation_SP2-2022-35.pptx
Final_Presentation_SP2-2022-35.pptxHarshilBaksani
 
The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspectiveankurpandeyinfo
 
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...Rasha
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrievalNisha Arankandath
 
IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsi...
IRS-Lecture-Notes irsirs    IRS-Lecture-Notes irsirs   IRS-Lecture-Notes irsi...IRS-Lecture-Notes irsirs    IRS-Lecture-Notes irsirs   IRS-Lecture-Notes irsi...
IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsi...onlmcq
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithmRupali Bhatnagar
 
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...alessio_ferrari
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Editor IJMTER
 
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)rchbeir
 

Ähnlich wie Probabilistic retrieval model (20)

Information retrieval 20 divergence from randomness
Information retrieval 20 divergence from randomnessInformation retrieval 20 divergence from randomness
Information retrieval 20 divergence from randomness
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
IRT Unit_ 2.pptx
IRT Unit_ 2.pptxIRT Unit_ 2.pptx
IRT Unit_ 2.pptx
 
A multi criteria evaluation of environmental databases using hasse
A multi criteria evaluation of environmental databases using hasseA multi criteria evaluation of environmental databases using hasse
A multi criteria evaluation of environmental databases using hasse
 
Chapter 7.pdf
Chapter 7.pdfChapter 7.pdf
Chapter 7.pdf
 
qury.pdf
qury.pdfqury.pdf
qury.pdf
 
Data Analysis in Research for Social Study
Data Analysis in Research for Social StudyData Analysis in Research for Social Study
Data Analysis in Research for Social Study
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
IR-lec17-probabilistic-ir.pdf
IR-lec17-probabilistic-ir.pdfIR-lec17-probabilistic-ir.pdf
IR-lec17-probabilistic-ir.pdf
 
Text mining
Text miningText mining
Text mining
 
Final_Presentation_SP2-2022-35.pptx
Final_Presentation_SP2-2022-35.pptxFinal_Presentation_SP2-2022-35.pptx
Final_Presentation_SP2-2022-35.pptx
 
The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspective
 
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
11 - qualitative research data analysis ( Dr. Abdullah Al-Beraidi - Dr. Ibrah...
 
Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathe...
Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathe...Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathe...
Improving Academic Plagiarism Detection for STEM Documents by Analyzing Mathe...
 
Probablistic information retrieval
Probablistic information retrievalProbablistic information retrieval
Probablistic information retrieval
 
IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsi...
IRS-Lecture-Notes irsirs    IRS-Lecture-Notes irsirs   IRS-Lecture-Notes irsi...IRS-Lecture-Notes irsirs    IRS-Lecture-Notes irsirs   IRS-Lecture-Notes irsi...
IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsirs IRS-Lecture-Notes irsi...
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
 
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
Distribution Similarity based Data Partition and Nearest Neighbor Search on U...
 
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
 

Kürzlich hochgeladen

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Probabilistic retrieval model

  • 1. Probabilistic Retrieval Model Baradhidasan P 2nd Year Pondicherry University
  • 2. INTRODUCTION • Probability theory has been used as a principal means for modeling the retrieval process in mathematical terms . • In conventional retrieval situations a document is retrieved whenever the keyword set attached to the appears similar in some sense to the query keywords. • In this case the document is considered relevant to the query.
  • 3. Cont.. • Since the relevance of a document with respect to a query is a matter of degree. It can be postulated that when the document and query vectors are sufficiently similar, the corresponding probability of relevance is large enough to make it reasonable to retrieve the document in response query • Applies the theory of probability
  • 4. Why use Probabilities? • Information Retrieval deals with uncertain information • Probability is a measure of uncertainty • Probabilistic Ranking Principle • provable • minimization of risk • Probabilistic Inference • To justify your decision
  • 5. Approach • The basic underlying tenet of the probabilistic approach to Retrieval is that, for optimal performance documents should be ranked in order of decreasing probability of relevance. • Several models based on probabilistic approaches have been advocated here we shall briefly look into three such models.
  • 6. objectives • • • • • • • • Highlight influential work on probabilistic models for IR Provide a working understanding of the probabilistic Techniques through a set of common implementation tricks Establish relationships between the popular approaches: stress common ideas, explain differences Outline issues in extending the models to interactive, cross-language, multi-media
  • 7. Maron and kuhns • Maron and kuhns proposed a model for probabilistic retrieval as early as in 1960. they advocated that the probability that a given document would be relevant to a user can be assessed by a calculation of the probability, for each document in the collection . That a user submitting a particular query would judge that document relevant Thus,
  • 8. Cont.. • For a query consisting of only one term (B), the probability that particular document (DM) will be judged relevant is the ratio of users who submit query term (B) and consider the document (DM) to be relevant in relation to the number of users who submitted the query term (B) Adopting this approach one has to employ historical information to calculate the probability of relevance the number times users.
  • 9. Cont.. • Who submitted a particular query term (B) judged a document (Dm) relevant compared with the total number of users who submitted that particular query term (B)
  • 10. Salton approach • The model suggested by salton and mcgill takes a different approach. The essence of this model is that if estimates for the probability of occurrence of various terms in relevant document can be calculated, then the probabilities that a document will be retrieved given that it is relevant, several experiments have shown that the probabilistic model can yield good results.
  • 11. Two basic parameters • The probability of relevance –pr(rel) • The probability of non-relevance-pr(non-rel) if relevance is considered as a binary property then pr(non-rel)= 1 pr(rel) However, there are two cost parameters associated with the process of retrieval A1- the loss associated with the retrieval of a non-relevant record
  • 12. Cont… • A2 the loss associated with the non- retrieval of a relevant record • Because of the fact that retrieval of anonrelevant record carries a loss of a1 {1p(rel)}, and the rejection of a relevant item has an associated loss factor of a2pr(rel), the total loss for a given retrieval process will be minimized if an item is retrieved whenever A2pr(rel)>a1pr(rel)
  • 13. Cont… • Detined, and an item may be retrieved whenever the value of g and DISC is greater than or equals zero, where • g or DISC = P(rel) a1 1-Pr(rel) a2 • The relevance properties of a record mist be related to the relevance properties of various terms attached to the records. The probabilities that a document is relevant and not relevant, given that is has been selected, are defined by P (rel selected) and P (non-rel selected) respectively.
  • 14. Historical Background  The first attempts to develop a probabilistic theory of retrieval were made over 30 years ago [Moron and Kuhn's 1960; Miller 1971], and since then there has been a steady development of the approach. There are already several operational IR systems based upon probabilistic or semi probabilistic models.  One major obstacle in probabilistic or semiprobabilistic IR models is finding methods for estimating the probabilities used to evaluate the probability of relevance that are both theoretically sound and computationally efficient.