SlideShare a Scribd company logo
1 of 22
Query Formulation Process
Presented by
M. Malathi,
MLISc II nd Year,
Department of Library and Information Science,
Central university of Pondicherry.
Query Formulation Process
Definition of Query:
Query is defined as any question, especially one
expressing doubt or requesting information or to
check its validity or accuracy of information.
Query formulation and Information and
information retrieval:
“Information retrieval embraces the intellectual
aspects of the description of information and its
specification for search, and also whatever systems,
techniques, or machines are employed to carry out
the operation.”
Information Retrieval (IR) is finding material (usually
documents) of an unstructured nature (usually text)
that satisfies an information need from within large
collections (usually stored on computers).
Information Retrieval is a research-driven theoretical
and experimental discipline. The focus is on different
aspects of the information–seeking process,
depending on the researcher’s background or
interest.
The Standard Retrieval Interaction Model
Query is one of the components in the IR cycle. The information search process has
the following steps such as Enrich query formulation, Expand result management,
Enable long-term effort, Enhance collaboration.
Information Searching Strategies
Starting strategies
Select-Break complex query into topics and deal with each topic separately
Exhaust-Include most elements of the query in the initial query formulation
Continuation strategies
Building blocks - Combination of discrete topics
Pearl growing - Small relevant set expanded gradually
Successive fractions - Large relevant set refined gradually
Query Formulation:
Most standard information retrieval models use a
single source of information (e.g., the retrieval
corpus) for query formulation tasks such as term
and phrase weighting and query expansion
Query formulation – a process during which the
original keyword query issued by the user is
transformed into a structured query
representation that is consumed by the search
engine.
Query Formulation Processing:
The process of query formulation (also referred to
as query rewriting or query transformation)
modifies the original keyword query submitted by
the user to the search engine in order to better
represent the underlying intent of the query.
The formulated query is then used as an input to
the search engine’s ranking algorithm.
Thus, the primary goal of query formulation is to
improve the overall quality of the ranking
presented to the user in response to their query.
Query formulation is usually divided into two
main processing stages.
•The first processing stage, which is usually
referred to as query refinement, alters the
query on the morphological level (e.g.,
tokenization, spelling corrections,stemming,
etc.).
Query term processing:
Tokenization
•Cut character sequence into word tokens
–Deal with “John’s”, a state-of-the-art solution
Normalization
Map text and query term to same form
–You want U.S.A. and USA to match
Stemming
•We may wish different forms of a root to match
–authorize, authorization
Stop words
•We may omit very common words (or not)
–the, a, to, of
•After the query refinement stage is completed,
the second processing stage alters the query on
the structural level.
Such structural alterations may include, among
other actions, segmenting the query into atomic
concepts (i.e., combinations of terms), assigning
weights to these concepts, or expanding the
query with related weighted concepts.
Query expansion:
Known relevant documents contain terms that can be
used to describe a larger cluster of relevant documents.
Query expansion based on Thesauri, Lexical/statistic
analysis of text / context and concept formation and
Relevance feedback.
1.Global strategy: all documents in the collection used
to determine a global thesaurus-like structure which
defines term relationships. This is shown to the user
who selects terms for query expansion.
2.Local strategy: local set for a query are examined at
query time to determine terms for query expansion.
Query refinement
Encourage users dissatisfied with the top search results to
query again.
User types a search query. Result is search results plus
possible new search queries, which when clicked on issue
a new query whose results to replace the current results
seen by the user.
A mechanism that recommends query modifications to
reduce false positives
Incremental process of transforming a query into a new
query that more accurately reflects the user’s information
need.
Different forms of queries:
Query-By-Form is the simplest querying method,
but it is neither flexible nor expressive.
Query-By-Example A known approach in databases,
where users formulate queries as filling
Conceptual Queries
As many databases are modeled at the conceptual
level using EER, ORM or UML diagrams, one can
query these databases starting from their diagrams.
Users can select part of a given diagram, and their
selection is translated into SQL, ConQuer & Mquery
etc.
Natural Language Queries allow people to write
their queries as natural language sentences, and
then translate these sentences into a formal
language (e.g., SQL , XQuery ).
Visualize queries
Several Semantic Web approaches (Isparql,
RDFAuthor, GRQL, Nitelight) propose to formulate a
SPARQL query by visualizing its triple patterns as
ellipses connected with arrows, so that one would
need less technical skills to formulate a query.
Interactive Queries
Asking and answering method.
Enrich query formulation
Query formulation enriched by the following
aspects:
Previous & Similar, Structured input, Spell check,
Query previews, Finding Aids, Limit:(Time,
Geography, Language, Sources, Media etc )
Supporting query formulation:
Spelling correction of the query, Suggestions with
definitions to the query, Searching for
disambiguated concepts, Implicit information
about users used for search, Geo location and user
language.
Querying is an iterative process
Expand the original query with new terms.
Re-weight the terms in the expanded query.
Direct feedback from user – filtering.
Information derived from the set of documents initially retrieved
(local set).
Global information derived from the whole document collection.
Shields the user from the details of the query formulation process,
and permits the construction of useful search statements without
intimate knowledge of collection make-up and search
environment.
Breaks down the search operation into a sequence of small search
steps, designed to approach the wanted subject area gradually.
Provides a controlled query alteration process designed to
emphasize some terms (relevant ones) and to de-emphasize others
(non-relevant ones).
Levels of search activities according to Bates 1990
Move: Low-level search function
(e.g. type in search term, view retrieved document)
Tactic: several moves to further a search
(e.g. broaden/narrow a query)
Stratagem: set of actions on a single domain
(citation database, tables of contents of journals)
Strategy: complete plan for satisfying an
information need
(e.g. subject search, browse relevant journals, find
referenced articles
Examples for query formulation process:
Boolean Queries
The Boolean retrieval model is being able to ask a query that is a
Boolean expression.
Boolean Queries are queries using AND, OR and NOT to join query
terms. Many search systems you still use are Boolean:
•Email, library catalog, Mac OS X Spotlight
Westlaw
Largest commercial (paying subscribers) legal search service
(started 1975; ranking added 1992; new federated search added
2010)
Phrase queries
We want to be able to answer queries such as “stanford university” –
as a phrase.
Thus the sentence “I went to university at Stanford” is not a match.
The concept of phrase queries has proven easily understood by users;
one of the few “advanced search” ideas that works
Advantages of query formulation process:
Query easy to specify
The output is ranked based on the estimated relevance of
the documents to the query
A wide variety of theoretical models exist
Very precise queries can be specified
Very easy to implement (in the simple form)
Disadvantages of query formulation process:
Specifying the query may be difficult for casual users
Lack of control over the size of the retrieved set
Query less precise (although weighting can be used)
Summary:
Large relevant set refined gradually
Matching between the document and the query in
the abstracted space of the set of index words is
very imprecise.
Relational Databases and Query language
exemplify data retrieval due to semantic clarity
and precision.
Document querying can be transitioned to data
retrieval if only we could re-author all the docs so
as to provide formalnsemantics to them and
implement sufficiently powerful query language.
Thank you

More Related Content

What's hot

Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery serviceKankana Baishya
 
Web Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchWeb Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchNikesh Narayanan
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!Jane Garay
 
Library consortia
Library consortia Library consortia
Library consortia Dheeraj Negi
 
Planning and implementation of library automation by Aman Kumar Kushwaha
Planning and implementation of library automation by Aman Kumar KushwahaPlanning and implementation of library automation by Aman Kumar Kushwaha
Planning and implementation of library automation by Aman Kumar KushwahaAMAN KUMAR KUSHWAHA
 
Library automation software
Library automation softwareLibrary automation software
Library automation softwareJancypriya M
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction) Primya Tamil
 
Information searching & retrieving techniques khalid
Information searching & retrieving techniques khalidInformation searching & retrieving techniques khalid
Information searching & retrieving techniques khalidKhalid Mahmood
 
Artificial Intelligence role in Libraries
Artificial Intelligence role in Libraries Artificial Intelligence role in Libraries
Artificial Intelligence role in Libraries Muhammad Yousuf Ali
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information scienceharshaec
 
Electronic resources management presentation 2021
Electronic resources management presentation 2021Electronic resources management presentation 2021
Electronic resources management presentation 2021chrisokiki69
 
Electronic Resource Management in the library
Electronic Resource Management in the libraryElectronic Resource Management in the library
Electronic Resource Management in the libraryDr. Nihar K. Patra
 
Information services
Information servicesInformation services
Information servicesJohan Koren
 
Technical services presentation
Technical services presentation Technical services presentation
Technical services presentation Ali Hassan Maken
 

What's hot (20)

Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery service
 
Webometrics
WebometricsWebometrics
Webometrics
 
Web Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchWeb Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated Search
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!
 
Library consortia
Library consortia Library consortia
Library consortia
 
Resource Sharing and Networking
Resource Sharing and NetworkingResource Sharing and Networking
Resource Sharing and Networking
 
collection development ppt
collection development pptcollection development ppt
collection development ppt
 
Planning and implementation of library automation by Aman Kumar Kushwaha
Planning and implementation of library automation by Aman Kumar KushwahaPlanning and implementation of library automation by Aman Kumar Kushwaha
Planning and implementation of library automation by Aman Kumar Kushwaha
 
Library automation software
Library automation softwareLibrary automation software
Library automation software
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction)
 
Information searching & retrieving techniques khalid
Information searching & retrieving techniques khalidInformation searching & retrieving techniques khalid
Information searching & retrieving techniques khalid
 
Artificial Intelligence role in Libraries
Artificial Intelligence role in Libraries Artificial Intelligence role in Libraries
Artificial Intelligence role in Libraries
 
Post coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information sciencePost coordinate indexing .. Library and information science
Post coordinate indexing .. Library and information science
 
Information Analysis Consolidation and Repackaging (IACR): an overview
Information Analysis Consolidation and Repackaging (IACR): an overviewInformation Analysis Consolidation and Repackaging (IACR): an overview
Information Analysis Consolidation and Repackaging (IACR): an overview
 
Electronic resources management presentation 2021
Electronic resources management presentation 2021Electronic resources management presentation 2021
Electronic resources management presentation 2021
 
POPSI
POPSIPOPSI
POPSI
 
Electronic Resource Management in the library
Electronic Resource Management in the libraryElectronic Resource Management in the library
Electronic Resource Management in the library
 
Knowledge organization system
Knowledge organization systemKnowledge organization system
Knowledge organization system
 
Information services
Information servicesInformation services
Information services
 
Technical services presentation
Technical services presentation Technical services presentation
Technical services presentation
 

Viewers also liked

Query formulation (chapter 1)
Query formulation (chapter 1)Query formulation (chapter 1)
Query formulation (chapter 1)Mohamed Rafique
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Dr. Haxel Consult
 
Computational approaches to cell cycle analysis: Current research topics (tho...
Computational approaches to cell cycle analysis: Current research topics (tho...Computational approaches to cell cycle analysis: Current research topics (tho...
Computational approaches to cell cycle analysis: Current research topics (tho...Lars Juhl Jensen
 
Information retrieval: Creating a Search Engine
Information retrieval: Creating a Search EngineInformation retrieval: Creating a Search Engine
Information retrieval: Creating a Search EngineMahendra Kariya
 
RESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short Text
RESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short TextRESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short Text
RESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short TextElizabeth Murnane
 
Probablistic sampling group 3 assighnment
Probablistic sampling group 3 assighnmentProbablistic sampling group 3 assighnment
Probablistic sampling group 3 assighnmentShimelis Birhanu
 
Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationDavid Gleich
 
Indexing languages (2)
Indexing languages (2)Indexing languages (2)
Indexing languages (2)yhen06
 
Social recommender system
Social recommender systemSocial recommender system
Social recommender systemKapil Kumar
 
Implicit Concept Testing & 3 Keys to Broader Adoption of Consumer Neuroscience
Implicit Concept Testing & 3 Keys to Broader Adoption of Consumer NeuroscienceImplicit Concept Testing & 3 Keys to Broader Adoption of Consumer Neuroscience
Implicit Concept Testing & 3 Keys to Broader Adoption of Consumer NeuroscienceSentient Decision Science
 
Shpend Stojkaj baza e të dhënave
Shpend Stojkaj baza e të dhënaveShpend Stojkaj baza e të dhënave
Shpend Stojkaj baza e të dhënaveShpend Stojkaj
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)9866825059
 
15 μαρτιου 2010 ημεριδα γγκ ομιλια π. καλαποθαρακου
15 μαρτιου 2010 ημεριδα γγκ  ομιλια π. καλαποθαρακου15 μαρτιου 2010 ημεριδα γγκ  ομιλια π. καλαποθαρακου
15 μαρτιου 2010 ημεριδα γγκ ομιλια π. καλαποθαρακουΕ.Κ.ΠΟΙ.ΖΩ.
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Kira
 
Microsoft Access
Microsoft AccessMicrosoft Access
Microsoft AccessAjla Hasani
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Bazat e Te Dhenave - ACCESS
Bazat e Te Dhenave  - ACCESSBazat e Te Dhenave  - ACCESS
Bazat e Te Dhenave - ACCESSAjla Hasani
 

Viewers also liked (20)

Query formulation (chapter 1)
Query formulation (chapter 1)Query formulation (chapter 1)
Query formulation (chapter 1)
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 
QUERY BD ACCESS
QUERY BD ACCESSQUERY BD ACCESS
QUERY BD ACCESS
 
Computational approaches to cell cycle analysis: Current research topics (tho...
Computational approaches to cell cycle analysis: Current research topics (tho...Computational approaches to cell cycle analysis: Current research topics (tho...
Computational approaches to cell cycle analysis: Current research topics (tho...
 
Information retrieval: Creating a Search Engine
Information retrieval: Creating a Search EngineInformation retrieval: Creating a Search Engine
Information retrieval: Creating a Search Engine
 
Operador logistico
Operador logisticoOperador logistico
Operador logistico
 
RESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short Text
RESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short TextRESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short Text
RESLVE: Leveraging User Interest to Improve Entity Disambiguation on Short Text
 
Probablistic sampling group 3 assighnment
Probablistic sampling group 3 assighnmentProbablistic sampling group 3 assighnment
Probablistic sampling group 3 assighnment
 
Skew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregationSkew-symmetric matrix completion for rank aggregation
Skew-symmetric matrix completion for rank aggregation
 
Indexing languages (2)
Indexing languages (2)Indexing languages (2)
Indexing languages (2)
 
Database
DatabaseDatabase
Database
 
Social recommender system
Social recommender systemSocial recommender system
Social recommender system
 
Implicit Concept Testing & 3 Keys to Broader Adoption of Consumer Neuroscience
Implicit Concept Testing & 3 Keys to Broader Adoption of Consumer NeuroscienceImplicit Concept Testing & 3 Keys to Broader Adoption of Consumer Neuroscience
Implicit Concept Testing & 3 Keys to Broader Adoption of Consumer Neuroscience
 
Shpend Stojkaj baza e të dhënave
Shpend Stojkaj baza e të dhënaveShpend Stojkaj baza e të dhënave
Shpend Stojkaj baza e të dhënave
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)
 
15 μαρτιου 2010 ημεριδα γγκ ομιλια π. καλαποθαρακου
15 μαρτιου 2010 ημεριδα γγκ  ομιλια π. καλαποθαρακου15 μαρτιου 2010 ημεριδα γγκ  ομιλια π. καλαποθαρακου
15 μαρτιου 2010 ημεριδα γγκ ομιλια π. καλαποθαρακου
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
 
Microsoft Access
Microsoft AccessMicrosoft Access
Microsoft Access
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Bazat e Te Dhenave - ACCESS
Bazat e Te Dhenave  - ACCESSBazat e Te Dhenave  - ACCESS
Bazat e Te Dhenave - ACCESS
 

Similar to Query formulation process

Improving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarityImproving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarityConference Papers
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information RetrievalPerformance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrievalidescitation
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search enginesunyil96
 
System and design chapter-2
System and design chapter-2System and design chapter-2
System and design chapter-2Best Rahim
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...IJMIT JOURNAL
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingeSAT Publishing House
 
A Survey on Automatically Mining Facets for Queries from their Search Results
A Survey on Automatically Mining Facets for Queries from their Search ResultsA Survey on Automatically Mining Facets for Queries from their Search Results
A Survey on Automatically Mining Facets for Queries from their Search ResultsIRJET Journal
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLPGVS Chaitanya
 
Query Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachQuery Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachIRJET Journal
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...Kumar Goud
 
SAD _ Fact Finding Techniques.pptx
SAD _ Fact Finding Techniques.pptxSAD _ Fact Finding Techniques.pptx
SAD _ Fact Finding Techniques.pptxSharmilaMore5
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineSalford Systems
 
Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...IJwest
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) inventionjournals
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 

Similar to Query formulation process (20)

Improving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarityImproving search result via search keywords and data classification similarity
Improving search result via search keywords and data classification similarity
 
Performance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information RetrievalPerformance Evaluation of Query Processing Techniques in Information Retrieval
Performance Evaluation of Query Processing Techniques in Information Retrieval
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search engines
 
System and design chapter-2
System and design chapter-2System and design chapter-2
System and design chapter-2
 
Ontology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...Classification-based Retrieval Methods to Enhance Information Discovery on th...
Classification-based Retrieval Methods to Enhance Information Discovery on th...
 
Open domain question answering system using semantic role labeling
Open domain question answering system using semantic role labelingOpen domain question answering system using semantic role labeling
Open domain question answering system using semantic role labeling
 
A Survey on Automatically Mining Facets for Queries from their Search Results
A Survey on Automatically Mining Facets for Queries from their Search ResultsA Survey on Automatically Mining Facets for Queries from their Search Results
A Survey on Automatically Mining Facets for Queries from their Search Results
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
 
Query Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering ApproachQuery Recommendation by using Collaborative Filtering Approach
Query Recommendation by using Collaborative Filtering Approach
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
SAD _ Fact Finding Techniques.pptx
SAD _ Fact Finding Techniques.pptxSAD _ Fact Finding Techniques.pptx
SAD _ Fact Finding Techniques.pptx
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search Engine
 
Lec1
Lec1Lec1
Lec1
 
Lec1,2
Lec1,2Lec1,2
Lec1,2
 
Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...Architecture of an ontology based domain-specific natural language question a...
Architecture of an ontology based domain-specific natural language question a...
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI) International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Query formulation process

  • 1. Query Formulation Process Presented by M. Malathi, MLISc II nd Year, Department of Library and Information Science, Central university of Pondicherry.
  • 2. Query Formulation Process Definition of Query: Query is defined as any question, especially one expressing doubt or requesting information or to check its validity or accuracy of information.
  • 3. Query formulation and Information and information retrieval: “Information retrieval embraces the intellectual aspects of the description of information and its specification for search, and also whatever systems, techniques, or machines are employed to carry out the operation.” Information Retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).
  • 4. Information Retrieval is a research-driven theoretical and experimental discipline. The focus is on different aspects of the information–seeking process, depending on the researcher’s background or interest.
  • 5. The Standard Retrieval Interaction Model Query is one of the components in the IR cycle. The information search process has the following steps such as Enrich query formulation, Expand result management, Enable long-term effort, Enhance collaboration.
  • 6. Information Searching Strategies Starting strategies Select-Break complex query into topics and deal with each topic separately Exhaust-Include most elements of the query in the initial query formulation Continuation strategies Building blocks - Combination of discrete topics Pearl growing - Small relevant set expanded gradually Successive fractions - Large relevant set refined gradually
  • 7. Query Formulation: Most standard information retrieval models use a single source of information (e.g., the retrieval corpus) for query formulation tasks such as term and phrase weighting and query expansion Query formulation – a process during which the original keyword query issued by the user is transformed into a structured query representation that is consumed by the search engine.
  • 8. Query Formulation Processing: The process of query formulation (also referred to as query rewriting or query transformation) modifies the original keyword query submitted by the user to the search engine in order to better represent the underlying intent of the query. The formulated query is then used as an input to the search engine’s ranking algorithm. Thus, the primary goal of query formulation is to improve the overall quality of the ranking presented to the user in response to their query.
  • 9. Query formulation is usually divided into two main processing stages. •The first processing stage, which is usually referred to as query refinement, alters the query on the morphological level (e.g., tokenization, spelling corrections,stemming, etc.).
  • 10. Query term processing: Tokenization •Cut character sequence into word tokens –Deal with “John’s”, a state-of-the-art solution Normalization Map text and query term to same form –You want U.S.A. and USA to match Stemming •We may wish different forms of a root to match –authorize, authorization Stop words •We may omit very common words (or not) –the, a, to, of
  • 11. •After the query refinement stage is completed, the second processing stage alters the query on the structural level. Such structural alterations may include, among other actions, segmenting the query into atomic concepts (i.e., combinations of terms), assigning weights to these concepts, or expanding the query with related weighted concepts.
  • 12. Query expansion: Known relevant documents contain terms that can be used to describe a larger cluster of relevant documents. Query expansion based on Thesauri, Lexical/statistic analysis of text / context and concept formation and Relevance feedback. 1.Global strategy: all documents in the collection used to determine a global thesaurus-like structure which defines term relationships. This is shown to the user who selects terms for query expansion. 2.Local strategy: local set for a query are examined at query time to determine terms for query expansion.
  • 13. Query refinement Encourage users dissatisfied with the top search results to query again. User types a search query. Result is search results plus possible new search queries, which when clicked on issue a new query whose results to replace the current results seen by the user. A mechanism that recommends query modifications to reduce false positives Incremental process of transforming a query into a new query that more accurately reflects the user’s information need.
  • 14. Different forms of queries: Query-By-Form is the simplest querying method, but it is neither flexible nor expressive. Query-By-Example A known approach in databases, where users formulate queries as filling Conceptual Queries As many databases are modeled at the conceptual level using EER, ORM or UML diagrams, one can query these databases starting from their diagrams. Users can select part of a given diagram, and their selection is translated into SQL, ConQuer & Mquery etc.
  • 15. Natural Language Queries allow people to write their queries as natural language sentences, and then translate these sentences into a formal language (e.g., SQL , XQuery ). Visualize queries Several Semantic Web approaches (Isparql, RDFAuthor, GRQL, Nitelight) propose to formulate a SPARQL query by visualizing its triple patterns as ellipses connected with arrows, so that one would need less technical skills to formulate a query. Interactive Queries Asking and answering method.
  • 16. Enrich query formulation Query formulation enriched by the following aspects: Previous & Similar, Structured input, Spell check, Query previews, Finding Aids, Limit:(Time, Geography, Language, Sources, Media etc ) Supporting query formulation: Spelling correction of the query, Suggestions with definitions to the query, Searching for disambiguated concepts, Implicit information about users used for search, Geo location and user language.
  • 17. Querying is an iterative process Expand the original query with new terms. Re-weight the terms in the expanded query. Direct feedback from user – filtering. Information derived from the set of documents initially retrieved (local set). Global information derived from the whole document collection. Shields the user from the details of the query formulation process, and permits the construction of useful search statements without intimate knowledge of collection make-up and search environment. Breaks down the search operation into a sequence of small search steps, designed to approach the wanted subject area gradually. Provides a controlled query alteration process designed to emphasize some terms (relevant ones) and to de-emphasize others (non-relevant ones).
  • 18. Levels of search activities according to Bates 1990 Move: Low-level search function (e.g. type in search term, view retrieved document) Tactic: several moves to further a search (e.g. broaden/narrow a query) Stratagem: set of actions on a single domain (citation database, tables of contents of journals) Strategy: complete plan for satisfying an information need (e.g. subject search, browse relevant journals, find referenced articles
  • 19. Examples for query formulation process: Boolean Queries The Boolean retrieval model is being able to ask a query that is a Boolean expression. Boolean Queries are queries using AND, OR and NOT to join query terms. Many search systems you still use are Boolean: •Email, library catalog, Mac OS X Spotlight Westlaw Largest commercial (paying subscribers) legal search service (started 1975; ranking added 1992; new federated search added 2010) Phrase queries We want to be able to answer queries such as “stanford university” – as a phrase. Thus the sentence “I went to university at Stanford” is not a match. The concept of phrase queries has proven easily understood by users; one of the few “advanced search” ideas that works
  • 20. Advantages of query formulation process: Query easy to specify The output is ranked based on the estimated relevance of the documents to the query A wide variety of theoretical models exist Very precise queries can be specified Very easy to implement (in the simple form) Disadvantages of query formulation process: Specifying the query may be difficult for casual users Lack of control over the size of the retrieved set Query less precise (although weighting can be used)
  • 21. Summary: Large relevant set refined gradually Matching between the document and the query in the abstracted space of the set of index words is very imprecise. Relational Databases and Query language exemplify data retrieval due to semantic clarity and precision. Document querying can be transitioned to data retrieval if only we could re-author all the docs so as to provide formalnsemantics to them and implement sufficiently powerful query language.