Here is the capital of Egypt, Cairo, along with some related concepts and properties frequently queried with it:Cairo (type: City)- Egypt (Country)- Nile (River)- Giza (City)- Alexandria (City)- Population (property)- Governorates (property)Let me know if you would like me to expand on any part of the graph
Despite the attention Semantic Search is continuously gaining, several challenges affecting tool performance and user experience remain unsolved. Among these are: matching user terms with the searchspace, adopting view-based interfaces in the Open Web as well as supporting users while building their queries. This paper proposes an approach to move a step forward towards tackling these challenges by creating models of usage of Linked Data concepts and properties extracted from semantic query logs as a source of collaborative knowledge. We use two sets of query logs from the USEWOD workshops to create our models and show the potential of using them in the mentioned areas.
Ähnlich wie Here is the capital of Egypt, Cairo, along with some related concepts and properties frequently queried with it:Cairo (type: City)- Egypt (Country)- Nile (River)- Giza (City)- Alexandria (City)- Population (property)- Governorates (property)Let me know if you would like me to expand on any part of the graph
Ähnlich wie Here is the capital of Egypt, Cairo, along with some related concepts and properties frequently queried with it:Cairo (type: City)- Egypt (Country)- Nile (River)- Giza (City)- Alexandria (City)- Population (property)- Governorates (property)Let me know if you would like me to expand on any part of the graph (20)
Here is the capital of Egypt, Cairo, along with some related concepts and properties frequently queried with it:Cairo (type: City)- Egypt (Country)- Nile (River)- Giza (City)- Alexandria (City)- Population (property)- Governorates (property)Let me know if you would like me to expand on any part of the graph
1. Improving Semantic Search Using
Query Log Analysis
Khadija Elbedweihy, Stuart N. Wrigley and Fabio Ciravegna
OAK Research Group,
Department of Computer Science,
University of Sheffield, UK
2. Outline
• Introduction
• Semantic Query Logs Analysis
- Query-Concepts Model
- Concepts-Predicates Model
- Instance-Types Model
• Results Augmentation
• Data Visualisation
4. Motivation
• Little work on results returned (answers) and
presentation style.
– Users want direct answers augmented with more
information for richer experience1
– Users want more user-friendly and attractive results
presentation format1
• Semantic query logs: logs of queries issued to repositories
containing RDF data.
1. See our paper from this morning’s IWEST 2012 workshop
5. Related Work
Semantic query logs analysis:
• Moller et al. identified patterns of Linked Data usage with
respect to different types of agents.
• Arias et al. analysed the structure of the SPARQL queries
to identify most frequent language elements.
• Luczak-Rösch et al. analysed query logs to detect errors
and weaknesses in LD ontologies and support their
maintenance.
6. Related Work (cont’d)
How our work is different:
Analyze semantic query logs to produce models capturing
different patterns of information needs on Linked Data:
Concepts used together in a query: query-concepts model
Predicate used with a concept: concept-predicates model
Concepts used as types of a LD entity: instance-types model
The models make use of the “collaborative knowledge”
inherent in the logs to enhance the search process.
9. Analysis
SELECT DISTINCT ?genre, ?instrument WHERE
{
<…dbpedia.org…/Ringo_Starr> ?rel <…dbpedia.org/…/The_Beatles>.
<…dbpedia.org…/Ringo_Starr> dbpedia:genre ?genre.
<…dbpedia.org…/Ringo_Starr> dbpedia:instrument ?instrument.
}
• For each bound resource (subject or object) ->
query endpoint for the type of the resource
http://dbpedia.org/resource/Ringo_Starr
type
http://dbpedia.org/ontology/MusicalArtist
10. Query-Concepts Model
SELECT DISTINCT ?genre, ?instrument WHERE
{ <…dbpedia.org…/Ringo_Starr> ?rel <…dbpedia.org/…/The_Beatles>.
<…dbpedia.org…/Ringo_Starr> dbpedia:instrument ?instrument. }
1) Retrieve types of resources in the query:
Ringo_Starr type dbpedia-owl:MusicalArtist, umbel:MusicalPerformer
The_Beatles type dbpedia-owl:Band, schema:MusicGroup
2) Increment the co-occurrence of each concept in the first list
with each concept in the second:
MusicalArtist Band MusicalPerformer MusicGroup
MusicalArtist MusicGroup MusicalPerformer Band
11. Concept-Predicates Model
SELECT DISTINCT ?genre, ?instrument WHERE
{ <…dbpedia.org…/Ringo_Starr> ?rel <…dbpedia.org/…/The_Beatles>.
<…dbpedia.org…/Ringo_Starr> dbpedia:genre ?genre.
<…dbpedia.org…/Ringo_Starr> dbpedia:instrument ?instrument. }
1) Retrieve types of resources used as subjects in the query:
Ringo_Starr type dbpedia-owl:MusicalArtist, umbel:MusicalPerformer
2) Identify bound predicates (dbpedia:genre, dbpedia:instrument)
3) Increment the co-occurrence of each type with the predicate used in
the same triple pattern:
MusicalPerformer genre MusicalPerformer instrument
MusicalArtist genre MusicalArtist instrument
12. Instance-Types Model
SELECT DISTINCT ?genre, ?instrument WHERE
{ <…dbpedia.org…/Ringo_Starr> ?rel <…dbpedia.org/…/The_Beatles>.
<…dbpedia.org…/Ringo_Starr> dbpedia:instrument ?instrument. }
1) Retrieve types of resources in the query:
Ringo_Starr type dbpedia-owl:MusicalArtist, umbel:MusicalPerformer
The_Beatles type dbpedia-owl:Band, schema:MusicGroup
2) Increment the co-occurrence of concepts found as types for the
same instance:
MusicalArtist MusicalPerformer
Band MusicGroup
14. Dataset
• Two sets of DBpedia query logs made available at the
USEWOD2011 and USEWOD2012 workshops.
• The logs contained around 5 million queries issued to
DBpedia over a time period spanning almost 2 years
USEWOD2012 USEWOD2011
Number of analyzed queries 8866028 4951803
Number of unique triple patterns 4095011 2641098
Number of unique bound triple patterns 3619216 2571662
15. Results Enhancement
• Google, Yahoo!, Bing, etc. enhance search
results using structured data
• FalconS and VisiNav return extra information together
with each entity in the answers (e.g. type, label)
• Evaluation of Semantic Search showed that augmenting
answers with extra information provides a richer user
experience2.
2. See our paper from this morning’s IWEST 2012 workshop
17. Motivation for proposed approach
• Utilizing query logs as a source of collaborative knowledge
able to capture implicit associations between Linked Data
entities and properties.
• Use this to select which information to show the user.
• Two recent studies3 analyzed semantic query logs and
observed that a class of entities is usually queried with
similar relations and concepts.
3. Luczak-Rösch et al. ; Elbedweihy et al.
18. Two Related Types of Result Augmentation
1. Additional result-related information.
– More details about each result item
– Provides better understanding of the answer.
2. Additional query-related information.
– More results related to the query entities
– Assists users in discovering useful findings
(serendipity)
19. Return additional result-related information
Steps
1) For each result item, find types of instance.
1) Most frequently queried predicates associated with them
are extracted from the concept-predicates model.
2) Generate queries with each pair (instance, predicate).
e.g. (<…dbpedia.org…/Ringo_Starr> , genre)
3) Show aggregated results to the user.
20. Return additional result-related information
• MusicalArtist-> genre, associatedBand, occupation, instrument,
birthDate, birthPlace, hometown, prop:yearsActive, foaf:surname,
prop:associatedActs, …
Query: “Who played drums for the Beatles?”
Result: Ringo Starr
Pop music, Rock music (genre)
Keyboard, Drum,Acousticguitar(instrument)
The Beatles, Plastic Ono Band, Rory Storm,(assoc.Band)
21. Return additional query-related information
Steps
1) Extract all concepts from query.
2) For any instances, find their types.
3) For each query concept, find most frequently occurring
concepts from the query-concepts model.
4) For each related concept, query for instances that have
relation with the originating instance.
5) Show aggregated results to the user.
22. Return additional query-related information
• City-> Book, Person, Country, Organisation, SportsTeam, MusicGroup,
Film, RadioStation, River, University, SoccerPlayer, Hospital, ...
Query: “Where is the University of Sheffield located?”
Result: Sheffield,UK
NickClegg,CliveBetts, DavidBlunkett(Person)
SheffieldUnitedF SheffieldWednesday (SportsT
.C., eam)
Hallam FM,RealRadio, BBCRadioSheffield (RadioStn.)
JessopHosp.,NorthernGeneral, RoyalHallamshire(Hospital)
Uni.ofSheffield, SheffieldHallam Uni. (University)
24. Data Visualization
• View-based interfaces (e.g. Semantic Crystal and Smeagol)
support users in query formulation by showing the
underlying data and connections.
• Helpful for users, especially those unfamiliar with the
search domain.
• Try to bridge the gap between user terms and tool terms
(habitability problem)
• Facing challenge to visualize large datasets without
cluttering the view and affecting user experience.
25. Data Visualization: Proposed approach
• Visualizing large datasets (especially heterogeneous ones)
is a challenge.
• To overcome this, we need to select and visualize specific
parts of the data.
• Exploit collaborative knowledge in query logs to derive
selection of concepts and predicates added to user’s
subgraph of interest.
26. Data Visualization: Proposed approach
Steps
1) User enters NL query
2) Return best-attempt results
3) Identify query instances and find their types
4) For each type:
• Extract most queried predicates associated with it from
concept-predicates model.
• Extract most queried concepts associated with it from
query-concepts model.
5) Add these to the user’s query graph (see next slide)
27. Example
Query: “What is the capital of Egypt?”
Best-attempt
Answer: Cairo results
Result-
➔ latitude: 30.058056 ➔ depiction: Related
information
➔ longitude: 31.228889
➔ population: 6758581
➔ area: 453000000
➔ time zone: Eastern European Time
➔ subdivision: Governorates of Egypt
➔ page: http://www.cairo.gov.eg/default.aspx
➔ nickname: The City of a Thousand Minarets, Capital of the
Arab World
28. Example
Query: “What is the capital of Egypt?” Query-Related
information
Answer: Cairo
➔ Cairo Uni., Ain Shams Uni., German Uni., British Uni. (University)
➔ Ittihad El Shorta, El Shams Club, AlNasr Egypt (SportsTeam)
➔ Orascom Telecom, HSBC Bank, EgyptAir, Olympic Grp (Organisation)
➔ Nile River (River)
➔ Al Azhar Park (Park)
➔ Hani Shaker, Sherine, Umm Kulthum, Am Diab (MusicalArtist)
➔ Nile TV, AL Nile, Al-Baghdadia TV (BroadCaster)
➔ Egyptian Museum, Museum of Islamic Art (Museum)
29. Data Visualization: Proposed approach
Step 5: Add concepts and
predicates to user’s query
graph
Most queried Most queried
predicates with concepts with
“Country” “Country”
Query
instance