SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Peter Mika| Yahoo! Research, Spain
pmika@yahoo-inc.com
Thanh Tran | Semsolute, Germany
Tran@semsolute.com
Semantic Search on the Rise
About the speakers
 Peter Mika
 Senior Research Scientist
 Head of Semantic Search group at
Yahoo! Labs
 Expertise: Semantic Search, Web
Object Retrieval, Natural Language
Processing
 Tran Duc Thanh
 CEO of Semsolute, Semantic Search
Technologies Company
 Served as Assistant Professor for
Karlsruhe Institute of Technology and
Stanford University
 Expertise: Semantic Search,
Semantic / Linked Data Management
Agenda
 Why Semantic Search
 What is Semantic Search
 Innovative Semantic Search Applications
 Behind the Scene
 Questions
Why Semantic Search?
Why Semantic Search? I.
 “We are at the beginning of search.“ (Marissa Mayer)
 Solved large classes of queries, e.g. navigational
 Remaining queries are hard, not solvable by brute
force, require deep understanding of the world and
human cognition, e.g.
 Ambiguous searches: paris hilton
 Imprecise or overly precise searches
 Searches for descriptions: 34 year old computer scientist
living in barcelona
 Background knowledge and metadata can help to
address poorly solved queries
Many of these queries
would not be asked by
users, who learned over
time what search
technology can and can
not do.
Why Semantic Search? II.
 The Semantic Web is now a reality
 Large amounts of data published in RDF
 Linked Data
 Metadata in HTML
 Facebook‟s Open Graph Protocol
 Schema.org
 Casual users
 Don‟t know SPARQL
 Unaware of the schema of the data
 Searching data instead or in addition to searching
documents
 Enable innovative search applications / tasks
What is Semantic Search?
Semantic Search: Using Semantic Models for
Search
 Semantic search is a retrieval paradigm that
 Exploits the semantics of the data or explicit background
knowledge to understand user intent and the meaning of
content
 Incorporates the intent of the query and the meaning of
content into the search process (semantic models)
Semantic Search: Different Kinds / Different
Uses of Semantic Models
 Wide range of semantic search systems
 Employ different semantic models, possibly at
different steps of the search process and in order to
support different tasks
 Query formulation
 Query processing / understanding
 Ranking
 Result presentation
 Result / query refinement
Semantic models
 Semantics is concerned with the meaning of the
resources made available for search
 Various representations of meaning
 Word-level models: models of relationships among
words
 Taxonomies, thesauri, dictionaries of entity names
 Inference along linguistic relations, e.g. broader/narrower
terms
 Concept-level models: models of relationships
among objects
 Ontologies capture entities in the world and their
relationships
 Inference along domain-specific relations
Graph-based Conceptual Models
 Core of W3C standards for knowledge representation
and data exchange: RDF, OWL
 Large amount of data / knowledge on the Web
available as graphs
 Linked Data: hundreds of interconnected datasets
capturing domain-independent and domain-specific
knowledge
 Metadata in HTML
 RDFa, microdata, Facebook‟s OGP
 Private graphs
 Google‟s Knowledge Graph
 Facebook Graph
 Yahoo‟s Knowledge Base (talk yesterday)
 Microsoft's Satori
Linked Data
Where can you find Linked Data?
 Downloads
 Dbpedia data dumps
 SPARQL access
 LOD cache by OpenLink: 51 billion triples
 Keyword search
 Sindice by SindiceTech
Google Knowledge Graph
 Start with Freebase‟s database, which had 12 million
entities
 As of June 2012, Knowledge Graph has 500 million
entities and over 3.5 billion relationships between
those entities
 Prioritize properties based on what users were most
Facebook‟s Open Graph Protocol
 The „Like‟ button provides publishers with a way to
promote their content on Facebook and build
communities
 Shows up in profiles and news feed
 Site owners can later reach users who have liked an
object
 Facebook Graph API allows 3rd party developers to
access the data
 Open Graph Protocol is an RDFa-based format that
allows to describe the object that the user „Likes‟
Facebook‟s Open Graph Protocol
 RDF vocabulary to be used in conjunction with RDFa
 Simplify the work of developers by restricting the freedom in RDFa
 Activities, Businesses, Groups, Organizations, People, Places,
Products and Entertainment
 Only HTML <head> accepted
 http://opengraphprotocol.org/
<html xmlns:og="http://opengraphprotocol.org/schema/">
<head>
<title>The Rock (1996)</title>
<meta property="og:title" content="The Rock" />
<meta property="og:type" content="movie" />
<meta property="og:url"
content="http://www.imdb.com/title/tt0117500/" />
<meta property="og:image" content="http://ia.media-
imdb.com/images/rock.jpg" /> …
</head> ...
Semantic Web markup: schema.org
 Agreement on a shared set of schemas for common types
of web content
 Use a single format to communicate the same information to all three
search engines
 Bing, Google, and Yahoo! (June, 2011), Yandex (Nov, 2011)
 Microdata and RDFa support
 Schemas for most common web content
 Business listings, images/video, recipes, reviews, products, jobs…
 Community
 public-vocabs@w3.org
Schema.org
Current state of metadata on the Web
 Analysis of the Bing/Yahoo! Search Crawl
 US crawl, January, 2012
 31% of webpages, 5% of domains contain some metadata
 P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus,
LDOW 2012
 WebDataCommons.org
 Data extracted from a public crawl (commoncrawl.org)
 February, 2012 results show 11% of URLs with metadata
compared to 5% in 2009/2010 data
 7.3 billion triples available for download
 H.Mühleisen, C.Bizer.Web Data Commons - Extracting
Structured Data from Two Large Web Corpora, LDOW 2012
 Large increase in RDFa and microdata adoption compared
to microformats
Where can you find HTML metadata?
 Web Data Commons
 Glimmer: glimmer.research.yahoo.com
 Online index of the schema.org data in Web Data
Commons
Innovative Semantic Search Applications
Innovative Semantic Search Applications
 Entity search: entity/entities as results
 Factual search: direct answers, facts (about entities)
 Relational search: complex relationships between entities
 Semantic auto-completion: suggesting queries based on
the intent of the provided inputs
 Results aggregation / analysis / prediction: apply
computational models
 Semantic log analysis: understanding user behavior in
terms of objects
 Semantic profiling: recommendations based on particular
interests
 Semantic context: contextual model of users / interests
 Support for complex tasks, e.g. booking a vacation using a
combination of services
 Conversational search
Entity Search: Entity-based
Disambiguation
Entity Search: Entity Summary
Entity Search: Entity-based Navigation / Exploration
Factual Search
Relational Search
Semantic auto-completion: Facebook Graph
Search
Semantic Auto-completion: Semsolute‟s semantic search
engine
Vorlesung Knowledge Discovery - Institut
AIFB
Syntactic
Completions
Keywords
Semantic
Completions
2
9
Results Aggregation
Contextual (pervasive, ambient) search
Yahoo! Connected
TV:
Widget engine
embedded into the
TV
Yahoo! IntoNow:
recognize audio and
show related content
Interactive Voice Search
 Siri
 Question-Answering
 Variety of backend sources
including Wolfram Alpha and
various Yahoo! services
 Task completion
 E.g. schedule an event
Conversational Search
 Google‟s Interactive Voice Search
Conversational Search
 Parlance EU project
 Complex dialogs around a set of objects
 Restaurant
 Area
 Price range
 Type of cuisine
 Complete system
 Automated Speech Recognition (ASR)
 Spoken Language Understanding (SLU)
 Interaction Management
 Knowledge Base
 Natural Language Generation (NLG)
 Text-to-Speech (TTS)
 Video
 Commercial alternatives from Nuance
Behind the Scene
Main Technological Building Blocks
 Query Interpretation
 Spelling Correction
 Query Segmentation
 Entity Recognition
 Query Intent Interpretation for Semantic Auto-Completion
 Ranking
 Entity Ranking
 Relationship Ranking
 Aggregation
 Result Fusion
 Rank / Score Aggregation
 Result Presentation
 Summary Generation
 Visualization
Semsolute‟s Building Blocks - Keyword / Key Phrase
Interpretation
Entity
“address company san
francisco”
 Semantic entity index
 Inverted index for entities /
triples
 Return entities / entities‟
relationships as results to
keys
 Semantic entity ranking
 Structured language model:
one language model for every
attribute
 Returns entities‟ LMs that
most likely generate the
keywords, i.e. the entity
descriptions that best match
Relationship
s / Structure
Entity
“address company san
francisco”
Semsolute‟s Building Blocks – Semantic Graph
Construction
 Offline component: query-
independent schema graph
 Reuse schema
 Pseudo-schema construction:
all possible connections
between classes of entities,
e.g. friendships between users
 Online component: query-
specific keyword matching
elements
 Connect keyword matching
elements / entities to the
classes they belong to
Relationship
s / Structure
Entity
“address company san
francisco”
Semsolute‟s Building Blocks – Graph Exploration
 Top-k graph exploration
 Shortest-path based algorithm
that finds top-k graphs
connecting keyword matching
elements
 Top-k graph ranking
 Language model based
 Aggregated model that
combines the LMs of entities
matching the keywords
Semsolute‟s Building Blocks – Query Generation &
Processing
TripleRelationship
s / Structure
Entity
Address of companies located in San
Francisco?
“address company san
francisco”
 Graph to query mapping
 Translation rules that map top
ranked graphs to structured
queries (SQL, SPARQL)
 Translation rules that map
structured queries to natural
language questions
 Graph matching
 Triple index: cover index
supporting different triple
patterns
 Various join implementations
Yahoo! Spark: Entity Recommendation in
Search
 Different use cases in Web Search
 Some users are short on time
 Need direct answers
 Query expansion, question-answering, information boxes, rich
results…
 Other users want to explore
 Long term interests such as sports, celebrities, movies and music
 Long running tasks such as travel planning
 Spark is a search assistance tool for exploration
 Recommend related entities given the user‟s current
query
 Based on explicit relations in a Knowledge Base
Example user sessions
Spark example I.
Spark example II.
High-Level Architecture View
Entity
graph
Data
preprocessing
Feature
extraction
Model
learning
Feature
sources
Editorial
judgements
Datapack
Ranking
model
Ranking and
disambiguation
Entity
data
Features
Spark challenges
 Interpretation and disambiguation
 Obama and Toyota are places in Japan, but maybe
the user is not looking for them
 The popularity of “obama” is not a sign of the
popularity of a Japanese town
 Ranking
 “Release me” from Engelbert Humperdinck should
rank higher than “Lesbian Seagull” which only
appeared on the soundtrack of a Beavis and
Butthead episode
 Editorial relevance vs. what people click
 Large-scale data processing and ML
 Knowledge Base built from Wikipedia, Yahoo! data,
Web extraction
 Feature extraction from query logs, Flickr and Twitter
data
Entity
graph
Data
preprocessing
Feature
extraction
Model
learning
Feature
sources
Editorial
judgements
Datapack
Ranking
model
Ranking and
disambiguation
Entity
data
Features
Contact
 Peter Mika
 pmika@yahoo-inc.com
 @pmika
 Tran Duc Thanh
 thanh.tran@semsolute.com
Resources
Resources
 Detailed information
 Peter Mika. Entity Search on the Web, Keynote at Web of
Linked Entities WS
 Peter Mika, Thanh Tran. Semantic search tutorial
SemTech2012
 Books
 Ricardo Baeza-Yates and Berthier Ribeiro-Neto. Modern
Information Retrieval. ACM Press. 2011
 Survey papers
 Thanh Tran, Peter Mika. Survey of Semantic Search
Approaches. Under submission, 2012.
 Conferences and workshops
 ISWC, ESWC, WWW, SIGIR, CIKM, SemTech
 Semantic Search workshop series
 Exploiting Semantic Annotations in Information Retrieval
(ESAIR)
 Entity-oriented Search (EOS) workshop
 Web of Linked Entities (WoLE) workshop

Weitere ähnliche Inhalte

Was ist angesagt?

Understanding Seo At A Glance
Understanding Seo At A GlanceUnderstanding Seo At A Glance
Understanding Seo At A Glancepoojagupta267
 
Federated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The UglyFederated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The Uglydorishelfer
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015Cason Snow
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015Cason Snow
 
Federated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentFederated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentHelen Mitchell
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008askamy
 
Citation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online LiteratureCitation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online LiteratureBalachandar Radhakrishnan
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Semantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebSemantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebEditor IJCATR
 
Entity linking with a knowledge base issues,
Entity linking with a knowledge base issues,Entity linking with a knowledge base issues,
Entity linking with a knowledge base issues,Nexgen Technology
 
Classification, Tagging & Search
Classification, Tagging & SearchClassification, Tagging & Search
Classification, Tagging & SearchJames Melzer
 
Components of a search engine
Components of a search engineComponents of a search engine
Components of a search enginePrimya Tamil
 
Federated Search Falls Short
Federated Search Falls ShortFederated Search Falls Short
Federated Search Falls Shortslknight
 
Liquid Query: Multi-domain Exploratory Search on the Web
Liquid Query: Multi-domain Exploratory Search on the WebLiquid Query: Multi-domain Exploratory Search on the Web
Liquid Query: Multi-domain Exploratory Search on the WebAlessandro Bozzon
 
Web Search and Mining
Web Search and MiningWeb Search and Mining
Web Search and Miningsathish sak
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 

Was ist angesagt? (20)

Understanding Seo At A Glance
Understanding Seo At A GlanceUnderstanding Seo At A Glance
Understanding Seo At A Glance
 
Federated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The UglyFederated Search: The Good, The Bad And The Ugly
Federated Search: The Good, The Bad And The Ugly
 
Linked Data MLA 2015
Linked Data MLA 2015Linked Data MLA 2015
Linked Data MLA 2015
 
Linked data MLA 2015
Linked data MLA 2015Linked data MLA 2015
Linked data MLA 2015
 
Web Information Retrieval and Mining
Web Information Retrieval and MiningWeb Information Retrieval and Mining
Web Information Retrieval and Mining
 
Federated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentFederated Search in a Disparate Environment
Federated Search in a Disparate Environment
 
Presentation federated search
Presentation federated searchPresentation federated search
Presentation federated search
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008
 
Citation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online LiteratureCitation Analysis for the Free, Online Literature
Citation Analysis for the Free, Online Literature
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Semantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic WebSemantic Annotation: The Mainstay of Semantic Web
Semantic Annotation: The Mainstay of Semantic Web
 
Entity linking with a knowledge base issues,
Entity linking with a knowledge base issues,Entity linking with a knowledge base issues,
Entity linking with a knowledge base issues,
 
Classification, Tagging & Search
Classification, Tagging & SearchClassification, Tagging & Search
Classification, Tagging & Search
 
Components of a search engine
Components of a search engineComponents of a search engine
Components of a search engine
 
Federated Search Falls Short
Federated Search Falls ShortFederated Search Falls Short
Federated Search Falls Short
 
Social Semantic Search and Browsing
Social Semantic Search and BrowsingSocial Semantic Search and Browsing
Social Semantic Search and Browsing
 
Liquid Query: Multi-domain Exploratory Search on the Web
Liquid Query: Multi-domain Exploratory Search on the WebLiquid Query: Multi-domain Exploratory Search on the Web
Liquid Query: Multi-domain Exploratory Search on the Web
 
Web Search and Mining
Web Search and MiningWeb Search and Mining
Web Search and Mining
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Semantic Web Nature
Semantic Web NatureSemantic Web Nature
Semantic Web Nature
 

Andere mochten auch

Danmark til VM
Danmark til VMDanmark til VM
Danmark til VMThomas3d
 
Asparaciones en el futuro
Asparaciones en el futuroAsparaciones en el futuro
Asparaciones en el futurostefytkm
 
Medicare, medicaid and social security
Medicare, medicaid and social securityMedicare, medicaid and social security
Medicare, medicaid and social securitycourtneylubin
 
IDA Kemi - Wasteburner
IDA Kemi - WasteburnerIDA Kemi - Wasteburner
IDA Kemi - WasteburnerThomas3d
 
Medicare, medicaid and social security
Medicare, medicaid and social securityMedicare, medicaid and social security
Medicare, medicaid and social securitycourtneylubin
 
Asparaciones en el futuro
Asparaciones en el futuroAsparaciones en el futuro
Asparaciones en el futurostefytkm
 
Decommissioning meetup
Decommissioning meetupDecommissioning meetup
Decommissioning meetupThomas3d
 
Khadijah what makesgoodemployee
Khadijah what makesgoodemployeeKhadijah what makesgoodemployee
Khadijah what makesgoodemployeeseegirl50
 
'This Works' Presentation by Abdul Majid Ashraf
'This Works' Presentation by Abdul Majid Ashraf'This Works' Presentation by Abdul Majid Ashraf
'This Works' Presentation by Abdul Majid AshrafAbdul Majid Ashraf
 
Pulse by MintM- Taking Feedback made easy
Pulse by MintM- Taking Feedback made easyPulse by MintM- Taking Feedback made easy
Pulse by MintM- Taking Feedback made easyAshutosh Chouksey
 
Pulse by MintM- Taking Feedback Made Easy
Pulse by MintM- Taking Feedback Made EasyPulse by MintM- Taking Feedback Made Easy
Pulse by MintM- Taking Feedback Made EasyAshutosh Chouksey
 
Hvor skal vi hen du
Hvor skal vi hen duHvor skal vi hen du
Hvor skal vi hen duThomas3d
 
Miracle Bodyshop - Thorium Energy
Miracle Bodyshop - Thorium Energy Miracle Bodyshop - Thorium Energy
Miracle Bodyshop - Thorium Energy Thomas3d
 
Copenhagen Thorium Energy and MSR meetup
Copenhagen Thorium Energy and MSR meetupCopenhagen Thorium Energy and MSR meetup
Copenhagen Thorium Energy and MSR meetupThomas3d
 
Container Camp London (2016-09-09)
Container Camp London (2016-09-09)Container Camp London (2016-09-09)
Container Camp London (2016-09-09)craigbox
 

Andere mochten auch (16)

Pbe mine
Pbe minePbe mine
Pbe mine
 
Danmark til VM
Danmark til VMDanmark til VM
Danmark til VM
 
Asparaciones en el futuro
Asparaciones en el futuroAsparaciones en el futuro
Asparaciones en el futuro
 
Medicare, medicaid and social security
Medicare, medicaid and social securityMedicare, medicaid and social security
Medicare, medicaid and social security
 
IDA Kemi - Wasteburner
IDA Kemi - WasteburnerIDA Kemi - Wasteburner
IDA Kemi - Wasteburner
 
Medicare, medicaid and social security
Medicare, medicaid and social securityMedicare, medicaid and social security
Medicare, medicaid and social security
 
Asparaciones en el futuro
Asparaciones en el futuroAsparaciones en el futuro
Asparaciones en el futuro
 
Decommissioning meetup
Decommissioning meetupDecommissioning meetup
Decommissioning meetup
 
Khadijah what makesgoodemployee
Khadijah what makesgoodemployeeKhadijah what makesgoodemployee
Khadijah what makesgoodemployee
 
'This Works' Presentation by Abdul Majid Ashraf
'This Works' Presentation by Abdul Majid Ashraf'This Works' Presentation by Abdul Majid Ashraf
'This Works' Presentation by Abdul Majid Ashraf
 
Pulse by MintM- Taking Feedback made easy
Pulse by MintM- Taking Feedback made easyPulse by MintM- Taking Feedback made easy
Pulse by MintM- Taking Feedback made easy
 
Pulse by MintM- Taking Feedback Made Easy
Pulse by MintM- Taking Feedback Made EasyPulse by MintM- Taking Feedback Made Easy
Pulse by MintM- Taking Feedback Made Easy
 
Hvor skal vi hen du
Hvor skal vi hen duHvor skal vi hen du
Hvor skal vi hen du
 
Miracle Bodyshop - Thorium Energy
Miracle Bodyshop - Thorium Energy Miracle Bodyshop - Thorium Energy
Miracle Bodyshop - Thorium Energy
 
Copenhagen Thorium Energy and MSR meetup
Copenhagen Thorium Energy and MSR meetupCopenhagen Thorium Energy and MSR meetup
Copenhagen Thorium Energy and MSR meetup
 
Container Camp London (2016-09-09)
Container Camp London (2016-09-09)Container Camp London (2016-09-09)
Container Camp London (2016-09-09)
 

Ähnlich wie Sem tech2013 tutorial

Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011sssw2011
 
Making things findable
Making things findableMaking things findable
Making things findablePeter Mika
 
DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0John Breslin
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialPeter Mika
 
Search Engines After The Semanatic Web
Search Engines After The Semanatic WebSearch Engines After The Semanatic Web
Search Engines After The Semanatic Websamar_slideshare
 
Applications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayApplications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayAmit Sheth
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
NetIKX Semantic Search Presentation
NetIKX Semantic Search PresentationNetIKX Semantic Search Presentation
NetIKX Semantic Search Presentationurvics
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Bradley Allen
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataMatthew Rowe
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
George thomas gtra2010
George thomas gtra2010George thomas gtra2010
George thomas gtra2010George Thomas
 
Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Bradley Allen
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016Jessie Chuang
 
Making IA Real: Planning an Information Architecture Strategy
Making IA Real: Planning an Information Architecture StrategyMaking IA Real: Planning an Information Architecture Strategy
Making IA Real: Planning an Information Architecture StrategyChiara Fox Ogan
 
Slawek Korea
Slawek KoreaSlawek Korea
Slawek KoreaSlawek
 
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social TaggingSocial Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social TaggingShelly D. Farnham, Ph.D.
 

Ähnlich wie Sem tech2013 tutorial (20)

Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
 
Making things findable
Making things findableMaking things findable
Making things findable
 
DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0DM110 - Week 10 - Semantic Web / Web 3.0
DM110 - Week 10 - Semantic Web / Web 3.0
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorial
 
Search Engines After The Semanatic Web
Search Engines After The Semanatic WebSearch Engines After The Semanatic Web
Search Engines After The Semanatic Web
 
Applications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayApplications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World Today
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
NetIKX Semantic Search Presentation
NetIKX Semantic Search PresentationNetIKX Semantic Search Presentation
NetIKX Semantic Search Presentation
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Semantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic DataSemantic Technologies: Representing Semantic Data
Semantic Technologies: Representing Semantic Data
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
George thomas gtra2010
George thomas gtra2010George thomas gtra2010
George thomas gtra2010
 
Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)Semantic Search using RDF Metadata (SemTech 2005)
Semantic Search using RDF Metadata (SemTech 2005)
 
X api chinese cop monthly meeting feb.2016
X api chinese cop monthly meeting   feb.2016X api chinese cop monthly meeting   feb.2016
X api chinese cop monthly meeting feb.2016
 
Making IA Real: Planning an Information Architecture Strategy
Making IA Real: Planning an Information Architecture StrategyMaking IA Real: Planning an Information Architecture Strategy
Making IA Real: Planning an Information Architecture Strategy
 
Slawek Korea
Slawek KoreaSlawek Korea
Slawek Korea
 
Semantic Web, e-commerce
Semantic Web, e-commerceSemantic Web, e-commerce
Semantic Web, e-commerce
 
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social TaggingSocial Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
Social Web 2.0 Class Week 8: Social Metadata, Ratings, Social Tagging
 

Kürzlich hochgeladen

ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTDR. SNEHA NAIR
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?TechSoup
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17Celine George
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxiammrhaywood
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfMohonDas
 
Over the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxOver the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxraviapr7
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesCeline George
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICESayali Powar
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...Nguyen Thanh Tu Collection
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.raviapr7
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxheathfieldcps1
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfMohonDas
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...raviapr7
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxraviapr7
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfMohonDas
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINTARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
ARTICULAR DISC OF TEMPOROMANDIBULAR JOINT
 
What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?What is the Future of QuickBooks DeskTop?
What is the Future of QuickBooks DeskTop?
 
Finals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quizFinals of Kant get Marx 2.0 : a general politics quiz
Finals of Kant get Marx 2.0 : a general politics quiz
 
How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17How to Show Error_Warning Messages in Odoo 17
How to Show Error_Warning Messages in Odoo 17
 
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic SupportMarch 2024 Directors Meeting, Division of Student Affairs and Academic Support
March 2024 Directors Meeting, Division of Student Affairs and Academic Support
 
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptxAUDIENCE THEORY -- FANDOM -- JENKINS.pptx
AUDIENCE THEORY -- FANDOM -- JENKINS.pptx
 
Department of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdfDepartment of Health Compounder Question ‍Solution 2022.pdf
Department of Health Compounder Question ‍Solution 2022.pdf
 
Over the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptxOver the counter (OTC)- Sale, rational use.pptx
Over the counter (OTC)- Sale, rational use.pptx
 
How to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 SalesHow to Manage Cross-Selling in Odoo 17 Sales
How to Manage Cross-Selling in Odoo 17 Sales
 
Quality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICEQuality Assurance_GOOD LABORATORY PRACTICE
Quality Assurance_GOOD LABORATORY PRACTICE
 
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
CHUYÊN ĐỀ DẠY THÊM TIẾNG ANH LỚP 11 - GLOBAL SUCCESS - NĂM HỌC 2023-2024 - HK...
 
Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.Drug Information Services- DIC and Sources.
Drug Information Services- DIC and Sources.
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
HED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdfHED Office Sohayok Exam Question Solution 2023.pdf
HED Office Sohayok Exam Question Solution 2023.pdf
 
Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...Patient Counselling. Definition of patient counseling; steps involved in pati...
Patient Counselling. Definition of patient counseling; steps involved in pati...
 
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....Riddhi Kevadiya. WILLIAM SHAKESPEARE....
Riddhi Kevadiya. WILLIAM SHAKESPEARE....
 
Prescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptxPrescribed medication order and communication skills.pptx
Prescribed medication order and communication skills.pptx
 
Diploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdfDiploma in Nursing Admission Test Question Solution 2023.pdf
Diploma in Nursing Admission Test Question Solution 2023.pdf
 
Prelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quizPrelims of Kant get Marx 2.0: a general politics quiz
Prelims of Kant get Marx 2.0: a general politics quiz
 
How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17How to Add Existing Field in One2Many Tree View in Odoo 17
How to Add Existing Field in One2Many Tree View in Odoo 17
 

Sem tech2013 tutorial

  • 1. Peter Mika| Yahoo! Research, Spain pmika@yahoo-inc.com Thanh Tran | Semsolute, Germany Tran@semsolute.com Semantic Search on the Rise
  • 2. About the speakers  Peter Mika  Senior Research Scientist  Head of Semantic Search group at Yahoo! Labs  Expertise: Semantic Search, Web Object Retrieval, Natural Language Processing  Tran Duc Thanh  CEO of Semsolute, Semantic Search Technologies Company  Served as Assistant Professor for Karlsruhe Institute of Technology and Stanford University  Expertise: Semantic Search, Semantic / Linked Data Management
  • 3. Agenda  Why Semantic Search  What is Semantic Search  Innovative Semantic Search Applications  Behind the Scene  Questions
  • 5. Why Semantic Search? I.  “We are at the beginning of search.“ (Marissa Mayer)  Solved large classes of queries, e.g. navigational  Remaining queries are hard, not solvable by brute force, require deep understanding of the world and human cognition, e.g.  Ambiguous searches: paris hilton  Imprecise or overly precise searches  Searches for descriptions: 34 year old computer scientist living in barcelona  Background knowledge and metadata can help to address poorly solved queries Many of these queries would not be asked by users, who learned over time what search technology can and can not do.
  • 6. Why Semantic Search? II.  The Semantic Web is now a reality  Large amounts of data published in RDF  Linked Data  Metadata in HTML  Facebook‟s Open Graph Protocol  Schema.org  Casual users  Don‟t know SPARQL  Unaware of the schema of the data  Searching data instead or in addition to searching documents  Enable innovative search applications / tasks
  • 8. Semantic Search: Using Semantic Models for Search  Semantic search is a retrieval paradigm that  Exploits the semantics of the data or explicit background knowledge to understand user intent and the meaning of content  Incorporates the intent of the query and the meaning of content into the search process (semantic models)
  • 9. Semantic Search: Different Kinds / Different Uses of Semantic Models  Wide range of semantic search systems  Employ different semantic models, possibly at different steps of the search process and in order to support different tasks  Query formulation  Query processing / understanding  Ranking  Result presentation  Result / query refinement
  • 10. Semantic models  Semantics is concerned with the meaning of the resources made available for search  Various representations of meaning  Word-level models: models of relationships among words  Taxonomies, thesauri, dictionaries of entity names  Inference along linguistic relations, e.g. broader/narrower terms  Concept-level models: models of relationships among objects  Ontologies capture entities in the world and their relationships  Inference along domain-specific relations
  • 11. Graph-based Conceptual Models  Core of W3C standards for knowledge representation and data exchange: RDF, OWL  Large amount of data / knowledge on the Web available as graphs  Linked Data: hundreds of interconnected datasets capturing domain-independent and domain-specific knowledge  Metadata in HTML  RDFa, microdata, Facebook‟s OGP  Private graphs  Google‟s Knowledge Graph  Facebook Graph  Yahoo‟s Knowledge Base (talk yesterday)  Microsoft's Satori
  • 13. Where can you find Linked Data?  Downloads  Dbpedia data dumps  SPARQL access  LOD cache by OpenLink: 51 billion triples  Keyword search  Sindice by SindiceTech
  • 14. Google Knowledge Graph  Start with Freebase‟s database, which had 12 million entities  As of June 2012, Knowledge Graph has 500 million entities and over 3.5 billion relationships between those entities  Prioritize properties based on what users were most
  • 15. Facebook‟s Open Graph Protocol  The „Like‟ button provides publishers with a way to promote their content on Facebook and build communities  Shows up in profiles and news feed  Site owners can later reach users who have liked an object  Facebook Graph API allows 3rd party developers to access the data  Open Graph Protocol is an RDFa-based format that allows to describe the object that the user „Likes‟
  • 16. Facebook‟s Open Graph Protocol  RDF vocabulary to be used in conjunction with RDFa  Simplify the work of developers by restricting the freedom in RDFa  Activities, Businesses, Groups, Organizations, People, Places, Products and Entertainment  Only HTML <head> accepted  http://opengraphprotocol.org/ <html xmlns:og="http://opengraphprotocol.org/schema/"> <head> <title>The Rock (1996)</title> <meta property="og:title" content="The Rock" /> <meta property="og:type" content="movie" /> <meta property="og:url" content="http://www.imdb.com/title/tt0117500/" /> <meta property="og:image" content="http://ia.media- imdb.com/images/rock.jpg" /> … </head> ...
  • 17. Semantic Web markup: schema.org  Agreement on a shared set of schemas for common types of web content  Use a single format to communicate the same information to all three search engines  Bing, Google, and Yahoo! (June, 2011), Yandex (Nov, 2011)  Microdata and RDFa support  Schemas for most common web content  Business listings, images/video, recipes, reviews, products, jobs…  Community  public-vocabs@w3.org
  • 19. Current state of metadata on the Web  Analysis of the Bing/Yahoo! Search Crawl  US crawl, January, 2012  31% of webpages, 5% of domains contain some metadata  P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus, LDOW 2012  WebDataCommons.org  Data extracted from a public crawl (commoncrawl.org)  February, 2012 results show 11% of URLs with metadata compared to 5% in 2009/2010 data  7.3 billion triples available for download  H.Mühleisen, C.Bizer.Web Data Commons - Extracting Structured Data from Two Large Web Corpora, LDOW 2012  Large increase in RDFa and microdata adoption compared to microformats
  • 20. Where can you find HTML metadata?  Web Data Commons  Glimmer: glimmer.research.yahoo.com  Online index of the schema.org data in Web Data Commons
  • 22. Innovative Semantic Search Applications  Entity search: entity/entities as results  Factual search: direct answers, facts (about entities)  Relational search: complex relationships between entities  Semantic auto-completion: suggesting queries based on the intent of the provided inputs  Results aggregation / analysis / prediction: apply computational models  Semantic log analysis: understanding user behavior in terms of objects  Semantic profiling: recommendations based on particular interests  Semantic context: contextual model of users / interests  Support for complex tasks, e.g. booking a vacation using a combination of services  Conversational search
  • 25. Entity Search: Entity-based Navigation / Exploration
  • 29. Semantic Auto-completion: Semsolute‟s semantic search engine Vorlesung Knowledge Discovery - Institut AIFB Syntactic Completions Keywords Semantic Completions 2 9
  • 31. Contextual (pervasive, ambient) search Yahoo! Connected TV: Widget engine embedded into the TV Yahoo! IntoNow: recognize audio and show related content
  • 32. Interactive Voice Search  Siri  Question-Answering  Variety of backend sources including Wolfram Alpha and various Yahoo! services  Task completion  E.g. schedule an event
  • 33. Conversational Search  Google‟s Interactive Voice Search
  • 34. Conversational Search  Parlance EU project  Complex dialogs around a set of objects  Restaurant  Area  Price range  Type of cuisine  Complete system  Automated Speech Recognition (ASR)  Spoken Language Understanding (SLU)  Interaction Management  Knowledge Base  Natural Language Generation (NLG)  Text-to-Speech (TTS)  Video  Commercial alternatives from Nuance
  • 36. Main Technological Building Blocks  Query Interpretation  Spelling Correction  Query Segmentation  Entity Recognition  Query Intent Interpretation for Semantic Auto-Completion  Ranking  Entity Ranking  Relationship Ranking  Aggregation  Result Fusion  Rank / Score Aggregation  Result Presentation  Summary Generation  Visualization
  • 37. Semsolute‟s Building Blocks - Keyword / Key Phrase Interpretation Entity “address company san francisco”  Semantic entity index  Inverted index for entities / triples  Return entities / entities‟ relationships as results to keys  Semantic entity ranking  Structured language model: one language model for every attribute  Returns entities‟ LMs that most likely generate the keywords, i.e. the entity descriptions that best match
  • 38. Relationship s / Structure Entity “address company san francisco” Semsolute‟s Building Blocks – Semantic Graph Construction  Offline component: query- independent schema graph  Reuse schema  Pseudo-schema construction: all possible connections between classes of entities, e.g. friendships between users  Online component: query- specific keyword matching elements  Connect keyword matching elements / entities to the classes they belong to
  • 39. Relationship s / Structure Entity “address company san francisco” Semsolute‟s Building Blocks – Graph Exploration  Top-k graph exploration  Shortest-path based algorithm that finds top-k graphs connecting keyword matching elements  Top-k graph ranking  Language model based  Aggregated model that combines the LMs of entities matching the keywords
  • 40. Semsolute‟s Building Blocks – Query Generation & Processing TripleRelationship s / Structure Entity Address of companies located in San Francisco? “address company san francisco”  Graph to query mapping  Translation rules that map top ranked graphs to structured queries (SQL, SPARQL)  Translation rules that map structured queries to natural language questions  Graph matching  Triple index: cover index supporting different triple patterns  Various join implementations
  • 41. Yahoo! Spark: Entity Recommendation in Search  Different use cases in Web Search  Some users are short on time  Need direct answers  Query expansion, question-answering, information boxes, rich results…  Other users want to explore  Long term interests such as sports, celebrities, movies and music  Long running tasks such as travel planning  Spark is a search assistance tool for exploration  Recommend related entities given the user‟s current query  Based on explicit relations in a Knowledge Base
  • 46. Spark challenges  Interpretation and disambiguation  Obama and Toyota are places in Japan, but maybe the user is not looking for them  The popularity of “obama” is not a sign of the popularity of a Japanese town  Ranking  “Release me” from Engelbert Humperdinck should rank higher than “Lesbian Seagull” which only appeared on the soundtrack of a Beavis and Butthead episode  Editorial relevance vs. what people click  Large-scale data processing and ML  Knowledge Base built from Wikipedia, Yahoo! data, Web extraction  Feature extraction from query logs, Flickr and Twitter data Entity graph Data preprocessing Feature extraction Model learning Feature sources Editorial judgements Datapack Ranking model Ranking and disambiguation Entity data Features
  • 47. Contact  Peter Mika  pmika@yahoo-inc.com  @pmika  Tran Duc Thanh  thanh.tran@semsolute.com
  • 49. Resources  Detailed information  Peter Mika. Entity Search on the Web, Keynote at Web of Linked Entities WS  Peter Mika, Thanh Tran. Semantic search tutorial SemTech2012  Books  Ricardo Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. ACM Press. 2011  Survey papers  Thanh Tran, Peter Mika. Survey of Semantic Search Approaches. Under submission, 2012.  Conferences and workshops  ISWC, ESWC, WWW, SIGIR, CIKM, SemTech  Semantic Search workshop series  Exploiting Semantic Annotations in Information Retrieval (ESAIR)  Entity-oriented Search (EOS) workshop  Web of Linked Entities (WoLE) workshop

Hinweis der Redaktion

  1. Mobile: Google interactive voice search (conversation), Siri (Peter)Facebook’s Graph Search (Thanh)Knowledge Graph (infoboxes)... entity search (“tom cruise actor”) to list/category queries (“tom cruise spouses”) to question-answering (“tom cruise height”) (Thanh)Spark (Yahoo!): related entity recommendation (Peter)Thanh’s search engine: auto-complete based on the schema/data, entity search to relational search using Yago data (Thanh)Glimmer: RDF search engine (Peter)
  2. Semantic search can be seen as a retrieval paradigm Centered on the use of semanticsIncorporates the semantics entailed by the query and (or) the resources into the matching process, it essentially performs semantic search.
  3. Facebook invited, but continues to pursue OGP
  4. We implemented the search paradigms and integrated them as separate search modules into a demonstrator system of the Information Workbench7 that has been developed as a showcase for interaction with the Web of data. In particular, keyword search is implemented according to the design and technologies employed by standard Semantic Web search engines. Like Sindice and FalconS, we use an invertedindex to store and retrieve RDF resources based on terms. Also using the inverted index, faceted search is implemented based on the techniques discussed in [25]. Result completion is based on recent work discussed for the TASTIER system [8]. For computing join graphs, we use the top-k procedure elaborated in [9]. This technique is also used for computing top-k interpretations, i.e. to support query completion. We choose to display the top-6 queries and the top-25 results respectively.