Exploring the Future Potential of AI-Enabled Smartphone Processors
Semantic Search Trend
1. Trend in Semantic Technology
and
Semantic Search
의미기술의 동향과 의미 검색
Sung-Kook Han
Semantic Technology Research Group, Won Kwang University
2010-01-28 skhan@wku.ac.kr page 1
2. Agenda
Information Technology and Semantics
Trends in Semantic Technology
Overview of Semantic Technology
Semantic Search
Summary
2010-01-28 skhan@wku.ac.kr 2
4. Information and Communication
Digitally stored information resources are growing.
Communication between Human and Computer is more common.
Communication devices are diverse.
Ubiquitous Information Knowledge
World-Wide Web
Computing Integration Management
Delivery and Share Semantics of Information.
2010-01-28 skhan@wku.ac.kr 4
5. Semantic Gap
Sender
Concept
“Jaguar” Symbol Thing
Communication Information
“Jaguar” Symbol Thing
Concept
Receiver
2010-01-28 skhan@wku.ac.kr 5
6. Missing Piece: Semantics
Business Process
Digital Content
Semantics
Device Convergence
Internet and Web
2010-01-28 skhan@wku.ac.kr 6
8. Ontology Spectrum: One View
Modal Logic
strong semantics
First Order Logic
Technologies
has_experience_in works
Programs Personnel
Company
Logical Theory Is Disjoint Subclass of
Knowledge
Representation
Project
Management S1 illusion Description Logic with transitivity
am
Agent Natural
Language
Task Technical Program AS AS
AS
Department DAML+OIL, OWL property
Telecommunication Leo
Semantic
Interoperability
Director EcDARPA
Navy
Paulnderleez has WISO UML
Assistant
Conceptual Model
Request Director Intelligence
Reza
Ann Brad
Howard Is Subclass of
RDF/S Semantic Interoperability
XTM
Extended ER
Thesaurus Has Narrower Meaning Than
ER
DB Schemas, XML Schema Animal
Structural Interoperability
Taxonomy Mammal Reptile
Is Sub-Classification of
Bird
Relational Snake
Dog Cat
Model, XML Syntactic Interoperability
Cocker
Spaniel
weak semantics
Lady
2010-01-28 skhan@wku.ac.kr 8
9. AI and Knowledge Engineering
Category Theory
Domain
Theory
Denotational
Semantics
Truth
Maintenance
Systems Category Theory : Theoretical CS apps-
Denotational Semantics, Type Theory Category Theory : Software Spec
EMYCIN KIDS
SPECware
Expert Dempster-shafer Probabilistic Bayesian
Evidence Theory Networks Hybrid KR Distributed
MYCIN Systems Inference Reasoning
Assumption- Decision Graph
Semantic based Systems Theory Knledge Partitioning
Networks Frame Problem Game Compilation Knowledge
Default Logic Abduction Theory Partitioning
GPS BUI
Circumscription Microtheories LogicKBs
SOAR Agents
NetL Non-monotonic
Logic Reactive JATlite
Frame-based KR
Agents KQML
Spreading
Activation
Today
Classic Formalization PowerLOOM NSF KDI
Distributed KJ-ONE Of Context
Actors AI LOOM Formal TOVE DARPA DARPA
Blackboard CYC ARPA Ontology HPKB RKF, DAML
Architectures WAM Description Logics KADS
KSI Ontological OIL
Planning 1983 Constraint PARKA 1990 Engineering 2001
Logic Prolog III KIF Ontolingua
PARLOG GFP OKBC
Prolog II Finite Linear BinProlog
Prolog Logic
Constraint LIFE Domain OZ
Theorem Satisfaction Constraint PARKA-DB
Proving Solvers
Feature Logics ECLiPSe
CHIP
2010-01-28 skhan@wku.ac.kr 9
14. Semantic Web
“The Semantic Web is an extension of the current web in which information is given well-defined
meaning, better enabling computers and people to work in cooperation”
T. Berners-Lee, J. Hendler, O. Lassila, The Semantic Web”, Scientific American, May 2001
기존 웹을 컴퓨터가 처리할 수 있는 잘 정의된 의미 어휘로 확장하여
컴퓨터-컴퓨터, 컴퓨터-인간의 원활한 상호 작용을 실현하는 웹.
Ontology
Ontology-Annotated
Ontology Annotated Agents
Web
1/28/2010 skhan@wku.ac.kr 14
15. Semantic Web
Ontology Articulation
Toolkit End User
Ontology Construction
Tool Agents
Ontologies
Community
Portal
Inference
Engine
Annotated
Web-Page Annotation Metadata
Web-Pages
Tool Repository
1/28/2010 skhan@wku.ac.kr 15
17. 차세대 웹 기술 발전 방향
Web 3.0 Web 4.0
Web 3.0
Web 1.0 Web 2.0
Web 1.0 Web 2.0
1/28/2010 skhan@wku.ac.kr 17
18. 서비스 지향: Service-oriented
Web Applications Web 2.0
Web Pages Service
Applet/Servlet RIA
Global
Script Enterprise 2.0
1995 2000
Networking
Client-Server
Stand-alone
Objects
Database
Components
Application
Windows GUI
Local 1980 1990
Text User-Friendly Rich UI
1/28/2010 skhan@wku.ac.kr 18
19. Service-oriented Architecture (SOA)
1. Point to point systems 2. Message-based middleware with integration broker
Partner
Partner App B App D Warehouse
App A
App A
(J2EE)
Warehouse
App B Sales
Sales
(.Net) App C Service Bus / MOM
App C
App D
(.Net)
(J2EE) Adapter
Adapter
Shared
Legacy System Legacy
Legacy System
Application
Application
Finance
Finance
Service Oriented Architecture & Enterprise Service Bus
Business
Consumer
Custom Package Business Rules
Process “Above the bus”
applications applications Engine
Orchestration
HTTP Enterprise Service Bus
Internet Service
Provider
Routing Transformation (Process)
Services Adapter Adapter
orchestration
Legacy Shared “Below the bus”
System System
Author: Peter Campbell, ANZ Banking Group Australia
1/28/2010 skhan@wku.ac.kr 19
26. Ontology
An ontology is a formal, explicit specification of a shared
conceptualization.
conceptualization [Borst 1997]
Shared Knowledge
Common Vocabulary
2010-01-28 skhan@wku.ac.kr 26
27. Ontology in a nutshell
Domain Knowledge Model
A vocabulary for representing knowledge about a domain and for describing
specific situations in a domain
classes, properties, predicates, and functions, and a set of relationships that
necessarily hold among those vocabulary terms.
Shared formal conceptualizations of particular domains that provide a common
interpretation of topics that can be communicated between people and
applications.
Also allow definition of axioms and constraints on particular concepts and
properties.
Ontological Commitment: General agreement to use a vocabulary
Ontology is social contracts.
Concept
Agreed, explicit semantics
Understandable to outsiders Instance Relation
(Often) derived in a community process
Function Axiom
2010-01-28 skhan@wku.ac.kr 27
28. Ontology
Concepts
concepts of the domain or tasks, which are usually organized in taxonomies
Example: Person, Car, University,…
Relations
a type of interaction between concepts of the domain
Example: subclass-of, is-a, part-of, hasJob, workWith,…,
Functions
a mapping of relations that return some value
Example : John = Father_of (Mary), 2006 = PublingYear(John, Book),…
Axioms
model sentences that are always true
Example: Cow is larger than a dog., a = a + 0,… Concept
Instances Instance Relation
to represent specific elements
Example : Student called Peter,…
Function Axiom
2010-01-28 skhan@wku.ac.kr 28
30. RDF Concept
Resource
(Document)
value
Property (Information))
(Metadata)
(Tag)
resource (subject) property (predicate) value (object)
Creator
http://www.w3.org/Home/Saron Saron Stone
property of the
web page web page value of
being described the predicate
creator
2010-01-28 skhan@wku.ac.kr 30
31. RDF: Data Model
Saron Stone is the creator of the resource http://www.w3.org/Home/Saron.
Subject (Resource) http://www.w3.org/Home/Saron
Predicate (Property) Creator
Object (literal) “Saron Stone"
resource (subject) property (predicate) value (object)
Creator
http://www.w3.org/Home/Saron Saron Stone
property of the
web page web page value of
being described the predicate
creator
2010-01-28 skhan@wku.ac.kr 31
32. RDF Schema
RDF Schema
RDF Vocabulary Description Language.
For defining an appropriate RDF vocabulary (classes, properties and
constraints) for each specific domain.
Comprises very limited predefined primitives: subClassOf,
subPropertyOf, domain and range.
Cannot assert that particular properties are equivalent, transitive,
reverse of one another, etc.
RDF Schema
#Book #Person
author
Property-Centric approach
2010-01-28 skhan@wku.ac.kr 32
34. OWL
Web Ontology Language (OWL) :
RDF/ RDF Schema에 기반을 둔 웹 정보 자원의 의미 기술 표준 언어
Description Logic (DL) 기반의 논리 언어
다양한 개념 구조 표현 가능
3종류의 OWL
OWL-Lite, OWL-DL, OWL-Full
필요에 따라 선택
2010-01-28 skhan@wku.ac.kr 34
35. Semantic Web Standards
RDFa
Microformat
GRDDL
전종홍 외, 시맨틱웹, TTA Jouranl, No 107, 2006년, 10월
1/28/2010 skhan@wku.ac.kr 35
38. Search Engine Market Share
Google by far comprises the largest share of searches.
Microsoft has been trying to buy Yahoo to increase Microsoft’s search share. As of June 12th, both com
panies have ended merger talks.
Now, Microsoft merges Powerset…
2010-01-28 skhan@wku.ac.kr 38
39. Rich Content and Vertical Search
Amazon Articles Wikipedia
Books
Blog
Blogs Photos Flickr
del.icio.us Events Upcoming.org
Book marks
Music Last.fm Places Dopplr
Movies Netflix Products Microsoft Aura
2010-01-28 skhan@wku.ac.kr
40. Rich Content and Vertical Search
Video http://kr.youtube.com/ Map http://maps.live.com/
Blog http://www.google.com/blogsearch
People http://www.pipl.com/
2010-01-28 skhan@wku.ac.kr 40
41. User-Friendly Interface
Tree http://www.tafiti.com/ Network http://www.kartoo.com/
Space
http://www.quintura.com/
2010-01-28 skhan@wku.ac.kr 41
43. Beyond the Limits of Keyword Search
Productivity of Search
The Intelligent Web
Web 4.0
2020 - 2030
Reasoning
The Semantic Web
Web 3.0 Semantic Search
The Social Web 2010 - 2020
The World Wide Web Web 2.0 Natural language search
Web 1.0 2000 - 2010
Tagging
1990 - 2000
The Desktop Keyword search
Directories
PC Era
1980 - 1990
Files & Folders
Databases
Amount of data
By Radar Networks
2010-01-28 skhan@wku.ac.kr 43
44. The Age of Semantic Search
2010-01-28 skhan@wku.ac.kr 44
45. The Age of Semantic Search
2010-01-28 skhan@wku.ac.kr 45
46. Typical Semantic Search Engine
Freebase
General Search Yahoo! Microsearch,
…
Powerset
Hakia
Natural Language Search AskMeNow AskWiki
…
Kango …now UpTake
AdaptiveBlue
Vertical Search ReportLinker
…
SemantiNet
Delver
Social Networking Search Google Social Graph API
…
Twine
Personalized Search MavinIT PSS
…
2010-01-28 skhan@wku.ac.kr 46
47. Search
Roles Language Input Index Metadata Design
Goals Vocabulary Interaction Algorithms Controlled Vocabulary Interaction
Tasks Syntax Feedback Linguistics Knowledge Management Behavior
User
?Query
Search
Interface
Search
Engine
Ask, Browse, or Search Again
Content Results
No definitive formulation.
Considerable uncertainty.
Complex interdependencies.
Incomplete, contradictory, and changing requirements.
Stakeholders have radically different world views and different frames for
understanding information.
2010-01-28 skhan@wku.ac.kr 47
48. Semantic Search
Semantic Search attempts to augment and improve traditional search results
by using data from the SW.
Syntactic Search Semantic Search
Document View Bag-of-Words Vocabularies and Concepts
Search Approach Word matching Concept matching
Search Process One hot Reasoning / Inference
Ontology and Semantic Search
Help user formulate semantic queries
Re-formulate or re-interpret queries
Browse domain
Formulate related queries
Interoperability between search application
Semantic indexing of documents
2010-01-28 skhan@wku.ac.kr 48
49. Semantic Search Problems
Optimization : Requires massive parallel computer
III Example : “What is the best vocation for me how?”
Inference : Requires NLP + Interface Engine + Database
II Example : “What US Senator took money from foreign entity?”
Natural Language : Requires query analysis
Example : “What year was Leonardo Da Vinci born?”
I
Simple : Solvable with Google Statistical Algorithm
Example : “read write web blog”
Alex Iskol – Read/Write Web
2010-01-28 skhan@wku.ac.kr 49
50. 5 Core technologies for Semantic Search
Semantic Tagging
Statistics Concept organization
Linguistics
Natural language Processing
Semantic Web
Metadata / Ontology
Reasoning
Artificial Intelligence
2010-01-28 skhan@wku.ac.kr 50
51. Semantic Search
Ontology/Metadata
Semantic Annotation
Query Processing Semantic Semantic Processing
User Interaction Search
Reasoning
Engine
System Architecture
Service Architecture
2010-01-28 skhan@wku.ac.kr 51
52. Categorical Features of Semantic Search Engine
Stand-alone Maintain an concept index of document
Architecture
Meta Search Use subordinate search engines
Coupling Data of documents refer explicitly to
Tight coupling
concepts of a specific ontology.
between documents
and ontologies Loose coupling Not committed to any available ontology
Transparent Semantic capabilities invisible to the user.
User Interaction Interactive Ask for clarification or recommendation
Hybrid Both
2010-01-28 skhan@wku.ac.kr 52
53. Categorical Features of Semantic Search Engine
Learning Extract from user interaction dynamically
User context
Hard-coded Ask for query category
Manually The user modifies a query.
Query modification Query rewritten A query can be optimized by the system.
Graph-based Use graph traversal algorithm
anonymous Disregard the vocabulary and the semantics
Standard
Ontology Synonym, hyponym,…
property
Construction
Domain-specific
Domain ontology
property
Ontology technology Language RDF, OWL,…
A survey and classification of semantic search approaches by Christoph Mangold
2010-01-28 skhan@wku.ac.kr 53
54. Technology for Semantic Search
Augmenting traditional keyword search with semantic techniques
Semantic annotation
Complex constraint queries
Problem solving
Semantic connectivity discovery
54
55. Technology for Semantic Search
Augmenting traditional keyword search with semantic techniques
WordNet
synonym and meronym
Keyword Concept
RDF
Repository
55
60. Evaluation of Semantic Search
Search phase Feature Functionality Interface Components
• keyword(s) • Single text entry
Free text input
• natural language • Property-specific fields
• Boolean operators
Operators • semantic constraints • Application-specific syntax
• regular expressions
Query construction
• disambiguate input • Value list
Controlled terms • restrict output • Faceted
• select predefined queries • Graph
• Suggestion list
User feedback • pre-query disambiguation
• Semantic auto completion
• exact, prefix, substring match
Syntactic matching • minimal edit distance
• stemming
Search algorithm
• thesauri expansion
Semantic matching • graph traversal
• RDFS/OWL reasoning
60
61. Evaluation of Semantic Search
Search phase Feature Functionality Interface Components
• Text
• Graph
• Selected property values
• Tag cloud
Data selection • Class specific template
• Map
• Display vocabulary
• Timeline
• Calendar
Presentation Ordering • Content and link structure based ranking • Ordered list
• Tree
• Clustering by property or path
Organization • Nested box structure
• Dynamic clustering
• Cluster map
• Post-query disambiguation • Facets
User feedback • Query refinement • Tag cloud
• Recommendation of related resources • Value list
refer to: http://swuiwiki.webscience.org/index.php/Semantic_Search_Survey
62. Applications of Semantic Search
Library 2.0 Find books related to “Semantic Search” written by TBL.
BPM Find PO web services for car repair parts.
Medicine What are side-effects of rifamycin?
e-Commerce Search the specifications of RFID chips produced by SamTech.
Science Which parameters are seriously changed during CO2 combustion?
Search = Generic Task
62
63. Summary
Semantic Search is a kind of Generic tasks.
• More than simple document search
• Diverse applications in BioInfomatics, EcoScience, Medical Science….
Ontology is a key player of Semantic Search.
• RDFa, Microformat, GRDDL,…
• RDF, RDF Schema, OWL,…
• Ontology Annotation and Population
• SPARQL and Query processing,
Multi-disciplinary research and development.
• Natural Language Processing and Text Mining
• Web Science
User-friendly
• Diverse vertical semantic search with domain ontologies
• Visualization
• Mobile Search
63
64. 의미기술의 동향과 의미 검색
경청해 주시어,감사드립니다.
2010-01-28 skhan@wku.ac.kr 64