IEEE International Conference on Semantic Computing (ICSC 2011).
A Multidimensional Semantic Space for Data Model Independent Queries over RDF Data
André Freitas, João Gabriel Oliveira, Edward Curry Seán O’Riain
http://andrefreitas.org/papers/preprint_multidimensional_ieee_icsc_2011.pdf
Abstract: The vision of creating a Linked Data Web brings together the challenge of allowing queries across highly heterogeneous and distributed datasets. In order to query Linked Data
on the Web today, end-users need to be aware of which datasets potentially contain the data and also which data model describes these datasets. The process of allowing users to expressively
query relationships in RDF while abstracting them from the underlying data model represents a fundamental problem for Web-scale Linked Data consumption. This article introduces a multidimensional semantic space model which enables data model independent natural language queries over RDF data. The center of the approach relies on the use of a distributional semantic model to address the level of semantic interpretation
demanded to build the data model independent approach. The final multidimensional semantic space proved to be flexible and precise under real-world query conditions achieving mean reciprocal rank = 0.516, avg. precision = 0.482 and avg. recall =0.491.
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
A Multidimensional Semantic Space for Data Model Independent Queries over RDF Data
1. Digital Enterprise Research Institute www.deri.ie
A Multidimensional Semantic Space
for Data Model Independent Queries
over RDF Data
André Freitas, João Gabriel Oliveira, Edward Curry
Seán O’Riain
Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
2. Outline
Digital Enterprise Research Institute www.deri.ie
Problem Space & Motivation
Description of the Approach
Evaluation
Conclusion & Future Work
3. Linked Data
Digital Enterprise Research Institute www.deri.ie
Uses the Web infrastructure and standards to
expose and interlink datasets.
Linked Data vision:
The Web as a single Dataspace.
Web of interlinked datasets.
5. Queries over Linked Data
Digital Enterprise Research Institute www.deri.ie
Linked Data brings a fundamental challenge for data
consumption:
How to query heterogeneous and distributed datasets?
At Web scale it is unfeasible for end-users to be aware of the
location and structure of datasets.
Demand for new query mechanisms for Linked Data (data
model independency).
7. Fundamental Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of
Barack Obama graduate?
Popescu (2003): Semantic tractability problem.
8. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
9. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
Entity identification
10. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
Entity search
11. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
Approximate
semantic matching
12. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
Approximate
semantic matching
13. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
Approximate
semantic matching
14. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
Structural matching
15. Semantic Matching Problem
Digital Enterprise Research Institute www.deri.ie
From which university did the wife of Barack Obama graduate?
T- Space
16. Strategy
Digital Enterprise Research Institute www.deri.ie
Best-effort query model (ranked results).
Use of a distributional semantic model.
Two phase search process combining entity search
with spreading activation search.
22. Semantic Relatedness
Digital Enterprise Research Institute www.deri.ie
Computation of a measure of “semantic proximity”
between two terms.
Allows a semantic approximate matching between
query terms and dataset terms.
Most existing approaches use WordNet-based
solutions for approximate semantic matching.
Distributional semantic approaches address these
limitations.
23. Distributional Semantics
Digital Enterprise Research Institute www.deri.ie
Assumption: the context surrounding a given word
in a text provides important information about its
meaning.
Meaning is mediated by word distribution in the
corpora.
Simplified semantic model.
Opera is an art form in which singers and musicians perform a
dramatic work combining text (called a libretto) and musical score.
Opera is part of the Western classical music tradition. Opera
incorporates many of the elements of spoken theatre, such as acting,
scenery, and costumes and sometimes includes dance. The
performance is typically given in an opera house, accompanied by an
orchestra or smaller musical ensemble.
24. Explicit Semantic Analysis (ESA)
Digital Enterprise Research Institute www.deri.ie
Based on Wikipedia.
Interpretation vector using Wikipedia articles titles.
25. Building the T- Space (Steps)
Digital Enterprise Research Institute www.deri.ie
Building the distributional semantic model using
ESA.
Construction of instances spaces (TF/IDF).
Construction of classes spaces (ESA).
Construction of relation spaces (ESA).
26. Building the T- Space
Digital Enterprise Research Institute www.deri.ie
relations
instances properties
classes
27. Building the T- Space
Digital Enterprise Research Institute www.deri.ie
42. Conclusion & Future Work
Digital Enterprise Research Institute www.deri.ie
The T-Space semantic model shows a promising direction for
providing data model independent queries over RDF data.
Improvement of semantic tractability.
The distributional semantic model supports a flexible
matching between query terms and dataset terms in a best-
effort scenario.
Further improvements are needed:
QA features (e.g. answer type detection, operators).
User feedback mechanisms (disambiguation).
Entity recognition for complex classes.