The growing size, heterogeneity and complexity of databases
demand the creation of strategies to facilitate users and systems to consume
data. Ideally, query mechanisms should be schema-agnostic or
vocabulary-independent, i.e. they should be able to match user queries
in their own vocabulary and syntax to the data, abstracting data consumers
from the representation of the data. Despite being a central requirement across natural language interfaces and entity search, there is a lack on the conceptual analysis of schema-agnosticism and on the associated semantic differences between queries and databases. This work aims at providing an initial conceptualization for schema-agnostic queries aiming at providing a fine-grained classification which can support the scoping, evaluation and development of semantic matching approaches for schema-agnostic queries.
UiPath Community: AI for UiPath Automation Developers
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
1. On the Semantic Mapping of Schema-agnostic
Queries: A Preliminary Study
André Freitas, João C. Pereira da Silva, Edward Curry
Insight Centre for Data Analytics
NLIWoD, ISWC 2014
Riva del Garda
2. On the Semantic Mapping of Schema-agnostic
Queries: A Preliminary Study
André Freitas, João C. Pereira da Silva, Edward Curry
Insight Centre for Data Analytics
NLIWoD, ISWC 2014
Riva del Garda
4. Motivation
What is being evaluated by the test collection ?
semantic matching
QA/NLI
Q0, R0
Q1, R1
...
Qn, Rn
f-measure
5. Goals
Provide a preliminary categorization on the semantic
matching (schema-agnosticism) classes.
Support a conceptual understanding on the semantic
phenomena behind schema-agnostic queries.
Applications:
- Help on the design and evaluation of schema-agnostic
query mechanisms
- Relevant to Question Answering and Natural
Language Interfaces
6. Semantic Tractability
Popescu et al. (2003)
Towards a Theory of Natural Language Interfaces to Databases
Definition focuses on soundness and completeness
conditions for mapping Natural Language Queries to Database
elements
7. Semantic Tractability
Leaves many queries outside the tractability scope
Conditions:
- Query-Database syntactic isomorphism
- Explicit and unambiguous synonymic mapping
Goal is to provide an all inclusive categorization system
8. Dimensions of Query-Database Semantic
Heterogeneity
Methodology for the creation of a taxonomy of lexico-semantic
differences
Listing of concepts expressed in the existing semantic
heterogeneity taxonomies
- George, 2005
- Colomb, 1997
- Parent & Spaccapietra, 1998
- Kashyap & Sheth, 1996
Elimination of concepts which were not relevant in the context of
the query-database semantic differences
Merging and renaming of equivalent concepts
13. Semantic Mapping Types
Classifies each semantic mapping
According to the semantic heterogeneity classes
Taking into account some semantic phenomena
(ambiguity, vagueness)
23. Example test collection analysis
Test collection X
Has 4 distinct semantic resolvability classes
50% are trivial mappings
23% are lexical mappings
27% are synonymic mappings
100% of the predicates are structure preserving
100% of the mapping cardinalities are 1:1
24. Example system evaluation
System Y
Addresses 5 out of 10 semantic resolvability classes
(AP=conceptual, PS=*, MC=1:1, SE=*, M=*, CT=*)
- map = 0.51, recall = 0.7
...
25. Summary
NLI/QA Systems have semantic matching (schema-agnosticism)
at its center
The proposed categorization can be used for a more principled
interpretation of the results of NLI/QA systems
... and also on which dimensions evaluation campaigns actually
measure
It supports deeper comparative analysis
Future work includes the categorization of the QALD test
collection