This presentation was given at the International Workshop on Interacting with Linked Data (ILD 2012) co-located with the 9th Extended Semantic Web Conference 2012, Heraklion, and is related the publication of the same title.
Much research has been done to combine the fields of Data-bases and Natural Language Processing. While many works focus on the problem of deriving a structured query for a given natural language question, the problem of query verbalization -- translating a structured query into natural language -- is less explored. In this work we describe our approach to verbalizing SPARQL queries in order to create natural language expressions that are readable and understandable by the human day-to-day user. These expressions are helpful when having search engines that generate SPARQL queries for user-provided natural language questions or keywords. Displaying verbalizations of generated queries to a user enables the user to check whether the right question has been understood. While our approach enables verbalization of only a subset of SPARQL 1.1, this subset applies to 90% of the 209 queries in our training set. These observations are based on a corpus of SPARQL queries consisting of datasets from the QALD-1 challenge and the ILD2012 challenge.
The publication is available at http://www.aifb.kit.edu/images/b/b7/VerbalizingSparqlQueries.pdf
Genetics and epigenetics of ADHD and comorbid conditions
SPARTIQULATION - Verbalizing SPARQL queries
1. KIT – University of the State of Baden-Wuerttemberg and
National Research Center of the Helmholtz Association
Institute of Applied Informatics and Formal Description Metthods (AIFB)
www.kit.edu
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
SPARTIQULATION
Verbalizing SPARQL queries
Basil Ell, Denny Vrandečić, Elena Simperl
International Workshop on Interacting with Linked Data, Extended Semantic Web Conference 2012
28 May 2012
2. Institute of Applied Informatics and Formal Description Methods2 29.05.2012
MOTIVATION
Basil Ell – Verbalizing SPARQL queries
3. Institute of Applied Informatics and Formal Description Methods3 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
4. Institute of Applied Informatics and Formal Description Methods4 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL
SPARQL
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
5. Institute of Applied Informatics and Formal Description Methods5 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL
SPARQLText
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
6. Institute of Applied Informatics and Formal Description Methods6 29.05.2012
Motivation
Basil Ell – Verbalizing SPARQL queries
SPARQL
SPARQLText
[QALD 2011]
[Haase et al., 2009]
[Shekarpour et al., 2011]
7. Institute of Applied Informatics and Formal Description Methods7 29.05.2012
APROACH
Basil Ell – Verbalizing SPARQL queries
8. Institute of Applied Informatics and Formal Description Methods8 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
[Reiter and Dale, 2000]
9. Institute of Applied Informatics and Formal Description Methods9 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
[Reiter and Dale, 2000]
10. Institute of Applied Informatics and Formal Description Methods10 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
3. Decide which words to use in order to
express the content
4. Decide how to refer to an entity
5. Map to linguistic structures
[Reiter and Dale, 2000]
11. Institute of Applied Informatics and Formal Description Methods11 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
3. Decide which words to use in order to
express the content
4. Decide how to refer to an entity
5. Map to linguistic structures
6. Create natural language
7. Add structure to text such as
HTML elements
[Reiter and Dale, 2000]
12. Institute of Applied Informatics and Formal Description Methods12 29.05.2012
The pipeline architecture
Basil Ell – Verbalizing SPARQL queries
Document
Planner
Microplanner
Surface
Realizer
Content determination
Document structuring
Lexicalization
Referring expression generation
Aggregation
Linguistic realization
Surface realization
SPARQL
Text
DP
TS
1. Select the information to communicate
2. Constructing messages and deciding
for their ordering and structure
3. Decide which words to use in order to
express the content
4. Decide how to refer to an entity
5. Map to linguistic structures
6. Create natural language
7. Add structure to text such as
HTML elements
[Reiter and Dale, 2000]
13. Institute of Applied Informatics and Formal Description Methods13 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
14. Institute of Applied Informatics and Formal Description Methods14 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
UNION and GROUP BY queries
15. Institute of Applied Informatics and Formal Description Methods15 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
UNION and GROUP BY queries
„Disconnected“ query graphs
16. Institute of Applied Informatics and Formal Description Methods16 29.05.2012
Restrictions
Basil Ell – Verbalizing SPARQL queries
SPARQL 1.0 SELECT
UNION and GROUP BY queries
„Disconnected“ query graphs
Regular expressions etc.
18. Institute of Applied Informatics and Formal Description Methods18 29.05.2012
Example query – graph representation
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
19. Institute of Applied Informatics and Formal Description Methods19 29.05.2012
Document structuring – 4 Steps
Basil Ell – Verbalizing SPARQL queries
Main entity
identification
Graph
trans-
formation
Message
creation
Create
Document
Plan
20. Institute of Applied Informatics and Formal Description Methods20 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
Select a variable that is verbalized as subject
21. Institute of Applied Informatics and Formal Description Methods21 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
Select a variable that is verbalized as subject
22. Institute of Applied Informatics and Formal Description Methods22 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
Select a variable that is verbalized as subject
?string Labels if available of capitals of African countries ...
Bad: subject is optional.
23. Institute of Applied Informatics and Formal Description Methods23 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
Select a variable that is verbalized as subject
?popu-
lation
Population < 10^6 of capitals of African countries ...
Bad: variable is not selected.
24. Institute of Applied Informatics and Formal Description Methods24 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
Select a variable that is verbalized as subject
?states African countries having capitals that have populations < 10^6 ...
Bad: variable is not selected.
25. Institute of Applied Informatics and Formal Description Methods25 29.05.2012
Example – identify main entity
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
Select a variable that is verbalized as subject
?uri Capitals of African countries having population < 10^6 ...
Good: Label for main entity is requested.
26. Institute of Applied Informatics and Formal Description Methods26 29.05.2012
Graph transformation
Idea: Reduce the set of message types
to simplify verbalization
Main entity is transformed into root node
Reversal of some edges necessary
Basil Ell – Verbalizing SPARQL queries
27. Institute of Applied Informatics and Formal Description Methods27 29.05.2012
Graph transformation
Idea: Reduce the set of message types
to simplify verbalization
Main entity is transformed into root node
Reversal of some edges necessary
Basil Ell – Verbalizing SPARQL queries
28. Institute of Applied Informatics and Formal Description Methods28 29.05.2012
Example – transformed graph
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital
- rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
29. Institute of Applied Informatics and Formal Description Methods29 29.05.2012
Message creation
Cut graph into independently verbalizable parts
Filters are stored in VAR messages
Basil Ell – Verbalizing SPARQL queries
1
1
Messages (1-9) represent paths,
message types are path classes
30. Institute of Applied Informatics and Formal Description Methods30 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital
- rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
31. Institute of Applied Informatics and Formal Description Methods31 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital
- rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
(5) M(RV)*RlV
32. Institute of Applied Informatics and Formal Description Methods32 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital
- rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
(3) M(RV)*RV
(5) M(RV)*RlV
33. Institute of Applied Informatics and Formal Description Methods33 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital
- rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
(7) M(RV)*RtR
(3) M(RV)*RV
(5) M(RV)*RlV
34. Institute of Applied Informatics and Formal Description Methods34 29.05.2012
Example – messages
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital
- rdfs:label
<1000000
LANG=en
optional
?var
?var
resource
filter
selected var
variable
(7) M(RV)*RtR
(3) M(RV)*RV
(5) M(RV)*RlV
(10) VAR+4 x
35. Institute of Applied Informatics and Formal Description Methods35 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
36. Institute of Applied Informatics and Formal Description Methods36 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Constraits for main entity, e.g. its class,
having population < 10^6
37. Institute of Applied Informatics and Formal Description Methods37 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Requested information, e.g. its name
Constraits for main entity, e.g. its class,
having population < 10^6
38. Institute of Applied Informatics and Formal Description Methods38 29.05.2012
Document Plan
Basil Ell – Verbalizing SPARQL queries
constraints requests modifiers
DP
1 2 3
Modifiers, e.g. LIMIT, ORDER BY ...
Requested information, e.g. its name
Constraits for main entity, e.g. its class,
having population < 10^6
39. Institute of Applied Informatics and Formal Description Methods39 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri
?states
yago:AfricanCountries
dbo:capital
-
M(RV)*RtR (cons)
40. Institute of Applied Informatics and Formal Description Methods40 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri
?states
yago:AfricanCountries
dbo:capital
-
M(RV)*RtR
Capitals of African countries
(cons)
41. Institute of Applied Informatics and Formal Description Methods41 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri
?states
yago:AfricanCountries
dbo:capital
-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
(cons)
(cons)
42. Institute of Applied Informatics and Formal Description Methods42 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri
?states
yago:AfricanCountries
dbo:capital
-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
that are having populations that are less
than 1000000
(cons)
(cons)
43. Institute of Applied Informatics and Formal Description Methods43 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri
?states
yago:AfricanCountries
dbo:capital
-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
that are having populations that are less
than 1000000
?uri
?stringrdfs:label
LANG=en
optional
M(RV)*RlV
(cons)
(cons)
(req)
44. Institute of Applied Informatics and Formal Description Methods44 29.05.2012
Example - verbalization
Basil Ell – Verbalizing SPARQL queries
?uri
?states
yago:AfricanCountries
dbo:capital
-
M(RV)*RtR
Capitals of African countries
?uri
?population <1000000
M(RV)*RV
that are having populations that are less
than 1000000
?uri
?stringrdfs:label
LANG=en
optional
M(RV)*RlV
where available their English labels.and
(cons)
(cons)
(req)
45. Institute of Applied Informatics and Formal Description Methods45 29.05.2012
SUMMARY AND FUTURE WORK
Basil Ell – Verbalizing SPARQL queries
46. Institute of Applied Informatics and Formal Description Methods46 29.05.2012
Summary and Future Work
Summary:
Presented an approach for explaining SPARQL
SELECT queries in natural language
Schema-agnostic
Basil Ell – Verbalizing SPARQL queries
47. Institute of Applied Informatics and Formal Description Methods47 29.05.2012
Summary and Future Work
Summary:
Presented an approach for explaining SPARQL
SELECT queries in natural language
Schema-agnostic
Directions for future work:
Tackle challenges in the two missing pipeline
components
Exploitation of linguistic features of labels
Evaluation
Basil Ell – Verbalizing SPARQL queries
48. Institute of Applied Informatics and Formal Description Methods48 29.05.2012
?QUESTIONS
http://km.aifb.kit.edu/projects/spartiqulator/
Basil Ell – Verbalizing SPARQL queries
?uri
?string?states
?population
yago:AfricanCountries
dbo:capital rdfs:label
<1000000
LANG=en
optional
The work presented here is supported by the European Union's 7th
Framework Programme (FP7/2007-2013) under Grant Agreement 257790.
http://bit.ly/KGuDTL
49. Institute of Applied Informatics and Formal Description Methods49 29.05.2012
REFERENCES
Basil Ell – Verbalizing SPARQL queries
50. Institute of Applied Informatics and Formal Description Methods50 29.05.2012
References
Basil Ell – Verbalizing SPARQL queries
S. Shekarpour, S. Auer, A.-C. Ngonga Ngomo, D. Gerber, S. Hellmann,
and C. Stadler. Keyword-driven SPARQL Query Generation Leveraging
Background Knowledge. In International Conference on Web Intelligence,
2011.
E. Reiter and R. Dale. Building Natural Language Generation Systems.
Natural Language Processing. Cambridge University Press, 2000.
P. Haase, D. M. Herzig, M. Musen, and D. T. Tran. Semantic Wiki Search.
In L. A. P. et al., editor, 6th Annual European Semantic Web Conference,
ESWC2009, Heraklion, Crete, Greece, volume 5554 of LNCS, pages 445-
460. Springer Verlag, Juni 2009.
QALD 2011: http://www.sc.cit-ec.uni-bielefeld.de/qald-1