SlideShare a Scribd company logo
1 of 27
Download to read offline
A Multilingual Semantic Wiki Based on
Controlled Natural Language
Tobias Kuhn
Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich,
Switzerland
Insight, National University of Ireland, Galway
19 August 2014
About This Talk
This talk is mainly based on the following papers:
Kaarel Kaljurand and Tobias Kuhn. A Multilingual Semantic Wiki
Based on Attempto Controlled English and Grammatical Framework.
In Proceedings of the 10th Extended Semantic Web Conference
(ESWC). 2013.
http://purl.org/tkuhn/eswc2013acewikigf
Kaarel Kaljurand, Tobias Kuhn, and Laura Canedo. Collaborative
multilingual knowledge management based on controlled natural
language. Semantic Web. Accepted, to appear.
http://www.semantic-web-journal.net/system/files/swj524.pdf
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 2 / 27
Imagine ...
... that Wikipedia can check consistency and answer
questions about the contained knowledge, and
... that all content is instantly available in all
languages!
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 3 / 27
• AceWiki is a semantic wiki
• Articles are written in Attempto Controlled English (ACE)
• These sentences are internally translated into the Semantic Web
language OWL
• An OWL reasoner is built in to answer questions and detect
inconsistencies
• Special editor for writing ACE statements
• Extended to support multilinguality
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 4 / 27
Monolingual AceWiki: Screenshot
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 5 / 27
Attempto Controlled English (ACE)
Subset of natural English:
• Conjunction, disjunction, negation, if-then, ...
• Anaphoric references: pronouns, definite noun phrases, variables
• Quantifiers: every, no, at least 3, ...
• Content words: proper names, nouns, verbs, adjectives, ...
Grammar is fixed, but users can change content words.
Deterministic ambiguity handling:
• Anaphora resolution (France borders Spain and it borders
Portugal.)
• Quantifier scope (Every country borders a country.)
• Attachment (Every EU-country borders a country that is an
EU-country and is a NATO-country.)
Well-defined translations to and from first-order logic, OWL, ...
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 6 / 27
Predictive Editor
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 7 / 27
Consistency Checking
AceWiki ensures consistency by checking every new statement:
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 8 / 27
Question Answering
AceWiki supports simple wh-questions:
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 9 / 27
Monolingual AceWiki: Demo
http://attempto.ifi.uzh.ch/acewiki/
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 10 / 27
ACE Reasoning via Translation to OWL
Every country that does not border a sea is a landlocked-country.
SubClassOf(
ObjectIntersectionOf(
:country
ObjectComplementOf(
ObjectSomeValuesFrom(
:border
:sea
)
)
)
:landlocked-country
)
Which country is a landlocked-country?
ObjectIntersectionOf(
:country
:landlocked-country
)
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 11 / 27
Evaluation
Two small usability experiments with earlier versions of AceWiki:
• Altogether 26 untrained participants
• Task: Collaborative creation of a knowledge base
Results:
• 78%-81% of the sentences were correct and sensible
• 61%-70% of them were complex (containing negations,
implications, disjunctions or number restrictions)
• Creation of a correct sentence every 5–6 minutes
• Definition of a new word every 5–7 minutes
→ Even untrained users can effectively use AceWiki
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 12 / 27
Multilingual AceWiki: AceWiki-GF
General ideas:
• Make wiki content available in different languages
• Automatically translated content using high-quality rule-based
machine translation: Grammatical Framework (GF)
• Language switching like in Wikipedia
• Localization of the user interface
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 13 / 27
Grammatical Framework (GF)
GF is a framework for multilingual grammar engineering:
• Rule-based
• Functional programming language (based on Haskell) optimized
to handle natural language
• Resource Grammar Library implementing common morphological
and syntactic structures
• Mildly context sensitive
• Bidirectional translations: concrete language ⇔ abstract syntax
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 14 / 27
GF grammars and translations
GF grammars consist of:
• One language-neutral abstract syntax
• Concrete syntaxes specify words, agreement, word order, etc. by
implementing the abstract categories and functions
Example
border : Country -> Country -> Relation
English: border x y = x!Nom + "borders" + y!Nom
Estonian: border x y = x!Gen + "naaber on" + y!Nom
GF translations consist of:
• First, parse a string in the original language to a tree (or trees)
in the abstract syntax
• Then, linearize these trees as strings in the target language
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 15 / 27
Multilingual AceWiki: Screenshot
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 16 / 27
Multilingual AceWiki: Demo
http://attempto.ifi.uzh.ch/acewiki-gf/
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 17 / 27
ACE-in-GF
• Multiple controlled versions of natural languages that map to
ACE (and to each other)
• As a result, they can be bidirectionally mapped to various formal
languages already supported by ACE
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 18 / 27
ACE-in-GF: Example
German: Jedes Land, das nicht an ein Meer grenzt, ist ein
Binnenland.
ACE-in-GF tree:
baseText (sText (s (vpS (everyNP (relCN (cn_as_VarCN country_CN)
(neg_predRS which_RP (v2VP border_V2 (thereNP_as_NP
(aNP (cn_as_VarCN sea_CN))))))) (npVP (thereNP_as_NP
(aNP (cn_as_VarCN landlocked_country_CN)))))))
ACE: Every country that does not border a sea is a
landlocked-country.
OWL:
SubClassOf(
ObjectIntersectionOf(
:country
ObjectComplementOf(
ObjectSomeValuesFrom( :border :sea )
)
)
:landlocked-country
)
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 19 / 27
ACE-in-GF: Implementation
Implementation of the ACE syntax:
• Targeting the subset of ACE that can be mapped to OWL
• Almost 100% coverage at almost 0% ambiguity
Support of most RGL languages:
• Bulgarian, Catalan, Chinese, Danish, Dutch, English, Finnish,
French, German, Greek, Hindi, Italian, Latvian, Norwegian,
Polish, Romanian, Russian, Spanish, Swedish, Thai, Urdu
• RGL-based design provides automatic increase in quality and
language-coverage over time
Status
• Some precision problems, e.g. with anaphoric references
• Ambiguity and coverage problems in some languages
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 20 / 27
Evaluation of AceWiki-GF
Hypothesis: A group of users reaches almost the same level of agreement
on the content of an article presented to them in different languages as
when the article is presented to all of them in the same language.
Design
• Based on a 500-word lexicon on European geography in three
languages: English, German and Spanish
• 30 participants accessed AceWiki-GF and wrote sentences in
their language (10 participants for each language)
• They had to enter true and false sentences and tag them as such
• In a post-editing task, each participant checked the output of
two other participants: one translated from another language
and one written in the same language (true/false tags were
removed and sentences shuffled); they were asked to remove all
false sentences
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 21 / 27
Evaluation of AceWiki-GF: Results
30 participants spent on average 37 minutes using AceWiki-GF,
creating 316 sentences in total.
Definition of agreement level: (Tk + Fd )/S
S is the total number of sentences, Tk the number of sentences marked as true
and kept, and Fd the ones marked as false and deleted
Agreement level (difference is not significant):
82.2%without translation
84.0%with translation
0% 25% 50% 75% 100%
agreement level
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 22 / 27
Evaluation of AceWiki-GF: Results
Assumption: Translation introduces a constant translation error rate r
that has the effect of reducing the agreement level a to (1 − r) · a.
New hypothesis: The translation error rate is less than 5%.
78.1%with hypothetical translation (r = 5%)
84.0%with translation
0% 25% 50% 75% 100%
agreement level
p-value with one-tailed Wilcoxon signed-rank test: 0.046
→ With AceWiki-GF, translation error rate is less than 5%
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 23 / 27
Evaluation of AceWiki-GF: Feedback
Questionnaire for the participants contained these questions:
1 Was AceWiki Geography easy or difficult to use in general?
2 Was the sentence editor easy or difficult to use?
3 Was creating true and false statements easy or difficult to
perform?
Possible answers: “very difficult” (0), “difficult” (1), “medium” (2),
“easy” (3), and “very easy” (4)
Results:
1 Average: 2.93 (∼“easy”)
2 Average: 2.77 (∼“easy”)
3 Average: 2.70 (∼“easy”)
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 24 / 27
The Future...?
Can we make a truly multilingual Wikipedia?
• Store main content in a semantic representation
• Verbalization in different languages
• All content is instantly available in all languages (once the
required vocabulary is defined)
• Breaking the current dominance of English and putting an end
to the lock-out of users speaking less widespread or
underrepresented languages
• Contributing to the Semantic Web
Related:
• http://www.wikidata.org
• http://meta.wikimedia.org/wiki/A_proposal_towards_a_
multilingual_Wikipedia
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 25 / 27
Links
ACE parser (APE) source code: https://github.com/Attempto/APE
ACE-in-GF source code: http://github.com/Attempto/ACE-in-GF
AceWiki and AceWikiGF
• Source code: http://github.com/AceWiki/AceWiki
• Demos (non-GF): http://attempto.ifi.uzh.ch/acewiki/
• Demos (GF): http://attempto.ifi.uzh.ch/acewiki-gf/
MOLTO project web site: http://www.molto-project.eu
Attempto web site: http://attempto.ifi.uzh.ch
GF: http://www.grammaticalframework.org
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 26 / 27
Thank you for your Attention!
If you are interested in Controlled Natural Languages,
come visit us at CNL 2014!
CNL
2014
Fourth Workshop on Controlled Natural Language
20–22 August 2014, Galway
http://attempto.ifi.uzh.ch/site/cnl2014/
Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 27 / 27

More Related Content

Viewers also liked

Controlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for StandardizationControlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for StandardizationTobias Kuhn
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesTobias Kuhn
 
Underspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in NanopublicationsUnderspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in NanopublicationsTobias Kuhn
 
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset ClassIoana Stoica
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsTobias Kuhn
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Tobias Kuhn
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...Tobias Kuhn
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiTobias Kuhn
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...Tobias Kuhn
 
Broadening the Scope of Nanopublications
Broadening the Scope of NanopublicationsBroadening the Scope of Nanopublications
Broadening the Scope of NanopublicationsTobias Kuhn
 

Viewers also liked (11)

Controlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for StandardizationControlled Natural Language and Opportunities for Standardization
Controlled Natural Language and Opportunities for Standardization
 
AceWiki
AceWikiAceWiki
AceWiki
 
How to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural LanguagesHow to Evaluate Controlled Natural Languages
How to Evaluate Controlled Natural Languages
 
Underspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in NanopublicationsUnderspecified Scientific Claims in Nanopublications
Underspecified Scientific Claims in Nanopublications
 
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
2013 Report on AngeI Investing Activity in Canada: Accelerating the Asset Class
 
Data Publishing and Post-Publication Reviews
Data Publishing and Post-Publication ReviewsData Publishing and Post-Publication Reviews
Data Publishing and Post-Publication Reviews
 
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linke...
 
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
A Multilingual Semantic Wiki based on Attempto Controlled English and Grammat...
 
AceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic WikiAceWiki: Controlled English in a Semantic Wiki
AceWiki: Controlled English in a Semantic Wiki
 
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
A Decentralized Network for Publishing Linked Data — Nanopublications, Trusty...
 
Broadening the Scope of Nanopublications
Broadening the Scope of NanopublicationsBroadening the Scope of Nanopublications
Broadening the Scope of Nanopublications
 

Similar to A Multilingual Semantic Wiki Based on Controlled Natural Language

How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisTobias Kuhn
 
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...MeetupDataScienceRoma
 
Towards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceGerard de Melo
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionSebastian Ruder
 
Multilinguals and Wikipedia Editing
Multilinguals and Wikipedia EditingMultilinguals and Wikipedia Editing
Multilinguals and Wikipedia EditingScott A. Hale
 
What you Can Make Out of Linked Data
What you Can Make Out of Linked DataWhat you Can Make Out of Linked Data
What you Can Make Out of Linked DataMarco Fossati
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Janifer Gatenby
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...Digital Classicist Seminar Berlin
 
natasha.ppt
natasha.pptnatasha.ppt
natasha.pptMrBrave
 
Language commons wiki_final
Language commons wiki_finalLanguage commons wiki_final
Language commons wiki_finalEd Bice
 
Umd draft-2010 jun22
Umd draft-2010 jun22Umd draft-2010 jun22
Umd draft-2010 jun22Ed Bice
 
Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)Yunyao Li
 
Linked Open Data Cloud
Linked Open Data CloudLinked Open Data Cloud
Linked Open Data CloudPretaLLOD
 
Flexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual WorldFlexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual WorldAlannah Fitzgerald
 
Wikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemWikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemJakob .
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Sebastian Ruder
 
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...Alannah Fitzgerald
 
Natural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesNatural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesValeria de Paiva
 

Similar to A Multilingual Semantic Wiki Based on Controlled Natural Language (20)

How Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic WikisHow Controlled English can Improve Semantic Wikis
How Controlled English can Improve Semantic Wikis
 
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
Roberto Navigli - From Text to Concepts and Back: Going Multilingual with Bab...
 
Towards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined EvidenceTowards a Universal Wordnet by Learning from Combined Evidence
Towards a Universal Wordnet by Learning from Combined Evidence
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary Induction
 
Multilinguals and Wikipedia Editing
Multilinguals and Wikipedia EditingMultilinguals and Wikipedia Editing
Multilinguals and Wikipedia Editing
 
What you Can Make Out of Linked Data
What you Can Make Out of Linked DataWhat you Can Make Out of Linked Data
What you Can Make Out of Linked Data
 
Programing Language
Programing LanguagePrograming Language
Programing Language
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
 
natasha.ppt
natasha.pptnatasha.ppt
natasha.ppt
 
Language commons wiki_final
Language commons wiki_finalLanguage commons wiki_final
Language commons wiki_final
 
Umd draft-2010 jun22
Umd draft-2010 jun22Umd draft-2010 jun22
Umd draft-2010 jun22
 
Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)Towards Universal Language Understanding (2020 version)
Towards Universal Language Understanding (2020 version)
 
Linked Open Data Cloud
Linked Open Data CloudLinked Open Data Cloud
Linked Open Data Cloud
 
Flexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual WorldFlexible Open Language Education for a MultiLingual World
Flexible Open Language Education for a MultiLingual World
 
Wikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization SystemWikipedia as Knowledge Organization System
Wikipedia as Knowledge Organization System
 
List of wikipedias
List of wikipediasList of wikipedias
List of wikipedias
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
 
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
FLAX Weaving with Oxford Open Educational Resources: Open Practices for Engli...
 
Natural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and MachinesNatural Language Inference: for Humans and Machines
Natural Language Inference: for Humans and Machines
 

More from Tobias Kuhn

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsTobias Kuhn
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishingTobias Kuhn
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer Tobias Kuhn
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Tobias Kuhn
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsTobias Kuhn
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data PublishingTobias Kuhn
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Tobias Kuhn
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Tobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureTobias Kuhn
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiTobias Kuhn
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Tobias Kuhn
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageTobias Kuhn
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischTobias Kuhn
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsTobias Kuhn
 

More from Tobias Kuhn (17)

Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
Linked Data Publishing with Nanopublications
Linked Data Publishing with NanopublicationsLinked Data Publishing with Nanopublications
Linked Data Publishing with Nanopublications
 
Genuine semantic publishing
Genuine semantic publishingGenuine semantic publishing
Genuine semantic publishing
 
The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer The Controlled Natural Language of Randall Munroe’s Thing Explainer
The Controlled Natural Language of Randall Munroe’s Thing Explainer
 
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
Publishing without Publishers: a Decentralized Approach to Dissemination, Ret...
 
Semantic Publishing and Nanopublications
Semantic Publishing and NanopublicationsSemantic Publishing and Nanopublications
Semantic Publishing and Nanopublications
 
Scientific Data Publishing
Scientific Data PublishingScientific Data Publishing
Scientific Data Publishing
 
Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?Science Bots: A Model for the Future of Scientific Computation?
Science Bots: A Model for the Future of Scientific Computation?
 
Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications Semantic Publishing with Nanopublications
Semantic Publishing with Nanopublications
 
Nanopubs
NanopubsNanopubs
Nanopubs
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Citation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific LiteratureCitation Graph Analysis to Identify Memes in Scientific Literature
Citation Graph Analysis to Identify Memes in Scientific Literature
 
Automatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen WikiAutomatische Übersetzung in einem multilingualen, semantischen Wiki
Automatische Übersetzung in einem multilingualen, semantischen Wiki
 
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
Improving Text Mining with Controlled Natural Language: A Case Study for Prot...
 
AceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural LanguageAceRules: Executing Rules in Controlled Natural Language
AceRules: Executing Rules in Controlled Natural Language
 
Wissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem EnglischWissensrepräsentation in kontrolliertem Englisch
Wissensrepräsentation in kontrolliertem Englisch
 
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive EditorsCodeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
Codeco: A Grammar Notation for Controlled Natural Language in Predictive Editors
 

Recently uploaded

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 

Recently uploaded (20)

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 

A Multilingual Semantic Wiki Based on Controlled Natural Language

  • 1. A Multilingual Semantic Wiki Based on Controlled Natural Language Tobias Kuhn Chair of Sociology, in particular of Modeling and Simulation, ETH Zurich, Switzerland Insight, National University of Ireland, Galway 19 August 2014
  • 2. About This Talk This talk is mainly based on the following papers: Kaarel Kaljurand and Tobias Kuhn. A Multilingual Semantic Wiki Based on Attempto Controlled English and Grammatical Framework. In Proceedings of the 10th Extended Semantic Web Conference (ESWC). 2013. http://purl.org/tkuhn/eswc2013acewikigf Kaarel Kaljurand, Tobias Kuhn, and Laura Canedo. Collaborative multilingual knowledge management based on controlled natural language. Semantic Web. Accepted, to appear. http://www.semantic-web-journal.net/system/files/swj524.pdf Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 2 / 27
  • 3. Imagine ... ... that Wikipedia can check consistency and answer questions about the contained knowledge, and ... that all content is instantly available in all languages! Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 3 / 27
  • 4. • AceWiki is a semantic wiki • Articles are written in Attempto Controlled English (ACE) • These sentences are internally translated into the Semantic Web language OWL • An OWL reasoner is built in to answer questions and detect inconsistencies • Special editor for writing ACE statements • Extended to support multilinguality Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 4 / 27
  • 5. Monolingual AceWiki: Screenshot Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 5 / 27
  • 6. Attempto Controlled English (ACE) Subset of natural English: • Conjunction, disjunction, negation, if-then, ... • Anaphoric references: pronouns, definite noun phrases, variables • Quantifiers: every, no, at least 3, ... • Content words: proper names, nouns, verbs, adjectives, ... Grammar is fixed, but users can change content words. Deterministic ambiguity handling: • Anaphora resolution (France borders Spain and it borders Portugal.) • Quantifier scope (Every country borders a country.) • Attachment (Every EU-country borders a country that is an EU-country and is a NATO-country.) Well-defined translations to and from first-order logic, OWL, ... Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 6 / 27
  • 7. Predictive Editor Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 7 / 27
  • 8. Consistency Checking AceWiki ensures consistency by checking every new statement: Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 8 / 27
  • 9. Question Answering AceWiki supports simple wh-questions: Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 9 / 27
  • 10. Monolingual AceWiki: Demo http://attempto.ifi.uzh.ch/acewiki/ Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 10 / 27
  • 11. ACE Reasoning via Translation to OWL Every country that does not border a sea is a landlocked-country. SubClassOf( ObjectIntersectionOf( :country ObjectComplementOf( ObjectSomeValuesFrom( :border :sea ) ) ) :landlocked-country ) Which country is a landlocked-country? ObjectIntersectionOf( :country :landlocked-country ) Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 11 / 27
  • 12. Evaluation Two small usability experiments with earlier versions of AceWiki: • Altogether 26 untrained participants • Task: Collaborative creation of a knowledge base Results: • 78%-81% of the sentences were correct and sensible • 61%-70% of them were complex (containing negations, implications, disjunctions or number restrictions) • Creation of a correct sentence every 5–6 minutes • Definition of a new word every 5–7 minutes → Even untrained users can effectively use AceWiki Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 12 / 27
  • 13. Multilingual AceWiki: AceWiki-GF General ideas: • Make wiki content available in different languages • Automatically translated content using high-quality rule-based machine translation: Grammatical Framework (GF) • Language switching like in Wikipedia • Localization of the user interface Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 13 / 27
  • 14. Grammatical Framework (GF) GF is a framework for multilingual grammar engineering: • Rule-based • Functional programming language (based on Haskell) optimized to handle natural language • Resource Grammar Library implementing common morphological and syntactic structures • Mildly context sensitive • Bidirectional translations: concrete language ⇔ abstract syntax Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 14 / 27
  • 15. GF grammars and translations GF grammars consist of: • One language-neutral abstract syntax • Concrete syntaxes specify words, agreement, word order, etc. by implementing the abstract categories and functions Example border : Country -> Country -> Relation English: border x y = x!Nom + "borders" + y!Nom Estonian: border x y = x!Gen + "naaber on" + y!Nom GF translations consist of: • First, parse a string in the original language to a tree (or trees) in the abstract syntax • Then, linearize these trees as strings in the target language Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 15 / 27
  • 16. Multilingual AceWiki: Screenshot Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 16 / 27
  • 17. Multilingual AceWiki: Demo http://attempto.ifi.uzh.ch/acewiki-gf/ Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 17 / 27
  • 18. ACE-in-GF • Multiple controlled versions of natural languages that map to ACE (and to each other) • As a result, they can be bidirectionally mapped to various formal languages already supported by ACE Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 18 / 27
  • 19. ACE-in-GF: Example German: Jedes Land, das nicht an ein Meer grenzt, ist ein Binnenland. ACE-in-GF tree: baseText (sText (s (vpS (everyNP (relCN (cn_as_VarCN country_CN) (neg_predRS which_RP (v2VP border_V2 (thereNP_as_NP (aNP (cn_as_VarCN sea_CN))))))) (npVP (thereNP_as_NP (aNP (cn_as_VarCN landlocked_country_CN))))))) ACE: Every country that does not border a sea is a landlocked-country. OWL: SubClassOf( ObjectIntersectionOf( :country ObjectComplementOf( ObjectSomeValuesFrom( :border :sea ) ) ) :landlocked-country ) Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 19 / 27
  • 20. ACE-in-GF: Implementation Implementation of the ACE syntax: • Targeting the subset of ACE that can be mapped to OWL • Almost 100% coverage at almost 0% ambiguity Support of most RGL languages: • Bulgarian, Catalan, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hindi, Italian, Latvian, Norwegian, Polish, Romanian, Russian, Spanish, Swedish, Thai, Urdu • RGL-based design provides automatic increase in quality and language-coverage over time Status • Some precision problems, e.g. with anaphoric references • Ambiguity and coverage problems in some languages Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 20 / 27
  • 21. Evaluation of AceWiki-GF Hypothesis: A group of users reaches almost the same level of agreement on the content of an article presented to them in different languages as when the article is presented to all of them in the same language. Design • Based on a 500-word lexicon on European geography in three languages: English, German and Spanish • 30 participants accessed AceWiki-GF and wrote sentences in their language (10 participants for each language) • They had to enter true and false sentences and tag them as such • In a post-editing task, each participant checked the output of two other participants: one translated from another language and one written in the same language (true/false tags were removed and sentences shuffled); they were asked to remove all false sentences Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 21 / 27
  • 22. Evaluation of AceWiki-GF: Results 30 participants spent on average 37 minutes using AceWiki-GF, creating 316 sentences in total. Definition of agreement level: (Tk + Fd )/S S is the total number of sentences, Tk the number of sentences marked as true and kept, and Fd the ones marked as false and deleted Agreement level (difference is not significant): 82.2%without translation 84.0%with translation 0% 25% 50% 75% 100% agreement level Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 22 / 27
  • 23. Evaluation of AceWiki-GF: Results Assumption: Translation introduces a constant translation error rate r that has the effect of reducing the agreement level a to (1 − r) · a. New hypothesis: The translation error rate is less than 5%. 78.1%with hypothetical translation (r = 5%) 84.0%with translation 0% 25% 50% 75% 100% agreement level p-value with one-tailed Wilcoxon signed-rank test: 0.046 → With AceWiki-GF, translation error rate is less than 5% Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 23 / 27
  • 24. Evaluation of AceWiki-GF: Feedback Questionnaire for the participants contained these questions: 1 Was AceWiki Geography easy or difficult to use in general? 2 Was the sentence editor easy or difficult to use? 3 Was creating true and false statements easy or difficult to perform? Possible answers: “very difficult” (0), “difficult” (1), “medium” (2), “easy” (3), and “very easy” (4) Results: 1 Average: 2.93 (∼“easy”) 2 Average: 2.77 (∼“easy”) 3 Average: 2.70 (∼“easy”) Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 24 / 27
  • 25. The Future...? Can we make a truly multilingual Wikipedia? • Store main content in a semantic representation • Verbalization in different languages • All content is instantly available in all languages (once the required vocabulary is defined) • Breaking the current dominance of English and putting an end to the lock-out of users speaking less widespread or underrepresented languages • Contributing to the Semantic Web Related: • http://www.wikidata.org • http://meta.wikimedia.org/wiki/A_proposal_towards_a_ multilingual_Wikipedia Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 25 / 27
  • 26. Links ACE parser (APE) source code: https://github.com/Attempto/APE ACE-in-GF source code: http://github.com/Attempto/ACE-in-GF AceWiki and AceWikiGF • Source code: http://github.com/AceWiki/AceWiki • Demos (non-GF): http://attempto.ifi.uzh.ch/acewiki/ • Demos (GF): http://attempto.ifi.uzh.ch/acewiki-gf/ MOLTO project web site: http://www.molto-project.eu Attempto web site: http://attempto.ifi.uzh.ch GF: http://www.grammaticalframework.org Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 26 / 27
  • 27. Thank you for your Attention! If you are interested in Controlled Natural Languages, come visit us at CNL 2014! CNL 2014 Fourth Workshop on Controlled Natural Language 20–22 August 2014, Galway http://attempto.ifi.uzh.ch/site/cnl2014/ Tobias Kuhn, ETH Zurich A Multilingual Semantic Wiki 27 / 27