SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
LODiļ¬er: Generating Linked Data from
Unstructured Text
Isabelle Augenstein1 Sebastian PadĀ“o1 Sebastian Rudolph2
1Department of Computational Linguistics, UniversitĀØat Heidelberg, Germany
2Institute AIFB, Karlsruhe Institute of Technology, Germany
{augenste,pado}@cl.uni-heidelberg.de, rudolph@kit.edu
Extended Semantic Web Conference, 2012
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 1 / 28
Outline
1 Motivation
2 The LODiļ¬er Architecture and Workļ¬‚ow
3 Evaluation in a Document Similarity Task
4 Conclusion
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 2 / 28
Introduction
Introduction
What?
Represent information from NL text as Linked Data
Creation of graph-based representation for unstructured plain text
Why?
Task is easy for structured text, but quite challenging for unstructured
text
Most other approaches are very selective, e.g., extract only
pre-deļ¬ned relations
Our approach translates the text in its entirety
Possible use cases: Domain-independent scenarios in which no
pre-deļ¬ned schema is available
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 3 / 28
Introduction
Introduction
How?
Use noise-tolerant NLP pipeline to analyze text
Translate result into RDF representation
Embed output in the LOD cloud by using vocabulary from DBPedia
and Wordnet
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 4 / 28
System Overview Pipeline
http:
//ww
w.w3
.org/
RDF
/icon
s/rdf
_ļ¬‚ye
r.svg
1 of 1
08.12
.11
23:48
lemmatized
text
NE-marked
text
parsed text
word sense
URIs
parsed text
DRS
tokenized
text
DBpedia
URIs
LOD-linked
RDF
unstructured
text
Named Entity
Recognition
(Wikifier)
Parsing
(C&C)
Deep Semantic
Analysis
(Boxer)
Lemmatization
Word Sense
Disambiguation
(UKB)
Tokenization
RDF Generation and Integration
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 5 / 28
System NLP analysis steps
NLP analysis steps (NER)
Named Entity Recognition (Wikiļ¬er 1):
Named entities are recognized using Wikiļ¬er
Wikiļ¬er ļ¬nds Wikipedia links for named entities
Wikipedia URLs are converted to DBPedia URIs
The New York Times reported that John McCarthy died.
He invented the programming language LISP.
Figure: Test sentences
[[The New York Times]] reported that [[John McCarthy (computer
scientist)|John McCarthy]] died. He invented the [[Programming
language|programming language]] [[Lisp (programming language)|
Lisp]].
Figure: Wikiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 6 / 28
System NLP analysis steps
NLP analysis steps (WSD)
Word Sense Disambiguation (UKB 2)):
Words are disambiguated using UKB, an unsupervised graph-based
WSD tool
UKB ļ¬nds WordNet links for word senses
UKB output is converted to RDF WordNet URIs
ctx_0 w1 00965687-v !! report
ctx_0 w4 00358431-v !! die
ctx_1 w7 01634424-v !! invent
ctx_1 w9 06898352-n !! programming_language
ctx_1 w10 06901936-n !! lisp
Figure: UKB output for the test sentences.
2
http://ixa2.si.ehu.es/ukb/
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 7 / 28
System NLP analysis steps
NLP analysis steps (Parsing and Semantic Analysis)
Parsing (C&C 3):
Sentences are parsed with the C&C parser
The C&C parser is a statistical parser that uses the combined
categorial grammar (CCG)
CCG is a semantically motivated grammar, making it suitable for
further semantic analysis
Deep Semantic Analysis (Boxer 4):
Boxer builds on the output of C&C and produces discourse
representation structures (DRSs)
DRSs model the meaning of texts in terms of the relevant entities
(discourse referents) and relations between them (conditions)
3
http://svn.ask.it.usyd.edu.au/trac/candc/
4
http://svn.ask.it.usyd.edu.au/trac/candc/wiki/boxer
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 8 / 28
System NLP analysis steps
______________________________ _______________________
| x0 x1 x2 x3 | | x4 x5 x6 |
|..............................| |.......................|
(| male(x0) |+| event(x4) |)
| named(x0,john_mccarthy,per) | | invent(x4) |
| programming_language(x1) | | agent(x4,x0) |
| nn(x1,x2) | | patient(x4,x2) |
| named(x2,lisp,nam) | | event(x5) |
| named(x3,new_york_times,org) | | report(x5) |
|______________________________| | agent(x5,x3) |
| theme(x5,x6) |
| proposition(x6) |
| ______________ |
| | x7 | |
| x6:|..............| |
| | event(x7) | |
| | die(x7) | |
| | agent(x7,x0) | |
| |______________| |
|_______________________|
Figure: Boxer output for the example sentences
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 9 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 10 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 11 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 12 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 13 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 14 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 15 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 16 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 17 / 28
System NLP analysis steps
_:var0x0 drsclass:named ne:john_mccarthy ;
rdf:type drsclass:male , foaf:Person ;
owl:sameAs dbpedia:John_McCarthy_(computer_scientist) .
_:var0x1 rdf:type class:programming_language ;
owl:sameAs dbpedia:Programming_language .
_:var0x2 drsrel:nn _:var0x1 .
_:var0x2 drsclass:named ne:lisp ;
owl:sameAs dbpedia:Lisp_(programming_language) .
_:var0x3 drsclass:named ne:the_new_york_times ;
owl:sameAs dbpedia:The_New_York_Times .
_:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 .
drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 .
_:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ;
drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 .
_:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ;
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object drsclass:event . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate rdf:type ;
rdf:object wn30:wordsense-die-verb-1 . ]
reify:conjunct [ rdf:subject _:var0x7 ;
rdf:predicate drsrel:agent ;
rdf:object _:var0x0 . ]
Figure: LODiļ¬er output for the test sentences.
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 18 / 28
Evaluation Evaluation overview
Evaluation overview
Task:
Assess document similarity
Story link detection task, part of topic detection and tracking (TDT)
family of tasks
Data:
TDT-2 benchmark dataset 5
Consists of newspaper, TV and radio news in English, Arabic and
Mandarin
Subset: English language, only newspaper articles
183 positive and negative document pairs
5
http://projects.ldc.upenn.edu/TDT2/
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 19 / 28
Evaluation Evaluation overview
Evaluation overview
Method:
Deļ¬ne various document similarity measures
Measures without structural information
Measures with structural information
Documents are predicted to have the same topic if they have a
similarity of Īø or more
Īø is determined with supervised learning
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 20 / 28
Evaluation Evaluation method
Evaluation method
Similarity measures without structure:
Random baseline
Bag-of-words baseline: Word overlap between the two documents in
a document pair
Bag-of-URI baseline: URI overlap between two RDF documents
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 21 / 28
Evaluation Evaluation method
Evaluation method
URI class variants:
Three variants for URI baseline: Take the relative importance of
various URI classes into account
Variant 1: All NEs identiļ¬ed by Wikiļ¬er, all words successfully
disambiguated by UKB (namespaces dbpedia:, wn30:)
Variant 2: Adds all NEs recognized by Boxer (namespace ne:)
Variant 3: Further adds the URIs for all words that were not
recognized by Wikiļ¬er, UKB or Boxer (namespace class:)
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 22 / 28
Evaluation Evaluation method
Evaluation method
Extended setting:
Aims at drawing more information from LOD cloud into the similarity
computation
The generated RDF graph is enriched by:
DBpedia categories (namespace dbpedia:)
WordNet synsets and WordNet senses related to the URIs in the
generated graph (namespace wn30:)
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 23 / 28
Evaluation Evaluation method
Evaluation method
Similarity measures with structure:
Full-ļ¬‚edged graph similarity measures, e.g. based on isomorphic
subgraphs, are infeasible due to the size of the RDF graphs
Our similarity measures are based on the shortest paths between
relevant nodes
Our intuition is that short paths between URI pairs denote relevant
semantic information and we expect they are related by a short path
in other documents as well
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 24 / 28
Evaluation Evaluation method
Evaluation method
Similarity measures with structure:
proSimk,Rel,f (G1, G2) =
a,bāˆˆRel(G1)
a,b āˆˆCk (G1)āˆ©Ck (G2)
f ( (a, b))
a,bāˆˆRel(G1)
a,b āˆˆCk (G1)
f ( (a, b))
k: path length, Rel: semantic relations, f : similarity function
proSimcnt: Uses f ( ) = 1, i.e., just counts the number of paths
irrespective of their length
proSimlen: Uses f ( ) = 1/ , giving less weight to longer paths
proSimsqlen: Uses f ( ) = 1/
āˆš
, discounting long paths less
aggressively than proSimlen
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 25 / 28
Evaluation Results
Results
Model normal extended
Similarity measures without structure
Random Baseline 50.0 ā€“
Bag of Words 63.0 ā€“
Bag of URIs (Variant 3) 73.4 76.4
Similarity measures with structure
proSimcnt (k=6, Variant 1) 77.7 77.6
proSimcnt (k=6, Variant 2) 79.2 79.0
proSimcnt (k=8, Variant 3) 82.1 81.9
proSimlen (k=6, Variant 3) 81.5 81.4
proSimsqlen (k=8, Variant 3) 81.1 80.9
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 26 / 28
Summary
Summary
Main contributions:
Service for the LOD community that can be used in diļ¬€erent
scenarios that can proļ¬t from structural representation of text
Combination of well-established concepts and systems in a deep
semantic pipeline
Linking existing NLP technologies to the LOD cloud
Evaluation of LODiļ¬er in a topic link detection task gives clear
evidence of its potential
Possible future work directions:
Instead of a pipeline approach, simultaneously consider information
from the NLP components to allow interaction between them
Test LODiļ¬er in diļ¬€erent scenarios
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 27 / 28
Summary
Thank you!
Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 28 / 28

Weitere Ƥhnliche Inhalte

Was ist angesagt?

Distributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementDistributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementOlafGoerlitz
Ā 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online NewsBernardo Najlis
Ā 
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Dr.-Ing. Thomas Hartmann
Ā 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
Ā 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Olaf Hartig
Ā 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Olaf Hartig
Ā 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialLeeFeigenbaum
Ā 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQLOlaf Hartig
Ā 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLPRobert Viseur
Ā 
TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...
TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...
TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...JohannWanja
Ā 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionRrubaa Panchendrarajan
Ā 
EDF2012 Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...
EDF2012   Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...EDF2012   Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...
EDF2012 Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...European Data Forum
Ā 
RDF Constraint Checking using RDF Data Descriptions (RDD)
RDF Constraint Checking using RDF Data Descriptions (RDD)RDF Constraint Checking using RDF Data Descriptions (RDD)
RDF Constraint Checking using RDF Data Descriptions (RDD)Alexander SchƤtzle
Ā 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsPlanetData Network of Excellence
Ā 
Quantifying RDF data sets
Quantifying RDF data setsQuantifying RDF data sets
Quantifying RDF data setsJanos Hajagos
Ā 
How we use functional programming to find the bad guys @ Build Stuff LT and U...
How we use functional programming to find the bad guys @ Build Stuff LT and U...How we use functional programming to find the bad guys @ Build Stuff LT and U...
How we use functional programming to find the bad guys @ Build Stuff LT and U...Richard Minerich
Ā 

Was ist angesagt? (20)

Distributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data ManagementDistributed Query Processing for Federated RDF Data Management
Distributed Query Processing for Federated RDF Data Management
Ā 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online News
Ā 
Triple Stores
Triple StoresTriple Stores
Triple Stores
Ā 
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Ā 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
Ā 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
Ā 
Scaling the (evolving) web data ā€“at low cost-
Scaling the (evolving) web data ā€“at low cost-Scaling the (evolving) web data ā€“at low cost-
Scaling the (evolving) web data ā€“at low cost-
Ā 
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Ā 
CSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web TutorialCSHALS 2010 W3C Semanic Web Tutorial
CSHALS 2010 W3C Semanic Web Tutorial
Ā 
Querying Linked Data with SPARQL
Querying Linked Data with SPARQLQuerying Linked Data with SPARQL
Querying Linked Data with SPARQL
Ā 
Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLP
Ā 
ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
Ā 
TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...
TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...
TermPicker: Enabling the Reuse of Vocabulary Terms by Exploiting Data from th...
Ā 
Neural Architectures for Named Entity Recognition
Neural Architectures for Named Entity RecognitionNeural Architectures for Named Entity Recognition
Neural Architectures for Named Entity Recognition
Ā 
EDF2012 Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...
EDF2012   Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...EDF2012   Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...
EDF2012 Irini Fundulaki - Abstract Access Control Models for Dynamic RDF Da...
Ā 
RDF Constraint Checking using RDF Data Descriptions (RDD)
RDF Constraint Checking using RDF Data Descriptions (RDD)RDF Constraint Checking using RDF Data Descriptions (RDD)
RDF Constraint Checking using RDF Data Descriptions (RDD)
Ā 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
Ā 
AINL 2016: Yagunova
AINL 2016: YagunovaAINL 2016: Yagunova
AINL 2016: Yagunova
Ā 
Quantifying RDF data sets
Quantifying RDF data setsQuantifying RDF data sets
Quantifying RDF data sets
Ā 
How we use functional programming to find the bad guys @ Build Stuff LT and U...
How we use functional programming to find the bad guys @ Build Stuff LT and U...How we use functional programming to find the bad guys @ Build Stuff LT and U...
How we use functional programming to find the bad guys @ Build Stuff LT and U...
Ā 

Andere mochten auch

Deploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal ServerDeploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal Serverrumito
Ā 
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersUSFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersIsabelle Augenstein
Ā 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine ReadingIsabelle Augenstein
Ā 
Seed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation ExtractionSeed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation ExtractionIsabelle Augenstein
Ā 
Relation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionRelation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionIsabelle Augenstein
Ā 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation LearningIsabelle Augenstein
Ā 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Peter Mika
Ā 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Andre Freitas
Ā 
Information Extraction with Linked Data
Information Extraction with Linked DataInformation Extraction with Linked Data
Information Extraction with Linked DataIsabelle Augenstein
Ā 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebIsabelle Augenstein
Ā 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question AnsweringMarina Santini
Ā 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Webalierkan
Ā 
Management information system question and answers
Management information system question and answersManagement information system question and answers
Management information system question and answerspradeep acharya
Ā 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
Ā 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringTraian Rebedea
Ā 
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...Glen Cathey
Ā 
Web 3.0 The Semantic Web
Web 3.0 The Semantic WebWeb 3.0 The Semantic Web
Web 3.0 The Semantic WebHatem Mahmoud
Ā 

Andere mochten auch (19)

Deploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal ServerDeploying RDF Linked Data via Virtuoso Universal Server
Deploying RDF Linked Data via Virtuoso Universal Server
Ā 
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with AutoencodersUSFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
USFD at SemEval-2016 - Stance Detection on Twitter with Autoencoders
Ā 
Weakly Supervised Machine Reading
Weakly Supervised Machine ReadingWeakly Supervised Machine Reading
Weakly Supervised Machine Reading
Ā 
Seed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation ExtractionSeed Selection for Distantly Supervised Web-Based Relation Extraction
Seed Selection for Distantly Supervised Web-Based Relation Extraction
Ā 
Relation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionRelation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant Supervision
Ā 
Distant Supervision with Imitation Learning
Distant Supervision with Imitation LearningDistant Supervision with Imitation Learning
Distant Supervision with Imitation Learning
Ā 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
Ā 
Mapping Keywords to
Mapping Keywords to Mapping Keywords to
Mapping Keywords to
Ā 
Human Neural Machine
Human Neural MachineHuman Neural Machine
Human Neural Machine
Ā 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
Ā 
Information Extraction with Linked Data
Information Extraction with Linked DataInformation Extraction with Linked Data
Information Extraction with Linked Data
Ā 
Natural Language Processing for the Semantic Web
Natural Language Processing for the Semantic WebNatural Language Processing for the Semantic Web
Natural Language Processing for the Semantic Web
Ā 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
Ā 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Web
Ā 
Management information system question and answers
Management information system question and answersManagement information system question and answers
Management information system question and answers
Ā 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Ā 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
Ā 
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic...
Ā 
Web 3.0 The Semantic Web
Web 3.0 The Semantic WebWeb 3.0 The Semantic Web
Web 3.0 The Semantic Web
Ā 

Ƅhnlich wie Lodifier: Generating Linked Data from Unstructured Text

Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webMahdi Atawneh
Ā 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainChristophe Debruyne
Ā 
2014.12 - Let's Disco - 2 (EDDI 2014)
2014.12 - Let's Disco - 2 (EDDI 2014)2014.12 - Let's Disco - 2 (EDDI 2014)
2014.12 - Let's Disco - 2 (EDDI 2014)Dr.-Ing. Thomas Hartmann
Ā 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataRoberto GarcĆ­a
Ā 
DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006santa
Ā 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation LearningJure Leskovec
Ā 
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...Jose Emilio Labra Gayo
Ā 
Information-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic DataInformation-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic DataSteffen Staab
Ā 
Datamining at SemWebPro 2012
Datamining at SemWebPro 2012Datamining at SemWebPro 2012
Datamining at SemWebPro 2012Vincent Michel
Ā 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
Ā 
Linked Data Visualization Model - KEG VÅ E
Linked Data Visualization Model - KEG VÅ ELinked Data Visualization Model - KEG VÅ E
Linked Data Visualization Model - KEG VÅ EJiÅ™Ć­ Helmich
Ā 
Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016Martin Necasky
Ā 
Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255deffa5
Ā 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop DatabasesMartĆ­n Rezk
Ā 
Multilingual qa
Multilingual qaMultilingual qa
Multilingual qashakimov
Ā 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applicationsJose Emilio Labra Gayo
Ā 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
Ā 
Not-So-Linked Solution to the Linked Data Mining Challenge 2016
Not-So-Linked Solution to the Linked Data Mining Challenge 2016Not-So-Linked Solution to the Linked Data Mining Challenge 2016
Not-So-Linked Solution to the Linked Data Mining Challenge 2016Jędrzej Potoniec
Ā 

Ƅhnlich wie Lodifier: Generating Linked Data from Unstructured Text (20)

Optimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the webOptimized index structures for querying rdf from the web
Optimized index structures for querying rdf from the web
Ā 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Ā 
2014.12 - Let's Disco - 2 (EDDI 2014)
2014.12 - Let's Disco - 2 (EDDI 2014)2014.12 - Let's Disco - 2 (EDDI 2014)
2014.12 - Let's Disco - 2 (EDDI 2014)
Ā 
Triplificating and linking XBRL financial data
Triplificating and linking XBRL financial dataTriplificating and linking XBRL financial data
Triplificating and linking XBRL financial data
Ā 
DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006DRESD Project Presentation - December 2006
DRESD Project Presentation - December 2006
Ā 
LOD2: State of Play WP6 - LOD2 Stack Architecture
LOD2: State of Play WP6 - LOD2 Stack ArchitectureLOD2: State of Play WP6 - LOD2 Stack Architecture
LOD2: State of Play WP6 - LOD2 Stack Architecture
Ā 
Graph Representation Learning
Graph Representation LearningGraph Representation Learning
Graph Representation Learning
Ā 
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Ā 
Information-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic DataInformation-Rich Programming in F# with Semantic Data
Information-Rich Programming in F# with Semantic Data
Ā 
Datamining at SemWebPro 2012
Datamining at SemWebPro 2012Datamining at SemWebPro 2012
Datamining at SemWebPro 2012
Ā 
ontop: A tutorial
ontop: A tutorialontop: A tutorial
ontop: A tutorial
Ā 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
Ā 
Linked Data Visualization Model - KEG VÅ E
Linked Data Visualization Model - KEG VÅ ELinked Data Visualization Model - KEG VÅ E
Linked Data Visualization Model - KEG VÅ E
Ā 
Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016Linked Open Data - Masaryk University in Brno 8.11.2016
Linked Open Data - Masaryk University in Brno 8.11.2016
Ā 
Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255
Ā 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
Ā 
Multilingual qa
Multilingual qaMultilingual qa
Multilingual qa
Ā 
RDF Validation Future work and applications
RDF Validation Future work and applicationsRDF Validation Future work and applications
RDF Validation Future work and applications
Ā 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Ā 
Not-So-Linked Solution to the Linked Data Mining Challenge 2016
Not-So-Linked Solution to the Linked Data Mining Challenge 2016Not-So-Linked Solution to the Linked Data Mining Challenge 2016
Not-So-Linked Solution to the Linked Data Mining Challenge 2016
Ā 

Mehr von Isabelle Augenstein

Beyond Fact Checking ā€” Modelling Information Change in Scientific Communication
Beyond Fact Checking ā€” Modelling Information Change in Scientific CommunicationBeyond Fact Checking ā€” Modelling Information Change in Scientific Communication
Beyond Fact Checking ā€” Modelling Information Change in Scientific CommunicationIsabelle Augenstein
Ā 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationIsabelle Augenstein
Ā 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingIsabelle Augenstein
Ā 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science CommunicationIsabelle Augenstein
Ā 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Isabelle Augenstein
Ā 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact CheckingIsabelle Augenstein
Ā 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information OnlineIsabelle Augenstein
Ā 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...Isabelle Augenstein
Ā 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Isabelle Augenstein
Ā 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondIsabelle Augenstein
Ā 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checkingIsabelle Augenstein
Ā 
SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...Isabelle Augenstein
Ā 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...Isabelle Augenstein
Ā 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...Isabelle Augenstein
Ā 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Isabelle Augenstein
Ā 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Isabelle Augenstein
Ā 

Mehr von Isabelle Augenstein (17)

Beyond Fact Checking ā€” Modelling Information Change in Scientific Communication
Beyond Fact Checking ā€” Modelling Information Change in Scientific CommunicationBeyond Fact Checking ā€” Modelling Information Change in Scientific Communication
Beyond Fact Checking ā€” Modelling Information Change in Scientific Communication
Ā 
Automatically Detecting Scientific Misinformation
Automatically Detecting Scientific MisinformationAutomatically Detecting Scientific Misinformation
Automatically Detecting Scientific Misinformation
Ā 
Accountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact CheckingAccountable and Robust Automatic Fact Checking
Accountable and Robust Automatic Fact Checking
Ā 
Determining the Credibility of Science Communication
Determining the Credibility of Science CommunicationDetermining the Credibility of Science Communication
Determining the Credibility of Science Communication
Ā 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
Ā 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
Ā 
Towards Explainable Fact Checking
Towards Explainable Fact CheckingTowards Explainable Fact Checking
Towards Explainable Fact Checking
Ā 
Tracking False Information Online
Tracking False Information OnlineTracking False Information Online
Tracking False Information Online
Ā 
What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...What can typological knowledge bases and language representations tell us abo...
What can typological knowledge bases and language representations tell us abo...
Ā 
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate ...
Ā 
Learning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyondLearning with limited labelled data in NLP: multi-task learning and beyond
Learning with limited labelled data in NLP: multi-task learning and beyond
Ā 
Learning to read for automated fact checking
Learning to read for automated fact checkingLearning to read for automated fact checking
Learning to read for automated fact checking
Ā 
SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...
SemEval 2017 Task 10: ScienceIE ā€“ Extracting Keyphrases and Relations from Sc...
Ā 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
Ā 
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
1st Workshop for Women and Underrepresented Minorities (WiNLP) at ACL 2017 - ...
Ā 
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Machine Reading Using Neural Machines (talk at Microsoft Research Faculty Sum...
Ā 
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...Extracting Relations between Non-Standard Entities using Distant Supervision ...
Extracting Relations between Non-Standard Entities using Distant Supervision ...
Ā 

KĆ¼rzlich hochgeladen

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĆŗjo
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
Ā 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
Ā 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationRadu Cotescu
Ā 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
Ā 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
Ā 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
Ā 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
Ā 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
Ā 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
Ā 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
Ā 

KĆ¼rzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Ā 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
Ā 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organization
Ā 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
Ā 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Ā 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
Ā 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
Ā 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Ā 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Ā 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Ā 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Ā 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Ā 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Ā 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
Ā 

Lodifier: Generating Linked Data from Unstructured Text

  • 1. LODiļ¬er: Generating Linked Data from Unstructured Text Isabelle Augenstein1 Sebastian PadĀ“o1 Sebastian Rudolph2 1Department of Computational Linguistics, UniversitĀØat Heidelberg, Germany 2Institute AIFB, Karlsruhe Institute of Technology, Germany {augenste,pado}@cl.uni-heidelberg.de, rudolph@kit.edu Extended Semantic Web Conference, 2012 Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 1 / 28
  • 2. Outline 1 Motivation 2 The LODiļ¬er Architecture and Workļ¬‚ow 3 Evaluation in a Document Similarity Task 4 Conclusion Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 2 / 28
  • 3. Introduction Introduction What? Represent information from NL text as Linked Data Creation of graph-based representation for unstructured plain text Why? Task is easy for structured text, but quite challenging for unstructured text Most other approaches are very selective, e.g., extract only pre-deļ¬ned relations Our approach translates the text in its entirety Possible use cases: Domain-independent scenarios in which no pre-deļ¬ned schema is available Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 3 / 28
  • 4. Introduction Introduction How? Use noise-tolerant NLP pipeline to analyze text Translate result into RDF representation Embed output in the LOD cloud by using vocabulary from DBPedia and Wordnet Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 4 / 28
  • 5. System Overview Pipeline http: //ww w.w3 .org/ RDF /icon s/rdf _ļ¬‚ye r.svg 1 of 1 08.12 .11 23:48 lemmatized text NE-marked text parsed text word sense URIs parsed text DRS tokenized text DBpedia URIs LOD-linked RDF unstructured text Named Entity Recognition (Wikifier) Parsing (C&C) Deep Semantic Analysis (Boxer) Lemmatization Word Sense Disambiguation (UKB) Tokenization RDF Generation and Integration Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 5 / 28
  • 6. System NLP analysis steps NLP analysis steps (NER) Named Entity Recognition (Wikiļ¬er 1): Named entities are recognized using Wikiļ¬er Wikiļ¬er ļ¬nds Wikipedia links for named entities Wikipedia URLs are converted to DBPedia URIs The New York Times reported that John McCarthy died. He invented the programming language LISP. Figure: Test sentences [[The New York Times]] reported that [[John McCarthy (computer scientist)|John McCarthy]] died. He invented the [[Programming language|programming language]] [[Lisp (programming language)| Lisp]]. Figure: Wikiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 6 / 28
  • 7. System NLP analysis steps NLP analysis steps (WSD) Word Sense Disambiguation (UKB 2)): Words are disambiguated using UKB, an unsupervised graph-based WSD tool UKB ļ¬nds WordNet links for word senses UKB output is converted to RDF WordNet URIs ctx_0 w1 00965687-v !! report ctx_0 w4 00358431-v !! die ctx_1 w7 01634424-v !! invent ctx_1 w9 06898352-n !! programming_language ctx_1 w10 06901936-n !! lisp Figure: UKB output for the test sentences. 2 http://ixa2.si.ehu.es/ukb/ Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 7 / 28
  • 8. System NLP analysis steps NLP analysis steps (Parsing and Semantic Analysis) Parsing (C&C 3): Sentences are parsed with the C&C parser The C&C parser is a statistical parser that uses the combined categorial grammar (CCG) CCG is a semantically motivated grammar, making it suitable for further semantic analysis Deep Semantic Analysis (Boxer 4): Boxer builds on the output of C&C and produces discourse representation structures (DRSs) DRSs model the meaning of texts in terms of the relevant entities (discourse referents) and relations between them (conditions) 3 http://svn.ask.it.usyd.edu.au/trac/candc/ 4 http://svn.ask.it.usyd.edu.au/trac/candc/wiki/boxer Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 8 / 28
  • 9. System NLP analysis steps ______________________________ _______________________ | x0 x1 x2 x3 | | x4 x5 x6 | |..............................| |.......................| (| male(x0) |+| event(x4) |) | named(x0,john_mccarthy,per) | | invent(x4) | | programming_language(x1) | | agent(x4,x0) | | nn(x1,x2) | | patient(x4,x2) | | named(x2,lisp,nam) | | event(x5) | | named(x3,new_york_times,org) | | report(x5) | |______________________________| | agent(x5,x3) | | theme(x5,x6) | | proposition(x6) | | ______________ | | | x7 | | | x6:|..............| | | | event(x7) | | | | die(x7) | | | | agent(x7,x0) | | | |______________| | |_______________________| Figure: Boxer output for the example sentences Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 9 / 28
  • 10. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 10 / 28
  • 11. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 11 / 28
  • 12. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 12 / 28
  • 13. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 13 / 28
  • 14. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 14 / 28
  • 15. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 15 / 28
  • 16. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 16 / 28
  • 17. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 17 / 28
  • 18. System NLP analysis steps _:var0x0 drsclass:named ne:john_mccarthy ; rdf:type drsclass:male , foaf:Person ; owl:sameAs dbpedia:John_McCarthy_(computer_scientist) . _:var0x1 rdf:type class:programming_language ; owl:sameAs dbpedia:Programming_language . _:var0x2 drsrel:nn _:var0x1 . _:var0x2 drsclass:named ne:lisp ; owl:sameAs dbpedia:Lisp_(programming_language) . _:var0x3 drsclass:named ne:the_new_york_times ; owl:sameAs dbpedia:The_New_York_Times . _:var0x4 rdf:type drsclass:event , wn30:wordsense-invent-verb-2 . drsrel:agent _:var0x0 ; drsrel:patient _:var0x2 . _:var0x5 rdf:type drsclass:event , wn30:wordsense-report-verb-3 ; drsrel:agent _:var0x3 ; drsrel:theme _:var0x6 . _:var0x6 rdf:type drsclass:proposition , reify:proposition , reify:conjunction ; reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object drsclass:event . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate rdf:type ; rdf:object wn30:wordsense-die-verb-1 . ] reify:conjunct [ rdf:subject _:var0x7 ; rdf:predicate drsrel:agent ; rdf:object _:var0x0 . ] Figure: LODiļ¬er output for the test sentences. Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 18 / 28
  • 19. Evaluation Evaluation overview Evaluation overview Task: Assess document similarity Story link detection task, part of topic detection and tracking (TDT) family of tasks Data: TDT-2 benchmark dataset 5 Consists of newspaper, TV and radio news in English, Arabic and Mandarin Subset: English language, only newspaper articles 183 positive and negative document pairs 5 http://projects.ldc.upenn.edu/TDT2/ Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 19 / 28
  • 20. Evaluation Evaluation overview Evaluation overview Method: Deļ¬ne various document similarity measures Measures without structural information Measures with structural information Documents are predicted to have the same topic if they have a similarity of Īø or more Īø is determined with supervised learning Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 20 / 28
  • 21. Evaluation Evaluation method Evaluation method Similarity measures without structure: Random baseline Bag-of-words baseline: Word overlap between the two documents in a document pair Bag-of-URI baseline: URI overlap between two RDF documents Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 21 / 28
  • 22. Evaluation Evaluation method Evaluation method URI class variants: Three variants for URI baseline: Take the relative importance of various URI classes into account Variant 1: All NEs identiļ¬ed by Wikiļ¬er, all words successfully disambiguated by UKB (namespaces dbpedia:, wn30:) Variant 2: Adds all NEs recognized by Boxer (namespace ne:) Variant 3: Further adds the URIs for all words that were not recognized by Wikiļ¬er, UKB or Boxer (namespace class:) Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 22 / 28
  • 23. Evaluation Evaluation method Evaluation method Extended setting: Aims at drawing more information from LOD cloud into the similarity computation The generated RDF graph is enriched by: DBpedia categories (namespace dbpedia:) WordNet synsets and WordNet senses related to the URIs in the generated graph (namespace wn30:) Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 23 / 28
  • 24. Evaluation Evaluation method Evaluation method Similarity measures with structure: Full-ļ¬‚edged graph similarity measures, e.g. based on isomorphic subgraphs, are infeasible due to the size of the RDF graphs Our similarity measures are based on the shortest paths between relevant nodes Our intuition is that short paths between URI pairs denote relevant semantic information and we expect they are related by a short path in other documents as well Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 24 / 28
  • 25. Evaluation Evaluation method Evaluation method Similarity measures with structure: proSimk,Rel,f (G1, G2) = a,bāˆˆRel(G1) a,b āˆˆCk (G1)āˆ©Ck (G2) f ( (a, b)) a,bāˆˆRel(G1) a,b āˆˆCk (G1) f ( (a, b)) k: path length, Rel: semantic relations, f : similarity function proSimcnt: Uses f ( ) = 1, i.e., just counts the number of paths irrespective of their length proSimlen: Uses f ( ) = 1/ , giving less weight to longer paths proSimsqlen: Uses f ( ) = 1/ āˆš , discounting long paths less aggressively than proSimlen Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 25 / 28
  • 26. Evaluation Results Results Model normal extended Similarity measures without structure Random Baseline 50.0 ā€“ Bag of Words 63.0 ā€“ Bag of URIs (Variant 3) 73.4 76.4 Similarity measures with structure proSimcnt (k=6, Variant 1) 77.7 77.6 proSimcnt (k=6, Variant 2) 79.2 79.0 proSimcnt (k=8, Variant 3) 82.1 81.9 proSimlen (k=6, Variant 3) 81.5 81.4 proSimsqlen (k=8, Variant 3) 81.1 80.9 Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 26 / 28
  • 27. Summary Summary Main contributions: Service for the LOD community that can be used in diļ¬€erent scenarios that can proļ¬t from structural representation of text Combination of well-established concepts and systems in a deep semantic pipeline Linking existing NLP technologies to the LOD cloud Evaluation of LODiļ¬er in a topic link detection task gives clear evidence of its potential Possible future work directions: Instead of a pipeline approach, simultaneously consider information from the NLP components to allow interaction between them Test LODiļ¬er in diļ¬€erent scenarios Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 27 / 28
  • 28. Summary Thank you! Augenstein, PadĀ“o, Rudolph LODiļ¬er ESWC 2012 28 / 28