Weitere ähnliche Inhalte Mehr von Thomas Roth-Berghofer (7) Kürzlich hochgeladen (20) Case acquisition from text: Ontology-based information extraction with SCOOBIE for myCBR1. Competence Center
Case-Based Reasoning
CASE ACQUISITION FROM TEXT:
ONTOLOGY-BASED INFORMATION
EXTRACTION WITH SCOOBIE FOR MYCBR
Thomas Roth-Berghofer, Benjamin Adrian, and Andreas Dengel
German Research Center for Artificial Intelligence DFKI GmbH
Donnerstag, 5. August 2010
2. COMPETENCE CENTER
CASE-BASED REASONING (CC CBR)
Klaus-Dieter Thomas Armin
Althoff Roth-Berghofer Stahl
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
3. COMPETENCE CENTER
CASE-BASED REASONING (CC CBR)
Klaus-Dieter Thomas Armin
Althoff Roth-Berghofer Stahl
Kerstin Régis
© 2010 DFKI CC CBR
Bach Newo
Donnerstag, 5. August 2010
5. MOTIVATION
Ontologies
SCOOBIE
Ontology-based RDF
Texts
© 2010 DFKI CC CBR
Information Extraction
Donnerstag, 5. August 2010
6. MOTIVATION
Ontologies
SCOOBIE +
Ontology-based RDF
Texts
© 2010 DFKI CC CBR
Information Extraction
Donnerstag, 5. August 2010
7. BBC Music profiles
Jamendo TOTP Peel Sites
Open-
Guides
DBLP
flickr RKB
Project
Pub Geo- Euro- wrappr Explorer
Guten- Virtuoso
Guide names stat Pisa CORDIS
berg Sponger eprints
BBC
Programmes Open
Calais
RKB
riese World Linked
ECS
Magna- Fact- MDB IEEE New-
South-
tune book
ampton castle
RDF Book
DBpedia Mashup
Linked
GeoData lingvoj Freebase LAAS-
US CiteSeer
Census CNRS
W3C DBLP
Data IBM
WordNet Hannover
UniRef
GEO
UMBEL Species DBLP
Gov-
Track Berlin
Reactome
LinkedCT UniParc
Open Taxonomy
Cyc Yago Drug
PROSITE
Daily Bank
Med
Pub GeneID
Chem
Homolo KEGG UniProt
Gene
Pfam ProDom
Disea- CAS
Gene
some
ChEBI Ontology
Symbol OMIM
Inter
Pro
UniSTS PDB
MOTIVATION
HGNC
MGI
PubMed
As of July 2009
Ontologies
SCOOBIE +
Ontology-based RDF
Texts
© 2010 DFKI CC CBR
Information Extraction
Donnerstag, 5. August 2010
8. BBC Music profiles
Jamendo TOTP Peel Sites
Open-
Guides
DBLP
flickr RKB
Project
Pub Geo- Euro- wrappr Explorer
Guten- Virtuoso
Guide names stat Pisa CORDIS
berg Sponger eprints
BBC
Programmes Open
Calais
RKB
riese World Linked
ECS
Magna- Fact- MDB IEEE New-
South-
tune book
ampton castle
RDF Book
DBpedia Mashup
Linked
GeoData lingvoj Freebase LAAS-
US CiteSeer
Census CNRS
W3C DBLP
Data IBM
WordNet Hannover
UniRef
GEO
UMBEL Species DBLP
Gov-
Track Berlin
Reactome
LinkedCT UniParc
Open Taxonomy
Cyc Yago Drug
PROSITE
Daily Bank
Med
Pub GeneID
Chem
Homolo KEGG UniProt
Gene
Pfam ProDom
Disea- CAS
Gene
some
ChEBI Ontology
Symbol OMIM
Inter
Pro
UniSTS PDB
MOTIVATION
HGNC
MGI
PubMed
As of July 2009
Ontologies
SCOOBIE +
Ontology-based RDF
Texts
© 2010 DFKI CC CBR
Information Extraction
Donnerstag, 5. August 2010
9. OVERVIEW
• Ontology-based Information Extraction with SCOOBIE
• Recap of myCBR
• myCBR+SCOOBIE
• Outlook and future work
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
11. SCOOBIE
Ontologies
Ontology-based
Texts RDF
Information Extraction
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
21. RECAP: MOTIVATION FOR
DEVELOPING
• Need for a freely available “out of the box” tool:
• compact and easy to use
• comfortable graphical user interface for
• defining case representations
• modeling knowledge-intensive similarity measures
• testing of retrieval functionality
• support for rapid prototyping
© 2010 DFKI CC CBR
• adaptable & extendable
Donnerstag, 5. August 2010
22. ➜ ECCBR
2008
© 2010 DFKI CC CBR
Armin Stahl and Thomas R. Roth-Berghofer. Rapid prototyping of CBR applications with the open source tool myCBR.
In Ralph Bergmann and Klaus-Dieter Althoff, editors, Advances in Case-Based Reasoning. Springer Verlag, 2008.
Donnerstag, 5. August 2010
23. BBC Music profiles
Jamendo TOTP Peel Sites
Open-
Guides
DBLP
flickr RKB
Project
Pub Geo- Euro- wrappr Explorer
Guten- Virtuoso
Guide names stat Pisa CORDIS
berg Sponger eprints
BBC
Programmes Open
Calais
RKB
riese World Linked
ECS
Magna- Fact- MDB IEEE New-
South-
tune book
ampton castle
RDF Book
DBpedia Mashup
Linked
GeoData lingvoj Freebase LAAS-
US CiteSeer
Census CNRS
W3C DBLP
Data IBM
WordNet Hannover
UniRef
GEO
UMBEL Species DBLP
Gov-
Track Berlin
Reactome
LinkedCT UniParc
Open Taxonomy
Cyc Yago Drug
PROSITE
Daily Bank
Med
Pub GeneID
Chem
Homolo KEGG UniProt
Gene
Pfam ProDom
Disea- CAS
Gene
some
ChEBI Ontology
Symbol OMIM
Inter
Pro
UniSTS PDB
MOTIVATION
HGNC
MGI
PubMed
As of July 2009
Ontologies
SCOOBIE +
Ontology-based RDF
Texts
Information Extraction
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
24. SEMANTIC WEB VISION
“The Semantic Web is an extension of the
current Web in which information is given well-
defined meaning, better enabling computers
and people to work in cooperation.”
© 2010 DFKI CC CBR
T. Berners-Lee, J. Hendler, O. Lassila, “The Semantic Web”, Scientific American, May 2001
Donnerstag, 5. August 2010
25. SEMANTIC WEB VISION
“The Semantic Web is an extension of the
current Web in which information is given well-
defined meaning, better enabling computers
and people to work in cooperation.”
• Web of content
• Web pages linked by semantical relations
• Machines are able to process contents and links
© 2010 DFKI CC CBR
T. Berners-Lee, J. Hendler, O. Lassila, “The Semantic Web”, Scientific American, May 2001
Donnerstag, 5. August 2010
26. SEMANTIC WEB VISION
“The Semantic Web is an extension of the
current Web in which information is given well-
defined meaning, better enabling computers
and people to work in cooperation.”
• Web of content
Web of content
• Web pages linked by semantical relations
• Machines are able to process contents and links
© 2010 DFKI CC CBR
T. Berners-Lee, J. Hendler, O. Lassila, “The Semantic Web”, Scientific American, May 2001
Donnerstag, 5. August 2010
27. WEB OF DATA
• Characteristics:
• Expressed in
RDF
• Identified by
URIs
• Accessible via
http
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
28. WEB OF TRIPLES
<rdf:Description
rdf:about=
"http://dbtropes.org/resource/Main/Ratatouille#Remy">
<does-not-like
rdf:resource=
"http://mycbr-project.net/models/Recipe#velveeta_cheese"/>
</rdf:Description>
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
29. WEB OF TRIPLES
• Characteristics:
• Expressed in
RDF
<rdf:Description • Identified by
rdf:about= URIs
"http://dbtropes.org/resource/Main/Ratatouille#Remy">
<does-not-like
• Accessible via
rdf:resource=
http
"http://mycbr-project.net/models/Recipe#velveeta_cheese"/>
</rdf:Description>
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
30. WEB OF TRIPLES
<rdf:Description
rdf:about=
"http://dbtropes.org/resource/Main/Ratatouille#Remy">
<does-not-like
rdf:resource=
"http://mycbr-project.net/models/Recipe#velveeta_cheese"/>
</rdf:Description>
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
31. flickr RKB
Project
Geo- Euro- wrappr Explorer
Guten- Virtuoso
names Pisa
USING LINKED
stat berg Sponger
Open
Calais
RKB
World Linked
ECS
Magna- Fact- MDB South-
DATA FOR CASE
tune book
ampton
RDF Book
DBpedia Mashup
lingvoj Freebase
CiteSeer
W3C DBLP
GENERATION
WordNet Hannover
UniR
GEO
UMBEL Species DBLP
Berlin
Reactome
LinkedCT UniParc
o Drug
PROSITE
Daily Bank
Med
Pub GeneID
Chem
KEGG UniProt
Pfam
Disea- CAS
Gene
some
ChEBI Ontology
l OMIM
UniSTS
Case Inter
Pro
PDB
HGNC
MGI
PubMed
Model
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
32. flickr RKB
Project
Geo- Euro- wrappr Explorer
Guten- Virtuoso
names
<skos:Concept
Pisa
USING LINKED
stat berg Sponger
Open rdf:about="http://mycbr-project.net/models/Recipe#Shallots">
Calais
World Linked <skos:prefLabel>
RKB
ECS
Magna- Fact- MDB
Shallots
South-
DATA FOR CASE
tune book
ampton
DBpedia
</skos:prefLabel>
RDF Book
Mashup
lingvoj Freebase <rdf:type rdf:resource="ingredients_vegetables"/>
CiteSeer
W3C </skos:Concept>
DBLP
GENERATION
WordNet Hannover
UniR
GEO
UMBEL Species DBLP
Berlin
<skos:Concept
LinkedCT rdf:about="http://mycbr-project.net/models/Recipe#Onions">
Reactome
UniParc
o Drug <skos:prefLabel> PROSITE
Bank
Daily
Med Onions
Pub GeneID
Chem </skos:prefLabel>
KEGG UniProt
<rdf:type rdf:resource="ingredients_vegetables"/>
Disea- CAS
</skos:Concept> Pfam
Gene
some
ChEBI Ontology
l OMIM
UniSTS
Case
Inter
Pro
PDB
HGNC
MGI
PubMed
Model
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
33. flickr RKB
Project
Geo- Euro- wrappr Explorer
Guten- Virtuoso
names Pisa
USING LINKED
stat berg Sponger
Open
Calais
RKB
World Linked
ECS
Magna- Fact- MDB South-
DATA FOR CASE
tune book
ampton
RDF Book
DBpedia Mashup
lingvoj Freebase
CiteSeer
W3C DBLP
GENERATION
WordNet Hannover
UniR
GEO
UMBEL Species DBLP
Berlin
Reactome
LinkedCT UniParc
o Drug
PROSITE
Daily Bank
Med
Pub GeneID
Chem
KEGG UniProt
Pfam
Disea- CAS
Gene
some
ChEBI Ontology
l OMIM
UniSTS
Case Inter
Pro
PDB
HGNC
MGI
PubMed
Model
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
34. flickr RKB
Project
Geo- Euro- wrappr Explorer
Guten- Virtuoso
names Pisa
USING LINKED
stat berg Sponger
Open
Calais
RKB
World Linked
ECS
Magna- Fact- MDB South-
DATA FOR CASE
tune book
ampton
RDF Book
DBpedia Mashup
lingvoj Freebase
CiteSeer
W3C DBLP
GENERATION
WordNet Hannover
UniR
GEO
UMBEL Species DBLP
Berlin
Reactome
LinkedCT UniParc
o Drug
PROSITE
Daily Bank
Med
Pub GeneID
Chem
KEGG UniProt
Pfam
Disea- CAS
Gene
some
ChEBI Ontology
l OMIM
UniSTS
Connection Case Inter
Pro
PDB
HGNC
Model MGI
PubMed
Model
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
35. flickr
flickr RKB
RKB
Project
Project
Geo-
Geo- Euro-
Euro- wrappr
wrappr Explorer
Explorer
Guten-
Guten- Virtuoso
Virtuoso
names
names Pisa
Pisa CORDIS
USING LINKED
stat
stat berg
berg Sponger
Sponger eprints
Open
Open
Calais
Calais
RKB
RKB
World
World Linked
Linked
ECS
ECS
Magna-
Magna- Fact-
Fact- MDB
MDB IEEE New-
South-
South-
DATA FOR CASE
tune
tune book
book
ampton
ampton castle
RDF Book
RDF Book
DBpedia
DBpedia Mashup
Mashup
lingvoj
lingvoj Freebase
Freebase LAAS-
CiteSeer
CiteSeer
CNRS
W3C
W3C DBLP
DBLP
GENERATION
IBM
WordNet
WordNet Hannover
Hannover
UniRef
UniR
GEO
GEO
UMBEL
UMBEL Species
Species DBLP
DBLP
Berlin
Berlin
Reactome
Reactome
LinkedCT
LinkedCT UniParc
UniParc
Taxonomy
o
o Drug
Drug
owl:sameas
PROSITE
PROSITE
Daily
Daily Bank
Bank
Med
Med
Pub
Pub GeneID
GeneID
Chem
Chem
KEGG
KEGG UniProt
UniProt
Pfam
Pfam ProDom
Disea-
Disea- CAS
CAS
Gene
Gene
some
some
ChEBI
ChEBI Ontology
Ontology
l OMIM
OMIM
UniSTS
UniSTS
Connection CaseInter
Inter
Pro
Pro
PDB
PDB
HGNC
HGNC
Model MGI
MGI
PubMed
PubMed
Model
As of July 2009
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010
36. flickr
flickr RKB
RKB
Project
Project
Geo-
Geo- Euro-
Euro- wrappr
wrappr Explorer
Explorer
Guten-
Guten- Virtuoso
Virtuoso
names
names Pisa
Pisa CORDIS
USING LINKED
stat
stat berg
berg Sponger
Sponger eprints
Open
Open
Calais
Calais
RKB
RKB
World
World Linked
Linked
ECS
ECS
Magna-
Magna- Fact-
Fact- MDB
MDB IEEE New-
South-
South-
DATA FOR CASE
tune
tune book
book
ampton
ampton castle
RDF Book
RDF Book
DBpedia
DBpedia Mashup
Mashup
lingvoj
lingvoj Freebase
Freebase LAAS-
CiteSeer
CiteSeer
CNRS
W3C
W3C DBLP
DBLP
GENERATION
IBM
WordNet
WordNet Hannover
Hannover
UniRef
UniR
GEO
GEO
UMBEL
UMBEL Species
Species DBLP
DBLP
Berlin
Berlin
Reactome
Reactome
LinkedCT
LinkedCT UniParc
UniParc
Taxonomy
o
o Drug
Drug
owl:sameas
PROSITE
PROSITE
Daily
Daily Bank
Bank
Med
Med
Pub
Pub GeneID
GeneID
Chem
Chem
KEGG
KEGG UniProt
UniProt
Pfam
Pfam ProDom
Disea-
Disea- CAS
CAS
Gene
Gene
some
some
ChEBI
ChEBI Ontology
Ontology
l OMIM
OMIM
UniSTS
UniSTS
Connection CaseInter
Inter
Pro
Pro
PDB
PDB
HGNC
HGNC
Model MGI
MGI
PubMed
PubMed
Model
<http://mycbr-project.net/models/Recipe#onions> 2009
As of July
owl:sameas <http://dbpedia.org/resource/Onion>
<http://mycbr-project.net/models/Recipe#green_fettuccine">
owl:sameas <http://dbpedia.org/resource/Fettucine>
<http://mycbr-project.net/models/Recipe#spinach_noodles">
owl:sameas <http://dbpedia.org/resource/Noodle>
© 2010 DFKI CC CBR
Donnerstag, 5. August 2010