Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Â
NIF - NLP Interchange Format
1. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
Sebastian Hellmann
AKSW, Universität Leipzig
LOD2 Presentation . 02.09.2010 . Page http://lod2.eu
2. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
Problem:
⢠Currently NLP software is organized in pipelines
⢠Integration is done âhard-wiredâ
â For each tool and each framework an adapter has to be created
(n*m)
⢠Difficult to exchange single components
2
Open Linguistics@OKCon 30.6.2011 2 http://lod2.eu
3. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
Overview:
⢠NLP tools can be integrated via a common output format (Common
pattern in Enterprise Application Integration)
⢠For each tool a wrapper needs to be created, that reads NIF and
produces NIF
⢠The combination of tools can be adhoc, i.e. it is not a pipeline that
needs to be configured
⢠Multi-layer and overlapping annotations are possible
⢠Ontologies provide interfaces for each layer and for applications
3
Open Linguistics@OKCon 30.6.2011 3 http://lod2.eu
4. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
⢠First Challenge: Representing Strings in RDF
⢠How to give a part of a document or text an identifier (URI)?
⢠What properties can such URIs have?
4
Open Linguistics@OKCon 30.6.2011 4 http://lod2.eu
5. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
5
LOD2 Event . 06.09.2010 . Page 5 http://lod2.eu
6. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
Example URIs for annotating âSemantic Webâ
6
Open Linguistics@OKCon 30.6.2011 6 http://lod2.eu
7. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
⢠First Challenge: Representing Strings in RDF
⢠How to give a part of a document or text an identifier (URI)?
⢠What properties can such URIs have?
7
Open Linguistics@OKCon 30.6.2011 7 http://lod2.eu
8. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
⢠URIs are used to integrate output. RDF merges naturally, if the URIs
are the same (or convertible using a certain recipe)
8
Open Linguistics@OKCon 30.6.2011 8 http://lod2.eu
9. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
⢠Second challenge: Output of each layer is required to be stable.
⢠Components and layers can be interchanged
⢠OLiA provides an ontological interface
9
Open Linguistics@OKCon 30.6.2011 9 http://lod2.eu
10. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
10
LOD2 Event . 06.09.2010 . Page 10 http://lod2.eu
11. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
11
LOD2 Event . 06.09.2010 . Page 11 http://lod2.eu
12. Creating Knowledge out of Interlinked Data
NIF â NLP Interchange Format
12
LOD2 Event . 06.09.2010 . Page 12 http://lod2.eu
13. Creating Knowledge out of Interlinked Data
Workplan
⢠EU Deliverable almost finished
⢠Integration of SnowballStemming and the Stanford Parser
⢠Next step: Integration of Knowledge Extraction tools (Zemanta,
DBpedia Spotlight, Alchemy, OpenCalais)
⢠Web Service that read NIF and Output NIF
⢠Google Code Project: http://code.google.com/p/nlp2rdf/
13
Open Linguistics@OKCon 30.6.2011 13 http://lod2.eu
14. Creating Knowledge out of Interlinked Data
Future
⢠NIF allows to represent NLP output using Knowledge Representation
Formalisms (RDF/OWL)
⢠It is possible to mix it with other Knowledge (e.g. Wikipedia/DBpedia)
⢠Good foundation to optimize machine learning:
⢠Choose the best algortihms
⢠Choose the best data
14
Open Linguistics@OKCon 30.6.2011 14 http://lod2.eu
15. Creating Knowledge out of Interlinked Data
Reasons for Open Data
⢠HorvĂĄth et. al. (ILP 2009): âA Logic-Based Approach to Relation
Extraction from Textsâ
⢠POS-Tags and Dependency Trees in First-Order-Logic
⢠ILP Machine Learning Approach
⢠TIDES Extraction (ACE) 2003 Multilingual Training Data
⢠closed licence
⢠about 3000 US $
⢠Barrier for reproduction of results
⢠Authors could send me a (p)(r)e-print, but not a copy of the
benchmarkTM
15
Open Linguistics@OKCon 30.6.2011 15 http://lod2.eu
16. Creating Knowledge out of Interlinked Data
Thank you for your attention!
LOD2 Presentation . 02.09.2010 . Page http://lod2.eu