SlideShare ist ein Scribd-Unternehmen logo
1 von 30
ULI meeting – 2013/05/28 – Page 1 http://lod2.eu
Creating Knowledge out of Interlinked Data
LOD2 Presentation . 02.09.2010 . Page http://lod2.eu
AKSW, Universität Leipzig
Sebastian Hellmann
Linked Data
for
Abbreviations and Segmentation
http://nlp2rdf.org
http://lod2.eu
http://slideshare.net/kurzum
ULI meeting – 2013/05/28 – Page 2 http://lod2.eu
Sebastian Hellmann – researcher working on LOD2 EU Project
AKSW – Agile Knowledge and the Semantic Web research group in Leipzig -
http://aksw.org
InfAI – Institute for Applied Informatics - http://infai.org
Contents:
• Introduction to Linked Data
• Linked data close-up: DBpedia data set
• Exploitation of free and open data for CLDR
• Collaboration points
Introduction
ULI meeting – 2013/05/28 – Page 3 http://lod2.eu
http://lod-cloud.net
ULI meeting – 2013/05/28 – Page 4 http://lod2.eu
http://lod-cloud.net
Linked Open Data
- All datasets provide open access to individual records via HTTP
- Many are free (no payment required, as in royalty-free)
- Some are openly licensed, e.g. CC-0 or CC-BY-SA
=> Open access also applies to published HTML on the WWW, but here the data
itself is published unrendered via RDF
ULI meeting – 2013/05/28 – Page 5 http://lod2.eu
http://dbpedia.org
ULI meeting – 2013/05/28 – Page 6 http://lod2.eu
• DBpedia is a crowd-sourced community effort to extract structured
information from Wikipedia and make this information available on the
Web.
• allows sophisticated queries against Wikipedia content
• allows links from the different data sets on the Web to Wikipedia data
• data is extracted continuously: http://live.dbpedia.org
• WikiData will be integrated within the next four months
via Google Summer of Code project
http://dbpedia.org
ULI meeting – 2013/05/28 – Page 7 http://lod2.eu
http://dbpedia.org/resource/Berlin
First paragraph in more
than 20 languages
ULI meeting – 2013/05/28 – Page 8 http://lod2.eu
http://dbpedia.org/resource/Berlin
Facts from Wikipedia infoboxes
ULI meeting – 2013/05/28 – Page 9 http://lod2.eu
http://dbpedia.org/resource/Berlin
Several
Hierarchical
Classifications
ULI meeting – 2013/05/28 – Page 10 http://lod2.eu
http://dbpedia.org/resource/Berlin
Links
Multilingual labels
ULI meeting – 2013/05/28 – Page 11 http://lod2.eu
Trend 1: I18n
ULI meeting – 2013/05/28 – Page 12 http://lod2.eu
• DBpedia Extraction Framework can be extended to easily extract any data
from Wikipedia: https://github.com/dbpedia/extraction-framework
• We are using it to extract corpora for NLP
• e.g. URI, surrounding text, surface form
• Probabilities:
• P(sf|URI): P that “apple” refers to wikipedia:Apple_Inc.
• P(URI|sf): P that wikipedia:Apple_Inc. is “apple” in text
Trend 2: DBpedia 4 NLP
ULI meeting – 2013/05/28 – Page 13 http://lod2.eu
• DBpedia is a data dissemination project:
• as download for reuse
• As Linked Data for interlinking
• Corpora will be published via the NLP Interchange RDF Format (NIF) -
http://nlp2rdf.org
Trend 2: DBpedia 4 NLP
ULI meeting – 2013/05/28 – Page 14 http://lod2.eu
DBpedia Live Abbreviation Example
Up-to-date gazetteer
- AFD party was founded earlier this year.
- lexical information and statistics could be included
ULI meeting – 2013/05/28 – Page 15 http://lod2.eu
Linguistic LOD Cloud
ULI meeting – 2013/05/28 – Page 16 http://lod2.eu
• DBpedia
• Main version and I18n chapters
• http://dbpedia.org/Datasets/NLP
• Wiktionary 2 RDF: http://dbpedia.org/Wiktionary
• Wortschatz from Uni Leipzig (planned as Linked Data)
• http://corpora.informatik.uni-leipzig.de/download.html
• JRC Names: http://langtech.jrc.it/JRC-Names.html
• JRC-Names is a highly multilingual named entity resource for person and
organisation names
• Lexvo.org:
• provides URIs for ISO 629-3
• http://lexvo.org/id/iso639-3/spa
Example data sets from LLOD
ULI meeting – 2013/05/28 – Page 17 http://lod2.eu
http://linguistics.okfn.org/resources/llod/
=> CLDR will make an excellent addition to LLOD
Linguistic LOD
ULI meeting – 2013/05/28 – Page 18 http://lod2.eu
• CLDR as Linked Data
• empowers third parties to link to your authoritative data
• links are reusable
• LIDER EU project (presumably starting in October) will provide some
support for linked data adopters
• ULI members can join the industry and advisory board
• Workshop “DBpedia & NLP” in Oct, 2013
• http://nlp-dbpedia2013.blogs.aksw.org/
• Creation of free and open benchmarks in RDF
• We could promote CLDR and collect contributions
Collaboration points I
ULI meeting – 2013/05/28 – Page 19 http://lod2.eu
• Personally, I can:
• Join ULI mailing list
• Look out for appropriate data
• Look for opportunities (e.g. synergies with other projects)
• Provide some counseling (e.g. pointers, technology Q&A)
=> this will be done as preparation for the LIDER EU project, CLDR
• Academic collaboration:
• Excellent PhD student topic: Create corpora, interlink and fuse data and
benchmark effectiveness for segmentation
• Provide knowledge transfer (e.g. tutorials, visits)
Collaboration points II
ULI meeting – 2013/05/28 – Page 20 http://lod2.eu
Open Community – All feedback is welcome!
http://slideshare.net/kurzum
Websites:
http://dbpedia.org
http://nlp2rdf.org
http://lod2.eu
Thanks for your attention
ULI meeting – 2013/05/28 – Page 21 http://lod2.eu
Wiktionary Example
ULI meeting – 2013/05/28 – Page 22 http://lod2.eu
LOD2 EU Project produces LOD2 Stack.
Three requirements to unlock Natural Language Processing (NLP) for the project:
1. NLP tool output is required to be in RDF
2. Scalability (less triples, focus on usefulness)
3. Common vocabulary to integrate and use NLP tools
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
achieve interoperability between Natural Language Processing (NLP) tools,
language resources and annotations.
• Version 1.0 published in November 2011
• Version 2.0 is scheduled for completion within 2013
NLP Interchange Format 2.0
ULI meeting – 2013/05/28 – Page 23 http://lod2.eu
NIF Architecture
ULI meeting – 2013/05/28 – Page 24 http://lod2.eu
Adressing Primary Data
ULI meeting – 2013/05/28 – Page 25 http://lod2.eu
Adressing Primary Data
NIF 1.0:http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729
NIF 2.0 uses RFC 5147:
http://www.w3.org/DesignIssues/LinkedData.html#char=717,729
User extensions possible:
http://www.w3.org/DesignIssues/LinkedData.html#your_own_scheme
(but you have to link to documentation on how it was created)
ULI meeting – 2013/05/28 – Page 26 http://lod2.eu
As a Web Service
curl
--data-urlencode prefix="http://prefix.given.by/theClient#"
--data-urlencode input="[...]"
(--data-urlencode source=”http://www.w3.org/DesignIssues/LinkedData.html”)
http://nlp2rdf.lod2.eu/demo/NIFStanfordCore
ULI meeting – 2013/05/28 – Page 27 http://lod2.eu
• Tibeto-Burman languages: http://purl.org/olia/tibet.owl#VNst
• Russian TreeTagger :
http://purl.org/olia/russ.owl#partizip_prt_sg_neut_passiv_gen_langform
• German STTS: http://purl.org/olia/stts.owl#VAPP
• English Penn: http://purl.org/olia/penn.owl#VBG
→ all map to http://purl.org/olia/olia.owl#NonFiniteVerb
Ontologies of Lingingustic Annotation (OLiA) contain mappings for over 50 Tagsets (free
and open, CC-By)
Vocabulary Module: OLiA
ULI meeting – 2013/05/28 – Page 28 http://lod2.eu
• NIF 2.0 tries to be compatible to (Vocabulary Module):
• ITS 2.0
• FISE used in Apache Stanbol (IKS-EU Project)
• LAF/GrAF XML – ISO standard, recently published
• Fragment Identifiers by IETF and W3C
• Lemon ontology from Monnet EU Project
• NERD ontology from EURECOM and LinkedTV EU Project
• Xpointer/XPath URI scheme
• Open Annotation
NIF 2.0 - plans
ULI meeting – 2013/05/28 – Page 29 http://lod2.eu
NIF 2.0 :
• NIF is free and open (CC-0 or CC-BY)
• All ontologies will be hosted for persistently by University Leipzig
• Sign up on the mailinglist at http://nlp2rdf.org
• Provide Use Cases, Requirements, Implementations at:
• http://wiki.nlp2rdf.org/wiki/Use_cases#Use_cases
• http://wiki.nlp2rdf.org/wiki/Requirements#Requirements
How you can contribute:
ULI meeting – 2013/05/28 – Page 30 http://lod2.eu
LOD 2 Stack
• Currently project half-time
• Most of the tools are free and open source
• Commercial rollout planned
• Many webinars available
• You can integrate your tool via Debian package
http://lod2.eu
http://stack.lod2.eu/
How you can contribute:

Weitere ähnliche Inhalte

Was ist angesagt?

The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataGeorg Rehm
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikisSören Auer
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair" OpenAIRE
 
WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020OpenAIRE
 
WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020OpenAIRE
 
Adlug annual meeting 2013
Adlug annual meeting 2013Adlug annual meeting 2013
Adlug annual meeting 2013@CULT Srl
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum Talk2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum TalkPaul Bracke
 
Using DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data IntegrationUsing DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data IntegrationMartin Kaltenböck
 
Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Leander Seige
 
Best Practices for Linked Data Education
Best Practices for Linked Data EducationBest Practices for Linked Data Education
Best Practices for Linked Data EducationEUCLID project
 
Semantic Information Management using PoolParty 4
Semantic Information Management using PoolParty 4Semantic Information Management using PoolParty 4
Semantic Information Management using PoolParty 4Martin Kaltenböck
 

Was ist angesagt? (20)

The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open Data
 
04 pisa final_event_111214_wp1_dg
04 pisa final_event_111214_wp1_dg04 pisa final_event_111214_wp1_dg
04 pisa final_event_111214_wp1_dg
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 
Eenmaal gemeten, veel gebruikt
Eenmaal gemeten, veel gebruiktEenmaal gemeten, veel gebruikt
Eenmaal gemeten, veel gebruikt
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikis
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020WEBINAR: Open Access to publications in Horizon 2020
WEBINAR: Open Access to publications in Horizon 2020
 
WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020WEBINAR: Open Research Data in Horizon 2020
WEBINAR: Open Research Data in Horizon 2020
 
A dive into Open Data
A dive into Open DataA dive into Open Data
A dive into Open Data
 
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
LOD2 Plenary Vienna 2012: WP9A - LOD for a Distributed Marketplace for Public...
 
Adlug annual meeting 2013
Adlug annual meeting 2013Adlug annual meeting 2013
Adlug annual meeting 2013
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
Open Data - technical approach
Open Data - technical approachOpen Data - technical approach
Open Data - technical approach
 
2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum Talk2014 ALA MW SPARC-ACRL Forum Talk
2014 ALA MW SPARC-ACRL Forum Talk
 
Using DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data IntegrationUsing DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data Integration
 
Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13
 
Best Practices for Linked Data Education
Best Practices for Linked Data EducationBest Practices for Linked Data Education
Best Practices for Linked Data Education
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
 
Semantic Information Management using PoolParty 4
Semantic Information Management using PoolParty 4Semantic Information Management using PoolParty 4
Semantic Information Management using PoolParty 4
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 

Ähnlich wie Linked Data for Abbreviations and Segmentation

NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web Sebastian Hellmann
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked DataSebastian Hellmann
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportSebastian Hellmann
 
Linked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot AustriaLinked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot AustriaMartin Kaltenböck
 
Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Sergio Fernández
 
Keynote Learning Layers Developer Camp 2013
Keynote Learning Layers Developer Camp 2013Keynote Learning Layers Developer Camp 2013
Keynote Learning Layers Developer Camp 2013Ralf Klamma
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...Sebastian Hellmann
 
Language technology market and components taxonomy
Language technology market and components taxonomyLanguage technology market and components taxonomy
Language technology market and components taxonomyPretaLLOD
 
Very Gentle Linked Data Workshop
Very Gentle Linked Data WorkshopVery Gentle Linked Data Workshop
Very Gentle Linked Data WorkshopAdrian Stevenson
 

Ähnlich wie Linked Data for Abbreviations and Segmentation (20)

NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
LOD2 Webinar Series FOX
LOD2 Webinar Series FOXLOD2 Webinar Series FOX
LOD2 Webinar Series FOX
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
 
NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
LOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the StackLOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the Stack
 
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and AuthoringLOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
LOD2: State of Play WP5 - Linked Data Visualization, Browsing and Authoring
 
LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion
LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge FusionLOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion
LOD2 Plenary Vienna 2012: WP4 - Reuse, Interlinking and Knowledge Fusion
 
LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine
 
Linked Open Data stuff
Linked Open Data stuffLinked Open Data stuff
Linked Open Data stuff
 
Linked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot AustriaLinked Open Data (LOD) Pilot Austria
Linked Open Data (LOD) Pilot Austria
 
Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)
 
Keynote Learning Layers Developer Camp 2013
Keynote Learning Layers Developer Camp 2013Keynote Learning Layers Developer Camp 2013
Keynote Learning Layers Developer Camp 2013
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
Linked data tooling XML
Linked data tooling XMLLinked data tooling XML
Linked data tooling XML
 
Language technology market and components taxonomy
Language technology market and components taxonomyLanguage technology market and components taxonomy
Language technology market and components taxonomy
 
Very Gentle Linked Data Workshop
Very Gentle Linked Data WorkshopVery Gentle Linked Data Workshop
Very Gentle Linked Data Workshop
 
LOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industryLOD2 Webinar Series: LOD2 in information and publishing industry
LOD2 Webinar Series: LOD2 in information and publishing industry
 
Free Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st releaseFree Webinar: LOD2 Stack - 1st release
Free Webinar: LOD2 Stack - 1st release
 

Mehr von Sebastian Hellmann

Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015Sebastian Hellmann
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by ExampleSebastian Hellmann
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationSebastian Hellmann
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23Sebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 

Mehr von Sebastian Hellmann (12)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Kürzlich hochgeladen

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Linked Data for Abbreviations and Segmentation

  • 1. ULI meeting – 2013/05/28 – Page 1 http://lod2.eu Creating Knowledge out of Interlinked Data LOD2 Presentation . 02.09.2010 . Page http://lod2.eu AKSW, Universität Leipzig Sebastian Hellmann Linked Data for Abbreviations and Segmentation http://nlp2rdf.org http://lod2.eu http://slideshare.net/kurzum
  • 2. ULI meeting – 2013/05/28 – Page 2 http://lod2.eu Sebastian Hellmann – researcher working on LOD2 EU Project AKSW – Agile Knowledge and the Semantic Web research group in Leipzig - http://aksw.org InfAI – Institute for Applied Informatics - http://infai.org Contents: • Introduction to Linked Data • Linked data close-up: DBpedia data set • Exploitation of free and open data for CLDR • Collaboration points Introduction
  • 3. ULI meeting – 2013/05/28 – Page 3 http://lod2.eu http://lod-cloud.net
  • 4. ULI meeting – 2013/05/28 – Page 4 http://lod2.eu http://lod-cloud.net Linked Open Data - All datasets provide open access to individual records via HTTP - Many are free (no payment required, as in royalty-free) - Some are openly licensed, e.g. CC-0 or CC-BY-SA => Open access also applies to published HTML on the WWW, but here the data itself is published unrendered via RDF
  • 5. ULI meeting – 2013/05/28 – Page 5 http://lod2.eu http://dbpedia.org
  • 6. ULI meeting – 2013/05/28 – Page 6 http://lod2.eu • DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. • allows sophisticated queries against Wikipedia content • allows links from the different data sets on the Web to Wikipedia data • data is extracted continuously: http://live.dbpedia.org • WikiData will be integrated within the next four months via Google Summer of Code project http://dbpedia.org
  • 7. ULI meeting – 2013/05/28 – Page 7 http://lod2.eu http://dbpedia.org/resource/Berlin First paragraph in more than 20 languages
  • 8. ULI meeting – 2013/05/28 – Page 8 http://lod2.eu http://dbpedia.org/resource/Berlin Facts from Wikipedia infoboxes
  • 9. ULI meeting – 2013/05/28 – Page 9 http://lod2.eu http://dbpedia.org/resource/Berlin Several Hierarchical Classifications
  • 10. ULI meeting – 2013/05/28 – Page 10 http://lod2.eu http://dbpedia.org/resource/Berlin Links Multilingual labels
  • 11. ULI meeting – 2013/05/28 – Page 11 http://lod2.eu Trend 1: I18n
  • 12. ULI meeting – 2013/05/28 – Page 12 http://lod2.eu • DBpedia Extraction Framework can be extended to easily extract any data from Wikipedia: https://github.com/dbpedia/extraction-framework • We are using it to extract corpora for NLP • e.g. URI, surrounding text, surface form • Probabilities: • P(sf|URI): P that “apple” refers to wikipedia:Apple_Inc. • P(URI|sf): P that wikipedia:Apple_Inc. is “apple” in text Trend 2: DBpedia 4 NLP
  • 13. ULI meeting – 2013/05/28 – Page 13 http://lod2.eu • DBpedia is a data dissemination project: • as download for reuse • As Linked Data for interlinking • Corpora will be published via the NLP Interchange RDF Format (NIF) - http://nlp2rdf.org Trend 2: DBpedia 4 NLP
  • 14. ULI meeting – 2013/05/28 – Page 14 http://lod2.eu DBpedia Live Abbreviation Example Up-to-date gazetteer - AFD party was founded earlier this year. - lexical information and statistics could be included
  • 15. ULI meeting – 2013/05/28 – Page 15 http://lod2.eu Linguistic LOD Cloud
  • 16. ULI meeting – 2013/05/28 – Page 16 http://lod2.eu • DBpedia • Main version and I18n chapters • http://dbpedia.org/Datasets/NLP • Wiktionary 2 RDF: http://dbpedia.org/Wiktionary • Wortschatz from Uni Leipzig (planned as Linked Data) • http://corpora.informatik.uni-leipzig.de/download.html • JRC Names: http://langtech.jrc.it/JRC-Names.html • JRC-Names is a highly multilingual named entity resource for person and organisation names • Lexvo.org: • provides URIs for ISO 629-3 • http://lexvo.org/id/iso639-3/spa Example data sets from LLOD
  • 17. ULI meeting – 2013/05/28 – Page 17 http://lod2.eu http://linguistics.okfn.org/resources/llod/ => CLDR will make an excellent addition to LLOD Linguistic LOD
  • 18. ULI meeting – 2013/05/28 – Page 18 http://lod2.eu • CLDR as Linked Data • empowers third parties to link to your authoritative data • links are reusable • LIDER EU project (presumably starting in October) will provide some support for linked data adopters • ULI members can join the industry and advisory board • Workshop “DBpedia & NLP” in Oct, 2013 • http://nlp-dbpedia2013.blogs.aksw.org/ • Creation of free and open benchmarks in RDF • We could promote CLDR and collect contributions Collaboration points I
  • 19. ULI meeting – 2013/05/28 – Page 19 http://lod2.eu • Personally, I can: • Join ULI mailing list • Look out for appropriate data • Look for opportunities (e.g. synergies with other projects) • Provide some counseling (e.g. pointers, technology Q&A) => this will be done as preparation for the LIDER EU project, CLDR • Academic collaboration: • Excellent PhD student topic: Create corpora, interlink and fuse data and benchmark effectiveness for segmentation • Provide knowledge transfer (e.g. tutorials, visits) Collaboration points II
  • 20. ULI meeting – 2013/05/28 – Page 20 http://lod2.eu Open Community – All feedback is welcome! http://slideshare.net/kurzum Websites: http://dbpedia.org http://nlp2rdf.org http://lod2.eu Thanks for your attention
  • 21. ULI meeting – 2013/05/28 – Page 21 http://lod2.eu Wiktionary Example
  • 22. ULI meeting – 2013/05/28 – Page 22 http://lod2.eu LOD2 EU Project produces LOD2 Stack. Three requirements to unlock Natural Language Processing (NLP) for the project: 1. NLP tool output is required to be in RDF 2. Scalability (less triples, focus on usefulness) 3. Common vocabulary to integrate and use NLP tools The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. • Version 1.0 published in November 2011 • Version 2.0 is scheduled for completion within 2013 NLP Interchange Format 2.0
  • 23. ULI meeting – 2013/05/28 – Page 23 http://lod2.eu NIF Architecture
  • 24. ULI meeting – 2013/05/28 – Page 24 http://lod2.eu Adressing Primary Data
  • 25. ULI meeting – 2013/05/28 – Page 25 http://lod2.eu Adressing Primary Data NIF 1.0:http://www.w3.org/DesignIssues/LinkedData.html#offset_717_729 NIF 2.0 uses RFC 5147: http://www.w3.org/DesignIssues/LinkedData.html#char=717,729 User extensions possible: http://www.w3.org/DesignIssues/LinkedData.html#your_own_scheme (but you have to link to documentation on how it was created)
  • 26. ULI meeting – 2013/05/28 – Page 26 http://lod2.eu As a Web Service curl --data-urlencode prefix="http://prefix.given.by/theClient#" --data-urlencode input="[...]" (--data-urlencode source=”http://www.w3.org/DesignIssues/LinkedData.html”) http://nlp2rdf.lod2.eu/demo/NIFStanfordCore
  • 27. ULI meeting – 2013/05/28 – Page 27 http://lod2.eu • Tibeto-Burman languages: http://purl.org/olia/tibet.owl#VNst • Russian TreeTagger : http://purl.org/olia/russ.owl#partizip_prt_sg_neut_passiv_gen_langform • German STTS: http://purl.org/olia/stts.owl#VAPP • English Penn: http://purl.org/olia/penn.owl#VBG → all map to http://purl.org/olia/olia.owl#NonFiniteVerb Ontologies of Lingingustic Annotation (OLiA) contain mappings for over 50 Tagsets (free and open, CC-By) Vocabulary Module: OLiA
  • 28. ULI meeting – 2013/05/28 – Page 28 http://lod2.eu • NIF 2.0 tries to be compatible to (Vocabulary Module): • ITS 2.0 • FISE used in Apache Stanbol (IKS-EU Project) • LAF/GrAF XML – ISO standard, recently published • Fragment Identifiers by IETF and W3C • Lemon ontology from Monnet EU Project • NERD ontology from EURECOM and LinkedTV EU Project • Xpointer/XPath URI scheme • Open Annotation NIF 2.0 - plans
  • 29. ULI meeting – 2013/05/28 – Page 29 http://lod2.eu NIF 2.0 : • NIF is free and open (CC-0 or CC-BY) • All ontologies will be hosted for persistently by University Leipzig • Sign up on the mailinglist at http://nlp2rdf.org • Provide Use Cases, Requirements, Implementations at: • http://wiki.nlp2rdf.org/wiki/Use_cases#Use_cases • http://wiki.nlp2rdf.org/wiki/Requirements#Requirements How you can contribute:
  • 30. ULI meeting – 2013/05/28 – Page 30 http://lod2.eu LOD 2 Stack • Currently project half-time • Most of the tools are free and open source • Commercial rollout planned • Many webinars available • You can integrate your tool via Debian package http://lod2.eu http://stack.lod2.eu/ How you can contribute: