SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
Pascal Christoph



Catalog enrichment à la
Linked Open Data


  SWIB12, Cologne, 2012-12-26
  Workshop: Introduction to Linked Open Data
License
2




    This presentation – inclusive the graphics made by the author, are licensed CC0:
    https://creativecommons.org/about/cc0

    Pictures from http://www.istockphoto.com/ at slides 5, 7, 8 and 41 are licensed CC-BY-ND:
    http://creativecommons.org/licenses/by-nd/3.0/de/

    Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-
    cloud.net/




     Christoph - Catalog enrichment à la Linked Open Data                            2012-12-26
Overview
3




       Catalog enrichment
          Definition

          Technique

          Matching

          Linking

       Implementation demo
       Conclusion

    Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
Overview
4




       Catalog enrichment
          Definition

          Technique

          Matching

          Linking

       Implementation demo
       Conclusion

    Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
Catalog enrichment ?
Catalog enrichment: definition
6



       Any addendum to the records:
          linksto fulltexts/webpages/...
          subjects, tags, recensions

          covers

          ...

       The source of the addendum does not matter
        (users, libraries, companies...)
       New features: only indirect

    Christoph - Catalog enrichment à à la Linked Open Data
                Kataloganreicherung la Linked Open Data      24.05.2012
                                                              2012-12-26
                                                              2012-09-27
„INSTANT GRATIFICATION“
Overview
9




       Catalog enrichment
          Definition

          Technique

          Matching

          Linking

       Implementation demo
       Conclusion

    Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
Catalog enrichment: methods
10




                                                               Sourtce of the pictures :http://findicons.com/about


       database vs.                                           mashup
     Christoph - Catalog enrichment à à la Linked Open Data
                 Kataloganreicherung la Linked Open Data                                 24.05.2012
                                                                                          2012-12-26
                                                                                          2012-09-27
methods
11




     locale DB:                                               dynamic mashup:
     + elaborated combination of the                          + data always up-to-date
     data
                                                              + relatively easy to integrate the data
     + data can be used to search and
     browse and other features                                - needs (performant) API
     - continously high effort to                             - no search etc.
     integrate the data




     Christoph - Catalog enrichment à à la Linked Open Data
                 Kataloganreicherung la Linked Open Data                           24.05.2012
                                                                                    2012-12-26
                                                                                    2012-09-27
infrastructure
12




     RDF based storing with SPARQL endpoint:
        Easy to add data
        Open to be used by customer
        Self-describing data
        SPARQL is a (too?) powerful API



     Christoph - Catalog enrichment à à la Linked Open Data
                 Kataloganreicherung la Linked Open Data             24.05.2012
                                                                      2012-12-26
Overview
13




        Catalog enrichment
           Definition

           Technique

           Matching

           Linking

        Implementation demo
        Conclusion

     Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
14




     Source of the picture: http://www.flickr.com/photos/jhsum-commons/4419490136/
lobid.org
15


        triple store with SPARQL Endpoint: 4store
        open data from the hbz union catalog
        16 M records <=> 1 B Triple
        links to:
• 5.500 Projekt Gutenberg                                     • 1.250.000 Open Library
• 12.000 DBpedia                                              • 700.000 ZDB
• 70.000 b3kat                                                • 800.000 LOC Iso-639-2
• 200.000 Dewey Decimal Class.                                • 22.000.000 gnd authority file
• 270.000 DNB Nationalbiografie                               • 32.000.000 lobid-organisations
• 420.000 OCLC


     Christoph - Catalog enrichment à à la Linked Open Data
                 Kataloganreicherung la Linked Open Data                       24.05.2012
                                                                                2012-12-26
                                                                                2012-09-27
Software
16



        Silk
        Culturegraph
        Google-refine
        Hadoop
        ...




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data     24.05.2012
                                                              2012-12-26
                                                              2012-09-27
Matching algorithms
17



        depending on the data
           Interestingdata reside „elsewhere“
           => other cataloging rules

          DBpedia example:
           Creator, ISBN etc. are often missing => only title
           constraints:
               german  DBpedia
               category:Literarisches_Werk ,
                category:Lexikon,_Enzyklopädie

     Christoph - Catalog enrichment à à la Linked Open Data
                 Kataloganreicherung la Linked Open Data      24.05.2012
                                                               2012-12-26
                                                               2012-09-27
Problem: disambiguation
18



        matching is to blurry
        Post processing:
          Allow only bundle with same creator




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data   24.05.2012
                                                            2012-12-26
                                                            2012-09-27
Bundle having the same creator
19




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data   24.05.2012
                                                            2012-12-26
                                                            2012-09-27
Bundle having different creators
20




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data   24.05.2012
                                                            2012-12-26
                                                            2012-09-27
LOW-HANGING
     FRUIT
Kai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
Overview
22




        Catalog enrichment
           Definition

           Technique

           Matching

           Linking

        Implementation demo
        Conclusion

     Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
triplification
23



        Find predicates or mint them yourself
           rdrel:workManifested

           =>      Triple:
             <lobid-resource> <rdrel:workManifested> <dbpedia-resource>




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data         24.05.2012
                                                                  2012-12-26
                                                                  2012-09-27
indexing
24



        What is the license ?
        Import triples into the SPARQL-Endpoint
          own „named graph“ has advantages:
               Easilyremovable/changeable
               Provenience is stored
               Query specific named graphs




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data     24.05.2012
                                                              2012-12-26
                                                              2012-09-27
Named Graphs
25




     Christoph - Catalog-enrichment à à la Linkedmit LOD
     Jansen / Christoph KataloganreicherungOpen Data
                 Kataloganreicherung la Linked Open Data          24.05.2012
                                                                   2012-12-26
                                                                   2012-09-27
What we achieved
26



        12.000 „sure“ links to 4.000 DBpedia
         resources => 4.000 new „Work“-levels (21.000
         discared links)
          average size of a bundle: 3

        links to freebase: 3.000
        0.1 % enrichment




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data           24.05.2012
                                                                  2012-09-27
                                                                  2012-12-26
What we achieved
27



        5.500 links zu 400 Project Gutenberg
         ressources (fulltexts in differnet formats)
          => 0.05% enrichment



        1.200.000 links to the work level of the Open
         Library
          => 12.5% enrichment




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data           24.05.2012
                                                                  2012-09-27
                                                                  2012-12-26
What we achieved
28




     Sir Tim Berners Lee:




                                                   Source of picture: http://www.w3.org/DesignIssues/LinkedData.html




      Christoph - Catalog enrichment à à la Linked Open Data
                  Kataloganreicherung la Linked Open Data                                       2012-12-26
                                                                                                2012-09-27
LOW-HANGING
    FRUIT
Kai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
What we achieved
30




                                    DBpedia example:

                  „Die Heilige Johanna der Schlachthöfe“




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data           24.05.2012
                                                                  2012-09-27
                                                                  2012-12-26
What we achieved
34




                              Open Library example:

                             „With reference to reference“




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data           24.05.2012
                                                                  2012-09-27
                                                                  2012-12-26
Linking Example: LODUM
36




     Christoph - Catalog enrichment à à la Linked Open Data
                 Kataloganreicherung la Linked Open Data      24.05.2012
                                                               2012-12-26
                                                               2012-09-27
Integration into the catalog
37



        What is allowed ?
        What should be integrated, what not?
        Human readable presentation of the
         links/URIs
        (some) data should be indexed locally (e. g. to
         be able to search)
        ...


     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data      24.05.2012
                                                             2012-09-27
                                                             2012-12-26
Overview
38




        Catalog enrichment
           Definition

           Technique

           Matching

           Linking

        Implementation demo
        Conclusion

     Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
Implementation demo
39




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data      24.05.2012
                                                             2012-09-27
                                                             2012-12-26
Implementation demo
40




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data      24.05.2012
                                                             2012-09-27
                                                             2012-12-26
Overview
41




        Catalog enrichment
           Definition

           Technique

           Matching

           Linking

        Implementation demo
        Conclusion

     Christoph - Catalog enrichment à la Linked Open Data     2012-12-26
43




     Bildquelle: http://www.flickr.com/photos/library_of_congress/4037490394/
conclusion
44




     Everything that's possible with LOD could also
     be achieved without LOD.


     It's just easier with LOD.




     Christoph - Kataloganreicherung à la Linkedmit LOD
     Jansen / Christoph -enrichment à la Linked Open Data
                 Catalog Kataloganreicherung Open Data          24.05.2012
                                                                 2012-09-27
                                                                 2012-12-26
LOD - Definition „linked“
45                           Ad astra ?
                             Addata ! ?
                             Ad astra
                             Ad data !
To boldly go where no data has gone before.

           To boldly go where no data has gone before .

           Source of the picture:http://hubblesite.org/gallery/album/star/pr2006050d
     Christoph - Kataloganreicherung à la Linked Open Data                             2012-09-27
Open source
46




                                               http://sourceforge.net/projects/culturegraph/


                           http://4store.org/



                       https://github.com/lobid/



     Silk            https://www.assembla.com/spaces/silk


     Christoph - Catalog enrichment à la Linked Open Data
47   Thank you !


          Pascal Christoph
          christoph@hbz-nrw.de

          semweb@hbz-nrw.de
48              list of references
- KiM: Empfehlungen zur Öffnung bibliothekarischer Daten
https://wiki.d-nb.de/pages/viewpage.action?pageId=45419980
- Till Kreutzer (2010): Open Data – Freigabe von Daten aus Bibliothekskatalogen
http://www.hbz-nrw.de/dokumentencenter/veroeffentlichungen/open-data-leitfaden.pdf
- Adrian Pohl (2010): Open Data im hbz-Verbund. Erschienen in: ProLibris. 3. Preprint:
http://www.hbz-nrw.de/dokumentencenter/produkte/lod/aktuell/pohl_2010_open-data.pdf
- Tim Berners Lee's talk of Open Data (2010): http://www.youtube.com/watch?v=3YcZ3Zqk0a8
- Jansen / Christoph: Dynamische Kataloganreicherung auf Basis von Linked Open Data
http://de.slideshare.net/h_jansen/dynamische-kataloganreicherung-auf-basis-von-linked-open-data
- Blog post: First results using SILK to link to DBpedia
https://wiki1.hbz-nrw.de/display/SEM/2012/05/03/First+results+using+SILK+to+link+to+DBpedia
- Blog post: 1.2 M links to Open Library
https://wiki1.hbz-nrw.de/display/SEM/2012/05/23/1.2+M+links+to+Open+Library
- Oliver Flimm (2010): LOD und die Open Library http://de.slideshare.net/flimm/lod-openlibrary20100512
- Directory of data „thedatahub“ aka CKAN: http://www.thedatahub.org/
- 49 bibliographic data sources as LODhttp://thedatahub.org/group/bibliographic?tags=lod

Weitere ähnliche Inhalte

Was ist angesagt?

Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
eswcsummerschool
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Sebastian Hellmann
 

Was ist angesagt? (20)

LOD2 Webinar Series FOX
LOD2 Webinar Series FOXLOD2 Webinar Series FOX
LOD2 Webinar Series FOX
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
OpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish RepositoriesOpenAIRE and the Case of Irish Repositories
OpenAIRE and the Case of Irish Repositories
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
 
DBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, DublinDBpedia Tutorial - Feb 2015, Dublin
DBpedia Tutorial - Feb 2015, Dublin
 
ROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data StackROI in Linking Content to CRM by Applying the Linked Data Stack
ROI in Linking Content to CRM by Applying the Linked Data Stack
 
Linked data life cycles
Linked data life cyclesLinked data life cycles
Linked data life cycles
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Data 2 Documents: Modular and Distributive Content Management in RDF
Data 2 Documents: Modular and Distributive Content Management in RDFData 2 Documents: Modular and Distributive Content Management in RDF
Data 2 Documents: Modular and Distributive Content Management in RDF
 
Ivan Herman - Semantic Web Activities @ W3C
Ivan Herman - Semantic Web Activities @ W3CIvan Herman - Semantic Web Activities @ W3C
Ivan Herman - Semantic Web Activities @ W3C
 
Marklogic and the Linked Data Connection
Marklogic and the Linked Data ConnectionMarklogic and the Linked Data Connection
Marklogic and the Linked Data Connection
 
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...
 
Scalability 09262012
Scalability 09262012Scalability 09262012
Scalability 09262012
 
Catalog enrichment: importing Dewey Decimal Classification from external sour...
Catalog enrichment: importing Dewey Decimal Classification from external sour...Catalog enrichment: importing Dewey Decimal Classification from external sour...
Catalog enrichment: importing Dewey Decimal Classification from external sour...
 
LDP-DL: A language to define the design of Linked Data Platforms
LDP-DL: A language to define the design of Linked Data PlatformsLDP-DL: A language to define the design of Linked Data Platforms
LDP-DL: A language to define the design of Linked Data Platforms
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 

Ähnlich wie Swib12 workshop lod_beginners

Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataSoren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Open City Foundation
 

Ähnlich wie Swib12 workshop lod_beginners (20)

Linked Open Library Data @hbz
Linked Open Library Data @hbzLinked Open Library Data @hbz
Linked Open Library Data @hbz
 
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
LODLAM Landscape NOTES
LODLAM Landscape NOTESLODLAM Landscape NOTES
LODLAM Landscape NOTES
 
Integrating Globus into LRZ's Data Science Storage Service
Integrating Globus into LRZ's Data Science Storage ServiceIntegrating Globus into LRZ's Data Science Storage Service
Integrating Globus into LRZ's Data Science Storage Service
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)
 
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...
 
Simplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open DataSimplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open Data
 
Simplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open DataSimplified minimalistic workflows for the publication of Linked Open Data
Simplified minimalistic workflows for the publication of Linked Open Data
 
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked DataSoren Auer - LOD2 - creating knowledge out of Interlinked Data
Soren Auer - LOD2 - creating knowledge out of Interlinked Data
 
Charper.lawdi.20120601
Charper.lawdi.20120601Charper.lawdi.20120601
Charper.lawdi.20120601
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine LOD2 Webinar Series: Zemanta / Open refine
LOD2 Webinar Series: Zemanta / Open refine
 
Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche Nationalbibliothek
 
H2o tutorial
H2o tutorialH2o tutorial
H2o tutorial
 
OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012OCLC Linked Data Roundtable event IFLA 2012
OCLC Linked Data Roundtable event IFLA 2012
 
NoTube: Models & Semantics
NoTube: Models & SemanticsNoTube: Models & Semantics
NoTube: Models & Semantics
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Swib12 workshop lod_beginners

  • 1. Pascal Christoph Catalog enrichment à la Linked Open Data SWIB12, Cologne, 2012-12-26 Workshop: Introduction to Linked Open Data
  • 2. License 2 This presentation – inclusive the graphics made by the author, are licensed CC0: https://creativecommons.org/about/cc0 Pictures from http://www.istockphoto.com/ at slides 5, 7, 8 and 41 are licensed CC-BY-ND: http://creativecommons.org/licenses/by-nd/3.0/de/ Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod- cloud.net/ Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 3. Overview 3  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 4. Overview 4  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 6. Catalog enrichment: definition 6  Any addendum to the records:  linksto fulltexts/webpages/...  subjects, tags, recensions  covers  ...  The source of the addendum does not matter (users, libraries, companies...)  New features: only indirect Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 8.
  • 9. Overview 9  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 10. Catalog enrichment: methods 10 Sourtce of the pictures :http://findicons.com/about database vs. mashup Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 11. methods 11 locale DB: dynamic mashup: + elaborated combination of the + data always up-to-date data + relatively easy to integrate the data + data can be used to search and browse and other features - needs (performant) API - continously high effort to - no search etc. integrate the data Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 12. infrastructure 12 RDF based storing with SPARQL endpoint:  Easy to add data  Open to be used by customer  Self-describing data  SPARQL is a (too?) powerful API Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26
  • 13. Overview 13  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 14. 14 Source of the picture: http://www.flickr.com/photos/jhsum-commons/4419490136/
  • 15. lobid.org 15  triple store with SPARQL Endpoint: 4store  open data from the hbz union catalog  16 M records <=> 1 B Triple  links to: • 5.500 Projekt Gutenberg • 1.250.000 Open Library • 12.000 DBpedia • 700.000 ZDB • 70.000 b3kat • 800.000 LOC Iso-639-2 • 200.000 Dewey Decimal Class. • 22.000.000 gnd authority file • 270.000 DNB Nationalbiografie • 32.000.000 lobid-organisations • 420.000 OCLC Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 16. Software 16  Silk  Culturegraph  Google-refine  Hadoop  ... Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 17. Matching algorithms 17  depending on the data  Interestingdata reside „elsewhere“  => other cataloging rules  DBpedia example:  Creator, ISBN etc. are often missing => only title  constraints:  german DBpedia  category:Literarisches_Werk , category:Lexikon,_Enzyklopädie Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 18. Problem: disambiguation 18  matching is to blurry  Post processing:  Allow only bundle with same creator Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 19. Bundle having the same creator 19 Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 20. Bundle having different creators 20 Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 21. LOW-HANGING FRUIT Kai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
  • 22. Overview 22  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 23. triplification 23  Find predicates or mint them yourself  rdrel:workManifested  => Triple: <lobid-resource> <rdrel:workManifested> <dbpedia-resource> Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 24. indexing 24  What is the license ?  Import triples into the SPARQL-Endpoint  own „named graph“ has advantages:  Easilyremovable/changeable  Provenience is stored  Query specific named graphs Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 25. Named Graphs 25 Christoph - Catalog-enrichment à à la Linkedmit LOD Jansen / Christoph KataloganreicherungOpen Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 26. What we achieved 26  12.000 „sure“ links to 4.000 DBpedia resources => 4.000 new „Work“-levels (21.000 discared links)  average size of a bundle: 3  links to freebase: 3.000  0.1 % enrichment Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 27. What we achieved 27  5.500 links zu 400 Project Gutenberg ressources (fulltexts in differnet formats)  => 0.05% enrichment  1.200.000 links to the work level of the Open Library  => 12.5% enrichment Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 28. What we achieved 28 Sir Tim Berners Lee: Source of picture: http://www.w3.org/DesignIssues/LinkedData.html Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 2012-12-26 2012-09-27
  • 29. LOW-HANGING FRUIT Kai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
  • 30. What we achieved 30 DBpedia example: „Die Heilige Johanna der Schlachthöfe“ Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 31.
  • 32.
  • 33.
  • 34. What we achieved 34 Open Library example: „With reference to reference“ Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 35.
  • 36. Linking Example: LODUM 36 Christoph - Catalog enrichment à à la Linked Open Data Kataloganreicherung la Linked Open Data 24.05.2012 2012-12-26 2012-09-27
  • 37. Integration into the catalog 37  What is allowed ?  What should be integrated, what not?  Human readable presentation of the links/URIs  (some) data should be indexed locally (e. g. to be able to search)  ... Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 38. Overview 38  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 39. Implementation demo 39 Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 40. Implementation demo 40 Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 41. Overview 41  Catalog enrichment  Definition  Technique  Matching  Linking  Implementation demo  Conclusion Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
  • 42.
  • 43. 43 Bildquelle: http://www.flickr.com/photos/library_of_congress/4037490394/
  • 44. conclusion 44 Everything that's possible with LOD could also be achieved without LOD. It's just easier with LOD. Christoph - Kataloganreicherung à la Linkedmit LOD Jansen / Christoph -enrichment à la Linked Open Data Catalog Kataloganreicherung Open Data 24.05.2012 2012-09-27 2012-12-26
  • 45. LOD - Definition „linked“ 45 Ad astra ? Addata ! ? Ad astra Ad data ! To boldly go where no data has gone before. To boldly go where no data has gone before . Source of the picture:http://hubblesite.org/gallery/album/star/pr2006050d Christoph - Kataloganreicherung à la Linked Open Data 2012-09-27
  • 46. Open source 46 http://sourceforge.net/projects/culturegraph/ http://4store.org/ https://github.com/lobid/ Silk https://www.assembla.com/spaces/silk Christoph - Catalog enrichment à la Linked Open Data
  • 47. 47 Thank you ! Pascal Christoph christoph@hbz-nrw.de semweb@hbz-nrw.de
  • 48. 48 list of references - KiM: Empfehlungen zur Öffnung bibliothekarischer Daten https://wiki.d-nb.de/pages/viewpage.action?pageId=45419980 - Till Kreutzer (2010): Open Data – Freigabe von Daten aus Bibliothekskatalogen http://www.hbz-nrw.de/dokumentencenter/veroeffentlichungen/open-data-leitfaden.pdf - Adrian Pohl (2010): Open Data im hbz-Verbund. Erschienen in: ProLibris. 3. Preprint: http://www.hbz-nrw.de/dokumentencenter/produkte/lod/aktuell/pohl_2010_open-data.pdf - Tim Berners Lee's talk of Open Data (2010): http://www.youtube.com/watch?v=3YcZ3Zqk0a8 - Jansen / Christoph: Dynamische Kataloganreicherung auf Basis von Linked Open Data http://de.slideshare.net/h_jansen/dynamische-kataloganreicherung-auf-basis-von-linked-open-data - Blog post: First results using SILK to link to DBpedia https://wiki1.hbz-nrw.de/display/SEM/2012/05/03/First+results+using+SILK+to+link+to+DBpedia - Blog post: 1.2 M links to Open Library https://wiki1.hbz-nrw.de/display/SEM/2012/05/23/1.2+M+links+to+Open+Library - Oliver Flimm (2010): LOD und die Open Library http://de.slideshare.net/flimm/lod-openlibrary20100512 - Directory of data „thedatahub“ aka CKAN: http://www.thedatahub.org/ - 49 bibliographic data sources as LODhttp://thedatahub.org/group/bibliographic?tags=lod