SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
A Unified Approach for Representing Metametadata




                  A Unified Approach for
                Representing Metametadata



                                   Kai Eckert
                             University of Mannheim




                                  Dublin Core 2009
                                 Seoul, South Korea
                                  October 14th 2009


Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   1/22
A Unified Approach for Representing Metametadata



                                       People               Hel
                                                                lo   , my
                                                                            nam

                                                             Kai
                                                                               e is
     Kai Eckert
     Mannheim University
     kai@informatik.uni-mannheim.de



     Magnus Pfeffer
     Mannheim University Library
     pfeffer@bib.uni-mannheim.de



     Heiner Stuckenschmidt
     Mannheim University
     heiner@informatik.uni-mannheim.de




Kai Eckert         Dublin Core 2009, Seoul, South Korea, October 14th 2009            2/22
A Unified Approach for Representing Metametadata



              What is this presentation about?


  ●
      Proposal for „statements about statements“ or
      „metametadata“ or „provenance on statement
      level“.
  ●
      That was done before!
  ●
      Implementation of metametadata by means of RDF
      Reification.
  ●
      That was done before! (That's what Reification was
      made for.)



Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   3/22
A Unified Approach for Representing Metametadata



                              Anything new?


  ●
      Practical examples based on two scenarios and
      several use-cases.
  ●
      Both scenarios show (again) the need for
      metametadata.
  ●
      Easy implementation based on RDF and SPARQL.
  ●
      Standardization?




Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   4/22
A Unified Approach for Representing Metametadata



                                   RDF Reification
   ●
       RDF supports statements about statements by means
       of Reification, literally „objectification“ (actually a
       “subjectification”...).
   ●
       “The book is written by Goethe“ is said by Kai.
                         Subject                             Predicate         Object



   ●
       How is it done in RDF:
             ex:someID   rdf:type        rdf:Statement .
             ex:someID   rdf:subject     “The book”.
             ex:someID   rdf:predicate   ex:isWrittenBy .
             ex:someID   rdf:object      "Goethe" .
             ex:someID   ex:isSaidBy      “Kai” .


Kai Eckert           Dublin Core 2009, Seoul, South Korea, October 14th 2009            5/22
A Unified Approach for Representing Metametadata



                           Simplified Presentation

  ●
      Based on Notation 3 (RDF/N3)

   Subject Predicate   Object
 1 ex:p123 rdf:type    ex:person
 2 ex:p123 ex:hasName “Kai Eckert”
 3 ex:p123 ex:worksFor ex:unima
                                  Example 1: A simple RDF example




  ●
      Identification of statements by the line number:

      4 #1        dc:creator                     ''Kai Eckert''
                The subject of a statement is a reference to another statement.
                With this notation, we imply a reification.

Kai Eckert             Dublin Core 2009, Seoul, South Korea, October 14th 2009    6/22
A Unified Approach for Representing Metametadata



                         Need for Metametadata


  ●
      Metadata are also data, so we need additional data
      about them.         Metametadata
  ●
      Metadata about a whole metadata record, not for
      single statements:
             –   Who created this metadata record?
             –   When was this record created?
             –   …
  ●
      This exists:                         Metadata Provenance


Kai Eckert            Dublin Core 2009, Seoul, South Korea, October 14th 2009   7/22
A Unified Approach for Representing Metametadata



             Statements about (single) statements


  ●
      Often proposed, but only vague instructions how to
      implement it.
  ●
      Needed, if metadata records are created by the
      combination of single statements from different
      sources.
  ●
      Needed for the storage of arbitrary additional
      information for single statements, that can not be
      represented in the metadata format easily.



Kai Eckert        Dublin Core 2009, Seoul, South Korea, October 14th 2009   8/22
A Unified Approach for Representing Metametadata



                             Scenario 1: Crosswalks

  ●
      Crosswalks define rules, how metadata from one
      schema are represented in a different schema.

         MARC field                                                 Dublin Core element
         260$c (Date of publication, distribution, etc.)      →     Date.Created
         522 (Geographic Coverage Note)                       →     Coverage.Spatial
         300$a (Physical Description)                         →     Format.Extent


  ●
      Problems:
              –   Loss of information
              –   Erroneous Crosswalks


Kai Eckert              Dublin Core 2009, Seoul, South Korea, October 14th 2009           9/22
A Unified Approach for Representing Metametadata



               Possibilities for Metametadata
  ●
      Storage of additional information, which would be
      lost in the target format.
  ●
      Identification of Crosswalks with version and the
      specific rule for every generated statement.

 Which statements are generated by a specific rule?
 Which rule is responsible for a specific (erroneous)
 statement?
 Which data in the originating format was used to
 generate a specific statement?

Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   10/22
A Unified Approach for Representing Metametadata



                          Example 1: Crosswalk Data

       Subject            Predicate        Object
  1    ex:docbase/doc1    dc:title         “Example title”
  2    #1                 ex:rule          16
  3    #1                 ex:crosswalk     3
  4    #1                 ex:origin        MARC:245
  5    ex:docbase/doc2    dc:title         “About finding a title”
  6    #5                 ex:rule          16
  7    #5                 ex:crosswalk     3
  8    #5                 ex:origin        MARC:245
  9    ex:docbase/doc3    dc:title         “Lorem ipsum dolor”
  10   #9                 ex:rule          18
  11   #9                 ex:crosswalk     3
  12   #9                 ex:origin        MARC:245
  13   #9                 ex:origin        MARC:246
  14   ex:docbase/doc4    dc:title         “Consetetur Sadipscing”
  15   #14                ex:rule          19
  16   #14                ex:crosswalk     6
  17   #14                ex:origin        xml:/record/description
                         Example 4: Resulting RDF statements with additional Metametadata


Kai Eckert                Dublin Core 2009, Seoul, South Korea, October 14th 2009           11/22
A Unified Approach for Representing Metametadata



                             Crosswalk Updates
  ●
      Which statements are generated by a given rule and
      need to be regenerated after an update?


         SELECT ?document ?field ?value WHERE {
           ?t rdf:subject    ?document .
           ?t rdf:predicate ?field .
           ?t rdf:object     ?value .
           ?t ex:rule        16 .
           ?t ex:crosswalk   3 .
         }



   document            field                                          value
   ex:docbase/doc1     http://www.example.org/dc#title                "Example title"
   ex:docbase/doc2     http://www.example.org/dc#title                "About ding a title"



Kai Eckert           Dublin Core 2009, Seoul, South Korea, October 14th 2009             12/22
A Unified Approach for Representing Metametadata



                         Crosswalk Debugging
  ●
      Which rule is responsible for a given statement and
      what was the original data?

         SELECT ?crosswalk ?rule ?origin WHERE {
           ?t rdf:subject     <ex:docbase/doc1> .
           ?t rdf:predicate dc:title .
           ?t rdf:object      "Example title" .
           ?t ex:rule         ?rule .
           ?t ex:crosswalk    ?crosswalk .
           ?t ex:origin    ?origin .
         }


               crosswalk          rule           origin
               3                  16             "MARC:245"



Kai Eckert          Dublin Core 2009, Seoul, South Korea, October 14th 2009   13/22
A Unified Approach for Representing Metametadata



             Scenario 2: Different Sources for Metadata


  ●
      Manual indexing is costly.
  ●
      Many documents are not indexed at all or not
      searchable:
              –   Journal Articles
              –   Externally owned documents
              –   Working papers
              –   Webpages
  ●
      New sources for metadata?


Kai Eckert             Dublin Core 2009, Seoul, South Korea, October 14th 2009   14/22
A Unified Approach for Representing Metametadata



             New ways for document indexing


  ●
      Automatic processes
  ●
      Tagging
  ●
      (Automatic) mapping of metadata from external
      sources
  ●
      Problem: Lack of quality


 How do you integrate these data from different sources
 without compromising the retrieval quality?


Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   15/22
A Unified Approach for Representing Metametadata



                    Possibilities for Metametadata

  ●
      Storage of the source of single statements.
  ●
      Storage of further source-specific information:
             –   Weighting for automatically generated subject
                  headings.
             –   Number of users who tagged a document with a
                  given tag.
             –   The original subject heading in case of an
                   automatic translation or mapping.

 Can we use these additional information to improve
 document retrieval?
Kai Eckert            Dublin Core 2009, Seoul, South Korea, October 14th 2009   16/22
A Unified Approach for Representing Metametadata



                       Example 2: Subject indexing



     Subject                  Predicate      Object
1    ex:docbase/doc1          dc:subject     ex:thes/sub20
2    #1                       ex:source      ex:sources/autoindex1
3    #1                       ex:rank        0.55
4    ex:docbase/doc1          dc:subject     ex:thes/sub30
5    #4                       ex:source      ex:sources/autoindex1
6    #4                       ex:rank        0.8
7    ex:docbase/doc1          dc:subject     ex:thes/sub30
8    #7                       ex:source      ex:sources/pfeffer
9    #7                       ex:rank        1.0
10   ex:docbase/doc1          dc:subject     ex:thes/sub40
11   #10                      ex:source      ex:sources/pfeffer
12   #10                      ex:rank        1.0
13   ex:sources/autoindex1    ex:type        ex:types/auto
14   ex:sources/pfeffer       ex:type        ex:types/manual
                             Example 7: Subject assignments by different sources


Kai Eckert              Dublin Core 2009, Seoul, South Korea, October 14th 2009    17/22
A Unified Approach for Representing Metametadata



                    Backward compatibility

  ●
      While there are four assignments for subject headings,
      the statement
      “ex:docbase/doc1 dc:subject ex:thes/sub30”
      is still one statement, regardless of the number of
      times you put it into your RDF store.
  ●
      Important for applications, that access the RDF Data,
      but do not handle the RDF reification.
  ●
      Your metadata remains valid, in particular there are
      no doublets.



Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   18/22
A Unified Approach for Representing Metametadata



                       Separating the sources

  ●
      Which statements are made by a specific source
      (here: Pfeffer)?

      SELECT ?document ?value WHERE {
        ?t rdf:subject    ?document .
        ?t rdf:predicate dc:subject .
        ?t rdf:object     ?value .
        ?t ex:source      <ex:sources/pfeffer> .
      }



                  document                           subject
                  ex:docbase/doc1                    ex:thes/sub30
                  ex:docbase/doc1                    ex:thes/sub40



Kai Eckert         Dublin Core 2009, Seoul, South Korea, October 14th 2009   19/22
A Unified Approach for Representing Metametadata



                             Extended queries
  ●
      Use all manually created subject headings.
  ●
      Use all subject headings with a rank > 0.7.
         SELECT DISTINCT ?document ?subject WHERE {
           ?t rdf:subject     ?document .
           ?t rdf:predicate dc:subject .
           ?t rdf:object      ?subject .
           ?t ex:source ?source .
           ?source ex:type       ?type .
           ?t ex:rank       ?rank .
           FILTER (
              ?type = <ex:types/manual> || ?rank > 0.7
           )
         }
                   document               subject
                   ex:docbase/doc1        ex:thes/sub30
                   ex:docbase/doc1        ex:thes/sub40


Kai Eckert          Dublin Core 2009, Seoul, South Korea, October 14th 2009   20/22
A Unified Approach for Representing Metametadata



                                 Conclusion


  ●
      Many applications of metametadata in the library
      fields can be realized with RDF Reification.
  ●
      No need for the reinvention of the wheel, based on
      existing standards and recommendations.
  ●
      With SPARQL and SeRQL you can access and make
      use of the additional information.
  ●
      Lot of mature products - developed by the semantic
      web community or commercial ones.



Kai Eckert       Dublin Core 2009, Seoul, South Korea, October 14th 2009   21/22
A Unified Approach for Representing Metametadata



                                Standardization?
  ●
      Would be great for generally accepted use-cases.
  ●
      What are “generally accepted use-cases”?
  ●
      Metametadata we used so far:
             –   A rank for subject headings
             –   Qualifications for sources (automatic, manual,
                  expert, layman, …)
             –   Link to further information about a specific indexing
                   process
             –   Link to original data source
             –   Debugging information

Kai Eckert            Dublin Core 2009, Seoul, South Korea, October 14th 2009   22/22

Weitere ähnliche Inhalte

Was ist angesagt?

Meta l metacase tools & possibilities
Meta l metacase tools & possibilitiesMeta l metacase tools & possibilities
Meta l metacase tools & possibilities
Fahad Golra
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
Alexandra Roatiș
 

Was ist angesagt? (15)

Meta l metacase tools & possibilities
Meta l metacase tools & possibilitiesMeta l metacase tools & possibilities
Meta l metacase tools & possibilities
 
Report on Work of Joint DCMI/IEEE LTSC Task Force
Report on Work of Joint DCMI/IEEE LTSC Task ForceReport on Work of Joint DCMI/IEEE LTSC Task Force
Report on Work of Joint DCMI/IEEE LTSC Task Force
 
Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.
Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.
Comparing XML Files with a DOGMA Ontology to Generate Omega-RIDL Annotations.
 
SPARQL and the Open Linked Data initiative
SPARQL and the Open Linked Data initiativeSPARQL and the Open Linked Data initiative
SPARQL and the Open Linked Data initiative
 
Exchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF DataExchange and Consumption of Huge RDF Data
Exchange and Consumption of Huge RDF Data
 
morph-LDP: An R2RML-based Linked Data Platform implementation
morph-LDP: An R2RML-based Linked Data Platform implementationmorph-LDP: An R2RML-based Linked Data Platform implementation
morph-LDP: An R2RML-based Linked Data Platform implementation
 
Compact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and ExchangeCompact Representation of Large RDF Data Sets for Publishing and Exchange
Compact Representation of Large RDF Data Sets for Publishing and Exchange
 
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of ...
 
Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF
 
RDF for Librarians
RDF for LibrariansRDF for Librarians
RDF for Librarians
 
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
Datalift a-catalyser-for-the-web-of-data-fosdem-05-02-2011
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
OWL Web Ontology Language Overview
OWL Web Ontology Language OverviewOWL Web Ontology Language Overview
OWL Web Ontology Language Overview
 
Jarrar: OWL -Web Ontology Language
Jarrar: OWL -Web Ontology LanguageJarrar: OWL -Web Ontology Language
Jarrar: OWL -Web Ontology Language
 

Andere mochten auch

Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Kai Eckert
 
Hannes Svardal - The role of environmental variance as adaptive response to f...
Hannes Svardal - The role of environmental variance as adaptive response to f...Hannes Svardal - The role of environmental variance as adaptive response to f...
Hannes Svardal - The role of environmental variance as adaptive response to f...
Seminaire MEE
 
UNL A&S NSE Presentation 2011
UNL A&S NSE Presentation 2011UNL A&S NSE Presentation 2011
UNL A&S NSE Presentation 2011
Mike O'Connor
 
François Rousset - présentation MEE2013
François Rousset - présentation MEE2013François Rousset - présentation MEE2013
François Rousset - présentation MEE2013
Seminaire MEE
 

Andere mochten auch (20)

Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
Guidance, Please! Towards a Framework for RDF-based Constraint Languages.
 
Metadata Provenance Tutorial at SWIB 13, Part 1
Metadata Provenance Tutorial at SWIB 13, Part 1Metadata Provenance Tutorial at SWIB 13, Part 1
Metadata Provenance Tutorial at SWIB 13, Part 1
 
Towards Interoperable Metadata Provenance
Towards Interoperable Metadata ProvenanceTowards Interoperable Metadata Provenance
Towards Interoperable Metadata Provenance
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata Provenance
 
LOHAI: Providing a baseline for KOS based automatic indexing
LOHAI: Providing a baseline for KOS based automatic indexingLOHAI: Providing a baseline for KOS based automatic indexing
LOHAI: Providing a baseline for KOS based automatic indexing
 
JudaicaLink: Linked Data from Jewish Encyclopediae
JudaicaLink: Linked Data from Jewish EncyclopediaeJudaicaLink: Linked Data from Jewish Encyclopediae
JudaicaLink: Linked Data from Jewish Encyclopediae
 
Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)Specialising the EDM for Digitised Manuscript (SWIB13)
Specialising the EDM for Digitised Manuscript (SWIB13)
 
Metadata Provenance
Metadata ProvenanceMetadata Provenance
Metadata Provenance
 
The DM2E Data Model and the DM2E Ingestion Infrastructure
The DM2E Data Model and the DM2E Ingestion InfrastructureThe DM2E Data Model and the DM2E Ingestion Infrastructure
The DM2E Data Model and the DM2E Ingestion Infrastructure
 
Portifolio designer Claudio Lopes
Portifolio designer Claudio LopesPortifolio designer Claudio Lopes
Portifolio designer Claudio Lopes
 
Plugin Case: social conversation of dengue Latinamerica
Plugin Case: social conversation of dengue LatinamericaPlugin Case: social conversation of dengue Latinamerica
Plugin Case: social conversation of dengue Latinamerica
 
Hannes Svardal - The role of environmental variance as adaptive response to f...
Hannes Svardal - The role of environmental variance as adaptive response to f...Hannes Svardal - The role of environmental variance as adaptive response to f...
Hannes Svardal - The role of environmental variance as adaptive response to f...
 
Arts & Sciences New Student Enrollment Presentation
Arts & Sciences New Student Enrollment PresentationArts & Sciences New Student Enrollment Presentation
Arts & Sciences New Student Enrollment Presentation
 
UNL A&S NSE Presentation 2011
UNL A&S NSE Presentation 2011UNL A&S NSE Presentation 2011
UNL A&S NSE Presentation 2011
 
François Rousset - présentation MEE2013
François Rousset - présentation MEE2013François Rousset - présentation MEE2013
François Rousset - présentation MEE2013
 
Unl election study 2011 presentation
Unl election study 2011 presentationUnl election study 2011 presentation
Unl election study 2011 presentation
 
Crowdsourcing the Assembly of Concept Hierarchies
Crowdsourcing the Assembly of Concept HierarchiesCrowdsourcing the Assembly of Concept Hierarchies
Crowdsourcing the Assembly of Concept Hierarchies
 
Linked Open Projects (DCMI Library Community)
Linked Open Projects (DCMI Library Community)Linked Open Projects (DCMI Library Community)
Linked Open Projects (DCMI Library Community)
 
Cost Effective Social Media Measurement
Cost Effective Social Media MeasurementCost Effective Social Media Measurement
Cost Effective Social Media Measurement
 
Sharp Track Report
Sharp Track ReportSharp Track Report
Sharp Track Report
 

Ähnlich wie A Unified Approach for Representing Metametadata

DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...
DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...
DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...
Digitised Manuscripts to Europeana
 
Content-sensitive User Interfaces for Annotated Web Pages
Content-sensitive User Interfaces for Annotated Web PagesContent-sensitive User Interfaces for Annotated Web Pages
Content-sensitive User Interfaces for Annotated Web Pages
flxn13
 
Wissenstechnologie Iv 08 09
Wissenstechnologie Iv 08 09Wissenstechnologie Iv 08 09
Wissenstechnologie Iv 08 09
mgrani
 
LOM DCAM at LOM Meeting 2008-04-23
LOM DCAM at LOM Meeting 2008-04-23LOM DCAM at LOM Meeting 2008-04-23
LOM DCAM at LOM Meeting 2008-04-23
Mikael Nilsson
 

Ähnlich wie A Unified Approach for Representing Metametadata (20)

DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...
DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...
DM2E Project meeting Bergen: WP2 RDF Validation, Kai Eckert (University of Ma...
 
Metadata Cloud
Metadata CloudMetadata Cloud
Metadata Cloud
 
Semantic RDF based integration framework for heterogeneous XML data sources
Semantic RDF based integration framework for heterogeneous XML data sourcesSemantic RDF based integration framework for heterogeneous XML data sources
Semantic RDF based integration framework for heterogeneous XML data sources
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 
Jpl presentation
Jpl presentationJpl presentation
Jpl presentation
 
OODBMS Concepts - National University of Singapore.pdf
OODBMS Concepts - National University of Singapore.pdfOODBMS Concepts - National University of Singapore.pdf
OODBMS Concepts - National University of Singapore.pdf
 
Content-sensitive User Interfaces for Annotated Web Pages
Content-sensitive User Interfaces for Annotated Web PagesContent-sensitive User Interfaces for Annotated Web Pages
Content-sensitive User Interfaces for Annotated Web Pages
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Ontology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpediaOntology, Semantic Web and DBpedia
Ontology, Semantic Web and DBpedia
 
RDF: Resource Description Failures?
RDF: Resource Description Failures?RDF: Resource Description Failures?
RDF: Resource Description Failures?
 
DCMI IEEE LTSC Joint taskforce at DC2007
DCMI IEEE LTSC Joint taskforce at DC2007DCMI IEEE LTSC Joint taskforce at DC2007
DCMI IEEE LTSC Joint taskforce at DC2007
 
Wissenstechnologie Iv 08 09
Wissenstechnologie Iv 08 09Wissenstechnologie Iv 08 09
Wissenstechnologie Iv 08 09
 
Going for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial MetadataGoing for GOLD - Adventures in Open Linked Geospatial Metadata
Going for GOLD - Adventures in Open Linked Geospatial Metadata
 
Geography in Linked Ancient World Data
Geography in Linked Ancient World DataGeography in Linked Ancient World Data
Geography in Linked Ancient World Data
 
Mathematical Semantic Markup in a Wiki: the Roles of Symbols and Notations
Mathematical Semantic Markup in a Wiki: the Roles of Symbols and NotationsMathematical Semantic Markup in a Wiki: the Roles of Symbols and Notations
Mathematical Semantic Markup in a Wiki: the Roles of Symbols and Notations
 
DC-2008 DCMI/IEEE workshop
DC-2008 DCMI/IEEE workshopDC-2008 DCMI/IEEE workshop
DC-2008 DCMI/IEEE workshop
 
LOM DCAM at LOM Meeting 2008-04-23
LOM DCAM at LOM Meeting 2008-04-23LOM DCAM at LOM Meeting 2008-04-23
LOM DCAM at LOM Meeting 2008-04-23
 
Using Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's TablesUsing Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's Tables
 

Mehr von Kai Eckert

Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)
Kai Eckert
 

Mehr von Kai Eckert (11)

Judaica link und der FID Jüdische Studien
Judaica link und der FID Jüdische StudienJudaica link und der FID Jüdische Studien
Judaica link und der FID Jüdische Studien
 
Linked Open Citation Database (LOC-DB)
Linked Open Citation Database (LOC-DB)Linked Open Citation Database (LOC-DB)
Linked Open Citation Database (LOC-DB)
 
JudaicaLink: Linked Data in the Jewish Studies FID
JudaicaLink: Linked Data in the Jewish Studies FIDJudaicaLink: Linked Data in the Jewish Studies FID
JudaicaLink: Linked Data in the Jewish Studies FID
 
Linked Data nach dem Hype
Linked Data nach dem HypeLinked Data nach dem Hype
Linked Data nach dem Hype
 
RDF Application Profiles
RDF Application ProfilesRDF Application Profiles
RDF Application Profiles
 
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
Bibliotheken und Linked Open Data - Erfahrungen und Ideen aus der UB Mannheim
 
Thesaurusvisualisierung mit ICE-Map und SEMTINEL
Thesaurusvisualisierung mit ICE-Map und SEMTINELThesaurusvisualisierung mit ICE-Map und SEMTINEL
Thesaurusvisualisierung mit ICE-Map und SEMTINEL
 
SWIB 2010: Linked Open Projects
SWIB 2010: Linked Open ProjectsSWIB 2010: Linked Open Projects
SWIB 2010: Linked Open Projects
 
Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)Linked Open Projects (DGI-Konferenz)
Linked Open Projects (DGI-Konferenz)
 
Linked Open Projects
Linked Open ProjectsLinked Open Projects
Linked Open Projects
 
Semantic Web, SKOS und Linked Data
Semantic Web, SKOS und Linked DataSemantic Web, SKOS und Linked Data
Semantic Web, SKOS und Linked Data
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

A Unified Approach for Representing Metametadata

  • 1. A Unified Approach for Representing Metametadata A Unified Approach for Representing Metametadata Kai Eckert University of Mannheim Dublin Core 2009 Seoul, South Korea October 14th 2009 Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 1/22
  • 2. A Unified Approach for Representing Metametadata People Hel lo , my nam Kai e is Kai Eckert Mannheim University kai@informatik.uni-mannheim.de Magnus Pfeffer Mannheim University Library pfeffer@bib.uni-mannheim.de Heiner Stuckenschmidt Mannheim University heiner@informatik.uni-mannheim.de Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 2/22
  • 3. A Unified Approach for Representing Metametadata What is this presentation about? ● Proposal for „statements about statements“ or „metametadata“ or „provenance on statement level“. ● That was done before! ● Implementation of metametadata by means of RDF Reification. ● That was done before! (That's what Reification was made for.) Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 3/22
  • 4. A Unified Approach for Representing Metametadata Anything new? ● Practical examples based on two scenarios and several use-cases. ● Both scenarios show (again) the need for metametadata. ● Easy implementation based on RDF and SPARQL. ● Standardization? Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 4/22
  • 5. A Unified Approach for Representing Metametadata RDF Reification ● RDF supports statements about statements by means of Reification, literally „objectification“ (actually a “subjectification”...). ● “The book is written by Goethe“ is said by Kai. Subject Predicate Object ● How is it done in RDF: ex:someID rdf:type rdf:Statement . ex:someID rdf:subject “The book”. ex:someID rdf:predicate ex:isWrittenBy . ex:someID rdf:object "Goethe" . ex:someID ex:isSaidBy “Kai” . Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 5/22
  • 6. A Unified Approach for Representing Metametadata Simplified Presentation ● Based on Notation 3 (RDF/N3) Subject Predicate Object 1 ex:p123 rdf:type ex:person 2 ex:p123 ex:hasName “Kai Eckert” 3 ex:p123 ex:worksFor ex:unima Example 1: A simple RDF example ● Identification of statements by the line number: 4 #1 dc:creator ''Kai Eckert'' The subject of a statement is a reference to another statement. With this notation, we imply a reification. Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 6/22
  • 7. A Unified Approach for Representing Metametadata Need for Metametadata ● Metadata are also data, so we need additional data about them. Metametadata ● Metadata about a whole metadata record, not for single statements: – Who created this metadata record? – When was this record created? – … ● This exists: Metadata Provenance Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 7/22
  • 8. A Unified Approach for Representing Metametadata Statements about (single) statements ● Often proposed, but only vague instructions how to implement it. ● Needed, if metadata records are created by the combination of single statements from different sources. ● Needed for the storage of arbitrary additional information for single statements, that can not be represented in the metadata format easily. Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 8/22
  • 9. A Unified Approach for Representing Metametadata Scenario 1: Crosswalks ● Crosswalks define rules, how metadata from one schema are represented in a different schema. MARC field Dublin Core element 260$c (Date of publication, distribution, etc.) → Date.Created 522 (Geographic Coverage Note) → Coverage.Spatial 300$a (Physical Description) → Format.Extent ● Problems: – Loss of information – Erroneous Crosswalks Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 9/22
  • 10. A Unified Approach for Representing Metametadata Possibilities for Metametadata ● Storage of additional information, which would be lost in the target format. ● Identification of Crosswalks with version and the specific rule for every generated statement. Which statements are generated by a specific rule? Which rule is responsible for a specific (erroneous) statement? Which data in the originating format was used to generate a specific statement? Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 10/22
  • 11. A Unified Approach for Representing Metametadata Example 1: Crosswalk Data Subject Predicate Object 1 ex:docbase/doc1 dc:title “Example title” 2 #1 ex:rule 16 3 #1 ex:crosswalk 3 4 #1 ex:origin MARC:245 5 ex:docbase/doc2 dc:title “About finding a title” 6 #5 ex:rule 16 7 #5 ex:crosswalk 3 8 #5 ex:origin MARC:245 9 ex:docbase/doc3 dc:title “Lorem ipsum dolor” 10 #9 ex:rule 18 11 #9 ex:crosswalk 3 12 #9 ex:origin MARC:245 13 #9 ex:origin MARC:246 14 ex:docbase/doc4 dc:title “Consetetur Sadipscing” 15 #14 ex:rule 19 16 #14 ex:crosswalk 6 17 #14 ex:origin xml:/record/description Example 4: Resulting RDF statements with additional Metametadata Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 11/22
  • 12. A Unified Approach for Representing Metametadata Crosswalk Updates ● Which statements are generated by a given rule and need to be regenerated after an update? SELECT ?document ?field ?value WHERE { ?t rdf:subject ?document . ?t rdf:predicate ?field . ?t rdf:object ?value . ?t ex:rule 16 . ?t ex:crosswalk 3 . } document field value ex:docbase/doc1 http://www.example.org/dc#title "Example title" ex:docbase/doc2 http://www.example.org/dc#title "About ding a title" Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 12/22
  • 13. A Unified Approach for Representing Metametadata Crosswalk Debugging ● Which rule is responsible for a given statement and what was the original data? SELECT ?crosswalk ?rule ?origin WHERE { ?t rdf:subject <ex:docbase/doc1> . ?t rdf:predicate dc:title . ?t rdf:object "Example title" . ?t ex:rule ?rule . ?t ex:crosswalk ?crosswalk . ?t ex:origin ?origin . } crosswalk rule origin 3 16 "MARC:245" Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 13/22
  • 14. A Unified Approach for Representing Metametadata Scenario 2: Different Sources for Metadata ● Manual indexing is costly. ● Many documents are not indexed at all or not searchable: – Journal Articles – Externally owned documents – Working papers – Webpages ● New sources for metadata? Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 14/22
  • 15. A Unified Approach for Representing Metametadata New ways for document indexing ● Automatic processes ● Tagging ● (Automatic) mapping of metadata from external sources ● Problem: Lack of quality How do you integrate these data from different sources without compromising the retrieval quality? Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 15/22
  • 16. A Unified Approach for Representing Metametadata Possibilities for Metametadata ● Storage of the source of single statements. ● Storage of further source-specific information: – Weighting for automatically generated subject headings. – Number of users who tagged a document with a given tag. – The original subject heading in case of an automatic translation or mapping. Can we use these additional information to improve document retrieval? Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 16/22
  • 17. A Unified Approach for Representing Metametadata Example 2: Subject indexing Subject Predicate Object 1 ex:docbase/doc1 dc:subject ex:thes/sub20 2 #1 ex:source ex:sources/autoindex1 3 #1 ex:rank 0.55 4 ex:docbase/doc1 dc:subject ex:thes/sub30 5 #4 ex:source ex:sources/autoindex1 6 #4 ex:rank 0.8 7 ex:docbase/doc1 dc:subject ex:thes/sub30 8 #7 ex:source ex:sources/pfeffer 9 #7 ex:rank 1.0 10 ex:docbase/doc1 dc:subject ex:thes/sub40 11 #10 ex:source ex:sources/pfeffer 12 #10 ex:rank 1.0 13 ex:sources/autoindex1 ex:type ex:types/auto 14 ex:sources/pfeffer ex:type ex:types/manual Example 7: Subject assignments by different sources Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 17/22
  • 18. A Unified Approach for Representing Metametadata Backward compatibility ● While there are four assignments for subject headings, the statement “ex:docbase/doc1 dc:subject ex:thes/sub30” is still one statement, regardless of the number of times you put it into your RDF store. ● Important for applications, that access the RDF Data, but do not handle the RDF reification. ● Your metadata remains valid, in particular there are no doublets. Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 18/22
  • 19. A Unified Approach for Representing Metametadata Separating the sources ● Which statements are made by a specific source (here: Pfeffer)? SELECT ?document ?value WHERE { ?t rdf:subject ?document . ?t rdf:predicate dc:subject . ?t rdf:object ?value . ?t ex:source <ex:sources/pfeffer> . } document subject ex:docbase/doc1 ex:thes/sub30 ex:docbase/doc1 ex:thes/sub40 Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 19/22
  • 20. A Unified Approach for Representing Metametadata Extended queries ● Use all manually created subject headings. ● Use all subject headings with a rank > 0.7. SELECT DISTINCT ?document ?subject WHERE { ?t rdf:subject ?document . ?t rdf:predicate dc:subject . ?t rdf:object ?subject . ?t ex:source ?source . ?source ex:type ?type . ?t ex:rank ?rank . FILTER ( ?type = <ex:types/manual> || ?rank > 0.7 ) } document subject ex:docbase/doc1 ex:thes/sub30 ex:docbase/doc1 ex:thes/sub40 Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 20/22
  • 21. A Unified Approach for Representing Metametadata Conclusion ● Many applications of metametadata in the library fields can be realized with RDF Reification. ● No need for the reinvention of the wheel, based on existing standards and recommendations. ● With SPARQL and SeRQL you can access and make use of the additional information. ● Lot of mature products - developed by the semantic web community or commercial ones. Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 21/22
  • 22. A Unified Approach for Representing Metametadata Standardization? ● Would be great for generally accepted use-cases. ● What are “generally accepted use-cases”? ● Metametadata we used so far: – A rank for subject headings – Qualifications for sources (automatic, manual, expert, layman, …) – Link to further information about a specific indexing process – Link to original data source – Debugging information Kai Eckert Dublin Core 2009, Seoul, South Korea, October 14th 2009 22/22