SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
Methodological Guidelines for
   Publishing Linked Data
            g
                Boris Villazón-Terrazas, Oscar Corcho
      Facultad de Informática, Universidad Politécnica de Madrid
                              ,
    Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid
                       http://www.oeg-upm.net
                    {bvillazon,ocorcho}@fi.upm.es
             Phone: 34 91 3366605 Fax: 34 91 3524819
                     34.91.3366605,       34.91.3524819
      Slides available at: http://www.slideshare.net/boricles/


Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches,
Victor Saquicela, Al
Vi t S     i l Alexander d L ó and many others th t we
                     d de León,   d        th   that
may have omitted.
WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0
Main References

Wood, David (Ed) Linking Government Data - 2011

Methodological Guidelines for Publishing Government Linked Data

Boris Villazón-Terrazas, Luis M. Vilches, Oscar Corcho, Asunción Gómez-Pérez




Best Practices for Publishing Linked Data

W3C Editor’s Draft – Government Linked Data Working Group

Michael Hausenblas, Bernadette Hyland, Boris Villazón-Terrazas

https://dvcs.w3.org/hg/gld/raw-file/bcb72f87b5cc/bp/index.html



Cookbook for Open Government Linked Data

W3C Editor’s Draft – Government Linked Data Working Group

Bernadette Hyland, Boris Villazón-Terrazas, Sarven Capadisli

http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook
http://www w3 org/2011/gld/wiki/Linked Data Cookbook
Guidelines for Publishing Linked Data

• The process of publishing Linked Data has an
  iterative incremental life cycle model.



• Based on our experience in the production of Linked
  Data in several Governmental Contexts, have been
  applied in real case scenarios.




                           3
4
5
Specification
• Identification and analysis of the data
  sources

• URI design

• Definition of the license




                        6
Specification
            Identification and analysis of the data sources

We have to distinguish

• O
  Open and publish d t th t government agencies h
           d bli h data that         t      i have
  not yet opened up and published
   • Task that may require contacting to specific government data
     owners to get access to their legacy data




• Reuse and leverage on data already opened up and
  p
  published by g
             y government agencies
                           g
   • Task to look for these data in public government catalogs
      • Open Government Data
      • datacatalogs org
        datacatalogs.org
      • Open Government Catalog
                                7
Specification
           Identification and analysis of the data sources

After we have identified and selected the government data
   sources

• Search and compile all the available data and
  documentation about those resources

• Identify the schema of those resources including
  conceptual components and th i relationships
          t l           t    d their l ti   hi

• Identify the items in the domain i e things whose
                            domain, i.e.,
  properties and relations are described in the data
  sources


                           8
Specification
                    GeoLinkedData – Identification of the data sources

                                                      Agreement with the IGN
                IGN
National Geographic Institute of Spain

        Oracle & MySQL




                                                       Data
                                                       D t sources available
                                                                        il bl
                                                      in a public data catalog
         INE
National Statistic Institute of Spain




                                         9
Specification
           GeoLinkedData – Analysis of the data sources




                   Year




Province                             Industry Production Index




                          10
Specification
                                                    URI Design

• Use meaningful URIs, instead of opaque URIs, when
  possible

• Separate TBox (ontology model) from ABox
  (instances) URIs
              URIs.
   • Base URI
     http://data.gov.bo/
     http://health.data.gov.bo/
   • TBox URIs
     http://data.gov.bo/ontology/{class|property}
        p        g            gy {     |p p y}
   • ABox URIs
     http://data.gov.bo/resource/
     http://data.gov.bo/resource/province/Tiraque
     http://data gov bo/resource/province/Tiraque


                                11
Specification
                                   GeoLinkedData - URI design

• Base URI
  http://linkeddata.es/
  http://geo.linkeddata.es/

• TBox URIs
  http://geo.linkeddata.es/ontology/{concept|property}
  http://geo.linkeddata.es/ontology/Provincia
  http://geo linkeddata es/ontology/Provincia

• ABox URIs
  http://geo.linkeddata.es/resource/{r. type}/{r. name}
  http://geo.linkeddata.es/resource/Provincia/Madrid


                              12
Specification
                                      Definition of the license

• Several possibilities

   • The UK Open Government License

   • Open Database License

   • Public Domain Dedication and License

   • Open Data Commons Attribution License

   • The C
         Creative C
                  Commons Licenses


It is also possible to reuse and apply an existing license
           p                      pp y           g
    of the government data sources.
                              13
Specification
                                    GeoLinkedData - Definition of the license

• Reusing the original license of the government data
  sources. IGN and INE data sources have their own
  license, similar t Att ib ti Sh
  li        i il to Attribution-Share Alik 2 5 G
                                       Alike 2.5 Generic
                                                      i
  License




  http://creativecommons.org/licenses/by-sa/2.5/


                                                   14
15
Modelling
                                                                            Ontology




•   An ontology is an engineering artifact, which provides:
     •   A set of terms
     •   A set of explicit assumptions regarding the intended meaning of the terms.
           • Almost always including concepts and their classification
           • Almost always including properties between concepts




•   Shared understanding of a domain of interest

•   Ontologies expressed in OWL or RDF(S), both based on RDF




                                          16
Modelling
                               Reuse available vocabularies



Search f suitable
S    h for it bl
  vocabularies



                                               Linked Open Vocabularies




    are there       Yes                  Build the vocabulary by
     suitable                               reusing available
                                                   g
  vocabularies?                               vocabularies


           No



       …

                          17
Modelling
                 Reuse available non-ontological resources

                                               Highly reliable Web Sites



   Search f suitable
   S     h for it bl                           Domain related
                                               Domain-related sites
non-ontological resources

                                               Government Catalogs




        are there           Yes        Build the vocabulary by
         suitable                      transforming available
                                       t     f    i      il bl
       resources?                              resources


               No




Build the vocabulary from
         scratch



                                  18
Modelling
                                                                                                              GeoLinkedData
                                                                      WGS84 Geo
                                                                   Positioning: an RDF
                                                                       vocabulary                                  scv:Dimension
                                                                                                                      scv:Item
                                                                                                                    scv:Dataset

               hydrographical
             phenomena (rivers,
                 lakes, etc.)




                                                                                                                     Vocabulary for
                                                                                                                     instants, intervals,
                                                                                                                     durations, etc.




                                                                                         Names and
                                                                                         international code
                                  Ontology for OGC                                       systems for
                                  Geography Markup                                       territories and
                                  Language                                               groups




Classes                     33          33
Object Properties           44          44
Data Properties            318         318
                                                     http://neon-toolkit.org/


                                                                   19
Modelling
     GeoLinkedData




20
21
Generation
• Transformation

• Data cleansing

• Linking




                   22
Generation
                                                Transformation

• Take the data sources selected in the specification
  activity and transform them to RDF according to the
  vocabulary created i th modelling activity
       b l         t d in the   d lli   ti it

• Some tools
   • CSV and spreadsheets
      • RDF extension of Google Refine, XLWrap, RDF123, NOR2O
   • RDB
      • D2R Server, ODEMapster, W3C RDB2RDF WG – R2RML
   • XML
      • GRDDL, ReDeFer




                               23
Generation
                                  GeoLinkedData - Transformation



                             NOR2O

       INE




                          ODEMapster


      IGN




             Geospatial       Geometry2RDF
              column


IGN




                                       24
Generation
                                   GeoLinkedData - Transformation
Industry Production Index   Year




Province




                                    NOR2O




                                     25
Generation
                                                      GeoLinkedData - Transformation
•   R2O is an e te s b e, fully dec a at e language to desc be
          s a extensible, u y declarative a guage describe
    mappings between relational database schemas and ontologies.
•   The ODEMapster processor generates RDF instances from
    relational instances based on the mapping description
                                          pp g       p
    expressed in the R2O document




    www.oeg-upm.net/index.php/en/downloads/9-r2o-odempaster
                                                              26
Generation
                       GeoLinkedData - Transformation
• Creation of the R2O Mappings




                         27
Generation
GeoLinkedData - Transformation


            Excerpt of the R2O document




  28
Generation
                                                      GeoLinkedData - Transformation

• Tool for generating RDF from geometrical information

• The geometry could be available in GML or WKT

• The RDF generated follows our Geometry Model




  http://www.oeg-upm.net/index.php/en/downloads/151-geometry2rdf

                                                           29
Generation
 GeoLinkedData - Transformation


                   Oracle STO UTIL package




SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry))
          AS Gml311Geometry
FROM "BCN200"."BCN200_0301L_RIO" c
WHERE c.Etiqueta='Arroyo'




     30
Generation
GeoLinkedData - Transformation
Generation
                                                      Data Cleansing

• To find possible errors, identified by Hogan et al.
   • http-level issues, such as accessibility and derefencability,
     e.g.,
     e g HTTP URIs ret rn 40 /50 errors
                        return 40x/50x
   • reasoning issues such as namespace without vocabulary,
     e.g., rss:item term invented
   • malformed/incompatible datatypes, e.g., “true” as xsd:int


• To fix the identified errors




                                 32
Generation
                            GeoLinkedData – Data Cleansing

• Errors
   • Some resources, with the same name, were mixed. For
     example,
     e ample Granada municipality belongs to Granada
                       m nicipalit
     province, and La Granada municipality belongs to Barcelona
     Province.

   • Autonomous communities that only have one province, e.g.,
     Murcia Region, missed some municipalities, but their
     corresponding provinces, e g Murcia Province have the
                   provinces e.g.,       Province,
     correct number of municipalities.

   • S
     Some hydrographical resources missed some parts of their
                                                      f
     geometrical information.




                               33
Generation
                                                                                                          Linking


                     Identify suitable data sets                                       http://ckan.net
                         as linking targets




                       Discover relationships
                        between data items
LIMES                                              Silk Framework
http://aksw.org/Projects/limes                     http://www4.wiwiss.fu-berlin.de/bizer/silk/




                     Validate the relationships
                            discovered              sameAs Validator
                                                    http://oegdev.dia.fi.upm.es:8080/sameAs/




                                                                34
Generation
                                                            GeoLinkedData - Linking


                   GeoLinked
                     Data




                               DBPedia                     GeoNames




        ….                                  ….                               ….

http://dbpedia.org/re              http://geo.linkeddata              http://sws.geoname
   source/Madrid                       .es/.../Madrid                    s.org/6355233/


        ….                                 ….                                 ….

                                                35
Generation
                                                GeoLinkedData - Linking




http://oegdev.dia.fi.upm.es:8080/sameAs/
http://oegdev dia fi upm es:8080/sameAs/




                                           36
37
Publication
• Dataset publication

• Metadata publication

• Dataset discovery




                        38
Publication
                                           Dataset Publication

• Tools for storing RDF
   • Virtuoso Universal Server, Jena, Sesame, 4Store, YARS,
     OWLIM


• SPARQL endpoint and Linked Data frontend
   • Pubby, Talis Platform, Fuseki




                               39
Publication
                                  Metadata Publication

• VoID allows to express metadata about RDF
  datasets




• Open Provenance Model




                          40
Publication
                                                                                   Dataset discovery

• Register the dataset into CKAN Registry

• Generate sitemap files for your dataset, by using
  sitemap4rdf

• Submit the sitemap location to Google and Sindice




  http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation


                                                       41
Publication
                                               GeoLinkedData – Dataset publication




                               HTML                  Linked Data        SPARQL




     Including Provenance                   Pubby
            Support

http://www4.wiwiss.fu-berlin.de/pubby/   Pubby 0.3




                                                                   Virtuoso 6.1.0
                                                                            610



                                                            42
Publication
GeoLinkedData – Dataset discovery




    43
44
Exploitation




Streaming resources
     45
Exploitation
                                                                      GeoLinkedData

                      http://oegdev.dia.fi.upm.es/projects/map4rdf/


map4rdf:
   • Google maps viewer of RDF resources
       • Resources with spatial information
   • Extensible with google plugins
   • Used in other applications like Aemet Goodrelations
                                     Aemet,




                               map4rdf                 SPARQL




                                                     Triplestore
                                                46
DEMO
http://geo.linkeddata.es/browser




              47
Provinces




48
Capital of Province




49
Provinces – Industry Production Index




 50
Beaches




51
Methodological Guidelines for
   Publishing Linked Data
            g
                Boris Villazón-Terrazas, Oscar Corcho
      Facultad de Informática, Universidad Politécnica de Madrid
                              ,
    Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid
                       http://www.oeg-upm.net
                    {bvillazon,ocorcho}@fi.upm.es
             Phone: 34 91 3366605 Fax: 34 91 3524819
                     34.91.3366605,       34.91.3524819
      Slides available at: http://www.slideshare.net/boricles/


Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches,
Victor Saquicela, Al
Vi t S     i l Alexander d L ó and many others th t we
                     d de León,   d        th   that
may have omitted.
WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0

Weitere ähnliche Inhalte

Was ist angesagt?

Secrets of the catalog remix the remix
Secrets of the catalog remix the remixSecrets of the catalog remix the remix
Secrets of the catalog remix the remixrobin fay
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift
 
All About Access Points in RDA
All About Access Points in RDAAll About Access Points in RDA
All About Access Points in RDAShana McDanold
 
Resource Description & Access (RDA)
Resource Description & Access (RDA)Resource Description & Access (RDA)
Resource Description & Access (RDA)Buzz Haughton
 
Revealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid ApproachRevealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid ApproachJulien PLU
 
The tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARCThe tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARCAnn Chapman
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelabCAMELIA BOBAN
 
Marc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue RecordsMarc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue RecordsOtuoma Peter
 

Was ist angesagt? (9)

Secrets of the catalog remix the remix
Secrets of the catalog remix the remixSecrets of the catalog remix the remix
Secrets of the catalog remix the remix
 
A brief history of MARC
A brief history of MARCA brief history of MARC
A brief history of MARC
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011
 
All About Access Points in RDA
All About Access Points in RDAAll About Access Points in RDA
All About Access Points in RDA
 
Resource Description & Access (RDA)
Resource Description & Access (RDA)Resource Description & Access (RDA)
Resource Description & Access (RDA)
 
Revealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid ApproachRevealing Entities From Texts With a Hybrid Approach
Revealing Entities From Texts With a Hybrid Approach
 
The tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARCThe tools of our trade: AACR2/RDA and MARC
The tools of our trade: AACR2/RDA and MARC
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelab
 
Marc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue RecordsMarc formats : Facilitating sharing of Catalogue Records
Marc formats : Facilitating sharing of Catalogue Records
 

Andere mochten auch

SEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and InteroperabilitySEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and InteroperabilityBoris Villazón-Terrazas
 
Linguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial InformationLinguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial InformationBoris Villazón-Terrazas
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataBoris Villazón-Terrazas
 
Towards a Commons RDF Java library
Towards a Commons RDF Java libraryTowards a Commons RDF Java library
Towards a Commons RDF Java librarySergio Fernández
 
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...Boris Villazón-Terrazas
 

Andere mochten auch (11)

SEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and InteroperabilitySEEMP - Semantic Aspects and Interoperability
SEEMP - Semantic Aspects and Interoperability
 
Linguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial InformationLinguistic resources enhanced with geospatial Information
Linguistic resources enhanced with geospatial Information
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Geolinkeddata 07042011 1
Geolinkeddata 07042011 1Geolinkeddata 07042011 1
Geolinkeddata 07042011 1
 
Yet another SPARQL 1.1 brief introduction
Yet another SPARQL 1.1 brief introductionYet another SPARQL 1.1 brief introduction
Yet another SPARQL 1.1 brief introduction
 
Towards a Commons RDF Java library
Towards a Commons RDF Java libraryTowards a Commons RDF Java library
Towards a Commons RDF Java library
 
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...
 
Sitemap4rdf(v2 boris)
Sitemap4rdf(v2 boris)Sitemap4rdf(v2 boris)
Sitemap4rdf(v2 boris)
 
Ecuadorian Geospatial Linked Data
Ecuadorian Geospatial Linked Data Ecuadorian Geospatial Linked Data
Ecuadorian Geospatial Linked Data
 
iSOCO - Research Lab Brief Introduction
iSOCO - Research Lab Brief IntroductioniSOCO - Research Lab Brief Introduction
iSOCO - Research Lab Brief Introduction
 
Data Shapes and Data Transformations
Data Shapes and Data TransformationsData Shapes and Data Transformations
Data Shapes and Data Transformations
 

Ähnlich wie Methodological Guidelines for Publishing Linked Data

Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesPrateek Jain
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
reegle - a new key portal for open energy data
reegle - a new key portal for open energy datareegle - a new key portal for open energy data
reegle - a new key portal for open energy datareeep
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISGiannis Tsakonas
 
Tsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-LisTsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-LisLIS EPI Meeting
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
First they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedFirst they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedRensselaer Polytechnic Institute
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip finalDeborah McGuinness
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinAnja Jentzsch
 

Ähnlich wie Methodological Guidelines for Publishing Linked Data (20)

PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
 
Linked Data
Linked DataLinked Data
Linked Data
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
reegle - a new key portal for open energy data
reegle - a new key portal for open energy datareegle - a new key portal for open energy data
reegle - a new key portal for open energy data
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Open Bibliographic Data and E-LIS
Open Bibliographic Data and E-LISOpen Bibliographic Data and E-LIS
Open Bibliographic Data and E-LIS
 
Tsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-LisTsakonas-Robbio·Open Bibliographic Data E-Lis
Tsakonas-Robbio·Open Bibliographic Data E-Lis
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
First they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and UsedFirst they have to find it: Getting Open Government Data Discovered and Used
First they have to find it: Getting Open Government Data Discovered and Used
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...Knowledge Organization System (KOS) for biodiversity information resources, G...
Knowledge Organization System (KOS) for biodiversity information resources, G...
 
Linked Data
Linked DataLinked Data
Linked Data
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, BerlinDBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
DBpedia Mappings Wiki, SMWCon Fall 2013, Berlin
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 

Mehr von Boris Villazón-Terrazas

RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingBoris Villazón-Terrazas
 
Map4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsMap4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsBoris Villazón-Terrazas
 
Linked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusLinked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusBoris Villazón-Terrazas
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationBoris Villazón-Terrazas
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataBoris Villazón-Terrazas
 
Linked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupLinked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupBoris Villazón-Terrazas
 
Lightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesLightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesBoris Villazón-Terrazas
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseBoris Villazón-Terrazas
 

Mehr von Boris Villazón-Terrazas (12)

RDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct MappingRDB2RDF, an overview of R2RML and Direct Mapping
RDB2RDF, an overview of R2RML and Direct Mapping
 
Map4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial DatasetsMap4rdf - Faceted Browser for Geospatial Datasets
Map4rdf - Faceted Browser for Geospatial Datasets
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Linked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current StatusLinked Data Projects at OEG - Current Status
Linked Data Projects at OEG - Current Status
 
A Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and OrganizationA Provenance-Aware Linked Data Application for Trip Management and Organization
A Provenance-Aware Linked Data Application for Trip Management and Organization
 
Methodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked DataMethodological Guidelines for Publishing Linked Data
Methodological Guidelines for Publishing Linked Data
 
Linked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering GroupLinked Data Research Projects at Ontology Engineering Group
Linked Data Research Projects at Ontology Engineering Group
 
Lightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful ServicesLightweight Semantic Annotation of Geospatial RESTful Services
Lightweight Semantic Annotation of Geospatial RESTful Services
 
Geometry2rdf(v2 boris)
Geometry2rdf(v2 boris)Geometry2rdf(v2 boris)
Geometry2rdf(v2 boris)
 
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use CaseAn Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
An Approach to Publish Spatial Data on the Web: The GeoLinked Data Use Case
 
Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)Geo linked data lstd10(v2-boris)
Geo linked data lstd10(v2-boris)
 
GeoLinkedData
GeoLinkedDataGeoLinkedData
GeoLinkedData
 

Kürzlich hochgeladen

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Kürzlich hochgeladen (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Methodological Guidelines for Publishing Linked Data

  • 1. Methodological Guidelines for Publishing Linked Data g Boris Villazón-Terrazas, Oscar Corcho Facultad de Informática, Universidad Politécnica de Madrid , Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net {bvillazon,ocorcho}@fi.upm.es Phone: 34 91 3366605 Fax: 34 91 3524819 34.91.3366605, 34.91.3524819 Slides available at: http://www.slideshare.net/boricles/ Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches, Victor Saquicela, Al Vi t S i l Alexander d L ó and many others th t we d de León, d th that may have omitted. WorkdistributedunderthelicenseCreativeCommonsAttribution- Noncommercial-Share Alike 3.0
  • 2. Main References Wood, David (Ed) Linking Government Data - 2011 Methodological Guidelines for Publishing Government Linked Data Boris Villazón-Terrazas, Luis M. Vilches, Oscar Corcho, Asunción Gómez-Pérez Best Practices for Publishing Linked Data W3C Editor’s Draft – Government Linked Data Working Group Michael Hausenblas, Bernadette Hyland, Boris Villazón-Terrazas https://dvcs.w3.org/hg/gld/raw-file/bcb72f87b5cc/bp/index.html Cookbook for Open Government Linked Data W3C Editor’s Draft – Government Linked Data Working Group Bernadette Hyland, Boris Villazón-Terrazas, Sarven Capadisli http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook http://www w3 org/2011/gld/wiki/Linked Data Cookbook
  • 3. Guidelines for Publishing Linked Data • The process of publishing Linked Data has an iterative incremental life cycle model. • Based on our experience in the production of Linked Data in several Governmental Contexts, have been applied in real case scenarios. 3
  • 4. 4
  • 5. 5
  • 6. Specification • Identification and analysis of the data sources • URI design • Definition of the license 6
  • 7. Specification Identification and analysis of the data sources We have to distinguish • O Open and publish d t th t government agencies h d bli h data that t i have not yet opened up and published • Task that may require contacting to specific government data owners to get access to their legacy data • Reuse and leverage on data already opened up and p published by g y government agencies g • Task to look for these data in public government catalogs • Open Government Data • datacatalogs org datacatalogs.org • Open Government Catalog 7
  • 8. Specification Identification and analysis of the data sources After we have identified and selected the government data sources • Search and compile all the available data and documentation about those resources • Identify the schema of those resources including conceptual components and th i relationships t l t d their l ti hi • Identify the items in the domain i e things whose domain, i.e., properties and relations are described in the data sources 8
  • 9. Specification GeoLinkedData – Identification of the data sources Agreement with the IGN IGN National Geographic Institute of Spain Oracle & MySQL Data D t sources available il bl in a public data catalog INE National Statistic Institute of Spain 9
  • 10. Specification GeoLinkedData – Analysis of the data sources Year Province Industry Production Index 10
  • 11. Specification URI Design • Use meaningful URIs, instead of opaque URIs, when possible • Separate TBox (ontology model) from ABox (instances) URIs URIs. • Base URI http://data.gov.bo/ http://health.data.gov.bo/ • TBox URIs http://data.gov.bo/ontology/{class|property} p g gy { |p p y} • ABox URIs http://data.gov.bo/resource/ http://data.gov.bo/resource/province/Tiraque http://data gov bo/resource/province/Tiraque 11
  • 12. Specification GeoLinkedData - URI design • Base URI http://linkeddata.es/ http://geo.linkeddata.es/ • TBox URIs http://geo.linkeddata.es/ontology/{concept|property} http://geo.linkeddata.es/ontology/Provincia http://geo linkeddata es/ontology/Provincia • ABox URIs http://geo.linkeddata.es/resource/{r. type}/{r. name} http://geo.linkeddata.es/resource/Provincia/Madrid 12
  • 13. Specification Definition of the license • Several possibilities • The UK Open Government License • Open Database License • Public Domain Dedication and License • Open Data Commons Attribution License • The C Creative C Commons Licenses It is also possible to reuse and apply an existing license p pp y g of the government data sources. 13
  • 14. Specification GeoLinkedData - Definition of the license • Reusing the original license of the government data sources. IGN and INE data sources have their own license, similar t Att ib ti Sh li i il to Attribution-Share Alik 2 5 G Alike 2.5 Generic i License http://creativecommons.org/licenses/by-sa/2.5/ 14
  • 15. 15
  • 16. Modelling Ontology • An ontology is an engineering artifact, which provides: • A set of terms • A set of explicit assumptions regarding the intended meaning of the terms. • Almost always including concepts and their classification • Almost always including properties between concepts • Shared understanding of a domain of interest • Ontologies expressed in OWL or RDF(S), both based on RDF 16
  • 17. Modelling Reuse available vocabularies Search f suitable S h for it bl vocabularies Linked Open Vocabularies are there Yes Build the vocabulary by suitable reusing available g vocabularies? vocabularies No … 17
  • 18. Modelling Reuse available non-ontological resources Highly reliable Web Sites Search f suitable S h for it bl Domain related Domain-related sites non-ontological resources Government Catalogs are there Yes Build the vocabulary by suitable transforming available t f i il bl resources? resources No Build the vocabulary from scratch 18
  • 19. Modelling GeoLinkedData WGS84 Geo Positioning: an RDF vocabulary scv:Dimension scv:Item scv:Dataset hydrographical phenomena (rivers, lakes, etc.) Vocabulary for instants, intervals, durations, etc. Names and international code Ontology for OGC systems for Geography Markup territories and Language groups Classes 33 33 Object Properties 44 44 Data Properties 318 318 http://neon-toolkit.org/ 19
  • 20. Modelling GeoLinkedData 20
  • 21. 21
  • 22. Generation • Transformation • Data cleansing • Linking 22
  • 23. Generation Transformation • Take the data sources selected in the specification activity and transform them to RDF according to the vocabulary created i th modelling activity b l t d in the d lli ti it • Some tools • CSV and spreadsheets • RDF extension of Google Refine, XLWrap, RDF123, NOR2O • RDB • D2R Server, ODEMapster, W3C RDB2RDF WG – R2RML • XML • GRDDL, ReDeFer 23
  • 24. Generation GeoLinkedData - Transformation NOR2O INE ODEMapster IGN Geospatial Geometry2RDF column IGN 24
  • 25. Generation GeoLinkedData - Transformation Industry Production Index Year Province NOR2O 25
  • 26. Generation GeoLinkedData - Transformation • R2O is an e te s b e, fully dec a at e language to desc be s a extensible, u y declarative a guage describe mappings between relational database schemas and ontologies. • The ODEMapster processor generates RDF instances from relational instances based on the mapping description pp g p expressed in the R2O document www.oeg-upm.net/index.php/en/downloads/9-r2o-odempaster 26
  • 27. Generation GeoLinkedData - Transformation • Creation of the R2O Mappings 27
  • 28. Generation GeoLinkedData - Transformation Excerpt of the R2O document 28
  • 29. Generation GeoLinkedData - Transformation • Tool for generating RDF from geometrical information • The geometry could be available in GML or WKT • The RDF generated follows our Geometry Model http://www.oeg-upm.net/index.php/en/downloads/151-geometry2rdf 29
  • 30. Generation GeoLinkedData - Transformation Oracle STO UTIL package SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry)) AS Gml311Geometry FROM "BCN200"."BCN200_0301L_RIO" c WHERE c.Etiqueta='Arroyo' 30
  • 32. Generation Data Cleansing • To find possible errors, identified by Hogan et al. • http-level issues, such as accessibility and derefencability, e.g., e g HTTP URIs ret rn 40 /50 errors return 40x/50x • reasoning issues such as namespace without vocabulary, e.g., rss:item term invented • malformed/incompatible datatypes, e.g., “true” as xsd:int • To fix the identified errors 32
  • 33. Generation GeoLinkedData – Data Cleansing • Errors • Some resources, with the same name, were mixed. For example, e ample Granada municipality belongs to Granada m nicipalit province, and La Granada municipality belongs to Barcelona Province. • Autonomous communities that only have one province, e.g., Murcia Region, missed some municipalities, but their corresponding provinces, e g Murcia Province have the provinces e.g., Province, correct number of municipalities. • S Some hydrographical resources missed some parts of their f geometrical information. 33
  • 34. Generation Linking Identify suitable data sets http://ckan.net as linking targets Discover relationships between data items LIMES Silk Framework http://aksw.org/Projects/limes http://www4.wiwiss.fu-berlin.de/bizer/silk/ Validate the relationships discovered sameAs Validator http://oegdev.dia.fi.upm.es:8080/sameAs/ 34
  • 35. Generation GeoLinkedData - Linking GeoLinked Data DBPedia GeoNames …. …. …. http://dbpedia.org/re http://geo.linkeddata http://sws.geoname source/Madrid .es/.../Madrid s.org/6355233/ …. …. …. 35
  • 36. Generation GeoLinkedData - Linking http://oegdev.dia.fi.upm.es:8080/sameAs/ http://oegdev dia fi upm es:8080/sameAs/ 36
  • 37. 37
  • 38. Publication • Dataset publication • Metadata publication • Dataset discovery 38
  • 39. Publication Dataset Publication • Tools for storing RDF • Virtuoso Universal Server, Jena, Sesame, 4Store, YARS, OWLIM • SPARQL endpoint and Linked Data frontend • Pubby, Talis Platform, Fuseki 39
  • 40. Publication Metadata Publication • VoID allows to express metadata about RDF datasets • Open Provenance Model 40
  • 41. Publication Dataset discovery • Register the dataset into CKAN Registry • Generate sitemap files for your dataset, by using sitemap4rdf • Submit the sitemap location to Google and Sindice http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation 41
  • 42. Publication GeoLinkedData – Dataset publication HTML Linked Data SPARQL Including Provenance Pubby Support http://www4.wiwiss.fu-berlin.de/pubby/ Pubby 0.3 Virtuoso 6.1.0 610 42
  • 44. 44
  • 46. Exploitation GeoLinkedData http://oegdev.dia.fi.upm.es/projects/map4rdf/ map4rdf: • Google maps viewer of RDF resources • Resources with spatial information • Extensible with google plugins • Used in other applications like Aemet Goodrelations Aemet, map4rdf SPARQL Triplestore 46
  • 50. Provinces – Industry Production Index 50
  • 52.
  • 53. Methodological Guidelines for Publishing Linked Data g Boris Villazón-Terrazas, Oscar Corcho Facultad de Informática, Universidad Politécnica de Madrid , Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net {bvillazon,ocorcho}@fi.upm.es Phone: 34 91 3366605 Fax: 34 91 3524819 34.91.3366605, 34.91.3524819 Slides available at: http://www.slideshare.net/boricles/ Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches, Victor Saquicela, Al Vi t S i l Alexander d L ó and many others th t we d de León, d th that may have omitted. WorkdistributedunderthelicenseCreativeCommonsAttribution- Noncommercial-Share Alike 3.0