SlideShare ist ein Scribd-Unternehmen logo
1 von 101
Downloaden Sie, um offline zu lesen
Linked Data:
Enabler of Semantic Web

        2011.06.30




             Sung-Kook Han
             Semantic Technology Lab
             Won Kwang Univ.
         skhan@wku.ac.kr               1
Outline

Introduction to Semantic Technology
Semantic Technology + Web Technology
    • Semantic Web
    • Web 2.0
    • Linked Data



Design and Publication of Linked Data
    • 9 steps towards Linked Open Data




       skhan@wku.ac.kr                   2
Why Semantic Technology??

           the ways of thinking, cognition…




George Boole: An Investigation of the Laws of Thought (1854)

                      Claude Shannon: 1937 master's thesis,
                      A Symbolic Analysis of Relay and Switching Circuits




   John von Neumann            Kurt Gödel      Alan Turing




                    skhan@wku.ac.kr                                 3
Why Semantic Technology??

 Final Goal: Intelligence




          skhan@wku.ac.kr   4
Our Computers




   skhan@wku.ac.kr   5
Communication

  Human vs. Human




   Human vs. Alien




 Human vs. Computer




Computer vs. Computer




      skhan@wku.ac.kr   6
Semantic Technology
 Semantic technology has been a distinct research field for more
  than 40 years.
      Formal Logic (since Russell and Frege)
      Knowledge Representation Systems in AI
      Semantic Networks and ATN (William Woods, 1975)
      DARPA and European Commission programs in information integration
      Development of simple tractable logics
      Relational Algebras and Schemas in Database Systems

 Library Science (classifications, thesauri, taxonomies)

 New challenges of Semantic Technology: Semantic Web
    A massive store of information that computers cannot use
    A way to get around needing the “big data warehouse”
    Another place where “a little semantics can go a long way”...

                                        cf: The Relationship Between Web 2.0 And the Semantic Web - Dr. Mark Greaves, Vulcan, Inc.


                                  skhan@wku.ac.kr                                                                         7
Ontology Spectrum
                                                                                                               strong semantics
                                                                                                               Modal Logic
                    has_experience_in             works           Company
                                                                                                        First Order Logic
        Technologies
                    Knowledge
                  Representation
                                        Programs Personnel
                                                                                                   Logical Theory         Is Disjoint Subclass
                                              Management       S1           illusion
     Agent    Natural
            Language
                                   Project                    am
                                                                          AS
                                                                                                  Description Logic       of with transitivity
                                                      Program AS AS                  Department
Telecommunication
                             Task      Technical
                                                               Paulnderleez
                                                                           Leo                  DAML+OIL, OWL             property
                Semantic          Director EcDARPA                                   has   WISO
             Interoperability
        Request
                              Reza
                                        Assistant
                                        Director
                                                        Navy
                                                             Intelligence                              UML
                                  Ann Brad
                         Howard               Conceptual Model
                                                                                     Is Subclass of
                                                         RDF/S                                     Semantic Interoperability
                                                        XTM
                                               Extended ER
                                     Thesaurus                  Has Narrower Meaning Than
                                                 ER
  DB Schemas, XML Schema                                        Animal
                                                                                                 Structural Interoperability

           Taxonomy                                 Mammal Reptile
                                             Is Sub-Classification of
                                                                             Bird
   Relational                                                      Snake
                                                    Dog Cat
   Model, XML                                                                           Syntactic Interoperability
                                                      Cocker
                                                      Spaniel
    weak semantics
                                                          Lady                      Based on Leo Obrst, The Ontology Spectrum & Semantic Models

                                                                  skhan@wku.ac.kr                                                                8
Semantic Technology

        Intelligence       Integration         Interoperability




Machine-processible                                          Digital
    Semantics                                         Information Resources
                                                      Web resources
         Ontology
                                                        Services
                           Semantic
                                                         Image
         Metadata
                                                       Audio/Video
                         Technology
        controlled
                                                        Documents
        vocabulary

                             skhan@wku.ac.kr                            9
Web Technology


                 Web of machine-processible Data
                 Common vocabularies: Metadata and Ontology
                 Query and reasoning
                                                                     Web of Services
Classic Web                                                          Internet of Services
Web of Documents                                                     Internet of Things
HTML as document format
HTTP URLs as globally unique IDs
Hyperlinks to connect everything
                                     Social Web
                                     Connect human-being
                      Web as a platform
                      Programmable APIs and proprietary interfaces
                      Mashups based on a fixed set of data sources




                                        skhan@wku.ac.kr                              10
Semantic Web

 Standardizations
  Trio of Semantic Web
       Metadata / Ontology: RDF, RDFS, OWL
       Query Language: SPARQL
       Rule Language: RIF (SWRL)
  SKOS, RDFa, GRRDL, WSMO,…
  SOAP/ REST

 Tools and Systems
  Authoring, Reasoning Engines,…
  835 items in Sweet Tools

 Best Practices
    Linked Open Data
    Semantic MediaWiki
    NEPOMUK, SIOC, Garlik
    W3C Semantic Web Use cases


        Sweet Tools: http://www.mkbergman.com/new-version-sweet-tools-sem-web/
        W3C Semantic Web Case Studies and Use Cases: http://www.w3.org/2001/sw/sweo/public/UseCases/
                                           skhan@wku.ac.kr                                             11
Semantic Applications




Semantic Wave 2008, Industry Roadmap to Web 3.0, Project10X

                                                                  http://www.mkbergman.com/new-version-sweet-tools-sem-web/

                                                              skhan@wku.ac.kr                                                 12
Web 2.0

 Resharpen the way of viewing the Web
    Web as the platform
    Web as the social media
    Web as the collaboration tool
    Web as ……

 Web 2.0 Manifestation
    Openness / Sharing
    Participation / Collaboration

 Web 2.0 Syndrome
    Library 2.0
    Government 2.0
    Enterprise 2.0
    ……

 New Web applications
    wiki, blog, RSS,…

                                skhan@wku.ac.kr   13
Web 2.0 Developers




     skhan@wku.ac.kr   14
Semantic Web Today



                        Major future issues:

                         •   Vocabularies
                         •   Scalability
                         •   Provenance
                         •   Personal Infospheres
                         •   Mobile and Real World Networks




      skhan@wku.ac.kr                                 15
Web 2.0 APIs Today


No Single global space:               Web APIs slice the Web into Walled Gardens.

 • Mashups of APIs are proprietary.
 • No links between data.


          MashUp




  Web      Web      Web
  API      API      API



    A        B        C




                                                        Christian Bizer: Pay-as-you-go Data Integration (21/9/2010)

                                      skhan@wku.ac.kr                                                            16
The Web is Dead??




 http://www.wired.com/magazine/2010/08/ff_webrip/
          skhan@wku.ac.kr                           17
Long Live the Web !




http://www.scientificamerican.com/article.cfm?id=long-live-the-web
                                                    skhan@wku.ac.kr   18
Lessons Learned
 Data is more important than API code.
    Data is the Intel Inside.
    Open data is more important than open source
 Structured data is more valuable than unstructured.
    We should seek to structure our data well.
    Metadata will play a core role of data structure.
 A little semantics goes a long way.
    Beware the usefulness of shallow ontology shown in LOD.
 Linking data and services are essential.
    Link every thing.
 Rich user experiences are the key for adaption.
    We should consider mobile computing and personalization.
    Visualize and navigate.



                               skhan@wku.ac.kr                  19
Semantic Web &
Linked Data
Web of Documents
 A global file systems of documents (document silos on the
  Web).
 Implicit semantics of content and links
 Designed for human consumption
 Disconnected data




                          skhan@wku.ac.kr                     21
Architecture: Web of Documents

                                                         Analogy
    Web                 Search                               a global file system
  Browsers              Engines
                                                         Designed for
             HTTP URL                                        human consumption
                                                         Primary objects
                                                             documents
HTML           HTML               HTML
                                                         Links between
Doc.           Doc.               Doc.                       documents (or sub-parts of)
                                                         Degree of structure in objects
     hyperlink          hyperlink
   document link      document link                          fairly low
                                                         Main Usage
                                                             Search and browsing
DB-A            DB-B              DB-C                   Semantics of content and links
                                                             implicit



                                      skhan@wku.ac.kr                                       22
Machine-Processible Data
                                     Web of Documents
                                                                       Documents
Information Resources                                    Documents


                                                 Human processible


                                                                      Data
                                                         Database

                                                Machine processible
                                       Web of Data



            Open the data silos and get rid of repository-centric mindset
            Publish data of public interest on the Web
            In a way that other applications can access and interpret the data
            Using common Web technologies



                                       skhan@wku.ac.kr                             23
Semantic Web: Web of Data
 The vision of a Semantic Web:
     building a global Web of machine-readable data
     Berners-Lee, Hendler & Lassila, 2001; Marshall & Shipman, 2003

The first step is putting data on the Web in a form that machines can
naturally understand, or converting it to that form. This creates what I call a
Semantic Web - a web of data that can be processed directly or indirectly by
machines. Therefore, while the Semantic Web, or Web of Data, is the goal or
the end result of this process, Linked Data provides the means to reach that
goal. -- Tim Berners-Lee, et al., http://linkeddata.org/docs/ijswis-special-issue, Jan, 2009


 Linked Data Foundation
     can lower the barrier to reuse, integration and application of data from multiple,
      distributed and heterogeneous sources.
     the more sophisticated proposals associated with the Semantic Web vision,
      such as intelligent agents, may become a reality.



                                       skhan@wku.ac.kr                                   24
Linked Data: Web of Data
 Goal: Web-scale Data Integration
     Alternative to classic data integration systems in order to cope with growing
      number of data sources.
     Querying across data sources
 Global distributed database                                         RDF

     Extend the Web with a single global data space
     Giant Global Graph (GGG)
   Demonstrate the possibility of Semantic Web
     By using RDF to publish structured data                                              RDF
     By setting links between data                          single
                                        RDF
                                                           universal
                                                       information space.


                                                                                      RDF
                            RDF
                                                            RDF



                                     skhan@wku.ac.kr                                  25
Architecture: Linked Data

                                                                   Analogy
                                                                       a global database
Linked Data           Linked Data         Search                   Designed for
  Browsers              Mashup            Engines                      machines first, humans later
                              HTTP URI                             Primary objects
                                                                       things (or descriptions (data) of
                                                                        things)
                                                                   Links between
 RDF                     RDF                   RDF                     things
triples                 Triples               triples              Degree of structure in
          RDF link                RDF link                          (descriptions of) things
          data link               data link                            high
DB-A                   DB-B                   DB-C
                                                                   Main usage
                                                                       query, navigation and reasoning
                                                                   Semantics of content and links
                                                                       explicit




                                                skhan@wku.ac.kr                                        26
Linked Data Principles
Set of best practices for publishing structured data on the Web in accordance with
the general architecture of the Web.

 Use URIs as names for things.
     Use URIs as names for things, not just for documents or homepages
 Use HTTP URIs so that people can look up those names.
 When someone looks up a URI, provide useful RDF information.
 Include RDF statements that link to other URIs so that they can discover
  related things.                 URI
                                                URI             URI
           URI
                          RDF Link
          URI                                            RDF triple Information
                             URI                 HTTP URI                 URI




                                       skhan@wku.ac.kr                            27
Linked Open Data


 Community effort to
      publish existing open license datasets as Linked Data on the Web
      interlink things between different data sources
      develop clients that consume Linked Data from the Web
      began early 2007




                                 skhan@wku.ac.kr                          28
LOD Data sets on the Web
 25 billion RDF triples, which are interlinked by around 395 million RDF links (Sep. 2010).




                                      http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.svg

                                           skhan@wku.ac.kr                                                         29
Summary: Web of Linked Data
 A global, distributed database built on a simple set of
  standards
    RDF, URI, HTTP
 Explicit semantics of content and links
 Resources are connected by semantic links.
    creating a single global data graph that span data sources
    enables the discovery of new data sources
 Provides for data co-existence
    Anyone can publish data to the Web of Linked Data
    Data publishers are not constrained in choice of vocabularies with
     which to represent data.
 Designed for computer first, humans later


                                 skhan@wku.ac.kr                          30
Data.Gov




 skhan@wku.ac.kr   31
Europeana
 European digital library: Europeana: This European Commission initiative
 encompasses not only libraries but also museums, archives and other holders of cultural
 heritage material.




http://version1.europeana.eu/web/europeana-project


                                          skhan@wku.ac.kr                             32
Linked Library Cloud
 Libraries have been producing
  metadata for ages.
 Libraries (often) produce high-
  quality metadata.
 Library develops many metadata
  standards such as DC, SKOS,
  BIBO, OAI-ORE including
  MARC 21, MODS, FRBR,..
 Integrate Library Catalogues on
  global scale




                                               http://code4lib.org/conference/2010/singer


                             skhan@wku.ac.kr                                     33
Linking Open Drug Data
 linking the various sources of
  drug data together to answer
  interesting scientific and
  business questions.
     Survey publicly available data
      sets about drugs
     Publish and interlink these data
      sets on the Web
     Explore interesting questions that
      could be answered if the data sets
      are linked.
 8 million RDF triples, which are
  interlinked by more than
  370,000 RDF links (As of
  August 2009)



                                    skhan@wku.ac.kr   34
BBC Semantic Project
 Publish program / music data as RDF/XML or RDFa
 Build semantically linked and annotated web pages about artists and
  singers whose songs are played on BBC radio stations.
 semantically interconnected




                               skhan@wku.ac.kr                          35
DBpedia Mobile
 Show map with information about nearby locations
 Linked data browser
 GPS + Google Maps + DBpedia + Flickr + Revyu




                             skhan@wku.ac.kr         36
Attention by Search Engines
 Yahoo!
   crawls Linked Data in its RDFa serialization as well as Microformat
   Yahoo Search Monkey to make search results more useful and visually
    appealing
   provides access to crawled data through the Yahoo BOSS API


 Google
   use Social Graph API
   is developing Google Squared and Google Fusion Table
   merged MetaWeb
       manage Freebase, a DBpedia/YAGO competitor


           Rich Snippets


                              skhan@wku.ac.kr                         37
Linked Open Commerce




       skhan@wku.ac.kr   38
Design and Publication
of
Linked Data
9 Steps to publishing Linked Data




                                                        Publicize your Data Sets
                                                   Describe your Data Sets
                                               Link to other Data Sets
                                      Triplify Data Sets
                               Choose URIs for Things in your Data
                       Create Vocabularies
               Understand your data
         Setup Your Infrastructure for Linked Data
   Understand the principles


                                  skhan@wku.ac.kr                               40
1. Understand Linked Data
   • Principle
   • Core Stack
   • Data Modeling
Linked Data: Overview
 Benefits of Linked Data Enables web-scale data distributed
  publication with web-based discovery mechanisms.
 Linked Data Web Resources are generic real-world data
  objects or entities:
    People, Places, and other physical things
    Abstract concepts (e.g., emotion, notion,…)
    Subject matter (e.g., science, economics, arts,…)
 Linked Data is not just structured data published on the
  Web.
 Linked Data is based on well-established Web standards
 Linked Data adds value: less redundancy, greater
  discoverability, network effects.


                                skhan@wku.ac.kr              42
Linked Data Principles (TimBL, 2006)

 Use URIs as names for things
    not just for documents
        http://dbpedia.org/resource/ontology
    you are not your homepage
        http://mentalist.com/actor/patrick_jane
 Use HTTP URIs
    globally unique names, distributed ownership
    allows people to look up those names
 Provide useful information in RDF
    when someone looks up a URI
 Include RDF links to other URIs
    to enable discovery of related information



                           skhan@wku.ac.kr          43
5 Star rating

On the web, open licensed: Available on the web (whatever
format), but with an open license

Machine-readable data: Available as machine-readable
structured data (e.g. excel instead of image scan of a table)

Non-proprietary format (e.g. csv instead of excel)

RDF standards: Use open standards from W3C (RDF and SPARQL)
 to identify things, so that people can point at your stuff

Linked RDF: Link your data to other people’s data to provide
 context




                      skhan@wku.ac.kr                           44
Linked Data Core Stack




                                                                  http://linkeddata-specs.info/

 RFC 2616 Hypertext Transfer Protocol
   •   HTTP/1.1 Defines HTTP, a generic and stateless application-level protocol for distributed,
       collaborative, hypermedia information systems.

 RFC 3986 Uniform Resource Identifier (URI):
   • Generic Syntax Defines a generic URI syntax and a process for resolving URI references that
     might be in relative form, along with guidelines and security considerations for the use of URIs
     on the Internet.

 RDF Concepts and Abstract Syntax
   • Defines the RDF graph data model and key concepts.

 SPARQL Query Language for RDF
   •   Defines the syntax and semantics of the SPARQL query language for RDF.


                                            skhan@wku.ac.kr                                         45
Core Technology
 Uniform Resource Identifier (URI)
    Names (identifiers) for resources in an open Web environment
 Resource Description Framework (RDF)
    a model for representing metadata on the web
    triple structure
 RDF Schema and OWL
    languages for defining vocabularies
 RDF/XML, N3, Turtle,…
    serialization and de-serialization of RDF triples for exchanging RDF
     data
 Simple Knowledge Organization System (SKOS)
    a language for describing controlled vocabularies
 SPARQL
    a query language and protocol for accessing RDF data via the Web


                                skhan@wku.ac.kr                             46
Linked Data Modeling

                     Data Modeling                            Data Linking

               RDF data model to publish              RDF links to interlink data
               structured data on the Web             from different data sources

RDF triple: subject, predicate, and object
  Subject: URI identifying the described resource
  Predicate: relation exists between subject and object,
    vocabularies, collections of URIs that can be used to represent information about a certain
     domain
  Object: a simple literal value, or the URI of another resource that is related to the subject




                                            skhan@wku.ac.kr                                        47
Linked Data Model

                                                 dbp-prop:title           The Lord of the rings
                    http://.../isbn/46316
                                                                    Flexible graph-based model: RDF graph
                                             skos:subject
            dbp-prop:author                                        English novels
                                       dbp-prop:publisher

                                                         The HTTP protocol brings together identification
  dbp-prop:name                                          and retrieval again.

                             foaf:homepage             dbpidia:Allen&Unwin
   J.R.R. Tolkien
                                                                                    opencyc:headquarter
                                                   dbp-prop:city
                                                                                            Deeper into the Web
                    wkp-en:J.R.R.Tolkien
                                                        London
                                                                                fb:guid…..92df7

URI: global primary key                                                                            fb:creator
skos:subject = http://www.w3.org/2004/02/skos/core#subject                     fb:street_address
dbp-prop:title = http://dbpedia.org/property/title
                                                                                                      Marivie
                                                                           83 Alexander St 83
                                                                               Alexander


                                                 skhan@wku.ac.kr                                           48
2. Setup Infrastructure
    • Basic Infrastructure
    • Systems and Tools




             skhan@wku.ac.kr   49
Basic Infrastructure
                                                    packaging


                                                      search
               Data/
                           extraction               discovery
              Content
                                                    navigation
                                                                   SPARQL
                             link
RDF Triple Base           generation
                                                       index        Query
                                                                   Engine

                  DB      conversion
                                                    triple store



Interface                            Framework + APIs


Delivery                            Web Server (Apache)

Application   browser   navigator       search

                                  skhan@wku.ac.kr                           50
Infrastructure Construction
 Configuration of Web server
    Configuring the server for correct MIME types application/rdf+xml
    Code samples for ConNeg and 303 Redirects:
     http://linkeddata.org/tools
    use cURL: http://curl.haxx.se/ to configure Apache
    Configure for hash URI or Slash URI

 Testing your content negotiation
    Install the LiveHTTPHeaders and Modify Headers extensions for
     Firefox
    Try LiveHTTPHeaders against my URI
        http://www.skyhigh.com/id/hong
        do the same with URIs from other data sets
    Modify your headers to ask for application/rdf+xml



                                  skhan@wku.ac.kr                        51
Supporting Technologies
 Linked Data Browsers
    provide for navigating between data sources and for exploring the dataspace.
    Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE), OpenLink RDF
     Browser (OpenLink, UK), Zitgist RDF Browser (Zitgist, USA), Disco
     Hyperdata Browser Berlin, Fenfire (DERI, Irland)
 Web of Data Search Engines
    crawl the data space and provide best-effort query answers over crawled data.
    Falcons (IWS, China), Sig.ma (DERI, Ireland), Swoogle (UMBC, USA),
     VisiNav (DERI, Ireland), Watson (Open University, UK), TAP, Sindice




                                   skhan@wku.ac.kr                                   52
Supporting Technologies
 Describing data set
    discovery and usage of linked datasets
    voiD, Ding
 Registry
    an open registry of data and content packages
    CKAN
 Linking tool
    discovering relationships between data items within different Linked Data
     sources
    SILK
 Mapping tool
    mapping database to RDF triples
    Triplify, D2R Server
 LOD platform
    D2R Server, Virtuoso Universal Server,
     Talis Platform, Pubby, …



                                     skhan@wku.ac.kr                             53
3. Understand Data to be published
   • Review about Data to be published
   • Requirement analysis




                skhan@wku.ac.kr          54
Review about Data to be published
 What
    think about the key things to be presented in Linked Data
    analysis of data properties
    What vocabularies can be used to describe these?

 Why
    purposes and goals of linked data to be published

 What for
    how to use and apply linked data (use cases)

 How to serve
    Serving Linked Data as Static RDF/XML Files
    Serving Linked Data as RDF Embedded in HTML Files
    Serving RDF and HTML with Custom Server-Side Scripts

    Serving Linked Data from Relational Databases
    Serving Linked Data from RDF Triple Stores
    Serving Linked Data by Wrapping Existing Application or Web APIs




                                       skhan@wku.ac.kr                  55
4. Create Vocabularies
    • Vocabulary Creation
    • Common Namespace
    • Definition




            skhan@wku.ac.kr   56
Guideline for Vocabulary Creation

 Do not define new vocabularies from scratch, but complement existing
  vocabularies with additional terms (in your own namespace) to represent your
  data as required.
 Provide for both humans and machines. Use rdfs:comments for each term
  invented. Always provide a label for each term using the rdfs:label property.
 Make term URIs de-referenceable following the W3C Best Practice Recipes
  for Publishing RDF Vocabularies.
 Make use of other people's terms. Using other people's terms, or providing
  mappings to them, by means of rdfs:subClassOf or rdfs:subPropertyOf.
 State all important information explicitly. For example, state all ranges and
  domains explicitly.
 Do not create over-constrained, brittle models; leave some flexibility for
  growth. Do not use full-featured OWL or RDF to define your vocabulary.
  Unless you know exactly what you are doing, use RDF Schema to define
  vocabularies.


                                skhan@wku.ac.kr                           57
Potential Ontologies / Vocabularies

 Friend-of-a-Friend (FOAF), vocabulary for describing people.
 Dublin Core (DC) defines general metadata attributes. See also their new
  domains and ranges draft.
 Semantically-Interlinked Online Communities (SIOC), vocabulary for
  representing online communities.
 Description of a Project (DOAP), vocabulary for describing projects.
 Simple Knowledge Organization System (SKOS), vocabulary for
  representing taxonomies and loosely structured knowledge.
 Music Ontology provides terms for describing artists, albums and tracks.
 Review Vocabulary, vocabulary for representing reviews.
 Creative Commons (CC), vocabulary for describing license terms
 Geo, vocabulary for describing geographical locations
 GoodRelations, vocabulary for describing products



                              skhan@wku.ac.kr                           58
Common Namespaces

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:dc="http://purl.org/dc/terms/"xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:vcard="http://www.w3.org/2006/vcard/ns#"
xmlns:dbp="http://dbpedia.org/dbprop/"
xmlns:geo="http://www.geonames.org/ontology#"
xmlns:gr="http://purl.org/goodrelations/v1#"
xmlns:commerce="http://search.yahoo.com/searchmonkey/commerce/"
xmlns:media="http://search.yahoo.com/searchmonkey/media/"
xmlns:cb="http://cb.semsol.org/ns#"


More Common Namespaces:
http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies
http://www-958.ibm.com/software/data/cognos/manyeyes/visualizations/100-most-popular-rdf-namespaces




                                               skhan@wku.ac.kr                                        59
Definition of Vocabulary

# Definition of the class "Lover"
<http://sites.movie.org/pub/LoveVocabulary#Lover>
         rdf:type rdfs:Class ;
         rdfs:label "Lover"@en ;
         rdfs:label "Liebender"@de ;
         rdfs:comment "A person who loves somebody."@en ;
         rdfs:comment "Eine Person die Jemanden liebt."@de ;
         rdfs:subClassOf foaf:Person .

# Definition of the property "loves"
<http://sites.movie.org/pub/LoveVocabulary#loves>
         rdf:type rdf:Property ;
         rdfs:label "loves"@en ;
         rdfs:label "liebt"@de ;
         rdfs:comment "Relation between a lover and a loved person."@en ;
         rdfs:subPropertyOf foaf:knows ;
         rdfs:domain <http://sites.movie.org/pub/LoveVocabulary#Lover> ;
         rdfs:range foaf:Person .

                                skhan@wku.ac.kr                             60
Tools for Vocabulary Definition
 Ontology editors
    Protégé:
        an open-source ontology editor with a dedicated OWL plug-in
    Neologism:
        Web-based tool for creating, managing and publishing simple RDFS
         vocabularies.
        open-source and implemented in PHP on top of the Drupal-platform.
    TopBraid Composer:
        a powerful commercial modeling environment for developing Semantic
         Web ontologies
    NeOn Toolkit:
        an open-source ontology engineering environment with an extensive set of
         plug-ins.




                                 skhan@wku.ac.kr                               61
5. Choose URIs
   •   Resource Identification
   •   Types of URIs
   •   De-Referencing
   •   Common URI Patterns




              skhan@wku.ac.kr    62
Resource Identification
 Separation of Identity and Representation
 Identity
     Identity (URI) of an Object or Entity should be unambiguous and globally unique
 Representation
     On the Web a URI should provide an unambiguous data access path
 Access
     Reference to abstract (physically inaccessible)
     Objects or Entities is only achievable via conduit documents that carry
      representations of entity descriptions (which at best are facets of an entire description)

 URI Requirements:
       Keep out of other peoples' namespaces
       Use a namespace that you control
       Abstract away from implementation details (Short is better…)
       Stable and persistent
       Hash or Slash
       Use common URI patterns


                                        skhan@wku.ac.kr                                     63
URI
 URI: Unique Resource Identifier
                                                                home page??
                                                                (Web document)

   http://www.example.com/people/alice
                                                                  information
                                                                  object ??



URI: identification of people, products, places, ideas and concepts such as
ontology classes, including URLs for Web documents


                                           hash URI

              Two Approaches
                                           slash URI


                                skhan@wku.ac.kr                               64
Hash / Slash URI
 Hash URI
    URIs can contain a fragment, a special part that is separated from the
     rest of the URI by a hash symbol (“#”).
        http://www.example.com/products/BiBimBab#this
        http://www.travel.com /nation/Korea/KyungJu#main
    simply publish a description document containing RDF about the things
     at the base URI
 Slash URI
    examples:
        http://www.example.com/products/BiBimBab
        http://www.travel.com /nation/Korea/KyungJu
    must publish your description document at another, distinct URI.




                                 skhan@wku.ac.kr                              65
hash URI

               http://www.skyhigh.com/person/GilDong#this

                                                      Separating identification
                                                      and naming from
                                                      representation

Metadata:
content-type:
application/xhtml+ xml

Data:
<html xmlns=“..
<head>                             Entity
<title> Our hero…
                                 (GilDong)
</html>



http://www.skyhigh.com/person/GilDong




                                 skhan@wku.ac.kr                                  66
slash URI

                    http://www.skyhigh.com/person/hero/GilDong/id

                                                           Separating identification
                                                           and naming from
                                                           representation

     Metadata:
     content-type:
     application/xhtml+ xml
                                                             Metadata:
     Data:
                                                             content-type:
     <html xmlns=“..
                                                             application/rdf+xml
     <head>                             Entity
     <title> Our hero…
                                      (GilDong)              Data:
                                                             <html xmlns=“..
     </html>
                                                             <head>
                                                             <title> Our hero…

http://www.skyhigh.com/person/hero/GilDong/page              </html>



                              http://www.skyhigh.com/person/hero/GilDong/data


                                      skhan@wku.ac.kr                                  67
Slash vs. Hash

 Slash URI
    HTTP redirection (30X response) is required in order for resource "Identity" to be
     separated from "representation". :
    http://www.skyhigh.com/person/hero/GilDong/id (URI of an Organization Entity)
    http://www.skyhigh.com/person/hero/GilDong/page (HTML representation of
     Entity description)
    http://www.skyhigh.com/person/hero/GilDong/data (RDF representation that
     describes the Entity which could be: Turtle, N3. RDF/XML etc. based data
     serialization)
 Hash URI
    HTTP redirection isn't required in order for resource "Identity" to be separated from
     "representation". :
    http://demo.openlinksw.com/Northwind/Customer/ALFKI#this (URI of an
     Organization Entity)
    http://demo.openlinksw.com/Northwind/Customer/ALFKI a document (HTML,
     Turtle, N3, RDF/XML, representation of Entity description).




                                      skhan@wku.ac.kr                                     68
DeReferencing Hash URI

 Without content negotiation                       With content negotiation
                                                       http://www.example.com/about#alice
http://www.example.com/about#alice
                                                                       ID

                ID                                       automatic truncation of fragment

                                                         http://www.example.com/about
 automatic truncation of fragment
                                           application/rdf+xml win          text/html win
                                                                     content
                                                                   negotiation
               RDF
                                                            RDF
  http://www.example.com/about        http://www.example.com/about.rdf           HTML


                                                       http://www.example.com/about.html



                                     skhan@wku.ac.kr                                    69
DeReferencing Slash URI

      One Generic Document                             Different documents
            http://www.example.com/id/alice
                                                              http://www.example.com/id/alice
                            ID
                                                                            ID
               303 redirected
                                                                                 text/html win
             http://www.example.com/doc/alice         application/rdf+xml win
                    generic document                                  303 redirected
  application/rdf+xml win        text/html win                         with content
                                                                       negotiation
                         content                                RDF
                       negotiation
                                               http://www.example.com/doc/alice.rdf
               RDF                                                                       HTML


http://www.example.com/doc/alice.rdf    HTML               http://www.example.com/doc/alice.html

              http://www.example.com/doc/alice.html



                                         skhan@wku.ac.kr                                    70
Content Negotiation




      skhan@wku.ac.kr   71
Content Negotiation




      skhan@wku.ac.kr   72
Common URI Pattern
http://dbpedia.org/resource/New_York_City            Thing
http://dbpedia.org/data/New_York_City                RDF data
http://dbpedia.org/page/New_York_City                HTML page

http://revyu.com/people/tom                          Thing
http://revyu.com/people/tom/about/rdf                RDF data
http://revyu.com/people/tom/about/html               HTML page

http://www.bbc.co.uk/music/artists/db4624cf#artist   Thing
http://www.bbc.co.uk/music/artists/db4624cf.rdf      RDF data
http://www.bbc.co.uk/music/artists/db4624cf.html     HTML page

http://id.dbpedia.org/Berlin                         Thing
http://data.dbpedia.org/Berlin                       RDF Data
http://page.dbpedia.org/Berlin                       HTML page


http://www4.wiwiss.fu-berlin.de/bookmashup/books/006251587X      ISBN




                                 skhan@wku.ac.kr                    73
Choosing URI

  http://www.culture.com/LOD/{class}/{member}
  http://www.culture.com/LOD/{class}/{member}.rdf
  http://www.culture.com/LOD/{class}/{member}.html


 Examples:
   URI of an Organization Entity
      http://demo.openlinksw.com/Northwind/Customer/ALFKI/id
   HTML representation of Entity description
      http://demo.openlinksw.com/Northwind/Customer/ALFKI/ page
   RDF representation that describes the Entity which could be: Turtle, N3.
    RDF/XML etc. based data serialization
      http://demo.openlinksw.com/Northwind/Customer/ALFKI/data




                               skhan@wku.ac.kr                            74
6. Triplify Data Sets
    • Publication Strategies
    • Conversion of Database




            skhan@wku.ac.kr    75
Linked Data Publication

Types of data                      Structured Data                                  Text


                                                                    RDF-izers         Entity
Data Preparation                                                   For CVS, xml,     Extractor
                                                                       Excel       (e.g. Calais)




                            Relational             Data Source        RDF            RDF
Data storage                Database                With API          Store
                                                                                     files


                                     CMS with
                    RDB-to-RDF                         Custom       Linked Data         Web
                                       RDFa
Data Publication      Wrapper
                     (e.g. D2R)
                                      Output
                                                     Linked Data      Interface        Server
                                                       wrapper      (e.g. Pubby    (e.g. Apache)
                                   (e.g. Drupal)




                                          Linked Data on the Web


                                      skhan@wku.ac.kr                                              76
Publication Strategy

 Strategy
    From unstructured sources
        use NLP, text mining, annotation,…
        OpenCalais, Ontos
    From semi-structured sources
        Dbpedia, Linked GeoData, SCOVO,…
        efficient bi-directional synchronization
    From structured sources (relational database)
        Declarative syntax and semantics of data model translation
        RDB2RDF,…




                                   skhan@wku.ac.kr                    77
Conversion of Database
   Books             Authors
         ID           ID
         Year         Name
                      Homepage


Publishers
     ID
     PublisherName
     City
                       Books
                                  ID            Author              Title       Publisher   Year
                       ISBN0-00-651409-X        id_xyz       The Glass Palace   id_qpr      2000

                        Authors
                          ID           Name                     Home page
                        id_xyz    Ghosh, Amitav      http://www.amitavghosh.com

                        Publishers
                          ID      Publisher Name             City
                        id_qpr    Harper Collins           London

                                         skhan@wku.ac.kr                                    78
Conversion of Database




 Tools for mapping RDB to Linked Data
    D2R Server for customizable mappings from relational databases to ontologies
       [Bizer, Cyganiak 06]
    Browser-based tools for defining RDB-to-RDF mappings
       [Zhou, Xu, Chen, Idehen 08]
    Triplify [Auer, Dietzold, Lehmann, Hellmann, Aumueller 09]
    OpenLink Data Spaces [Idehen, Erling 08]
                                     skhan@wku.ac.kr                                79
RDF Features Best Avoided


 Do not use the full expressivity of the RDF data model.
    Use a subset of the RDF features
 No blank nodes.
    It is impossible to set external RDF links to a blank node,
 Do not use RDF reification as the semantics of reification
    unclear and cumbersome to query with the SPARQL query language.
    Metadata can be attached to the information resource instead
 Be careful before using RDF collections or RDF containers
    do not work well together with SPARQL




                                 skhan@wku.ac.kr                       80
7. Link to other Data sets
    • Types of Linking
    • Linking manually
    • Automatic generation of Link




            skhan@wku.ac.kr          81
Link ! Reuse !!
 Reuse. Do not invent the wheel again…
    The URIs are de-referenceable.
        For instance, using the DBpedia URI http://dbpedia.org/page/Doom to
         identify the computer game Doom gives you an extensive description of
         the game including abstracts in 10 different languages and various
         classifications.
    The URIs are already linked to URIs from other data sources.
        For instance, you can navigate from the DBpedia URI
         http://dbpedia.org/resource/Innsbruck to data about Innsbruck provided by
         Geonames and EuroStat.
        Therefore, by using concept URIs form these datasets, you interlink your
         data with a rich and fast-growing network of other data sources.




                                  skhan@wku.ac.kr                                82
Types of Linking to other Data Sets
 Relationship Links
    point at related things in other data sources, for instance, other people, places or
     genes.
        <http://www.skyhigh.com/people/GilDong>
         rdf:type foaf:Person ;
         foaf:name “Hong, Gil-Dong" ;
         foaf:based_near <http://dbpedia.org/resource/Seoul> ;
         foaf:topic_interest <http://dbpedia.org/resource/Justice> ;
         foaf:knows <http://dbpedia.org/resource/HalBingDang> .
 Identity Links
    point at URI aliases used by other data sources to identify the same real-world
     object or abstract concept.
     <http:// www.skyhigh.com/people/GilDong > <http://www.w3.org/2002/07/owl#sameAs>
      <http://www.korea.org/history/hero>
 Vocabulary Links
    point to the definitions of related terms in other vocabularies
        <http://www.university.org/terms/professor>
        rdf:type rdfs:Class ;
        rdfs:subClassOf <http://dbpedia.org/ontology/Person> .
        rdfs:subClassOf <http://sw.opencyc.org/concept/Mx4rvbGdrcN5Y29ycA> ;
        owl:equivalentClass <http://rdf.dictionary.com/entry/facultyMember>


                                            skhan@wku.ac.kr                             83
Link to other Data Sets
 URI aliases
    In an open environment like the Web it often happens that different
     information providers talk about the same non-information resource. As
     they do not know about each other, they introduce different URIs for
     identifying the same real-world object.
         http://dbpedia.org/resource/Berlin
         http://sws.geonames.org/2950159/
    URI aliases provide an important social function to the Web of Data as they
     are de-referenced to different descriptions of the same non-information
     resource and thus allow different views and opinions to be expressed.
    owl:sameAs

 Common Properties
    rdfs:seeAlso, foaf:knows, foaf:based_near, foaf:topic_interest,…

 Two approaches for linking data:
    RDF Links Manually
    Auto-generating RDF Links



                                     skhan@wku.ac.kr                          84
RDF Links Manually
 Find the similar data sets as suitable linking targets manually search in
  these for the URI references you want to link to.
 If a data source doesn't provide a search interface, you can use Linked
  Data browsers like Tabulator or Disco to explore the dataset and find
  the right URIs.

 Useful sites:
       Sindice and Falcons provide indexes to identify candidate URIs for linking.
       CKAN site : a registry of open linked data and projects.
       Uriqr - A URI Search Engine: http://dev.uriqr.com/
       Freebase: http://www.freebase.com

 MOAT: Meaning Of A Tag Framework
     For manually interlinking tags with Semantic Web URIs (such as URIs from
      DBpedia, Geonames … or any knowledge base)

 Remember that data sources might use HTTP-303 redirects to redirect
  clients from URIs identifying non-information resources to URIs
  identifying information resources that describe the non-information
  resources.
                                       skhan@wku.ac.kr                                85
Auto-generating RDF Links
 Various approaches
    Pattern-based Algorithms
    Similarity-based Approaches
    Complex property-based Algorithms
         Yves Equivalence Miner: interlinking Jamendo and Musicbrainz.
 Equivalence Mining and Matching Frameworks
    Silk - A Link Discovery Framework for the Web of Data.
         Silk can be run on a single machine or on a Hadoop cluster (for instance
          Amazon EC2).
    LIMES - Link Discovery Framework for Metric Spaces.
         time-efficient and lossless approaches for large-scale link discovery based on
          the characteristics of metric spaces.
    DSNotify - Detecting and Fixing Broken Links in Linked Data Sets
    TopBraid Composer
         a wizard for linking ontology instances to corresponding DBpedia concepts.
    SemMF
         a flexible framework for calculating semantic similarity between objects that
          are represented as arbitrary RDF graphs.


    http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/EquivalenceMining

                                                 skhan@wku.ac.kr                           86
8. Describe Data Sets
   • Metadata for Description




           skhan@wku.ac.kr      87
Publishing Descriptions of a Data set
 Help others discover and index your data
 Apply a license or waiver to your data set

 Metadata about the published linked data set
     authorship of a data set, its currency (i.e., how recently the data set was updated), its
      licensing terms, the provenance and timeliness of a data set and the terms for
      licensing

 Important issues:
     Provenance:
          the ability to track the origin of data
          key component in building trustworthy, reliable applications
          Open Provenance Model84
     Licenses vs. Waivers
     Norms : a means for data publishers who waive their legal rights (through
      application of a waiver) to define expectations they have about how the data is used

 Two primary mechanisms
     Semantic Sitemaps: http://sw.deri.org/2007/07/sitemapextension/
     voiD : http://semanticweb.org/wiki/VoiD

                                          skhan@wku.ac.kr                                    88
Description

               Metadata about published data, such as a URI identifying the author
 Metadata      and licensing information.



Description    Description of dataset that have the resource's URI as the subject.



               Description of dataset that have the resource's URI as the object.
 Backlinks     This is redundant, but it allows browsers and crawlers to traverse links
               in either direction.



  Related      Any additional information about related resources, i.e., answering
               information about a book with the author information.
descriptions   A moderate approach not overloaded excessively.



               Various ways to serialize RDF descriptions.
               At least provide RDF descriptions as RDF/XML which is the only
  Syntax       official syntax for RDF.
               Additionally provide Turtle descriptions Trix, and other


                               skhan@wku.ac.kr                                            89
Data Set Description: Example
# Metadata and Licensing Information
<http://dbpedia.org/data/Alec_Empire>
     rdfs:label "RDF description of Alec Empire" ;
     rdf:type foaf:Document ;
     dc:publisher <http://dbpedia.org/resource/DBpedia> ;
     dc:date "2007-07-13"^^xsd:date ;
     dc:rights <http://en.wikipedia.org/wiki/WP:GFDL> .

# The description
<http://dbpedia.org/resource/Alec_Empire>
     foaf:name "Empire, Alec" ;
     rdf:type foaf:Person ;
     rdf:type <http://dbpedia.org/class/yago/musician> ;
     rdfs:comment
            "Alec Empire (born May 2, 1972) is a German musician who is ..."@en ;
     rdfs:comment
            "Alec Empire (eigentlich Alexander Wilke) ist ein deutscher Musiker. ..."@de ;
     dbpedia:genre <http://dbpedia.org/resource/Techno> ;
     dbpedia:associatedActs <http://dbpedia.org/resource/Atari_Teenage_Riot> ;
     foaf:page <http://en.wikipedia.org/wiki/Alec_Empire> ;
     foaf:page <http://dbpedia.org/page/Alec_Empire> ;
     rdfs:isDefinedBy <http://dbpedia.org/data/Alec_Empire> ;
     owl:sameAs <http://zitgist.com/music/artist/d71ba53b-23b0-4870-a429-cce6f345763b> .



                                        skhan@wku.ac.kr                                      90
Data Set Description: Example
# Backlinks
<http://dbpedia.org/resource/60_Second_Wipeout>
     dbpedia:producer <http://dbpedia.org/resource/Alec_Empire> .
<http://dbpedia.org/resource/Limited_Editions_1990-1994>
     dbpedia:artist <http://dbpedia.org/resource/Alec_Empire> .




                                       skhan@wku.ac.kr              91
9. Publish Data Sets
    • Serialization
    • Linked Data Storage
    • Test and Debugging




            skhan@wku.ac.kr   92
Publishing Linked Data
 Serialization of Data

      Publication
                             Advantages                 Disadvantages
       Method
   RDF/XML Document    Oldest, best supported       Confusingly like normal XML

   Turtle (N3)                                      Not technically a standard
                       Simplest
   Document                                         yet
   HTML Document       Fits inside HTML,
                                                    Can get very complicated
   with RDFa           but also RDF
                                                    Promising, but still being
   JSON                Normal JSON, but also RDF
                                                    developed
                                                    Needs to download+run
   GRDDL               Use the XML you have/want
                                                    XSLT

   SPARQL              Query Protocol               Query Protocol


 RDF files shouldn't be larger than, say, a few hundred kilobytes. Break
  them up into several RDF files
 Make sure multiple RDF files are linked to each other through RDF
  triples.
                                  skhan@wku.ac.kr                                 93
Examples
RDF/XML
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:db="http://dbpedia.org/resource/">
   <rdf:Description rdf:about="http://dbpedia.org/resource/Massachusetts">
     <db:Governor>
      <rdf:Description rdf:about="http://dbpedia.org/resource/Deval_Patrick" />
     </db:Governor>
     <db:Nickname>Bay State</db:Nickname>
     <db:Capital>
      <rdf:Description rdf:about="http://dbpedia.org/resource/Boston">
         <db:Nickname>Beantown</db:Nickname>
      </rdf:Description>
     </db:Capital>
   </rdf:Description>
  </rdf:RDF>

Turtle
     @prefix db: <http://dbpedia.org/resource/>

     db:Massachusetts db:Governor db:Deval_Patrick;
                     db:Nickname "Bay State";
                     db:Capital db:Boston.
                     db:Nickname "Beantown".
                                   skhan@wku.ac.kr                                94
Examples
RDFa
 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
    "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:db="http://dbpedia.org/resource/"
      version="XHTML+RDFa 1.0">
  <head>
    <title>About Massachusetts</title>
  </head>
  <body>
    <div about="http://dbpedia.org/resource/Massachusetts">The
    Massachusetts governor is
      <span rel="db:Governor">
              <span about="http://dbpedia.org/resource/Deval_Patrick">Deval
              Patrick
              </span>,
      </span>
      the nickname is "<span property="db:Nickname">Bay State</span>",
      and the capital
      <span rel="db:Capital">
              <span about="http://dbpedia.org/resource/Boston">
                has the nickname "<span property="db:Nickname">Beantown</span>".
              </span>
      </span>
    </div>
  </body>
 </html>

                                     skhan@wku.ac.kr                               95
Examples
RDF-JSON
        { "__iri": "db:Massachusetts",
          "db:Nickname": "Bay State",
          "db:Governor": { "__iri": "db:Deval_Patrick" },
          "db:Capital": { "__iri": "db:Boston",
                          "db:Nickname": "Beantown"
                       },
          "__prefixes": { "db:": "http://dbpedia.org/resource/" }
        }

GRDDL
        <MyDataSet xmlns="http://example.org/my-data-xml-namespace">
         <State>
          <name>Massachusetts</name>
          <governor>Deval_Patrick</governor>
          <nickname>Bay State</nickname>
          <capital>
            <name>Boston</name>
            <nickname>Beantown</nickname>
          </capital>
         </State>
        </MyDataSet>

                                         skhan@wku.ac.kr               96
Linked Data Storage

 RDB to RDF Middleware
    D2R Server
 Native RDF Storage (manage it yourself)
      4Store
      AllegroGraph
      Bigdata
      BigOWLIM
      Jena TDB
      Neo4j
      Sesame
      Virtuoso
 Native RDF Storage (managed)
    Talis Platform
 Pubby
    Linked Data front-end for SPARQL Endpoints
 Paget Framework



                                  skhan@wku.ac.kr   97
Testing and Debugging Linked Data
 To ensure it adheres to the Linked Data principles and best
  practices

 correctness of URIs dereference
    Vapour Linked Data Validator at http://idi.fundacionctic.org/vapour
    RDF:Alerts at http://swse.deri.org/RDFAlerts/
    Sindice Inspector at http://inspector.sindice.com/
 manual validation and debugging of Linked Data
    cURL, Firefox browser extensions LiveHTTPHeaders and
     ModifyHeaders
 technical debugging and validation
    Linked Data browsers can be used for.
    Tabulator, Marbles, LOD Browser Switch



                                skhan@wku.ac.kr                            98
Summary: Linked Data


Semantic Technologies need to go where the data is !
Long Live Semantic Technology !
Early adaptation of Semantic Technology is the king !

Growth in data volumes is very rapid.
Link, Integrate, Reuse
Linked Data is a truly Web-friendly way of publishing data.


Linked Data is the common global data space.
Gun for killer apps of semantic technology…
Catalyst and enabler to make semantic technology real…
Unlimited opportunities ahead…




                   skhan@wku.ac.kr                            99
References
   Keith Alexander, Richard Cyganiak, Michael Hausenblas, and Jun Zhao, Describing linked datasets, In
    Proceedings of the WWW2009 Workshop on Linked Data on the Web, 2009.
   Tim Berners-Lee, Linked Data - Design Issues, 2006, http://www.w3.org/DesignIssues/LinkedData.html.
   Tim Berners-Lee, Giant global graph, http://dig.csail.mit.edu/breadcrumbs/node/215, 2007.
   Christian Bizer, Tom Heath, and Tim Berners-Lee, Linked data - the story so far, Int. J. Semantic Web Inf.
    Syst., 5(3):1–22, 2009.
   Chris Bizer, Richard Cyganiak, and Tom Heath, How to Publish Linked Data on the Web,
    http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
   W3C Working Draft, Cool URIs for the Semantic Web,
     http://www.w3.org/TR/2008/WD-cooluris-20080321/
   http://data.gov.uk/linked-data
   http://www.w3.org/2001/sw/Specs.html
   Auer, S., Dietzold, S., Lehmann, J., Hellmann, S., and Aumueller, D. (2009). Triplify : lightweight linked
    data publication from relational databases. In Proceedings of the 17th International Conference on World
    Wide Web, WWW 2009, Madrid, Spain, April 20-24, 2009
   A Survey of current approaches for mapping of relational databases to RDF:
    http://esw.w3.org/topic/Rdb2RdfXG/StateOfTheArt
   Miles et al.: Best Practices Recipes for Publishing RDF Vocabularies, Available at:
    http://www.w3.org/TR/swbp-vocab-pub/




                                                skhan@wku.ac.kr                                              100
Semantic Technology
           Your World, Your Way

                         skhan@wku.ac.kr

skhan@wku.ac.kr                   101

Weitere ähnliche Inhalte

Was ist angesagt?

Gestures and Lip Shape Integration for Cued Speech Recognition
Gestures and Lip Shape Integration for Cued Speech RecognitionGestures and Lip Shape Integration for Cued Speech Recognition
Gestures and Lip Shape Integration for Cued Speech Recognition
Mohammed Musfir N N
 
supervised and relational topic models
supervised and relational topic modelssupervised and relational topic models
supervised and relational topic models
perseid
 
KR Workshop 1 - Ontologies
KR Workshop 1 - OntologiesKR Workshop 1 - Ontologies
KR Workshop 1 - Ontologies
Michele Pasin
 
Pal gov.tutorial2.session0.outline
Pal gov.tutorial2.session0.outlinePal gov.tutorial2.session0.outline
Pal gov.tutorial2.session0.outline
Mustafa Jarrar
 
The Future of Technology and Information
The Future of Technology and InformationThe Future of Technology and Information
The Future of Technology and Information
Nick Finck
 
SCHEME OF WORK 2010
SCHEME OF WORK 2010SCHEME OF WORK 2010
SCHEME OF WORK 2010
SMS
 
SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...
SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...
SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...
Sofia Eu
 

Was ist angesagt? (20)

Chc v2.0 model 2 13-12
Chc v2.0 model 2 13-12Chc v2.0 model 2 13-12
Chc v2.0 model 2 13-12
 
Gestures and Lip Shape Integration for Cued Speech Recognition
Gestures and Lip Shape Integration for Cued Speech RecognitionGestures and Lip Shape Integration for Cued Speech Recognition
Gestures and Lip Shape Integration for Cued Speech Recognition
 
supervised and relational topic models
supervised and relational topic modelssupervised and relational topic models
supervised and relational topic models
 
Ontology Dev
Ontology DevOntology Dev
Ontology Dev
 
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
Looking into the Black Box - A Theoretical Insight into Deep Learning NetworksLooking into the Black Box - A Theoretical Insight into Deep Learning Networks
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
 
"Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature""Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature"
 
Information Quality in the Web Era
Information Quality in the Web EraInformation Quality in the Web Era
Information Quality in the Web Era
 
On the Development of a Brain Simulator
On the Development of a Brain SimulatorOn the Development of a Brain Simulator
On the Development of a Brain Simulator
 
Image semantic coding using OTB
Image semantic coding using OTBImage semantic coding using OTB
Image semantic coding using OTB
 
Prerequisites of AI Techniques Making Robot To Perform Task With Human (autos...
Prerequisites of AI Techniques Making Robot To Perform Task With Human (autos...Prerequisites of AI Techniques Making Robot To Perform Task With Human (autos...
Prerequisites of AI Techniques Making Robot To Perform Task With Human (autos...
 
Knowledge management
Knowledge managementKnowledge management
Knowledge management
 
KR Workshop 1 - Ontologies
KR Workshop 1 - OntologiesKR Workshop 1 - Ontologies
KR Workshop 1 - Ontologies
 
Pal gov.tutorial2.session0.outline
Pal gov.tutorial2.session0.outlinePal gov.tutorial2.session0.outline
Pal gov.tutorial2.session0.outline
 
The Future of Technology and Information
The Future of Technology and InformationThe Future of Technology and Information
The Future of Technology and Information
 
Vertical integration of computational architectures - the mediator problem
Vertical integration of computational architectures - the mediator problemVertical integration of computational architectures - the mediator problem
Vertical integration of computational architectures - the mediator problem
 
SCHEME OF WORK 2010
SCHEME OF WORK 2010SCHEME OF WORK 2010
SCHEME OF WORK 2010
 
A Study of Semantic Proximity between Archetype Terms based on SNOMED CT Rela...
A Study of Semantic Proximity between Archetype Terms based on SNOMED CT Rela...A Study of Semantic Proximity between Archetype Terms based on SNOMED CT Rela...
A Study of Semantic Proximity between Archetype Terms based on SNOMED CT Rela...
 
SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...
SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...
SOFIA - Experiences in Implementing a Cross-domain Use Case by Combining Sema...
 
Nonlinear Communications: Achievable Rates, Estimation, and Decoding
Nonlinear Communications: Achievable Rates, Estimation, and DecodingNonlinear Communications: Achievable Rates, Estimation, and Decoding
Nonlinear Communications: Achievable Rates, Estimation, and Decoding
 
OBSC Framework
OBSC FrameworkOBSC Framework
OBSC Framework
 

Ähnlich wie Tutorial kcc-2011

Ubiquitous Service Capability Modeling and Similarity Based Searching
Ubiquitous Service Capability Modeling and Similarity Based SearchingUbiquitous Service Capability Modeling and Similarity Based Searching
Ubiquitous Service Capability Modeling and Similarity Based Searching
Wassim Derguech
 
Web standards, why care?
Web standards, why care?Web standards, why care?
Web standards, why care?
Thomas Roessler
 
Toward The Semantic Deep Web
Toward The Semantic Deep WebToward The Semantic Deep Web
Toward The Semantic Deep Web
Samiul Hoque
 
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
Hiroshi Ono
 
Word Format.doc
Word Format.docWord Format.doc
Word Format.doc
butest
 

Ähnlich wie Tutorial kcc-2011 (20)

Model-Driven Software Development with Semantic Web Technologies
Model-Driven Software Development with Semantic Web TechnologiesModel-Driven Software Development with Semantic Web Technologies
Model-Driven Software Development with Semantic Web Technologies
 
Taming digital traces for informal learning dhaval
Taming digital traces for informal learning  dhavalTaming digital traces for informal learning  dhaval
Taming digital traces for informal learning dhaval
 
Ubiquitous Service Capability Modeling and Similarity Based Searching
Ubiquitous Service Capability Modeling and Similarity Based SearchingUbiquitous Service Capability Modeling and Similarity Based Searching
Ubiquitous Service Capability Modeling and Similarity Based Searching
 
Web standards, why care?
Web standards, why care?Web standards, why care?
Web standards, why care?
 
Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...
 
Evolution: It's a process
Evolution: It's a processEvolution: It's a process
Evolution: It's a process
 
The MediaBase
The MediaBaseThe MediaBase
The MediaBase
 
Toward The Semantic Deep Web
Toward The Semantic Deep WebToward The Semantic Deep Web
Toward The Semantic Deep Web
 
On Semantics in Onto-DIY
On Semantics in Onto-DIYOn Semantics in Onto-DIY
On Semantics in Onto-DIY
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
Text and Data Visualization Introduction 2012
Text and Data Visualization Introduction 2012Text and Data Visualization Introduction 2012
Text and Data Visualization Introduction 2012
 
Resume Akshay Kakkar
Resume Akshay KakkarResume Akshay Kakkar
Resume Akshay Kakkar
 
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
 
7th AIS SigPrag International Conference on Pragmatic Web (ICPW 2012)
7th AIS SigPrag International Conference on Pragmatic Web (ICPW 2012)7th AIS SigPrag International Conference on Pragmatic Web (ICPW 2012)
7th AIS SigPrag International Conference on Pragmatic Web (ICPW 2012)
 
A category theoretic model of rdf ontology
A category theoretic model of rdf ontologyA category theoretic model of rdf ontology
A category theoretic model of rdf ontology
 
Mark Logic Information Analysis Trends Webinar
Mark Logic Information Analysis Trends WebinarMark Logic Information Analysis Trends Webinar
Mark Logic Information Analysis Trends Webinar
 
Word Format.doc
Word Format.docWord Format.doc
Word Format.doc
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
An overview of foundation models.pdf
An overview of foundation models.pdfAn overview of foundation models.pdf
An overview of foundation models.pdf
 

Mehr von Won Kwang University

Mehr von Won Kwang University (13)

Prospects, concerns, and response strategies for the post-AI world
Prospects, concerns, and response strategies for the post-AI worldProspects, concerns, and response strategies for the post-AI world
Prospects, concerns, and response strategies for the post-AI world
 
Digital_Healthcare_and_ICT.pdf
Digital_Healthcare_and_ICT.pdfDigital_Healthcare_and_ICT.pdf
Digital_Healthcare_and_ICT.pdf
 
humanities and liberal arts in the age of Artificial Intelligence
humanities and liberal arts in the age of Artificial Intelligencehumanities and liberal arts in the age of Artificial Intelligence
humanities and liberal arts in the age of Artificial Intelligence
 
스마트 교수학습법
스마트 교수학습법스마트 교수학습법
스마트 교수학습법
 
[배포]4차 교육혁신
[배포]4차 교육혁신[배포]4차 교육혁신
[배포]4차 교육혁신
 
4th Industrial Revolution and Restoration of Humanity
4th Industrial Revolution and Restoration of Humanity4th Industrial Revolution and Restoration of Humanity
4th Industrial Revolution and Restoration of Humanity
 
How to innovate your ICT business
How to innovate your ICT businessHow to innovate your ICT business
How to innovate your ICT business
 
Killer Presentation
Killer PresentationKiller Presentation
Killer Presentation
 
Good programming
Good programmingGood programming
Good programming
 
Future Library
Future LibraryFuture Library
Future Library
 
Lib0604
Lib0604Lib0604
Lib0604
 
Onto Sem
Onto SemOnto Sem
Onto Sem
 
Sws Han
Sws HanSws Han
Sws Han
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Tutorial kcc-2011

  • 1. Linked Data: Enabler of Semantic Web 2011.06.30 Sung-Kook Han Semantic Technology Lab Won Kwang Univ. skhan@wku.ac.kr 1
  • 2. Outline Introduction to Semantic Technology Semantic Technology + Web Technology • Semantic Web • Web 2.0 • Linked Data Design and Publication of Linked Data • 9 steps towards Linked Open Data skhan@wku.ac.kr 2
  • 3. Why Semantic Technology?? the ways of thinking, cognition… George Boole: An Investigation of the Laws of Thought (1854) Claude Shannon: 1937 master's thesis, A Symbolic Analysis of Relay and Switching Circuits John von Neumann Kurt Gödel Alan Turing skhan@wku.ac.kr 3
  • 4. Why Semantic Technology?? Final Goal: Intelligence skhan@wku.ac.kr 4
  • 5. Our Computers skhan@wku.ac.kr 5
  • 6. Communication Human vs. Human Human vs. Alien Human vs. Computer Computer vs. Computer skhan@wku.ac.kr 6
  • 7. Semantic Technology  Semantic technology has been a distinct research field for more than 40 years.  Formal Logic (since Russell and Frege)  Knowledge Representation Systems in AI  Semantic Networks and ATN (William Woods, 1975)  DARPA and European Commission programs in information integration  Development of simple tractable logics  Relational Algebras and Schemas in Database Systems  Library Science (classifications, thesauri, taxonomies)  New challenges of Semantic Technology: Semantic Web  A massive store of information that computers cannot use  A way to get around needing the “big data warehouse”  Another place where “a little semantics can go a long way”... cf: The Relationship Between Web 2.0 And the Semantic Web - Dr. Mark Greaves, Vulcan, Inc. skhan@wku.ac.kr 7
  • 8. Ontology Spectrum strong semantics Modal Logic has_experience_in works Company First Order Logic Technologies Knowledge Representation Programs Personnel Logical Theory Is Disjoint Subclass Management S1 illusion Agent Natural Language Project am AS Description Logic of with transitivity Program AS AS Department Telecommunication Task Technical Paulnderleez Leo DAML+OIL, OWL property Semantic Director EcDARPA has WISO Interoperability Request Reza Assistant Director Navy Intelligence UML Ann Brad Howard Conceptual Model Is Subclass of RDF/S Semantic Interoperability XTM Extended ER Thesaurus Has Narrower Meaning Than ER DB Schemas, XML Schema Animal Structural Interoperability Taxonomy Mammal Reptile Is Sub-Classification of Bird Relational Snake Dog Cat Model, XML Syntactic Interoperability Cocker Spaniel weak semantics Lady Based on Leo Obrst, The Ontology Spectrum & Semantic Models skhan@wku.ac.kr 8
  • 9. Semantic Technology Intelligence Integration Interoperability Machine-processible Digital Semantics Information Resources Web resources Ontology Services Semantic Image Metadata Audio/Video Technology controlled Documents vocabulary skhan@wku.ac.kr 9
  • 10. Web Technology Web of machine-processible Data Common vocabularies: Metadata and Ontology Query and reasoning Web of Services Classic Web Internet of Services Web of Documents Internet of Things HTML as document format HTTP URLs as globally unique IDs Hyperlinks to connect everything Social Web Connect human-being Web as a platform Programmable APIs and proprietary interfaces Mashups based on a fixed set of data sources skhan@wku.ac.kr 10
  • 11. Semantic Web  Standardizations  Trio of Semantic Web  Metadata / Ontology: RDF, RDFS, OWL  Query Language: SPARQL  Rule Language: RIF (SWRL)  SKOS, RDFa, GRRDL, WSMO,…  SOAP/ REST  Tools and Systems  Authoring, Reasoning Engines,…  835 items in Sweet Tools  Best Practices  Linked Open Data  Semantic MediaWiki  NEPOMUK, SIOC, Garlik  W3C Semantic Web Use cases Sweet Tools: http://www.mkbergman.com/new-version-sweet-tools-sem-web/ W3C Semantic Web Case Studies and Use Cases: http://www.w3.org/2001/sw/sweo/public/UseCases/ skhan@wku.ac.kr 11
  • 12. Semantic Applications Semantic Wave 2008, Industry Roadmap to Web 3.0, Project10X http://www.mkbergman.com/new-version-sweet-tools-sem-web/ skhan@wku.ac.kr 12
  • 13. Web 2.0  Resharpen the way of viewing the Web  Web as the platform  Web as the social media  Web as the collaboration tool  Web as ……  Web 2.0 Manifestation  Openness / Sharing  Participation / Collaboration  Web 2.0 Syndrome  Library 2.0  Government 2.0  Enterprise 2.0  ……  New Web applications  wiki, blog, RSS,… skhan@wku.ac.kr 13
  • 14. Web 2.0 Developers skhan@wku.ac.kr 14
  • 15. Semantic Web Today Major future issues: • Vocabularies • Scalability • Provenance • Personal Infospheres • Mobile and Real World Networks skhan@wku.ac.kr 15
  • 16. Web 2.0 APIs Today No Single global space: Web APIs slice the Web into Walled Gardens. • Mashups of APIs are proprietary. • No links between data. MashUp Web Web Web API API API A B C Christian Bizer: Pay-as-you-go Data Integration (21/9/2010) skhan@wku.ac.kr 16
  • 17. The Web is Dead?? http://www.wired.com/magazine/2010/08/ff_webrip/ skhan@wku.ac.kr 17
  • 18. Long Live the Web ! http://www.scientificamerican.com/article.cfm?id=long-live-the-web skhan@wku.ac.kr 18
  • 19. Lessons Learned  Data is more important than API code.  Data is the Intel Inside.  Open data is more important than open source  Structured data is more valuable than unstructured.  We should seek to structure our data well.  Metadata will play a core role of data structure.  A little semantics goes a long way.  Beware the usefulness of shallow ontology shown in LOD.  Linking data and services are essential.  Link every thing.  Rich user experiences are the key for adaption.  We should consider mobile computing and personalization.  Visualize and navigate. skhan@wku.ac.kr 19
  • 21. Web of Documents  A global file systems of documents (document silos on the Web).  Implicit semantics of content and links  Designed for human consumption  Disconnected data skhan@wku.ac.kr 21
  • 22. Architecture: Web of Documents  Analogy Web Search  a global file system Browsers Engines  Designed for HTTP URL  human consumption  Primary objects  documents HTML HTML HTML  Links between Doc. Doc. Doc.  documents (or sub-parts of)  Degree of structure in objects hyperlink hyperlink document link document link  fairly low  Main Usage  Search and browsing DB-A DB-B DB-C  Semantics of content and links  implicit skhan@wku.ac.kr 22
  • 23. Machine-Processible Data Web of Documents Documents Information Resources Documents Human processible Data Database Machine processible Web of Data  Open the data silos and get rid of repository-centric mindset  Publish data of public interest on the Web  In a way that other applications can access and interpret the data  Using common Web technologies skhan@wku.ac.kr 23
  • 24. Semantic Web: Web of Data  The vision of a Semantic Web:  building a global Web of machine-readable data  Berners-Lee, Hendler & Lassila, 2001; Marshall & Shipman, 2003 The first step is putting data on the Web in a form that machines can naturally understand, or converting it to that form. This creates what I call a Semantic Web - a web of data that can be processed directly or indirectly by machines. Therefore, while the Semantic Web, or Web of Data, is the goal or the end result of this process, Linked Data provides the means to reach that goal. -- Tim Berners-Lee, et al., http://linkeddata.org/docs/ijswis-special-issue, Jan, 2009  Linked Data Foundation  can lower the barrier to reuse, integration and application of data from multiple, distributed and heterogeneous sources.  the more sophisticated proposals associated with the Semantic Web vision, such as intelligent agents, may become a reality. skhan@wku.ac.kr 24
  • 25. Linked Data: Web of Data  Goal: Web-scale Data Integration  Alternative to classic data integration systems in order to cope with growing number of data sources.  Querying across data sources  Global distributed database RDF  Extend the Web with a single global data space  Giant Global Graph (GGG)  Demonstrate the possibility of Semantic Web  By using RDF to publish structured data RDF  By setting links between data single RDF universal information space. RDF RDF RDF skhan@wku.ac.kr 25
  • 26. Architecture: Linked Data  Analogy  a global database Linked Data Linked Data Search  Designed for Browsers Mashup Engines  machines first, humans later HTTP URI  Primary objects  things (or descriptions (data) of things)  Links between RDF RDF RDF  things triples Triples triples  Degree of structure in RDF link RDF link (descriptions of) things data link data link  high DB-A DB-B DB-C  Main usage  query, navigation and reasoning  Semantics of content and links  explicit skhan@wku.ac.kr 26
  • 27. Linked Data Principles Set of best practices for publishing structured data on the Web in accordance with the general architecture of the Web.  Use URIs as names for things.  Use URIs as names for things, not just for documents or homepages  Use HTTP URIs so that people can look up those names.  When someone looks up a URI, provide useful RDF information.  Include RDF statements that link to other URIs so that they can discover related things. URI URI URI URI RDF Link URI RDF triple Information URI HTTP URI URI skhan@wku.ac.kr 27
  • 28. Linked Open Data  Community effort to  publish existing open license datasets as Linked Data on the Web  interlink things between different data sources  develop clients that consume Linked Data from the Web  began early 2007 skhan@wku.ac.kr 28
  • 29. LOD Data sets on the Web  25 billion RDF triples, which are interlinked by around 395 million RDF links (Sep. 2010). http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.svg skhan@wku.ac.kr 29
  • 30. Summary: Web of Linked Data  A global, distributed database built on a simple set of standards  RDF, URI, HTTP  Explicit semantics of content and links  Resources are connected by semantic links.  creating a single global data graph that span data sources  enables the discovery of new data sources  Provides for data co-existence  Anyone can publish data to the Web of Linked Data  Data publishers are not constrained in choice of vocabularies with which to represent data.  Designed for computer first, humans later skhan@wku.ac.kr 30
  • 32. Europeana European digital library: Europeana: This European Commission initiative encompasses not only libraries but also museums, archives and other holders of cultural heritage material. http://version1.europeana.eu/web/europeana-project skhan@wku.ac.kr 32
  • 33. Linked Library Cloud  Libraries have been producing metadata for ages.  Libraries (often) produce high- quality metadata.  Library develops many metadata standards such as DC, SKOS, BIBO, OAI-ORE including MARC 21, MODS, FRBR,..  Integrate Library Catalogues on global scale http://code4lib.org/conference/2010/singer skhan@wku.ac.kr 33
  • 34. Linking Open Drug Data  linking the various sources of drug data together to answer interesting scientific and business questions.  Survey publicly available data sets about drugs  Publish and interlink these data sets on the Web  Explore interesting questions that could be answered if the data sets are linked.  8 million RDF triples, which are interlinked by more than 370,000 RDF links (As of August 2009) skhan@wku.ac.kr 34
  • 35. BBC Semantic Project  Publish program / music data as RDF/XML or RDFa  Build semantically linked and annotated web pages about artists and singers whose songs are played on BBC radio stations.  semantically interconnected skhan@wku.ac.kr 35
  • 36. DBpedia Mobile  Show map with information about nearby locations  Linked data browser  GPS + Google Maps + DBpedia + Flickr + Revyu skhan@wku.ac.kr 36
  • 37. Attention by Search Engines  Yahoo!  crawls Linked Data in its RDFa serialization as well as Microformat  Yahoo Search Monkey to make search results more useful and visually appealing  provides access to crawled data through the Yahoo BOSS API  Google  use Social Graph API  is developing Google Squared and Google Fusion Table  merged MetaWeb  manage Freebase, a DBpedia/YAGO competitor Rich Snippets skhan@wku.ac.kr 37
  • 38. Linked Open Commerce skhan@wku.ac.kr 38
  • 40. 9 Steps to publishing Linked Data  Publicize your Data Sets  Describe your Data Sets  Link to other Data Sets  Triplify Data Sets  Choose URIs for Things in your Data  Create Vocabularies  Understand your data  Setup Your Infrastructure for Linked Data  Understand the principles skhan@wku.ac.kr 40
  • 41. 1. Understand Linked Data • Principle • Core Stack • Data Modeling
  • 42. Linked Data: Overview  Benefits of Linked Data Enables web-scale data distributed publication with web-based discovery mechanisms.  Linked Data Web Resources are generic real-world data objects or entities:  People, Places, and other physical things  Abstract concepts (e.g., emotion, notion,…)  Subject matter (e.g., science, economics, arts,…)  Linked Data is not just structured data published on the Web.  Linked Data is based on well-established Web standards  Linked Data adds value: less redundancy, greater discoverability, network effects. skhan@wku.ac.kr 42
  • 43. Linked Data Principles (TimBL, 2006)  Use URIs as names for things  not just for documents  http://dbpedia.org/resource/ontology  you are not your homepage  http://mentalist.com/actor/patrick_jane  Use HTTP URIs  globally unique names, distributed ownership  allows people to look up those names  Provide useful information in RDF  when someone looks up a URI  Include RDF links to other URIs  to enable discovery of related information skhan@wku.ac.kr 43
  • 44. 5 Star rating On the web, open licensed: Available on the web (whatever format), but with an open license Machine-readable data: Available as machine-readable structured data (e.g. excel instead of image scan of a table) Non-proprietary format (e.g. csv instead of excel) RDF standards: Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff Linked RDF: Link your data to other people’s data to provide context skhan@wku.ac.kr 44
  • 45. Linked Data Core Stack http://linkeddata-specs.info/  RFC 2616 Hypertext Transfer Protocol • HTTP/1.1 Defines HTTP, a generic and stateless application-level protocol for distributed, collaborative, hypermedia information systems.  RFC 3986 Uniform Resource Identifier (URI): • Generic Syntax Defines a generic URI syntax and a process for resolving URI references that might be in relative form, along with guidelines and security considerations for the use of URIs on the Internet.  RDF Concepts and Abstract Syntax • Defines the RDF graph data model and key concepts.  SPARQL Query Language for RDF • Defines the syntax and semantics of the SPARQL query language for RDF. skhan@wku.ac.kr 45
  • 46. Core Technology  Uniform Resource Identifier (URI)  Names (identifiers) for resources in an open Web environment  Resource Description Framework (RDF)  a model for representing metadata on the web  triple structure  RDF Schema and OWL  languages for defining vocabularies  RDF/XML, N3, Turtle,…  serialization and de-serialization of RDF triples for exchanging RDF data  Simple Knowledge Organization System (SKOS)  a language for describing controlled vocabularies  SPARQL  a query language and protocol for accessing RDF data via the Web skhan@wku.ac.kr 46
  • 47. Linked Data Modeling Data Modeling Data Linking RDF data model to publish RDF links to interlink data structured data on the Web from different data sources RDF triple: subject, predicate, and object  Subject: URI identifying the described resource  Predicate: relation exists between subject and object,  vocabularies, collections of URIs that can be used to represent information about a certain domain  Object: a simple literal value, or the URI of another resource that is related to the subject skhan@wku.ac.kr 47
  • 48. Linked Data Model dbp-prop:title The Lord of the rings http://.../isbn/46316 Flexible graph-based model: RDF graph skos:subject dbp-prop:author English novels dbp-prop:publisher The HTTP protocol brings together identification dbp-prop:name and retrieval again. foaf:homepage dbpidia:Allen&Unwin J.R.R. Tolkien opencyc:headquarter dbp-prop:city Deeper into the Web wkp-en:J.R.R.Tolkien London fb:guid…..92df7 URI: global primary key fb:creator skos:subject = http://www.w3.org/2004/02/skos/core#subject fb:street_address dbp-prop:title = http://dbpedia.org/property/title Marivie 83 Alexander St 83 Alexander skhan@wku.ac.kr 48
  • 49. 2. Setup Infrastructure • Basic Infrastructure • Systems and Tools skhan@wku.ac.kr 49
  • 50. Basic Infrastructure packaging search Data/ extraction discovery Content navigation SPARQL link RDF Triple Base generation index Query Engine DB conversion triple store Interface Framework + APIs Delivery Web Server (Apache) Application browser navigator search skhan@wku.ac.kr 50
  • 51. Infrastructure Construction  Configuration of Web server  Configuring the server for correct MIME types application/rdf+xml  Code samples for ConNeg and 303 Redirects: http://linkeddata.org/tools  use cURL: http://curl.haxx.se/ to configure Apache  Configure for hash URI or Slash URI  Testing your content negotiation  Install the LiveHTTPHeaders and Modify Headers extensions for Firefox  Try LiveHTTPHeaders against my URI  http://www.skyhigh.com/id/hong  do the same with URIs from other data sets  Modify your headers to ask for application/rdf+xml skhan@wku.ac.kr 51
  • 52. Supporting Technologies  Linked Data Browsers  provide for navigating between data sources and for exploring the dataspace.  Tabulator Browser (MIT, USA), Marbles (FU Berlin, DE), OpenLink RDF Browser (OpenLink, UK), Zitgist RDF Browser (Zitgist, USA), Disco Hyperdata Browser Berlin, Fenfire (DERI, Irland)  Web of Data Search Engines  crawl the data space and provide best-effort query answers over crawled data.  Falcons (IWS, China), Sig.ma (DERI, Ireland), Swoogle (UMBC, USA), VisiNav (DERI, Ireland), Watson (Open University, UK), TAP, Sindice skhan@wku.ac.kr 52
  • 53. Supporting Technologies  Describing data set  discovery and usage of linked datasets  voiD, Ding  Registry  an open registry of data and content packages  CKAN  Linking tool  discovering relationships between data items within different Linked Data sources  SILK  Mapping tool  mapping database to RDF triples  Triplify, D2R Server  LOD platform  D2R Server, Virtuoso Universal Server, Talis Platform, Pubby, … skhan@wku.ac.kr 53
  • 54. 3. Understand Data to be published • Review about Data to be published • Requirement analysis skhan@wku.ac.kr 54
  • 55. Review about Data to be published  What  think about the key things to be presented in Linked Data  analysis of data properties  What vocabularies can be used to describe these?  Why  purposes and goals of linked data to be published  What for  how to use and apply linked data (use cases)  How to serve  Serving Linked Data as Static RDF/XML Files  Serving Linked Data as RDF Embedded in HTML Files  Serving RDF and HTML with Custom Server-Side Scripts  Serving Linked Data from Relational Databases  Serving Linked Data from RDF Triple Stores  Serving Linked Data by Wrapping Existing Application or Web APIs skhan@wku.ac.kr 55
  • 56. 4. Create Vocabularies • Vocabulary Creation • Common Namespace • Definition skhan@wku.ac.kr 56
  • 57. Guideline for Vocabulary Creation  Do not define new vocabularies from scratch, but complement existing vocabularies with additional terms (in your own namespace) to represent your data as required.  Provide for both humans and machines. Use rdfs:comments for each term invented. Always provide a label for each term using the rdfs:label property.  Make term URIs de-referenceable following the W3C Best Practice Recipes for Publishing RDF Vocabularies.  Make use of other people's terms. Using other people's terms, or providing mappings to them, by means of rdfs:subClassOf or rdfs:subPropertyOf.  State all important information explicitly. For example, state all ranges and domains explicitly.  Do not create over-constrained, brittle models; leave some flexibility for growth. Do not use full-featured OWL or RDF to define your vocabulary. Unless you know exactly what you are doing, use RDF Schema to define vocabularies. skhan@wku.ac.kr 57
  • 58. Potential Ontologies / Vocabularies  Friend-of-a-Friend (FOAF), vocabulary for describing people.  Dublin Core (DC) defines general metadata attributes. See also their new domains and ranges draft.  Semantically-Interlinked Online Communities (SIOC), vocabulary for representing online communities.  Description of a Project (DOAP), vocabulary for describing projects.  Simple Knowledge Organization System (SKOS), vocabulary for representing taxonomies and loosely structured knowledge.  Music Ontology provides terms for describing artists, albums and tracks.  Review Vocabulary, vocabulary for representing reviews.  Creative Commons (CC), vocabulary for describing license terms  Geo, vocabulary for describing geographical locations  GoodRelations, vocabulary for describing products skhan@wku.ac.kr 58
  • 59. Common Namespaces xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:dc="http://purl.org/dc/terms/"xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:vcard="http://www.w3.org/2006/vcard/ns#" xmlns:dbp="http://dbpedia.org/dbprop/" xmlns:geo="http://www.geonames.org/ontology#" xmlns:gr="http://purl.org/goodrelations/v1#" xmlns:commerce="http://search.yahoo.com/searchmonkey/commerce/" xmlns:media="http://search.yahoo.com/searchmonkey/media/" xmlns:cb="http://cb.semsol.org/ns#" More Common Namespaces: http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies http://www-958.ibm.com/software/data/cognos/manyeyes/visualizations/100-most-popular-rdf-namespaces skhan@wku.ac.kr 59
  • 60. Definition of Vocabulary # Definition of the class "Lover" <http://sites.movie.org/pub/LoveVocabulary#Lover> rdf:type rdfs:Class ; rdfs:label "Lover"@en ; rdfs:label "Liebender"@de ; rdfs:comment "A person who loves somebody."@en ; rdfs:comment "Eine Person die Jemanden liebt."@de ; rdfs:subClassOf foaf:Person . # Definition of the property "loves" <http://sites.movie.org/pub/LoveVocabulary#loves> rdf:type rdf:Property ; rdfs:label "loves"@en ; rdfs:label "liebt"@de ; rdfs:comment "Relation between a lover and a loved person."@en ; rdfs:subPropertyOf foaf:knows ; rdfs:domain <http://sites.movie.org/pub/LoveVocabulary#Lover> ; rdfs:range foaf:Person . skhan@wku.ac.kr 60
  • 61. Tools for Vocabulary Definition  Ontology editors  Protégé:  an open-source ontology editor with a dedicated OWL plug-in  Neologism:  Web-based tool for creating, managing and publishing simple RDFS vocabularies.  open-source and implemented in PHP on top of the Drupal-platform.  TopBraid Composer:  a powerful commercial modeling environment for developing Semantic Web ontologies  NeOn Toolkit:  an open-source ontology engineering environment with an extensive set of plug-ins. skhan@wku.ac.kr 61
  • 62. 5. Choose URIs • Resource Identification • Types of URIs • De-Referencing • Common URI Patterns skhan@wku.ac.kr 62
  • 63. Resource Identification  Separation of Identity and Representation  Identity  Identity (URI) of an Object or Entity should be unambiguous and globally unique  Representation  On the Web a URI should provide an unambiguous data access path  Access  Reference to abstract (physically inaccessible)  Objects or Entities is only achievable via conduit documents that carry representations of entity descriptions (which at best are facets of an entire description)  URI Requirements:  Keep out of other peoples' namespaces  Use a namespace that you control  Abstract away from implementation details (Short is better…)  Stable and persistent  Hash or Slash  Use common URI patterns skhan@wku.ac.kr 63
  • 64. URI  URI: Unique Resource Identifier home page?? (Web document) http://www.example.com/people/alice information object ?? URI: identification of people, products, places, ideas and concepts such as ontology classes, including URLs for Web documents hash URI Two Approaches slash URI skhan@wku.ac.kr 64
  • 65. Hash / Slash URI  Hash URI  URIs can contain a fragment, a special part that is separated from the rest of the URI by a hash symbol (“#”).  http://www.example.com/products/BiBimBab#this  http://www.travel.com /nation/Korea/KyungJu#main  simply publish a description document containing RDF about the things at the base URI  Slash URI  examples:  http://www.example.com/products/BiBimBab  http://www.travel.com /nation/Korea/KyungJu  must publish your description document at another, distinct URI. skhan@wku.ac.kr 65
  • 66. hash URI http://www.skyhigh.com/person/GilDong#this Separating identification and naming from representation Metadata: content-type: application/xhtml+ xml Data: <html xmlns=“.. <head> Entity <title> Our hero… (GilDong) </html> http://www.skyhigh.com/person/GilDong skhan@wku.ac.kr 66
  • 67. slash URI http://www.skyhigh.com/person/hero/GilDong/id Separating identification and naming from representation Metadata: content-type: application/xhtml+ xml Metadata: Data: content-type: <html xmlns=“.. application/rdf+xml <head> Entity <title> Our hero… (GilDong) Data: <html xmlns=“.. </html> <head> <title> Our hero… http://www.skyhigh.com/person/hero/GilDong/page </html> http://www.skyhigh.com/person/hero/GilDong/data skhan@wku.ac.kr 67
  • 68. Slash vs. Hash  Slash URI  HTTP redirection (30X response) is required in order for resource "Identity" to be separated from "representation". :  http://www.skyhigh.com/person/hero/GilDong/id (URI of an Organization Entity)  http://www.skyhigh.com/person/hero/GilDong/page (HTML representation of Entity description)  http://www.skyhigh.com/person/hero/GilDong/data (RDF representation that describes the Entity which could be: Turtle, N3. RDF/XML etc. based data serialization)  Hash URI  HTTP redirection isn't required in order for resource "Identity" to be separated from "representation". :  http://demo.openlinksw.com/Northwind/Customer/ALFKI#this (URI of an Organization Entity)  http://demo.openlinksw.com/Northwind/Customer/ALFKI a document (HTML, Turtle, N3, RDF/XML, representation of Entity description). skhan@wku.ac.kr 68
  • 69. DeReferencing Hash URI  Without content negotiation  With content negotiation http://www.example.com/about#alice http://www.example.com/about#alice ID ID automatic truncation of fragment http://www.example.com/about automatic truncation of fragment application/rdf+xml win text/html win content negotiation RDF RDF http://www.example.com/about http://www.example.com/about.rdf HTML http://www.example.com/about.html skhan@wku.ac.kr 69
  • 70. DeReferencing Slash URI  One Generic Document  Different documents http://www.example.com/id/alice http://www.example.com/id/alice ID ID 303 redirected text/html win http://www.example.com/doc/alice application/rdf+xml win generic document 303 redirected application/rdf+xml win text/html win with content negotiation content RDF negotiation http://www.example.com/doc/alice.rdf RDF HTML http://www.example.com/doc/alice.rdf HTML http://www.example.com/doc/alice.html http://www.example.com/doc/alice.html skhan@wku.ac.kr 70
  • 71. Content Negotiation skhan@wku.ac.kr 71
  • 72. Content Negotiation skhan@wku.ac.kr 72
  • 73. Common URI Pattern http://dbpedia.org/resource/New_York_City Thing http://dbpedia.org/data/New_York_City RDF data http://dbpedia.org/page/New_York_City HTML page http://revyu.com/people/tom Thing http://revyu.com/people/tom/about/rdf RDF data http://revyu.com/people/tom/about/html HTML page http://www.bbc.co.uk/music/artists/db4624cf#artist Thing http://www.bbc.co.uk/music/artists/db4624cf.rdf RDF data http://www.bbc.co.uk/music/artists/db4624cf.html HTML page http://id.dbpedia.org/Berlin Thing http://data.dbpedia.org/Berlin RDF Data http://page.dbpedia.org/Berlin HTML page http://www4.wiwiss.fu-berlin.de/bookmashup/books/006251587X ISBN skhan@wku.ac.kr 73
  • 74. Choosing URI http://www.culture.com/LOD/{class}/{member} http://www.culture.com/LOD/{class}/{member}.rdf http://www.culture.com/LOD/{class}/{member}.html  Examples:  URI of an Organization Entity http://demo.openlinksw.com/Northwind/Customer/ALFKI/id  HTML representation of Entity description http://demo.openlinksw.com/Northwind/Customer/ALFKI/ page  RDF representation that describes the Entity which could be: Turtle, N3. RDF/XML etc. based data serialization http://demo.openlinksw.com/Northwind/Customer/ALFKI/data skhan@wku.ac.kr 74
  • 75. 6. Triplify Data Sets • Publication Strategies • Conversion of Database skhan@wku.ac.kr 75
  • 76. Linked Data Publication Types of data Structured Data Text RDF-izers Entity Data Preparation For CVS, xml, Extractor Excel (e.g. Calais) Relational Data Source RDF RDF Data storage Database With API Store files CMS with RDB-to-RDF Custom Linked Data Web RDFa Data Publication Wrapper (e.g. D2R) Output Linked Data Interface Server wrapper (e.g. Pubby (e.g. Apache) (e.g. Drupal) Linked Data on the Web skhan@wku.ac.kr 76
  • 77. Publication Strategy  Strategy  From unstructured sources  use NLP, text mining, annotation,…  OpenCalais, Ontos  From semi-structured sources  Dbpedia, Linked GeoData, SCOVO,…  efficient bi-directional synchronization  From structured sources (relational database)  Declarative syntax and semantics of data model translation  RDB2RDF,… skhan@wku.ac.kr 77
  • 78. Conversion of Database Books Authors ID ID Year Name Homepage Publishers ID PublisherName City Books ID Author Title Publisher Year ISBN0-00-651409-X id_xyz The Glass Palace id_qpr 2000 Authors ID Name Home page id_xyz Ghosh, Amitav http://www.amitavghosh.com Publishers ID Publisher Name City id_qpr Harper Collins London skhan@wku.ac.kr 78
  • 79. Conversion of Database  Tools for mapping RDB to Linked Data  D2R Server for customizable mappings from relational databases to ontologies [Bizer, Cyganiak 06]  Browser-based tools for defining RDB-to-RDF mappings [Zhou, Xu, Chen, Idehen 08]  Triplify [Auer, Dietzold, Lehmann, Hellmann, Aumueller 09]  OpenLink Data Spaces [Idehen, Erling 08] skhan@wku.ac.kr 79
  • 80. RDF Features Best Avoided  Do not use the full expressivity of the RDF data model.  Use a subset of the RDF features  No blank nodes.  It is impossible to set external RDF links to a blank node,  Do not use RDF reification as the semantics of reification  unclear and cumbersome to query with the SPARQL query language.  Metadata can be attached to the information resource instead  Be careful before using RDF collections or RDF containers  do not work well together with SPARQL skhan@wku.ac.kr 80
  • 81. 7. Link to other Data sets • Types of Linking • Linking manually • Automatic generation of Link skhan@wku.ac.kr 81
  • 82. Link ! Reuse !!  Reuse. Do not invent the wheel again…  The URIs are de-referenceable.  For instance, using the DBpedia URI http://dbpedia.org/page/Doom to identify the computer game Doom gives you an extensive description of the game including abstracts in 10 different languages and various classifications.  The URIs are already linked to URIs from other data sources.  For instance, you can navigate from the DBpedia URI http://dbpedia.org/resource/Innsbruck to data about Innsbruck provided by Geonames and EuroStat.  Therefore, by using concept URIs form these datasets, you interlink your data with a rich and fast-growing network of other data sources. skhan@wku.ac.kr 82
  • 83. Types of Linking to other Data Sets  Relationship Links  point at related things in other data sources, for instance, other people, places or genes. <http://www.skyhigh.com/people/GilDong> rdf:type foaf:Person ; foaf:name “Hong, Gil-Dong" ; foaf:based_near <http://dbpedia.org/resource/Seoul> ; foaf:topic_interest <http://dbpedia.org/resource/Justice> ; foaf:knows <http://dbpedia.org/resource/HalBingDang> .  Identity Links  point at URI aliases used by other data sources to identify the same real-world object or abstract concept. <http:// www.skyhigh.com/people/GilDong > <http://www.w3.org/2002/07/owl#sameAs> <http://www.korea.org/history/hero>  Vocabulary Links  point to the definitions of related terms in other vocabularies <http://www.university.org/terms/professor> rdf:type rdfs:Class ; rdfs:subClassOf <http://dbpedia.org/ontology/Person> . rdfs:subClassOf <http://sw.opencyc.org/concept/Mx4rvbGdrcN5Y29ycA> ; owl:equivalentClass <http://rdf.dictionary.com/entry/facultyMember> skhan@wku.ac.kr 83
  • 84. Link to other Data Sets  URI aliases  In an open environment like the Web it often happens that different information providers talk about the same non-information resource. As they do not know about each other, they introduce different URIs for identifying the same real-world object.  http://dbpedia.org/resource/Berlin  http://sws.geonames.org/2950159/  URI aliases provide an important social function to the Web of Data as they are de-referenced to different descriptions of the same non-information resource and thus allow different views and opinions to be expressed.  owl:sameAs  Common Properties  rdfs:seeAlso, foaf:knows, foaf:based_near, foaf:topic_interest,…  Two approaches for linking data:  RDF Links Manually  Auto-generating RDF Links skhan@wku.ac.kr 84
  • 85. RDF Links Manually  Find the similar data sets as suitable linking targets manually search in these for the URI references you want to link to.  If a data source doesn't provide a search interface, you can use Linked Data browsers like Tabulator or Disco to explore the dataset and find the right URIs.  Useful sites:  Sindice and Falcons provide indexes to identify candidate URIs for linking.  CKAN site : a registry of open linked data and projects.  Uriqr - A URI Search Engine: http://dev.uriqr.com/  Freebase: http://www.freebase.com  MOAT: Meaning Of A Tag Framework  For manually interlinking tags with Semantic Web URIs (such as URIs from DBpedia, Geonames … or any knowledge base)  Remember that data sources might use HTTP-303 redirects to redirect clients from URIs identifying non-information resources to URIs identifying information resources that describe the non-information resources. skhan@wku.ac.kr 85
  • 86. Auto-generating RDF Links  Various approaches  Pattern-based Algorithms  Similarity-based Approaches  Complex property-based Algorithms  Yves Equivalence Miner: interlinking Jamendo and Musicbrainz.  Equivalence Mining and Matching Frameworks  Silk - A Link Discovery Framework for the Web of Data.  Silk can be run on a single machine or on a Hadoop cluster (for instance Amazon EC2).  LIMES - Link Discovery Framework for Metric Spaces.  time-efficient and lossless approaches for large-scale link discovery based on the characteristics of metric spaces.  DSNotify - Detecting and Fixing Broken Links in Linked Data Sets  TopBraid Composer  a wizard for linking ontology instances to corresponding DBpedia concepts.  SemMF  a flexible framework for calculating semantic similarity between objects that are represented as arbitrary RDF graphs. http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/EquivalenceMining skhan@wku.ac.kr 86
  • 87. 8. Describe Data Sets • Metadata for Description skhan@wku.ac.kr 87
  • 88. Publishing Descriptions of a Data set  Help others discover and index your data  Apply a license or waiver to your data set  Metadata about the published linked data set  authorship of a data set, its currency (i.e., how recently the data set was updated), its licensing terms, the provenance and timeliness of a data set and the terms for licensing  Important issues:  Provenance:  the ability to track the origin of data  key component in building trustworthy, reliable applications  Open Provenance Model84  Licenses vs. Waivers  Norms : a means for data publishers who waive their legal rights (through application of a waiver) to define expectations they have about how the data is used  Two primary mechanisms  Semantic Sitemaps: http://sw.deri.org/2007/07/sitemapextension/  voiD : http://semanticweb.org/wiki/VoiD skhan@wku.ac.kr 88
  • 89. Description Metadata about published data, such as a URI identifying the author Metadata and licensing information. Description Description of dataset that have the resource's URI as the subject. Description of dataset that have the resource's URI as the object. Backlinks This is redundant, but it allows browsers and crawlers to traverse links in either direction. Related Any additional information about related resources, i.e., answering information about a book with the author information. descriptions A moderate approach not overloaded excessively. Various ways to serialize RDF descriptions. At least provide RDF descriptions as RDF/XML which is the only Syntax official syntax for RDF. Additionally provide Turtle descriptions Trix, and other skhan@wku.ac.kr 89
  • 90. Data Set Description: Example # Metadata and Licensing Information <http://dbpedia.org/data/Alec_Empire> rdfs:label "RDF description of Alec Empire" ; rdf:type foaf:Document ; dc:publisher <http://dbpedia.org/resource/DBpedia> ; dc:date "2007-07-13"^^xsd:date ; dc:rights <http://en.wikipedia.org/wiki/WP:GFDL> . # The description <http://dbpedia.org/resource/Alec_Empire> foaf:name "Empire, Alec" ; rdf:type foaf:Person ; rdf:type <http://dbpedia.org/class/yago/musician> ; rdfs:comment "Alec Empire (born May 2, 1972) is a German musician who is ..."@en ; rdfs:comment "Alec Empire (eigentlich Alexander Wilke) ist ein deutscher Musiker. ..."@de ; dbpedia:genre <http://dbpedia.org/resource/Techno> ; dbpedia:associatedActs <http://dbpedia.org/resource/Atari_Teenage_Riot> ; foaf:page <http://en.wikipedia.org/wiki/Alec_Empire> ; foaf:page <http://dbpedia.org/page/Alec_Empire> ; rdfs:isDefinedBy <http://dbpedia.org/data/Alec_Empire> ; owl:sameAs <http://zitgist.com/music/artist/d71ba53b-23b0-4870-a429-cce6f345763b> . skhan@wku.ac.kr 90
  • 91. Data Set Description: Example # Backlinks <http://dbpedia.org/resource/60_Second_Wipeout> dbpedia:producer <http://dbpedia.org/resource/Alec_Empire> . <http://dbpedia.org/resource/Limited_Editions_1990-1994> dbpedia:artist <http://dbpedia.org/resource/Alec_Empire> . skhan@wku.ac.kr 91
  • 92. 9. Publish Data Sets • Serialization • Linked Data Storage • Test and Debugging skhan@wku.ac.kr 92
  • 93. Publishing Linked Data  Serialization of Data Publication Advantages Disadvantages Method RDF/XML Document Oldest, best supported Confusingly like normal XML Turtle (N3) Not technically a standard Simplest Document yet HTML Document Fits inside HTML, Can get very complicated with RDFa but also RDF Promising, but still being JSON Normal JSON, but also RDF developed Needs to download+run GRDDL Use the XML you have/want XSLT SPARQL Query Protocol Query Protocol  RDF files shouldn't be larger than, say, a few hundred kilobytes. Break them up into several RDF files  Make sure multiple RDF files are linked to each other through RDF triples. skhan@wku.ac.kr 93
  • 94. Examples RDF/XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:db="http://dbpedia.org/resource/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Massachusetts"> <db:Governor> <rdf:Description rdf:about="http://dbpedia.org/resource/Deval_Patrick" /> </db:Governor> <db:Nickname>Bay State</db:Nickname> <db:Capital> <rdf:Description rdf:about="http://dbpedia.org/resource/Boston"> <db:Nickname>Beantown</db:Nickname> </rdf:Description> </db:Capital> </rdf:Description> </rdf:RDF> Turtle @prefix db: <http://dbpedia.org/resource/> db:Massachusetts db:Governor db:Deval_Patrick; db:Nickname "Bay State"; db:Capital db:Boston. db:Nickname "Beantown". skhan@wku.ac.kr 94
  • 95. Examples RDFa <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:db="http://dbpedia.org/resource/" version="XHTML+RDFa 1.0"> <head> <title>About Massachusetts</title> </head> <body> <div about="http://dbpedia.org/resource/Massachusetts">The Massachusetts governor is <span rel="db:Governor"> <span about="http://dbpedia.org/resource/Deval_Patrick">Deval Patrick </span>, </span> the nickname is "<span property="db:Nickname">Bay State</span>", and the capital <span rel="db:Capital"> <span about="http://dbpedia.org/resource/Boston"> has the nickname "<span property="db:Nickname">Beantown</span>". </span> </span> </div> </body> </html> skhan@wku.ac.kr 95
  • 96. Examples RDF-JSON { "__iri": "db:Massachusetts", "db:Nickname": "Bay State", "db:Governor": { "__iri": "db:Deval_Patrick" }, "db:Capital": { "__iri": "db:Boston", "db:Nickname": "Beantown" }, "__prefixes": { "db:": "http://dbpedia.org/resource/" } } GRDDL <MyDataSet xmlns="http://example.org/my-data-xml-namespace"> <State> <name>Massachusetts</name> <governor>Deval_Patrick</governor> <nickname>Bay State</nickname> <capital> <name>Boston</name> <nickname>Beantown</nickname> </capital> </State> </MyDataSet> skhan@wku.ac.kr 96
  • 97. Linked Data Storage  RDB to RDF Middleware  D2R Server  Native RDF Storage (manage it yourself)  4Store  AllegroGraph  Bigdata  BigOWLIM  Jena TDB  Neo4j  Sesame  Virtuoso  Native RDF Storage (managed)  Talis Platform  Pubby  Linked Data front-end for SPARQL Endpoints  Paget Framework skhan@wku.ac.kr 97
  • 98. Testing and Debugging Linked Data  To ensure it adheres to the Linked Data principles and best practices  correctness of URIs dereference  Vapour Linked Data Validator at http://idi.fundacionctic.org/vapour  RDF:Alerts at http://swse.deri.org/RDFAlerts/  Sindice Inspector at http://inspector.sindice.com/  manual validation and debugging of Linked Data  cURL, Firefox browser extensions LiveHTTPHeaders and ModifyHeaders  technical debugging and validation  Linked Data browsers can be used for.  Tabulator, Marbles, LOD Browser Switch skhan@wku.ac.kr 98
  • 99. Summary: Linked Data Semantic Technologies need to go where the data is ! Long Live Semantic Technology ! Early adaptation of Semantic Technology is the king ! Growth in data volumes is very rapid. Link, Integrate, Reuse Linked Data is a truly Web-friendly way of publishing data. Linked Data is the common global data space. Gun for killer apps of semantic technology… Catalyst and enabler to make semantic technology real… Unlimited opportunities ahead… skhan@wku.ac.kr 99
  • 100. References  Keith Alexander, Richard Cyganiak, Michael Hausenblas, and Jun Zhao, Describing linked datasets, In Proceedings of the WWW2009 Workshop on Linked Data on the Web, 2009.  Tim Berners-Lee, Linked Data - Design Issues, 2006, http://www.w3.org/DesignIssues/LinkedData.html.  Tim Berners-Lee, Giant global graph, http://dig.csail.mit.edu/breadcrumbs/node/215, 2007.  Christian Bizer, Tom Heath, and Tim Berners-Lee, Linked data - the story so far, Int. J. Semantic Web Inf. Syst., 5(3):1–22, 2009.  Chris Bizer, Richard Cyganiak, and Tom Heath, How to Publish Linked Data on the Web, http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/  W3C Working Draft, Cool URIs for the Semantic Web, http://www.w3.org/TR/2008/WD-cooluris-20080321/  http://data.gov.uk/linked-data  http://www.w3.org/2001/sw/Specs.html  Auer, S., Dietzold, S., Lehmann, J., Hellmann, S., and Aumueller, D. (2009). Triplify : lightweight linked data publication from relational databases. In Proceedings of the 17th International Conference on World Wide Web, WWW 2009, Madrid, Spain, April 20-24, 2009  A Survey of current approaches for mapping of relational databases to RDF: http://esw.w3.org/topic/Rdb2RdfXG/StateOfTheArt  Miles et al.: Best Practices Recipes for Publishing RDF Vocabularies, Available at: http://www.w3.org/TR/swbp-vocab-pub/ skhan@wku.ac.kr 100
  • 101. Semantic Technology Your World, Your Way skhan@wku.ac.kr skhan@wku.ac.kr 101