Semantic web

16. May 2012

Más contenido relacionado


Semantic web

  1. Semantic Web
  2. Overview  What is Semantic Web  Semantic Web Vision  Semantic Web Layers  RDF, RDFS, OWL  Tools  GATE  Applications
  3. What is Semantic Web?  Semantic means that the meaning of data can be discovered by computers  "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." - Tim Berners-Lee
  4. Definition  The Semantic Web is a project to create a universal medium for information exchange by putting documents with computer-processable meaning (semantic) on the World Wide Web  The Semantic Web extends the Web through the use of standards, markup languages and related processing tools
  5. The aims of Semantic Web  Indexing and retrieving information  Annotation  The Web as a interoperable database  Machine retrieval of data  Web based services  Discovery of services  Intelligent software agents
  6. Semantic Web Vision  Oriented toward machine-readable resources rather than human-readable  Requires resources to be described so that machines know what they mean  Description in terms of metadata  Use of logic interpretation for inference
  7. Semantic Web Layers
  8. Semantic Web Layers  XML (Extensible Markup Language)- The language framework that is used to define nearly all new languages that are used to interchange data over the Web  XML Schema -A language used to define the structure of specific XML language
  9. Semantic Web Layers  RDF (Resource Description Framework)- a language used to describe all sort of information and meta data  RDF Schema-A framework that provides a means to specify basic vocabularies for specific RDF application language to use
  10. Semantic Web Layers  Ontology- defines vocabularies and establish the usage of words and terms in context of specific vocabulary  Logic and Proof –is used to establish the consistency and correctness of data sets and to infer conclusion that aren’t explicitly stated
  11. Semantic Web agents  Metadata will be used to identify and extract information from Web sources.  Ontologies will be used to assist in Web searches, to interpret retrieved information, and to communicate with other agents.  Logic will be used for processing retrieved information and for drawing conclusions.
  12. RDF
  13. RDF • “Resource Description Framework” • RDF is a data model • Originally for describing metadata for web pages • Structured information • Universal, machine-readable data exchange model • Syntax uses XML for serialization • Statements can be modeled with • Resources: an element, a URI, a literal • Properties: directed relation between two resources • Statements: triples of two resources linked by property
  14. RDF • Generally triple can be viewed as a graph • both “ object: and “ subject” are the graph nodes • “properties are the edges • XML syntax is only the tools for practical usage instead of graph • Components • URIs – for referencing resources • Literals – data values • Empty nodes (blank nodes) – talking about something which doesn’t have a name
  15. RDF Example • Subject: URIs and empty nodes • Predicate: URIs ( also called properties) • Object: URIs and empty nodes and literals
  16. XML syntax for RDF Example
  17. RDF Example
  18. RDF XML Code Example 1. <?xml version="1.0"?> 2. <rdf:RDF xmlns:rdf="" 3. xmlns:dc="" 4. xmlns:exterms=""> 5. <rdf:Description rdf:about=""> 6. <exterms:creation-date>August 16, 1999</exterms:creation-date> 7. <dc:language>en</dc:language> 8. <dc:creator rdf:resource=""/> 9. </rdf:Description> 10. </rdf:RDF>
  19. A simple example  “The book has the title War and Peace”  Graphical RDF Statement has the title War and The book The book Peace  RDF in a XML document <?xml version="1.0"?> <rdf:RDF xmlns:rdf="" xmlns:dc=""> <rdf:Description rdf:about=""> <dc:title> War and Peace</dc:title> </rdf:Description> </rdf:RDF>
  20. Ontology  We can express ontology as: Ontology =<taxonomy, inference rules> And we can express a taxonomy as: Taxonomy <{classes}, {relations}>  Ontology Languages (RDFS, OWL) has formal foundations that allow us to infer additional (implicit) statements
  21. RDF Schema  Intended to structure RDF resources  RDFS  Set theory – rdfs:Class  Relation – rdf:Property, rdfs:domain, rdfs:range  Hierarchy – rdfs:subClassOf, rdfs:subPropertyOf  Built-in Datatype – xsd:string, xsd:dataTime
  22. RDF & RDFS  RDF is graphical formalism ( + XML syntax + semantics)  for representing metadata  for describing the semantics of information in a machine- accessible way  RDFS extends RDF with “schema vocabulary”, e.g.:  Class, Property  type, subClassOf, subPropertyOf  range, domain
  23. Limitations of RDF/RDFS  No standard for expressing primitive data types such as integer, etc. All data types in RDF/RDFS are treated as strings.  No standard for expressing relations of properties (unique, transitive, inverse etc.)  No standard for expressing whether enumerations are closed.  No standard to express equivalence, disjointedness etc. among properties
  24. OIL and DAML  RDFRDFS define a framework, however they have limitations. There is a need for new semantic web languages with following requirements  They should be compatible with (XML, RDF/RDFS)  They should have enough expressive power to fill in the gaps in RDFS  They should provide automated reasoning support  Ontology Inference Layer (OIL) and DARPA Agent Markup Language (DAML) are two important efforts developed to fulfill these requirements.  Their combined efforts formed DAML+OIL declarative semantic language.
  25. OIL and DAML  DAML+OIL is built on top of RDFS.  It uses RDFS syntax.  It has richer ways to express primitive data types.  DAML+OIL allows other relationships (inverse and transitivity) to be directly expressed.  DAML+OIL provides well defined semantics, This provides followings:  Meaning of DAML+OIL statements can be formally specified.  Machine understanding and automated reasoning can be supported.  More expressive power can be provided.
  26. Example Example: T. Rex is not herbivore and not a currently living species.  This statement can be expressed in DAML+OIL, but not in RDF/RDFS since RDF/RDFS cannot express disjointedness.  DAML+OIL provides automated reasoning by providing such expressive power.  For instance, a software agent can find out the “list of all the carnivores that won’t be any threat today” by processing the DAML+OIL data representation of the example above.  RDF/RDFS does not express “is not” relationships and exclusions.
  27. OWL
  28. Web Ontology Language = OWL  OWL is an extra layer, a bit like RDFS  own namespace, own terms  it relies on RDF Schemas  It is a separate recommendation  actually… there is a 2004 version of OWL (“OWL 1”)  and there is an update (“OWL 2”) published in 2009
  29. OWL- Web Ontology Language  OWL is a vocabulary extension of the RDF and is derived from the DAML+OIL Web Ontology Language.  OWL  Description Logic  Class, Thing, Nothing  DatatypeProperty, ObjectProperty, AnnotationProperty,…  Class  oneOf, disjointWith, unionOf, complementOf, intersectionOf …  Restriction, onProperty, cardinality, hasValue…  Property  inverseOf , TransitiveProperty , SymmetricProperty  FunctionalProperty, InverseFunctionalProperty  Equality– equivalentClass , sameAs , differentFrom…  Ontology annotation – Ontology, imports, versionInfo
  30. Term equivalences  For classes:  owl:equivalentClass: two classes have the same individuals  owl:disjointWith: no individuals in common  For properties:  owl:equivalentProperty  remember the a:author vs. f:auteur?  owl:propertyDisjointWith
  31. Term equivalences  For individuals:  owl:sameAs: two URIs refer to the same concept (“individual”)  owl:differentFrom: negation of owl:sameAs
  32. Example owl:equivalentProperty a:author f:auteur owl:equivalentClass a:Novel f:Roman
  33. Property characterization  In OWL, one can characterize the behavior of properties (symmetric, transitive, functional, reflexive, inverse functional…)  One property can be defined as the “inverse” of another
  34. What this means is…  If the following holds in our triples: :email rdf:type owl:InverseFunctionalProperty. <A> :email "mailto:a@b.c". <B> :email "mailto:a@b.c".
  35. What this means is…  If the following holds in our triples: :email rdf:type owl:InverseFunctionalProperty. <A> :email "mailto:a@b.c". <B> :email "mailto:a@b.c". then, processed through OWL, the following holds, too: <A> owl:sameAs <B>.
  36. Keys “if two persons have the same emails and the same homepages then they are identical”  Identification is based on the identical values of two properties  The rule applies to persons only
  37. Previous rule in OWL :Person rdf:type owl:Class; owl:hasKey (:email :homepage) .
  38. What it means is… If: <A> rdf:type :Person ; :email "mailto:a@b.c"; :homepage "". <B> rdf:type :Person ; :email "mailto:a@b.c"; :homepage "". then, processed through OWL, the following holds, too: <A> owl:sameAs <B>.
  39. Classes in OWL  In RDFS, you can subclass existing classes… that’s all  In OWL, you can construct classes from existing ones:  enumerate its content  through intersection, union, complement  etc
  40. Enumerate class content :Currency rdf:type owl:Class; owl:oneOf (:€ :£ :$).  I.e., the class consists of exactly of those individuals and nothing else
  41. Union of classes :Novel rdf:type owl:Class. :Short_Story rdf:type owl:Class. :Poetry rdf:type owl:Class. :Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry).  Other possibilities: owl:complementOf, owl:intersectionOf, …
  42. For example… If: :Novel rdf:type owl:Class. :Short_Story rdf:type owl:Class. :Poetry rdf:type owl:Class. :Literature rdf:type owl:Class; owl:unionOf (:Novel :Short_Story :Poetry). <myWork> rdf:type :Novel . then the following holds, too: <myWork> rdf:type :Literature .
  43. What we have so far…  The OWL features listed so far are already fairly powerful  E.g., various databases can be linked via owl:sameAs, functional or inverse functional properties, etc.  Many inferred relationship can be found using a traditional rule engine
  44. The most used Semantic Web Tools  RDF Gateway- it runs both a Web application server and database design to handle RDF content  Jena -Java API for RDF  Smore: Semantic Markup, Ontology and RDF Editor  Drive - a C# API. It parses and validate RDF documents.
  45. General Architecture for Text Engineering (GATE)
  46. What is GATE? An architecture A macro-level organisational picture for LE software systems. A framework For programmers, GATE is an object-oriented class library that implements the architecture. A development environment For language engineers, computational linguists et al, GATE is a graphical development environment bundled with a set of tools for doing e.g. Information Extraction. Some free components... ...and wrappers for other people's components Tools for: evaluation; visualise/edit; persistence; IR; IE; dialogue; ontologies; etc. 46(21)
  47. Where did GATE come from? A number of researchers realised in the early- mid-1990s (e.g. in TIPSTER): • Increasing trend towards multi-site collaborative projects • Role of engineering in scalable, reusable, and portable HLT solutions • Support for large data, in multiple media, languages, formats, and locations • Lower the cost of creation of new language processing components • Promote quantitative evaluation metrics via tools and a level playing field History: • 1996 – 2002: GATE version 1, proof of concept • March 2002: version 2, rewritten in Java, component based, more users • Fall 2003: new development cycle 47(21)
  48. Applications  Swoogle  DBpedia  Flickr  PhotoStuff
  49. Swoogle • Swoogle is a crawler based indexing and retrieval system for Semantic Web • Swoogle crawls and discovers documents written in RDF,OWL • Swoogle classifies a Semantic Web Document(SWD) as – • Semantic Web Ontology (SWO) – Defines new terms • Semantic Web Databases (SWDB) – Makes assertions about individuals
  50. Reference & Resources     