Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Intro to the semantic web (for libraries)

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 43 Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Intro to the semantic web (for libraries) (20)

Anzeige

Weitere von robin fay (20)

Aktuellste (20)

Anzeige

Intro to the semantic web (for libraries)

  1. 1. Introduction to the Semantic Web Robin Fay @georgiawebgurl robin fay @georgiawebgurl 2013
  2. 2. Objectives By the end of this session, you’ll • Have an understanding of the basic principles and terminology of the – Semantic Web – Linked data – Library data in semantic web space • BIBFRAME • RDA • FRBR robin fay @georgiawebgurl 2013
  3. 3. The web as we know it (and think of it) links together documents (html, pdf, dynamic documents created from databases, etc.) robin fay @georgiawebgurl 2013
  4. 4. In brief: Types of metadata: Descriptive Structural Administrative • Many forms of metadata include elements of each of these; however it is dependent upon the schema. • A schema is a set of rules covering the elements and requirements for coding. Examples of common schemas in the library world include Dublin Core, TEI, EAD, and others. Examples of schemas in the semantic web include Dublin Core, FOAF (Friend of a Friend), and many others. robin fay @georgiawebgurl 2013
  5. 5. •Much of library metadata is highly structured and done by trained professionals. In the library world, MARC has been a long term standard. While it can be rigid, its structural nature can makes it easier to crosswalk and harvest into other databases. •SEO (Search Engine Optimization) is a common term in the web world; these experts assign descriptive, administrative (usually copyright) to websites; their goal is generally higher search results. Given that search engine algorithms change regularly, SEO is a highly dynamic field, which can lead to inconsistencies in metadata application, making it harder for databases and search engines to harvest. •In a nutshell, most library metadata has rules and standards; metadata in the web world is often (but not always) more flexible. The Semantic Web will need to manage (and make sense!) of all of these types of metadata. robin fay @georgiawebgurl 2013
  6. 6. •At its core, the semantic web comprises: oa set of design principles, ocollaborative working groups, oand a variety of enabling technologies. •Some elements of the semantic web are expressed as prospective future possibilities that are yet to be implemented or realized AND •Other elements of the semantic web are expressed in formal specifications -- (wikipedia, 2009) Robin Fay, robinfay.net 2009/10 robin fay @georgiawebgurl 2013
  7. 7. Robin Fay, robinfay.net 2009/10 •Semantic web and web metadata is frequently from outside of the library community – working in parallel or sometimes, at odds. •Metadata in libraries encompasses a wide variety; one of the most common metadata schemas is MARC. •MARC is formatted using ISBD punctuation; the content of what goes into a record is controlled by our cataloging rules (such as RDA). RDA can be applied using different metadata schemas – although for now, many libraries are still in a MARC based world. robin fay @georgiawebgurl 2013
  8. 8. robin fay @georgiawebgurl 2013
  9. 9. •RDF = Resource Description Framework •RDFS = Resource Description Framework Schema •OWL = Web Ontology Language •URI = Uniform Resource Identifier - think unique number , URLs Many terms associated with the Semantic Web are used or based upon information architecture, database, information science, and library science fields – controlled vocabularies, structural elements, etc. robin fay @georgiawebgurl 2013
  10. 10. RDF = Resource Description Framework • is a general-purpose language for representing information in the Web (a metadata data model) • is a W3C specification • is a conceptual description • is based upon making statements about web resources (triplets) • More or less : XML • We can express RDA in RDF • Think sentence structure : • subject – predicate(verb)-object • My dog eats dogfood. robin fay @georgiawebgurl 2013
  11. 11. So, we have the framework, but how do we apply it? RDFS = Resource Description Framework Schema oA schema is  outline: a schematic or preliminary plan  A structure described in a formal language supported by the database management system ; in a relational database [such as MySQL), the schema defines the tables, the fields in each table, and the relationships between fields and tables.  a description of the structure and rules a document must satisfy for an XML document type  http://tinyurl.com/yj442vr (define: schema -- google)  Dublin Core is a schema robin fay @georgiawebgurl 2013
  12. 12. Very Simple Dublin Core in RDF We can combine schemas robin fay @georgiawebgurl 2013
  13. 13. OWL = Web Ontology Language •invented to link ontologies which are classification systems •Attempts to define objects and their relationships •Different “flavors” •“interpreted as a set of "individuals" and a set of "property assertions" which relate these individuals to each other” (wikipedia 2009) •Not a requirement •Sounds familiar to catalogers, right? robin fay @georgiawebgurl 2013
  14. 14. robin fay @georgiawebgurl 2013
  15. 15. Being that this is data driven, we can query, using SPARQL, a standard query language. We’ll talk more about SPARQL later… robin fay @georgiawebgurl 2013
  16. 16. •Linked data is: “about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data.” •Think> related, series records, authority files •Libraries already link data. •Projects such as the NYT Linked Open Data project and the Virtual Authority File project are resources of controlled vocabularies. •Verified and digital identity accounts such as openID and claimID to differentiate names robin fay @georgiawebgurl 2013
  17. 17. • What is linked data and open data o Linked data is about reusing data o We already do some linked data in our library catalogs and even in our daily lives o The link in a bibliographic record (like an authority record link) is linking data behavior o A link that we share to our friends on facebook is linked data (of sorts) • Linked data is a link to a record/data/content that can then be utilized in some way • Open data is data that available to be used in some way with no barriers to access (licensing, etc.) robin fay @georgiawebgurl 2013
  18. 18. Basic principles of linked data It keeps us from having to re-enter or copy information – Making our data: • reusable • easy to correct (correct one record instead of multiples) • efficient • and potentially useful to others It can build relationships in different ways - allowing us to create temporary collections (a user could organize their search results in a way that makes sense to them) or more permanent (collocating ALL works by a particular author more easily; pulling together photographs more easily) robin fay @georgiawebgurl 2013
  19. 19. • Advantages (reusable data, potential to provide and built relationships, discoverability) • How library data fits into linked data o FRBR ( a bibliographic FRAMEWORK which is more semantic by nature) RDA ( metadata rules which are not tied to a programming language such as MARC but can work with semantic web standards like XML); IRs, and CMS like Drupal which have semantic web capabilities • RDA expressed as RDFa robin fay @georgiawebgurl 2013
  20. 20. The RDF Triple: conceptual Examples same as author of Predicate/verb
  21. 21. Tim Berners-Lee’s Four Rules 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards 4. Include links to other URIs, so they can discover more things URIs = Uniform Resource Identifier robin fay @georgiawebgurl 2013
  22. 22. What can linked data do for libraries? • URIs creates methods for classifying that can be used (linked to!) by others • Library of Congress has released LCSH as linked data, and OCLC has a modified version of LCSH called FAST as linked data • Linked Data is flexible enough to express entity- relationship relationships such as FRBR/FRAD • Different databases (ILS, ERMS, IRs, local databases, etc.) allowing sharing of data – potentially more consistent data – allowing for collocation across resources and allowing users to easily find resources regardless of source robin fay @georgiawebgurl 2013
  23. 23. Our data in a semantic viewpoint SOURCE: Getting triples from records: the role of ISBD http://www.slideshare.net/scottishlibraries/isbd-record2triples
  24. 24. Our data in a semantic view SOURCE: Getting triples from records: the role of ISBD http://www.slideshare.net/scottishlibraries/isbd-record2triples “Bib” :Record id as subject Field role and relationship Can map to record such as viaf
  25. 25. How cataloging is changing : A changing library and WEB landscape • Automation and new technologies • The web has changed • Large scale bibliographic databases • Cooperative cataloging • Administrative desire to decrease costs • Greater variety of media in library collections (electronic!) • User expectations and needs • FRBR is our data model – semantic web friendly! robin fay @georgiawebgurl 2013
  26. 26. • FRBR will give us a way to group things in different ways building relationships between data – by WEMI (Work, Expression, Manifestation, Item) • WEMI is a hierarchy from abstract to the actual thing owned by a library (the well… item!) • Work and Expression can be somewhat conceptual with lots of discussion going on; however, you can loosely think of Work as a concept or idea which is Expressed (think the act of creation; performance) onto/into a physical format (can be digital) aka a Manifestation, of which the library has a copy (Item). robin fay @georgiawebgurl 2013
  27. 27. Entity-Relationship Model (new way of storing & organizing data) • Database design model • Entity - a thing with an identity – Entities have attributes (characteristics) • Relationships – Between different entities at different levels • Provides for organization of records in database – “clustering” • Conceptual model of abstract concepts robin fay @georgiawebgurl 2013
  28. 28. FRBR and RDA properties element sets
  29. 29. “User Tasks” How do catalog users • Find • Identify • Select • Obtain … the resources they want? robin fay @georgiawebgurl 2013
  30. 30. Work A distinct intellectual or artistic creation Group 1 Entities (WEMI Hierarchy) Expression Intellectual or artistic realization of a work Manifestation Physical embodiment of an expression of a work Item Single exemplar of a manifestation
  31. 31. The FRBR Entity Relationship Model
  32. 32. RDA Controlled Vocabularies Closed content type media type carrier type mode of issuance ... and more. Open frequency type of recording language of expression form of musical notation relationship designators (app. I-K) ... and more. robin fay @georgiawebgurl 2013
  33. 33. RDA, FRBR, and MARC RDA is our metadata rules to describe our content FRBR is our semantic web friendly data model Currently we use MARC to format our data but we need something better Linked data can be the mechanism – but what about the actual records? robin fay @georgiawebgurl 2013
  34. 34. RDA, FRBR, and MARC • Bibliographic records are structured in MARC (a programming language). MARC (MAchine Readable Code) and AACR2 have been working together a long time which means that compromises and workarounds have sometimes be made. This will be true for RDA, too. • MARC is a mixture of controlled access points (series, name authority and subject headings + free text (e.g., contents notes). This provides flexibility and structure but> More free text = less precision in searching = more work for systems to return relevant results robin fay @georgiawebgurl 2013
  35. 35. RDA, FRBR, and MARC • Bibliographic records are structured in MARC (a programming language). MARC (MAchine Readable Code) and AACR2 have been working together a long time which means that compromises and workarounds have sometimes be made. This will be true for RDA, too. • MARC is a mixture of controlled access points (series, name authority and subject headings + free text (e.g., contents notes). This provides flexibility and structure but> More free text = less precision in searching = more work for systems to return relevant results robin fay @georgiawebgurl 2013
  36. 36. • MARC existed before AACR2. MARC was developed in the 1960s before most digital technology existed – the web as we know it, ebooks, and Google, did not exist. • Most current catalog systems use MARC, but there are other metadata schemas and programming languages. • Although many systems have not fully utilized all of the fields and functionalities of MARC, it is reaching the end of its lifespan. • The next generation (nexgen) systems can not develop as only MARC based; we need more. RDA, FRBR, and MARC robin fay @georgiawebgurl 2013
  37. 37. • Our future systems will probably not use MARC, but some kind of semantic web friendly schema. • Currently, the Library of Congress has started a project called the Bibliographic Framework Transition Initiative • Why? • We need something that is more flexible, not flat in file structure, yet works with a semantic framework. • We need something that works better with different metadata schemas. • This new framework will provide us with enormous functionality in our catalogs and allow us to fully use RDA. It will allow us to move forward into the semantic web world. RDA, FRBR, and MARC robin fay @georgiawebgurl 2013
  38. 38. • We have some relationships within our library catalog via the bibliographic data – bib-holding-item (a way to keep all of the parts of a particular thing together) • Bib to authority –series-subject headings (a bib record having linking field(s) to another record(s)) • Authority records – records not visible to the public, but provide the linking points to our bib records and guide the user through variations of the name or title, etc. Linking data in catalogs robin fay @georgiawebgurl 2013
  39. 39. Resources LODLAM: http://lodlam.net/ LODAM CHALLENGE: http://summit2013.lodlam.net/ LODLAM Zotero Group (Webliography of good stuff): https://www.zotero.org/groups/lod-lam GLAMLOD: https://groups.google.com/group/glamlod LC Bibliographic Framework Transition Initiative: http://www.loc.gov/marc/transition/ LITA - library linked data interest group: http://connect.ala.org/node/142470 Use Case Tool: http://obd.jisc.ac.uk/navigate Getting triples from records: the role of ISBD http://www.slideshare.net/scottishlibraries/isbd-record2triples FRBR Display Tool: http://www.loc.gov/marc/marc-functional-analysis/tool.html Understanding FRBR: http://www.loc.gov/cds/downloads/FRBR.PDF More materials at http://www.delicious.com/georgiawebgurl/metadata_presentation_como Making the Digital Connection: Linked Data and Libraries robin fay @georgiawebgurl 2013
  40. 40. Resources • RDA Toolkit RDA Toolkit (online) – http://www.rdatoolkit.org • LC PCC PS (free .pdf downloads) http://www.loc.gov/catdir/cpso/RDAtest/rda_LC- PCC PS.html robin fay @georgiawebgurl 2013

Hinweis der Redaktion

  • For each part a URI.
  • URIs are kind of like a hook – they allow us to connect things together.
  • ElaineSvenonius (?) posited that “navigate” (finding works related to a given work by generalization, assn., & aggregation…) should be added, but is not officialFind: resources corresponding to user’s search criteriaIdentify: confirm resource described corresponds to that sought, or distinguish between more than one resource with similar characteristicsSelect: resource appropriate to user’s needsObtain: to acquire or access resource (RDA chap. 4 (7 pp.) on acquisition and access, includes URL)
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/[Show exercises, explain, solicit questions, etc.]
  • One of RDA’s characteristics which mean it can work better in linked data environment; many of these are registered on the web online registry of both RDA element sets and values (vocabularies), at http://metadataregistry.org/[Show exercises, explain, solicit questions, etc.]
  • What a FRBRized catalog should give us is better searching tools and enable to see editions more easily; see related titles in different media (e.g., easier to find the work “Dracula” regardless of its physical format – its manifestation). Since FRBR is a data model built on a semantic web framework, it will also enable us to have better, more robust, more semantic web like search tools (like our catalogs). ..while FRBR influenced RDA and FRSAD (Functional Requirements for Subject Authority Data)

×