Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Nächste SlideShare
Linked (Open) Data
Linked (Open) Data
Wird geladen in …3
×

Hier ansehen

1 von 46 Anzeige

NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging

The International Federation of Library Associations and Institutions (IFLA) is responsible for the development and maintenance of International Standard Bibliographic Description (ISBD), UNIMARC, and the "Functional Requirements" family for bibliographic records (FRBR), authority data (FRAD), and subject authority data (FRSAD). ISBD underpins the MARC family of formats used by libraries world-wide for many millions of catalog records, while FRBR is a relatively new model optimized for users and the digital environment. These metadata models, schemas, and content rules are now being expressed in the Resource Description Framework language for use in the Semantic Web.

This webinar provides a general update on the work being undertaken. It describes the development of an Application Profile for ISBD to specify the sequence, repeatability, and mandatory status of its elements. It discusses issues involved in deriving linked data from legacy catalogue records based on monolithic and multi-part schemas following ISBD and FRBR, such as the duplication which arises from copy cataloging and FRBRization. The webinar provides practical examples of deriving high-quality linked data from the vast numbers of records created by libraries, and demonstrates how a shift of focus from records to linked-data triples can provide more efficient and effective user-centered resource discovery services.

The International Federation of Library Associations and Institutions (IFLA) is responsible for the development and maintenance of International Standard Bibliographic Description (ISBD), UNIMARC, and the "Functional Requirements" family for bibliographic records (FRBR), authority data (FRAD), and subject authority data (FRSAD). ISBD underpins the MARC family of formats used by libraries world-wide for many millions of catalog records, while FRBR is a relatively new model optimized for users and the digital environment. These metadata models, schemas, and content rules are now being expressed in the Resource Description Framework language for use in the Semantic Web.

This webinar provides a general update on the work being undertaken. It describes the development of an Application Profile for ISBD to specify the sequence, repeatability, and mandatory status of its elements. It discusses issues involved in deriving linked data from legacy catalogue records based on monolithic and multi-part schemas following ISBD and FRBR, such as the duplication which arises from copy cataloging and FRBRization. The webinar provides practical examples of deriving high-quality linked data from the vast numbers of records created by libraries, and demonstrates how a shift of focus from records to linked-data triples can provide more efficient and effective user-centered resource discovery services.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Andere mochten auch (19)

Anzeige

Ähnlich wie NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging (20)

Weitere von National Information Standards Organization (NISO) (20)

Anzeige

Aktuellste (20)

NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging

  1. 1. Interna'onal  Bibliographic   Standards,  Linked  Data,  and  the   Impact  on  Library  Cataloging   Gordon  Dunsire   A  NISO/DCMI  Webinar   24  August  2011  
  2. 2. Abstract     The  Interna'onal  Federa'on  of  Library  Associa'ons  and  Ins'tu'ons  (IFLA)  is   responsible  for  the  development  and  maintenance  of  Interna'onal  Standard   Bibliographic  Descrip'on  (ISBD),  UNIMARC,  and  the  "Func'onal  Requirements"   family  for  bibliographic  records  (FRBR),  authority  data  (FRAD),  and  subject   authority  data  (FRSAD).  ISBD  underpins  the  MARC  family  of  formats  used  by   libraries  world-­‐wide  for  many  millions  of  catalog  records,  while  FRBR  is  a  rela'vely   new  model  op'mized  for  users  and  the  digital  environment.  These  metadata   models,  schemas,  and  content  rules  are  now  being  expressed  in  the  Resource   Descrip'on  Framework  language  for  use  in  the  Seman'c  Web.     This  webinar  provides  a  general  update  on  the  work  being  undertaken.  It  describes   the  development  of  an  Applica'on  Profile  for  ISBD  to  specify  the  sequence,   repeatability,  and  mandatory  status  of  its  elements.  It  discusses  issues  involved  in   deriving  linked  data  from  legacy  catalogue  records  based  on  monolithic  and  mul'-­‐ part  schemas  following  ISBD  and  FRBR,  such  as  the  duplica'on  which  arises  from   copy  cataloging  and  FRBRiza'on.  The  webinar  provides  prac'cal  examples  of   deriving  high-­‐quality  linked  data  from  the  vast  numbers  of  records  created  by   libraries,  and  demonstrates  how  a  shiZ  of  focus  from  records  to  linked-­‐data  triples   can  provide  more  efficient  and  effec've  user-­‐centered  resource  discovery  services.  
  3. 3. IFLA  standards    RDF  representa'ons  of  standards  for  “universal”   bibliographic  control  are  being  developed    “FR”  (Func'onal  Requirements)  family  of  models    For  Bibliographic  Records  (FRBR)    For  Authority  Data  (FRAD)    For  Subject  Authority  Data  (FRSAD)    Interna'onal  Standard  Bibliographic  Descrip'on  (ISBD)    Record  structure  and  content    UNIMARC    Encoding  for  ISBD  records  (Bibliographic)  and  FRAD   (Authori'es)  
  4. 4. Representa'on  in  RDF    En''es  =>  RDF  classes    E.g.  FRBR  “Person”    Abributes,  tags,  (sub)fields,  rela'onships  =>   RDF  proper'es    E.g.  ISBD  “'tle  proper”    E.g.  UNIMARC  “200  $a”  ('tle  proper)    E.g.  FRBR  “'tle  of  the  manifesta'on”    Controlled  term  values  =>  SKOS  vocabularies    E.g.  ISBD  Area  0  (content  and  media  type)  
  5. 5. FR  family    Each  model  has  its  own  namespace    To  reflect  historical  development    Re-­‐using  earlier  RDF  elements    Consolidated  model  under  development    Being  informed  by  analysis  of  RDF  representa'on    FRBR  RDF  published    FRBRer  (en'ty-­‐rela'onship)  ontology    Namespace  elements  plus  OWL    FRBRoo  (object-­‐oriented)    Extension  of  CIDOC  Conceptual  Reference  Model    FRAD  and  FRSAD  imminent    tba  
  6. 6. ISBD    Element  set  and  vocabularies  for  content  and   media  types    Namespace  now  published    DC  Applica'on  Profile  in  development    Models  the  ISBD  record    What  proper'es  (fields)    Mandatory?  Repeatable?    Aggregated  statements    Sub-­‐elements  and  punctua'on  
  7. 7. ISBD  AP  snippet   <!-­‐-­‐  Area  0  is  mandatory  and  non-­‐repeatable-­‐-­‐>                   <StatementTemplate  ID="hasContentFormAndMediaTypeArea"  minOccurs="1"   maxOccurs="1"  type="nonliteral">      <Property>hbp://iflastandards.info/ns/isbd/elements/P1158</Property>      <!-­‐-­‐  Area  0  is  an  aggregated  statement  with  SES  -­‐-­‐>      <NonLiteralConstraint   descrip'onTemplateRef="DThasContentFormAndMediaTypeArea">          <ValueStringConstraint>              <SyntaxEncodingScheme>hbp://iflastandards.info/ns/isbd/elements/C2003              </SyntaxEncodingScheme>          </ValueStringConstraint>        </NonLiteralConstraint>                           </StatementTemplate>  
  8. 8. UNIMARC    Proposal  for  RDF  representa'on  made  at  IFLA   2011    hbp://conference.ifla.org/sites/default/files/files/ papers/ifla77/187-­‐dunsire-­‐en.pdf    Outcome  of  discussions  with  Permanent   UNIMARC  Commibee    tba  
  9. 9. Other  library  standards  in  RDF  (1)    RDA:  resource  descrip'on  and  access    Content  standard  based  on  FR  models    Refines  the  FR  proper'es    Many  more  controlled  vocabularies  than  AACR    MODS/MADS  (Metadata  Object/Authority   Descrip'on  Schema)    Metadata  structure  based  on  MARC21    RDF  representa'on  just  beginning  ...  
  10. 10. Other  library  standards  in  RDF  (2)    BIBO:  Bibliographic  Ontology    Classes  and  proper'es  for  cita'ons  and   bibliographic  references    DCMI  Metadata  Terms  (Dublin  Core)    High-­‐level  common-­‐denominator  classes  and   proper'es  for  memory  ins'tu'on  metadata    Lots  of  controlled  vocabularies    LCSH,  DDC  summaries,  RDA  vocabularies,  etc.  
  11. 11. From  record  to  triples  (in  9  stages)    Very  large  numbers  of  records    Catalogue  records,  finding  aids,  etc.    300  million;  1  billion?    High  quality  metadata    In  comparison  with  other  communi'es    Each  record  may  generate  many  triples    30  “raw”  triples  (no  inferences)  per  MARC  record?    Very,  very  large  numbers  of  triples    Billions?  Trillions?  
  12. 12. 1.  Take  a  record   Field/a(ribute   Value   Record  ID   54321   Title   Museum  archives:  an  introduc'on   Author   Wythe,  Deborah   Date   2004   LCSH   Museum  archives   Media/GMD   Electronic   Content  form   Text  
  13. 13. 2.  Disaggregate  to  single  statements   Record   A(ribute   Value   54321   (has)  'tle   Museum  archives:  an   introduc'on   54321   (has)  author   Wythe,  Deborah   54321   (has)  date   2004   54321   (has)  LCSH   Museum  archives   54321   (has)  media  type   Electronic   54321   (has)  content  form   Text  
  14. 14. 3.  Create  URI  for  record    Must  be  unique,  so  54321  no  good  on  its  own    hbp  URIs  are  a  good  thing  (W3C)    So  add  record  ID  to  a  unique  hbp  domain    E.g.  hbp://MyLibraryX.com  (unique  to  the  library)    +  54321      hbp://MyLibraryX.com/54321    (or  hbp://MyLibraryX.com#54321)    This  is  not  a  URL!  
  15. 15. 4.  Replace  record  ID  with  URI   URI   A(ribute   Value   mlx:54321   (has)  'tle   Museum  archives:   an  introduc'on   mlx:54321   (has)  author   Wythe,  Deborah   mlx:54321   (has)  date   2004   mlx:54321   (has)  LCSH   Museum  archives   mlx:54321   (has)  media  type   Electronic   mlx:54321   (has)  content  form   Text   “mlx”  =  qname  (xmlns)  =  shorthand  for  “hbp://MyLibraryX.com/”    
  16. 16. 5.  Find  URIs  for  abributes    Abributes  are  modelled  as  RDF  proper'es  (predicates)   in  “element  set”  namespaces    E.g.  Dublin  Core  terms  (dct);  ISBD  (isbd);  FRBR  (frbrer);   RDA  (rdaxxx);  Bibliographic  Ontology  (bibo);  etc.    Choose  a  namespace,  find  property  with  same  (or   closest)  “meaning”  (e.g.  defini'on)  as  abribute    Nearest  property  minimises  loss  of  informa'on    Get  URI  for  property    If  no  suitable  property,  choose  another  namespace    Proper'es  do  not  have  to  come  from  single  namespace    Match  and  mix!  
  17. 17. 5  (cont).  Find  URI  for  'tle    hbp://purl.org/dc/terms/'tle  (dct:'tle)    hbp://iflastandards.info/ns/isbd/elements/ P1014  (isbd:P1014)    hasTitleProper      hbp://RDVocab.info/Elements/'tleProper     (rdaGR1:'tleProper)  
  18. 18. 5  (cont).  Find  URI  for  author    dct:creator    rdarole:author    (isbd  does  not  cover  “headings”)  
  19. 19. 5  (cont).  Find  URI  for  date    dct:date    isbd:P1018    hasDateOfPublica'onProduc'onDistribu'on    rdaGr1:dateOfPublica'on  
  20. 20. 5  (cont).  Find  URI  for  LCSH    LCSH  is  a  subject  vocabulary    Controlled  terms    So  abribute  is  really  “subject”    And  the  term  itself  is  the  value    dct:subject  
  21. 21. 5  (cont).  Find  URI  for  media  type    Assuming  record  uses  new  ISBD  Area  0  ...    isbd:P1003    hasMediaType  
  22. 22. 5  (cont).  Find  URI  for  content  form    Assuming  record  uses  new  ISBD  Area  0  ...    isbd:  P1001    hasContentForm  
  23. 23. 6.  Replace  abributes  with  URIs   URI   URI   Value   mlx:54321   isbd:P1014   Museum  archives:   an  introduc'on   mlx:54321   rdarole:author   Wythe,  Deborah   mlx:54321   isbd:P1018   2004   mlx:54321   dct:subject   Museum  archives   mlx:54321   isbd:P1003   Electronic   mlx:54321   isbd:P1001   Text  
  24. 24. 7.  Find  URIs  for  values    If  object  of  a  triple  is  a  URI,  it  can  link  to  the  subject  of   another  triple  with  the  same  URI    Linked  data!    Values  from  controlled  vocabularies  may  have  URIs    Possible  vocabularies:  author,  subject,  ISBD  Area  0    NOT:  'tle,  date    For  author:  Virtual  Interna'onal  Authority  File  (VIAF)    For  LCSH:  Library  of  Congress  Authori'es  &   Vocabularies    For  ISBD  Area  0:  Open  Metadata  Registry  
  25. 25. 7  (cont).  Find  URI  for  author    Author:  Wythe,  Deborah    VIAF:  hbp://www.viaf.org/    viaf:31899419/#Wythe,+Deborah  
  26. 26. 7  (cont).  Find  URI  for  subject  (LCSH)    LCSH:  Museum  archives    LoC:  hbp://id.loc.gov/authori'es/    lcsh:/sh85088707#concept    
  27. 27. 7  (cont).  Find  URIs  for  ISBD  Area  0    Media  type:  Electronic    ISBD  media  type    isbdmt:T1002    Content  form:  Text    ISBD  Content  form    isbdcf:T1009  
  28. 28. 8.  Replace  values  with  URIs   subject   predicate   object   mlx:54321   isbd:P1014   “Museum  archives:  an   introduc'on”   mlx:54321   rdarole:author   viaf:31899419/#Wythe, +Deborah   mlx:54321   isbd:P1018   “2004”   mlx:54321   dct:subject   lcsh:/ sh85088707#concept     mlx:54321   isbd:P1003   isbdmt:T1002   mlx:54321   isbd:P1001   isbdcf:T1009  
  29. 29. 9.  Publish  triples  (linked  data)   mlx:54321  |  isbd:P1014  |  “Museum  archives:  an  introduc'on”     mlx:54321  |  rdarole:author  |  viaf:31899419/#Wythe,+Deborah   mlx:54321  |  isbd:P1018  |  “2004”   mlx:54321  |  dct:subject  |  lcsh:/sh85088707#concept   mlx:54321  |  isbd:P1003  |  isbdmt:T1002   mlx:54321  |  isbd:P1001  |  isbdcf:T1009  
  30. 30. Linked  data  chains   mlx:54321  |  dct:subject  |  lcsh:/sh85088707#concept   lcsh:/sh85088707#concept  |  skos:related  |  rameau:XXX   rameau:XXX  |  frbrer:isSubjectOf  |  mly:98765   mly:98765  |  rda:'tleOfTheWork  |  “Managing  archives  in  museums”   rameau:XXX  |  skos:prefLabel  |  “archives  du  musée”  
  31. 31. Linked  data  cluster  =  “record”   mlx:54321  |  isbd:P1014  |  “Museum  archives:  an  introduc'on”     mlx:54321  |  rdarole:author  |  viaf:31899419/#Wythe,+Deborah   mlx:54321  |  isbd:P1018  |  “2004”   mlx:54321  |  dct:subject  |  lcsh:/sh85088707#concept   mlx:54321  |  isbd:P1003  |  isbdmt:T1002   mlx:54321  |  isbd:P1001  |  isbdcf:T1009  
  32. 32. Duplica'on  and  legacy  records    Many  copies  of  legacy  records    Copied  and  amended  for  local  use    Danger  of  min'ng  mul'ple  URIs  for  the  same   resource    Na'onal  bibliographic  agencies  have   significant  role  to  play    As  memory/cultural  ins'tu'ons    The  linked-­‐data  memory/culture  of  a  na'on  
  33. 33. FRBRiza'on    FRBR  splits  record  into  four  func'onal  parts    User-­‐centred  func'ons    Subject  of  a  FRBR  triple  is  one  of  the  parts,  not   the  resource  as  a  whole    But  subject  of  ISBD  triple  is  the  resource  as  a   whole    Class  collisions  can  be  avoided  by  using   unbounded  (no  domain  or  range)  versions  of   proper'es  
  34. 34. A  short  history   of  the  evolu'on   of  the  library  catalogue  record  
  35. 35. In  the  beginning  ...   Lee,  T.  B.   Cataloguing  has  a  future.  -­‐  Audio  disc     (Spoken  word).  -­‐    Donated  by  the  author.   1.  Metadata   ...  the  catalogue  card  
  36. 36. From  flat-­‐file  record  ...   Bibliographic  descrip7on   Name  authority   Author:   Lee,  T.  B.   Name:   Title:   Cataloguing  has  a  future   Biography:   ...   Content  type:   Spoken  word   Carrier  type:   Audio  disc   Subject  authority   Subject:   Metadata   Term:   Provenance:   Donated  by  the  author   Defini'on:   ...   ...  to  rela'onal  record  
  37. 37. From  flat-­‐file  descrip'on  ...   Bibliographic  descrip7on   Name  authority   Author:   Name:   Lee,  T.  B.   Title:   Cataloguing  has  a  future   Biography:   Work   ...   Content  type:   Spoken  word   Author:   Carrier  type:   Audio  disc   Subject  authority   Subject:   Subject:   Term:   Metadata   Expression   Provenance:   Donated  by  the  author   Defini'on:   Content  type:   Spoken  word   ...   Manifesta7on   Item   ...  to  FRBR  record  
  38. 38. From  FRBR  record  ...   Work   Name  authority   Author:   Name:   Lee,  T.  B.   Subject:   Subject  authority   Expression   Content  type:   Spoken  word   Term:   Metadata   Manifesta7on   RDA  content  type   Title:   Cataloguing  has  a  future   Term:   Carrier  type:   Audio  disc   RDA  carrier  type   Item   Donor:   Provenance:   Donated  by  the  author   Term:   Amazon/Publisher   Title:   ...  to  ex'nc'on!  
  39. 39. Where  is  the  record?    Implicit,  not  explicit    Everywhere  and  nowhere    A  seman'c  Web  will  allow  machines  to  create  the   record  just-­‐in-­‐'me    We  will  not  have  to  maintain  records  just-­‐in-­‐case    The  user  will  have  control  over  the  presenta'on    I  want  to  see  an  archive  or  library  or  museum  or   Amazon  or  Google  or  Flickr  or  ?  display    And  by  avoiding  duplica'on,  we  can  all  get  on   with  describing  new  stuff  ...  
  40. 40. The  hyperdimensional  (Tardis)  card   W3C  Library   Audio  shop   Lee,  T.  B.   Cataloguing  has  a  future.  -­‐  Audio  disc     (Spoken  word).  -­‐    Donated  by  the  author.   1.  Metadata   Spoken  word  archive   Lee  Museum   “TARDIS  four  port  USB  hub,  for  office-­‐bound  Time  Lords:   Open  a  'me  vortex  on  your    desk”  –  Pocket-­‐lint    
  41. 41. Metadata  focus   ShiZ  of  focus  of  metadata  crea'on,  maintenance,  storage,   preserva'on  (by  professionals,  amateurs,  machines)   From  Record   To  Statement(s)  =  triple(s)   But  metadata  display  ...   ...  aggregates  triples  (from   mul'ple  sources)  to   create  records  on  the  fly  

×