NISO Virtual Conference: BIBFRAME & Real World Applications of Linked Bibliographic Data
http://www.niso.org/news/events/2016/virtual_conference/jun15_virtualconf/
June 15, 2016
Opening Keynote: Landscape and Current Status of BIBFRAME and Related Initiatives
Measures of Central Tendency: Mean, Median and Mode
LODLAM Landscape NOTES
1. NISOVirtual Conference: BIBFRAME&Real World Applicationsof LinkedBibliographicData
http://www.niso.org/news/events/2016/virtual_conference/jun15_virtualconf/
June 15, 2016
Keynote:Landscape andCurrentStatusof BIBFRAME and RelatedInitiatives
“The LODLAMLandscape”
SLIDE 1: The LODLAMLandscape
Subtitle:BIBFRAME&Other LinkedDataInitiatives
SLIDE 2: WhyLODLAM?
First,a definition:
LODLAM: linkedopen datainlibraries,archives,andmuseums
Libraries – Libraries,Archives,Museums(e.g. cultural heritageinstitutions)
Why isLinkedData important?
Libraries – trustedrepositoriesof information;the datawe have isrich and deep
Linkeddata– howwe can share that informationonthe web;makingthingsmore discoverable and
accessible
So what’sthe problem?!We’ve gotthisdata,sowhyisn’tit justthrownoutthere?
Our data issiloedinMARC
MARC
we’ve beenmakingitdosomethingitwasn’tdesignedtodofor20+ yearsnow
designedtopurelytransmitdataforprinting;NOTforbeingindexed,searched,manipulated,etc.
MARC is contextual (hello,ISBD) –machinesdon’tdocontext,theyare dumbandliteral
Henriette Avram –one of my heroes(sheroes) –BUT not a storage/retrieval format;fortransmitting
and displayonly
BIBFRAME isone effortto breakthat silo.
SLIDE 3: Termsand Definitions
Semanticweb:“W3C,"The SemanticWebprovidesacommonframeworkthatallowsdatato be shared
and reused acrossapplication,enterprise,andcommunityboundaries“”
Tim Berners-Lee–definedthe semanticweb/linkeddata
Semanticweb is powered by Linked Data
Linked Datais structured datathatis machinereadable/actionable
SLIDE 4: [CATS]
Of course,we all knowthe internet(akathe web) isactuallymade of aseriesof tubesfilledwithCATS,
not data.
2. Sources
Subversive CrossStitch
Pusheen.com
Nyancat
Threadless –“Voltron”of cats
SLIDE 5: [xkcd]
Well, OK.It’snotreally cats intubes.The web isactuallya bunch of standards – that keepproliferating –
but mustbe interoperable
http://xkcd.com/927/
Alttext:chargingissue solved.Wait,isit:mini-USB?Micro-USB?
SLIDE 6: (SemanticWeb)
Semanticweb – webof linkeddata
powered byrules/standardsandtechnologiesthatprovide structure todata
*Various(mostcommonlyused/heard)standardsandorganizationsthatpowerthe SemanticWeb;most
usedas part of or bylinkeddata
W3C: WorldWide WebConsortium
URI: Uniformresource identifier(aURL is one type of URI, also have URNs)
HTTP: hypertexttransferprotocol –allowsaURI to be actionable (“linkable”or“clickable”)
XML/HTML: mark-uplanguages
Microdata: nestmetadatawithinexistingwebpage content(embeddedinHTML)
JSON:JaveScriptObjectNotation –dataformat fortransmittingdataobjects(attribute-value pairs);
JSON-LD– methodof encodinglinkeddatausingJSON
RDF: Resource DescriptionFramework –conceptual descriptionormodelingof informationusedin
webresources;multiple RDFspecifications(entity-relationship);mostothersemanticwebstandards
are builtonor use RDF
Turtle:Terse RDF Triple Language
SPARQL:semanticquerylanguage (usesRDF);endpointforaccess
SKOS:simple knowledge organizationsystem(forcontrolledvocabularyrepresentation)
OWL: webontologylanguage (knowledgerepresentationlanguage forontologies
(vocabularies/taxonomies))
FOAF:“friendof a friend”(ontologyfordescribingpersons)
Schema.org:Bing/Google/Yahoo - schemasforstructureddatamarkupon webpages
SLIDE 7: (Semantic) Web:LODLAMSubset
Othercommontermsand organizationswithinthe LODLAMcommunity:
BIBFRAME: data model forbibliographicdescriptionusinglinkeddataprinciples
Blacklight:discoveryinterface forSolrindex;
o http://projectblacklight.org/
Solr:opensource standalone enterprise searchserver;powerful indexing
o http://lucene.apache.org/solr/
Fedora:Flexible Extensible DigitalObjectRepositoryArchitecture;opensource architecture for
storing,managing,andaccessingdigital contentinthe formof digital objects;
3. o http://fedora-commons.org/
Hydra: repositorysolution;“Hydraisan ecosystem of componentsthatletsinstitutionsdeploy
robustand durable digital repositories(thebody) supportingmultiple “heads”:fully-featureddigital
assetmanagementapplicationsandtailoredworkflows. Itsprinciple platformsare the Fedora
Commonsrepositorysoftware,Solr,RubyonRailsandBlacklight”;
o http://projecthydra.org/
ORCID:open,non-profit,community-drivenefforttocreate and maintainaregistryof unique
researcheridentifiers;
o http://orcid.org/
VIVO:member-supported,opensource softwareandanontology forrepresentingscholarship;
o http://vivoweb.org/
PLUS existingcontrolledvocabulariessuchasLCSH, LCClassification,Dewey,etc.
SLIDE 8: LinkedData
Powersthe semanticweb
STRUCTURED data
can be queried,linkedto/from, andintegrated
buildsonexistingwebtechnologies
machine (nothuman) friendly –actionable bymachine
usescontrolledvocabularies
structureddata storedinvariousinteroperable “containers”(mark-up,schemas,etc.)
Structure of linkeddata:triples:
a subject,a predicate,andanobject – definesrelationshipbetweentwothings(entity-relationship
model)
Bestcase: eachelementof atriple isa URI, theneach elementcanconnectwithotherelements
URI – dereferencable(actionable/clickable)
Controlledvocabulariesandidentitiesare “things”representedbyURIs – multiple sourceswill“connect
through”one URI, providingthe “webof data”
Things:representedbyURIs;theyare actionable andconnecttootherthings/strings
Strings:text;alsocalledliterals;“deadends”inthe semanticweb/linkeddata
SLIDE 9: TriplesandQuads
TRIPLE
a subject,a predicate,andanobject – definesrelationshipbetweentwothings(entity-relationship
model)
QUAD
a subject,a predicate,anobject,andcontext
4. Allowsrelationshipstohave attributes –orinformationaboutthe relationship
Assertion:one waytorepresent“trust”or “authenticate”the relationship;“who”or“what”
establishedthisrelationship?
ALL of the pieces of bothTRIPLES andQUADS can be URIs; any piece thatisnot a URI is a “dead-end”or
literal/string
SLIDE 10: LinkedOpenData[mug]
Linkeddatais notguaranteedtobe OPEN;youcan have linkeddataina closednetwork
However,OPEN ispreferred
Why?
Tim Berners-Leeprinciplesof linkeddata
1. Use URIs to name (identify) things.
2. Use HTTP URIs so that these things canbe lookedup(interpreted,"dereferenced").
3. Provide usefulinformationaboutwhataname identifieswhenit'slookedup,usingopen
standardssuch as RDF,SPARQL,etc.
4. Refertoother thingsusingtheirHTTPURI-basednameswhenpublishingdataonthe Web – e.g.
linktoother URIs somore thingscan be discovered.
5. OPEN data/content
But whatmakesit OPEN?
OPEN data – “star” system– goal is5 stars
1. Available onthe web(whateverformat) butwith an open license, to beOpen Data
2. Available asmachine-readable structureddata(e.g.excel insteadof image scanof a table)
3. as (2) plusnon-proprietaryformat(e.g.CSV insteadof excel)
4. Use openstandardsfromW3C (RDFand SPARQL) toidentifythings,sothatpeople canpointat
your stuff
5. Linkyour data to otherpeople’sdatatoprovide context
Source:W3C (https://www.w3.org/DesignIssues/LinkedData.html)
SLIDE 11: LinkedOpenDataCloud
http://lod-cloud.net/
Aug.2014
CC-BY-SA license
Representationof the LinkedOpenDatacloud – still growingasmore data setsare published
Each node is a datasetthat can be queried,linkedto,andextractedfrom
Centerof the hub isDBPedia
connectsotherdata sources – interlinkingpublisheddatasets
Upper right– codedgreen – librarydata
id.loc.gov
VIAF
5. WorldCat
national libraries –French,German,andmore
geographicdata
SLIDE 12: BIBFRAME
BibliographicFrameworkInitiative
Initiative definedin2011
Source:https://www.loc.gov/bibframe/faqs/
Releasingourdatafromthe MARC silo,butbeyondthattoo
SLIDE 13:
INDEPENDENTof any specificdescriptionstandard(e.g.RDA orResource DescriptionandAccess)
Is itreplacingMARC? Yes,but alsogoingbeyondMARC
FocusedonRELATIONSHIPS
Can lookat it as: “blowingup”a MARC recordintoits discrete elementsanddefiningtheirrelationship
witheachotherand beyond(connectwithelementsnottraditionallyincluded)
Classes:definearesource (underBIBFRAME)
Properties:furtherdescribe aresource
SLIDE 14: BIBFRAME 2.0 Model
Work – conceptual essence
Instance – embodimentof awork(one workcan have multiple instances)
Item– copyof the instance – local holdings
Agents– people,organizations,jurisdictions,etc.associatedwithawork
Subjects – “aboutness”of a work
Events– occurrences,recordingof a contentof a work
SLIDE 15: All the LinkedDataActivity!
Reviewthe currentinitiatives/activities
definitionsandthe active participants;
more detail aboutthe variousactivitieswill be include inthe presentationstoday
NOT an exhaustive list
Currentlylargelyindependentof eachother
Why? Nosharedtriple store;infrastructure isabarrierforlarger cooperative participation
LODLAM: linkedopendatainlibraries,archives,andmuseums
o “informal,borderlessnetworkof enthusiasts,technicians,professionalsandanynumberof
otherpeople whoare interestedinorworkingwithLinkedOpenDatapertainingtogalleries,
libraries,archives,andmuseums”
o http://lodlam.net/
BIBFRAME: initiative forbibliographicdata
6. o Libraryof Congress
o https://www.loc.gov/bibframe/ (bibframe.org LOCsite now)
BIBFRAME Lite:modularandlayeredvocabularyapproachusingBIBFRAMEvocabulary
o core setof classes/propertiesasascaffolding;buildonitwithothervocabularies(Library,
Archive,Rare Materials,etc.)
o National Libraryof Medicine;George WashingtonU;Zepheira
o http://www.bibfra.me/
LD4PE: LinkedDatafor Professional Educators
o educate the educators;buildinganExploratoriumof learningresourcesanddefined
competencies
o Universityof WA;KentState U; DublinCore MetadataInitiative(DCMI);Sungkyunkwan
University(Korea);OCLC;Elsevier;Synaptica
o http://wiki.dublincore.org/index.php/Pet/ld4pe
LD4L: LinkedData forLibraries
o 2014-2016 MellonGrant ($1 million)
o Cornell ULibrary; Harvard Library;StanfordU Libraries
o https://wiki.duraspace.org/pages/viewpage.action?pageId=41354028
LD4L Labs: continue LD4L
o “$1.5 million dollargrantfromthe Andrew W.MellonFoundation,LinkedDataforLibraries:
LD4L Labs is a collaborationof Cornell,Harvard,Iowa,andStanfordtocontinue toadvance
the use and usefulnessof linkeddatainlibraries”
o https://wiki.duraspace.org/display/ld4l/LD4L+Labs
LD4P: LinkedData for Production
o LC; Columbia;Cornell;Harvard;Princeton;Stanford
o Multiple projects- eachof the 6 core memberscontribute projects
o https://wiki.duraspace.org/pages/viewpage.action?pageId=74515029
o http://www.loc.gov/aba/pcc/documents/PCC-LD4P.docx
BIBFLOW: “ReinventingCataloging:Modelsforthe Future of LibraryOperations”
o focusis onworkflows
o UC DavisLibrary;Zepheira
o https://www.lib.ucdavis.edu/bibflow/
LibHub:convertinglibraryMARCdata to BIBFRAME andlinkeddataformats;publishingandhosting
the resultingcontent
o Zepheira
o http://www.libhub.org/
CLDI: CanadianLinkedDataInitiative
o U of Toronto; McGill U; Universite de Montreal;Uof Alberta;U of BritishColumbia;Library
& ArchivesCanada; BibliothèqueetArchivesnationalesduQuébec; Canadiana
o https://connect.library.utoronto.ca/display/U5LD/Canadian+Linked+Data+Initiative+Home
LC LinkedData Service
o providesaccesstocommonlyfoundstandardsandvocabulariespromulgatedbythe Library
of Congress
o http://id.loc.gov/
OCLC: multiple initiatives(W3CLinkedDataPlatform;BIBFRAME;Schema.org;Schema.orgExtend
W3C Group;OCLC Works)
o WorldCatinSchema.org
o FAST:FacetedApplicationof SubjectTerminology;http://fast.oclc.org/
7. o VIAF:Virtual International AuthorityFile;http://viaf.org/
o WorldCatEntities(orWorks);https://www.oclc.org/developer/develop/linked-
data/worldcat-entities/worldcat-work-entity.en.html
o https://www.oclc.org/en-US/data.html
o https://www.oclc.org/developer/develop/linked-data.en.html
ADDENDUM:
LibraryLink Network
o Zepheira
o “Seedingthe WebwithLibrarylocations,services,andcontent.”
o http://library.link/
LinkedData CollaborationProgram
o Ex Libris
o Jan.2016 pressrelease;involves30+institutions
o http://www.exlibrisgroup.com/category/LinkedDataDiscussionPaper
o http://www.exlibrisgroup.com/files/Publications/LinkedDataattheServiceofLibraries.pdf
IGELU/ELUNA Special InterestGrouponLinkedOpenData
o “achieve essential linkedopendatafeaturesinall Ex Librisproductswhere appropriate,
bothfrom the data publishing,the dataconsumingandthe data integrationperspective.”
o http://igelu.org/special-interests/lod
Blue CloudVisibility
o SirsiDynix
o service toextractMARC recordsand transformintoBIBFRAME; enhancedwithgeographic
data
o http://www.sirsidynix.com/products/bluecloud-visibility
SLIDE 16: [businesscatimage]
Questions?
“businesscat”meme
http://es.memegenerator.net/Business-Cat
SLIDE 17: [xkcd]
https://xkcd.com/262/
ALT TEXT: “hey,at leastI ran out of staples”