The Stanford Workshop focused on creating plans to expedite a shift in how knowledge and information resources are managed and discovered through linked data. The goal was to identify capabilities and design new tools, processes, and systems that move beyond current metadata practices to link related resources and provide improved navigation and discovery through open feedback. A number of organizations from around the world participated in the workshop to discuss these issues.
the slogan & logo a) it’s the web, stupid! is pretty long in the tooth these days b) never the less, tension between dogma & web-driven functionality for linked data does exist c) this offers a gentle reminder of this particular workshop’s focus
checklist of the participating institutions and organizations 25 people all together a) library-centric by design b) other initiatives & events are addressing wider range of resources & agencies (this JISC session, Europeana, LOD_LAM in SF, early June) c) yeast added to the mix: research, corporate, and non-profit content technology agencies
here’s where you get to do a bit of work ... I won’t read these a) objectives for the workshop fund-able plans for creating tools, processes, and vehicles to expedite a disruptive paradigm shift catch phrase for the effort as a whole: less talking, more doing ... doing at web scale (pace!)
(pace!)
(pace!)
CLIR funded creation of a linked-data survey ... state affairs to set a framework and provide a means for participants to begin work with relatively similar definitions for “linked data”
screen shot of the survey’s outline (note headings ... pace!)
drilling down ... extant metadata (note headings ... pace -- )
looking at a specific subheading
which takes one to the content checklists of content with URLs 1 st version published to the Workshop participants final version (with updates from the workhop & ensuing developments in the linked data arena will be public on the CLIR site by end of summer ... ever changing landscape: schema.org W3C Library Linked Data Incubator Group David / Richard will hear when the web site is publicly accessable
some additional work for you ... first is Josh Greenberg’s telling synopsis
and Mike Bergman’s thoughts on provenance & co-referencing ... worth consideration relative to cultural heritage efforts & objectives
Stefano mazzoki’s tale on linking take note of the date and give some thought to this snippet ... in terms of what’s playing out in the dialogs among proponents & advocates for varied flavors of structured data ... linked-data in the W3C cannon, RDF, RDFa, rich snippets, schema.org, Facebook’s Open Graph Protocol, Freebase’s GraphD driven topics
... speaks for itself
this checklist surfaced near the Wokshop’s mid-point a) procedurally the workshop was shaped by an evolving agenda b) said agenda driven by the products of four 6-person workgroups c) cross pollination among workgroups provided by alternating plenary ... workgroup ... plenary ... workgroup sessions it’s a “prioritized” list of the issues coming separately from the workgoups ... it’s ordered via a simple voting process, each participant having 7 votes ... any number of votes could go to any issue (1 – 7) -- its a rough hewn working document, a quick snapshot that then went back into the work-group process 1. co-reference, reconciliation ... of URs 3. killer apps ... tension between cult heritage built ... emergent via web scale entrepreneurs 11. feed back, metrics ... what’s the value-add
12 user seduction ! 13. workflow ... gross simplifications
sketch of the chief components of a linked-data creation pipeline MARC / MODS library data was the specific use case considerable detail developed in the working documents note: 1) co-references captured early pay big dividends later (less URI bloat) 2) comparison of machine vs. human ID of co-references 80 / 20 – ideal !! mebbe as low 60 / 40 in many cases need efficient human resource ontribution crowdsourcing Freebase reconciliation pipelines note also: 1) revisions that go back thru the whole pipeline can be expensive think about transforming MARC to linked data then pushing revisions back into MARC highly dis-functional impedance (granularity) miss-match
how soon can the publish canon of linked data become the metadata resource of record ?? !! in Workshop’s parlance, what can be done to make the change ... and can that change be incremental
a 10,000 foot snapshot of a work plan for a specific project a) large pools of journal citations b) highly constrained workflows in extant systems with narrow flexibility in staff resources (ranging from little to none) c) using a rough project planning outline, where do the ISSUES fit into the planning and workflow of such a project sketches of Use cases Data modeling expectations for production workflows
maintenance and distribtuion
bottom line IS and REMAINS metrics ... feedback, reporting, reward systems VALUE ADD ... for consumers
VALUE ACCRUED linked data publishers where’s the convincing elevator speech for VLUE ADD / ACCRUED via linked data by the cultural heritage community ?? !!
another take on putting linked data to work across many levels expertise found in GLAM environments a checklist of tasks / events / considerations / planning nodes
placed in a matrix with 3 levels of organization maturity novice journeyman master (oops ... we left out apprentice ...
at the nexis of each matrix junction institutions could find reference implementations that addressed the issues and functions needed by organizations to move forward with their linked data projects
OK ... the unspoken issues much discussed at the Workshop !! URIs change in basic culture ... flat, string-based records throughout the cultural heritage community -- conversion of data hard -- conversion of culture ... DOABLE ?! co-references and provenance inseparable ... trust comes from source of assertions how choices about sameAs were made ... AND NOT MADE crucial value ADD and value ACCRUED
an array of VALUE assessments from Europeana’s mid-June workshop increased relevance ... an apparent winner ... metrics for same? new customers & public mission data enhancement
things to keep in mind ... recurrent themes during the week at Stanford co-references / reconciliation is more web emergent and less policy driven -- in the terms of one of the groups at the workshop schema last ... data first
Wendy Hall talked with the Stanford Libraries’ staff last Summer (2010) ... she recounted her experience with the emerging web, from the perspective of one who developed a well structued, standards-based means of providing access to a massive archive of historial papers at Southampton in the end, the web worked better than what she’d created her message to the Stanford folk: scruffy works
speaks for itself
what’s coming from the Workshop Public version of the CLIR survey Documents from with workshop A few number of proposals (over a relatively short period of time)