Linked Data Cookbook for US Government Agencies by Bernadette Hyland, 3 Round Stones, Inc. and W3C Government Linked Data co-chair.
Presented at Semantic Technology Conference Dec 2011, Washington DC
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Linked Data Cookbook for Government Agencies, SemTech East, Washington DC 1-Dec-2011
1. A Linked Data Cookbook
for Government Agencies
Semantic Technology Conference, Washington DC
01-Dec-2011 8:30AM
Bernadette Hyland
CEO, 3 Round Stones &
co-chair W3C Government Linked Data Working Group
bhyland@3roundstones.com
Twitter @BernHyland
Monday, November 28, 11
2. • Linked Data is about
publishing and
consuming data using
international data
standards
• Based on 20 year old
idea
• Goal is to solve
organizational issues
related to data silos,
requirements for faster
data integration and an
environment of reduced
IT budgets
Monday, November 28, 11
3. Linking Government Data
• 42 contributors
• ...from 8 countries
• 10 chapters
• Publication date:
November 2011
3
Monday, November 28, 11
4. Agenda
• Why publishing Linked Open Data matters
• What governments are doing today
• How government use of Open Standards &
Open Source Software saves lives and money
• Social contract as a government publisher
• Next steps
Monday, November 28, 11
5. Two sides of the
Open Government Coin
Short and long term public interests
Increasing transparency
Helping with informed civic engagement
#2 Data sharing for informed research, policy &
regulation
My talk today focuses on #2
Monday, November 28, 11
6. Why should we Care?
• Reducing data silos has long been discussed ...
• Linked Data, based on international data exchange
standards avoids vendor lock in
• Reduces the need to create & maintain data silos
• Encourages private and public partnerships
• Sows the seeds for economic growth from the top down
and bottom up
Monday, November 28, 11
7. ACCEPTABLE ROI FOR IT
4% 17%
13%
16%
6 months
49% 12 months
18 months
24 months
More than 24 months
Monday, November 28, 11
19. Where is Open Source deployed?
International Standards and Open Source are the reason
• The Web has become the most extensible, robust
information network ever created
• US Dept of Defense is big customer of commercially
support Open Source software
• US Army cites Open Source is saving lives and hundreds of
millions of dollars.
• 100k instances deployed in missile defense systems &
armored personnel carriers
Monday, November 28, 11
20. In 3 brief years ...
• Starting in 2008, a few heads of state directed open
government data to be published on the Web ...
• Three months ago (September 2011), Presidents
Obama (USA) and Rousseff (Brazil) endorsed the
Open Government Partnership, along with
7 other nations
• Each launched their government’s National Plans
during the meeting of the UN General Assembly
Monday, November 28, 11
21. World changing phenomenon
• Using Linked Data approach, we can begin to
address data silos and interoperability using
data exchange standards
• We can combine information sources
• The W3C has defined standards that enable
interoperability and allow us to freely move
data
Monday, November 28, 11
23. What is next?
• We’re already seeing signs of things to
come.
• Structured data on the Web is becoming
mainstream.
Monday, November 28, 11
24. Government Linked
Data Working Group
• Started June 2011; runs to May 2013
• Chartered to provide standards & develop standards
track documents to help all governments share
their data as high quality (“5 star”) Linked
Data
• 39 participants from 25 organizations
• 50% in non-US locations
Monday, November 28, 11
26. Deliverables
Community Directory
Best Practices for Publishing Linked Data
• Procurement, vocabulary selection, URI construction,
versioning, stability, legacy data issues
• Cookbook for Linked Open Data
Standard Vocabularies
• Metadata, Statistical “Cube” Data, People,
Organizational structures
Monday, November 28, 11
27. Beta: http://dir.w3.org
email support@3roundstones.com for login to
add your organization’s details
Monday, November 28, 11
37. Preparation
1. Leverage what exists
• Request a copy of the logical and physical model of the
database(s)
• Obtain data extracts (i.e., databases and/or spreadsheets)
or create data in a way that can be replicated.
Monday, November 28, 11
38. Model the data
2. Model data without context to allow for
reuse and easier merging of data sets
• Traditional DBAs organize data for specified
Web services or applications.
• With LD, application logic does not drive the
data schema, concepts, etc.
Monday, November 28, 11
39. Model the data
3.Look for real world objects of interest (e.g.,
people, places, things, locations, etc.) and
model them.
• Investigate how others are already modeling
similar or related data.
• Look for duplication and normalize the data
• Use common sense to decide whether or
not to make link
Monday, November 28, 11
40. Model the data ...
4. Connect data from different sources and
authoritative vocabularies (see list of popular
vocabularies below).
•Use URIs as names for your
objects
Monday, November 28, 11
41. Model the data ...
•Put aside immediate needs of any
application
•Don’t think about how an application will
use your data
•Do think about time and how the data will
change over time.
Monday, November 28, 11
42. Convert, Publish & Maintain
5.Write a script or process to convert the
data set repeatedly
6.Publish to the Web and announce it! (more
details shortly)
7.Maintenance strategy (more details in the
social contract at the end)
Monday, November 28, 11
43. Take the plunge ... Be forgiving
• Simplistic data models can still be useful
• Better to make progress with something
rather than do nothing because we cannot
be comprehensive and complete
Monday, November 28, 11
44. Take an iterative approach
1. Review of modeling decisions
2. Review vocabularies chosen and developed
3. Modify/update data conversion scripts
4. Do a maintenance walk-through with real use cases
5. Show how to explore data with SPARQL and
visualizations
6. Discuss a persistent identifier strategy (think PURLs)
Monday, November 28, 11
46. Linked Data Management System
Callimachus (kəlĭm'əkəs) is a framework for data-driven
applications based on Linked Data principles.
Callimachus allows Web authors to quickly and easily create
semantically-enabled Web applications.
Monday, November 28, 11
47. Web 2.0 developers can create data driven application
with templates in hours
Triples up & down (no mySQL under the covers)
Wiki editing of content
Access control
Collaboration via Web
Change tracking (history)
Page/form Templates
Monday, November 28, 11
58. Join the Community
Callimachus has benefited from 2+ years of corporate support
We’re using it for real world Web applications in environmental
protection, finance and publishing
Open Source project
Visit callimachusproject.org
Monday, November 28, 11
59. What we covered today
• Why government authorities are publishing information as
Linked Open Data
• The process for converting data into RDF
• Using Open Standards and Open Source to publish
Open Data
• Note: Commercial support & products are
critical for government publishing & consumption of Open
Data
• Announcing agency Open Data & your social contract
Monday, November 28, 11
60. Further Reading
http://linkeddatabook.com/editions/1.0/
http://3roundstones.com/linking-enterprise-data/
http://3roundstones.com/linking-government-data/
http://www.linkeddatadeveloper.com/
Monday, November 28, 11
61. Recommended talk
Thursday, 1-Dec 2011 @ 9:30
by Michael Pendleton &
David G. Smith, US EPA
LINKED GOVERNMENT
DATA:
ENVIRONMENTAL
PROTECTION PERSPECTIVES
Monday, November 28, 11