1. The CSO Open Data Experience
Databank and Dissemination,
Central Statistics Office, Cork, Ireland
Eoin MacCuirc eoin.mccuirc@cso.ie (00 353 21) 453 5504
2. The Tower of Babel
“If as one people
speaking the same
language they have
begun to do this,
then nothing they
plan to do will be
impossible for
them. Come, let us
go down and
confuse their
language so they
will not understand
each other.”
3. Ireland — Central Statistics Office
“The collection,
compilation, extraction
and dissemination for
statistical purposes of
information relating to
economic, social and
general activities and
conditions in the State”
6. The Web of Things – The Internet of Thingshttp://semanticweb.com/34702/
The Internet of Things is coming, but it needs a semantic backbone to
flourish. With some 25 billion devices expected to be connected to the
Internet by 2015 and 50 billion by 2020, providing interoperability among
the things on the IoT “is one of the most fundamental requirements to
support object addressing, tracking, and discovery as well as information
representation, storage, and exchange.” So write the authors of Semantics
for the Internet of Things: Early Progress and Back to the Future, Payam
Barnaghi and Wei Wang, Centre for Communication Systems Research,
University of Surrey, Guildford, UK and Cory Henson, Kno.e.sis – Ohio
Center of Excellence in Knowledge-enabled Computing.
“The suite of technologies developed in the Semantic Web … such as
ontologies, semantic annotation, Linked Data and semantic Web services
… can be used as principal solutions for the purpose of realizing the IoT,”
they state. “Defining an ontology and using semantic descriptions for data
will make it interoperable for users and stakeholders that share and use
the same ontology.”
7. Tim Berners Lee – Founder
of the Web
“In an extreme view, the world
can be seen as only connections,
nothing else. We think of a
dictionary as the repository of
meaning, but it defines words
only in terms of other words. I
liked the idea that a piece of
information is really defined only
by what it's related to, and how
it's related. There really is little
else to meaning. The structure is
everything. There are billions of
neurons in our brains, but what
are neurons? Just cells. The brain
has no knowledge until
connections are made between
neurons. All that we know, all that
we are, comes from the way our
neurons are connected.”
8. Linked Open Data cloud
http://lod-cloud.net/
Media
Government
Geo
Publications
User-generated
Life sciences
Cross-domain
9. How open is the data? - Linked Open
Data star scheme
Tim Berners-Lee suggested a 5-star deployment scheme for Linked Open
Data and Ed Summers provided a nice rendering of it. In the following,
examples are given for each level. The example data used throughout is
'the temperature forecast for Galway, Ireland for the next 3 days':
★ make your stuff available on the Web (whatever format)
under an open license 1 example ...
★★ make it available as structured data (e.g., Excel instead of
image scan of a table) 2 example ...
★★★ use non-proprietary formats (e.g., CSV instead of Excel)
3 example ...
★★★★ use URIs to identify things, so that people can point at your
stuff4 example ...
★★★★★ link your data to other data to provide context 5 example
http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/
10. URI – Uniform Resource Identifier
give the thing a name and an address
The following picture shows the desired relationships between a resource
and its representing documents:
11. Tim’s cool URIs
Cool URIs don't change
What makes a cool URI?
A cool URI is one which does not change.
What sorts of URI change?
URIs don't change: people change them.
It is the the duty of a Webmaster to allocate URIs
which you will be able to stand by in 2 years, in 20
years, in 200 years. This needs thought, and
organization, and commitment.
12. Where is the CSO with all this?
• Publishes data on www.cso.ie
• Maintains a statistics portal www.statcentral.ie
• Publishes data in JSON-Stat API (Beta) format http://www.cso.ie/webserviceclient/
• One of the first NSIs in the world to upload census data as linked open data –
data.cso.ie – Census 2011 http://data.cso.ie/
• Involved in the EU Open Cube project http://opencube-project.eu/
• Hosts data for other government departments http://www.cso.ie/en/databases/
• Actively engaged with the Irish Open Government Partnership
http://www.ogpireland.ie/
• Organises the apps4gaps competition www.apps4gaps.ie
19. • Own the data.cso.ie process and technology
– Enable in-house maintenance, changes, etc.
• Publish StatBank* data as Linked Open Data
– Ongoing publication process
– Adhering to release schedule is critical
– Publish data that are regularly updated (monthly,
quarterly, annual) as linked open data ( Census 2011 static
data)
*StatBank is the CSO published time series database (PC Axis)
• Deploy tools that enable analytics and exploitation of
linked data
– Both internally and externally
CSO goals (independent from
OpenCube)
22. Building capacity – Using the data
• Liaison Groups
• Seminar Series (Administrative Data and Business
Statistics)
• Oireachtas Briefings
• Social Media (Twitter, YouTube, FaceBook)
• Education Outreach (CensusAtSchool, John
Hooper Medal for Statistics, IPA Diploma in
Official Statistics)
• Visualisations and Engagement (Key Economic
Indicators, Infographics, Exploristica)
23. The Tower of Babel
“If as one people
speaking the same
language they have
begun to do this,
then nothing they
plan to do will be
impossible for
them. Come, let us
go down and
confuse their
language so they
will not understand
each other.”
Genesis 11:6, "If as one people speaking the same language they have begun to do this, then nothing they plan to do will be impossible for them”
The data deluge
The data deluge refers to the situation where the sheer volume of new data being generated is overwhelming the capacity of institutions to manage it and researchers to make use of it.
The Semantic Web is a collaborative movement led by international standards body the World Wide Web Consortium (W3C).[1] The standard promotes common data formats on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web, dominated by unstructured and semi-structured documents into a "web of data". The Semantic Web stack builds on the W3C's Resource Description Framework (RDF).[2]
According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries."[2] The term was coined by Tim Berners-Lee for a web of data that can be processed by machines.[3]
http://semanticweb.com/34702/
Over 200 datasets with 30 billion data items and 400 million links
http://5stardata.info/
In computing, a uniform resource identifier (URI) is a string of characters used to identify a name of a web resource. Such identification enables interaction with representations of the web resource over a network (typically the World Wide Web) using specific protocols. Schemes specifying a concrete syntax and associated protocols define each URI.
Not to be confused with its two subclasses, uniform resource locator (URL) and uniform resource name (URN).
URIs can be classified as locators (URLs), as names (URNs), or as both. A uniform resource name (URN) functions like a person's name, while a uniform resource locator (URL) resembles that person's street address. In other words: the URN defines an item's identity, while the URL provides a method for finding it.
A URL is a URI that, in addition to identifying a web resource, specifies the means of acting upon or obtaining the representation: providing both the primary access mechanism, and the network "location".
A URN is a URI that identifies a resource by name, in a particular namespace. One can use a URN to talk about a resource without implying its location or how to access it.
http://www.w3.org/TR/cooluris/