Presented in March 2012 at the OECD Statistical Information System Collaboration Community (SIS-CC) Workshop in Paris.
Video available at http://youtu.be/ZldhNV3Qt6c
The digitization of information exchange processes has led in many industries to define standards to be used in the B2B side of the value chain for the conversations between key partners. The agencies involved in statistical production are not an exception and need to agree on standards that can be used in the exchange of data and metadata between them. However, before these standards have been fully adopted, new needs have arisen that have stressed the importance of machine readable formats for the reuse of the public sector information. Open data initiatives have usually found a strategic ally in the statistical offices because timeliness, punctuality and accessibility are part of the code of practice in official statistics. This has increased the necessity of standards not only for data exchange between organizations specializing in statistical production but also for dissemination to third parties. The presentation will try to address the requirements that the dissemination standards should meet in this new context.
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Standards for statistical data dissemination: a wish list
1. Standards for statistical data dissemination
a wish list
Xavier Badosa (@badosa)
Statistical Institute of Catalonia
OECD Statistical Information System
Collaboration Community (SIS-CC)
Workshop 2012. Paris, 12-14 March
38. “these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
Anil Dash
Expert Labs
http://dashes.com/anil/2009/12/the-twitter-api-is-finished.html
39. KISS principle
“these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
Anil Dash
Expert Labs
http://dashes.com/anil/2009/12/the-twitter-api-is-finished.html
40. develop for the weakest link
“these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
Anil Dash
Expert Labs
http://dashes.com/anil/2009/12/the-twitter-api-is-finished.html
41. provide a useful service
“these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
Anil Dash
Expert Labs
http://dashes.com/anil/2009/12/the-twitter-api-is-finished.html
42. get close to the end users
“these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
Anil Dash
Expert Labs
http://dashes.com/anil/2009/12/the-twitter-api-is-finished.html
43. langs / APIs ~ browsers / websites
“these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
Anil Dash
Expert Labs
http://dashes.com/anil/2009/12/the-twitter-api-is-finished.html
44. “these APIs are simple enough for weekend
hackers to build interesting projects on, and
(…) easy to implement even on mobile devices
and in almost any programming language.”
45. 2 APIs
REST API 100 x 1.5 = 150 pp.
Streaming API 6 x 3 = 18 pp.
< 200 pp.
47. The Fifteen Minute Rule
A person of reasonable
ability should be able to
get from zero to ‘Hello World’
in fifteen minutes.
Michael E. Driscoll
Dataspore, Metamarkets
64. E
C
H
A
N SDMX
G amazingly flexible
E
Shared environment
65. E
C
H
A
N SDMX
G amazingly complex
E
Shared environment
66. E
C
H
A
N SDMX
G amazingly complex
E
SDMX-ML is a meta-language
67. D
I
S TypePad WordPress
S
E
M
I
N
A
T
I
existing Twitter clients
O
N
68. The Metcalfe-Bray Law
D The value of a
I markup language
is proportional
S TypePad WordPress
approximately
S to the square
E of the
M number of
I different
N software
implementations
A that can process
T it.
I Tim Bray
existing Twitter clients
O Google
N
69. The Metcalfe-Bray Law
D The value of a
I markup language
is proportional
S Stats provider Stats provider
approximately
S to the square
E of the
M number of
I different
N software
implementations
A that can process
T it.
I Tim Bray
existing clients
O Google
N
70. D
I Are we
S Stats provider Stats provider
so special
S that
E we
M can’t
I benefit
N from
A existing
T libraries
I and clients?
existing clients
O
N
71. Are we
so special
that
we
Special things can’t
benefit
from
existing
libraries
and clients?
Common things
88. Are we
so special
that
we
Special things can’t
benefit
from
existing
libraries
and clients?
Common things
89. Are we
so special
SDMX? that
we
Special things can’t
benefit
from
existing
libraries
and clients?
Common things
90. Are we
so special
SDMX? that
we
Special things can’t
benefit
from
Simplified SDMX? existing
libraries
and clients?
Common things
91. A Complex System That Works Is Invariably
Found To Have Evolved From A Simple System
That Worked.
“Gall’s Law”
Systemantics
How Systems Work and Especially How They Fail
John Gall
92. The parallel proposition also appears to be true:
A Complex System Designed From Scratch
Never Works And Cannot Be Made To Work.
You Have To Start Over, Beginning With A
Working Simple System.
“Gall’s Law”
Systemantics
How Systems Work and Especially How They Fail
John Gall
94. Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
95. “REST”
RESTish
Pragmatic REST
REST-inspired
Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
96. Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
97. Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
98. Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
111. "KeyFamilyRef" : "ALFS_SUMTAB" ,
a “data” format "Series" : {
"SeriesKey" : {
"location" : "AUS" ,
Natural to programmers "subject" : "YGTT01L1_ST" ,
"frequency" : "A"
},
"Attributes" : {
“time_format" : "P1Y"
“JSON shines as a },
"Obs" : {
programming language- "time": [
independent representation "2000","2001","2002","2003",
"2004","2005","2006","2007",
of typical programming "2008","2009","2010"
language data structures.” ],
"value": [
James Clark 19153,19413,19651,19895,
Technical lead for the W3C XML 20127,20395,20698,21072,
activity which developed 21499,21955,22342
XML 1.0 Recommendation ]
}
169. I have a dream for the Web [in which computers] become
capable of analyzing all the data on the Web – the content,
links, and transactions between people and computers. A
‘Semantic Web’, which should make this possible, has yet to
emerge, but when it does, the day-to-day mechanisms of
trade, bureaucracy and our daily lives will be handled by
machines talking to machines. The ‘intelligent agents’ people
have touted for ages will finally materialize.
Tim Berners-Lee
Director of the W3C, 1999
170.
171.
172.
173.
174.
175.
176. Statistical “Cube” Data. The group will produce a
vocabulary, compatible with SDMX, for expressing
some kinds of statistical data. This need not be as
expressive as all of SDMX, but may provide a subset
as in the RDF Data Cube vocabulary. It may also include ways to
annotate data to indicate its assumptions and comparability.
177.
178.
179. The Mainstream
Relevance “Law”
The mainstream
relevance of a
communication
environment is
proportional to
the quantity of
rubbish in that
environment.
180. The Mainstream
Relevance “Law”
The mainstream
relevance of a
communication
environment is
proportional to
the quantity of
rubbish in that
environment.
183. Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
184. Open APIs and the Semantic Web (John Musser, ProgrammableWeb)
http://www.slideshare.net/jmusser/j-musser-semtechjun2011
185.
186.
187. For some web developers the need to understand
the RDF data model and associated serializations and
query language (SPARQL) has proved a barrier to
adoption of linked data. This project seeks to develop
APIs, data formats and supporting tools to overcome
this barrier. Including, but not limited to, accessing
linked data via a developer-friendly JSON format.
237. Standards for statistical data dissemination
a wish list
Xavier Badosa (@badosa)
Statistical Institute of Catalonia
Thank you
OECD Statistical Information System
Collaboration Community (SIS-CC)
Workshop 2012. Paris, 12-14 March
238. Dan Taylor
borman818 / Daniel Borman
Christian Cable
Ian Muttoo
Donald Macleod
Lushbunny
shaggy359
Wikimedia Commons
http://en.wikipedia.org/wiki/File:IBM_PC_5150.jpg
http://en.wikipedia.org/wiki/File:Asimo_look_new_design.jpg
Richard Cyganiak