A combination of automatic, real-time data feeds from government agencies and new Web 2.0 "data visualization" tools can both increase cooperation and operating efficiency within government, and improve the quality of policy debate and encourage people to become actively involved in offering ideas to improve government. NOTE: best viewed in full-screen mode to read notes.
Interoperability and ecosystems: Assembling the industrial metaverse
Gov Transformation Through Public Data
1. Let my data go!
from locked file cabinets to
government transformation
W. David Stephenson
Stephenson Strategies
Let my data go!
From locked file cabinets …
3. Where’s my data??
Remember the final scene in “Raiders of the Lost Ark,” when the Ark of the Covenant was moved to a government warehouse? When that happened, you
knew it would never be seen again.
That’s what seems to happen with a lot of government data. We pay taxes to collect them. Our activities and lives are their raw material. They determine
whether many of us get more government benefits and which states and communities get grants.
But once they’re collected, most citizens -- and a lot of government employees for that matter -- don’t have a clue where government data are stored or how
they’re used. Even worse, that robs us of important tools that could improve government’s performance and cut its operating costs.
It often seems as if these data are condemned to remain meaningless numbers locked in obscure databases within unknown agencies. Where the heck are
they?
4. Fast forward to 2008. Lo and behold, in the latest Indiana Jones sequel, Indy retrieves the Ark!
In my book, that’s an omen that you can’t keep things hidden forever!
Similarly, closely-controlled and long-lost government data are being liberated by the growing demand by watchdog groups, the media, and the electorate
for transparency.
Releasing data and enabling more people to discuss it is a big deal. As Ellen Miller of the Sunlight Foundation said:
“Data and the accessibility of data are key to understanding how Washington really works. It's information that's crucial for citizens to have in
order to participate in their democracy. We need to know, in real time, things like who's lobbying whom, how lawmakers spend our money, who our
elected officials meet with and about what, who contributes to their campaigns, what interventions they make with regulatory agencies, and
information about lawmakers' personal investments...”
5. But more than shedding light on how government operations, taking reams of data and portraying them visually -- what’s called data visualization or
sometimes data graphics --- can have far more wide-reaching impact.
As Edward Tufte, generally acknowledged as the leading thinker on data graphics, puts it, even the most skilled statisticians often find representing data
visually is the most insightful way of making sense of them:
quot;Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments
for reasoning about quantitative information. Often the most effective way to describe, explore and summarize a set of numbers -- even a
very large set -- is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical
information, well-designed data graphics are usually the simplest and at the same time the most powerful.”
When that concept is applied to government, whether to improve agencies’ internal functioning or, more innovatively, using these data-driven graphics to
spark public involvement, the benefits can be dramatic.
For example, this is a Google mashup Jon Udell whipped up in a few hours to highlight pothole complaints to the DC Department of Public Works, and track
-- on a real-time basis -- the repairs’ status.
Yes, you could find that information in a chart, but who would take the time to sift through pages of records in hopes of possibly finding the one or two that
applied to their neighborhood? By contrast, if you saw this map, and lived near one of the pointers, wouldn’t curiosity compel you to click on it? Wouldn’t the fact
that it includes not only information about where the pothole is and when the complaint was made, but also the repair status TODAY, both fascinate you -- and
provoke you to call the DPW if it’s now 3 months later and the map shows the repair still hasn’t been made?
A simple map can be the impetus for citizen awareness – and greater agency accountability.
6. A look at several other data graphics will give you an appreciation of their potential.
Some visualizations combine various data bases to illustrate convergence, contrasts or possible causality
This example is Neighborhood Knowledge Los Angeles, a collaboration between UCLA and community activists. Its motto is “neighborhood improvement and
recovery is not just for the experts.”
They’re working to adapt the Neighborhood Early Warning System, which seeks to identify early signs that a neighborhood may be declining, into a
mechanism to monitor areas in danger. This is an important example of the value of data visualization, because the system combines data on 7 “problem
indicators” (including code violations, property tax delinquencies, and fire records, etc.) that might have otherwise remained isolated. When you see a single block
where many of the danger signs are repeated, that’s a signal to city officials to intervene with coordinated services to halt the decline.
7. Other mashups of data and maps put data in geographic context.
For example, this one shows where illegal billboards have been erected in Toronto and summarizes their legal histories. illegalsigns.ca by the way, also
illustrates another interesting variation on citizen use of data graphics. While some of are done by major organizations, many result from the passions of single
individuals, in this case, billboard opponent Rami Tabello.
8. Still other data graphics give context to global issues, which can seem so massive and complex that many of us shy away from trying to understand, let alone
to influence them.
None are as eye-catching, and informative as the visualizations on issues facing developing nations created by the Gapminder Foundation using its
innovative, animated Trendalyzer software. Goggle now offers Trendalyzer for general use under the Motion Chart name. This static screengrab can't do justice to
the powerful additional understanding gained when you view one of Gapminder’s animated trend diagrams.
9. “
… put together big enough and diverse
enough groups of people & ask them to
make decisions affecting [the] general
interest, [and] that group's decisions will,
over time, be intellectually superior to the
isolated individual, no matter how smart or
well-informed he is.
”
-- The Wisdom of Crowds
Equally important, web-based data visualization tools may also include a variety of community-building Web 2.0 tools, including search, topic hubs, tags,
and discussion areas. They make it easy to focus many individuals’ and groups’ attention on a policy issue, increasing the chance that new insights will emerge
precisely because of the interplay of so many perspectives.
As James Surowiecki wrote in “The Wisdom of Crowds,” “… put together big enough and diverse enough groups of people & ask them to make decisions
affecting matters of general interest, [and] that group's decisions will, over time, be intellectually superior to the isolated individual, no matter how smart or well-
informed he is.quot;
10. Text
Text
Text
Text
1 st: release the data
Motivated, technologically- sophisticated individuals can create informative data visualizations the hard way, by “scrapping” data from governmental web
sites.
However, now that it is so simple to create data feeds that are generated automatically as new data are added, there’s little rationale not to do so.
In fact, Princeton researchers recently released a paper making a startling assertion: “Rather than struggling, as it currently does, to design sites that meet
each end-user need, we argue that the executive branch should focus on creating a simple, reliable and publicly accessible infrastructure that exposes the
underlying data”
11. A number of federal and state agencies now publish a variety of RSS and other feeds.
By far, the most exciting model is the District of Columbia’s Citywide Data Warehouse, created as part of the city’s operational reform efforts.
It provides real-time RSS, XML,ATOM, TXT and ESRI feeds -- among others. The feeds are drawn from more than 150 data sets, ranging from the all-
important crime reports to pothole complaints to the DPW.
The city has realized benefits including:
• Improved coordination among city agencies
• More value from the mashups because the data is available on a real-time basis rather than scraped from historical records
• A wider variety of uses because of more types of feed formats and a greater range of data.
D.C. has set a high standard: who will top it?
12. 2nd: visualize data
2
nd : Visualize data!
The second major component of a public data project is to help people find simple-to-use ways to portray the data visually. A growing range of new
visualization tools are readily available on the web. Several of the commercial sites now offer secure versions making it simple for agencies to add visualization
services behind the firewall.
13. As with many e-gov initiatives, data graphics can capitalize on existing private-sector sites that help build understanding of the concept, create user-friendly
tools to facilitate it, and provide community-building features such as topic hubs, search, tagging and discussion forums.
The creators of IBM’s Many Eyes say it’s:
“… a bet on the power of human visual intelligence to find patterns. Our goal is to ‘democratize’ visualization and to enable a new social kind of data
analysis …. All of us ... are passionate about the potential of data visualization to spark insight. It is that magical moment we live for: an unwieldy, unyielding
data set is transformed into an image on the screen, and suddenly the user can perceive an unexpected pattern.
As visualization designers we have witnessed and experienced many of those wondrous sparks. But in recent years, we have become acutely aware that
the visualizations and the sparks they generate, take on new value in a social setting. Visualization is a catalyst for discussion and collective insight about
data .... When we share it and discuss it, we understand it in new ways.”
This particular visualization was the first one that I personally created, to help understand patterns in the Department of Homeland Security's disbursement
of funds under its Urban Area Security Initiative program. The simple-to-understand directions allowed me to upload the data and create the visualization in a
matter of minutes.
14. With similar passion for the social benefits of free exchange of data, Swivel’s founders say their
“... mission is to liberate the world's data and make it useful so new insights can be discovered and shared …. the world's most important data have been
completely neglected. … When people, business leaders and politicians cannot access the facts in an engaging way they'll just ignore them. And when the
facts are ignored citizens, communities and investors lose. Without accountability to the facts our world gets worse. We believe data is most valuable when
it's out in the open where everyone can see it, debate it, have fun, and share new insights. Swivel is applying the power of the Web to data so that life gets
better.”
You’ll notice on this particular screengrab, several key Swivel and Many Eyes features applicable to governmental use:
• other users are free to comment -- and challenge the interpretations
• users can create tags (as well as “community tags”) to categorize the graphs
• and it’s also easy to find related graphs and/or to share them with others.
Swivel makes it particularly easy to add your graph to a blog.
15. Yeah, but ...
• Certainly, there are concerns that must be addressed before an agency launches a data feed and visualization initiative. Fortunately, there are sound
solutions to most of them.
One concern is that amateurs will just confuse issues, to which Jon Udell responds: “Those who don’t cite data will be laughed at. Those who do cite data but
interpret it incorrectly will be corrected. Those who do great work will develop reputations that are discoverable and measurable.”
Others worry that release of data will intrude on privacy. Sadly there is already a lot of personal information available on the web, not to mention data theft
and inadvertent disclosure by government agencies. The data privacy issue must be addressed on a comprehensive basis, and shouldn’t be given as the justification
for denying transparent government and data feeds.
Still others worry that releasing and combining bad data will only compound problems. Give me a break! Bad data must be cleaned up under any
circumstances.
•
16. Transparency begins at home
The concept of releasing data to the general public is understandably downright scary to many in government!
So here’s a great way to ease into data feeds and data visualization: follow the District of Columbia's lead, and apply the same strategy behind the firewall
first.
After all, your own employees may be struggling with incompatible data bases, may need to reach across agency “silos” to see if there might be synergies
between programs, and employees from another agency may be able to provide new insights simply because of their differing life experiences and insights.
Also, as more young workers, who have never known life without the Web, join governmental workforces, they’ll naturally ask why tools they’ve used can’t be
used in government. A data graphics project can empower them and tap their expertise.
Launching a behind-the-firewall data visualization site requires the same components as with a public site, allowing agencies to test and improve the parts
before any kind of public exposure – not to mention reaping data feeds’ and data graphics’ benefits:
• clean up data, and establish common formats for feeds: XML, RSS, AJAX and geospatial feeds such as KML. Whenever possible, release the feeds on a
real-time basis, so they can serve more effectively as management tools, rather than to simply analyze historical data.
• create a single web site for the project, including the feeds, demonstrations on show how the process works, then aggregate all of the mashups and
visualizations that result.
• encourage users to create and use tags, so that individual graphics can be clustered, compared, and searched.
• have agency management commit to monitor the site for new ideas, and or signs of problems with agency’s performance uncovered by visualizations.
When possible, act on them, and publicly thank those who make the effort.
Experimenting with transparent government on the inside lets government agencies
• learn more about the approach
• encourage inter-agency cooperation
• clean up data streams
• create their own data visualizations and information mash-ups
17. Let 1,000 mashups bloom!
Once an agency has done this behind-the-scenes work and realized value from an internal data visualization program, the prospect of a parallel set of public
data feeds and a data visualization site is less worrisome.
Of course there are obstacles and risks with transparent government. However, realistically, there’s little choice about whether to implement it: as the public
becomes more at ease with already-available Web 2.0 participation tools and sees the benefits of ad hoc projects such as illegalsigns.ca, they will do more and
more data visualizations whether or not agencies facilitate them. If that’s the case, why not also reap the benefits of growing public understanding and insight?
18. The payoff: transformation!
Outweighing the obstacles are the benefits, particularly in an era in which public faith and participation in government must be rebuilt. When more
employees within agencies and the public at large become comfortable with data and its interpretation, the potential benefits include:
• more informed policy debate, grounded in fact, rather than rhetoric
• consensus building: as people from a variety of perspectives and political ideologies engage in discussion around these facts and their interpretation,
there is at least the possibility of dialogue
• better legislation: with more public debate before legislation is enacted, potential pitfalls that might otherwise only emerge once a program was
operating (with more serious and costly ramifications) would be debated and addressed during the legislative process
• greater transparency and less corruption: when data on government operations, campaign finance, etc. are public, it is harder to conceal
corruption. It becomes easier, for example, to see possible correlation between contributions to an elected official and votes that might favor that contributor.
• greater accountability: robust discussion of program statistics can make government operations more accountable, improving performance (this is
often referred to as sousveillance, in which the public monitors delivery of services).
• optimizing program efficiency and reducing costs: when data are easily available, especially for a wide range of government programs dealing
with the same issue, target populations or geographic areas, it becomes easier to identify potential synergies and/or overlaps and redundancies. This can result
both in more effective services and reduce costs.
• new perspectives, especially when “the wisdom of crowds” emerges.
Who would have believed that dry data -- with a healthy doses of Web 2.0 magic -- could become the engine to involve the public
in governmental transformation!
19. For more information on transparent
government and how to create it:
Stephenson Strategies
335 Main Street, Medfield, MA 02052
(617) 314-7858
To learn more about transparent government and how to create the processes and policies to make it a reality, contact:
Stephenson Strategies 335 Main Street, Medfield, MA 02052 (617) 314-7858 D.Stephenson@stephensonstrategies.com