SlideShare ist ein Scribd-Unternehmen logo
1 von 73
O’Reilly Strata Conference
Making Data Work

Feb 28 - Mar 1
Santa Clara, CA

Michelle Li
Conference Overview

         • 3 days of workshops, lectures, keynotes, startup showcase and a mini
           Maker Faire
         • Developers, data scientists, data analysts, and other data professionals
           including researchers, designers, journalists
         • 5 different session tracks: Data Science, Deep Data, Business &
           Industry, Hadoop and Big Data (Applied & Tech), Domain Data, and
           Visualization & Interface




© 2011 Oculus Info Inc.                        2
Evening events included:
            • Mini Maker Faire – showcase of innovative data-related hardware, apps, and robots
            • Data Crush: Where Wine and Data Meet – wine tasting event where participants provide
                feedback data that was compiled and analyzed to extrapolate behavioural trends and
                factors influencing their responses
            • Startup Showcase – live demo program and competition for 10 finalist startups and early-
                stage companies to demonstrate their innovations to
                judges, investors, entrepreneurs, journalists

© 2011 Oculus Info Inc.                                3
Who goes to a conference about data?


            Over 2000 attendees from various organizations:

                          Microsoft   Digg
                          Google      Groupon
                          Apple       PayPal
                          Netflix     Infochimps
                          IBM         Tableau
                          Oracle      VMware
                          LinkedIn    Guardian News
                          Facebook    The Seattle Times
                          Twitter     MIT Media Lab
                          Amazon      …


© 2011 Oculus Info Inc.                           4
© 2011 Oculus Info Inc.   5
Data is the New Oil




         Source: http://www.house.gov/apps/list/press/tx08_brady/71509_hc_chart.html




© 2011 Oculus Info Inc.                                                                6
Materialize Data Into New Services




© 2011 Oculus Info Inc.          7
Google Insights: “Infographic”




© 2011 Oculus Info Inc.                    8
Google Insights: “Infographic” vs “Big Data”




© 2011 Oculus Info Inc.                         9
Session Overviews

         o Data visualization
            • how we communicate information
            • visual analysis and principles for designing effective
               data views
            • design process and visualization tools for presenting
               data
         o Data Journalism
            • creating data stories to share information socially
         o Democratization of Data
            • data for the common good




© 2011 Oculus Info Inc.                           10
Noah Iliinksy, Complex Diagrams
                Jock Mackinlay, Tableau

                DESIGNING DATA VISUALIZATIONS


© 2011 Oculus Info Inc.                           11
Data Visualization



                 The representation and
                 presentation of data that
                 exploits our visual
                 perception abilities in order
                 to amplify cognition


© 2011 Oculus Info Inc.          12
Science of Visualization

         o Humans are slow at mental math;
           but we’re faster when using the           34
           world around us
         o Human perception is powerful but
                                                   x 72   VS

           perception can be aided and
           augmented by visual prompts
         o Finding patterns is key to
           information visualization
            • We have a flexible pattern
               finder coupled with an
               adaptive decision-making
               mechanism




© 2011 Oculus Info Inc.                       13
Visualization Makes Data Accessible

         Allows us to easily see trends and patterns




© 2011 Oculus Info Inc.                        14
Leverage the Amazing Abilities of Our Eyes and Brain

          Preattentive features:
          length, width, size, colour, closure, number, intersection, contrast, tilt, cur
          vature, etc.




© 2011 Oculus Info Inc.                           15
Faster Access to Actionable Insights
         Difficult to compare 15+ tire models with                     Chart allows customer to focus on appropriate tires based
         different characteristics                                     on 3 axes of data:
            • rim diameters, various widths, various                         •    desired rim size
                features, price, special features                            •    tire width
                                                                             •    toughness/quickness




         Source: http://www.rivbike.com/Tires-Pumps-Patches-s/52.htm                                  Source: http://complexdiagrams.com/2009/03/tire-chart/




© 2011 Oculus Info Inc.                                                16
Allows Access to Huge Amounts of Data

         GapMinder Public health data on a massive global scale
         Understand data through stories




                                                                  Source: gapminder.org




© 2011 Oculus Info Inc.                           17
Visualization for Exploration


          LinkedIn Maps




© 2011 Oculus Info Inc.           18
Visualization for Explanation




© 2011 Oculus Info Inc.           19
Visualizing Data

         Data has properties
               • categorical, quantifiable, geographic, binary
               • continuous, non-continous, ordered
               • timeline




© 2011 Oculus Info Inc.                           20
Define Knowledge Before Structure
         Donut charts: Aesthetically pleasing but not very functional in these cases.
         Good: Individual donuts good for glance of relative share of total market



         Chart #1

         • Comparing series of donut charts is
           meaningless
         • Shows time series data over 7 donuts


         Chart #2

         • Too many wedges
         • Many of the wedges are similarly sized
         • Non-standard sort




                                                    Source: http://litmus.com/blog/email-client-market-share-infograph/email-client-market-stats-1000



© 2011 Oculus Info Inc.                                             21
Use Defaults

         Time series data is
         usually best shown in a
         line graph

         Shows sequential
         changes more easily
         than comparing wedges
         between donuts

         Line graph shows trends
         more clearly




© 2011 Oculus Info Inc.            22
Simple bar graph, but it’s much easier
                          to extract knowledge from it




© 2011 Oculus Info Inc.       23
Unless your data is periodic, don’t put your data in a periodic table




                                    Chronological timeline
                                                Family tree
                          Influence of different controllers
                                       Meaningful context


© 2011 Oculus Info Inc.                                24
Encoding Well




                 Position is everything.
                 Colour is hard.
                                       /Moritz Stefaner




© 2011 Oculus Info Inc.          25
Position is Everything




© 2011 Oculus Info Inc.           26
Position is Everything




© 2011 Oculus Info Inc.           27
Colour is Difficult
         Colour can be used effectively
         in information display
           • Naturally codes attributes of
             objects
           • Not naturally ordered in our
             brain

         Excellent for labelling and
         categorization
          • Works well for heat
            maps/temperature and
            categorization

         Poor for displaying
         shape, rank, order, detail or
         space
          • Not effective for
            quantitative data




© 2011 Oculus Info Inc.                      28
Colour is Difficult




© 2011 Oculus Info Inc.        29
Retinal Properties

         o      Jacques Bertin identified that every
                visualization is made up of basic
                components
         o      Each component has different expressive
                power
         o      Each works best only in some conditions
         o      6 basic variables:
                size, value, texture, colour, orientation, sha
                pe

         o      Jock Mackinlay applied these same
                principles to automatically construct
                visualizations out of data




                                  Four dimensions of data shown             Diagram shows how each visual
                                  effectively in traditional scatter        component works best in each case
                                  plot generated by computers
                                                                            and how to use them.



© 2011 Oculus Info Inc.                                                30
Appropriate Encodings




                                      http://complexdiagrams.com/properties




© 2011 Oculus Info Inc.          31
Fabien Girardin, Near Future Laboratory

                SKETCHING WITH DATA


© 2011 Oculus Info Inc.                           32
Napolean


                          “Un bon croquis vaut mieux
                          qu’un long discours.”




© 2011 Oculus Info Inc.         33
Network Data




© 2011 Oculus Info Inc.   35
Urban Demos           ‘Urban demos’ reveal how the city
                               lives through its data. The City of
                               Geneva visualized digital traces
                               created from cellular network
                               activity.

                               They reflect mobility in a city or a
                               street and reveal insights about a
                               city that are of importance from
                               an economic and political
                               perspective.




© 2011 Oculus Info Inc.   36
Digital Traces




© 2011 Oculus Info Inc.   37
Process




© 2011 Oculus Info Inc.   40
Innovate With Data - Sketch




© 2011 Oculus Info Inc.          41
Sketching With Data

                                Sketch: to think, to make an idea
                                tangible (and observe its different
                                dimensions and implications), to tell
                                stories, to share discoveries

                                A rough version of a creative
                                work, made to assist in reaching
                                coherent result

                                Key values of sketching:
                                    •    share common language
                                    •    qualify results
                                    •    explore ideas




© 2011 Oculus Info Inc.        42
Sketch To Share A Common Language

         Sets a common language among different actors of the project how they understand the
         data and how the data can be used

         Project: explore novel services for mobile phone operators using aggregated cellular
         network activity




               Network Engineer’s view of the data         Product Manager’s view of the same data


© 2011 Oculus Info Inc.                              43
Sketch To Share A Common Language




This is an early sketch to show the
data they were trying to
transform, which reveals the
quality of the data to measure
mobility and density of activity on
the network                           44
© 2011 Oculus Info Inc.
Sketch To Qualify Results

         Project: Controlling hyper-congestion at le Louvre to create an enjoyable
         visiting experience

         Hypercongestion refers to the situation in which the quantity of visitors in a space
         influences negatively the quality of their visiting experience and their security.




© 2011 Oculus Info Inc.                              45
Sketch To Qualify Results
         o      Used network of sensors over 10 days around critical areas to collect empirical data on flows
                and densities of visitors in key areas
         o      Measured occupancy levels, visiting times, and centrality of trails
         o      Field experts (security guards) helped contextualized data and early results through sketches
         o      These results can influence the remodeling of areas and the deployment of information kiosks
                and help evaluate strategies and policies to control hyper-congestion




© 2011 Oculus Info Inc.                                    46
Defining Measures of Hyper-Congestion

        • Measures provided insights and revealed symptoms of hyper-congestion, but
          were insufficient to describe the cause of the issue
             • how to qualify how people walk, etc.
        • Sketches were produced after each data collection period: visualized
          information about visiting sequences, travel times how long visitors stayed in
          each room
        • Used sketches to discuss with people in the field, who provided qualitative
          evidence to contextualize and qualify results and explain detected irregularities




© 2011 Oculus Info Inc.                         47
Defining Measures of Hyper-Congestion




          Network data tells A
          story, not THE story

© 2011 Oculus Info Inc.          48
Sketch To Explore Ideas
         Project: Explore the role of a retail bank BBVA in smart cities in the near future
         Explored opportunities for innovative services to exploit data in the domains of
         distribution strategies, audience profiling and social navigation




© 2011 Oculus Info Inc.                          49
Sketch To Explore Ideas


                                        Created multiple prototypes to
                                        explore opportunities for innovative
                                        BBVA internal and external services

                                        Project participants were able to
                                        explore and interrogate the data
                                        from multiple perspectives

                                        Use of the dashboard helped
                                        participants develop specific
                                        scenarios involving services and
                                        products that a bank could take
                                        advantage of




© 2011 Oculus Info Inc.            50
Interactive Sketching Tool: Quadrigram




      Data manipulation and visualization
      environment using a visual
      programming language

      Modular, node-based interface for
      designing data flows, linking data
      resources to operators, controls and
      visualizations

      WYSIWG interface designed for iterative
      exploration and explanation, allowing
      us to generate new questions and
      provide answers with data



© 2011 Oculus Info Inc.                         51
Access, Manipulate, Analyze and Visualize




   Real-time traffic information

   Five representations of a single data set:
   1. Table visualizer (rows & columns)
   2. Network visualization to see
         relationships between points
   3. Geodata to view points of map to
         see context
   4. Data in real-time visualizes traffic
         moving at different velocities
   5. Temporal data

© 2011 Oculus Info Inc.                         52
Access, Manipulate, Analyze and Visualize




     Data as living material
© 2011 Oculus Info Inc.          53
OPEN DATA & DATA JOURNALISM


© 2011 Oculus Info Inc.      55
Data Journalism

         • Data is changing journalism in several ways
            • New ways of visualizing complexity
            • Provide real answers, based on evidence rather than assertion
            • Democratization of tools and data platforms to help people understand
              information and share stories
            • Bigger datasets about really small things
                          o Allows you to search data
                          o Make complex maps really quickly
                   • Crowdsourcing
                          o Aggregated input from the public is powerful for disaster response
                          o Accurately depicts dynamic situations
         • Open data means open data journalism
            • Governments are increasingly publishing their data repositories for
              other people to access and use it



© 2011 Oculus Info Inc.                                        56
Japanese Geiger Maps

      Using Pachube to aggregate geiger counter
      readings from various data sources


      • Geiger counter – readings for
        Tsunami/Fukushima facility
      • Government was releasing information only
        once per day in PDF format – only numerals;
        nothing about what they mean
      • Pachube community created tutorials-
        collected and aggregated measurements
        from various sources and hooked them up to
        the web
      • Suddenly 2000 feeds/minute across Japan
      • People took data and built applications to
        represent data in terms of health
        consequences and change from background
        radiation

                                         http://japan.failedrobot.com/



© 2011 Oculus Info Inc.                                            59
Winds of Fukushima

           Android App: took your geolocation, wind direction and nearby radiation monitors to
           infer where radiation may peak next




       Android app: Winds of Fukushima




© 2011 Oculus Info Inc.                            60
After the tsunami and
         earthquakes, Toyota
         and Honda shared their
         data to map out usable
         roads




© 2011 Oculus Info Inc.           61
Crowdsourcing Datasets


                                       Understand trends of the
                                       data set

                                       Help find anomalies

                                       People measured things that
                                       might not be measured by
                                       the offical network

                                       Public visibility and
                                       accountability- get people
                                       from different domain
                                       expertise to talk about the
                                       data




© 2011 Oculus Info Inc.           62
Simon Rogers, Guardian

                THE CRAFT OF DATA JOURNALISM


© 2011 Oculus Info Inc.                  64
Behind the Scenes at the Guardian Datablog

                              Datablog started off as a small blog offering full datasets
                              behind their stories and now publishes hundreds of raw
                              datasets, data visualizations and data analyses


                              Process
                              o Locate the data or receive it from various sources (e.g.
                                breaking news stories, government data, journalists’
                                research)
                              o Examine the data: transform for quality/purpose, tidy
                                up, consolidate
                              o Perform calculations and statistical inquiries to see
                                whether there is a story
                              o Output a story, graphic or visualization
                                   • Excel/Google charts for small line graphs and pies
                                   • Google Fusion Tables for maps
                                   • Internal dev team produce the more
                                     sophisticated graphics



© 2011 Oculus Info Inc.            65
The First Guardian Data Journalism:
                               May 5, 1821

                                • Contained a table of data: a list of
                                  schools in Manchester and
                                  Salford, with the number of students
                                  at each and the average annual
                                  spending
                                     ie. how many pupils received free
                                         education and how many poor
                                         children there were in the city
                                • Official statistics were collected by only
                                  4 clergymen, which resulted in
                                  inaccurate and faulty data
                                • Leaked by a source identified as
                                  “NH”, the data caused a huge
                                  sensation
                                • Revealed that 25 000 children were
                                  receiving free education instead of the
                                  8 000 that was officially estimated


                                • Using data to show the true state of
                                  affairs to help fight for a decent
© 2011 Oculus Info Inc.   66      education system
Public spending by the UK’s central government departments 2010-2011




© 2011 Oculus Info Inc.              67
© 2011 Oculus Info Inc.   68
Becoming Data Providers




© 2011 Oculus Info Inc.        69
Exploring the Data

     170 spreadsheets of
     government spending data

     Guardian created a
     spending data explorer
     application

     Designed to make it easier
     for people to search and
     download key data

     Simply analysis has already
     been done: combined
     spending for each
     department into single
     spreadsheets




© 2011 Oculus Info Inc.            70
Wikileaks Afghanistan War Logs


                                          Wikileaks log of every IED attack
                                          with co-ordinates from 2004-2009

                                          Soldiers are good at entering data:
                                          locations of where soldiers died in
                                          Afghanistan, including date, what
                                          happened, number of
                                          casualties, and summaries




© 2011 Oculus Info Inc.         71
Bigger Datasets Of Smaller Things:
Every IED attack from 2004-2009




© 2011 Oculus Info Inc.              72
Crowdsourcing Experiment: MP Expense Scandal
          • Big release of MP’s documented expense claims – 458,000+ documents
          • The Guardian developed a crowdsourcing application in 5 days
                 • Within 10 minutes of the launch, 323 people were using the application to go through the
                   documents
                 • First half hour, more than 2000 pages had been reviewed
          • Each receipt filed by an MP were converted into an image for the public to review
          • Users reviewing were asked to determine and detail what entries there were on a page and flag
            them as unimportant, interesting, “interesting but known” or worthy or investigation




        http://mps-expenses.guardian.co.uk/




© 2011 Oculus Info Inc.                                  73
© 2011 Oculus Info Inc.   74
What Was Revealed…

         • Douglas Hogg, Conservative MP for
           Sleaford and North Hykeham, charged
           £2,115 to have the moat cleared at his
           Lincolnshire estate and claimed bills for
           a "mole man".
         • Sir Peter Viggers, Tory MP for
           Gosport, claimed £1,645 for a floating
           "duck island" in the garden of his
           Hampshire home as part of £32,000 of
           gardening expenses over three years.
         • Jacqui Smith, the former home
           secretary, claimed £10 for two adult
           films which were accessed by her
           husband at her constituency home.
         • Tony Blair claimed almost £7000 for
           roof repairs two days before leaving
           office and standing down as MP.


© 2011 Oculus Info Inc.                                75
London Riots




            Instant data journalism: filling the hole of knowledge for anyone
            wanting to know what was happening where

            • Collected key reported incidents from as many possible sources
            • Compiled a list of every incident where there was a verified
              report, then mapped it with Google Fusion tables
            • Allowed people to download the data behind it – possibly the
              the simplest but most popular thing they did


© 2011 Oculus Info Inc.                              77
Reading the Riots

         o Project took a look at the riots as
           experienced by those who were
           there
         o A specially-recruited team
           interviewed around 270 people
           about the riots and why they had
           been involved




© 2011 Oculus Info Inc.                          78
England Riots: Was Poverty A Factor?




© 2011 Oculus Info Inc.          79
‘Riot Commute’

       • Data from 1,100 individual’s
         magistrate’s court records that included
         postcodes for defendents’ home and
         offence locations

       • 70% of those accused of riot-related
         crimes travelled from outside their area

       • Riots occurred in the city centre, but
         accused rioters lived in out districts

       • Travelled an average of 2.2 miles from
         home to the riot offence site

       • Transport mapping specialists modelled
         the most likely routes from home to
         offence


© 2011 Oculus Info Inc.                             80
How Riot Rumours Spread On Twitter

       • Many people, including the PM and acting head of the Metropolitan police, blamed
         Twitter for spreading the disorder
       • Analysis of 2.6 million riot-related tweets suggested a different conclusion: the
         network was able to collectively dispel and clarify false information
       • Picked a subset of more than 10 000 tweets concerning 7 key rumours that emerged
         during the riots




© 2011 Oculus Info Inc.                             81

Weitere ähnliche Inhalte

Ähnlich wie O'Reilly Strata Conference Making Data Work

Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Ora Lassila
 
Overview of Open Data, Linked Data and Web Science
Overview of Open Data, Linked Data and Web ScienceOverview of Open Data, Linked Data and Web Science
Overview of Open Data, Linked Data and Web ScienceHaklae Kim
 
Ottawa NIEM SOA Open Data Event
Ottawa NIEM SOA Open Data EventOttawa NIEM SOA Open Data Event
Ottawa NIEM SOA Open Data EventBizagi Inc
 
8 emerging technology trends that will impact Government in 2011
8 emerging technology trends that will impact Government in 20118 emerging technology trends that will impact Government in 2011
8 emerging technology trends that will impact Government in 2011eGovernment Resource Centre
 
Forecast 2012 Panel: Software Innovation Richard Villars, IDC
Forecast 2012 Panel: Software Innovation Richard Villars, IDCForecast 2012 Panel: Software Innovation Richard Villars, IDC
Forecast 2012 Panel: Software Innovation Richard Villars, IDCOpen Data Center Alliance
 
Bit120 m02 l04 - accessing the value of information
Bit120   m02 l04 - accessing the value of informationBit120   m02 l04 - accessing the value of information
Bit120 m02 l04 - accessing the value of informationNeumontStudio
 
Data visualization trends in Business Intelligence: Allison Sapka at Analytic...
Data visualization trends in Business Intelligence: Allison Sapka at Analytic...Data visualization trends in Business Intelligence: Allison Sapka at Analytic...
Data visualization trends in Business Intelligence: Allison Sapka at Analytic...Fitzgerald Analytics, Inc.
 
Nexus Economics: How Critical Macro Trends Will Change Your Business
Nexus Economics: How Critical Macro Trends Will Change Your BusinessNexus Economics: How Critical Macro Trends Will Change Your Business
Nexus Economics: How Critical Macro Trends Will Change Your BusinessMoxie Insight
 
Graphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningGraphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningNeo4j
 
Analytics for All Webinar April 25
Analytics for All Webinar April 25Analytics for All Webinar April 25
Analytics for All Webinar April 25Tidemark
 
The Future of Work: Social and Mobile Technologies That Matter
The Future of Work: Social and Mobile Technologies That MatterThe Future of Work: Social and Mobile Technologies That Matter
The Future of Work: Social and Mobile Technologies That MatterCharlene Li
 
DDS in a Nutshell
DDS in a NutshellDDS in a Nutshell
DDS in a NutshellRick Warren
 
Big Data, Watson & The Future of Sourcing
Big Data, Watson & The Future of SourcingBig Data, Watson & The Future of Sourcing
Big Data, Watson & The Future of SourcingKevin Wheeler
 
CII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud ComputingCII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud ComputingAnand Deshpande
 
Nexus Economics: How Critical Macro Trends will Change Your Business
Nexus Economics: How Critical Macro Trends will Change Your Business Nexus Economics: How Critical Macro Trends will Change Your Business
Nexus Economics: How Critical Macro Trends will Change Your Business nhaque
 
NBS8053 Introduction 2012
NBS8053 Introduction 2012NBS8053 Introduction 2012
NBS8053 Introduction 2012Lee Schlenker
 

Ähnlich wie O'Reilly Strata Conference Making Data Work (20)

EDF2012 - CODE
EDF2012 - CODEEDF2012 - CODE
EDF2012 - CODE
 
Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)
 
Actuarial Analytics in R
Actuarial Analytics in RActuarial Analytics in R
Actuarial Analytics in R
 
Overview of Open Data, Linked Data and Web Science
Overview of Open Data, Linked Data and Web ScienceOverview of Open Data, Linked Data and Web Science
Overview of Open Data, Linked Data and Web Science
 
Ottawa NIEM SOA Open Data Event
Ottawa NIEM SOA Open Data EventOttawa NIEM SOA Open Data Event
Ottawa NIEM SOA Open Data Event
 
8 emerging technology trends that will impact Government in 2011
8 emerging technology trends that will impact Government in 20118 emerging technology trends that will impact Government in 2011
8 emerging technology trends that will impact Government in 2011
 
Forecast 2012 Panel: Software Innovation Richard Villars, IDC
Forecast 2012 Panel: Software Innovation Richard Villars, IDCForecast 2012 Panel: Software Innovation Richard Villars, IDC
Forecast 2012 Panel: Software Innovation Richard Villars, IDC
 
Bit120 m02 l04 - accessing the value of information
Bit120   m02 l04 - accessing the value of informationBit120   m02 l04 - accessing the value of information
Bit120 m02 l04 - accessing the value of information
 
Emm introduction
Emm introductionEmm introduction
Emm introduction
 
Data visualization trends in Business Intelligence: Allison Sapka at Analytic...
Data visualization trends in Business Intelligence: Allison Sapka at Analytic...Data visualization trends in Business Intelligence: Allison Sapka at Analytic...
Data visualization trends in Business Intelligence: Allison Sapka at Analytic...
 
Nexus Economics: How Critical Macro Trends Will Change Your Business
Nexus Economics: How Critical Macro Trends Will Change Your BusinessNexus Economics: How Critical Macro Trends Will Change Your Business
Nexus Economics: How Critical Macro Trends Will Change Your Business
 
Graphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningGraphs for Data Science and Machine Learning
Graphs for Data Science and Machine Learning
 
Analytics for All Webinar April 25
Analytics for All Webinar April 25Analytics for All Webinar April 25
Analytics for All Webinar April 25
 
The Future of Work: Social and Mobile Technologies That Matter
The Future of Work: Social and Mobile Technologies That MatterThe Future of Work: Social and Mobile Technologies That Matter
The Future of Work: Social and Mobile Technologies That Matter
 
You Brand is Being Discussed, Are You Listening?
You Brand is Being Discussed, Are You Listening?You Brand is Being Discussed, Are You Listening?
You Brand is Being Discussed, Are You Listening?
 
DDS in a Nutshell
DDS in a NutshellDDS in a Nutshell
DDS in a Nutshell
 
Big Data, Watson & The Future of Sourcing
Big Data, Watson & The Future of SourcingBig Data, Watson & The Future of Sourcing
Big Data, Watson & The Future of Sourcing
 
CII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud ComputingCII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud Computing
 
Nexus Economics: How Critical Macro Trends will Change Your Business
Nexus Economics: How Critical Macro Trends will Change Your Business Nexus Economics: How Critical Macro Trends will Change Your Business
Nexus Economics: How Critical Macro Trends will Change Your Business
 
NBS8053 Introduction 2012
NBS8053 Introduction 2012NBS8053 Introduction 2012
NBS8053 Introduction 2012
 

Kürzlich hochgeladen

CALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun serviceCALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun serviceanilsa9823
 
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Government polytechnic college-1.pptxabcd
Government polytechnic college-1.pptxabcdGovernment polytechnic college-1.pptxabcd
Government polytechnic college-1.pptxabcdshivubhavv
 
Booking open Available Pune Call Girls Kirkatwadi 6297143586 Call Hot Indian...
Booking open Available Pune Call Girls Kirkatwadi  6297143586 Call Hot Indian...Booking open Available Pune Call Girls Kirkatwadi  6297143586 Call Hot Indian...
Booking open Available Pune Call Girls Kirkatwadi 6297143586 Call Hot Indian...Call Girls in Nagpur High Profile
 
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...amitlee9823
 
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...SUHANI PANDEY
 
Pastel Portfolio _ by Slidesgo.pptx. Xxx
Pastel Portfolio _ by Slidesgo.pptx. XxxPastel Portfolio _ by Slidesgo.pptx. Xxx
Pastel Portfolio _ by Slidesgo.pptx. XxxSegundoManuelFaichin1
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...Pooja Nehwal
 
The_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdf
The_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdfThe_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdf
The_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdfAmirYakdi
 
Top Rated Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Call Girls in Nagpur High Profile
 
AMBER GRAIN EMBROIDERY | Growing folklore elements | Root-based materials, w...
AMBER GRAIN EMBROIDERY | Growing folklore elements |  Root-based materials, w...AMBER GRAIN EMBROIDERY | Growing folklore elements |  Root-based materials, w...
AMBER GRAIN EMBROIDERY | Growing folklore elements | Root-based materials, w...BarusRa
 
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...Delhi Call girls
 
Design Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptxDesign Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptxTusharBahuguna2
 
Best VIP Call Girls Noida Sector 47 Call Me: 8448380779
Best VIP Call Girls Noida Sector 47 Call Me: 8448380779Best VIP Call Girls Noida Sector 47 Call Me: 8448380779
Best VIP Call Girls Noida Sector 47 Call Me: 8448380779Delhi Call girls
 
Best VIP Call Girls Noida Sector 44 Call Me: 8448380779
Best VIP Call Girls Noida Sector 44 Call Me: 8448380779Best VIP Call Girls Noida Sector 44 Call Me: 8448380779
Best VIP Call Girls Noida Sector 44 Call Me: 8448380779Delhi Call girls
 
infant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptxinfant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptxsuhanimunjal27
 
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)amitlee9823
 
CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service 🧵
CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service  🧵CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service  🧵
CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service 🧵anilsa9823
 

Kürzlich hochgeladen (20)

CALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun serviceCALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun service
CALL ON ➥8923113531 🔝Call Girls Aminabad Lucknow best Night Fun service
 
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Brookefield Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Government polytechnic college-1.pptxabcd
Government polytechnic college-1.pptxabcdGovernment polytechnic college-1.pptxabcd
Government polytechnic college-1.pptxabcd
 
Booking open Available Pune Call Girls Kirkatwadi 6297143586 Call Hot Indian...
Booking open Available Pune Call Girls Kirkatwadi  6297143586 Call Hot Indian...Booking open Available Pune Call Girls Kirkatwadi  6297143586 Call Hot Indian...
Booking open Available Pune Call Girls Kirkatwadi 6297143586 Call Hot Indian...
 
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
RT Nagar Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Bang...
 
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
VIP Model Call Girls Kalyani Nagar ( Pune ) Call ON 8005736733 Starting From ...
 
Pastel Portfolio _ by Slidesgo.pptx. Xxx
Pastel Portfolio _ by Slidesgo.pptx. XxxPastel Portfolio _ by Slidesgo.pptx. Xxx
Pastel Portfolio _ by Slidesgo.pptx. Xxx
 
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
Pooja 9892124323, Call girls Services and Mumbai Escort Service Near Hotel Gi...
 
The_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdf
The_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdfThe_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdf
The_Canvas_of_Creative_Mastery_Newsletter_April_2024_Version.pdf
 
Top Rated Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated  Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...Top Rated  Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls Saswad ⟟ 6297143586 ⟟ Call Me For Genuine Sex Serv...
 
AMBER GRAIN EMBROIDERY | Growing folklore elements | Root-based materials, w...
AMBER GRAIN EMBROIDERY | Growing folklore elements |  Root-based materials, w...AMBER GRAIN EMBROIDERY | Growing folklore elements |  Root-based materials, w...
AMBER GRAIN EMBROIDERY | Growing folklore elements | Root-based materials, w...
 
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
call girls in Vasundhra (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝...
 
Design Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptxDesign Inspiration for College by Slidesgo.pptx
Design Inspiration for College by Slidesgo.pptx
 
Best VIP Call Girls Noida Sector 47 Call Me: 8448380779
Best VIP Call Girls Noida Sector 47 Call Me: 8448380779Best VIP Call Girls Noida Sector 47 Call Me: 8448380779
Best VIP Call Girls Noida Sector 47 Call Me: 8448380779
 
Call Girls Service Mukherjee Nagar @9999965857 Delhi 🫦 No Advance VVIP 🍎 SER...
Call Girls Service Mukherjee Nagar @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SER...Call Girls Service Mukherjee Nagar @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SER...
Call Girls Service Mukherjee Nagar @9999965857 Delhi 🫦 No Advance VVIP 🍎 SER...
 
Best VIP Call Girls Noida Sector 44 Call Me: 8448380779
Best VIP Call Girls Noida Sector 44 Call Me: 8448380779Best VIP Call Girls Noida Sector 44 Call Me: 8448380779
Best VIP Call Girls Noida Sector 44 Call Me: 8448380779
 
infant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptxinfant assessment fdbbdbdddinal ppt.pptx
infant assessment fdbbdbdddinal ppt.pptx
 
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
Escorts Service Nagavara ☎ 7737669865☎ Book Your One night Stand (Bangalore)
 
CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service 🧵
CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service  🧵CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service  🧵
CALL ON ➥8923113531 🔝Call Girls Kalyanpur Lucknow best Female service 🧵
 
young call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Service
young call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Service
young call girls in Vivek Vihar🔝 9953056974 🔝 Delhi escort Service
 

O'Reilly Strata Conference Making Data Work

  • 1. O’Reilly Strata Conference Making Data Work Feb 28 - Mar 1 Santa Clara, CA Michelle Li
  • 2. Conference Overview • 3 days of workshops, lectures, keynotes, startup showcase and a mini Maker Faire • Developers, data scientists, data analysts, and other data professionals including researchers, designers, journalists • 5 different session tracks: Data Science, Deep Data, Business & Industry, Hadoop and Big Data (Applied & Tech), Domain Data, and Visualization & Interface © 2011 Oculus Info Inc. 2
  • 3. Evening events included: • Mini Maker Faire – showcase of innovative data-related hardware, apps, and robots • Data Crush: Where Wine and Data Meet – wine tasting event where participants provide feedback data that was compiled and analyzed to extrapolate behavioural trends and factors influencing their responses • Startup Showcase – live demo program and competition for 10 finalist startups and early- stage companies to demonstrate their innovations to judges, investors, entrepreneurs, journalists © 2011 Oculus Info Inc. 3
  • 4. Who goes to a conference about data? Over 2000 attendees from various organizations: Microsoft Digg Google Groupon Apple PayPal Netflix Infochimps IBM Tableau Oracle VMware LinkedIn Guardian News Facebook The Seattle Times Twitter MIT Media Lab Amazon … © 2011 Oculus Info Inc. 4
  • 5. © 2011 Oculus Info Inc. 5
  • 6. Data is the New Oil Source: http://www.house.gov/apps/list/press/tx08_brady/71509_hc_chart.html © 2011 Oculus Info Inc. 6
  • 7. Materialize Data Into New Services © 2011 Oculus Info Inc. 7
  • 8. Google Insights: “Infographic” © 2011 Oculus Info Inc. 8
  • 9. Google Insights: “Infographic” vs “Big Data” © 2011 Oculus Info Inc. 9
  • 10. Session Overviews o Data visualization • how we communicate information • visual analysis and principles for designing effective data views • design process and visualization tools for presenting data o Data Journalism • creating data stories to share information socially o Democratization of Data • data for the common good © 2011 Oculus Info Inc. 10
  • 11. Noah Iliinksy, Complex Diagrams Jock Mackinlay, Tableau DESIGNING DATA VISUALIZATIONS © 2011 Oculus Info Inc. 11
  • 12. Data Visualization The representation and presentation of data that exploits our visual perception abilities in order to amplify cognition © 2011 Oculus Info Inc. 12
  • 13. Science of Visualization o Humans are slow at mental math; but we’re faster when using the 34 world around us o Human perception is powerful but x 72 VS perception can be aided and augmented by visual prompts o Finding patterns is key to information visualization • We have a flexible pattern finder coupled with an adaptive decision-making mechanism © 2011 Oculus Info Inc. 13
  • 14. Visualization Makes Data Accessible Allows us to easily see trends and patterns © 2011 Oculus Info Inc. 14
  • 15. Leverage the Amazing Abilities of Our Eyes and Brain Preattentive features: length, width, size, colour, closure, number, intersection, contrast, tilt, cur vature, etc. © 2011 Oculus Info Inc. 15
  • 16. Faster Access to Actionable Insights Difficult to compare 15+ tire models with Chart allows customer to focus on appropriate tires based different characteristics on 3 axes of data: • rim diameters, various widths, various • desired rim size features, price, special features • tire width • toughness/quickness Source: http://www.rivbike.com/Tires-Pumps-Patches-s/52.htm Source: http://complexdiagrams.com/2009/03/tire-chart/ © 2011 Oculus Info Inc. 16
  • 17. Allows Access to Huge Amounts of Data GapMinder Public health data on a massive global scale Understand data through stories Source: gapminder.org © 2011 Oculus Info Inc. 17
  • 18. Visualization for Exploration LinkedIn Maps © 2011 Oculus Info Inc. 18
  • 19. Visualization for Explanation © 2011 Oculus Info Inc. 19
  • 20. Visualizing Data Data has properties • categorical, quantifiable, geographic, binary • continuous, non-continous, ordered • timeline © 2011 Oculus Info Inc. 20
  • 21. Define Knowledge Before Structure Donut charts: Aesthetically pleasing but not very functional in these cases. Good: Individual donuts good for glance of relative share of total market Chart #1 • Comparing series of donut charts is meaningless • Shows time series data over 7 donuts Chart #2 • Too many wedges • Many of the wedges are similarly sized • Non-standard sort Source: http://litmus.com/blog/email-client-market-share-infograph/email-client-market-stats-1000 © 2011 Oculus Info Inc. 21
  • 22. Use Defaults Time series data is usually best shown in a line graph Shows sequential changes more easily than comparing wedges between donuts Line graph shows trends more clearly © 2011 Oculus Info Inc. 22
  • 23. Simple bar graph, but it’s much easier to extract knowledge from it © 2011 Oculus Info Inc. 23
  • 24. Unless your data is periodic, don’t put your data in a periodic table Chronological timeline Family tree Influence of different controllers Meaningful context © 2011 Oculus Info Inc. 24
  • 25. Encoding Well Position is everything. Colour is hard. /Moritz Stefaner © 2011 Oculus Info Inc. 25
  • 26. Position is Everything © 2011 Oculus Info Inc. 26
  • 27. Position is Everything © 2011 Oculus Info Inc. 27
  • 28. Colour is Difficult Colour can be used effectively in information display • Naturally codes attributes of objects • Not naturally ordered in our brain Excellent for labelling and categorization • Works well for heat maps/temperature and categorization Poor for displaying shape, rank, order, detail or space • Not effective for quantitative data © 2011 Oculus Info Inc. 28
  • 29. Colour is Difficult © 2011 Oculus Info Inc. 29
  • 30. Retinal Properties o Jacques Bertin identified that every visualization is made up of basic components o Each component has different expressive power o Each works best only in some conditions o 6 basic variables: size, value, texture, colour, orientation, sha pe o Jock Mackinlay applied these same principles to automatically construct visualizations out of data Four dimensions of data shown Diagram shows how each visual effectively in traditional scatter component works best in each case plot generated by computers and how to use them. © 2011 Oculus Info Inc. 30
  • 31. Appropriate Encodings http://complexdiagrams.com/properties © 2011 Oculus Info Inc. 31
  • 32. Fabien Girardin, Near Future Laboratory SKETCHING WITH DATA © 2011 Oculus Info Inc. 32
  • 33. Napolean “Un bon croquis vaut mieux qu’un long discours.” © 2011 Oculus Info Inc. 33
  • 34. Network Data © 2011 Oculus Info Inc. 35
  • 35. Urban Demos ‘Urban demos’ reveal how the city lives through its data. The City of Geneva visualized digital traces created from cellular network activity. They reflect mobility in a city or a street and reveal insights about a city that are of importance from an economic and political perspective. © 2011 Oculus Info Inc. 36
  • 36. Digital Traces © 2011 Oculus Info Inc. 37
  • 37. Process © 2011 Oculus Info Inc. 40
  • 38. Innovate With Data - Sketch © 2011 Oculus Info Inc. 41
  • 39. Sketching With Data Sketch: to think, to make an idea tangible (and observe its different dimensions and implications), to tell stories, to share discoveries A rough version of a creative work, made to assist in reaching coherent result Key values of sketching: • share common language • qualify results • explore ideas © 2011 Oculus Info Inc. 42
  • 40. Sketch To Share A Common Language Sets a common language among different actors of the project how they understand the data and how the data can be used Project: explore novel services for mobile phone operators using aggregated cellular network activity Network Engineer’s view of the data Product Manager’s view of the same data © 2011 Oculus Info Inc. 43
  • 41. Sketch To Share A Common Language This is an early sketch to show the data they were trying to transform, which reveals the quality of the data to measure mobility and density of activity on the network 44 © 2011 Oculus Info Inc.
  • 42. Sketch To Qualify Results Project: Controlling hyper-congestion at le Louvre to create an enjoyable visiting experience Hypercongestion refers to the situation in which the quantity of visitors in a space influences negatively the quality of their visiting experience and their security. © 2011 Oculus Info Inc. 45
  • 43. Sketch To Qualify Results o Used network of sensors over 10 days around critical areas to collect empirical data on flows and densities of visitors in key areas o Measured occupancy levels, visiting times, and centrality of trails o Field experts (security guards) helped contextualized data and early results through sketches o These results can influence the remodeling of areas and the deployment of information kiosks and help evaluate strategies and policies to control hyper-congestion © 2011 Oculus Info Inc. 46
  • 44. Defining Measures of Hyper-Congestion • Measures provided insights and revealed symptoms of hyper-congestion, but were insufficient to describe the cause of the issue • how to qualify how people walk, etc. • Sketches were produced after each data collection period: visualized information about visiting sequences, travel times how long visitors stayed in each room • Used sketches to discuss with people in the field, who provided qualitative evidence to contextualize and qualify results and explain detected irregularities © 2011 Oculus Info Inc. 47
  • 45. Defining Measures of Hyper-Congestion Network data tells A story, not THE story © 2011 Oculus Info Inc. 48
  • 46. Sketch To Explore Ideas Project: Explore the role of a retail bank BBVA in smart cities in the near future Explored opportunities for innovative services to exploit data in the domains of distribution strategies, audience profiling and social navigation © 2011 Oculus Info Inc. 49
  • 47. Sketch To Explore Ideas Created multiple prototypes to explore opportunities for innovative BBVA internal and external services Project participants were able to explore and interrogate the data from multiple perspectives Use of the dashboard helped participants develop specific scenarios involving services and products that a bank could take advantage of © 2011 Oculus Info Inc. 50
  • 48. Interactive Sketching Tool: Quadrigram Data manipulation and visualization environment using a visual programming language Modular, node-based interface for designing data flows, linking data resources to operators, controls and visualizations WYSIWG interface designed for iterative exploration and explanation, allowing us to generate new questions and provide answers with data © 2011 Oculus Info Inc. 51
  • 49. Access, Manipulate, Analyze and Visualize Real-time traffic information Five representations of a single data set: 1. Table visualizer (rows & columns) 2. Network visualization to see relationships between points 3. Geodata to view points of map to see context 4. Data in real-time visualizes traffic moving at different velocities 5. Temporal data © 2011 Oculus Info Inc. 52
  • 50. Access, Manipulate, Analyze and Visualize Data as living material © 2011 Oculus Info Inc. 53
  • 51. OPEN DATA & DATA JOURNALISM © 2011 Oculus Info Inc. 55
  • 52. Data Journalism • Data is changing journalism in several ways • New ways of visualizing complexity • Provide real answers, based on evidence rather than assertion • Democratization of tools and data platforms to help people understand information and share stories • Bigger datasets about really small things o Allows you to search data o Make complex maps really quickly • Crowdsourcing o Aggregated input from the public is powerful for disaster response o Accurately depicts dynamic situations • Open data means open data journalism • Governments are increasingly publishing their data repositories for other people to access and use it © 2011 Oculus Info Inc. 56
  • 53. Japanese Geiger Maps Using Pachube to aggregate geiger counter readings from various data sources • Geiger counter – readings for Tsunami/Fukushima facility • Government was releasing information only once per day in PDF format – only numerals; nothing about what they mean • Pachube community created tutorials- collected and aggregated measurements from various sources and hooked them up to the web • Suddenly 2000 feeds/minute across Japan • People took data and built applications to represent data in terms of health consequences and change from background radiation http://japan.failedrobot.com/ © 2011 Oculus Info Inc. 59
  • 54. Winds of Fukushima Android App: took your geolocation, wind direction and nearby radiation monitors to infer where radiation may peak next Android app: Winds of Fukushima © 2011 Oculus Info Inc. 60
  • 55. After the tsunami and earthquakes, Toyota and Honda shared their data to map out usable roads © 2011 Oculus Info Inc. 61
  • 56. Crowdsourcing Datasets Understand trends of the data set Help find anomalies People measured things that might not be measured by the offical network Public visibility and accountability- get people from different domain expertise to talk about the data © 2011 Oculus Info Inc. 62
  • 57. Simon Rogers, Guardian THE CRAFT OF DATA JOURNALISM © 2011 Oculus Info Inc. 64
  • 58. Behind the Scenes at the Guardian Datablog Datablog started off as a small blog offering full datasets behind their stories and now publishes hundreds of raw datasets, data visualizations and data analyses Process o Locate the data or receive it from various sources (e.g. breaking news stories, government data, journalists’ research) o Examine the data: transform for quality/purpose, tidy up, consolidate o Perform calculations and statistical inquiries to see whether there is a story o Output a story, graphic or visualization • Excel/Google charts for small line graphs and pies • Google Fusion Tables for maps • Internal dev team produce the more sophisticated graphics © 2011 Oculus Info Inc. 65
  • 59. The First Guardian Data Journalism: May 5, 1821 • Contained a table of data: a list of schools in Manchester and Salford, with the number of students at each and the average annual spending ie. how many pupils received free education and how many poor children there were in the city • Official statistics were collected by only 4 clergymen, which resulted in inaccurate and faulty data • Leaked by a source identified as “NH”, the data caused a huge sensation • Revealed that 25 000 children were receiving free education instead of the 8 000 that was officially estimated • Using data to show the true state of affairs to help fight for a decent © 2011 Oculus Info Inc. 66 education system
  • 60. Public spending by the UK’s central government departments 2010-2011 © 2011 Oculus Info Inc. 67
  • 61. © 2011 Oculus Info Inc. 68
  • 62. Becoming Data Providers © 2011 Oculus Info Inc. 69
  • 63. Exploring the Data 170 spreadsheets of government spending data Guardian created a spending data explorer application Designed to make it easier for people to search and download key data Simply analysis has already been done: combined spending for each department into single spreadsheets © 2011 Oculus Info Inc. 70
  • 64. Wikileaks Afghanistan War Logs Wikileaks log of every IED attack with co-ordinates from 2004-2009 Soldiers are good at entering data: locations of where soldiers died in Afghanistan, including date, what happened, number of casualties, and summaries © 2011 Oculus Info Inc. 71
  • 65. Bigger Datasets Of Smaller Things: Every IED attack from 2004-2009 © 2011 Oculus Info Inc. 72
  • 66. Crowdsourcing Experiment: MP Expense Scandal • Big release of MP’s documented expense claims – 458,000+ documents • The Guardian developed a crowdsourcing application in 5 days • Within 10 minutes of the launch, 323 people were using the application to go through the documents • First half hour, more than 2000 pages had been reviewed • Each receipt filed by an MP were converted into an image for the public to review • Users reviewing were asked to determine and detail what entries there were on a page and flag them as unimportant, interesting, “interesting but known” or worthy or investigation http://mps-expenses.guardian.co.uk/ © 2011 Oculus Info Inc. 73
  • 67. © 2011 Oculus Info Inc. 74
  • 68. What Was Revealed… • Douglas Hogg, Conservative MP for Sleaford and North Hykeham, charged £2,115 to have the moat cleared at his Lincolnshire estate and claimed bills for a "mole man". • Sir Peter Viggers, Tory MP for Gosport, claimed £1,645 for a floating "duck island" in the garden of his Hampshire home as part of £32,000 of gardening expenses over three years. • Jacqui Smith, the former home secretary, claimed £10 for two adult films which were accessed by her husband at her constituency home. • Tony Blair claimed almost £7000 for roof repairs two days before leaving office and standing down as MP. © 2011 Oculus Info Inc. 75
  • 69. London Riots Instant data journalism: filling the hole of knowledge for anyone wanting to know what was happening where • Collected key reported incidents from as many possible sources • Compiled a list of every incident where there was a verified report, then mapped it with Google Fusion tables • Allowed people to download the data behind it – possibly the the simplest but most popular thing they did © 2011 Oculus Info Inc. 77
  • 70. Reading the Riots o Project took a look at the riots as experienced by those who were there o A specially-recruited team interviewed around 270 people about the riots and why they had been involved © 2011 Oculus Info Inc. 78
  • 71. England Riots: Was Poverty A Factor? © 2011 Oculus Info Inc. 79
  • 72. ‘Riot Commute’ • Data from 1,100 individual’s magistrate’s court records that included postcodes for defendents’ home and offence locations • 70% of those accused of riot-related crimes travelled from outside their area • Riots occurred in the city centre, but accused rioters lived in out districts • Travelled an average of 2.2 miles from home to the riot offence site • Transport mapping specialists modelled the most likely routes from home to offence © 2011 Oculus Info Inc. 80
  • 73. How Riot Rumours Spread On Twitter • Many people, including the PM and acting head of the Metropolitan police, blamed Twitter for spreading the disorder • Analysis of 2.6 million riot-related tweets suggested a different conclusion: the network was able to collectively dispel and clarify false information • Picked a subset of more than 10 000 tweets concerning 7 key rumours that emerged during the riots © 2011 Oculus Info Inc. 81

Hinweis der Redaktion

  1. Strata in Santa Clara, California has gathered over 2,000 developers, journalists and data scientists in one place to discuss data - big and small - at what has become the data event of the year. Oh and we're there too. See where the data enthusiasts came from, what they want to talk about - and how much data they process
  2. Locate untapped sourcesRefine data rather than just selling it. For instance, the analysis georeferenced photos you have seen previously as led tothe production of new layer of information for navigation systems.Research Challenge on Visualization http://www.w3.org/2012/06/pmod/visualization.pdfIntroduction and definition As the Google CEO Eric Schmidt pointed out in 2010, currently in two days is created in the world as much information as it was from the appearance of man till 2003. This is due to the explosion in computing techniques, which led to the generation of a tremendous amount of data which are stored in the internet and processed in the IT systems all over the world. In fact as predicted by CISCO4, by 2015 the annual global IP traffic will reach 966 Exabytes (1018 bytes) (nearly a Zettabyte (1021 bytes)), increasing fourfold from about 900 Petabytes (1015 bytes) back in 2000 and around 2,500 Petabytes in 20105. But data are not only stored in the internet, rather in an exponentially increasing number of IT infrastructures.
  3. Materialize data into new services or into new ‘data products’.Some examples of new technologies for data collections6 are: web logs; RFID; sensor networks; social networks; social data (due to the Social data revolution), Internet text and documents; Internet search indexing; call detail records; astronomy, atmospheric science, genomics, biogeochemical, biological; military surveillance; medical records; photography archives; video archives; large-scale eCommerce. In fact, in order to manage this huge amount of data, when it comes to human-computer interaction there is a need to distil the most important information to be presented it in a humanly understandable and comprehensive way. Here it comes visualisation, which is a way to interpret and translate data from computer understandable formats to human ones by employing graphical models, charts, graphs and other images that are conventional for humans7. In a sense we can define visualisation as any technique for creating images, diagrams, or animations to communicate a message or an idea. Since from the beginning of human history, visualisation has been an effective way to communicate both abstract and concrete ideas ------------------------http://www.livework.co.uk/articles/data-is-the-new-oil-part-1-business-informationData, whilst valuable, is a commodityThis is where the process of refinement comes in. We need to refine the data into services. And these services need to meet the needs and issues of the businesses that information providers hope to sell to.Data owners need to think about how to use their data to help fix their customers’ challenges rather than focusing on the number of data sets they can sell.We use information about location, weather, traffic conditions in ways that help us make decisions and fit well into our lives. We all know that information can be live, dynamic and personal to our life context. If data providers do not adopt this kind of Service Thinking then they will be superseded by more agile providers or by Google themselves. The opportunity is there for information businesses to significantly add value to their data assets by treating the provision of information as a service.---------------------------http://ana.blogs.com/maestros/2006/11/data_is_the_new.html “Data is just like crude. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc., to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.”---------------------------http://www.forbes.com/sites/perryrotella/2012/04/02/is-data-the-new-oil/according to IBM, the digital universe will grow to eight zetabytes by 2015real impetus is the potential insights we can derive from this new, vast, and growing natural resource. If data is the next big thing, then companies need to think about a new business model that exploits this valuable resource.
  4. Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.Today’s commodity hardware, cloud architectures and open source software bring big data processing into the reach of the less well-resourced. Big data processing is eminently feasible for even the small garage startups, who can cheaply rent server time in the cloud.The value of big data to an organization falls into two categories: analytical use, and enabling new products. Big data analytics can reveal insights hidden previously by data too costly to process, such as peer influence among customers, revealed by analyzing shoppers’ transactions, social and geographical data. Being able to process every item of data in reasonable time removes the troublesome need for sampling and promotes an investigative approach to data, in contrast to the somewhat static nature of running predetermined reports.
  5. Noah Iliinksy’s Designing Data Visualizations Author of Beautiful Visualization & O’Reilly’s Designing Data VisualizationsNoah Iliinsky, of Complex Diagrams and Designing Data Visualizations, takes our focus from the clear and factual to good storytelling. While data has its properties that need to be honored, he places equal emphasis on knowing your audience and being able to state exactly what it is you want to convey. In terms of design advice, Iliinsky is slightly less explicit about established rules. He borrows a quote from Moritz Stefaner, that "position is everything, color is difficult." No one wants to see arbitrarily chosen, confusing color schemes, but it's no reason to shy away from it completely.Jock Mackinlay’s The Science of Visualization- Tableau
  6. Goal: pop out important information to present effectivelyTake advantage of human visual comparison/system
  7. http://www.infovis-wiki.net/index.php/Preattentive_processinghttp://www.csc.ncsu.edu/faculty/healey/PP/index.html
  8. “The representation and presentation of data that exploits our visual perception abilities in order to amplify cognition”http://complexdiagrams.com/2009/03/tire-chart/ Toughness axis (vertical) isn’t well-defined/ordered: “burly” vs “svelte” gives an idea but is intentionally ambiguous (loose categorical grouping) Rim sizes are preattentively differentiable Price & special features not included in this level of use other ideas: filter by rim size (and price), use icons, reduce grid lines (nominal categories?)
  9. Hans Rosling: TEDTalks “Myths about the developing world“ (2006)
  10. When you don’t yet have a story to tellEach color corresponds to a different group within the professional network, which can be labeled by the user. The graph should allow users to recognize connections that share mutual people, or indentify areas that might be underrepresentedZoomable interface. Select a node to see highlighted nodes that are mutual connections.
  11. http://qph.cf.quoracdn.net/main-qimg-40df8574b885918dde4c2496025a323fuse visuals to thinkExperience is active and involves people trying to answer questionsTask: “question answering”
  12. Visual properties don’t help us compare the share of each client
  13. Use defaults: timelines for timeseries, maps for geographic data
  14.  This just takes technology and pours it into a periodic table-shaped box. Timelines are great — it’s a really powerful axis, that time axis, because you can see where there are clumps and trends. Pour it into a box like [the periodic table] and you get none of that.
  15. Timeline is obviousPlacement is keySee departure and arrival times and flight duration in relation to one anotherTime bar across the top has both time zones listedsort order (ranked) “agony” filter? “Agony” is a combination of price, time of day, number of stopovers. That’s the one you want! That’s really smart.
  16. Axes give you information for free About targets When searching (think grouping)
  17. The top image is an example of poor use of colour to represent sea elevation and land topology. The hues have no natural order and only simply disrupts the reading.The bottom map uses natural colours (blue for ocean and brown for land). It shows ordering and depth/height using varied levels of saturation and luminance.
  18. Color is meaningful
  19. http://fellinlovewithdata.com/guides/the-hidden-legacy-of-bertin-and-the-semiology-of-graphicshttp://mkt.tableausoftware.com/downloads/designing-great-visualizations.pdf
  20. What does data tell us about ourselves and the places (cities, streets, buildings) we live in?A researcher, engineer in the domains of user experience and data science- Investigates interplay between people and data.
  21. “A good sketch is better than a long speech.”
  22. We have been focusing on specific types of data, we call ‘network data’. Network data are the byproducts of ourinteractions with digital infrastructures as nicely animated here by our friend TimoArnall in his project ‘Wireless in theworld’ http://www.nearfield.org/2010/06/new-film-wireless-in-the-world-2. Practically, we have materializinginformation from pretty much anything that is networked in our cities: cellphones, cars, shared bikes, digital cameras,credit cards, ...Video: making invisible wireless technologies visible, in order to better understand and communicate with and about them. Here we are creating communicative material that uses dashed-line abstractions to visualise the presence of wireless technologies in the everyday environment. What if we could see every field produced by an Oyster card or NFC enabled mobile phone for instance?http://www.nearfield.org/2010/06/new-film-wireless-in-the-world-2
  23. We have been focusing on specific types of data, we call ‘network data’. Network data are the byproducts of ourinteractions with digital infrastructures as nicely animated here by our friend TimoArnall in his project ‘Wireless in theworld’ http://www.nearfield.org/2010/06/new-film-wireless-in-the-world-2. Practically, we have materializinginformation from pretty much anything that is networked in our cities: cellphones, cars, shared bikes, digital cameras,credit cards, ...Video: making invisible wireless technologies visible, in order to better understand and communicate with and about them. Here we are creating communicative material that uses dashed-line abstractions to visualise the presence of wireless technologies in the everyday environment. What if we could see every field produced by an Oyster card or NFC enabled mobile phone for instance?http://www.nearfield.org/2010/06/new-film-wireless-in-the-world-2
  24. http://villevivante.ch/Based on this conclusion the City of Geneva decided to take the challenge to visualize these digital traces created by our mobile phones. The objective of this installation is to make this data visible and allow you to explore these streams of connected people around the city, in their everyday life.
  25. Cumulative activity of the city per hour & per daySize + brightness indicates aggregate activity at that hour-----------------------------Every mobile phone leaves digital traces permanently, while interacting with the mobile infrastructure.Geneva generates approximately 15 million connections from 2 million phone calls per day. These 'digital traces' offer new insights about the city, which are of great interest both from a economic and political perspective. innovation opportunity for new citizen services like traffic jam detectors or nightlife buzz indicators.public administration can evaluate urban planning strategies.reveal insights for businesses on how popular certain districts are, during what time periods. reveal information that is invisible in traditional visualization techniques such as cartography.
  26. The process of innovating with (network) data demands several clear steps, each with their own set of questions andanswers: From the data access and collection techniques, that feed data to obfuscations algorithms and big datamanagement systems that are interrogated by basic data mining operation or advanced statistical inquiries. Informationvisualization techniques are then used to build evidences and indicators used to interrogate further the data.Innovate with data : iterate through process, métiers, sketch, sketch and sketchThe process involves multiple practices and skills from engineering, to statistics, design, strategy planning, productmanagement and law.
  27. sketches with the data at hand at each steps. We use this sketches to answered some questions that generate newinterrogations for the next phase
  28. Sketching is not a new practice as part of a creative activity. Sketching has been widely used to innovate in drawing,painting and architecture all domains related to visualization and communication. For instance Le Corbusier whochanged the face of architecture was famous to sketch while presenting his projects and ideas:“Through visual artifacts, architects can transform, manipulate, and develop architectural concepts in anticipation offuture construction. It may, in fact, be through this alteration that architectural ideas find form”
  29. The project gathered multiple practices from a Network Engineer to help access the data to a Product manager that had to transform insights scenarios of product.Engineer Data: network of cells that distribute phone conversationsProduct manager view: sees the data through customers and their interactionsability to quickly sketch an interactive system is a way to develop a common language amongst varied stakeholdersallows them to focus on tangible opportunities of products or services that are hidden within their data
  30. produced a sketch to showed the data we were trying to transform, for instance revealing the quality of the data to measure mobility and the type of information that could be extracted (here mobility and density of activity on the network).
  31. In this project, we first helped the Louvre formulate needs to measure of occupancy levels and flows. We create an inventory of the availability of datasets both internally and externally in partnership with sensor network providers. We then considered the complementarity of the information to define indicators that help facility managers, museologists and architects evaluate their strategies. We helped them design novel strategies to control hyper-congestion and ensure a good visiting experience.
  32. So far, administrators of the museum only had a partial understanding of the problem based on observations and surveys.Used BitCarrier to collect emperical data on flows and densities of visitors in key areasBased on the measures of occupancy levels, visiting times, and centrality of trails, we developed a solution that measures the influence of hyper-congestion on the visiting experience in the most popular rooms of the museum.These results can influence the remodeling of areas and the deployment of information kiosks and help evaluate strategies and policies to control hyper-congestion.
  33. Limitations of quants: how to qualify how people walk, etc.Doors were closed because the crowds became too largeSo we used our sketches to confront our measures and indicators with people on the field. Their *qualitative evidences* helped contextualize and qualify the early results as well as explain the detected irregularities. This qualitative view reinforced the quantitative observations and consolidated the overall knowledge on hyper-congestion. In other words, network data tell a story, not THE story.
  34. Limitations of quants: how to qualify how people walk, etc.Doors were closed because the crowds became too largePeople on the field have the experience to help contextualize the data and early resultsSo we used our sketches to confront our measures and indicators with people on the field. Their *qualitative evidences* helped contextualize and qualify the early results as well as explain the detected irregularities. This qualitative view reinforced the quantitative observations and consolidated the overall knowledge on hyper-congestion. In other words, network data tell a story, not THE story.
  35. Explore new roles of banks in the smart cities in the near future: needWe used maps (see examples) and interactive proof of concept to provoke the exploration of opportunities for innovative BBVA internal and external services. This investigation process led us to co-create opportunities to exploit data in the domains of distribution strategies, audience profiling and social navigation.
  36. New perspectives for innovative servicesThis investigation process led us to co-create opportunities to exploit data in the domains of distribution strategies, audience profiling and social navigation.As part of our consulting work, we sketched a pretty advanced dashboard for participants of the project to explore and interrogate their data with fresh perspectives. (Here a mix of social network and credit card activity in Madrid). The use of the dashboard helped the participants craft and tune indicators that qualify the space (e.g. the streets of a city) based on its business activity. This experience was used to develop specific scenarios involving services and products that exploit a bank could take advantage of. multiple perspectives extracted from the use of exploratory data visualizations is crucial to quickly answer some basic questions and provoke many better ones-that generate new interrogations for the next phase
  37. Quadrigram is an online platform with a Visual Programming Language, that can be used to gather data and generate meaning through data processing and information visualization. Modular interface to design information flows, linking data resources to operators, controls and viz methods within node-based GUI that displays structure of your process. These modules form a data flow when you link them together. Each time you modify a modules, the update is propagated throughout the flow. Access, manipulate, analyze and visualizeFreely explore multiple dimensions of a single dataset, each time generating a set of questions and answers.Additionally they reduce the prototyping time necessary to sketch interactive visualizations that allow the different stakeholder of an organization to take an active part in the design of services or products.
  38. Real-time traffic information: their sensor networks measures the quantity and speed of the traffic in key areas of a city.Exploratory data analysis approach to create an interactive applicationFive representations of a single data set:Table visualizer (rows & columns)Network visualization to see relationships between pointsGeodata to view points on map to view context. View trajectory of traffic in a single slice of timeData in real-time. Incoming up-to-the-second data to see motion of traffic between points, moving at different velocities Data as a living materialTemporal data: temperature data--------------
  39. Real-time traffic information: their sensor networks measures the quantity and speed of the traffic in key areas of a city.Exploratory data analysis approach to create an interactive applicationFive representations of a single data set:Table visualizer (rows & columns)Network visualization to see relationships between pointsGeodata to view points on map to view context. View trajectory of traffic in a single slice of timeData in real-time. Incoming up-to-the-second data to see motion of traffic between points, moving at different velocities Data as a living materialTemporal data: temperature data--------------
  40. This example shows how multiples interrelated perspectives on the same data (temporal bar charts, quadrifications, maps, and scatter plots) can create a powerful tool that permits us to explore the activities of a company by projects, sectors, location, and profitability.This application collects and analyses the sentiment expressed in real-time on Twitter. The results shows the positive and negative polarities with respect to a word you define.So, we have seen that our world produces new type of data - network data - that is now treated is a material. There areboth processes and tools that help innovate with this evolution. From our experience, there are values to sketching withdata, in the same ways as strategists, innovators and world changers have been using sketches in the past.
  41. Visualization is one of the most advanced fields in policy modeling, being able to foster the design of more effective and efficient policies, as well as to make sense of large datasets, such as those provided as open government data. In fact the huge increase in data availability is also due to the so called "open data" movement, characterized by the fact that all across Europe and the US, governments are increasingly publishing their data repositories for other people to access and use it.
  42. This map visualizes crowd-sourced radiation geiger counter readings from across Japan. Click on the labels to get more information on the source of each reading.The number of locations fluctuate due to the validity of the data feeds. There are approximately 185 feeds from the official Japanese government source MEXT and the rest are from other sources such as the Tokyo hackspace, universities, local councils and concerned individuals.
  43. http://cpstiers.opencityapps.org/
  44. Simon Rogers is editor of The Guardian Data Blog (www.guardian.co.uk/data, @datastore) an online data resource which publishes hundreds of raw datasets and encourages its users to visualise and analyse them. He is also a news editor on the Guardian, working with the graphics team to visualise and interpret huge datasets.
  45. Simon Rogers is editor of The Guardian Data Blog (www.guardian.co.uk/data, @datastore) an online data resource which publishes hundreds of raw datasets and encourages its users to visualise and analyse them. He is also a news editor on the Guardian, working with the graphics team to visualise and interpret huge datasets.Manually pick out data from PDF to extract specific information
  46. The tools we have to analyse the data may have changed; that motivation has stayed exactly the same.How all the spending fits together: see department cuts and which programmes received big increases (nuclear, defence)Most comprehensive atlas of public spending availableEver year every government dept publishes an annual report which includes breakdowns of spendingManually pick out data from PDF to extract specific information
  47. The data itself covers over 194,000 individual transactions, payments to suppliers and bills covered by government departments in the first five months of the life of the Coalition. There's lots excluded, though: the NHS, benefit payments, spending by quangos, information removed for "national security" and personally confidential reports. It's about £80bn of an annual spend of £670bn.We figured 170 spreadsheets is too much for most people to browse, so Guardian lead software architect Matthew Wall has built this usefulspending data explorer app. It's designed to make it easier for you to search and download the key data you're interested in.We may even have done some of the analysis you're looking for already. We've combined spending for each department into single spreadsheets. Here's what you can find:• Sheet 1: Every item for the department• Sheet 2: Detailed breakdown of type of spending• Sheet 3: Broader breakdown into fewer areas• Sheet 4: Every supplier listed in alphabetical order and by size (watch out on this one for different spellings of the same supplier)
  48. Soldiers are good at entering data – locations where soldiers died in Afghanhistan (date, what happened, # of casualties, summaries)
  49. Interactive map display region using wikileaks war log dataWikileaks: every IED attack, with co-ordinates2004-2009
  50. Made it more interesting/rewarding for people: asked ppl to do smaller tasks with reduced number of data “zooniverse” – citizen science project to transcribe documents, visually classify images, categorize etc. added recognition to users: keep track of task assignments, see progress reward for work: identification from journalists / editorial feedback allow users to skip over uninteresting docs – lead to users reviewing more docs on average ability to view data about your own MP
  51. BlackoutGate: massive cover-up of their expenses after the Commons authorities released hundreds of thousands of claims documents and receipts with huge sections of detail blacked out. belief that publication would be in breach of the Data Protection Act.http://www.guardian.co.uk/politics/2009/jun/18/mps-expenses-censorship-black-out
  52. http://storify.com/smfrogers/making-a-map-togetherhttp://www.guardian.co.uk/uk/datablog/2012/apr/12/deprivation-poverty-london
  53. First time such a major attempt had been made to forensically examine the motivations behind a riot since the work in Detroit in 1967Gathered qualitative data of the interviews and quantitative responses to a set of questionsUK riots: every verified incidentCollected key reported incidents from as many possible sourcesRaw data in Google spreadsheets: approx time, date, place, location details, local authority, what happened, sourceMapped with Google Fusion tables
  54. England riots: suspects mapped and poverty mapped