SlideShare ist ein Scribd-Unternehmen logo
1 von 27
StatMine – prototypeStatMine
Exploring official statistics
Martijn Tennekes, Edwin de Jonge, Jan van der Laan & Jessica
Statistics Netherlands (CBS)
Visweek 2013
StatMine, statistical
goldmine
Edwin de Jonge (@edwindjonge)
Jan van der Laan, Jessica Solcer
Statistics Netherlands / CBS
Dutch Information Visualisation Event 2014, June 19, 2014
StatMine 0.2
2
Statistics Netherlands / CBS
- Creates and publishes official statistics on economics,
demographics, health care and others.
- Since 1899
- Website: www.cbs.nl
- Online DB: http://statline.cbs.nl (since 1997)
Why StatMine?
– Online StatLine contains more than one billion (109)
facts
‐ Policy makers
‐ Journalists
‐ Citizens
‐ Enterprises
‐ Economists
‐ Social scientist
‐ Historicians
‐ etc
StatMine 0.2
3
StatMine
4
Problem 1
Numbers ≠ Information
1. Numbers ≠ Information
We know from a user study that:
1. Many interesting patterns in StatLine are not spotted by
users
2. Many important topics in StatLine are scattered across
multiple tables
StatMine 0.2
5
StatMine
6
H1:
Data
analysis
=
Data
insight
H1. Data insight
Goal of StatMine 0.1 was to provide more
insight StatLine numbers by
• Presenting these facts visually and
interactively
• We tested this succesfully on 4 “difficult
StatLine tables.
StatMine 0.2
7
StatMine 0.2
8
Bar chart
- compare
Line chart
- development
Bubble/scatter chart
- correlation
Mosaic chart
- structure
an exploration
of
dissemination
data: StatMine
9
Chart type – bar chart
StatMine 0.2
10
Small multiples?
StatMine 0.2
11
Demo
an exploration of dissemination data: StatMine 12
StatMine 0.1 Results
Tested on 25 users:
Findings:
- Test persons think that visualizing data adds
value (small multiples)
- Data owners look at their data differently
- They want this tool to check their data before
publication.
StatMine 0.2
13
StatMine
14
Problem 2:
Fragmented Information
2. Fragmented information
Most information in StatLine is fragmented:
‐ Energy consumption wrt economic growth
‐ Perceived public safety wrt registered crime
– Users currently need to look into multiple tables and
combine the information by hand. Gebruiker moet in
meerdere tabellen kijken en informatie zelf combineren
StatMine 0.2
15
StatMine
16
2. Merge data!
H2. Table joining
Goal StatMine 0.2: create more insight by:
- Letting users combine tables
- Condition: share at least one column/data
dimension.
- Tested on small set of tables.
StatMine 0.2
17
StatMine 0.2 Results
Test persons: 20 internal, 40 external (policy
makers, journalists).
Findings:
- External users enthousiast about visual
possibilities StatMine
- Joining of data fills a user need.
StatMine 0.2
18
StatMine
19
Problem 3
Statistical numbers are
uncertain
H3. Confidence intervals
– Al facts Statistics Netherlands have confidence interval
– European Statistics Code of Practice (12.2):
‐ “sampling and non sampling errors should be
systematically documented”
Goal StatMine 0.3:
Investigate how uncertainty in numbers can be presented
understandable to users.
StatMine
20
Restricted to:
‐ How do users interpret CI’s? And what does that affect
the interpretation of facts?
‐ Do users need CI’s?
Assumption:
‐ For test data set of point estimate with CI available
StatMine 0.3
StatMine 0.2
21
User test (100+) with synthetic data shows that:
‐ CI’s improve validity of user statements (they are
more correct)
User test CI’s
StatMine 0.2
22
StatMine 0.3
– Prototype StatMine 0.3:
‐ Show uncertainty in Line Charts
‐ Bar Charts
‐ Tested on 25 test persons.
23
Line charts with uncertainty
24
Bar charts with uncertainty
25
StatMine 0.4
–Build on CBS open data API
–Will be public
–Currently in beta test, ETA (2014 Q3)
26
Questions?
27

Weitere ähnliche Inhalte

Ähnlich wie StatMine

EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceKasia Szkuta
 
Big data as a source for official statistics
Big data as a source for official statisticsBig data as a source for official statistics
Big data as a source for official statisticsEdwin de Jonge
 
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...Arthur Mickoleit
 
Open Data Institute presentation of european context
Open Data Institute presentation of european contextOpen Data Institute presentation of european context
Open Data Institute presentation of european contextliberTIC
 
Strata Big data presentation
Strata Big data presentationStrata Big data presentation
Strata Big data presentationPiet J.H. Daas
 
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open DataIC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open DataDr. Haxel Consult
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataJim Damicis
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataBethany Meys, MPH
 
IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...StatsCommunications
 
Carlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for businessCarlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for businessCarlo Vaccari
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...Piet J.H. Daas
 
Introduction to Data4Impact
Introduction to Data4ImpactIntroduction to Data4Impact
Introduction to Data4ImpactData4Impact
 
Measuring of informal emplyment
Measuring of informal emplyment Measuring of informal emplyment
Measuring of informal emplyment Dr Lendy Spires
 
Informal sector publication
Informal sector publicationInformal sector publication
Informal sector publicationDr Lendy Spires
 
Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016Juan Mateos-Garcia
 
TRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCHTRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCHAlfredo Pina
 
Open Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should CareOpen Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should CareAndrew Stott
 
Henry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsHenry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsIncisive_Events
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...BYTE Project
 

Ähnlich wie StatMine (20)

EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
 
Presentation Sofie De Broe (ochtend)
Presentation Sofie De Broe (ochtend)Presentation Sofie De Broe (ochtend)
Presentation Sofie De Broe (ochtend)
 
Big data as a source for official statistics
Big data as a source for official statisticsBig data as a source for official statistics
Big data as a source for official statistics
 
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
 
Open Data Institute presentation of european context
Open Data Institute presentation of european contextOpen Data Institute presentation of european context
Open Data Institute presentation of european context
 
Strata Big data presentation
Strata Big data presentationStrata Big data presentation
Strata Big data presentation
 
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open DataIC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though Data
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though Data
 
IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...
 
Carlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for businessCarlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for business
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 
Introduction to Data4Impact
Introduction to Data4ImpactIntroduction to Data4Impact
Introduction to Data4Impact
 
Measuring of informal emplyment
Measuring of informal emplyment Measuring of informal emplyment
Measuring of informal emplyment
 
Informal sector publication
Informal sector publicationInformal sector publication
Informal sector publication
 
Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016
 
TRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCHTRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCH
 
Open Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should CareOpen Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should Care
 
Henry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsHenry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information Professionals
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
 

Mehr von Edwin de Jonge

Validatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rulesValidatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rulesEdwin de Jonge
 
Data error! But where?
Data error! But where?Data error! But where?
Data error! But where?Edwin de Jonge
 
Daff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frameDaff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frameEdwin de Jonge
 
Heatmaps best practices Strata Hadoop
Heatmaps best practices Strata HadoopHeatmaps best practices Strata Hadoop
Heatmaps best practices Strata HadoopEdwin de Jonge
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014Edwin de Jonge
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data VisualizationEdwin de Jonge
 
ffbase, statistical functions for large datasets
ffbase, statistical functions for large datasetsffbase, statistical functions for large datasets
ffbase, statistical functions for large datasetsEdwin de Jonge
 
Tabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large dataTabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large dataEdwin de Jonge
 
Statmine, Visuele dataexploratie
Statmine, Visuele dataexploratieStatmine, Visuele dataexploratie
Statmine, Visuele dataexploratieEdwin de Jonge
 

Mehr von Edwin de Jonge (11)

sdcSpatial user!2019
sdcSpatial user!2019sdcSpatial user!2019
sdcSpatial user!2019
 
Validatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rulesValidatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rules
 
Data error! But where?
Data error! But where?Data error! But where?
Data error! But where?
 
Daff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frameDaff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frame
 
Heatmaps best practices Strata Hadoop
Heatmaps best practices Strata HadoopHeatmaps best practices Strata Hadoop
Heatmaps best practices Strata Hadoop
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
 
Big data experiments
Big data experimentsBig data experiments
Big data experiments
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data Visualization
 
ffbase, statistical functions for large datasets
ffbase, statistical functions for large datasetsffbase, statistical functions for large datasets
ffbase, statistical functions for large datasets
 
Tabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large dataTabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large data
 
Statmine, Visuele dataexploratie
Statmine, Visuele dataexploratieStatmine, Visuele dataexploratie
Statmine, Visuele dataexploratie
 

Kürzlich hochgeladen

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Kürzlich hochgeladen (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

StatMine

  • 1. StatMine – prototypeStatMine Exploring official statistics Martijn Tennekes, Edwin de Jonge, Jan van der Laan & Jessica Statistics Netherlands (CBS) Visweek 2013 StatMine, statistical goldmine Edwin de Jonge (@edwindjonge) Jan van der Laan, Jessica Solcer Statistics Netherlands / CBS Dutch Information Visualisation Event 2014, June 19, 2014
  • 2. StatMine 0.2 2 Statistics Netherlands / CBS - Creates and publishes official statistics on economics, demographics, health care and others. - Since 1899 - Website: www.cbs.nl - Online DB: http://statline.cbs.nl (since 1997)
  • 3. Why StatMine? – Online StatLine contains more than one billion (109) facts ‐ Policy makers ‐ Journalists ‐ Citizens ‐ Enterprises ‐ Economists ‐ Social scientist ‐ Historicians ‐ etc StatMine 0.2 3
  • 5. 1. Numbers ≠ Information We know from a user study that: 1. Many interesting patterns in StatLine are not spotted by users 2. Many important topics in StatLine are scattered across multiple tables StatMine 0.2 5
  • 7. H1. Data insight Goal of StatMine 0.1 was to provide more insight StatLine numbers by • Presenting these facts visually and interactively • We tested this succesfully on 4 “difficult StatLine tables. StatMine 0.2 7
  • 8. StatMine 0.2 8 Bar chart - compare Line chart - development Bubble/scatter chart - correlation Mosaic chart - structure
  • 12. Demo an exploration of dissemination data: StatMine 12
  • 13. StatMine 0.1 Results Tested on 25 users: Findings: - Test persons think that visualizing data adds value (small multiples) - Data owners look at their data differently - They want this tool to check their data before publication. StatMine 0.2 13
  • 15. 2. Fragmented information Most information in StatLine is fragmented: ‐ Energy consumption wrt economic growth ‐ Perceived public safety wrt registered crime – Users currently need to look into multiple tables and combine the information by hand. Gebruiker moet in meerdere tabellen kijken en informatie zelf combineren StatMine 0.2 15
  • 17. H2. Table joining Goal StatMine 0.2: create more insight by: - Letting users combine tables - Condition: share at least one column/data dimension. - Tested on small set of tables. StatMine 0.2 17
  • 18. StatMine 0.2 Results Test persons: 20 internal, 40 external (policy makers, journalists). Findings: - External users enthousiast about visual possibilities StatMine - Joining of data fills a user need. StatMine 0.2 18
  • 20. H3. Confidence intervals – Al facts Statistics Netherlands have confidence interval – European Statistics Code of Practice (12.2): ‐ “sampling and non sampling errors should be systematically documented” Goal StatMine 0.3: Investigate how uncertainty in numbers can be presented understandable to users. StatMine 20
  • 21. Restricted to: ‐ How do users interpret CI’s? And what does that affect the interpretation of facts? ‐ Do users need CI’s? Assumption: ‐ For test data set of point estimate with CI available StatMine 0.3 StatMine 0.2 21
  • 22. User test (100+) with synthetic data shows that: ‐ CI’s improve validity of user statements (they are more correct) User test CI’s StatMine 0.2 22
  • 23. StatMine 0.3 – Prototype StatMine 0.3: ‐ Show uncertainty in Line Charts ‐ Bar Charts ‐ Tested on 25 test persons. 23
  • 24. Line charts with uncertainty 24
  • 25. Bar charts with uncertainty 25
  • 26. StatMine 0.4 –Build on CBS open data API –Will be public –Currently in beta test, ETA (2014 Q3) 26