SlideShare ist ein Scribd-Unternehmen logo
1 von 27
StatMine – prototypeStatMine
Exploring official statistics
Martijn Tennekes, Edwin de Jonge, Jan van der Laan & Jessica
Statistics Netherlands (CBS)
Visweek 2013
StatMine, statistical
goldmine
Edwin de Jonge (@edwindjonge)
Jan van der Laan, Jessica Solcer
Statistics Netherlands / CBS
Dutch Information Visualisation Event 2014, June 19, 2014
StatMine 0.2
2
Statistics Netherlands / CBS
- Creates and publishes official statistics on economics,
demographics, health care and others.
- Since 1899
- Website: www.cbs.nl
- Online DB: http://statline.cbs.nl (since 1997)
Why StatMine?
– Online StatLine contains more than one billion (109)
facts
‐ Policy makers
‐ Journalists
‐ Citizens
‐ Enterprises
‐ Economists
‐ Social scientist
‐ Historicians
‐ etc
StatMine 0.2
3
StatMine
4
Problem 1
Numbers ≠ Information
1. Numbers ≠ Information
We know from a user study that:
1. Many interesting patterns in StatLine are not spotted by
users
2. Many important topics in StatLine are scattered across
multiple tables
StatMine 0.2
5
StatMine
6
H1:
Data
analysis
=
Data
insight
H1. Data insight
Goal of StatMine 0.1 was to provide more
insight StatLine numbers by
• Presenting these facts visually and
interactively
• We tested this succesfully on 4 “difficult
StatLine tables.
StatMine 0.2
7
StatMine 0.2
8
Bar chart
- compare
Line chart
- development
Bubble/scatter chart
- correlation
Mosaic chart
- structure
an exploration
of
dissemination
data: StatMine
9
Chart type – bar chart
StatMine 0.2
10
Small multiples?
StatMine 0.2
11
Demo
an exploration of dissemination data: StatMine 12
StatMine 0.1 Results
Tested on 25 users:
Findings:
- Test persons think that visualizing data adds
value (small multiples)
- Data owners look at their data differently
- They want this tool to check their data before
publication.
StatMine 0.2
13
StatMine
14
Problem 2:
Fragmented Information
2. Fragmented information
Most information in StatLine is fragmented:
‐ Energy consumption wrt economic growth
‐ Perceived public safety wrt registered crime
– Users currently need to look into multiple tables and
combine the information by hand. Gebruiker moet in
meerdere tabellen kijken en informatie zelf combineren
StatMine 0.2
15
StatMine
16
2. Merge data!
H2. Table joining
Goal StatMine 0.2: create more insight by:
- Letting users combine tables
- Condition: share at least one column/data
dimension.
- Tested on small set of tables.
StatMine 0.2
17
StatMine 0.2 Results
Test persons: 20 internal, 40 external (policy
makers, journalists).
Findings:
- External users enthousiast about visual
possibilities StatMine
- Joining of data fills a user need.
StatMine 0.2
18
StatMine
19
Problem 3
Statistical numbers are
uncertain
H3. Confidence intervals
– Al facts Statistics Netherlands have confidence interval
– European Statistics Code of Practice (12.2):
‐ “sampling and non sampling errors should be
systematically documented”
Goal StatMine 0.3:
Investigate how uncertainty in numbers can be presented
understandable to users.
StatMine
20
Restricted to:
‐ How do users interpret CI’s? And what does that affect
the interpretation of facts?
‐ Do users need CI’s?
Assumption:
‐ For test data set of point estimate with CI available
StatMine 0.3
StatMine 0.2
21
User test (100+) with synthetic data shows that:
‐ CI’s improve validity of user statements (they are
more correct)
User test CI’s
StatMine 0.2
22
StatMine 0.3
– Prototype StatMine 0.3:
‐ Show uncertainty in Line Charts
‐ Bar Charts
‐ Tested on 25 test persons.
23
Line charts with uncertainty
24
Bar charts with uncertainty
25
StatMine 0.4
–Build on CBS open data API
–Will be public
–Currently in beta test, ETA (2014 Q3)
26
Questions?
27

Weitere ähnliche Inhalte

Ähnlich wie StatMine

EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceKasia Szkuta
 
Big data as a source for official statistics
Big data as a source for official statisticsBig data as a source for official statistics
Big data as a source for official statisticsEdwin de Jonge
 
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...Arthur Mickoleit
 
Open Data Institute presentation of european context
Open Data Institute presentation of european contextOpen Data Institute presentation of european context
Open Data Institute presentation of european contextliberTIC
 
Strata Big data presentation
Strata Big data presentationStrata Big data presentation
Strata Big data presentationPiet J.H. Daas
 
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open DataIC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open DataDr. Haxel Consult
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataJim Damicis
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataBethany Meys, MPH
 
IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...StatsCommunications
 
Carlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for businessCarlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for businessCarlo Vaccari
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...Piet J.H. Daas
 
Introduction to Data4Impact
Introduction to Data4ImpactIntroduction to Data4Impact
Introduction to Data4ImpactData4Impact
 
Measuring of informal emplyment
Measuring of informal emplyment Measuring of informal emplyment
Measuring of informal emplyment Dr Lendy Spires
 
Informal sector publication
Informal sector publicationInformal sector publication
Informal sector publicationDr Lendy Spires
 
Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016Juan Mateos-Garcia
 
TRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCHTRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCHAlfredo Pina
 
Open Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should CareOpen Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should CareAndrew Stott
 
Henry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsHenry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsIncisive_Events
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...BYTE Project
 

Ähnlich wie StatMine (20)

EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
 
Presentation Sofie De Broe (ochtend)
Presentation Sofie De Broe (ochtend)Presentation Sofie De Broe (ochtend)
Presentation Sofie De Broe (ochtend)
 
Big data as a source for official statistics
Big data as a source for official statisticsBig data as a source for official statistics
Big data as a source for official statistics
 
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
2015 05 19 - From # to impact - presentation at OECD Development Communicatio...
 
Open Data Institute presentation of european context
Open Data Institute presentation of european contextOpen Data Institute presentation of european context
Open Data Institute presentation of european context
 
Strata Big data presentation
Strata Big data presentationStrata Big data presentation
Strata Big data presentation
 
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open DataIC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
IC-SDV 2018: Patrick Beaucamp (Bpm-Conseil) A journey in Open Data
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though Data
 
Understanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though DataUnderstanding COVID-19 and Beyond Though Data
Understanding COVID-19 and Beyond Though Data
 
IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...IAOS 2018 - Trust in official statistics. An econometric search for determin...
IAOS 2018 - Trust in official statistics. An econometric search for determin...
 
Carlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for businessCarlo Colicchio: Big Data for business
Carlo Colicchio: Big Data for business
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 
Introduction to Data4Impact
Introduction to Data4ImpactIntroduction to Data4Impact
Introduction to Data4Impact
 
Measuring of informal emplyment
Measuring of informal emplyment Measuring of informal emplyment
Measuring of informal emplyment
 
Informal sector publication
Informal sector publicationInformal sector publication
Informal sector publication
 
Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016Digital econ policy data presentation for readie 18mar2016
Digital econ policy data presentation for readie 18mar2016
 
TRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCHTRASCENDA - NETSENSER - INVESTOR PITCH
TRASCENDA - NETSENSER - INVESTOR PITCH
 
Open Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should CareOpen Data for Economic and Social Development: Why Government Should Care
Open Data for Economic and Social Development: Why Government Should Care
 
Henry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information ProfessionalsHenry Stiller Implementing New Roles For Information Professionals
Henry Stiller Implementing New Roles For Information Professionals
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
 

Mehr von Edwin de Jonge

Validatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rulesValidatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rulesEdwin de Jonge
 
Data error! But where?
Data error! But where?Data error! But where?
Data error! But where?Edwin de Jonge
 
Daff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frameDaff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frameEdwin de Jonge
 
Heatmaps best practices Strata Hadoop
Heatmaps best practices Strata HadoopHeatmaps best practices Strata Hadoop
Heatmaps best practices Strata HadoopEdwin de Jonge
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014Edwin de Jonge
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data VisualizationEdwin de Jonge
 
ffbase, statistical functions for large datasets
ffbase, statistical functions for large datasetsffbase, statistical functions for large datasets
ffbase, statistical functions for large datasetsEdwin de Jonge
 
Tabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large dataTabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large dataEdwin de Jonge
 
Statmine, Visuele dataexploratie
Statmine, Visuele dataexploratieStatmine, Visuele dataexploratie
Statmine, Visuele dataexploratieEdwin de Jonge
 

Mehr von Edwin de Jonge (11)

sdcSpatial user!2019
sdcSpatial user!2019sdcSpatial user!2019
sdcSpatial user!2019
 
Validatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rulesValidatetools, resolve and simplify contradictive or data validation rules
Validatetools, resolve and simplify contradictive or data validation rules
 
Data error! But where?
Data error! But where?Data error! But where?
Data error! But where?
 
Daff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frameDaff: diff, patch and merge for data.frame
Daff: diff, patch and merge for data.frame
 
Heatmaps best practices Strata Hadoop
Heatmaps best practices Strata HadoopHeatmaps best practices Strata Hadoop
Heatmaps best practices Strata Hadoop
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
 
Big data experiments
Big data experimentsBig data experiments
Big data experiments
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data Visualization
 
ffbase, statistical functions for large datasets
ffbase, statistical functions for large datasetsffbase, statistical functions for large datasets
ffbase, statistical functions for large datasets
 
Tabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large dataTabplotd3, interactive inspection of large data
Tabplotd3, interactive inspection of large data
 
Statmine, Visuele dataexploratie
Statmine, Visuele dataexploratieStatmine, Visuele dataexploratie
Statmine, Visuele dataexploratie
 

Kürzlich hochgeladen

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 

Kürzlich hochgeladen (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 

StatMine

  • 1. StatMine – prototypeStatMine Exploring official statistics Martijn Tennekes, Edwin de Jonge, Jan van der Laan & Jessica Statistics Netherlands (CBS) Visweek 2013 StatMine, statistical goldmine Edwin de Jonge (@edwindjonge) Jan van der Laan, Jessica Solcer Statistics Netherlands / CBS Dutch Information Visualisation Event 2014, June 19, 2014
  • 2. StatMine 0.2 2 Statistics Netherlands / CBS - Creates and publishes official statistics on economics, demographics, health care and others. - Since 1899 - Website: www.cbs.nl - Online DB: http://statline.cbs.nl (since 1997)
  • 3. Why StatMine? – Online StatLine contains more than one billion (109) facts ‐ Policy makers ‐ Journalists ‐ Citizens ‐ Enterprises ‐ Economists ‐ Social scientist ‐ Historicians ‐ etc StatMine 0.2 3
  • 5. 1. Numbers ≠ Information We know from a user study that: 1. Many interesting patterns in StatLine are not spotted by users 2. Many important topics in StatLine are scattered across multiple tables StatMine 0.2 5
  • 7. H1. Data insight Goal of StatMine 0.1 was to provide more insight StatLine numbers by • Presenting these facts visually and interactively • We tested this succesfully on 4 “difficult StatLine tables. StatMine 0.2 7
  • 8. StatMine 0.2 8 Bar chart - compare Line chart - development Bubble/scatter chart - correlation Mosaic chart - structure
  • 12. Demo an exploration of dissemination data: StatMine 12
  • 13. StatMine 0.1 Results Tested on 25 users: Findings: - Test persons think that visualizing data adds value (small multiples) - Data owners look at their data differently - They want this tool to check their data before publication. StatMine 0.2 13
  • 15. 2. Fragmented information Most information in StatLine is fragmented: ‐ Energy consumption wrt economic growth ‐ Perceived public safety wrt registered crime – Users currently need to look into multiple tables and combine the information by hand. Gebruiker moet in meerdere tabellen kijken en informatie zelf combineren StatMine 0.2 15
  • 17. H2. Table joining Goal StatMine 0.2: create more insight by: - Letting users combine tables - Condition: share at least one column/data dimension. - Tested on small set of tables. StatMine 0.2 17
  • 18. StatMine 0.2 Results Test persons: 20 internal, 40 external (policy makers, journalists). Findings: - External users enthousiast about visual possibilities StatMine - Joining of data fills a user need. StatMine 0.2 18
  • 20. H3. Confidence intervals – Al facts Statistics Netherlands have confidence interval – European Statistics Code of Practice (12.2): ‐ “sampling and non sampling errors should be systematically documented” Goal StatMine 0.3: Investigate how uncertainty in numbers can be presented understandable to users. StatMine 20
  • 21. Restricted to: ‐ How do users interpret CI’s? And what does that affect the interpretation of facts? ‐ Do users need CI’s? Assumption: ‐ For test data set of point estimate with CI available StatMine 0.3 StatMine 0.2 21
  • 22. User test (100+) with synthetic data shows that: ‐ CI’s improve validity of user statements (they are more correct) User test CI’s StatMine 0.2 22
  • 23. StatMine 0.3 – Prototype StatMine 0.3: ‐ Show uncertainty in Line Charts ‐ Bar Charts ‐ Tested on 25 test persons. 23
  • 24. Line charts with uncertainty 24
  • 25. Bar charts with uncertainty 25
  • 26. StatMine 0.4 –Build on CBS open data API –Will be public –Currently in beta test, ETA (2014 Q3) 26