SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Tool for converting and linking statistical datasets
to a cloud of interconnected historical datasets.
QB’er - Demonstration
Ashkan Ashkpour, IISH – CLARIAH WP4
07-10-2016
GOAL OF THIS
PRESENTATION
From CSV files and structured statistical data to (harmonized)
Interlinked data on the Web
Data Tooling Interlinked Datasets on the web
• Gather and enter own data
• Find data on multiple repositories
• Download
• Clean and reshape
• Merge
• Clean and reshape…
• Analyse
PROBLEM - Today’s Workflow
PROBLEM
Disconnected data and efforts
We keep repeating ourselves and do this repeatedly for the same
datasets
Comparability across time and datasets
https://blog.gaijinpot.com/knowledge-sharing-economy/
LOSS OFF..
Provenance
Cleaning efforts (sometimes up to 60% of the work)
Valuable mappings (discarding time consuming prior work)
Expert decisions
Discoverability
SOLUTION: INTEGRATE DISSIMILAR
DATA IN FLEXIBLE AND
ACCOUNTABLE WAYS
HARMONIZATION AND RDF
What we want is harmonization by way of;
Standardization and Classification
 Flexible approach while providing accountability
QB’ER
Empower individual researchers to:
Code and harmonize individual datasets according to best practices of the
community (e.g. HISCO, SDMX, Worldbank, etc.) or against their colleagues
Share their own code lists with fellow researchers
Align code lists across datasets
Publish their standards-compliant datasets on a Structured Data Hub
Collaborative growing of a graph of interconnected datasets
INPUT
INPUT
INPUT
INPUT
DEMO EXAMPLE
Nieuwkomers in de Utrechtse volkstelling van 1829 en 1839
http://hdl.handle.net/10622/KMAJLE
Utrecht 1829
Utrecht 1839
Variables
Values
DEMO
Qb’er Demonstration Video
TO CONCLUDE…
• Generic, domain-independent tool
• Uploading of a dataset and extraction of variables and value
Frequencies
• Mapping of variable values to codes (while preserving the originals!)
• Publishing of dataset structure as Linked Data
• Align codes and identifiers across datasets
• Provenance of all assertions to the SDH traceable to time and person
• Crowd-based production of code lists and mappings
• Sharing / Reuse other people’s work (or stand on the shoulders of giants)
• No disposable research
QUESTIONS ?
QB’er - Demonstration
Ashkan Ashkpour – CLARIAH WP4
07-10-2016

Weitere ähnliche Inhalte

Was ist angesagt?

20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage InformationEnno Meijers
 
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...Stefan Schmunk
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...WARCnet
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked dataEnno Meijers
 
VRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffHeather Seneff
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOChristophe Guéret
 
Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingTobias Kuhn
 
ESDG seminar 2019: reconstructing a country
ESDG seminar 2019: reconstructing a countryESDG seminar 2019: reconstructing a country
ESDG seminar 2019: reconstructing a countryRick Mourits
 
Collections as Data National Forum (Elings)
Collections as Data National Forum (Elings)Collections as Data National Forum (Elings)
Collections as Data National Forum (Elings)Mary Elings
 
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'ScienceWorks
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
 
ELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource ManagementELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource ManagementLydiaU
 
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...Björn Muschall
 
Introducing Web of Science Profiles
Introducing Web of Science ProfilesIntroducing Web of Science Profiles
Introducing Web of Science ProfilesORCID, Inc
 
Connecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked DataConnecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked DataVictor de Boer
 
Mind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingMind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingSimeon Warner
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Stefan Dietze
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphIoan Toma
 

Was ist angesagt? (20)

20170501 Distributed Network of Digital Heritage Information
20170501  Distributed Network of Digital Heritage Information20170501  Distributed Network of Digital Heritage Information
20170501 Distributed Network of Digital Heritage Information
 
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
New tasks, new roles: Libraries in the tension between Digital Humanities, Re...
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
 
lodlam summit session browsable linked data
lodlam summit session browsable linked datalodlam summit session browsable linked data
lodlam summit session browsable linked data
 
VRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_SeneffVRA_2015_CatalogingRoundup_Seneff
VRA_2015_CatalogingRoundup_Seneff
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVO
 
Nanopublications and Decentralized Publishing
Nanopublications and Decentralized PublishingNanopublications and Decentralized Publishing
Nanopublications and Decentralized Publishing
 
ESDG seminar 2019: reconstructing a country
ESDG seminar 2019: reconstructing a countryESDG seminar 2019: reconstructing a country
ESDG seminar 2019: reconstructing a country
 
Collections as Data National Forum (Elings)
Collections as Data National Forum (Elings)Collections as Data National Forum (Elings)
Collections as Data National Forum (Elings)
 
Linked Data
Linked DataLinked Data
Linked Data
 
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
Erwin Folmer - Congres 'Data gedreven Beleidsontwikkeling'
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
ELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource ManagementELAG 2014, Workshop on Electronic Resource Management
ELAG 2014, Workshop on Electronic Resource Management
 
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
Developing an ERM System based on Linked Data (AMSL project presentation @ ER...
 
Introducing Web of Science Profiles
Introducing Web of Science ProfilesIntroducing Web of Science Profiles
Introducing Web of Science Profiles
 
Connecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked DataConnecting Heterogeneous Collections using Linked Data
Connecting Heterogeneous Collections using Linked Data
 
Mind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingMind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvesting
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 

Ähnlich wie QB'er demonstration

Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsCarole Goble
 
Data standardization process for social sciences and humanities
Data standardization process for social sciences and humanitiesData standardization process for social sciences and humanities
Data standardization process for social sciences and humanitiesvty
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...vty
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...giuseppe_futia
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryRuben Schalk
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationDenodo
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_trainingssouthwick
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionSri Ambati
 
Scaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data ChallengesScaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data ChallengesMatthew Vaughn
 
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016CLARIAH
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesDataWorks Summit
 
Arches Getty Brownbag Talk
Arches Getty Brownbag TalkArches Getty Brownbag Talk
Arches Getty Brownbag Talkbenosteen
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015Ioan Toma
 
Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013Ian Foster
 
20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS WebinarBen Blaiszik
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingCascading
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data LakesLinked Enterprise Date Services
 

Ähnlich wie QB'er demonstration (20)

Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
Data standardization process for social sciences and humanities
Data standardization process for social sciences and humanitiesData standardization process for social sciences and humanities
Data standardization process for social sciences and humanities
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'int...
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University Library
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
Minimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data VirtualizationMinimizing the Complexities of Machine Learning with Data Virtualization
Minimizing the Complexities of Machine Learning with Data Virtualization
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_training
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 Edition
 
Scaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data ChallengesScaling People, Not Just Systems, to Take On Big Data Challenges
Scaling People, Not Just Systems, to Take On Big Data Challenges
 
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
Executable papers
Executable papersExecutable papers
Executable papers
 
Arches Getty Brownbag Talk
Arches Getty Brownbag TalkArches Getty Brownbag Talk
Arches Getty Brownbag Talk
 
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
The LDBC Social Network Benchmark Interactive Workload - SIGMOD 2015
 
Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013
 
20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar20160922 Materials Data Facility TMS Webinar
20160922 Materials Data Facility TMS Webinar
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
 
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakeseccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
eccenca CorporateMemory - Semantically integrated Enterprise Data Lakes
 

Mehr von CLARIAH

ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018CLARIAH
 
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018CLARIAH
 
Masterclass innosurance 2018
Masterclass innosurance 2018Masterclass innosurance 2018
Masterclass innosurance 2018CLARIAH
 
Flat TLA
Flat TLAFlat TLA
Flat TLACLARIAH
 
Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.CLARIAH
 
CMDI2RDF
CMDI2RDFCMDI2RDF
CMDI2RDFCLARIAH
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3CLARIAH
 
2016 05-20-clariah-wp2
2016 05-20-clariah-wp22016 05-20-clariah-wp2
2016 05-20-clariah-wp2CLARIAH
 
2016 05-20-clariah-wp5
2016 05-20-clariah-wp52016 05-20-clariah-wp5
2016 05-20-clariah-wp5CLARIAH
 
MTAS Henny Brugman
MTAS Henny BrugmanMTAS Henny Brugman
MTAS Henny BrugmanCLARIAH
 
LREC Ton vd Wouden
LREC Ton vd WoudenLREC Ton vd Wouden
LREC Ton vd WoudenCLARIAH
 
Paqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkPaqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkCLARIAH
 
Open sonar martinreynaert
Open sonar martinreynaertOpen sonar martinreynaert
Open sonar martinreynaertCLARIAH
 
Struc data Auke Rijpma
Struc data Auke RijpmaStruc data Auke Rijpma
Struc data Auke RijpmaCLARIAH
 
Diachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek VossenDiachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek VossenCLARIAH
 
Corpus studio Erwin Komen
Corpus studio Erwin KomenCorpus studio Erwin Komen
Corpus studio Erwin KomenCLARIAH
 
Athena richard zijdeman
Athena richard zijdemanAthena richard zijdeman
Athena richard zijdemanCLARIAH
 
Struc data aukerijpma
Struc data aukerijpmaStruc data aukerijpma
Struc data aukerijpmaCLARIAH
 
Anansi jauco noordzij
Anansi jauco noordzijAnansi jauco noordzij
Anansi jauco noordzijCLARIAH
 
Clariah dag 2016_wp1_ocw
Clariah dag 2016_wp1_ocwClariah dag 2016_wp1_ocw
Clariah dag 2016_wp1_ocwCLARIAH
 

Mehr von CLARIAH (20)

ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
ACAD Presentation by Wilbert Spooren, CLARIAH Toogdag 19-10-2018
 
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
DB:CCC Presentation of Karin Hofmeester, CLARIAH Toogdag 19-10-2018
 
Masterclass innosurance 2018
Masterclass innosurance 2018Masterclass innosurance 2018
Masterclass innosurance 2018
 
Flat TLA
Flat TLAFlat TLA
Flat TLA
 
Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.Collection registration for the CLARIAH Media Suite.
Collection registration for the CLARIAH Media Suite.
 
CMDI2RDF
CMDI2RDFCMDI2RDF
CMDI2RDF
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3
 
2016 05-20-clariah-wp2
2016 05-20-clariah-wp22016 05-20-clariah-wp2
2016 05-20-clariah-wp2
 
2016 05-20-clariah-wp5
2016 05-20-clariah-wp52016 05-20-clariah-wp5
2016 05-20-clariah-wp5
 
MTAS Henny Brugman
MTAS Henny BrugmanMTAS Henny Brugman
MTAS Henny Brugman
 
LREC Ton vd Wouden
LREC Ton vd WoudenLREC Ton vd Wouden
LREC Ton vd Wouden
 
Paqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan OdijkPaqu Gertjan van Noord en Jan Odijk
Paqu Gertjan van Noord en Jan Odijk
 
Open sonar martinreynaert
Open sonar martinreynaertOpen sonar martinreynaert
Open sonar martinreynaert
 
Struc data Auke Rijpma
Struc data Auke RijpmaStruc data Auke Rijpma
Struc data Auke Rijpma
 
Diachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek VossenDiachronous conceptuallexicons Marieke van Erp / Piek Vossen
Diachronous conceptuallexicons Marieke van Erp / Piek Vossen
 
Corpus studio Erwin Komen
Corpus studio Erwin KomenCorpus studio Erwin Komen
Corpus studio Erwin Komen
 
Athena richard zijdeman
Athena richard zijdemanAthena richard zijdeman
Athena richard zijdeman
 
Struc data aukerijpma
Struc data aukerijpmaStruc data aukerijpma
Struc data aukerijpma
 
Anansi jauco noordzij
Anansi jauco noordzijAnansi jauco noordzij
Anansi jauco noordzij
 
Clariah dag 2016_wp1_ocw
Clariah dag 2016_wp1_ocwClariah dag 2016_wp1_ocw
Clariah dag 2016_wp1_ocw
 

Kürzlich hochgeladen

Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 

Kürzlich hochgeladen (20)

Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 

QB'er demonstration

  • 1. Tool for converting and linking statistical datasets to a cloud of interconnected historical datasets. QB’er - Demonstration Ashkan Ashkpour, IISH – CLARIAH WP4 07-10-2016
  • 2. GOAL OF THIS PRESENTATION From CSV files and structured statistical data to (harmonized) Interlinked data on the Web Data Tooling Interlinked Datasets on the web
  • 3. • Gather and enter own data • Find data on multiple repositories • Download • Clean and reshape • Merge • Clean and reshape… • Analyse PROBLEM - Today’s Workflow
  • 4. PROBLEM Disconnected data and efforts We keep repeating ourselves and do this repeatedly for the same datasets Comparability across time and datasets
  • 6. LOSS OFF.. Provenance Cleaning efforts (sometimes up to 60% of the work) Valuable mappings (discarding time consuming prior work) Expert decisions Discoverability
  • 7. SOLUTION: INTEGRATE DISSIMILAR DATA IN FLEXIBLE AND ACCOUNTABLE WAYS
  • 8. HARMONIZATION AND RDF What we want is harmonization by way of; Standardization and Classification  Flexible approach while providing accountability
  • 9.
  • 10.
  • 11.
  • 12. QB’ER Empower individual researchers to: Code and harmonize individual datasets according to best practices of the community (e.g. HISCO, SDMX, Worldbank, etc.) or against their colleagues Share their own code lists with fellow researchers Align code lists across datasets Publish their standards-compliant datasets on a Structured Data Hub Collaborative growing of a graph of interconnected datasets
  • 13. INPUT
  • 14. INPUT
  • 15. INPUT
  • 16. INPUT
  • 17.
  • 18. DEMO EXAMPLE Nieuwkomers in de Utrechtse volkstelling van 1829 en 1839 http://hdl.handle.net/10622/KMAJLE
  • 22. TO CONCLUDE… • Generic, domain-independent tool • Uploading of a dataset and extraction of variables and value Frequencies • Mapping of variable values to codes (while preserving the originals!) • Publishing of dataset structure as Linked Data • Align codes and identifiers across datasets • Provenance of all assertions to the SDH traceable to time and person • Crowd-based production of code lists and mappings • Sharing / Reuse other people’s work (or stand on the shoulders of giants) • No disposable research
  • 23. QUESTIONS ? QB’er - Demonstration Ashkan Ashkpour – CLARIAH WP4 07-10-2016