SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
HIGH-VALUE DATASETS
FROM PUBLICATION TO IMPACT
Elena Simperl
@esimperl
National Open Data Conference
December 3, 2020
How do people search, make sense of, and use
open data?
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Frameworks
and models
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Methods and guidance
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Tools
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Analysis
Analysis
HIGH-VALUE DATASETS
UNDERSTANDING USE THROUGH BEHAVIOUR ANALYSIS
TO PROVIDE GUIDANCE TO PUBLISHERS
Open
government
data
portals
• Search logs
• Data requests
Data
science
platforms
• Activity logs
2018 STUDY
ANALYSIS OF LOGS AND REQUESTS
Four national open government data portals, 2.2 million
queries from 2013 to 2016, 1500 data requests.
Data search is a work-related activity.
Shorter queries, include time and location, with varying
levels of granularity.
Explorative search, using keywords and filters.
Native and external queries topically different.
Data requests describe the data through boundaries and
restrictions on location, time, data type, granularity.
Kacprzak, E., Koesten, L., Ibáñez, L.D., Blount, T., Tennison, J. and Simperl, E., 2018. Characterising dataset search—An analysis of search logs and data requests. Journal of Web Semantics.
2020 STUDY
ANALYSIS OF LOGS
844,343 user sessions
(April 18 to June 20)
Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal, Analytical Report 18, 2020
TOOLS TO FIND DATA
EDP is used to find datasets, but it is not the only tool
people use. 60% of sessions arrive to the EDP from the
web.
Changes in portal design impact traffic (and user
experience).
Covid-19 datasets were in high demand in 2020.
A large majority of users visit only one section of the
portal at a time.
Content on the site needs to be better interlinked (both
data and other pages). When links exist, people use them.
SEARCH APPROACHES AND AFFORDANCES
Filters are important: 60% of native sessions use
filters-only; 15-20% use keywords and filters.
Common search strategies: single-filter; keywords
first, then one or more filters.
Popular filters are country and category. Less so:
format, license.
Keyword queries are short, less use of time,
format and data attributes, more use of location.
SUCCESS IN DATASET SEARCH
20-40% of native queries and 8-25% of
external queries are successful.
Keywords + filters seem to work better. Might
also be a proxy for seasoned users.
IMPLICATIONS FOR PUBLISHERS
SEO strategy Filter
affordances
Granular
location data
Dataset
retrieval
Dataset
preview
pages
Links between
content and
datasets
WHAT’S NEXT
Studies on users and their
information needs.
Granular activity data captured
and shared by portals for new
studies.
Portals publish lots of data. They
now need to do more to become
data communities.
PUBLICATIONS
Talking Datasets — understanding data sensemaking behaviours. L Koesten, K
Gregory, P Groth, E Simperl. Currently under review at the International Journal
of Human-Computer Studies. 2020
Everything You Always Wanted to Know about a Dataset: Studies in Data
Summarisation. L Koesten, E Simperl, E Kacprzak, T Blount, J Tennison.
International Journal of Human-Computer Studies. 2019
Collaborative Practices with Structured Data: Do Tools Support what Users Need?
L Koesten, E Kacprzak, E Simperl, J Tennison; ACM CHI Conference on Human
Factors in Computing Systems, CHI 2019.
Dataset search: a survey. A Chapman, E Simperl, L Koesten, G Konstantinidis,
LD Ibáñez, E Kacprzak, P Groth. The International Journal on Very Large Data
Bases, 2019.
Characterising dataset search — An analysis of search logs and data requests. E
Kacprzak, L Koesten, LD Ibáñez, T Blount, J Tennison, E Simperl; Journal of Web
Semantics, 2018
Characterising Dataset Search on the European Data Portal: An Analysis of
Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal,
Analytical Report 18, 2020
The Trials and Tribulations of Working with Structured Data - a Study on
Information Seeking Behaviour. L Koesten, E Kacprzak, J Tennison, E Simperl.
Proceedings of ACM CHI Conference on Human Factors in Computing Systems,
CHI 2017.
Dataset Reuse: Toward Translating Principles to Practice. L Koesten, P Vougiouklis,
E Simperl, P Groth - Patterns, 2020
Pie Chart or Pizza: Identifying Chart Types and Their Virality on Twitter - P
Vougiouklis, L Carr, E Simperl - Proceedings of the International AAAI Conference
on Web and Social Media, 2020

Weitere ähnliche Inhalte

Was ist angesagt?

Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesElena Simperl
 
Loops of humans and bots in Wikidata
Loops of humans and bots in WikidataLoops of humans and bots in Wikidata
Loops of humans and bots in WikidataElena Simperl
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Istituto nazionale di statistica
 
Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3SMCFrance
 
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” Daniel X. O'Neil
 
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleBernhard Rieder
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Bernhard Rieder
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020P2Pvalue
 
Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25Jody Ranck
 
The GIS Guide to Public Domain Data
The GIS Guide to Public Domain DataThe GIS Guide to Public Domain Data
The GIS Guide to Public Domain DataEsri
 
Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)Han Woo PARK
 
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...Brittne Kakulla, Ph.D.
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Bernhard Rieder
 
Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical GesturesBernhard Rieder
 

Was ist angesagt? (20)

Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart cities
 
Loops of humans and bots in Wikidata
Loops of humans and bots in WikidataLoops of humans and bots in Wikidata
Loops of humans and bots in Wikidata
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
 
Innovations in Data for Decision Making
Innovations in Data for Decision MakingInnovations in Data for Decision Making
Innovations in Data for Decision Making
 
GI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationshipsGI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationships
 
Homelessness Data Discussion
Homelessness Data DiscussionHomelessness Data Discussion
Homelessness Data Discussion
 
Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3
 
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
 
Data Power
Data PowerData Power
Data Power
 
Ongoing Research in Data Studies
Ongoing Research in Data StudiesOngoing Research in Data Studies
Ongoing Research in Data Studies
 
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
 
Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25
 
The GIS Guide to Public Domain Data
The GIS Guide to Public Domain DataThe GIS Guide to Public Domain Data
The GIS Guide to Public Domain Data
 
Community Data Program Submitted letter to Open Government Partneship
Community Data Program Submitted letter to Open Government PartneshipCommunity Data Program Submitted letter to Open Government Partneship
Community Data Program Submitted letter to Open Government Partneship
 
Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)
 
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
 
Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical Gestures
 

Ähnlich wie High-value datasets: from publication to impact

The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactElena Simperl
 
Data Discovery and Visualization
Data Discovery and VisualizationData Discovery and Visualization
Data Discovery and VisualizationDr. Neil Brittliff
 
Talk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceTalk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceGenoveva Vargas-Solar
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...michellep
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnPavankalayankusetty
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013WCJones6348
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
Presentation emerging tecnology
Presentation  emerging tecnologyPresentation  emerging tecnology
Presentation emerging tecnologyAmalAltarge
 
Zeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhZeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhMarcia Zeng
 
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...Andrew Stott
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...John Makridis
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
Data Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus LearningData Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus LearningPraj H
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 

Ähnlich wie High-value datasets: from publication to impact (20)

The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
Data Discovery and Visualization
Data Discovery and VisualizationData Discovery and Visualization
Data Discovery and Visualization
 
Talk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceTalk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial Intelligence
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learn
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Presentation emerging tecnology
Presentation  emerging tecnologyPresentation  emerging tecnology
Presentation emerging tecnology
 
Zeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhZeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadh
 
Big Data
Big DataBig Data
Big Data
 
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Data Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus LearningData Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus Learning
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 

Mehr von Elena Simperl

This talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceElena Simperl
 
Knowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationKnowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationElena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringElena Simperl
 
Ten myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfTen myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfElena Simperl
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringElena Simperl
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfElena Simperl
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Elena Simperl
 
Qrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesQrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesElena Simperl
 
Inclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachInclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachElena Simperl
 
Making transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factorMaking transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factorElena Simperl
 
Quality and collaboration in Wikidata
Quality and collaboration in WikidataQuality and collaboration in Wikidata
Quality and collaboration in WikidataElena Simperl
 
Beyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasksBeyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasksElena Simperl
 
The business of open data
The business of open dataThe business of open data
The business of open dataElena Simperl
 
Open data – are we done
Open data – are we doneOpen data – are we done
Open data – are we doneElena Simperl
 

Mehr von Elena Simperl (18)

This talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing science
 
Knowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationKnowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generation
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineering
 
Ten myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfTen myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdf
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineering
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdf
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?
 
Qrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesQrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart cities
 
Qrowd and the city
Qrowd and the cityQrowd and the city
Qrowd and the city
 
Inclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachInclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approach
 
Making transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factorMaking transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factor
 
Data storytelling
Data storytelling Data storytelling
Data storytelling
 
Quality and collaboration in Wikidata
Quality and collaboration in WikidataQuality and collaboration in Wikidata
Quality and collaboration in Wikidata
 
Beyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasksBeyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasks
 
The Data Pitch call
The Data Pitch callThe Data Pitch call
The Data Pitch call
 
The business of open data
The business of open dataThe business of open data
The business of open data
 
Open data – are we done
Open data – are we doneOpen data – are we done
Open data – are we done
 

Kürzlich hochgeladen

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxAniqa Zai
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...HyderabadDolls
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 

Kürzlich hochgeladen (20)

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 

High-value datasets: from publication to impact

  • 1. HIGH-VALUE DATASETS FROM PUBLICATION TO IMPACT Elena Simperl @esimperl National Open Data Conference December 3, 2020
  • 2. How do people search, make sense of, and use open data?
  • 4. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Frameworks and models
  • 5. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Methods and guidance
  • 6. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Tools
  • 7. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Analysis Analysis
  • 8. HIGH-VALUE DATASETS UNDERSTANDING USE THROUGH BEHAVIOUR ANALYSIS TO PROVIDE GUIDANCE TO PUBLISHERS Open government data portals • Search logs • Data requests Data science platforms • Activity logs
  • 9. 2018 STUDY ANALYSIS OF LOGS AND REQUESTS Four national open government data portals, 2.2 million queries from 2013 to 2016, 1500 data requests. Data search is a work-related activity. Shorter queries, include time and location, with varying levels of granularity. Explorative search, using keywords and filters. Native and external queries topically different. Data requests describe the data through boundaries and restrictions on location, time, data type, granularity. Kacprzak, E., Koesten, L., Ibáñez, L.D., Blount, T., Tennison, J. and Simperl, E., 2018. Characterising dataset search—An analysis of search logs and data requests. Journal of Web Semantics.
  • 10. 2020 STUDY ANALYSIS OF LOGS 844,343 user sessions (April 18 to June 20) Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal, Analytical Report 18, 2020
  • 11. TOOLS TO FIND DATA EDP is used to find datasets, but it is not the only tool people use. 60% of sessions arrive to the EDP from the web. Changes in portal design impact traffic (and user experience). Covid-19 datasets were in high demand in 2020. A large majority of users visit only one section of the portal at a time. Content on the site needs to be better interlinked (both data and other pages). When links exist, people use them.
  • 12. SEARCH APPROACHES AND AFFORDANCES Filters are important: 60% of native sessions use filters-only; 15-20% use keywords and filters. Common search strategies: single-filter; keywords first, then one or more filters. Popular filters are country and category. Less so: format, license. Keyword queries are short, less use of time, format and data attributes, more use of location.
  • 13. SUCCESS IN DATASET SEARCH 20-40% of native queries and 8-25% of external queries are successful. Keywords + filters seem to work better. Might also be a proxy for seasoned users.
  • 14. IMPLICATIONS FOR PUBLISHERS SEO strategy Filter affordances Granular location data Dataset retrieval Dataset preview pages Links between content and datasets
  • 15. WHAT’S NEXT Studies on users and their information needs. Granular activity data captured and shared by portals for new studies. Portals publish lots of data. They now need to do more to become data communities.
  • 16. PUBLICATIONS Talking Datasets — understanding data sensemaking behaviours. L Koesten, K Gregory, P Groth, E Simperl. Currently under review at the International Journal of Human-Computer Studies. 2020 Everything You Always Wanted to Know about a Dataset: Studies in Data Summarisation. L Koesten, E Simperl, E Kacprzak, T Blount, J Tennison. International Journal of Human-Computer Studies. 2019 Collaborative Practices with Structured Data: Do Tools Support what Users Need? L Koesten, E Kacprzak, E Simperl, J Tennison; ACM CHI Conference on Human Factors in Computing Systems, CHI 2019. Dataset search: a survey. A Chapman, E Simperl, L Koesten, G Konstantinidis, LD Ibáñez, E Kacprzak, P Groth. The International Journal on Very Large Data Bases, 2019. Characterising dataset search — An analysis of search logs and data requests. E Kacprzak, L Koesten, LD Ibáñez, T Blount, J Tennison, E Simperl; Journal of Web Semantics, 2018 Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal, Analytical Report 18, 2020 The Trials and Tribulations of Working with Structured Data - a Study on Information Seeking Behaviour. L Koesten, E Kacprzak, J Tennison, E Simperl. Proceedings of ACM CHI Conference on Human Factors in Computing Systems, CHI 2017. Dataset Reuse: Toward Translating Principles to Practice. L Koesten, P Vougiouklis, E Simperl, P Groth - Patterns, 2020 Pie Chart or Pizza: Identifying Chart Types and Their Virality on Twitter - P Vougiouklis, L Carr, E Simperl - Proceedings of the International AAAI Conference on Web and Social Media, 2020