1. Improving the discovery of European Historic Newspapers
Rossitza Atanassova, British Library
@RossiAtanassova
IFLA Newspapers, Lyon, 20 August 2014
2. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Europeana Newspapers is making historic newspapers pages searchable
2
http://vimeo.com/100313926
3. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Project outcomes
•Content in 22 languages ranging 17th-20th century
•10 million pages of full text
•Article-level records and named entities for 2 million pages
•Aggregation of up to 18 million pages
•Aggregation of metadata of up to additional 19 million pages
•Cross-searchable newspapers interface at The European Library
•http://www.theeuropeanlibrary. org/tel4/newspapers
•Issue-level metadata via Europeana http://www.europeana.eu/
3
4. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Statistics
Currently one can search through
•full-text for over 2 million pages
•metadata records relating to
to over 1 million issues
(links to source libraries)
4
5. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
5
6. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Search and browse options
6
7. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Display options
•Metadata, full-text and full zoomable images
•Metadata, full-text and static images (full size or snippets)
•Metadata and full-text
•Metadata
7
8. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Usability testing
•Remote 60 minutes long test sessions in April 2014
•Conducted by User Vision, Edinburgh
•12 participants from 5 countries with professional or strong personal research interest in the content
•6 task scenarios
•Pre- and post-test questionnaires
•User Vision Report at http://www.europeana- newspapers.eu/usability-testing-results-for-our-historic- newspapers-browser/
8
9. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Task success and ease of use ratings
9
Images in Alan Blackwood, The European Library Newspaper Archive – Usability Testing, 16/04/2014
10. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
User response to the interface
•“Strong positive reaction to the availability of the archive”
•“Aggregated view of content from many sources highly valued”
•“Basic search functionalities worked well”
•Presentation of images and image navigation controls are appreciated, as is the display of OCRed text
•Browse content over geographical map is popular
•Identified issues with design and functionality: facets, results, navigation
•More expectations: print, download, saved searches
10
11. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Before and after
11
12. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Changes to landing page
•Prominent browse and advanced options
•‘Discover’ tab for browse options page
•This day in history allows users to scroll through all relevant issues
12
13. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Changes to browsing options
•Search by issue date modified to include a text input box for the year with auto-suggestions
•Select title from an alphabetical index
•Geographical map of Europe is bigger and uses better colour palette to indicate number of issues
13
14. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
•Sort by relevance, descending date and ascending date
•Configure number of items per page (10-100)
•Further recommendations: controls to navigate between results, a ‘back to search results’ button and a search input box to allow modification of search terms
14
Changes to results pages
15. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
15
Faceted search and newspaper source page
16. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
16
17. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Integration of the viewer into the Europeana portal
17
18. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Next steps with the browser
18
•Second usability test in September
•Final version by end of 2014
•Add OCR correction functionality
•Allow access via API
•Further integration of the newspapers viewer within Europeana
19. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Research practices and expectations
•Participants in the usability test have well established research practices and higher expectations of the site’s functionality
•Preference for search over browsing
•Greater control over search results
•Multiple layers of search through facets
•Would like to search by subject area and historical period
•User account to save search histories
•Download and print options
•New content notifications and feedback submission option
19
20. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Researchers’ interest in the Europeana Newspapers archive
20
•Interdisciplinary source of information
•Mass digitised content
•Pan-European cross- searchable archive
•Transnational comparative studies
•Text mining for multilingual content
•Computational analysis and visualisation of the data
21. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
What researchers value
21
“I see enormous value in an archive that breaks down national boundaries automatically, where I can search for content from a range of countries..” – Bob Nicholson
“The difference lies not just in access but in the conversion of a massive amount of print into a searchable resource … This holds the potential to make connections across newspapers in ways previously unimaginable.” Matt Rubery
“Now software allows us to work with millions of pages. By combining words and expressions, machines uncover patterns that we never even suspected were there …” Professor Toine Pieters
22. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Digital Humanities approaches to digitised newspaper archives
22
•Asymmetrical Encounters: E- Humanity Approaches to Reference Cultures in Europe, 1815-1992’
•The project will apply multi- lingual text mining techniques to long runs of digitised newspapers and other textual materials
23. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
The Victorian Meme Machine project
23
•Partnership between Bob Nicholson, Edge Hill University and British Library Labs
•Extract Victorian jokes from 19th century British newspapers
•Crowdsource transcriptions
•Algorithms to pair text with images
•Share and re-use memes
https://www.youtube.com/watch?v=FN1ZSAz2vMg
24. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Europeana Newspapers Information Days
24
25. This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp
Final workshop “Newspapers in Europe & the Digital Agenda for Europe”
25
•British Library, 29-30 September 2014
•The value of digitised historic newspapers
•How to overcome the barriers to improving access to digitised historic newspapers
•Policy makers, researchers, librarians, cultural heritage professionals and newspaper publishers
26. Thank you! For more information visit www.europeana-newspapers.eu