Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Who cares about yesterday‘s news?
Use cases and requirements for newspaper digitization
Clemens Neudecker
Staatsbibliothek...
Topics
• Current state of newspaper digitization
–European Newspapers Survey
–ICON Comparative Analysis
• Exemplary use ca...
Europeana Newspapers Survey
• Europeana Newspapers survey (2012):
47 respondents from European libraries
• Most EU countri...
ICON Comparative Analysis
• ICON Comparative Analysis (2015)
• (Awareness of) newspaper digitization mostly
limited to Wes...
Representation of Absence
• Scale of what
is still left to
digitize is
mindboggling
...only about
0,001% done
in Europe
„Copyright cliff of death“
Use cases
Example use cases: 1
• Digital Humanities / Text & Data Mining
– Broad interest in societal, cultural developments
– Newsp...
viraltexts.org
Elegant Elephant
Example use cases: 2
• Creative industries / Apps
– Unfamiliar but intriguing uses
– Potential to reach out to novel audie...
hierwashetnieuws.nl
Example use cases: 3
• Commercial / Family History
– Aim to identify inviduals within articles, obituaries
– Benefit great...
familysearch.com
Interactive Newspaper Desk
Summary: Requirements
• Interest in digital newspapers is as diverse
as the newspaper content
• OCR is nearly always a mus...
Summary: Best Practices
• Make available a critical mass through
cost-efficient microfilm digitization
• Always provide OC...
„The coolest thing to do with
your data will be thought of
by someone else“
Jo Walsh & Rufus Pollock:
The Many Minds Princ...
Thank you for your attention!
Questions?
Clemens Neudecker
Staatsbibliothek zu Berlin
Europeana Newspapers
@cneudecker
Nächste SlideShare
Wird geladen in …5
×

Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-newspaper-digitization-slides

369 Aufrufe

Veröffentlicht am

Who cares about yesterday's news? Use cases and requirements for newspaper digitization. Presentation held at IFLA News Media Conference 2016, 20-22 April, Hamburg, Germany.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Neudecker who-cares-about-yesterday’s-news-–-use-cases-and-requirements-for-newspaper-digitization-slides

  1. 1. Who cares about yesterday‘s news? Use cases and requirements for newspaper digitization Clemens Neudecker Staatsbibliothek zu Berlin Europeana Newspapers @cneudecker IFLA International News Media Conference Hamburg, 20-22 April 2016
  2. 2. Topics • Current state of newspaper digitization –European Newspapers Survey –ICON Comparative Analysis • Exemplary use cases –Digital Humanities / Text Mining –Creative Industries / Apps –Industry / Family History • Requirements and best practices
  3. 3. Europeana Newspapers Survey • Europeana Newspapers survey (2012): 47 respondents from European libraries • Most EU countries have (national/major) newspaper digitization programmes in place • Approx. 130,000,000 pages already digitized • 87% of respondents offer access to their newspaper collection free-of-charge
  4. 4. ICON Comparative Analysis • ICON Comparative Analysis (2015) • (Awareness of) newspaper digitization mostly limited to Western countries (US-UK-EU) • The vast majority of digital newspapers have been produced from microfilm / cost-efficiency • Estimated 30,000 titles digitized in US-UK-EU, approximately 45,000 titles worldwide  Lack of material other than English
  5. 5. Representation of Absence • Scale of what is still left to digitize is mindboggling ...only about 0,001% done in Europe
  6. 6. „Copyright cliff of death“
  7. 7. Use cases
  8. 8. Example use cases: 1 • Digital Humanities / Text & Data Mining – Broad interest in societal, cultural developments – Newspapers cover „daily life“, events that do not make it into the history textbooks – OCR/full-text almost always a requirement – For text mining, large quantities of data can be more important than the quality of the OCR – Prefer API or bulk download over search & browse – See also http://www.europeana- newspapers.eu/category/interviews-with- researchers/
  9. 9. viraltexts.org
  10. 10. Elegant Elephant
  11. 11. Example use cases: 2 • Creative industries / Apps – Unfamiliar but intriguing uses – Potential to reach out to novel audiences – Not necessarily commercial interest – Almost exclusively require API – Serendipity effect – Tracing the use: Trove: http://trovespace.webfactional.com/traces/ NDNP: http://www.loc.gov/ndnp/extras/#reuse
  12. 12. hierwashetnieuws.nl
  13. 13. Example use cases: 3 • Commercial / Family History – Aim to identify inviduals within articles, obituaries – Benefit greatly from Named Entity Recognition – Huge volunteer base for crowd-sourcing
  14. 14. familysearch.com
  15. 15. Interactive Newspaper Desk
  16. 16. Summary: Requirements • Interest in digital newspapers is as diverse as the newspaper content • OCR is nearly always a must-have • NER can enhance some use cases greatly • Access should be as open as possible • APIs provide a means for third parties to create additional outreach and exposure
  17. 17. Summary: Best Practices • Make available a critical mass through cost-efficient microfilm digitization • Always provide OCR and be transparent about the quality • Open access to the content is not a threat but can help create unforeseeable exposure and added value through creative reuse • Work with the public!
  18. 18. „The coolest thing to do with your data will be thought of by someone else“ Jo Walsh & Rufus Pollock: The Many Minds Principle
  19. 19. Thank you for your attention! Questions? Clemens Neudecker Staatsbibliothek zu Berlin Europeana Newspapers @cneudecker

×