The document summarizes projects that have used the British Library's cultural heritage digital collections and data in creative ways. It describes projects such as using digitized newspapers to mine verse from the 18th century, tagging over 1 million images on Flickr Commons, and artistic works generated from the digital collections including memes, maps, and machine learning experiments. The British Library aims to accelerate human imagination by making cultural heritage widely and openly available and supporting innovative uses of the data.
1. 1@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
British Library Labs
Using the British Library’s Cultural Heritage Data
Mahendra Mahey
1445 – 1500, 24-25 November 2016,
Second Session Part II: Imagination and Speculative Cultures
Accelerating Human Imagination, A workshop with the Arthur C Clark Center for Human Imagination,
UC San Diego and the University of Liverpool in London,
33 Finsbury Square, London, EC2A 1AG, UK.
https://goo.gl/2qqj6V
3. 3@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Perspective on
Accelerating Human Imagination
• Getting Researchers, Artists, Entrepreneurs, Educators and
General Public using the British Library’s Cultural Heritage
Digital Collections and Data in creative, innovative and
imaginative ways!
• Why and how are we doing that?
• What are the challenges and how we overcoming them?
• What are people actually doing as a result?
4. 4@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
http://www.bl.uk/projects/british-library-labs
Funded by the Andrew W. Mellon Foundation
6. 6@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content
Show us what you have already done with our digital
content in research, artistic, commercial and learning and
teaching categories
Talk to us about working on collaborative projects
7. 7@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Cultural Heritage Datasets
Datasets about our collections
Bibliographic datasets relating to our published
and archival holdings
Datasets for content mining
Content suitable for use in text and data mining
research
Datasets for image analysis
Image collections suitable for large-scale image-
analysis-based research
Datasets from UK Web Archive
Data and API services available for accessing UK
Web Archive
Digital mapping
Geospatial data, cartographic applications, digital
aerial photography and scanned historic map
materials https://data.bl.uk
Launched November 7, 2016
Discussion list: http://www.jiscmail.ac.uk/CULTURAL-HERITAGE-DATASETS
8. 8@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
The Magic of Openness!
• By opening collections up we are creating the possibility to
have them used in ways only restricted by human
imagination.
• Need to work hard to tell people about our Digital
Collections and Data especially if not easy to find, creating
serendipity and opportunities for use!
• Give plenty of examples to inspire use!
• Support and celebrate the use!
9. 9@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Labs Engagement 2016
• 18 institutions visited
• 5000 miles travelled
• 50 presentations & 25 workshops
• 900 researchers / artists/ entrepreneurs /
educators
• 400 expressions of interest
• 40 researchers, artists, entrepreneurs &
educators supported
• 60TB of data via post
• 9TB of data via data.bl.uk (Nov 16)
• Over half a billion views on BL Flickr
Commons since launch in Dec 2013It’s hard work!
11. 11@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Why and how are doing this?
• Working closely with and listening to those who want use
our digital collections and data for their work and helping to
build services, tools and processes to support them
• We can learn how we are and should be supporting them.
– Is the access to digital collections we provide sufficient?
– Do we have the right tools?
– Do we provide the right support?
– Where are the gaps between what they want and what we
can give?
– How do we build the bridges to overcome them?
– Many more reasons…
12. 12@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Some Lessons Learned and Challenges
so far…
• Everything starts from a conversation (external and internal)!
• Need to have several conversations with several stakeholders and tap
into their tacit knowledge that isn’t always written down (esp internal).
• It’s hard work at the beginning!
• Expectations change when researchers actually see the data, systems
and experience the ‘culture’ of the organisation.
• We tend to work with researchers who can be ‘flexible’ with their research
questions and are willing to embrace challenges.
• Often misunderstandings because of jargon & different meaning of words.
• Embrace dirty data, it may never be perfect!
13. 13@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Some Lessons Learned and Challenges
so far…(2)
• Many researchers have the domain knowledge but lack the technical
skills to use Digital Research methods. Should they be teamed up with
those that have problems that need solving (Computing) or get trained?
• Identifying / bridging gaps for researchers to use data, help them
‘navigate’ through the Library to get the data they want (sometimes).
• Huge appetite to use digital content & data (e.g. Flickr Commons stats).
• Start small and simple, but think big!
• Create and embrace serendipity, stimulate the imagination, work fast,
give it energy.
• Learn the lessons, tell the positive stories and move on!
• Fail faster (don’t be afraid), small experiments, reject perfectionism.
15. 15@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Finding things in messy data
Mrs Folly
• Clean up manually
• Get ‘ground truth’
• Write code to find things
reliably in it automatically
• Try code on messy content
• Tweak if necessary
Mrs Folly
16. 16@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
http://victorianhumour.tubmblr.com
Victorian Meme Machine (2014)
https://goo.gl/HMqDt3
Bob Nicholson
http://victorianhumour.tumblr.com/
Bob Nicholson interviewed on
BBC Radio 4 Making History Programme:
http://goo.gl/fmV9ep
And telling jokes to the public:
http://goo.gl/xIDRhz
https://www.youtube.com/watch?v=-GRgj7Q5OM0
Rob Walker, Victorian Mother-in-law Jokes
Victorian Comedy Night, 7 Nov 2016
17. 17@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Katrina Navickas (2015)
Political Meetings Mapper
http://politicalmeetingsmapper.co.uk
https://goo.gl/Qq78Oa
Labs Symposium 2015
https://goo.gl/BSA3be
Interview 2015
The Chartist Newspaper
http://goo.gl/vOLSnH
Chartist Monster Meeting
Chartists Re-enactment London
18. 18@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Black Abolitionist Performances & their Presence
in Britain (2016) – Hannah-Rose Murray
Frederick
Douglass
Ellen
Craft
Josiah
Henson
Ida B
Wells
A Performance by
Joe Williams &
Martelle Edinborough
http://frederickdouglassinbritain.com/
19. 19@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Data-mining verse in 18th
Century newspapers
BL Labs Project 16-17, Jennifer Batt
https://goo.gl/5Akthd
20. 20@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
What can 65,000
books tell us?
Image: Artwork by Alicia Martin
Just one open digital collection
21. 21@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Worked better for female faces than men’s
Press
http://mechanicalcurator.tumblr.com
Posts image every 30 minutes
http://www.flickr.com/photos/britishlibrary/
1,020,418 images
need tagging!
Creative uses of images
Face recognition
Mechanical Curator
http://goo.gl/qPPgxX
Flickr
Snipping out images
from 65,000 Digitised Books*
>600,000,000 views
>15,500,000 tags
https://goo.gl/FgZ4HM
Work @ BL by Ben O’Steen, Labs
and Digital Research Team
*Matt Prior - http://goo.gl/j29Tnx
Since Dec 2013
22. 22@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Tagging a million images
Iterative Crowdsourcing
http://goo.gl/j6fxac
Cardiff University’s
Lost Visions Project
http://www.metadatagames.org/
Metadata Games
James Heald
Mario Klingemann
Chico 45
Use computational methods
Human Tagger
Top British Library Flickr Commons Taggers
http://goo.gl/8SkfM1
Machine Learning
Search Engine
& Google Image
search
23. 23@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Special Jury’s Prize (2015)
James Heald – Wikimedia and Map work
https://goo.gl/WYZCB2
http://goo.gl/HNQq5e
https://goo.gl/VPgffL
https://commons.wikimedia.org/
https://goo.gl/djtm1b
Labs Symposium (2015)Geotagging maps
54,000 Maps
24. 24@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Adam Crymble (2015)
Crowdsource Arcade
What if crowd sourcing
looked like this?
http://goo.gl/LBfJ4W
http://goo.gl/OH9pOZ
https://goo.gl/7z0j8p
30 mins talk
Labs Symposium (2015)
https://goo.gl/SSRsdd
5 min interview (2015)
http://goo.gl/0APpE8
Game Jam
25. 25@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
SherlockNet: Competition Winner 2016
Karen Wang, Luda Zhao and Brian Do
Using Convolutional Neural Networks to Automatically Tag and Caption
the British Library Flickr Commons 1 million Image Collection
Classify into one of 12 categories
>15 million tags added
(total now 15.5 million overall)
>100,000 experimental captions
bit.ly/sherlocknet
Pooled surrounding
Optical Character Recognised
text on page from similar images
Used Microsoft COCO (photographs) &
British Museum Prints and Drawings
collections as training sets.
Tags Captions
26. 26@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Artistic / Creative Works
http://goo.gl/dM8ieA
Mario Klingeman (2015)
http://www.crossroadsofcuriosity.com
David Normal 2014 and 2015
https://www.youtube.com/watch?v=-GRgj7Q5OM0
Rob Walker 2014
http://goo.gl/bNxGZZ
Kris Hoffman (2016)
https://goo.gl/QilqqT
Jiayi Chong 2016
Ling Low 2016
https://www.youtube.com/watch?v=bcOP1E5bRE0
https://www.facebook.com/RealmlandStory/
Paul Rand Pierce 2016
A Hat on the Ground
Spells trouble
Tragic Looking Women
44 Men who Look 44
(Notice the direction faces)
27. 27@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
Mario Klingemann 2016
https://www.youtube.com/watch?v=xgnxnmqnR7Y
Google Arts and Culture Lab – Experiments with Machine Learning
https://artsexperiments.withgoogle.com/
28. 28@imagineUCSD @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/2qqj6V
More BL Labs ideas for inspiration!
http://labs.bl.uk/Ideas+for+Labs http://labs.bl.uk/Other+Uses+of+Collections
25 Seconds (68 Words)
My name is Mahendra Mahey and I work on a project called British Library Labs. We are based at the British Library in London, in the Digital Scholarship department and we work closely with the Digital Research team there. It’s been running for three years now and is funded by the Andrew W. Mellon Foundation.
33 Seconds (100 Words)
In a nutshell the project encourages researchers, artists, entrepreneurs, educators and anyone else,
<Click>
to ‘experiment’ with our digital collections and data. We are particularly interested in those who have questions which focus on the potential to find and create NEW things through access to the digital content. For example, being able to ask a question across thousands of digitised books or newspapers using computational techniques would not feasible using manual methods. Let’s look at a clear example.
<Click>
17 Seconds (53 Words)
<Click>The British Library is one of the largest Library’s in the world <Click> with an estimated 180 million physical items, with only a small proportion being digitised. <Click>We estimate this is around 1-2%, but no one really knows exactly how much. However, increasingly more items are being stored as ‘born’ digital, such as the UK Web Archive<Click>
Sharing our data
Coming up with ideas
Listening and learning
21 Seconds (65 Words)
Katrina Navickas was particularly interested in the <Click>Chartist Movement who were a group who were campaigning for the vote for working people. <Click>They were the biggest popular movement for democracy in 19th century British history, just as this is early picture shows a huge monster meeting at Kennington Common<Click>She wanted to use a combination of manual and computational methods to explore our Digitised Newspapers to find out when and where they met and plot them on map. <Click>and hopefully unearthing new history.
75 seconds
The work of Labs is really about a number of stories, stories about digital collections and about researchers wanting to ask fascinating research questions about them. Let’s now tell you a story about one collection and the intended and unintended consequences of working with it.
The Library digitised 65,000 17th to 19th century books from our collections a few years ago (around 2.7 % of the physical total in that period). You can view them from our catalogue or read them on your <click>IPad via the Historical Books app developed by BiblioLabs. We also captured 22 million individual page images, along with full text scans of these images all of which contain untold quantity of useful data such as names of people, places, historical events, dates.
So the question became then, what next? What can 65,000 books tell us?
Posts small illustrations taken almost at random from the digitised book corpus to a Tumblr blog.
This experiment with undirected engagement was a by-product of work to uncover the hidden wealth of illustrations within the digitised pages.
18 Seconds (56 Words)
Indexing BL the 1 million & Mapping the Maps – was led by James Heald and collaboration with others <Click>They produced an index of 1 million 'Mechanical Curator collection' images on <Click>Wikimedia Commons from a collection of largely un-described images. <Click>This gave rise to finding 50,000 maps within the collection partially through a map-tag-a-thon <Click>These are now being geo-referenced. <Click>
27 Seconds (82 Words)
Adam Crymble <Click>wanted to harness the power of playing fun games on arcade machines to help with crowdsourcing the tagging of un-described images. He particularly wanted to engage a younger audience into crowdsourcing .<Click>On the right you can see a replica 1980’s arcade machine we built and <Click>and on the bottom left some tagging games that were developed through a ‘Games Jam’ for the machine. <Click>. Let’s take a closer look at two of the games…<Click>