1. 1http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
http://www.bl.uk/projects/british-library-labs
29th
February 2016 – BL Labs Roadshow 2016
Presentation at the Open University, Milton Keynes
Funded by the Andrew W. Mellon Foundation
2. 2http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
http://www.bl.uk/projects/british-library-labs
Funded by the Andrew W. Mellon Foundation
29th
February 2016 – BL Labs Roadshow 2016
Presentation at the Open University, Milton Keynes
4. 4http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Digital research methods
http://labs.bl.uk/Launch+Event (has some examples from researchers)
Corpus analysis tools
Text Mining
Visualisations
Location based searching
Geotagging
Annotation
Natural Language
Processing
Using Application Programming Interfaces for
datasets e.g. Metadata, Images
Transcribing
Crowdsourcing /
Human Computation
5. 5http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
One of Largest Library’s in the World
180 million* items
1-2 %* digitised
* estimate
6. 6http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
http://www.bl.uk/subjects/digital-scholarship
http://labs.bl.uk/Digital+Collections
Soon…http://data.bl.uk
Mini Network Area Storage
Device (NAS) guide:
http://goo.gl/E8aRyQ
In 20 years time…
8. 8http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
•Submit ideas by 11 April 2016.
•Two finalists announced late May 2016.
•Residency June – October 2016.
•Up to £3600 in support, technical, curatorial
etc.
•Showcase @ Symposium Monday 7Nov 16.
•Winner £3000 & Runner up £1000!
Competition
9. 9http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
•Those who have already been using our digital
content in interesting and innovative ways.
•Submit projects (previous and new) by
5 September 2016.
•Artistic, Commercial, Research and Learning /
Teaching categories.
•Winners announced @Symposium 7 Nov 16.
•£500 Winner & £100 Runner Up.
Awards
10. 10http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Projects & Ideas
•Ideas change once you try to access, examine
and use the data!
•Talk to us about working on potential ideas /
projects.
11. 11http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
2013
Pieter Francois
Dan Norton
2014
Desmond Schmidt
Bob Nicholson
2015
Katrina Navickas
Adam CrymbleDina MalkovaMario Klingemann
Spatial Humanities Project at
Lancaster University
James Heald
Who and Why?
Please refer to the Winners’ Hand out
14. 14http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Chartists Un-Covered
History in London
Chartist’s London
Walking Tour 21 Sep 15
The Red Lion Pub, Soho
Chartists Re-enactment
Chartist’s Meeting
Locations in London
Chartist’s Meetings
Heatmap in London
16. 16http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Adam Crymble (2015)
Crowdsource Arcade
What if crowd sourcing
looked like this?
http://goo.gl/LBfJ4W
Game Jam - http://goo.gl/OH9pOZ
30 mins talk
Labs Symposium (2015)
https://goo.gl/7z0j8p
5 min interview (2015)
https://goo.gl/SSRsdd
http://goo.gl/0APpE8
17. 17http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Art Treachery – Januz Druz Tag Attack – Antonio Padial
Art Treachery and Tag Attack
21. 21http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
http://www.lancaster.ac.uk/fass/projects/spatialhum.wordpress/
Labs Symposium 2015: https://goo.gl/ZCU56a
Research (2015)
Spatial Humanities: Lancaster University
Combining Text and
Geographic Information
http://goo.gl/yZ3xCJ
Investigating geographical
representation of disease in
digitised 19th
Century
newspapers
22. 22http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Special Jury’s Prize (2015)
James Heald – Wikimedia and Map work
https://goo.gl/WYZCB2
http://goo.gl/HNQq5e
https://goo.gl/VPgffL
https://commons.wikimedia.org/
Labs Symposium (2015)
https://goo.gl/djtm1b
26. 26http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
What can 65,000
books tell us?
Image: Artwork by Alicia Martin
Just one digital collection
27. 27http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Snipping out images
from 65,000 Digitised Books
Face recognition
Mechanical Curator
Flickr
Worked better for female faces
than men’s
Press
http://mechanicalcurator.tumblr.com
>380,000,000 views
> 500,000 tags
http://www.flickr.com/photos/britishlibrary/
1,020,418 images
need tagging!
Creative uses of images
http://goo.gl/qPPgxX
28. 28http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Tagging a million images
Iterative Crowdsourcing
http://www.metadatagames.org/ http://goo.gl/j6fxac
Cardiff University’s
Lost Visions Project
James Heald
Mario Klingemann
Chico 45
Use computational methods
Human Tagger
Top British Library Flickr Commons Taggers
30. 30http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Finding one image
on Flickr
Finding many more!Make collages
Make 4 paintings
Exhibit light boxes at
Burning Man 2014
In Nevada USA
Work with Labs &
British Library to install
Light boxes in London
In 2015
37. 37http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Language Problems
Library uses different terms than
researchers use
•Access
•Collections
•Content
38. 38http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Data Problems [stuff goes in here]
•Metadata isn’t as clean as many think it
is
•Many metadata records have square
brackets to indicate inferred information
•Code to plot when a record had square
brackets by Ben O’Steen
41. 41http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Training /
Teaming up with Expert?
•Many researchers lack the technical skills to use
Digital Research Methods but have the domain
knowledge
•Should support be more focused on training?
•There are plenty of computational experts looking
for problems to solve
•Should they be teamed up with those that have
problems that need solving
45. 45http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Supporting
Digital Experiments better
BL Labs Git Hub Site Re-OCRing Newspapers Flickr API
BL Explore – Search Catalogue Python Code
46. 46http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Lessons…
•Huge appetite to use BL digital content & data
(see Flickr Commons stats later).
•Identifying / bridging gaps for researchers to
use BL data.
•Labs can help researchers navigate through
the Library to get the data they want.
47. 47http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Perfection vs Imperfection
•If we focus too much on perfection
we will never get anything done!
•Fear of failure seen as a negative
thing.
•Just don’t be scared to try
experiments!
48. 48http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Jimmy Wales, Founder of Wikipedia
Fail faster
Small experiments!
https://goo.gl/Vlv3Yu
Let yourself reboot!
49. 49http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Finally…
•Try and examine, use our data and talk
to us about your ideas and projects!
•Consider entering the Competition and
Awards!
•You never know you might….
51. 51http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Accessing the Mini NAS
Mini Network Area Storage Device (NAS) guide:
http://goo.gl/E8aRyQ
Find the ‘opendata’ wireless
access point and join it.
The passphrase is ‘opendata’
Accessing data folders
Username:guest
Password:guest
Or use ftp://10.0.0.1/
53. 53http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
What’s on the Mini NAS?
• British National Bibliography – 3.5 million records (2
Gb)
• 90,000 Playbills – 1602 – 1902 (60 Gb)
• ALTO XML (includes METS and MODS) for OCR of
65,000 volumes, 22 million pages mostly from the 19th
Century (909 Gb)
• 1 million images snipped from books put on Flickr (667
Gb)
• 70,000 tagged images (170 Gb)
55. 55http://labs.bl.uk @BL_Labs @DH_OU #bldigital labs@bl.uk
Ideas Lab
•Get into groups of 2-6
•Read the instructions in the Ideas Lab
Pack
•We are around to help and advise.
•Enjoy it and have fun!
25 Seconds (68 Words)
My name is Mahendra Mahey and I work on a project called British Library Labs. We are based at the British Library in London, in the Digital Scholarship department and we work closely with the Digital Research team there, Aquiles Alencar Brayner is here today from that team today. It’s been running for three years now and is funded by the Andrew W. Mellon Foundation.
33 Seconds (100 Words)
In a nutshell the project encourages researchers, artists, entrepreneurs, educators and anyone else,
<Click>
to ‘experiment’ with our digital collections and data. We are particularly interested in those who have questions which focus on the potential to find and create NEW things through access to the digital content. For example, being able to ask a question across thousands of digitised books or newspapers using computational techniques would not feasible using manual methods. Let’s look at a clear example.
<Click>
Adam Crymble was doing his PhD research on Distant Reading at King’s College. He won a competition to explain his thesis in 2 minutes in the PhD Comics competition.
Examples like this will hopefully INSPIRE YOU to use the British Library’s digital content in some way in your work by showing some of what others have done.
17 Seconds (53 Words)
<Click>The British Library is one of the largest Library’s in the world <Click> with an estimated 180 million physical items, with only a small proportion being digitised. <Click>We estimate this is around 1-2%, but no one really knows exactly how much. However, increasingly more items are being stored as ‘born’ digital, such as the UK Web Archive<Click>
35 Seconds overall
We have created collection guides detailing some of these digital collections <Click>on our Digital Scholarship site.
<Click>and some on the Labs site.
<Click> Soon data.bl.uk will be the place where people can directly access some of the digital collections we have available.
<Click> Today we have brought data with us, see the guide on how to access it and print outs on the tables.
A pause for thought and reflection however, digital is just a current technology to deliver information. Perhaps in Years to come <Click>we won’t be using the word ‘Digital’ <Click>in front of the word ‘Scholarship’. It will just be ‘Scholarship’, digital tecnology will be part of the EVERYDAY process of research. Any way back to the present.
6 Seconds (20 Words)
So <Click> ‘how’ do we try and engage those who might be interested in the BL’s digital collections and data? <Click>
41 Seconds (123 Words)
One way is by running an annual competition which is open to the world! All you have to do is
<Click>submit and idea by 11 April 2016.
<Click>The two finalists will be announced in late May <Click>and they work with in residence between June and October,
<Click>where they will get up to £3600 financial support, together with technical, curatorial and other types of support.
<Click>The winners will showcase their work and receive their prizes at our symposium on Monday 7th of November.
<Click>£3000 will be awarded to the winner and £1000 for the runner up.
15 Seconds (45 Words)
The next way we try to engage those interested in using our digital content is through our Awards,
<Click>these recognise work already carried out using our digital content.<Click>The deadline for this year is the 5 September. You can submit previous and new projects<Click>in one of four categories: Artistic, Commercial, Research and Teaching & Learning <Click> Winners will be announced on Monday 7th of November
<Click> where each category winner, winning £500 with £100 for the runners up.
8 Seconds (24 Words)
The final way to engage with our digital collections and data is to simply examine and use our data. We have learnt ideas usually change when we have done this. Talk to us about projects or ideas you would like to work on whether it’s for the competition, awards or something else.<Click>
21 Seconds (63 Words) (LEAVE as Automatic)
The library is learning WHO wants to use our digital content and most importantly WHY? What you can see are just the winners of our competition and awards for the last 3 years. There are so many more people who have been engaging with the Labs. I will give a flavour of some of the work carried out and later we will talk about this engagement in more detail. <Click>
7 Seconds (21 Words)
So focusing back on the competition, let’s look at a few examples.
21 Seconds (65 Words)
Katrina Navickas was particularly interested in the <Click>Chartist Movement who were a group who were campaigning for the vote for working people. <Click>They were the biggest popular movement for democracy in 19th century British history, just as this is early picture shows a huge monster meeting at Kennington Common<Click>She wanted to use a combination of manual and computational methods to explore our Digitised Newspapers to find out when and where they met and plot them on map. <Click>and hopefully unearthing new history.
33 Seconds (101 Words)
Katrina’s previous research has primarily focussed on the North of England.<Click>She was surprised to learn that many Chartists’ meetings were held in London. See the map of Chartists’ meetings in London and heat map where they met most. <Click>The map at the bottom left shows the route of a walking tour she organised in London visiting sites where the Chartists met.<Click> The photo at the bottom right is a Historical re-enactment of a Chartist meeting that took place in the Red Lion pub. Let’s take a peek at have happened …
54 Seconds
Video from https://www.youtube.com/watch?v=0lx0CL_dsQs
From 2.13 – 3.07 – 54 seconds
<Click>
From 1.18 – 3.09 – Longer clip – 1 minute 51 seconds
27 Seconds (82 Words)
Adam Crymble <Click>wanted to harness the power of playing fun games on arcade machines to help with crowdsourcing the tagging of un-described images. He particularly wanted to engage a younger audience into crowdsourcing .<Click>On the right you can see a replica 1980’s arcade machine we built and <Click>and on the bottom left some tagging games that were developed through a ‘Games Jam’ for the machine. <Click>. Let’s take a closer look at two of the games…<Click>
79 Seconds Video Clip
We are close to installing the machine at the National Video Arcade in Nottingham to see how successful the games will be. If you’re interested in having the machine in your institution, please contact us.
https://www.youtube.com/watch?v=xoCgHo2rwN4 (Switch on Subtitles)
1.47 – 3.06 1 min 19 seconds
<Click>
From 1.47 to 2.28 – Art Treachery – 41 seconds
From 2.28 to 3.06 – Art Attack – 38 seconds
Total for both clips – 1 min 19 seconds
9 Seconds (25 Words)
Now on to our Awards, these recognise work *already* carried out using our digital content. Last year’s categories were Artistic, Entrepreneurial and Research. <Click>
37 Seconds (112 Words)
The artistic winner was Mario Klingemann otherwise known as ‘Quasimondo’ . He tries to use computers to generate art or do clever and interesting things such as find images. He worked a lot a collection of un-described images largely from the 19th Century. <Click> Here you can see a picture of a 44 men he found algorithmically who looked around 44<Click>notice how the eyes of the faces change from left to right. <Click>Bottom Left is an attempt to use code to find images of <Click> ‘Tragic looking women’ and <Click>Top Right above is an attempt to create computer art by snipping bits of images together computationally.
26 Seconds (78 Words)
Dina Malkova was the winner of Commercial category. <Click>Inspired by a small digitised fragment of an <Click>illustration of Alice’s Adventures Under Ground original handwritten manuscript<Click>Dina made handmade and bespoke bow ties and cufflinks. <Click>You can still buy these items in the Alice pop up shop in London and of course online on Etsy.
12 Seconds (37 Words)
<Click>The research winner were a Spatial Humanities group of researchers from Lancaster University <Click>who focussed on analysing digitised newspapers to establish when and where diseases were mentioned in the Victorian Era and <Click>plotting them on a map to look for patterns.<Click><Click>
18 Seconds (56 Words)
Indexing BL the 1 million & Mapping the Maps – was led by James Heald and collaboration with others <Click>They produced an index of 1 million 'Mechanical Curator collection' images on <Click>Wikimedia Commons from a collection of largely un-described images. <Click>This gave rise to finding 50,000 maps within the collection partially through a map-tag-a-thon <Click>These are now being geo-referenced. <Click>
75 seconds
The work of Labs is really about a number of stories, stories about digital collections and about researchers wanting to ask fascinating research questions about them. Let’s now tell you a story about one collection and the intended and unintended consequences of working with it.
The Library digitised 65,000 17th to 19th century books from our collections a few years ago (around 2.7 % of the physical total in that period). You can view them from our catalogue or read them on your <click>IPad via the Historical Books app developed by BiblioLabs. We also captured 22 million individual page images, along with full text scans of these images all of which contain untold quantity of useful data such as names of people, places, historical events, dates.
So the question became then, what next? What can 65,000 books tell us?
Posts small illustrations taken almost at random from the digitised book corpus to a Tumblr blog.
This experiment with undirected engagement was a by-product of work to uncover the hidden wealth of illustrations within the digitised pages.
Play from 4m 50 seconds to 5m 19 seconds
6 Seconds (18 Words)
Just to inspire you, I couldn’t resist showing you the animating of some British Library images, using Creature Software by Kestrel Moon, developed by a former PIXAR animator.
Let’s look at the finished work!
16 Seconds Video Clip
https://goo.gl/QilqqT
1.27- 1.43 – 16 seconds
85 seconds
<click>The British Library faces many challenges of access to our Digital collections!
<click> Sometimes digital content is only available onsite due to license restrictions,
<click>or even only on a specific computer in a reading room! Technically there are very few reasons why digital content can’t be online
<click> though it might be too big or hasn’t been transferred from other digital storage media.
<click>Sometimes access is through a paywall. Finally,
<click>some content is in the happy sunny place, online, open and freely available.
The real reasons why there are challenges to accessing digital content are of course human. They require different approaches from the Library and may often involve an honest, open dialogue and negotiation with the publishers.
The Labs project has tried to address this problem my creating a ‘residency model’ for researchers to work intensively with a digital collection on-site, so as to not infringe access conditions, I will say more about this later.
5 Seconds (15 Words)
<Click> Why is the Library doing this? Well there are many reasons, but essentially it is about…
20 Seconds (62 Words)
<Click>Labs is learning important lessons on how we are supporting researchers who want to experiment with our digital content using digital methods. <Click>We are learning what we are doing right.<Click>Understanding what researchers want, <Click>learning if we provide the appropriate services, tools and resources to support them. Trying to understand where the gaps are <Click>and what we should be doing in the future. <Click>
28 Seconds (86 Words)
We have learned many lessons. I will touch on a few briefly here. <Click>There is a tremendous appetite from researchers, artists, entrepreneurs and others who want to use our digital content/data (see our Flickr Commons Image statistics later). <Click> We are identifying and bridging gaps for researchers to access BL data.<Click> and helping researchers navigate through the Library’s systems and processes to get to it. <Click>At our first roadshow, student Alison Pope suggested that BL Labs acts like a human API (or access point) connecting people to the BL’s digital data.
20 Seconds (62 Words)
The Labs is a place where we do many small experiments quickly. Most importantly it’s where it’s OK to make mistakes and learn from them. Fail faster and fail better! Perhaps Jimmy Wales’ advice (founder of Wikipedia) can sum what we have learned time and time again.<Click>
40 Seconds
Video Clip
http://www.bbc.co.uk/news/business-34808495
20 Seconds (61 Words)
<Click>Examine and use our data and talk to us about your ideas and projects.<Click>Consider entering the Competition and Awards <Click>You never know YOU might….
9 Seconds (28 Words)
A tweet from Professor Melissa Terras from University College London, <Click>.