Slides for presentation at the British Fashion Council, Teatum Jones and British Library event on Monday 22 October 2017, at 1415 BST, British Library Knowledge Centre, Bronte Room, London, NW1 2DB
3. 3
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Competition
Awards
Projects
Tell us your ideas of what to do with our digital content
Show us what you have already done with our digital
content in research, artistic, commercial and learning and
teaching categories
Talk to us about working on collaborative projects
4. 4
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
The British Library
Inside the British Library
Space for 1200 readers, around 400,000 visitors per year
Building 37 uses low oxygen and robots
Reading room and delivery to London
Document Supply and Storage at Boston Spa
Stockton-on-Tees
Author right to payment each time their books
are borrowed from public libraries.
St Pancras, London, UK
Many books are stored 4 stories below the building
UK Legal Deposit Library – Reference only
5. 5
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Collections – not just books!
> 180*million items
> 0.8* m serial titles
> 8* m stamps
> 14* m books
> 6* m sound recordings
> 4* m maps
> 1.6* m musical scores
> 0.3* m manuscripts
> 60* m patents
King’s Library *Estimates
7. 7
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Born digital
Data all around us!
/
Knowledge Quarter London
80 knowledge organisations within 1 mile radius of
Kings Cross, http://www.knowledgequarter.london
http://www.turing.ac.uk (Headquartered at the British Library)
UK Web Archive and e-legal deposit (2013)
http://www.webarchive.org.uk/ukwa/
Born digital
Data all around us!
13. 13
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Playbills, Books, Newspapers
Digital collections and Datasets
Book Records
http://bnb.data.bl.uk
http://sounds.bl.ukhttp://dml.city.ac.uk/
Music (Recordings & Sheet) & Sounds
http://goo.gl/frSMJt
Broadcast News (TV and Radio)
http://goo.gl/cwThHw
http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ
Data
Images, Manuscripts & Maps
http://www.qdl.qa/
Qatar Digital Library
http://idp.bl.uk/
International
Dunhuang
Project
Maps
http://www.bl.uk/maps/
Hebrew Manuscripts
http://goo.gl/4sbCp9
Flickr &
Wikimedia Commons
https://goo.gl/LZRmaZ
14. 14
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Finding Open Cultural Heritage Datasets
Collection Guides (182 as of 23/10/17)
https://www.bl.uk/collection-guides/ (includes some digital)
Digital Collection Guides
(120 as of 23/10/17)
http://labs.bl.uk/Digital+Collections
Datasets
Datasets about our collections
Bibliographic datasets relating to our published and
archival holdings
Datasets for content mining
Content suitable for use in text and data mining research
Datasets for image analysis
Image collections suitable for large-scale image-analysis-
based research
Datasets from UK Web Archive
Data and API services available for accessing UK Web
Archive
Digital mapping
Geospatial data, cartographic applications, digital aerial
photography and scanned historic map materials
https://data.bl.uk
Download collections as zip files
NOT all discoverable via Google!
16. 16
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
The Story of the Digital Collection…
Digital
Collection
Curator
Who paid for the digitisation?
Who did the digitisation?
Technology used
Born digital?
Published
Unpublished
Where is it?
Can it still be accessed?
Generates income
Reputational risk in using?
Legalities
Politics when digitised
Personalities involved
Surprises (e.g. gaps)
Descriptive information
Old format not supported
What media was the
digitisation done from?
Is there any background documentation?
No Descriptive information
Inconsistent descriptive information
Still there?
Tip! Good to know the background of a
Digital collection if you want to use it for research and make conclusions…
18. 18
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Coverage in the collection?
• Download entire list of books with descriptions
• https://goo.gl/HqPQMS (Excel Spreadsheet)
(Health warning over 65,000 rows!)
• 1789 to 1876
• Subjects include:
– Philosophy
– Poetry
– History
– Literature
35. 35
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
One major problem!
•We haven’t described or identified them
•How will we find them later?
(We will come back to this later, it affects you!)
•How can we do that with 1 million images?
•Try a few experiments!
38. 38
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
The Mechanical Curator
Snipped image posted
almost randomly
every hour…
on a Tumblr blog
One of our early followers was…
Ben O’Steen, 30 September 2013
Has a slight ‘mood’…
once image published,
tries to find 8 similar images
e.g. ‘slanty’, ‘circular’ etc.
& then gets ‘bored’
follow…
@MechCuratorBot
40. 40
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
British Library Flickr Commons
Why Flickr Commons?
• Free!
• Each image has it’s own unique website, easy to share
• Can Tag images
• Has Application Programming Interface (API)
Late August 2013
41. 41
@mahendra_mahey @BL_Labs @BL_DigiSchol
Worked better for female faces than men’s
Press
http://mechanicalcurator.tumblr.com
Posts image every 30 minutes
http://www.flickr.com/photos/britishlibrary/
1,020,418 images
need tagging!
Creative uses of images
Face recognition
Algorithms based on photos
Mechanical Curator
with an algorithmic brain
(Circles, Squares and Slanty etc)
http://goo.gl/qPPgxX
Wikimedia
Flickr Commons
Individual URL & API
Snipping out images
from 65,000 Digitised Books*
>600,000,000* views
>20,000,000* tags
https://goo.gl/FgZ4HM
Work @ BL by Ben O’Steen, Labs
and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx
Since Dec 2013
Tumblr
*Estimates
43. 43
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Tagging a million images
Iterative Crowdsourcing
http://goo.gl/j6fxac
Cardiff University’s
Lost Visions Project
http://www.metadatagames.org/
Metadata Games
James Heald
Mario Klingemann
Chico 45
Use computational methods
Using Flickr API
Human Tagger
Top British Library Flickr Commons Taggers
18 hard core taggers
How to reward and keep motivated this ‘small group?
Average for ‘crowd’ is 1 tag per person
What kind of ‘task’ can this ‘crowd’ do?
Mobile games for ‘Ships’, ‘Covers’ and ‘Portraits’ Interface for tagging
44. 44
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Adam Crymble (2015)
Crowdsource Arcade
http://goo.gl/LBfJ4W
http://goo.gl/OH9pOZ
https://goo.gl/7z0j8p
30 mins talk
Labs Symposium (2015)
https://goo.gl/SSRsdd
5 min interview (2015)
http://goo.gl/0APpE8
Game Jam
Using Arcade Games
to help Tag images
‘Art Treachery’ and ‘Tag Attack’
45. 45
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Special Jury’s Prize (2015)
James Heald – Wikimedia and Map work
https://goo.gl/WYZCB2
http://goo.gl/HNQq5e
https://goo.gl/VPgffL
https://commons.wikimedia.org/
https://goo.gl/djtm1b
Labs Symposium (2015)Geotagging maps
54,000 Maps
Found in Flickr 1 million
Human & Computational Tagging
& Community engagement
Geo-referencing work
https://www.bl.uk/georeferencer
46. 46
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
SherlockNet: Competition Winner 2016
Karen Wang, Luda Zhao and Brian Do
Using Convolutional Neural Networks to Automatically Tag and Caption
the British Library Flickr Commons 1 million Image Collection
12 categories
>15 million tags added
>100,000 captions
bit.ly/sherlocknet
Pooled surrounding
OCR text on page
from similar images
Used Microsoft COCO (photographs) &
British Museum Prints and Drawings
collections as training sets.
Tags Captions
47. 47
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Artistic / Creative Works
http://goo.gl/dM8ieA
Mario Klingeman (2015)
Code Artist / Curator
https://www.youtube.com/watch?v=Q3SBxO34Zlc
David Normal 2014 and 2015
Collages/Paintings & Lightboxes
http://goo.gl/bNxGZZ
Kris Hoffman (2016)
Animation for Fashion Week 2016
https://goo.gl/QilqqT
Jiayi Chong 2016 - Animation tool
https://www.facebook.com/RealmlandStory/
Paul Rand Pierce 2016
Graphic Novel on Facebook
A Hat on the Ground Spells trouble
Tragic Looking Women
44 Men who Look 44
(Notice the direction faces)
48. 48
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Hey there Young Sailor!
Ling Low 2016 – Hey there Young Sailor
https://www.youtube.com/watch?v=bcOP1E5bRE0VIMEO.COM/SWEETANDLOWFILMS
@SWEETNLOWFILMS ON INSTAGRAM
@SWEETNLOWLING ON TWITTER
The Impatient Sisters
49. 49
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Imaginary Cities – BL Labs Project 16-17
Michael Takeo Magruder
https://goo.gl/4ARwTy
An artistic exploration seeking to create provocative fictional cityscapes for the Information Age
from the British Library’s digital collection of historic urban maps
52. 52
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
British Library Flickr Commons
https://www.flickr.com/photos/britishlibrary/
Kind of a precursor to Instagram
Used for photo sharing
Flickr Commons has items from
Galleries, Libraries, Archives and Museums (GLAM)
(Mostly Public Domain)
54. 54
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Getting an account on Flickr
•Get a Flickr / Yahoo account
(https://login.yahoo.com/account/create)
•You can then tag, organise favourites, make
your own albums and galleries from Flickr
images online or uploaded
•You get 1TB for free!
•You could reference your own Flickr account
for the competition?
58. 58
@mahendra_mahey @BL_Labs @BL_DigiSchol
mahendra.mahey@bl.uk & labs@bl.uk
Flickr Albums
Curated by the British Library – specifically Nora McGregor
She works with the public to add images or create new ones!
440 Albums as of 23/10/17 – Mostly Maps!
140 seconds
The British Library is the national library of the UK and one of the largest research libraries in the world . The Library moved to a new purpose built building in 1997 <click> the largest of it’s kind that was built in the UK in the 20th century. Many frequently used items are stored 5 stories below the main building at St Pancras in London and many might not know that part of the building is meant to look like a ship on a journey to discovery!<click>. <click to switch off>
The building can sit 1,200 researchers at any one time across 5 reading rooms.
<click>Medium and long term requested items are held at Boston Spa in Yorkshire in a low oxygen warehouse, using robot to retrieve items. In total, the library has 625 km of shelving, growing by 12 km every year.
Whilst we acquire items through purchase or gifts, much of the collection has been built up through legal deposit. That is, by law, a copy of every UK and Ireland print publication must be given to the British Library by its publishers. Around 3 million items are added per year. In 2013, legal deposit was extended to cover non-print material which means by law we take in digitally published items as well, which means regular mass crawls of the entire UK web domain as well as ebooks, ejournals etc.
85 seconds
The picture you can see is inside the main building in London, it’s the King’s Library – King George the Third’s personal library! Sometimes known as the ‘stack’, I walk past this everyday and I sometimes forget that the collections the British Library have are truly staggering! We currently estimate them to exceed <click>150 million items, representing every age of written civilisation and every known language. Our archives now contain the earliest surviving printed book in the world, the Diamond Sutra, written in Chinese and dating from 868 AD….
So some big numbers…
Over …<click>14 million books
<click>60 million patents
<click>8 million stamps
<click>4 million maps
<click>3 million sound recordings
<click>1.6 million music scores
<click>over .3 million manuscripts
<click>0.8 million serials titles (which are of course made up of many many volumes/editions), this is where a lot of our content is, just in case you thought the numbers didn’t add up!
6 Seconds (20 Words)
So <Click> ‘how’ do we try and engage those who might be interested in the BL’s digital collections and data? <Click>
17 Seconds (53 Words)
<Click>The British Library is one of the largest Library’s in the world <Click> with an estimated 180 million physical items, with only a small proportion being digitised. <Click>We estimate this is around 1-2%, but no one really knows exactly how much. However, increasingly more items are being stored as ‘born’ digital, such as the UK Web Archive<Click>
Have balance of Multimedia
Broadcast news and radio, sounds asave our sounds
Books and newspapers
Images
BNB
Qatar Digital library
Hebrew manuscripts
Watch out the gunner and skunk as they will make an appearance again!
Posts small illustrations taken almost at random from the digitised book corpus to a Tumblr blog.
This experiment with undirected engagement was a by-product of work to uncover the hidden wealth of illustrations within the digitised pages.
27 Seconds (82 Words)
Adam Crymble <Click>wanted to harness the power of playing fun games on arcade machines to help with crowdsourcing the tagging of un-described images. He particularly wanted to engage a younger audience into crowdsourcing .<Click>On the right you can see a replica 1980’s arcade machine we built and <Click>and on the bottom left some tagging games that were developed through a ‘Games Jam’ for the machine. <Click>. Let’s take a closer look at two of the games…<Click>
18 Seconds (56 Words)
Indexing BL the 1 million & Mapping the Maps – was led by James Heald and collaboration with others <Click>They produced an index of 1 million 'Mechanical Curator collection' images on <Click>Wikimedia Commons from a collection of largely un-described images. <Click>This gave rise to finding 50,000 maps within the collection partially through a map-tag-a-thon <Click>These are now being geo-referenced. <Click>
26 Seconds (78 Words)
Dina Malkova was the winner of Commercial category. <Click>Inspired by a small digitised fragment of an <Click>illustration of Alice’s Adventures Under Ground original handwritten manuscript<Click>Dina made handmade and bespoke bow ties and cufflinks. <Click>You can still buy these items in the Alice pop up shop in London and of course online on Etsy.