Crowdsourcing your cultural heritage collections: considerations when choosing a platform

•Als PPTX, PDF herunterladen•

0 gefällt mir•823 views

This talk was given at the Visual Resources Association conference March 13 2015. The moderator was Trish Rose-Sandler and speakers included: Robert Guralnick, Guarav Vaidya, and Trish Rose-Sandler. Notes from the talk are visible when downloaded.

Technologie

Crowdsourcing your cultural heritage
collections: considerations when
choosing a platform
Robert Guralnick, University of Florida
Guarav Vaidya, University of Colorado, Boulder
Trish Rose-Sandler, Missouri Botanical Garden
Image credit: Opensourceway
https://www.flickr.com/photos/opensourcew
ay/4370250237/
March 13 2015 Visual Resources Association conference

March 13 2015 Visual Resources Association conference

www.metadatagames.org
March 13 2015 Visual Resources Association conference

March 13 2015 Visual Resources Association conference
Factors to consider when choosing a platform
• Fit for purpose
• Size of user community
• Copyright restrictions
• Analytics
• User engagement
• System interoperability

March 13 2015 Visual Resources Association conference
Flickr as both an image sharing
and crowdsourcing platform: the
Biodiversity Heritage Library
experience
Trish Rose-Sandler, Missouri Botanical Garden

March 13 2015 Visual Resources Association conference
BHL portal

March 13 2015 Visual Resources Association conference
BHL Book Viewer

March 13 2015 Visual Resources Association conference
BHL Crowdsourcing of image descriptions
BHL images available since 2011
BHL images available since 2011
BHL images available since last week!

March 13 2015 Visual Resources Association conference
BHL’s latest crowdsourcing venture

March 13 2015 Visual Resources Association conference
Science Gossip UI

March 13 2015 Visual Resources Association conference
• Image hosting site created in 2004
• acquired by Yahoo in 2005
• 87 million registered members and 3.5 million
new images uploaded daily (Mar ‘13)
• hosts 6 billion images (Aug ‘11)
http://en.wikipedia.org/wiki/Flickr
Flickr basics

March 13 2015 Visual Resources Association conference
BHL Flickr stream

March 13 2015 Visual Resources Association conference
Internet Archive Book Images stream

March 13 2015 Visual Resources Association conference
How to get the word out?

March 13 2015 Visual Resources Association conference
Flickr machine tags
BHL asked folks to tag scientific and common
names as machine tags
Takes form of Namespace:predicate=value
Examples
taxonomy:binomial=Aegotheles savesi
taxonomy:common=owl

March 13 2015 Visual Resources Association conference
Flickr machine tags: searching and re-use

March 13 2015 Visual Resources Association conference
Getting data out of Flickr
Via APIs
• Flickr limits API calls to 3600 images per
hour
• 24 hrs to extract tags for 90k images
• Can use multiple API keys to get around
Flickr limitations

March 13 2015 Visual Resources Association conference
Flickr – success or failure for
crowdsourcing?
Google Analytics
18% of images in BHL Flickr stream have at least 1 tag or
more add by users
Total views = 88 million views of BHL images in last 4 yrs
Science Gossip since Mar 6 2015 - 140k
images classified by 330 users – huge
success!

Weitere ähnliche Inhalte

Was ist angesagt?

A theory of digital library metadata : enrich then filter Getaneh Alemu

LOD/LAM PresentationHafabe

PLOS ALM Talk on UC3 Services and AltmetricsCarly Strasser

Keeping Up to Date on Data Management - UC3 Data Curation WorkshopCarly Strasser

A Current Overview of the Biodiversity Heritage LibraryMartin Kalfatovic

CDL research lifecycleUniversity of California Curation Center

Understanding Shakespeare: What We've Learned (So Far) - RSA 2016Alex Humphreys

Designing Metadata to Meet User Needs for Special CollectionsAllison Jai O'Dell

Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Wor...Allison Jai O'Dell

Special Collections, Special Thesauri: Managing and Publishing Local Vocabula...Allison Jai O'Dell

“Agile” as Key to Collaboration on NYU Digital Collections Discovery InitiativeLovins, Daniel

Data as Supplemental MaterialUniversity of California Curation Center

Fostering Community through Grassroots DH at Mississippi State UniversityHillary Richardson

Fostering Community through Grassroots DH at Mississippi State UniversityNickoal L. Eichmann-Kalwara

Biodiversity Heritage Library in AustraliaElycia Wallis

DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)University of California Curation Center

Bracke may4-1National Information Standards Organization (NISO)

Dr Natalie Harrower - DRI and Open Datadri_ireland

The role of a Socio-informatricianGreg D'Arcy

Delivering biodiversity knowledge in the information ageVince Smith

Was ist angesagt? (20)

A theory of digital library metadata : enrich then filter

LOD/LAM Presentation

PLOS ALM Talk on UC3 Services and Altmetrics

Keeping Up to Date on Data Management - UC3 Data Curation Workshop

A Current Overview of the Biodiversity Heritage Library

CDL research lifecycle

Understanding Shakespeare: What We've Learned (So Far) - RSA 2016

Designing Metadata to Meet User Needs for Special Collections

Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Wor...

Special Collections, Special Thesauri: Managing and Publishing Local Vocabula...

“Agile” as Key to Collaboration on NYU Digital Collections Discovery Initiative

Data as Supplemental Material

Fostering Community through Grassroots DH at Mississippi State University

Biodiversity Heritage Library in Australia

DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)

Bracke may4-1

Dr Natalie Harrower - DRI and Open Data

The role of a Socio-informatrician

Delivering biodiversity knowledge in the information age

Andere mochten auch

Crowdsourcing, scholarship and the academyMia

Text and Data Mining Using Cultural Heritage DataLIBER Europe

Crowdsourcing in the Cultural Sector: approaches, challenges and issuesMia

Designing Successful Heritage Crowdsourcing ProjectsMia

Digital Odyssey 2015 - Open CollectionsOurDigitalWorld

Crowdsourcing and Cultural Heritage CollectionsOurDigitalWorld

Planning for big data (lessons from cultural heritage)Mia

Crowdsourcing and Cultural Heritage workshopMia

Introduction to information visualisation for humanities PhDsMia

Andere mochten auch (9)

Crowdsourcing, scholarship and the academy

Text and Data Mining Using Cultural Heritage Data

Crowdsourcing in the Cultural Sector: approaches, challenges and issues

Designing Successful Heritage Crowdsourcing Projects

Digital Odyssey 2015 - Open Collections

Crowdsourcing and Cultural Heritage Collections

Planning for big data (lessons from cultural heritage)

Crowdsourcing and Cultural Heritage workshop

Introduction to information visualisation for humanities PhDs

Ähnlich wie Crowdsourcing your cultural heritage collections: considerations when choosing a platform

MLA Flickr_v.Lisa.3.4.finaldraftErin Durrett

More than just a pretty picture: improving the discoverability of illustrati...Trish Rose-Sandler

Creating Subject Guides for the 21st Century Library: Crafting New Direction...Buffy Hamilton

MLA Handout 2012Erin Durrett

Free Visualization Tools for Teaching and Research: Blogs, Glogs, GIS, Word C...Teresa S. Welsh

NCompass Live: Beyond MARC: BIBFRAME and the Future of Bibliographic DataNebraska Library Commission

Beyond MARC: BIBFRAME and the Future of Bibliographic DataEmily Nimsakont

March 18 NISO Two Part Webinar: Is Granularity the Next Discovery Frontier? P...National Information Standards Organization (NISO)

Perspectives on National Library of Australia Developments Part 1 Rose HolleyRose Holley

New tools for libraries 18.4.2008Kiussi

Research Data PublishingBrian Hole

Building AAPB Participation into Digitization Grant Proposals: Requirements, ...WGBH Media Library and Archives

Maximised discovery of institutions digital collections - Jisc Digital Festiv...Jisc

Altmetrics for Team ScienceIUPUI

Wikis, Flickr & YoutubeChella Vaidyanathan

Adaptive Food System Planning for Place-Based Food SystemsErin White

Being a Good Data ProviderJisc

Being A Good Data ProviderAlastair Dunning

Report on the Rethinking Resource Sharing Initiativekramsey

History day Creative Commons Resources HandoutMartha Hardy

Ähnlich wie Crowdsourcing your cultural heritage collections: considerations when choosing a platform (20)

MLA Flickr_v.Lisa.3.4.finaldraft

More than just a pretty picture: improving the discoverability of illustrati...

Creating Subject Guides for the 21st Century Library: Crafting New Direction...

MLA Handout 2012

Free Visualization Tools for Teaching and Research: Blogs, Glogs, GIS, Word C...

NCompass Live: Beyond MARC: BIBFRAME and the Future of Bibliographic Data

Beyond MARC: BIBFRAME and the Future of Bibliographic Data

March 18 NISO Two Part Webinar: Is Granularity the Next Discovery Frontier? P...

Perspectives on National Library of Australia Developments Part 1 Rose Holley

New tools for libraries 18.4.2008

Research Data Publishing

Building AAPB Participation into Digitization Grant Proposals: Requirements, ...

Maximised discovery of institutions digital collections - Jisc Digital Festiv...

Altmetrics for Team Science

Wikis, Flickr & Youtube

Adaptive Food System Planning for Place-Based Food Systems

Being a Good Data Provider

Being A Good Data Provider

Report on the Rethinking Resource Sharing Initiative

History day Creative Commons Resources Handout

Mehr von Trish Rose-Sandler

Botanists and annotations: use cases and their relevance for the larger scie...Trish Rose-Sandler

Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Trish Rose-Sandler

Expanding access to natural history images: the BHL and its global consortiumTrish Rose-Sandler

The Art of Life: merging the worlds of art and scienceTrish Rose-Sandler

Special libraries association meeting march 2014Trish Rose-Sandler

Breathing new life into old data - How opening your collection can spark imag...Trish Rose-Sandler

Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...Trish Rose-Sandler

Reach Out! Opportunities for the Visual Resource CenterTrish Rose-Sandler

The Art of Life projectTrish Rose-Sandler

Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...Trish Rose-Sandler

Mehr von Trish Rose-Sandler (10)

Botanists and annotations: use cases and their relevance for the larger scie...

Foundations to Actions: Extending Innovations to Digital Libraries in Partner...

Expanding access to natural history images: the BHL and its global consortium

The Art of Life: merging the worlds of art and science

Special libraries association meeting march 2014

Breathing new life into old data - How opening your collection can spark imag...

Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...

Reach Out! Opportunities for the Visual Resource Center

The Art of Life project

Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...

Kürzlich hochgeladen

Data governance with Unity Catalog PresentationKnoldus Inc.

Connecting the Dots for Information Discovery.pdfNeo4j

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda

Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

Sample pptx for embedding into website for demoHarshalMandlekar2

From Family Reminiscence to Scholarly Archive .Alan Dix

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Kürzlich hochgeladen (20)

Data governance with Unity Catalog Presentation

Connecting the Dots for Information Discovery.pdf

(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...

Assure Ecommerce and Retail Operations Uptime with ThousandEyes

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

What is DBT - The Ultimate Data Build Tool.pdf

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

Sample pptx for embedding into website for demo

From Family Reminiscence to Scholarly Archive .

UiPath Community: Communication Mining from Zero to Hero

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Genislab builds better products and faster go-to-market with Lean project man...

[Webinar] SpiraTest - Setting New Standards in Quality Assurance

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Long journey of Ruby standard library at RubyConf AU 2024

Decarbonising Buildings: Making a net-zero built environment a reality

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Crowdsourcing your cultural heritage collections: considerations when choosing a platform

1. Crowdsourcing your cultural heritage collections: considerations when choosing a platform Robert Guralnick, University of Florida Guarav Vaidya, University of Colorado, Boulder Trish Rose-Sandler, Missouri Botanical Garden Image credit: Opensourceway https://www.flickr.com/photos/opensourcew ay/4370250237/ March 13 2015 Visual Resources Association conference

2. March 13 2015 Visual Resources Association conference

3. March 13 2015 Visual Resources Association conference

4. March 13 2015 Visual Resources Association conference

5. www.metadatagames.org March 13 2015 Visual Resources Association conference

6. March 13 2015 Visual Resources Association conference Factors to consider when choosing a platform • Fit for purpose • Size of user community • Copyright restrictions • Analytics • User engagement • System interoperability

7. March 13 2015 Visual Resources Association conference

8. March 13 2015 Visual Resources Association conference Flickr as both an image sharing and crowdsourcing platform: the Biodiversity Heritage Library experience Trish Rose-Sandler, Missouri Botanical Garden

9. March 13 2015 Visual Resources Association conference BHL portal

10. March 13 2015 Visual Resources Association conference BHL Book Viewer

11. March 13 2015 Visual Resources Association conference BHL Book Viewer

12. March 13 2015 Visual Resources Association conference BHL Crowdsourcing of image descriptions BHL images available since 2011 BHL images available since 2011 BHL images available since last week!

13. March 13 2015 Visual Resources Association conference BHL’s latest crowdsourcing venture

14. March 13 2015 Visual Resources Association conference Science Gossip UI

15. March 13 2015 Visual Resources Association conference • Image hosting site created in 2004 • acquired by Yahoo in 2005 • 87 million registered members and 3.5 million new images uploaded daily (Mar ‘13) • hosts 6 billion images (Aug ‘11) http://en.wikipedia.org/wiki/Flickr Flickr basics

16. March 13 2015 Visual Resources Association conference BHL Flickr stream

17. March 13 2015 Visual Resources Association conference Internet Archive Book Images stream

18. March 13 2015 Visual Resources Association conference BHL Flickr stream

19. March 13 2015 Visual Resources Association conference How to get the word out?

20. March 13 2015 Visual Resources Association conference Flickr machine tags BHL asked folks to tag scientific and common names as machine tags Takes form of Namespace:predicate=value Examples taxonomy:binomial=Aegotheles savesi taxonomy:common=owl

21. March 13 2015 Visual Resources Association conference Flickr machine tags: searching and re-use

22. March 13 2015 Visual Resources Association conference Getting data out of Flickr Via APIs • Flickr limits API calls to 3600 images per hour • 24 hrs to extract tags for 90k images • Can use multiple API keys to get around Flickr limitations

23. March 13 2015 Visual Resources Association conference Flickr – success or failure for crowdsourcing? Google Analytics 18% of images in BHL Flickr stream have at least 1 tag or more add by users Total views = 88 million views of BHL images in last 4 yrs Science Gossip since Mar 6 2015 - 140k images classified by 330 users – huge success!

Hinweis der Redaktion

Crowdsourcing as a method for gathering and transcribing information about cultural heritage objects has been used effectively in improving access to these collections for the past decade. Large numbers of the public can be harnessed in accomplishing a task too large for institutional staffing to complete. Its also a great way to build deeper connections to the content we produce by letting the public generate conversations around the content and make it more accessibleThis image I borrowed from opensource.com and an article on their site by Chris Grams called “2 reasons why the term “crowdsourcing” bugs me” http://opensource.com/business/10/1/2-reasons-why-term-crowdsourcing-bugs-me?sc_cid=70160000000IDmjAAG I like the image because it show how there are two mindsets when it comes to crowdsourcing The first is what Grams explains is the “manufacturing mindset” or factory model where the goal of crowdsourcing is simply about getting a task done cheaper and faster. The illustration on the left side reflects that mindset. The crowd is simply there to serve a single individual or organization. The second he describes as a meritocracy, “where the best ideas win” and less of a socialist or collective approach. I would agree with him that some crowdsourcing projects use a peer to peer verification where an idea is not accepted until a consensus forms. Yet others take the collective approach where every idea is considered valid. I think the illustration on the right is more reflective of this mindset and demonstrates how everyone benefits and feeds off of the ideas of others in a collective crowdsourcing project. There are lots of crowdsourcing applications out there that tackle all sorts of problems from fundraising to folding protein strings. For this talk we’ll focus on those that are designed for sharing and describing images and that have been used fairly widely in the cultural heritage community. Some examples include: Flickr, Wikimedia Commons, Zooniverse, and Metadata Games
Flickr was setup as an image sharing platform in 2004 and Flickr Commons was developed in 2008 to “provide access to publically held photo archives” Library of Congress was the first cultural heritage institution to join FC and many others have followed suite such as the US National Archives, the Powerhouse Museum, and the New York Public Library. Most of these institutions have posted small numbers of images (LC 23k) NYPL (2500), Nat Archive (12k) The British Library came along in Dec 2013 and uploaded over 1 million images to FC - one of the largest batches of images to FC since it began. Internet Archive followed a year later with over 2.5 million images.
WC was launched in Sept 2004 to create a central repository for media files so they could be uploaded once but then referenced as many times as needed in different Wikimedia projects. Many galleries, libraries, archives and museums or GLAMS wanted to bulk upload their content to WC but there was no tool to do it. The GLAMwikiToolset project was a European initiative to make that easier. With this bulk upload tool many GLAMS are now sharing their content in WC
Zooniverse - what began in 2007 as a science focused crowdsourcing platform for 1 project called Galaxy Zoo now has over 2 dozen projects covering a wide variety of domains including: climate, nature, and humanities.
Metadata Games is a site designed by a game design dept at Dartmouth called Tiltfactor – combines crowdsourcing of metadata with gaming. . is used with over 44 Collections represented at 10 Institutions. Including the British Library, Boston Public Library ,and DPLA
Fit for Purpose: In other words what is the platform designed primarily to do? Both Zooniverse and Metadata Games were designed first and foremost as crowdsourcing applications and therefore have functionality better suited for this purpose - e.g. “peer to peer verification” [http://www.tiltfactor.org/wp-content/uploads2/tiltfactor_citizenArchivistsAtPlay_digra2013.pdf]. This is where Each images has more than one user looking at and tagging it which then allows a score to be assigned to a word or phrase to assess its accuracy. This can help with quality control which is a concern for some organizations venturing into utilizing the public as catalogers Size of User community Its important to consider what is the size of the community using the platform – what type of exposure will your content get and how many people could potentially want to tag your content? As of 2014, Zooniverse had 1 million registered volunteers Wikipedia is 6th most visited website in the world As of 2013 Flickr had 87 million registered members and as of August 2011 the site reported that it was hosting more than 6 billion images Copyright restrictions– Does the application impose any copyright restrictions? WC only accepts images that either public domain or you as the copyright holder are willing to apply a Creative Commons license without any commercial restrictions. This is because one of the core values of Wikimedia is that content that is shared there should be free to use, reuse, and change w/o permission Analytics Does the site provide any stats about folks interacting with your data? e.g. number of images tagged, # of tags per image, when are folks tagging most (month, day year) User engagement Does the platform have tools for users to interact with the data and for you to interact with the volunteers? This is where a platform like Zooniverse really stands out - it has a blog and talk pages. Incentives - give feedback on how many items they’ve classified (some projects have badges) System interoperability: data input (bulk uploads), data output (exports, querying APIs)
With these factors in mind lets begin with our first speaker Rob is a biodiversity scientist and informatician whose focus is on documenting the pace of global change and impacts on wildlife. Much of the data he uses for his work comes from natural history collections and citizen science naturalists. He has been deeply involved in ecological and biodiversity informatics initiatives to increase the quality, availability and utility of such datasets at the global scale. His particular informatics interest is building web-based tools to enhance discovery, and curate content of natural history collections data; this will be the key topic he discusses today Gaurav is a graduate student at the University of Colorado Boulder, where he studies how often taxonomists update catalogues of species definitions and what that means for our understanding of global biodiversity. This is a convenient excuse for him to read old books about cranky taxonomists and the revel in the thrill of discovering something new for the first time. He's been an editor on Wikipedia since 2002, and -- although he barely edits anything pages himself -- he still thinks its the greatest thing since sliced bread. Trish Rose-Sandler has over two decades of experience working in libraries, archives and museums. Since 2010, she’s worked at the Missouri Botanical Garden in St Louis where she provides data management assistance for the Biodiversity Heritage Library and is principal investigator for 2 BHL grant funded projects: Art of Life and Purposeful Gaming. She has been a VRA member for the past decade is finishing up a 4yr run as the co-chair for the VRA Core Oversight Committee.
I’ve talked about BHL with this community in past conferences but for anyone who is not familiar let me give you a brief overview. BHL is a consortium of natural history libraries and museums who collaborate to digitize their historic literature and make it available for free online as part of a global biodiversity commons. We have digitized over 45 million pages of text about plants and animals
This is our portal where our content can be searched at biodiversitylibrary.org
Here is the viewer for navigating through books and journals once you’ve identified an item.
While many people are aware of BHL’s rich textual resources many are less aware of our hidden natural history illustrations. We estimate we have millions of images in the books but have not had descriptive metadata about them that allows them to be searched. To address this challenge we have been both manually and automatically, through the development of algorithms, identifying pages with images and pushing them out to crowdsourcing environments.
We’ve utilized multiple crowdsourcing platforms for sharing and describing BHL images including Wikimedia Commons as Gaurav talked about, Flickr, and most recently Zooniverse,
I just want to plug the Zooniverse site before I jump into my discussion on Flickr because it just went live last week and we’re really excited over the level of participation we’re seeing . Its called Science Gossip and can be found at sciencegossip.org. Our Zooniverse opportunity came about because BHL partnered with another project called Constructing Scientific Communities based in the UK which investigates Victorian citizen science periodicals. BHL contains lots of periodicals from this period and by serving up pages of BHL journals from this period within Zooniverse and asking the public to help us identify the images content and creators we can Better understand the range of individuals who made science through their images. This is the first Zooniverse project where citizen scientists are both the researchers and the subject of the research.
Here is the UI for Science Gossip which as you can see if pretty different from the UI for Notes from Nature and shows how customizable the platform is for the needs of different materials. We are asking folks to help us tag contributors such as illustrators or engravers, add species, record hand-written transcrptions and add keywords about the general subject matter. We would love to have the VRA community participate in ScienceGossip so please do have a look. Today I’ll focus my talk on Flickr since that is the platform we’ve been using the longest for both sharing images and crowdsourcing
Many of you are probably very familiar with this platform but here are some basic stats Image hosting site created in 2004, acquired by Yahoo in 2005, As of March 2013 Flickr had 87 million registered members and 3.5 million new images uploaded daily As of Aug 2011 site reported it was hosting 6 billion images http://en.wikipedia.org/wiki/Flickr
stream created in 2011 https://www.flickr.com/photos/biodivlibrary over 95k images, manually curated sets, full page plats
We are also Part of Internet Archive Books images stream in Flickr Thanks to the work of researcher Kalev Leetaru and developers at Smithsonian Libraries (SIL), Missouri Botanical Garden (MBG), and the Internet Archive (IA), over 1 million images from BHL are being added to the IA's Book Images Flickr stream. This work began in the summer of 2014 when Leetaru extracted over 14 million images from 2 million IA public domain books and pushed them to the Flickr Commons. BHL images are a subset of this collection because, as a digitization partner for BHL, IA not only scans many of BHL’s books and journals but also hosts all of its content at the Internet Archive as a mirror of the content found at the BHL portal. The URL is on the screen but rather long so the easiest to find this content is to search on flickr for the term bookcollectionbiodiversity (all one word) https://www.flickr.com/search/?tags=bookcollectionbiodiversity
Staff identify books or journals heavily illustrated with full plate pages and upload the item as a set Created a script that somewhat automates the process of uploading so that adding bibliographic metadata does not have to be done by hand Metadata in this stream contains Basic bibliographic information about the source from which the image came (in our case books or journals) URL to get to the page within the BHL portal – because not only do we want the public to view our images in Flickr but also visit our site (promotional tool) Copyright status – BHL only uploads public domain We upload some – BHL page id, URL encoded in DC id, subject keywords that are pulled from the MARC records. In this case – catalog, flowers, gardening, seeds. All of the photos you upload can be tagged by anyone as long as you give permission in your account settings under privacy and permission
Crowdsourcing is much more successful when you get the word out on an ongoing basis - BHL does regular blog and FB posts, tweets, and even some Flickr tagging parties. We ask Flickr users to help us tag scientific and common names for species
If you are not familiar with what machine tags are its basically a way to not only add a term to an image but also specify the type of term it is. The allows machines to read any understand them better Form of Namespace:predicate=value e.g. “taxonomy:binomial=Aegotheles savesi” taxonomy:common=owl You can develop your own set of machine tags and ask people to use them or reuse some that already exist on Flickr. The taxonomic binomial tag was one that already existed on Flickr
By tagging scientific names not only does it allow for searching by people but also searching by machines Machine tagging of species has allowed the Encyclopedia of Life, a BHL partner to pull those images into its platform and affiliate them with their species pages Quality of tags? (as of Sept 2014 over 22,000 of these tags have been added to 14,000 BHL images) 75% are machine tags taxonomy:binomial=Eurystomus glaucurus geo:country=Australia rest are just values – common names, illustrators, geographic locations We haven’t come across any tags that are irrelevant or offensive so far
Already mentioned the data import a bit and the script we created. Many cultural institutions will want to bring that md back into their local system for searching which is what BHL wants to do. Exporting data done via APIs - Slow but only way to get data out. We have just begun experimenting with exporting via the APIs. Flickr has limits on how many requests can be made to its APIs per hour (3600) With 90k images in this stream it takes us about 24 hrs to extract it all. We have heard of others setting up multiple API keys to get around Flickr’s limits so we’ll probably need to go that route as we begin extracting data from the IA stream which in well over 1 million images, otherwise it would take us about 12 days to extract everything
So how do you gauge if a platform is succeeding or failing? For Flickr You can query the API and pull out the tags and assess them as we have done. You can also look at the Google Analytics for the site. We track those monthly For BHL Flickr stream there are 17k images with one or more tags added by users (18% of all our images) Many of those images can have more than 1 tag but analytics don’t tell us the total number of tags per image and don’t tell us how many individual users have added tags. To judge success you can look at other factors total views on your content (since 2011 we’ve had 88 million views of BHL content on Flickr) Point of comparison with Zooniverse. Since we went live on March 6th we’ve had 140k images classified by 330 users – a huge success! Attribute this difference to “fit for purpose” Zooniverse designed for crowdsourcing, has an active community of classifiers, lots of ways for users to engage with images through talk etc. Flickr has No way for users to interact with content other than tag or fav it. No way for content providers to interact with taggers. So success yes but over a much longer period of time than Zooniverse

Crowdsourcing your cultural heritage collections: considerations when choosing a platform

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Crowdsourcing your cultural heritage collections: considerations when choosing a platform

Ähnlich wie Crowdsourcing your cultural heritage collections: considerations when choosing a platform (20)

Mehr von Trish Rose-Sandler

Mehr von Trish Rose-Sandler (10)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Crowdsourcing your cultural heritage collections: considerations when choosing a platform

Hinweis der Redaktion