SlideShare a Scribd company logo
1 of 31
The Art of Life project



SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
What is Art of Life?

• Full title - The Art of Life: Data Mining and Crowdsourcing the
  Identification and Description of Natural History Illustrations
  from the Biodiversity Heritage Library (BHL)
• Grant given to Missouri Botanical Garden in St Louis
• Funded by National Endowment for the Humanities
• Runs May 2012-April 2014

SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
What is BHL?

 • A consortium of natural history, botanical libraries and
   research institutions
 • An open access digital library for historic biodiversity
   literature
 • An open data repository of taxonomic names and
   bibliographic information

SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Member Institutions
    •    Academy of Natural Sciences Library and Archives
    •    American Museum of Natural History Library
    •    California Academy of Sciences Library
    •    Cornell University Library
    •    The Field Museum Library
    •    Harvard University Botany Libraries
    •    Harvard University, Ernst Mayr Library of the Museum of Comparative Zoology
    •    Library of Congress
    •    Marine Biological Laboratory / Woods Hole Oceanographic Institution Library
    •    Missouri Botanical Garden Library
    •    Natural History Museum, London, Library & Archives
    •    The New York Botanical Garden
    •    Royal Botanic Gardens, Kew, Library & Archives
    •    Smithsonian Institution Libraries
    •    United States Geological Survey Libraries
SLRLN March 2013 St Louis MO           Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL
FT Staff



SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL
Global



SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL
Browse




SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL
Search




SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL
Scientific
Names




  SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL
Book
viewer




SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
BHL copyright and licensing
   Public Domain Content Files                                             Public Domain

   Copyrighted Content Files

   Metadata, OCR, Scientific Names


SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden       Art of Life project
BHL provides data via:
                      •        APIs
                      •        Data exports
                      •        OpenURL
                      •        OAI-PMH



SLRLN March 2013 St Louis MO       Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Reuse
      of BHL
      data

                               The website and webservice BioStor by Rod Page provides tools for
                               extracting, annotating, and visualising information on literature from
                               BHL (http://biostor.org/). In this example, Rod has identified articles
                               found in the Proceedings of the United States National Museum.


SLRLN March 2013 St Louis MO       Trish Rose-Sandler, Missouri Botanical Garden              Art of Life project
Reuse
      of BHL
      data

                               Ryan Schenk is using publication dates of works in BHL to build histograms of
                               the number of publications-per-year for specific species, In this example, the
                               Guniea Pig (http://synynyms.no.de/ )


SLRLN March 2013 St Louis MO         Trish Rose-Sandler, Missouri Botanical Garden             Art of Life project
SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Why the need for Art of Life?

      Problem statement – users want access to images,
      access to images is limited to page by page scroll or
      viewing selection of images in Flickr, not searchable
      by image content (e.g. corn, zea mays)



SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
5 Primary Objectives of Art of Life

     Objective 1: Define an appropriate metadata schema for natural history illustrations

     Objective 2: Build software tools to automatically identify illustrations in the BHL corpus

     Objective 3: Enhance existing tools to enable the initial sorting, viewing, and editing of these
     identified visual resources.

     Objective 4: Integrate tagging applications to enable a community of users to edit descriptive
     metadata for the illustrations

     Objective 5: Integrate the descriptive metadata generated by users back into BHL portal both for
     access and preservation


  SLRLN March 2013 St Louis MO         Trish Rose-Sandler, Missouri Botanical Garden       Art of Life project
SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Current status of Art of Life
 • Development of the algorithm is about 90% complete and will
   be done by April 2013
 • Draft schema for describing natural history illustrations
   available for public review http://tinyurl.com/9hm7nsb
 • Classifier tool – reusing an existing BHL tool developed by Joel
   Richard called Macaw http://code.google.com/p/macaw-
   book-metadata-tool/


SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
4 algorithms that have been tested
 • Analysis of the scanning metadata, aka picture block data,
   contained within the OCR files generated as part of the
   Internet Archive scanning process.
 • Application of a technique that evaluates the vertical
   composition of a page by applying contrast and scaling
   transformations.
 • Analysis of the image compression ratio for scanned pages.
 • Analysis of color properties of scanned pages.

SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
UI for
 Review of
 Algorithm




SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Art of Life Schema
 Needs to support three objectives:
   (1) to enable the discovery, description and use of the
   identified images by artists, biologists, humanities scholars,
   librarians, and educators;
   (2) to make BHL’s metadata and images available to other
   platforms; and
   (3) to import crowdsourced metadata generated in other
   platforms back into BHL.

SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Schema landscape review
       – VRA Core 4.0 (borrowed 9 elements)
       – LIDO
       – Darwin Core (borrowed 2 elements)
       – Dublin Core



SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
ART OF LIFE SCHEMA ELEMENTS          red =required

                 Title
                 Type
                 Date
                 Copyright
                 Source
                 Agent
                 Subjects
                 Description
                 Inscription

SLRLN March 2013 St Louis MO    Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Example of illustration described using Art of Life schema

                                      Title   Stictospiza formosa
                                      Type    Paintings
                                      Date    Publication: 1898
                                    Agent     Author: Arthur G. Butler (1844-1925)
                                              Illustrator: F.W. Frohawk (1861-1946)
                               Description    A pair of finches with green and yellow bodies resting on reeds
                                  Subjects    Scientific name: Amandava formosa (Latham, 1790)
                                              Vernacular Name: Green Avadavat or Green Munia
                                              Accepted Name: Amandava formosa (Latham, 1790)
                                              Birds, finches

                               Inscriptions   bottom center: Green Amaduvade Waxbill (Stictospiza formosa)

                                   Source     Butler, Arthur Gardiner. Foreign finches in captivity. Hull and London: Brumby and
                                              Clarke, limited,1889 (2nd edition). This image comes from the Biodiversity Heritage
                                              Library, and is available online at biodiversitylibrary.org/page/17195895

                                    Rights    Public domain


SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden                                    Art of Life project
How will this project benefit libraries?
 • Significant resource of natural history images that will be made openly
   accessible and reusable.
 • Useful to varying audiences: artists, biologists, humanities scholars,
   particularly historians of science; librarians, education and outreach.
   Anyone who uses images in their research and teaching.
 • Algorithm will be made available and can be used on any text collections
   with OCR output.
 • Schema can be applied to other image collections that contain a large
     number of natural history illustrations.

SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Thanks to Art of Life team!
 PI
           Trish Rose-Sandler, Missouri Botanical Garden
 Algorithm development
           Ed Bachta, Charlie Moad, Kyle Jaebker, Indianapolis Museum of Art
 Schema development
           Gaurav Vaidya and Robert Guralnick, University of Colorado, Boulder
           William Ulate, Missouri Botanical Garden
 Programming
           Mike Lichtenberg, Missouri Botanical Garden
 Consultants
           Doug Holland, Missouri Botanical Garden; Chris Freeland, Washington University
           (former PI for Art of Life)

SLRLN March 2013 St Louis MO      Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
Interested? Here’s how you can help
 • We welcome your feedback on the schema! http://tinyurl.com/9hm7nsb
 • If you know of scholars and users who would be interested in these types
   of images and would be interested either in participating in our survey or
   a brief focus groups about the schema please have them contact me
   trish.rose-sandler@mobot.org
 • Would love to talk with other folks about their experiences with
   crowdsourcing of metadata, particularly if you’ve used flickr or Wikimedia
   commons



SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project
For more info
http://biodivlib.wikispaces.com/Art+of+Life

Contact:
tweet@trosesandler
trish.rose-sandler@mobot.org

SLRLN March 2013 St Louis MO   Trish Rose-Sandler, Missouri Botanical Garden   Art of Life project

More Related Content

Viewers also liked

Make Your Life Your Art, Dr. Bob Tobin
Make Your Life Your Art, Dr. Bob TobinMake Your Life Your Art, Dr. Bob Tobin
Make Your Life Your Art, Dr. Bob TobinDr. Bob Tobin
 
The Art of Life: merging the worlds of art and science
The Art of Life:  merging the worlds of art and scienceThe Art of Life:  merging the worlds of art and science
The Art of Life: merging the worlds of art and scienceTrish Rose-Sandler
 
Elementet e analizës së një pikture (art figurativ)
Elementet e analizës së një pikture (art figurativ)Elementet e analizës së një pikture (art figurativ)
Elementet e analizës së një pikture (art figurativ)art teacher
 
Arti, nje dritare e jetes
Arti, nje dritare e jetesArti, nje dritare e jetes
Arti, nje dritare e jetesMehmet Emiri
 
The Role Of Art Therapy In Healing
The Role Of Art Therapy In HealingThe Role Of Art Therapy In Healing
The Role Of Art Therapy In Healingcbyma
 

Viewers also liked (7)

Make Your Life Your Art, Dr. Bob Tobin
Make Your Life Your Art, Dr. Bob TobinMake Your Life Your Art, Dr. Bob Tobin
Make Your Life Your Art, Dr. Bob Tobin
 
The Art of Life: merging the worlds of art and science
The Art of Life:  merging the worlds of art and scienceThe Art of Life:  merging the worlds of art and science
The Art of Life: merging the worlds of art and science
 
™ Life as Art
™ Life as Art™ Life as Art
™ Life as Art
 
Elementet e analizës së një pikture (art figurativ)
Elementet e analizës së një pikture (art figurativ)Elementet e analizës së një pikture (art figurativ)
Elementet e analizës së një pikture (art figurativ)
 
BEAUTY IS SKIN DEEP
BEAUTY IS SKIN DEEPBEAUTY IS SKIN DEEP
BEAUTY IS SKIN DEEP
 
Arti, nje dritare e jetes
Arti, nje dritare e jetesArti, nje dritare e jetes
Arti, nje dritare e jetes
 
The Role Of Art Therapy In Healing
The Role Of Art Therapy In HealingThe Role Of Art Therapy In Healing
The Role Of Art Therapy In Healing
 

Similar to The Art of Life project

Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...Trish Rose-Sandler
 
3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage LibraryMartin Kalfatovic
 
An Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryAn Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryMartin Kalfatovic
 
Cua lsc 603_2011
Cua lsc 603_2011Cua lsc 603_2011
Cua lsc 603_2011SCPilsk
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage LibraryMartin Kalfatovic
 
CUA LSC 747_2011
CUA LSC 747_2011CUA LSC 747_2011
CUA LSC 747_2011SCPilsk
 
Cua lsc 888 cataloging in special libraries
Cua lsc 888 cataloging in special librariesCua lsc 888 cataloging in special libraries
Cua lsc 888 cataloging in special librariesPolly Khater
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage LibraryMartin Kalfatovic
 
Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012Smithsonian Libraries
 
Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10Marcia Adams
 
Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10marciaadams
 
Breathing new life into old data - How opening your collection can spark imag...
Breathing new life into old data - How opening your collection can spark imag...Breathing new life into old data - How opening your collection can spark imag...
Breathing new life into old data - How opening your collection can spark imag...Trish Rose-Sandler
 
Smithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSmithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSCPilsk
 
2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk2009 05 20 Cimc Pilsk
2009 05 20 Cimc PilskSCPilsk
 
CUA LSC 888_2011
CUA LSC 888_2011CUA LSC 888_2011
CUA LSC 888_2011SCPilsk
 
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...costantinog
 
Art, emotion and cognition
Art, emotion and cognitionArt, emotion and cognition
Art, emotion and cognitionsarl2007
 
An Exploration of the Relationship Between Culture and Public Library Use: No...
An Exploration of the Relationship Between Culture and Public Library Use: No...An Exploration of the Relationship Between Culture and Public Library Use: No...
An Exploration of the Relationship Between Culture and Public Library Use: No...RachelSalzano
 
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...Martin Kalfatovic
 
IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)
IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)
IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)Richard Urban
 

Similar to The Art of Life project (20)

Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
 
3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library
 
An Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryAn Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage Library
 
Cua lsc 603_2011
Cua lsc 603_2011Cua lsc 603_2011
Cua lsc 603_2011
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage Library
 
CUA LSC 747_2011
CUA LSC 747_2011CUA LSC 747_2011
CUA LSC 747_2011
 
Cua lsc 888 cataloging in special libraries
Cua lsc 888 cataloging in special librariesCua lsc 888 cataloging in special libraries
Cua lsc 888 cataloging in special libraries
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage Library
 
Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012
 
Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10
 
Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10Usaf navy marine corps librarians 06 25-10
Usaf navy marine corps librarians 06 25-10
 
Breathing new life into old data - How opening your collection can spark imag...
Breathing new life into old data - How opening your collection can spark imag...Breathing new life into old data - How opening your collection can spark imag...
Breathing new life into old data - How opening your collection can spark imag...
 
Smithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSmithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in Research
 
2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk2009 05 20 Cimc Pilsk
2009 05 20 Cimc Pilsk
 
CUA LSC 888_2011
CUA LSC 888_2011CUA LSC 888_2011
CUA LSC 888_2011
 
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
Outreach Strategies to Engage Citizen Scientists: Insights from the Biodivers...
 
Art, emotion and cognition
Art, emotion and cognitionArt, emotion and cognition
Art, emotion and cognition
 
An Exploration of the Relationship Between Culture and Public Library Use: No...
An Exploration of the Relationship Between Culture and Public Library Use: No...An Exploration of the Relationship Between Culture and Public Library Use: No...
An Exploration of the Relationship Between Culture and Public Library Use: No...
 
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
 
IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)
IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)
IMLS DCC Progress Update to the Chief Officers of State Library Agencies (COSLA)
 

More from Trish Rose-Sandler

Botanists and annotations: use cases and their relevance for the larger scie...
Botanists and annotations:  use cases and their relevance for the larger scie...Botanists and annotations:  use cases and their relevance for the larger scie...
Botanists and annotations: use cases and their relevance for the larger scie...Trish Rose-Sandler
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Trish Rose-Sandler
 
Expanding access to natural history images: the BHL and its global consortium
Expanding access to natural history images:  the BHL and its global consortiumExpanding access to natural history images:  the BHL and its global consortium
Expanding access to natural history images: the BHL and its global consortiumTrish Rose-Sandler
 
Crowdsourcing your cultural heritage collections: considerations when choosi...
Crowdsourcing your cultural heritage collections:  considerations when choosi...Crowdsourcing your cultural heritage collections:  considerations when choosi...
Crowdsourcing your cultural heritage collections: considerations when choosi...Trish Rose-Sandler
 
Special libraries association meeting march 2014
Special libraries association meeting march 2014Special libraries association meeting march 2014
Special libraries association meeting march 2014Trish Rose-Sandler
 
More than just a pretty picture: improving the discoverability of illustrati...
More than just a pretty picture:  improving the discoverability of illustrati...More than just a pretty picture:  improving the discoverability of illustrati...
More than just a pretty picture: improving the discoverability of illustrati...Trish Rose-Sandler
 
Building the new open linked library: Theory and Practice
Building the new open linked library: Theory and PracticeBuilding the new open linked library: Theory and Practice
Building the new open linked library: Theory and PracticeTrish Rose-Sandler
 
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...Trish Rose-Sandler
 

More from Trish Rose-Sandler (8)

Botanists and annotations: use cases and their relevance for the larger scie...
Botanists and annotations:  use cases and their relevance for the larger scie...Botanists and annotations:  use cases and their relevance for the larger scie...
Botanists and annotations: use cases and their relevance for the larger scie...
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
 
Expanding access to natural history images: the BHL and its global consortium
Expanding access to natural history images:  the BHL and its global consortiumExpanding access to natural history images:  the BHL and its global consortium
Expanding access to natural history images: the BHL and its global consortium
 
Crowdsourcing your cultural heritage collections: considerations when choosi...
Crowdsourcing your cultural heritage collections:  considerations when choosi...Crowdsourcing your cultural heritage collections:  considerations when choosi...
Crowdsourcing your cultural heritage collections: considerations when choosi...
 
Special libraries association meeting march 2014
Special libraries association meeting march 2014Special libraries association meeting march 2014
Special libraries association meeting march 2014
 
More than just a pretty picture: improving the discoverability of illustrati...
More than just a pretty picture:  improving the discoverability of illustrati...More than just a pretty picture:  improving the discoverability of illustrati...
More than just a pretty picture: improving the discoverability of illustrati...
 
Building the new open linked library: Theory and Practice
Building the new open linked library: Theory and PracticeBuilding the new open linked library: Theory and Practice
Building the new open linked library: Theory and Practice
 
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

The Art of Life project

  • 1. The Art of Life project SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 2. What is Art of Life? • Full title - The Art of Life: Data Mining and Crowdsourcing the Identification and Description of Natural History Illustrations from the Biodiversity Heritage Library (BHL) • Grant given to Missouri Botanical Garden in St Louis • Funded by National Endowment for the Humanities • Runs May 2012-April 2014 SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 3. What is BHL? • A consortium of natural history, botanical libraries and research institutions • An open access digital library for historic biodiversity literature • An open data repository of taxonomic names and bibliographic information SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 4. Member Institutions • Academy of Natural Sciences Library and Archives • American Museum of Natural History Library • California Academy of Sciences Library • Cornell University Library • The Field Museum Library • Harvard University Botany Libraries • Harvard University, Ernst Mayr Library of the Museum of Comparative Zoology • Library of Congress • Marine Biological Laboratory / Woods Hole Oceanographic Institution Library • Missouri Botanical Garden Library • Natural History Museum, London, Library & Archives • The New York Botanical Garden • Royal Botanic Gardens, Kew, Library & Archives • Smithsonian Institution Libraries • United States Geological Survey Libraries SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 5. BHL FT Staff SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 6. BHL Global SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 7. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 8. BHL Browse SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 9. BHL Search SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 10. BHL Scientific Names SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 11. BHL Book viewer SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 12. BHL copyright and licensing Public Domain Content Files Public Domain Copyrighted Content Files Metadata, OCR, Scientific Names SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 13. BHL provides data via: • APIs • Data exports • OpenURL • OAI-PMH SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 14. Reuse of BHL data The website and webservice BioStor by Rod Page provides tools for extracting, annotating, and visualising information on literature from BHL (http://biostor.org/). In this example, Rod has identified articles found in the Proceedings of the United States National Museum. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 15. Reuse of BHL data Ryan Schenk is using publication dates of works in BHL to build histograms of the number of publications-per-year for specific species, In this example, the Guniea Pig (http://synynyms.no.de/ ) SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 16. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 17. Why the need for Art of Life? Problem statement – users want access to images, access to images is limited to page by page scroll or viewing selection of images in Flickr, not searchable by image content (e.g. corn, zea mays) SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 18. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 19. 5 Primary Objectives of Art of Life Objective 1: Define an appropriate metadata schema for natural history illustrations Objective 2: Build software tools to automatically identify illustrations in the BHL corpus Objective 3: Enhance existing tools to enable the initial sorting, viewing, and editing of these identified visual resources. Objective 4: Integrate tagging applications to enable a community of users to edit descriptive metadata for the illustrations Objective 5: Integrate the descriptive metadata generated by users back into BHL portal both for access and preservation SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 20. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 21. Current status of Art of Life • Development of the algorithm is about 90% complete and will be done by April 2013 • Draft schema for describing natural history illustrations available for public review http://tinyurl.com/9hm7nsb • Classifier tool – reusing an existing BHL tool developed by Joel Richard called Macaw http://code.google.com/p/macaw- book-metadata-tool/ SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 22. 4 algorithms that have been tested • Analysis of the scanning metadata, aka picture block data, contained within the OCR files generated as part of the Internet Archive scanning process. • Application of a technique that evaluates the vertical composition of a page by applying contrast and scaling transformations. • Analysis of the image compression ratio for scanned pages. • Analysis of color properties of scanned pages. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 23. UI for Review of Algorithm SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 24. Art of Life Schema Needs to support three objectives: (1) to enable the discovery, description and use of the identified images by artists, biologists, humanities scholars, librarians, and educators; (2) to make BHL’s metadata and images available to other platforms; and (3) to import crowdsourced metadata generated in other platforms back into BHL. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 25. Schema landscape review – VRA Core 4.0 (borrowed 9 elements) – LIDO – Darwin Core (borrowed 2 elements) – Dublin Core SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 26. ART OF LIFE SCHEMA ELEMENTS red =required Title Type Date Copyright Source Agent Subjects Description Inscription SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 27. Example of illustration described using Art of Life schema Title Stictospiza formosa Type Paintings Date Publication: 1898 Agent Author: Arthur G. Butler (1844-1925) Illustrator: F.W. Frohawk (1861-1946) Description A pair of finches with green and yellow bodies resting on reeds Subjects Scientific name: Amandava formosa (Latham, 1790) Vernacular Name: Green Avadavat or Green Munia Accepted Name: Amandava formosa (Latham, 1790) Birds, finches Inscriptions bottom center: Green Amaduvade Waxbill (Stictospiza formosa) Source Butler, Arthur Gardiner. Foreign finches in captivity. Hull and London: Brumby and Clarke, limited,1889 (2nd edition). This image comes from the Biodiversity Heritage Library, and is available online at biodiversitylibrary.org/page/17195895 Rights Public domain SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 28. How will this project benefit libraries? • Significant resource of natural history images that will be made openly accessible and reusable. • Useful to varying audiences: artists, biologists, humanities scholars, particularly historians of science; librarians, education and outreach. Anyone who uses images in their research and teaching. • Algorithm will be made available and can be used on any text collections with OCR output. • Schema can be applied to other image collections that contain a large number of natural history illustrations. SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 29. Thanks to Art of Life team! PI Trish Rose-Sandler, Missouri Botanical Garden Algorithm development Ed Bachta, Charlie Moad, Kyle Jaebker, Indianapolis Museum of Art Schema development Gaurav Vaidya and Robert Guralnick, University of Colorado, Boulder William Ulate, Missouri Botanical Garden Programming Mike Lichtenberg, Missouri Botanical Garden Consultants Doug Holland, Missouri Botanical Garden; Chris Freeland, Washington University (former PI for Art of Life) SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 30. Interested? Here’s how you can help • We welcome your feedback on the schema! http://tinyurl.com/9hm7nsb • If you know of scholars and users who would be interested in these types of images and would be interested either in participating in our survey or a brief focus groups about the schema please have them contact me trish.rose-sandler@mobot.org • Would love to talk with other folks about their experiences with crowdsourcing of metadata, particularly if you’ve used flickr or Wikimedia commons SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project
  • 31. For more info http://biodivlib.wikispaces.com/Art+of+Life Contact: tweet@trosesandler trish.rose-sandler@mobot.org SLRLN March 2013 St Louis MO Trish Rose-Sandler, Missouri Botanical Garden Art of Life project

Editor's Notes

  1. In 2006 libraries from botanical garden and natural history museums in the US and UK came together and decided they wanted to digitize all of the public domain literature in their collections and place them online to make them more widely accessible.One of our primary audiences are taxonomists who use BHL to find the first occurrence of name for a species in the historic literature. Also to track how that name has changed over time.
  2. Today there are 15 institutions who are BHL members and who contribute content to the repository. As you can see from the list its made up of some of the largest and most well known nat history and botanical garden libraries and as well as some libraries who are much broader in scope but contribute their biodiversity specific materials to the repositoryMost recent member to join is the Library of Congress
  3. The BHL is made up of 6 fulltime staff located at both the Smithsonian Libraries and Missouri Botanical Garden. Program Director, Martin Kalfatovic; Program Manager, Grace Costantino; and Collections Coordinator, Bianca Crowley are all based at the Smithsonian LibrariesTechnical Director, William Ulate; Programmer, Mike Lichtenberg; Data Analyst, Trish Rose-Sandler based at the Missouri Botanical GardenWe also have contributions from staff at the member institutions who allow a certain percentage of their staff time to work on BHL (when we tried to quantify how much time is spent by the part time staff it comes out to a little over 16 FTEs from the member institutions)
  4. What began as a consortium between libraries in the US and UK it is now increasingly a global effort – there are BHL nodes in China, Australia, Egypt, Europe, Brazil and soon to be Africa! William also heads up our global coordination efforts. Each global partner maintains its own portal that is specialized to the needs of its users but we work to share content and technology across nodes and also try to not duplicate digitization efforts so we can maximize limited scanning funding.
  5. The url for the portal is at biodiversitylibrary.orgHere is a sneak peak of our new UI that will go live on Monday. This new UI is a collaboration between BHL US and BHL Australia who have been maintaining separate portals but decide that based on feedback from BHL users that we should merge the 2 sites. We did a usability study in 2011 of the 2 sites and feedback indicated that Users wanted the look at feel of the Australian site but with the functionality of the US/UK site.After 6 yrs of digitization we have a critical mass of contentWe have over 57 thousand titles108 thousand volumesAlmost 40 million pages of text
  6. You can browse BHL by Title, Author, Date or Collection
  7. In our advanced search you can search on the title of a book, journal or article. You can search on Authors, Subjects and Scientific Names.
  8. Scientific names are a particularly critical access point for our taxonomists who need to verify the first mention of a species in the published literature as this will help them validate which is the accepted name. In this case searching on zea mays brings back all of the books and journals that have mentioned zea mays anywhere within the text. you can see it was first mentioned in 1797. Our scientific name searching is enabled by taking the OCR output from each page of digitized text and running it through TaxonFinder, a taxonomic name recognition algorithm that is maintained by uBio and which we can call up via web services
  9. Probably the most significant change to our UI was to the book viewer. Users now have ability to scroll multiple pages at once and to go directly to sections of the book like chapters and articles. When viewing a single page in the viewer you can see all the scientific names found on that page. The OCR can also be viewed side by side with the image of the page.
  10. Because one of the founding principles of BHL was to make biodiversity lit available for open access and responsible use as part of biodiversity global commons we seek to provide data in a variety of forms for other to harvest and re-use One of the keys to making data harvestable is clearly stating for your users your copyright and licensing terms are so that they know what they can and cannot do with your data. Because most of our books and journals are historic literature that were published between 1450s- and 1923 they fall within the public domain - users may reuse them for either commercial or non-commercial purposes w/o permission. We do have some copyrighted material that we’ve digitized with the permission of the publisher. Those we provide under a Creative Commons Attribution/Non-Commercial/Share Alike License Our metadata (which includes catalog records and scientific names ) are available under a CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. Which essentially says we are the creaters of the data and want to dedicate it to the public domain. This allows anyone to reuse, modify, repurpose, and distribute the metadata for all purposes including commercial and non-commercial, with no need to ask for permission. This follows in the footsteps of other libraries such as The British Library, Europeana, the University of Michigan Library, and Harvard who have adopted it for their online catalog data. 
  11. Therefore we provide data in a variety of exportsWe encourage digital library aggregators to incorporate our records into their portals and library catalogs. We provide data in ways that it can be mined recontextualized
  12. Besides the traditional harvesting of our metadata records into other portals such as OCLC WorldcatAnd Digital Public Library of America our content has been mined and recontextualized for a variety of interesting purposes.  The website and webservice BioStor by Rod Page provides tools for extracting, annotating, and visualising information on literature from BHL (http://biostor.org/). In this example, Rod has identified articles found in the Proceedings of the United States National Museum. We are now incorporating Rod’s articles back into the BHL portal and they will be searchable in the new UI. So here’s an example of how making your data open accessible could result in real benefits back to your institution and users
  13. Ryan Schenk is using publication dates of works in BHL to build histograms of the number of publications-per-year for specific species, In this example, the Guniea Pig (http://synynyms.no.de/ )
  14. Another key to harvestable is promotion. People won’t know your data is out there and can be mined unless you advertise and really get the word out. We also make extensive use of social media to let users know about new content, contextualize the existing content, interact with users and in general let users know we are a constantly dynamic and growing repository. We do this via: our Blog, Twitter, Facebook and Pinterest.
  15. So all of this background information was provided in order to give context as to the need for the Art of Life project. Problem Statement- Art of Life evolved out of a need in the BHL that was expressed by our users. We had a critical mass of content online, BHL users knew there were amazing images within the BHL pages but there was no easy way to find them other than opening up a BHL book or volume and scrolling through page by page to find illustrations. There is no descriptive metadata attached to the illustration that would tell you the content of the image, date when they were created or who was involved in their creation.
  16. One way we’ve tried to address this need is by pushing selected images to Flickr. We have created a BHL account in Flickr and pushed over 63,000 images so far but but this is all a very manual process that takes considerable staff time. We estimate that we have millions of illustrations within BHL so this manual process does not scale well
  17. This is the Art of Life workflow diagram which identifies the 4 processes the illustrations will go through as they move through each stage of the workflow. They include: Extract, Classify, Describe, and Share.The Extract stage is where BHL pages will be run through an algorithm to identify which pages contain illustrations, whether they be full plates or only a section of the page. This algorithm is being developed by our partners on the project from the Indianapolis Museum of Art Lab. At the Classify stage, the pages with illustrations will be tagged by Art of Life staff as being one or several broad types such as drawing/painting, photograph, diagram and even map. The tool we will use for this is actually an existing tool created at the Smithsonian Libraries called Macaw that BHL uses to for books scanned outside of its traditional workflow.For the Describe stage, the illustrations will be pushed into platforms such as Flickr and Wikimedia Commons where both the general public and specialists can describe them in much greater detail such as adding a title, creator, date (if different from date of publication), and subjects. Wikimedia Commons is where the schema can play a role. Because Wikimedia allows you to create templates we can provide guidance to taggers on what information to record and how to record it. In the Share stage, the metadata for the illustrations will be reingested back into the BHL portal for searching there. And of course we want to be able to preserve any contributed metadata from external platforms. We also want to broaden the audience for these illustrations because we believe they have a wide appeal to artists, biologists, humanities scholars, particularly historians of science; librarians, education and outreach. Many of the audiences don’t know about BHL and won’t go to the BHL platform looking for the content so we want to push the illustrations out to environments where they already are: Encyclopedia of Life, ARTstor, and even ITunes.
  18. IMA developed a UI for us to review the results of each algorithm. This screen shot just shows results from the OCR files (ABBY) and contrast. Those were the 2 algorithms we found to be most accurate. Its showing you all the pages in a book, how many actual illustrations and how many true/false postitives. The actual illustrations is based on a gold standard testset that we developed where we’ve manually gone and identified which pages have images so the algorithm can always compare against the test set and come up with a percentage of accuracy.
  19.  
  20. We ended up choosing most of the schema elements from the VRA Core because its elements and attributes were mostly closely aligned with the types of information we felt were important to record. But also because its relationship of works linking to one or more images fit nicely with the BHL pages often containing one or more illustrations on a single page. The only thing the VRA Core lacked was a way to record an acceptedName and CommonName for a species. VRA Core has a subject attribute type of scientificName but Taxonomists would be interested in knowing the multiple names by which species are known. Darwin Core was able to fulfill this need and so we borrowed 2 elements from that schema.
  21. Here are the elements chosen (this is still in draft form)
  22. Here is an illustration described using the schema