SlideShare ist ein Scribd-Unternehmen logo
1 von 51
Downloaden Sie, um offline zu lesen
Botanists and Annotations:
use cases and their relevance
for the larger scientific community
William Ulate
Trish Rose-Sandler
Center for Biodiversity Informatics
Missouri Botanical Garden
Jun. 2018
Where do we come from?
Why are we here?
I Annotate Conference , Berlin (2016)
The uptake of web annotation could be sufficiently
moved forward by tackling three key issues:
1) interoperability
2) domain use cases
3) user centered design
Darwin Virtual Library (2011)
Charles Darwin’s Library is a digital edition and virtual
reconstruction of the surviving books owned by Charles
Darwin.
https://www.biodiversitylibrary.org/collection/darwinlibrary
Charles Darwin’s Library is a digital edition and virtual
reconstruction of the surviving books owned by Charles
Darwin.
In 1908, Charles Darwin’s son, Francis, transferred what he
called the ‘Darwin Library’ to the Botany School at
Cambridge University.
‘The chief interest of the Darwin books lies in the pencil notes
scribbled on their pages, or written on scraps of paper and
pinned to the last page.’ – Francis Darwin
Darwin read to gather evidence, to explore and define the
research possibilities of his evolutionary ideas, and to gauge
reactions to his own publications.
https://www.biodiversitylibrary.org/collection/darwinlibrary
Charles Darwin’s Library is a digital edition and virtual
reconstruction of the surviving books owned by Charles
Darwin.
In 1908, Charles Darwin’s son, Francis, transferred what he
called the ‘Darwin Library’ to the Botany School at
Cambridge University.
https://www.biodiversitylibrary.org/collection/darwinlibrary
Charles Darwin’s Library is a digital edition and virtual
reconstruction of the surviving books owned by Charles
Darwin.
In 1908, Charles Darwin’s son, Francis, transferred what he
called the ‘Darwin Library’ to the Botany School at
Cambridge University.
‘The chief interest of the Darwin books lies in the pencil notes
scribbled on their pages, or written on scraps of paper and
pinned to the last page.’ – Francis Darwin
https://www.biodiversitylibrary.org/collection/darwinlibrary
Charles Darwin’s Library is a digital edition and virtual
reconstruction of the surviving books owned by Charles
Darwin.
In 1908, Charles Darwin’s son, Francis, transferred what he
called the ‘Darwin Library’ to the Botany School at
Cambridge University.
‘The chief interest of the Darwin books lies in the pencil notes
scribbled on their pages, or written on scraps of paper and
pinned to the last page.’ – Francis Darwin
Darwin read to gather evidence, to explore and define the
research possibilities of his evolutionary ideas, and to gauge
reactions to his own publications.
This digital reconstruction of the Darwin Library delivers is
the ability to retrace and reduplicate Darwin’s reading of a
wealth of materials.
https://www.biodiversitylibrary.org/collection/darwinlibrary
https://www.biodiversitylibrary.org/page/34074923
https://biodiversitylibrary.org/page/34074986
https://did3.jiscinvolve.org/wp/projects/mining-biodiversity/
Mining Biodiversity
• Transform BHL into a next-generation social
digital library
• A multi-disciplinary approach
– Text Mining
– Machine learning
– History of Science
– Environmental History & Studies
– Library and Information Science
– Social Media
This project was made possible in part by the Institute of Museum and Library Services [LG-00-14-04-0032-14].
http://miningbiodiversity.com
Enhancements to BHL
What’s wrong with
keyword-based search: Polysemy
•Ambiguity!
Boxwood
historic place in
Alabama?
North American term for
plants in the Buxaceae
family?
California bay
hardwood
tree?
location?
What’s wrong with
keyword-based search: Synonymy
Campanula
portenschlagiana Schult.
Campanula
portenschlagiana Schult.
Campanula affinis
Rchb. ex Nyman
Campanula muralis
Port ex. A. DC.
Semantic metadata generation
• Entity types
– species
– location
– habitat
– anatomical parts
– qualities
– persons
– temporal expressions
Semantic metadata generation
• Entity types
– species
– location
– habitat
– anatomical parts
– qualities
– persons
– temporal expressions
• Association types
– observation
– Habitation
– nutrition
– trait
Examples of semantic metadata
(annotations)
• Observation
• Habitation
Examples of semantic metadata
(annotations)
• Nutrition
• Trait
Text mining-based approach
Seed
documents
Unlabelled
documents
Learn semantics
Annotator/Curator
Validate
Feedback
Annotate
Search
index
Store
Annotate
Validation interface
Enhanced document viewing
Page in
PDF/image
format
OCR-corrected text
with colour-coded
annotations
Text Annotation Use Cases
Annotator Use Case: I am a contributing
participant, adding or curating annotations in
the Biodiversity Digital Library.
Searcher Use Case: I am an user of the
Biodiversity Digital Library, searching for content
that is indexed by annotations
Admin Use Case
Annotator Use Case
• Add an annotation by selecting text
• Conveniently select an appropriate annotation (autocomplete, dropdown menu)
• “Cross out” an annotation (eg: a homonym) and toggle showing it.
• Modify which text is selected and/or change the annotation term associated with
my own or a pre-existing annotation.
• Confirm or agree with an existing annotation.
• Show measure of certainty on an annotation, either a count of how many people
agree, or just “Confirmed” versus “Still in need of review”
• Easily browse existing annotations in a document (using the tab or next button)
• Browse annotations filtered by their status (confirmed, crossed out, review)
• Find documents by annotation status.
• Find documents that interest me (combine the solution above with search or
filter by other document metadata (keyword, title, author, etc.)
Searcher Use Case
• Discover annotation terms to search by (autocomplete, drop down menu,
browsable tree of terms)
• Navigate to locations in documents from my search (search results show
truncated text found and a link to the location of the annotated text)
• Download search results (several columns: annotation term; the chunk of text
containing the annotation; URL to the location of the annotated text)
• Search for documents containing combinations of terms
• Search for combinations of terms in proximity to each other in the text.
• Search for facts based on semantic combinations or relative positions of terms
(eg: “Leptinotarsa” “feed on” ?)
• Retrieve search results for associated terms. Asking for water bodies, should
return rivers, bays, lakes, seas, etc. Asking for butterflies, should get all the
Lepidoptera species.
Test your hypothesis with real Use Cases
Enhanced searching of content
Faceted
search
Automatically
generated
questions
Time-
sensitive
search
Search by facets
Opisthoproctus soleatus reported between 1840 and 1950
filtered by Habitat, Morphology and
Reproduction annotations.
• Taxonomy (73)
• Geography (18)
• Habitat (61)
• Traits (57)
- Morphology(20)
- Feeding (35)
- Reproduction(10)
• Publication (73)
- Journal (21)
- Author (63)
-Collection (10)
Automatically generated questions
Opisthoproctus soleatus reported between 1840 and 1950
filtered by Habitat, Morphology and
Reproduction annotations.
there is no strong sentiment on whether this
functionality is something that is definitely useful
this is very relevant to their work (50%)
I can see how it can be useful but not currently (50%)
Ask a question
-Which species taxa are related to Opisthoproctus soleatus?
- In which geographical locations can I find Opisthoproctus soleatus?
- What other species are co-located with Opisthoproctus soleatus?
- In which environments does Opisthoproctus soleatus live?
- What other species are in the same habitat as Opisthoproctus soleatus?
- What are the characteristics of Opisthoproctus soleatus?
- What other species share the same characteristics of Opisthoproctus soleatus?
Searching by subject-verb-object
Leptinotarsa feeds on ? reported between 1840 and 1950
…they can see how the graph-based
visualization of results can be useful
but not for their current purposes …
Searching for directly associated concepts
I’m looking for Taxa/Geographic locations/Habitats/Traits
directly associated with Eltanin reported between 1840 and 1950.
this is very relevant to what they are doing (50%)
it might be useful but not for their current purposes (40%)
there is no strong indication of whether this
feature is definitely wanted by our respondents
Searching for indirectly associated concepts
I’m looking for Taxa indirectly associated with tarsier
via Geographic locations reported between 1840 and 1950.
they can see its benefits
but not to what they are currently doing (50%)
it will be definitely useful (26%)
Use Cases
1. Finding the original description (taxonomic research).
2. Finding host plants, for example (ecological research).
3. Finding illustrations and plates.
4. Finding taxon name usage instances (taxonomic
treatment, nomenclatural act).
5. Capturing spelling variants (orthographic variants).
6. Marking errors on versions of OCR/transcribed text.
7. Exposing semantic metadata (as a SPARQL endpoint).
8. Being able to access through APIs search functionalities.
9. Allowing users to highlight in text (keywords).
10. Allowing users to annotate concepts if incorrectly
recognized or missed.
Application to Query Expansion
• an interface for searching documents using a
species name as a query
• query is automatically expanded by retrieving
synonyms/semantically related names from
the term inventory
• documents mentioning all of the names in the
expanded query are returned
Term Inventory
• compilation of species names (flowering
plants, mammals, birds)
• acts as a thesaurus, as each name is linked to
its synonyms as well as other semantically
related names
• “semantically relatedness”: defined in terms
of a contextual similarity measure, computed
over the entire Digital Library corpus
Magnoliopsida species (common) names
CHOICE 1 CHOICE 2 CHOICE 3
Phaseolus multiflorus Garden pea Argemone alba
Citrus nobilis Sweetheart Arabis perfoliata
Spergularia marina Aster pauciflorus Mimosa
Canavalia ensiformis Physic nut Mung bean
Chrysanthemum inodorum Guilandina bonducella Tilia parvifolia
Fraxinus pubescens Arabidopsis thaliana Pulsatilla vulgaris
Symphoricarpos orbiculatus Turritis glabra Medick
Sorbus domestica Lespedeza reticulata Hypericum galioides
Haematoxylon campechianum Scaevola lobelia Alliaria petiolata
Real Use Cases
"Collected by who?
Zambia 1934.....
Stuck again!!
@KewDC“
Dr. Sandra Knapp
(@SandyKnapp)
Mar. 11, 2016
Real Use Cases
Real Use Cases
May.25, 2016 :
On the etymology of the word "elephant" and
the origins of the word "tamarind", the "Indian
date"
Sketches of the natural history of Ceylon : with narratives and anecdotes
illustrative of the habits and instincts of the mammalia, birds, reptiles,
fishes, insects, etc. including a monograph of the elephant
https://twitter.com/WUlate/status/734805482536198144
Disqus
• Annotation functionality was made available as a trial within the
portal from December of 2015 through June 2016 as part of the
IMLS-funded Mining Biodiversity project.
• A social commenting tool that allowed users to add comments to
individual pages in a book and follow users and discussions about
those books.
• The following tasks were carried out:
1. Created Requirements document to outline the commenting tool
needs and how Disqus achieved them.
2. Coordinated with Disqus development staff to determine how best
to implement Disqus to meet those needs.
3. Tool built and implemented in Portal.
4. Extensive testing of the feature before launching the tool.
5. Developed User Tutorials and Outreach Content to announce the
feature to the public and provide training for its use.
Disqus
• In 6 months, 188 individual annotations were received
and stored in Disqus repository.
• The tool was discontinued within BHL because it was
considered a proprietary tool that would not have
served well as a long term scalable solution and
customizations to the tool were limited and
annotations were stored on Disqus and not BHL servers
• The trial demonstrated a desire from users to actively
engage in the annotation process within a digital
library interface.
• Citizen scientists and librarians were among the most
active profiles in generating annotations.
• The International Plant Names Index (IPNI) is a database of the
names and associated basic bibliographical details of seed plants,
ferns and lycophytes.
• Its goal is to eliminate the need for repeated reference to primary
sources for basic bibliographic information about plant names.
• The data are freely available and are gradually being standardized
and checked.
• IPNI is a dynamic resource, depending on direct contributions by all
members of the botanical community.
http://www.ipni.org
Why Botanists?
Why Botanists?
Botanico-Periodicum-Huntianum (1968)
Worldwide bibliography of periodicals
• 12,000 titles (45 languages)
• title abbreviations
• cross-referenced to other published
abbreviations and complete titles
• details of volumation and duration
• and other basic bibliographic data.
BPH-2 (2004)
Periodicals with Botanical Content
Second edition of B-P-H
Alphabetical title list (1665 – 2002)
33,000 titles from around the world
Agriculture, Agronomy, Bacteriology,
Biology, Biotechnology,
Botanical Bibliography and History,
Conservation, Ecology,
Environmental Science, Floriculture,
Forestry, Fruit growing,
Genetics and Plant breeding,
Geography, Horticulture,
Hydrobiology and Limnology,
Immunology and Toxicology,
Medical Mycology,
Microbiology and Microscopy,
Molecular biology, Palaeontology,
Pharmacology and Pharmacognosy,
Plant pathology and Vegetable crops, etc.
B-P-H/Supplementum (1991)
• 25,000 title entries arranged by title
• key to entries in both volumes.
• Citation abbreviations for all titles
• improved cross-referencing
• expanded thesaurus of title words
and their abbreviation equivalents
• included periodicals dealing with
biotechnology, molecular biology,
environmental studies and
conservation.
Landscape Review
New Media Consortium
• Horizon Report Library Edition
Few examples of adoption within Libraries
Except for:
• Australia’s Trove and
• Europeana Sounds Project
Lack of Available tools? No
Consumers As Creators
Planning Grant
From: May 2018
To: Apr 2019
#ConsumersAsCreators
Purpose
Analyze Web annotation needs of the
botanical community and develop a
prototype of how those needs may
be met within a digital library
platform
Results from this project will be useful to the
following audiences:
• Librarians looking to improve their virtual library by
enabling users to add value to their content.
• Botanists who want to enhance the corpus of their
digital library collection by augmenting knowledge
through the annotations provided.
• Developers who want to choose a tool to enable
annotations in their online solutions, particularly within
digital library platforms.
Deliverables:
• Needs Analysis Report with prioritized list of annotation
needs for users of a botanical virtual library.
• Feasibility Study with the evaluation of four open source
existing annotation tools based on their potential to
address the needs identified in the Analysis Report
• Proof of concept prototype installed within a virtual
library to demonstrate the functional capacity of one of
the evaluated tools
• Outcomes Assessment with next step recommendations
to propose a full-scale project adopting an annotation
tool as part of a virtual library.
Needs analysis report
Using case research approach,
• Interview 10 users of a botanical virtual
library from 5 separate institutions
• Answers will be analyzed and classified
by user type, purpose and function
Feasibility study
Four existing annotation tools will be thoroughly
evaluated against the needs analysis in order to
develop a feasibility study for how they could satisfy
botanists’ needs
digilib
Proof of concept prototype
RERUM will be integrated within a digital
library platform as proof-of-concept
Outcomes assessment and next steps
• Identify requisites, best practices, and
further developments for research project
• Identify appropriate partners
Interested in joining us?
Contact:
Trish Rose-Sandler trish.rose-sandler@mobot.org
William Ulate william.ulate@mobot.org

Weitere ähnliche Inhalte

Was ist angesagt?

Navigating Selected NIU Libraries' Online Resources
Navigating Selected NIU Libraries' Online ResourcesNavigating Selected NIU Libraries' Online Resources
Navigating Selected NIU Libraries' Online Resources
Ladislava Khailova
 
Cua lsc 603_2011
Cua lsc 603_2011Cua lsc 603_2011
Cua lsc 603_2011
SCPilsk
 
We ve got_issues
We ve got_issuesWe ve got_issues
We ve got_issues
Erinjt
 

Was ist angesagt? (16)

The Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going Global
The Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going GlobalThe Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going Global
The Biodiversity Heritage Library: Corn-fed, Missouri Raised, Going Global
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage Library
 
Species delimitation - species limits and character evolution
Species delimitation - species limits and character evolutionSpecies delimitation - species limits and character evolution
Species delimitation - species limits and character evolution
 
Tony Rees IRMNG 2015 presentation
Tony Rees IRMNG 2015 presentationTony Rees IRMNG 2015 presentation
Tony Rees IRMNG 2015 presentation
 
Information Sources in Biology at JMU
Information Sources in Biology at JMUInformation Sources in Biology at JMU
Information Sources in Biology at JMU
 
10 years of global biodiversity databases: are we there yet?
10 years of global biodiversity databases: are we there yet?10 years of global biodiversity databases: are we there yet?
10 years of global biodiversity databases: are we there yet?
 
Navigating Selected NIU Libraries' Online Resources
Navigating Selected NIU Libraries' Online ResourcesNavigating Selected NIU Libraries' Online Resources
Navigating Selected NIU Libraries' Online Resources
 
Searching jstor
Searching jstorSearching jstor
Searching jstor
 
Searching anthro source
Searching anthro sourceSearching anthro source
Searching anthro source
 
A Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage LibraryA Global Library of Life: The Biodiversity Heritage Library
A Global Library of Life: The Biodiversity Heritage Library
 
Taxonomy type concept
Taxonomy type conceptTaxonomy type concept
Taxonomy type concept
 
Botanical Literature Goes Global: The Biodiversity Heritage Library
Botanical Literature Goes Global: The Biodiversity Heritage Library Botanical Literature Goes Global: The Biodiversity Heritage Library
Botanical Literature Goes Global: The Biodiversity Heritage Library
 
Cua lsc 603_2011
Cua lsc 603_2011Cua lsc 603_2011
Cua lsc 603_2011
 
We ve got_issues
We ve got_issuesWe ve got_issues
We ve got_issues
 
Phylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databasesPhylo finder: an intelligent search engine for phylogenetic tree databases
Phylo finder: an intelligent search engine for phylogenetic tree databases
 
Two graphs, three responses
Two graphs, three responsesTwo graphs, three responses
Two graphs, three responses
 

Ähnlich wie Botanists and annotations: use cases and their relevance for the larger scientific community

Finding the annotation needs of the botanical community in a digital library
Finding the annotation needs of the botanical community in a digital libraryFinding the annotation needs of the botanical community in a digital library
Finding the annotation needs of the botanical community in a digital library
William Ulate
 
Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012
Smithsonian Libraries
 
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Becky Morin
 
Special Libraries Associatin
Special Libraries AssociatinSpecial Libraries Associatin
Special Libraries Associatin
drielinger
 
Help Guide for Free Online Resources
Help Guide for Free Online ResourcesHelp Guide for Free Online Resources
Help Guide for Free Online Resources
ejg29
 

Ähnlich wie Botanists and annotations: use cases and their relevance for the larger scientific community (20)

ALA Presentation 2010 Open Office Powerpoint
ALA Presentation 2010 Open Office PowerpointALA Presentation 2010 Open Office Powerpoint
ALA Presentation 2010 Open Office Powerpoint
 
Finding the annotation needs of the botanical community in a digital library
Finding the annotation needs of the botanical community in a digital libraryFinding the annotation needs of the botanical community in a digital library
Finding the annotation needs of the botanical community in a digital library
 
Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access
 
Biodiversity Heritage Library
Biodiversity Heritage LibraryBiodiversity Heritage Library
Biodiversity Heritage Library
 
Smithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSmithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in Research
 
Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012Suzanne Pilsk Presentation to SIL Board 2012
Suzanne Pilsk Presentation to SIL Board 2012
 
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...
The Biodiversity Heritage Library Mass Digitizing Project: A Grandeur in this...
 
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
 
Searching & Referencing
Searching & ReferencingSearching & Referencing
Searching & Referencing
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
Books, Bytes, Biodiversity: Using the Biodiversity Heritage Library in Your R...
 
Open Access to Legacy Biodiversity Literature
Open Access to Legacy Biodiversity LiteratureOpen Access to Legacy Biodiversity Literature
Open Access to Legacy Biodiversity Literature
 
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
Franz et al ice 2016 addressing the name meaning drift challenge in open ende...
 
Special Libraries Associatin
Special Libraries AssociatinSpecial Libraries Associatin
Special Libraries Associatin
 
Oh Time, Thy Pyramids! The Biodiversity Heritage Library and the Unchaining o...
Oh Time, Thy Pyramids! The Biodiversity Heritage Library and the Unchaining o...Oh Time, Thy Pyramids! The Biodiversity Heritage Library and the Unchaining o...
Oh Time, Thy Pyramids! The Biodiversity Heritage Library and the Unchaining o...
 
20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club
 
Big Data Case Studies
Big Data Case Studies Big Data Case Studies
Big Data Case Studies
 
Open Access Bhl Ia
Open Access Bhl IaOpen Access Bhl Ia
Open Access Bhl Ia
 
Help Guide for Free Online Resources
Help Guide for Free Online ResourcesHelp Guide for Free Online Resources
Help Guide for Free Online Resources
 
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
 

Mehr von Trish Rose-Sandler

Finding a goldmine of natural history illustrations within BHL texts: the Ar...
Finding a goldmine of natural history illustrations within BHL texts:  the Ar...Finding a goldmine of natural history illustrations within BHL texts:  the Ar...
Finding a goldmine of natural history illustrations within BHL texts: the Ar...
Trish Rose-Sandler
 
The Art of Life project
The Art of Life projectThe Art of Life project
The Art of Life project
Trish Rose-Sandler
 
The Biodiversity Heritage Library and bibliographic citations: towards new u...
The Biodiversity Heritage Library and bibliographic citations: towards new u...The Biodiversity Heritage Library and bibliographic citations: towards new u...
The Biodiversity Heritage Library and bibliographic citations: towards new u...
Trish Rose-Sandler
 

Mehr von Trish Rose-Sandler (15)

Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
 
Expanding access to natural history images: the BHL and its global consortium
Expanding access to natural history images:  the BHL and its global consortiumExpanding access to natural history images:  the BHL and its global consortium
Expanding access to natural history images: the BHL and its global consortium
 
The history of biodiversity through words and pictures
The history of biodiversity through words and picturesThe history of biodiversity through words and pictures
The history of biodiversity through words and pictures
 
Crowdsourcing your cultural heritage collections: considerations when choosi...
Crowdsourcing your cultural heritage collections:  considerations when choosi...Crowdsourcing your cultural heritage collections:  considerations when choosi...
Crowdsourcing your cultural heritage collections: considerations when choosi...
 
The Art of Life: merging the worlds of art and science
The Art of Life:  merging the worlds of art and scienceThe Art of Life:  merging the worlds of art and science
The Art of Life: merging the worlds of art and science
 
Special libraries association meeting march 2014
Special libraries association meeting march 2014Special libraries association meeting march 2014
Special libraries association meeting march 2014
 
Finding a goldmine of natural history illustrations within BHL texts: the Ar...
Finding a goldmine of natural history illustrations within BHL texts:  the Ar...Finding a goldmine of natural history illustrations within BHL texts:  the Ar...
Finding a goldmine of natural history illustrations within BHL texts: the Ar...
 
Breathing new life into old data - How opening your collection can spark imag...
Breathing new life into old data - How opening your collection can spark imag...Breathing new life into old data - How opening your collection can spark imag...
Breathing new life into old data - How opening your collection can spark imag...
 
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
Revealing and Contextualizing the treasures of the Biodiversity Heritage Libr...
 
More than just a pretty picture: improving the discoverability of illustrati...
More than just a pretty picture:  improving the discoverability of illustrati...More than just a pretty picture:  improving the discoverability of illustrati...
More than just a pretty picture: improving the discoverability of illustrati...
 
Reach Out! Opportunities for the Visual Resource Center
Reach Out!  Opportunities for the Visual Resource CenterReach Out!  Opportunities for the Visual Resource Center
Reach Out! Opportunities for the Visual Resource Center
 
The Art of Life project
The Art of Life projectThe Art of Life project
The Art of Life project
 
The Biodiversity Heritage Library and bibliographic citations: towards new u...
The Biodiversity Heritage Library and bibliographic citations: towards new u...The Biodiversity Heritage Library and bibliographic citations: towards new u...
The Biodiversity Heritage Library and bibliographic citations: towards new u...
 
Building the new open linked library: Theory and Practice
Building the new open linked library: Theory and PracticeBuilding the new open linked library: Theory and Practice
Building the new open linked library: Theory and Practice
 
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Li...
 

Kürzlich hochgeladen

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Kürzlich hochgeladen (20)

9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 

Botanists and annotations: use cases and their relevance for the larger scientific community

  • 1. Botanists and Annotations: use cases and their relevance for the larger scientific community William Ulate Trish Rose-Sandler Center for Biodiversity Informatics Missouri Botanical Garden Jun. 2018
  • 2. Where do we come from? Why are we here? I Annotate Conference , Berlin (2016) The uptake of web annotation could be sufficiently moved forward by tackling three key issues: 1) interoperability 2) domain use cases 3) user centered design
  • 3. Darwin Virtual Library (2011) Charles Darwin’s Library is a digital edition and virtual reconstruction of the surviving books owned by Charles Darwin. https://www.biodiversitylibrary.org/collection/darwinlibrary Charles Darwin’s Library is a digital edition and virtual reconstruction of the surviving books owned by Charles Darwin. In 1908, Charles Darwin’s son, Francis, transferred what he called the ‘Darwin Library’ to the Botany School at Cambridge University. ‘The chief interest of the Darwin books lies in the pencil notes scribbled on their pages, or written on scraps of paper and pinned to the last page.’ – Francis Darwin Darwin read to gather evidence, to explore and define the research possibilities of his evolutionary ideas, and to gauge reactions to his own publications. https://www.biodiversitylibrary.org/collection/darwinlibrary Charles Darwin’s Library is a digital edition and virtual reconstruction of the surviving books owned by Charles Darwin. In 1908, Charles Darwin’s son, Francis, transferred what he called the ‘Darwin Library’ to the Botany School at Cambridge University. https://www.biodiversitylibrary.org/collection/darwinlibrary Charles Darwin’s Library is a digital edition and virtual reconstruction of the surviving books owned by Charles Darwin. In 1908, Charles Darwin’s son, Francis, transferred what he called the ‘Darwin Library’ to the Botany School at Cambridge University. ‘The chief interest of the Darwin books lies in the pencil notes scribbled on their pages, or written on scraps of paper and pinned to the last page.’ – Francis Darwin https://www.biodiversitylibrary.org/collection/darwinlibrary Charles Darwin’s Library is a digital edition and virtual reconstruction of the surviving books owned by Charles Darwin. In 1908, Charles Darwin’s son, Francis, transferred what he called the ‘Darwin Library’ to the Botany School at Cambridge University. ‘The chief interest of the Darwin books lies in the pencil notes scribbled on their pages, or written on scraps of paper and pinned to the last page.’ – Francis Darwin Darwin read to gather evidence, to explore and define the research possibilities of his evolutionary ideas, and to gauge reactions to his own publications. This digital reconstruction of the Darwin Library delivers is the ability to retrace and reduplicate Darwin’s reading of a wealth of materials. https://www.biodiversitylibrary.org/collection/darwinlibrary
  • 7. Mining Biodiversity • Transform BHL into a next-generation social digital library • A multi-disciplinary approach – Text Mining – Machine learning – History of Science – Environmental History & Studies – Library and Information Science – Social Media This project was made possible in part by the Institute of Museum and Library Services [LG-00-14-04-0032-14]. http://miningbiodiversity.com
  • 9. What’s wrong with keyword-based search: Polysemy •Ambiguity! Boxwood historic place in Alabama? North American term for plants in the Buxaceae family? California bay hardwood tree? location?
  • 10. What’s wrong with keyword-based search: Synonymy Campanula portenschlagiana Schult. Campanula portenschlagiana Schult. Campanula affinis Rchb. ex Nyman Campanula muralis Port ex. A. DC.
  • 11. Semantic metadata generation • Entity types – species – location – habitat – anatomical parts – qualities – persons – temporal expressions
  • 12. Semantic metadata generation • Entity types – species – location – habitat – anatomical parts – qualities – persons – temporal expressions • Association types – observation – Habitation – nutrition – trait
  • 13. Examples of semantic metadata (annotations) • Observation • Habitation
  • 14. Examples of semantic metadata (annotations) • Nutrition • Trait
  • 15. Text mining-based approach Seed documents Unlabelled documents Learn semantics Annotator/Curator Validate Feedback Annotate Search index Store Annotate
  • 17. Enhanced document viewing Page in PDF/image format OCR-corrected text with colour-coded annotations
  • 18. Text Annotation Use Cases Annotator Use Case: I am a contributing participant, adding or curating annotations in the Biodiversity Digital Library. Searcher Use Case: I am an user of the Biodiversity Digital Library, searching for content that is indexed by annotations Admin Use Case
  • 19. Annotator Use Case • Add an annotation by selecting text • Conveniently select an appropriate annotation (autocomplete, dropdown menu) • “Cross out” an annotation (eg: a homonym) and toggle showing it. • Modify which text is selected and/or change the annotation term associated with my own or a pre-existing annotation. • Confirm or agree with an existing annotation. • Show measure of certainty on an annotation, either a count of how many people agree, or just “Confirmed” versus “Still in need of review” • Easily browse existing annotations in a document (using the tab or next button) • Browse annotations filtered by their status (confirmed, crossed out, review) • Find documents by annotation status. • Find documents that interest me (combine the solution above with search or filter by other document metadata (keyword, title, author, etc.)
  • 20. Searcher Use Case • Discover annotation terms to search by (autocomplete, drop down menu, browsable tree of terms) • Navigate to locations in documents from my search (search results show truncated text found and a link to the location of the annotated text) • Download search results (several columns: annotation term; the chunk of text containing the annotation; URL to the location of the annotated text) • Search for documents containing combinations of terms • Search for combinations of terms in proximity to each other in the text. • Search for facts based on semantic combinations or relative positions of terms (eg: “Leptinotarsa” “feed on” ?) • Retrieve search results for associated terms. Asking for water bodies, should return rivers, bays, lakes, seas, etc. Asking for butterflies, should get all the Lepidoptera species.
  • 21. Test your hypothesis with real Use Cases
  • 22. Enhanced searching of content Faceted search Automatically generated questions Time- sensitive search
  • 23. Search by facets Opisthoproctus soleatus reported between 1840 and 1950 filtered by Habitat, Morphology and Reproduction annotations. • Taxonomy (73) • Geography (18) • Habitat (61) • Traits (57) - Morphology(20) - Feeding (35) - Reproduction(10) • Publication (73) - Journal (21) - Author (63) -Collection (10)
  • 24. Automatically generated questions Opisthoproctus soleatus reported between 1840 and 1950 filtered by Habitat, Morphology and Reproduction annotations. there is no strong sentiment on whether this functionality is something that is definitely useful this is very relevant to their work (50%) I can see how it can be useful but not currently (50%) Ask a question -Which species taxa are related to Opisthoproctus soleatus? - In which geographical locations can I find Opisthoproctus soleatus? - What other species are co-located with Opisthoproctus soleatus? - In which environments does Opisthoproctus soleatus live? - What other species are in the same habitat as Opisthoproctus soleatus? - What are the characteristics of Opisthoproctus soleatus? - What other species share the same characteristics of Opisthoproctus soleatus?
  • 25. Searching by subject-verb-object Leptinotarsa feeds on ? reported between 1840 and 1950 …they can see how the graph-based visualization of results can be useful but not for their current purposes …
  • 26. Searching for directly associated concepts I’m looking for Taxa/Geographic locations/Habitats/Traits directly associated with Eltanin reported between 1840 and 1950. this is very relevant to what they are doing (50%) it might be useful but not for their current purposes (40%) there is no strong indication of whether this feature is definitely wanted by our respondents
  • 27. Searching for indirectly associated concepts I’m looking for Taxa indirectly associated with tarsier via Geographic locations reported between 1840 and 1950. they can see its benefits but not to what they are currently doing (50%) it will be definitely useful (26%)
  • 28. Use Cases 1. Finding the original description (taxonomic research). 2. Finding host plants, for example (ecological research). 3. Finding illustrations and plates. 4. Finding taxon name usage instances (taxonomic treatment, nomenclatural act). 5. Capturing spelling variants (orthographic variants). 6. Marking errors on versions of OCR/transcribed text. 7. Exposing semantic metadata (as a SPARQL endpoint). 8. Being able to access through APIs search functionalities. 9. Allowing users to highlight in text (keywords). 10. Allowing users to annotate concepts if incorrectly recognized or missed.
  • 29. Application to Query Expansion • an interface for searching documents using a species name as a query • query is automatically expanded by retrieving synonyms/semantically related names from the term inventory • documents mentioning all of the names in the expanded query are returned
  • 30. Term Inventory • compilation of species names (flowering plants, mammals, birds) • acts as a thesaurus, as each name is linked to its synonyms as well as other semantically related names • “semantically relatedness”: defined in terms of a contextual similarity measure, computed over the entire Digital Library corpus
  • 31.
  • 32. Magnoliopsida species (common) names CHOICE 1 CHOICE 2 CHOICE 3 Phaseolus multiflorus Garden pea Argemone alba Citrus nobilis Sweetheart Arabis perfoliata Spergularia marina Aster pauciflorus Mimosa Canavalia ensiformis Physic nut Mung bean Chrysanthemum inodorum Guilandina bonducella Tilia parvifolia Fraxinus pubescens Arabidopsis thaliana Pulsatilla vulgaris Symphoricarpos orbiculatus Turritis glabra Medick Sorbus domestica Lespedeza reticulata Hypericum galioides Haematoxylon campechianum Scaevola lobelia Alliaria petiolata
  • 33. Real Use Cases "Collected by who? Zambia 1934..... Stuck again!! @KewDC“ Dr. Sandra Knapp (@SandyKnapp) Mar. 11, 2016
  • 35. Real Use Cases May.25, 2016 : On the etymology of the word "elephant" and the origins of the word "tamarind", the "Indian date" Sketches of the natural history of Ceylon : with narratives and anecdotes illustrative of the habits and instincts of the mammalia, birds, reptiles, fishes, insects, etc. including a monograph of the elephant https://twitter.com/WUlate/status/734805482536198144
  • 36. Disqus • Annotation functionality was made available as a trial within the portal from December of 2015 through June 2016 as part of the IMLS-funded Mining Biodiversity project. • A social commenting tool that allowed users to add comments to individual pages in a book and follow users and discussions about those books. • The following tasks were carried out: 1. Created Requirements document to outline the commenting tool needs and how Disqus achieved them. 2. Coordinated with Disqus development staff to determine how best to implement Disqus to meet those needs. 3. Tool built and implemented in Portal. 4. Extensive testing of the feature before launching the tool. 5. Developed User Tutorials and Outreach Content to announce the feature to the public and provide training for its use.
  • 37.
  • 38. Disqus • In 6 months, 188 individual annotations were received and stored in Disqus repository. • The tool was discontinued within BHL because it was considered a proprietary tool that would not have served well as a long term scalable solution and customizations to the tool were limited and annotations were stored on Disqus and not BHL servers • The trial demonstrated a desire from users to actively engage in the annotation process within a digital library interface. • Citizen scientists and librarians were among the most active profiles in generating annotations.
  • 39.
  • 40. • The International Plant Names Index (IPNI) is a database of the names and associated basic bibliographical details of seed plants, ferns and lycophytes. • Its goal is to eliminate the need for repeated reference to primary sources for basic bibliographic information about plant names. • The data are freely available and are gradually being standardized and checked. • IPNI is a dynamic resource, depending on direct contributions by all members of the botanical community. http://www.ipni.org Why Botanists?
  • 41. Why Botanists? Botanico-Periodicum-Huntianum (1968) Worldwide bibliography of periodicals • 12,000 titles (45 languages) • title abbreviations • cross-referenced to other published abbreviations and complete titles • details of volumation and duration • and other basic bibliographic data. BPH-2 (2004) Periodicals with Botanical Content Second edition of B-P-H Alphabetical title list (1665 – 2002) 33,000 titles from around the world Agriculture, Agronomy, Bacteriology, Biology, Biotechnology, Botanical Bibliography and History, Conservation, Ecology, Environmental Science, Floriculture, Forestry, Fruit growing, Genetics and Plant breeding, Geography, Horticulture, Hydrobiology and Limnology, Immunology and Toxicology, Medical Mycology, Microbiology and Microscopy, Molecular biology, Palaeontology, Pharmacology and Pharmacognosy, Plant pathology and Vegetable crops, etc. B-P-H/Supplementum (1991) • 25,000 title entries arranged by title • key to entries in both volumes. • Citation abbreviations for all titles • improved cross-referencing • expanded thesaurus of title words and their abbreviation equivalents • included periodicals dealing with biotechnology, molecular biology, environmental studies and conservation.
  • 42. Landscape Review New Media Consortium • Horizon Report Library Edition Few examples of adoption within Libraries Except for: • Australia’s Trove and • Europeana Sounds Project Lack of Available tools? No
  • 43. Consumers As Creators Planning Grant From: May 2018 To: Apr 2019 #ConsumersAsCreators
  • 44. Purpose Analyze Web annotation needs of the botanical community and develop a prototype of how those needs may be met within a digital library platform
  • 45. Results from this project will be useful to the following audiences: • Librarians looking to improve their virtual library by enabling users to add value to their content. • Botanists who want to enhance the corpus of their digital library collection by augmenting knowledge through the annotations provided. • Developers who want to choose a tool to enable annotations in their online solutions, particularly within digital library platforms.
  • 46. Deliverables: • Needs Analysis Report with prioritized list of annotation needs for users of a botanical virtual library. • Feasibility Study with the evaluation of four open source existing annotation tools based on their potential to address the needs identified in the Analysis Report • Proof of concept prototype installed within a virtual library to demonstrate the functional capacity of one of the evaluated tools • Outcomes Assessment with next step recommendations to propose a full-scale project adopting an annotation tool as part of a virtual library.
  • 47. Needs analysis report Using case research approach, • Interview 10 users of a botanical virtual library from 5 separate institutions • Answers will be analyzed and classified by user type, purpose and function
  • 48. Feasibility study Four existing annotation tools will be thoroughly evaluated against the needs analysis in order to develop a feasibility study for how they could satisfy botanists’ needs digilib
  • 49. Proof of concept prototype RERUM will be integrated within a digital library platform as proof-of-concept
  • 50. Outcomes assessment and next steps • Identify requisites, best practices, and further developments for research project • Identify appropriate partners
  • 51. Interested in joining us? Contact: Trish Rose-Sandler trish.rose-sandler@mobot.org William Ulate william.ulate@mobot.org