Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics

Encyclopedia of Life
eol.org
@eol
Cynthia Parr
@cydparr
GSC15 23 April 2013

How EOL works
EOL
Crowds
Harvest
Third party applications

EOL
Plinian
Core
DwC
description
SPM
infoitem
using
Dublin Core & Audubon
Core for other metadata
Darwin Core Archive
flat files as
transport mechanism
Sharing process adds semantics to content objects

EOL Today
Key Milestones in 2013
1.1 million species pages
240+ content providers
3 million unique visitors
from 223 countries &
territories

EOL
GBIF
NCBI
with Anne Bowser, University of Maryland
EOL connects hubs

BioNames
Rod Page
Ryan Schenk
iphylo.blogspot.com

Anatolia Zooarchaeology Case Study
led by Alexandria Archive Institute
Research goals and outcomes:
– Improve archaeological data
collection / documentation practices
– Better understanding of gaps (spatial
and temporal)
– Integrated biometrics show complex
patterns (introduction of domestic
and continued use of wild animals
by region)
– Aligning data to EOL taxon identifiers
helps draw out patterns in relative
proportion of taxa over time and
space across many assemblages

EOL Computable Data Challenge
1. 14 different sites
2. 34+ zooarchaeologists
3. Decoding, cleanup, metadata documentation
4. 220,000+ specimens
5. 450 entities linked to 143 EOL taxon concepts
6. Anatomical entities linked to Uberon.org
7. Biometrics linked to measurement ontology
8. Collaborative analysis
http://opencontext.org/

0 100000 200000 300000 400000 500000 600000 700000 800000
Distribution
MolecularBiology
Multiple topics
TypeInformation
Habitat
ConservationStatus
Threats
Morphology
Conservation
Management
Trends
Size
Associations
Uses
TrophicStrategy
Cyclicity & Life Cycle
PopulationBiology
Reproduction
Migration
Taxonomy
LifeExpectancy
Identification
Behaviour
Ecology
Diseases
Number of text objects
Subject
of
text
object

Promote NLP text mining,
crowdsourcing, standardizing
• Species Interaction Datasets—Integration,
Visualization, and Analysis (Poelen and Mungall)
• Discovering EnvO habitat terms in EOL contents
(Pafilis)
• Altitude Specificity of Flower Coloration (Wright)
• Crowd-sourced data to examine morphological
impacts of extinction risk in ray-finned fishes
(Chang)
• Macroecological patterns in butterfly-hostplant
associations (Ferrer-Parris)

EOL GloBI
Global Biotic Interactions
Challenge: Species interaction datasets are mostly
buried in flat files & custom formats.
Plan: Build infrastructure for normalizing and aggregating
species interaction datasets and make them accessible
through flat files (Darwin Core Archive), web services,
and semantic web endpoints (SPARQL).
Eventually: Publish biotic interaction ontology re-using
existing ontologies, re-integrate with EOL
Enable semantic interoperability to allow for cross-functional
analysis (e.g. How does a parasite regulate gene
expression of host?
Poelen, Mungall, Simons, Reiz

http://globalbioticinteractions.wordpress.com/
14 datasets containing 25k
taxa, 422k interactions,
for 3k locations
alpha version of ingestion,
normalization, aggegation
alpha version of web API
alpha version of data
exports
Dr. Katy Börner led
Information Visualization
MOOC

Easy access to analyzable trait data
“Are blue organisms more common in high altitudes?”
“How can I predict vulnerability to climate change based
on life history characteristics?”
“What organisms should I collect to fill in gaps in genome
quality tissue collections?”
• Look for data type, download for all taxa
• Create a collection of taxa, download all data
• Use Reol: an R interface to EOL (Banbury, Omeara)
http://barbbanbury.info/barbbanbury/Reol.html
• Find more specialized data repositories

Adding traits to EOL
Funded: Marine focus
<scientific name> <hasAvgBodyMass in g> <value>
<scientific name> <preysOn> <scientific name>
Harvest and display on data tab
Add high-level semantics from coarse SPM ontology
Downloads, fancy searching
Machine access

INSDC
900,000 species
4,000 genomes
60 million DNA sequence records
How are these related to traits?
Next step: TraitBank

Thanks
Funding & other contributions
Sloan Foundation
Smithsonian Institution
David Rubenstein
Marine Biological Laboratory
Harvard University
Our content partners
Thousands of individual
contributors, and hundreds of
volunteer curators
Image credits
Jenny from Taipei
University of Birmingham
Cynthia Parr
Chief Scientist @eol
@cydparr parrc@si.edu
GLoBI: Jorrit Poelen (lead/software), Chris Mungall
(ontologies), James Simons (biologist) and Robert
Reiz (software). Datasets shared by: Peter D.
Roopnarine, Rachel Hertog, Carlos García-Robledo,
James Simons, Jenny L. Wrast, C. Barnes,
International Council for the Exploration of the Sea
(ICES), Jose R. Ferrer Paris, Senol Akin, Malcolm
Storey (BioInfo.org.uk), Ivy E. Baremore, Joel Sachs
(SPIRE), Colt W. Cook, David A. Blewett
Alexandria Archive: Sarah Kansa, Eric
Kansa, 34 other zooarchaeologists
BioNames: Rod Page, Ryan Schenk
MOOC: Katy Börner, Twy Bethard,
Andrew Miles , Mattia Della Libera

Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics

Ähnlich wie Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics (20)

Mehr von Cyndy Parr

Mehr von Cyndy Parr (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics

Hinweis der Redaktion