SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Encyclopedia of Life
eol.org
@eol
Cynthia Parr
@cydparr
GSC15 23 April 2013
A webpage for every species
How EOL works
EOL
Crowds
Harvest
Third party applications
EOL
Plinian
Core
DwC
description
SPM
infoitem
using
Dublin Core & Audubon
Core for other metadata
Darwin Core Archive
flat files as
transport mechanism
Sharing process adds semantics to content objects
EOL Today
Key Milestones in 2013
1.1 million species pages
240+ content providers
3 million unique visitors
from 223 countries &
territories
EOL
GBIF
NCBI
with Anne Bowser, University of Maryland
EOL connects hubs
BioNames
Rod Page
Ryan Schenk
iphylo.blogspot.com
Anatolia Zooarchaeology Case Study
led by Alexandria Archive Institute
Research goals and outcomes:
– Improve archaeological data
collection / documentation practices
– Better understanding of gaps (spatial
and temporal)
– Integrated biometrics show complex
patterns (introduction of domestic
and continued use of wild animals
by region)
– Aligning data to EOL taxon identifiers
helps draw out patterns in relative
proportion of taxa over time and
space across many assemblages
EOL Computable Data Challenge
1. 14 different sites
2. 34+ zooarchaeologists
3. Decoding, cleanup, metadata documentation
4. 220,000+ specimens
5. 450 entities linked to 143 EOL taxon concepts
6. Anatomical entities linked to Uberon.org
7. Biometrics linked to measurement ontology
8. Collaborative analysis
http://opencontext.org/
0 100000 200000 300000 400000 500000 600000 700000 800000
Distribution
MolecularBiology
Multiple topics
TypeInformation
Habitat
ConservationStatus
Threats
Morphology
Conservation
Management
Trends
Size
Associations
Uses
TrophicStrategy
Cyclicity & Life Cycle
PopulationBiology
Reproduction
Migration
Taxonomy
LifeExpectancy
Identification
Behaviour
Ecology
Diseases
Number of text objects
Subject
of
text
object
Promote NLP text mining,
crowdsourcing, standardizing
• Species Interaction Datasets—Integration,
Visualization, and Analysis (Poelen and Mungall)
• Discovering EnvO habitat terms in EOL contents
(Pafilis)
• Altitude Specificity of Flower Coloration (Wright)
• Crowd-sourced data to examine morphological
impacts of extinction risk in ray-finned fishes
(Chang)
• Macroecological patterns in butterfly-hostplant
associations (Ferrer-Parris)
EOL GloBI
Global Biotic Interactions
Challenge: Species interaction datasets are mostly
buried in flat files & custom formats.
Plan: Build infrastructure for normalizing and aggregating
species interaction datasets and make them accessible
through flat files (Darwin Core Archive), web services,
and semantic web endpoints (SPARQL).
Eventually: Publish biotic interaction ontology re-using
existing ontologies, re-integrate with EOL
Enable semantic interoperability to allow for cross-functional
analysis (e.g. How does a parasite regulate gene
expression of host?
Poelen, Mungall, Simons, Reiz
http://globalbioticinteractions.wordpress.com/
14 datasets containing 25k
taxa, 422k interactions,
for 3k locations
alpha version of ingestion,
normalization, aggegation
alpha version of web API
alpha version of data
exports
Dr. Katy Börner led
Information Visualization
MOOC
Easy access to analyzable trait data
“Are blue organisms more common in high altitudes?”
“How can I predict vulnerability to climate change based
on life history characteristics?”
“What organisms should I collect to fill in gaps in genome
quality tissue collections?”
• Look for data type, download for all taxa
• Create a collection of taxa, download all data
• Use Reol: an R interface to EOL (Banbury, Omeara)
http://barbbanbury.info/barbbanbury/Reol.html
• Find more specialized data repositories
Adding traits to EOL
Funded: Marine focus
<scientific name> <hasAvgBodyMass in g> <value>
<scientific name> <preysOn> <scientific name>
Harvest and display on data tab
Add high-level semantics from coarse SPM ontology
Downloads, fancy searching
Machine access
INSDC
900,000 species
4,000 genomes
60 million DNA sequence records
How are these related to traits?
Next step: TraitBank
Thanks
Funding & other contributions
Sloan Foundation
Smithsonian Institution
David Rubenstein
Marine Biological Laboratory
Harvard University
Our content partners
Thousands of individual
contributors, and hundreds of
volunteer curators
Image credits
Jenny from Taipei
University of Birmingham
Cynthia Parr
Chief Scientist @eol
@cydparr parrc@si.edu
GLoBI: Jorrit Poelen (lead/software), Chris Mungall
(ontologies), James Simons (biologist) and Robert
Reiz (software). Datasets shared by: Peter D.
Roopnarine, Rachel Hertog, Carlos García-Robledo,
James Simons, Jenny L. Wrast, C. Barnes,
International Council for the Exploration of the Sea
(ICES), Jose R. Ferrer Paris, Senol Akin, Malcolm
Storey (BioInfo.org.uk), Ivy E. Baremore, Joel Sachs
(SPIRE), Colt W. Cook, David A. Blewett
Alexandria Archive: Sarah Kansa, Eric
Kansa, 34 other zooarchaeologists
BioNames: Rod Page, Ryan Schenk
MOOC: Katy Börner, Twy Bethard,
Andrew Miles , Mattia Della Libera

Weitere ähnliche Inhalte

Ähnlich wie Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics

Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
Cyndy Parr
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
Monica Munoz-Torres
 
JulieKlein_Bosc2012
JulieKlein_Bosc2012JulieKlein_Bosc2012
JulieKlein_Bosc2012
KUPKB_Team
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Monica Munoz-Torres
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
ekansa
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
TERN Australia
 

Ähnlich wie Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics (20)

iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
 
Encyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypesEncyclopedia of Life: Use cases for phenotypes
Encyclopedia of Life: Use cases for phenotypes
 
Introduction to EOL.org for scientists
Introduction to EOL.org for scientistsIntroduction to EOL.org for scientists
Introduction to EOL.org for scientists
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
 
Frontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of LifeFrontiers of discovery with Encyclopedia of Life
Frontiers of discovery with Encyclopedia of Life
 
EOL and Science: Yes we can!
EOL and Science: Yes we can!EOL and Science: Yes we can!
EOL and Science: Yes we can!
 
Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.Web Apollo: Lessons learned from community-based biocuration efforts.
Web Apollo: Lessons learned from community-based biocuration efforts.
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
JulieKlein_Bosc2012
JulieKlein_Bosc2012JulieKlein_Bosc2012
JulieKlein_Bosc2012
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...Practical interoperability across semantic stores of data for ecological, tax...
Practical interoperability across semantic stores of data for ecological, tax...
 
Using the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support EcoinformaticsUsing the Semantic Web to Support Ecoinformatics
Using the Semantic Web to Support Ecoinformatics
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
Connecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics InstituteConnecting life sciences data at the European Bioinformatics Institute
Connecting life sciences data at the European Bioinformatics Institute
 
Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.Introduction to Web Apollo for the i5K pilot species.
Introduction to Web Apollo for the i5K pilot species.
 
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
 
Introducing Encyclopedia of Life version 2
Introducing Encyclopedia of Life version 2Introducing Encyclopedia of Life version 2
Introducing Encyclopedia of Life version 2
 

Mehr von Cyndy Parr

Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbag
Cyndy Parr
 

Mehr von Cyndy Parr (20)

Open data and the ag data commons
Open data and the ag data commonsOpen data and the ag data commons
Open data and the ag data commons
 
Ag Data Commons for AgBioData
Ag Data Commons for AgBioDataAg Data Commons for AgBioData
Ag Data Commons for AgBioData
 
Biodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscapeBiodiversity informatics and the agricultural data landscape
Biodiversity informatics and the agricultural data landscape
 
Public access to research results at USDA
Public access to research results at USDAPublic access to research results at USDA
Public access to research results at USDA
 
Ag Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and dataAg Data Commons: Agricultural research metadata and data
Ag Data Commons: Agricultural research metadata and data
 
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...Ag Data Commons: A new USDA catalog and repository for agricultural research ...
Ag Data Commons: A new USDA catalog and repository for agricultural research ...
 
Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.Preparing for data-intensive science across domains.
Preparing for data-intensive science across domains.
 
Parr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbagParr ag datacommonsnal_brownbag
Parr ag datacommonsnal_brownbag
 
Ag Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research dataAg Data Commons: Adding Value to open agricultural research data
Ag Data Commons: Adding Value to open agricultural research data
 
Big Data Initiatives for Agroecosystems
Big Data Initiatives for AgroecosystemsBig Data Initiatives for Agroecosystems
Big Data Initiatives for Agroecosystems
 
TDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's WelcomeTDWG 2014 opening talk: Chair's Welcome
TDWG 2014 opening talk: Chair's Welcome
 
Using and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute dataUsing and extending Darwin Core for structured attribute data
Using and extending Darwin Core for structured attribute data
 
The Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of LifeThe Road to TraitBank: What's Next for the Encyclopedia of Life
The Road to TraitBank: What's Next for the Encyclopedia of Life
 
Species pages and portals
Species pages and portals Species pages and portals
Species pages and portals
 
Building EOL species pages
Building EOL species pagesBuilding EOL species pages
Building EOL species pages
 
Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...Leveraging an international infrastructure: Case studies from the Encyclopeda...
Leveraging an international infrastructure: Case studies from the Encyclopeda...
 
EOL China Center status
EOL China Center statusEOL China Center status
EOL China Center status
 
Western Ghats Portal
Western Ghats PortalWestern Ghats Portal
Western Ghats Portal
 
EOL's Hotlist and RedHotList
EOL's Hotlist and RedHotListEOL's Hotlist and RedHotList
EOL's Hotlist and RedHotList
 
Atlas of Living Australia
Atlas of Living Australia Atlas of Living Australia
Atlas of Living Australia
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity Informatics

  • 1. Encyclopedia of Life eol.org @eol Cynthia Parr @cydparr GSC15 23 April 2013
  • 2. A webpage for every species
  • 4. EOL Plinian Core DwC description SPM infoitem using Dublin Core & Audubon Core for other metadata Darwin Core Archive flat files as transport mechanism Sharing process adds semantics to content objects
  • 5. EOL Today Key Milestones in 2013 1.1 million species pages 240+ content providers 3 million unique visitors from 223 countries & territories
  • 6. EOL GBIF NCBI with Anne Bowser, University of Maryland EOL connects hubs
  • 8. Anatolia Zooarchaeology Case Study led by Alexandria Archive Institute Research goals and outcomes: – Improve archaeological data collection / documentation practices – Better understanding of gaps (spatial and temporal) – Integrated biometrics show complex patterns (introduction of domestic and continued use of wild animals by region) – Aligning data to EOL taxon identifiers helps draw out patterns in relative proportion of taxa over time and space across many assemblages
  • 9. EOL Computable Data Challenge 1. 14 different sites 2. 34+ zooarchaeologists 3. Decoding, cleanup, metadata documentation 4. 220,000+ specimens 5. 450 entities linked to 143 EOL taxon concepts 6. Anatomical entities linked to Uberon.org 7. Biometrics linked to measurement ontology 8. Collaborative analysis http://opencontext.org/
  • 10. 0 100000 200000 300000 400000 500000 600000 700000 800000 Distribution MolecularBiology Multiple topics TypeInformation Habitat ConservationStatus Threats Morphology Conservation Management Trends Size Associations Uses TrophicStrategy Cyclicity & Life Cycle PopulationBiology Reproduction Migration Taxonomy LifeExpectancy Identification Behaviour Ecology Diseases Number of text objects Subject of text object
  • 11. Promote NLP text mining, crowdsourcing, standardizing • Species Interaction Datasets—Integration, Visualization, and Analysis (Poelen and Mungall) • Discovering EnvO habitat terms in EOL contents (Pafilis) • Altitude Specificity of Flower Coloration (Wright) • Crowd-sourced data to examine morphological impacts of extinction risk in ray-finned fishes (Chang) • Macroecological patterns in butterfly-hostplant associations (Ferrer-Parris)
  • 12. EOL GloBI Global Biotic Interactions Challenge: Species interaction datasets are mostly buried in flat files & custom formats. Plan: Build infrastructure for normalizing and aggregating species interaction datasets and make them accessible through flat files (Darwin Core Archive), web services, and semantic web endpoints (SPARQL). Eventually: Publish biotic interaction ontology re-using existing ontologies, re-integrate with EOL Enable semantic interoperability to allow for cross-functional analysis (e.g. How does a parasite regulate gene expression of host? Poelen, Mungall, Simons, Reiz
  • 13. http://globalbioticinteractions.wordpress.com/ 14 datasets containing 25k taxa, 422k interactions, for 3k locations alpha version of ingestion, normalization, aggegation alpha version of web API alpha version of data exports Dr. Katy Börner led Information Visualization MOOC
  • 14. Easy access to analyzable trait data “Are blue organisms more common in high altitudes?” “How can I predict vulnerability to climate change based on life history characteristics?” “What organisms should I collect to fill in gaps in genome quality tissue collections?” • Look for data type, download for all taxa • Create a collection of taxa, download all data • Use Reol: an R interface to EOL (Banbury, Omeara) http://barbbanbury.info/barbbanbury/Reol.html • Find more specialized data repositories
  • 15. Adding traits to EOL Funded: Marine focus <scientific name> <hasAvgBodyMass in g> <value> <scientific name> <preysOn> <scientific name> Harvest and display on data tab Add high-level semantics from coarse SPM ontology Downloads, fancy searching Machine access
  • 16.
  • 17.
  • 18. INSDC 900,000 species 4,000 genomes 60 million DNA sequence records How are these related to traits? Next step: TraitBank
  • 19. Thanks Funding & other contributions Sloan Foundation Smithsonian Institution David Rubenstein Marine Biological Laboratory Harvard University Our content partners Thousands of individual contributors, and hundreds of volunteer curators Image credits Jenny from Taipei University of Birmingham Cynthia Parr Chief Scientist @eol @cydparr parrc@si.edu GLoBI: Jorrit Poelen (lead/software), Chris Mungall (ontologies), James Simons (biologist) and Robert Reiz (software). Datasets shared by: Peter D. Roopnarine, Rachel Hertog, Carlos García-Robledo, James Simons, Jenny L. Wrast, C. Barnes, International Council for the Exploration of the Sea (ICES), Jose R. Ferrer Paris, Senol Akin, Malcolm Storey (BioInfo.org.uk), Ivy E. Baremore, Joel Sachs (SPIRE), Colt W. Cook, David A. Blewett Alexandria Archive: Sarah Kansa, Eric Kansa, 34 other zooarchaeologists BioNames: Rod Page, Ryan Schenk MOOC: Katy Börner, Twy Bethard, Andrew Miles , Mattia Della Libera

Hinweis der Redaktion

  1. This is a very different kind of talk because I am not focusing on metagenomics or microbes per se. I would to introduce what we are doing at the Encyclopedia of Life in hopes that we can soon bridge the gap between those studies and studies on macroorganism diversity.
  2. We have a working infrastructure as well as more than 200 partners, We harvest and sort text and multimedia by topic and by species and put it on our pages. Curation + user-added content from the crowds is added to the mix.This is fed back to providers, giving them traffic, quality control on their own content, and new content for them to use And, we are already seeing spinoff products. We make it easy for developers, and everything is either public domain or CC-licensed so it can be re-used.
  3. As this is a meeting about standards, I thought I would mention some of the standards we are using.
  4. We now have over a million pages with content, some of it is even in other languages like Arabic, Spanish, and Chinese. And we are getting traffic mostly from the general public, from all over the world.
  5. There are strong links between taxa represented in NCBI’s databases and others. Each dot here represents a project with a database holding some sort of biological data. Chiefly, the links between these databases are based on taxonomic names and so EOL has mapped every name and their identifiers in each of these hubs to bring the data together.
  6. One of the benefits is that we can support third-party projects where linking and visualizing via names is critical. This is Bionames, a project by Rod Page &amp; Ryan Shenk. They have visualized the taxonomic concepts in this family of bats – where there are no images there are obvious gaps in the Encyclopedia of life. There is a timeline showing when species were described, a sample classification and distribution map, and links to some of the foundational literature.Essentially, they are re-organizing EOL data to suit their own use cases for taxonomists, and bringing in additional data not yet available on EOL.
  7. Here is what archaeologists are doing with EOL
  8. The removal of objects is now forbidden in most countries and many sites in the US. As a result data collection methods have changed from description of a physical object accessible in the US to a full surrogate for an object that might be re-buried in the ground. Data collection has increased as the collection of objects has decreased.Still individual systems of data collection (see examples on the right) have emerged which have.Developed over timeAre Handed down from mentors Contain some technological adoption, particularly the adoption of Excel spreadsheets over relational databasesIn all of our interviews there was no reference to existing guides, such as the UK: Archaeological Data Service or Netherlands: DANS on archaeological documentation.
  9. Most of our 5.4 million content objects are text blobs and here are the subjects of that text. Most often, our text objects are about distribution. But there are many other subjects involved including essays that include multiple subjects.
  10. Information Visualization MOOC (Massive Open Online Course) led by Dr. Katy Börner of Indiana University, students TwyBethard (United States), Andrew Miles (United Kingdom), Edward Kok (Netherlands) and Mattia Della Libera (Italy) used GloBI data to create an insightful visualization of spatial marine food webs in the Gulf of Mexico.
  11. In the next year and a half we are tackling these challenges with funding from the Sloan Foundation.We are starting with marine dataIn the most simplistic view, we’ll be storing triples, each part of which can be linked to a definitionso that the meaning is clearly defined. There might be five different ways to define an attribute like “body length” and we should be able to handle them all without losing the distinction. Of course we’ll also make sure each triple links back to a dataset and all the appropriate credits.This data will be organized on a data tab, perhaps sorted out into the 35 or so “topics” that we currently have text chapters for, like size or reproduction, and we will also allow powerful downloading and searching capabilityFinally we’ll be setting up ways for other applications to grab the data and do interesting things with it.This semantic web technology isn’t new, but the way we’ll be using it with EOL is new.
  12. Serving building blocks, but actually not quite like lego because we are not one source that mass produces everything
  13. More like amazon marketplace, because we are an infrastructure that providers (i.e. merchants) can plug into to share their data with others.
  14. We are in the midst of a genomics revolution.The cost to generate a full genome sequence is dropping more or less daily.What is all this genetic information DOING?How does it relate to what we can see and measure about organisms, their phenotypes, or their traits?How do these genes interact with the environment to result in both normal and abnormal development of traitsnot just for lab-dwelling species like rats, but across the tree of life?How do evolutionary changes in DNA make a difference in the lives of organisms?TraitBank, which is not yet funded, would enable us to scale up and manage all kinds of trait data about all organisms.