SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
Towards a Simple,
Standards Compliant, and
  Generic Phylogenetic
    Database Module
   Hilmar Lapp and Todd Vision
 National Evolutionary Synthesis Center
               (NESCent)
Rich diversity of online
   data repositories
Most data is not online
                                        Syst. Biol.
                                       Data Archive




                Clark J.R. et al. (2008) A Comparative Study
                in Ancestral Range Reconstruction Methods:
                 Retracing the Uncertain Histories of Insular
                 Lineages. Systematic Biology,57:5,693-707
Little standards support
Accelerating knowledge
dissemination: A Story
• Jane and her lab have accumulated molecular
  data to resolve the phylogeny of a certain clade
  of frogs, many of which are endangered species.

• Her group assembles a multiple alignment and
  reconstructs the phylogeny using a variety of
  methods, some developed by her lab, resulting
  in 1000s of trees.

• The results show overwhelming support for
  several new branch points. The results are
  interesting and solid enough to be useful for
  others working on those species.
Accelerating knowledge
dissemination: A Story
• Jane and her lab have accumulated molecular
  data to resolve the phylogeny of a certain clade
  of frogs, many of which are endangered species.

• Her group assembles a multiple alignment and
  reconstructs the phylogeny using a variety of
  methods, some developed by her lab, resulting
  in 1000s of trees.

• The results show overwhelming support for
  several new branch points. The results are
  interesting and solid enough to be useful for
  others working on those species.
Accelerating knowledge
dissemination: A Story
• Jane and her lab have accumulated molecular
  data to resolve the phylogeny of a certain clade
  of frogs, many of which are endangered species.

• Her group assembles a multiple alignment and
  reconstructs the phylogeny using a variety of
  methods, some developed by her lab, resulting
  in 1000s of trees.

• The results show overwhelming support for
  several new branch points. The results are
  interesting and solid enough to be useful for
  others working on those species.
Accelerating knowledge
dissemination: A Story
• Jane and her lab have accumulated molecular
  data to resolve the phylogeny of a certain clade
  of frogs, many of which are endangered species.

• Her group assembles a multiple alignment and
  reconstructs the phylogeny using a variety of
  methods, some developed by her lab, resulting
  in 1000s of trees.

• The results show overwhelming support for
  several new branch points. The results are
  interesting and solid enough to be useful for
  others working on those species.
• Jane downloads and installs PhyloDOM, a
  freely available open source software package.
  The software creates a database and Jane uses
  the programs that come with it to import all
  her data.

• As a result, Jane’s lab now has a web-interface
  to her results that others can use to query for
  novel topologies and to explore her data.

• Her lab also updates the database from their
  on-going work, and uses it to add provenance
  data and links to protocols, publications, and
  taxonomic concepts.
• Jane downloads and installs PhyloDOM, a
  freely available open source software package.
  The software creates a database and Jane uses
  the programs that come with it to import all
  her data.

• As a result, Jane’s lab now has a web-interface
  to her results that others can use to query for
  novel topologies and to explore her data.

• Her lab also updates the database from their
  on-going work, and uses it to add provenance
  data and links to protocols, publications, and
  taxonomic concepts.
• Jane downloads and installs PhyloDOM, a
  freely available open source software package.
  The software creates a database and Jane uses
  the programs that come with it to import all
  her data.

• As a result, Jane’s lab now has a web-interface
  to her results that others can use to query for
  novel topologies and to explore her data.

• Her lab also updates the database from their
  on-going work, and uses it to add provenance
  data and links to protocols, publications, and
  taxonomic concepts.
• Jane downloads and installs PhyloDOM, a
  freely available open source software package.
  The software creates a database and Jane uses
  the programs that come with it to import all
  her data.

• As a result, Jane’s lab now has a web-interface
  to her results that others can use to query for
  novel topologies and to explore her data.

• Her lab also updates the database from their
  on-going work, and uses it to add provenance
  data and links to protocols, publications, and
  taxonomic concepts.
• Other researchers easily download and
  integrate her results in their own analyses.

• Even where Jane used new methods, other
  software understands the meaning of the
  metadata and can take advantage of it.

• Within shortly, her results appear in data
  aggregators such as iSpecies, EOL, or
  Scratchpads, along with those from other labs.

• Jane herself uses the LifeMap widget to map
  her trees onto geo-coordinates and to link
  branches to ecological and biodiversity
  parameters of respective areas.
• Other researchers easily download and
  integrate her results in their own analyses.

• Even where Jane used new methods, other
  software understands the meaning of the
  metadata and can take advantage of it.

• Within shortly, her results appear in data
  aggregators such as iSpecies, EOL, or
  Scratchpads, along with those from other labs.

• Jane herself uses the LifeMap widget to map
  her trees onto geo-coordinates and to link
  branches to ecological and biodiversity
  parameters of respective areas.
• Other researchers easily download and
  integrate her results in their own analyses.

• Even where Jane used new methods, other
  software understands the meaning of the
  metadata and can take advantage of it.

• Within shortly, her results appear in data
  aggregators such as iSpecies, EOL, or
  Scratchpads, along with those from other labs.

• Jane herself uses the LifeMap widget to map
  her trees onto geo-coordinates and to link
  branches to ecological and biodiversity
  parameters of respective areas.
• Other researchers easily download and
  integrate her results in their own analyses.

• Even where Jane used new methods, other
  software understands the meaning of the
  metadata and can take advantage of it.

• Within shortly, her results appear in data
  aggregators such as iSpecies, EOL, or
  Scratchpads, along with those from other labs.

• Jane herself uses the LifeMap widget to map
  her trees onto geo-coordinates and to link
  branches to ecological and biodiversity
  parameters of respective areas.
• Other researchers easily download and
  integrate her results in their own analyses.

• Even where Jane used new methods, other
  software understands the meaning of the
  metadata and can take advantage of it.

• Within shortly, her results appear in data
  aggregators such as iSpecies, EOL, or
  Scratchpads, along with those from other labs.

• Jane herself uses the LifeMap widget to map
  her trees onto geo-coordinates and to link
  branches to ecological and biodiversity
  parameters of respective areas.
How to get there?
                    Embeddable Tools
                                             Client-based Query         Data Aggregators,
                      (PhyloWidget,
                                                  Interfaces           Mash-up Applications
                   GBrowse TreeWidget)



                                    Data and other services API (PhyloWS)
                                supporting exchange standards (NeXML, CDAO)



   Data                         Middleware: Query & Persistence Management
Management
   Tools
                                                                                   Topology-
                                                                                   oriented
                         Phylogenetic Database supporting                           Queries
                                    - ontologies
                               - arbitrary metadata                               Precompute
                                (PhyloDB / BioSQL)                                   Query
                                                                                  Optimization

Molecular
  Data
(Sequences,            Language binding for database model               Data loading tools
Annotation)
                      (BioPerl, Biojava, Biopython, Bioruby)                 (BioSQL)
                      Parser libraries for data and semantics
Ontologies
                            standards (NeXML, CDAO)




              Phylogenetic                                        Metadata
                                              Character          (Evolutionary,      ITIS, NCBI
                 Trees          Taxonomies
              (Gene, Species)
                                                Data              Biodiversity,     Taxonomies
                                                                Computational)
Achieving the Vision:
   Coordinated & open
      development,
 nurturing & harnessing
     existing efforts
Database:
                 PhyloDB module
                                   Edge_Qualifier_        Node_Qualifier_
    Node_Path                          Value                  Value
                                   -Value                 -Value
   - distance                      -Rank                  -Rank

                           Edge
                                                                            Node_Dbxref
                                                       Tree_Root
                                                 -Is_Alternate
                                     Node        -Significance           Tree_Dbxref
  Node_Taxon                  -Label
-Rank                         -Left_Idx
                              -Right_Idx
        Node_Bioentry                                Tree_Qualifier_
        -Rank
                                                         Value
                                                     -Value
                                                     -Rank
                                     Tree
                              -Name
                              -Identifier
  Taxon
                              -Is_Rooted                       Term

                Bioentry                                                      Dbxref
                                  Biodatabase                 Ontology
Syntax: NeXML
Semantics: CDAO




http://www.evolutionaryontology.org
Service API: PhyloWS
http://evoinfo.nescent.org/PhyloWS
Embeddable tools:
Community-owned,
 reusable software
Nurturing the community
Phyloinformatics
Hackathon, Dec 2006
• James Estill (U. Georgia):
  “A Perl-based Command Line Interface to a
  Topological Query Application for BioSQL in Support
  of High Throughput Classification and Analysis of LTR
  Retrotransposons in Plant Genomes”
Acknowledgments
• Phyloinformatics    • Sponsors & support:
  Hackathon
  participants         • NESCent

• BioHackathon 2008    • BioSynC
  participants
                       • TDWG
• EvoInformatics
  Working Group        • DBCLS, CBRC (Japan)
  participants

• Google Summer of
  Code Students:
  Jamie Estill

Weitere ähnliche Inhalte

Was ist angesagt?

Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Edward Baker
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence images
Syed Ahmad Chan Bukhari, PhD
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
c.titus.brown
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
drnigam
 

Was ist angesagt? (20)

B.3.5
B.3.5B.3.5
B.3.5
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformatics
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
Introduction to BioNLP and its applications
Introduction to BioNLP and its applicationsIntroduction to BioNLP and its applications
Introduction to BioNLP and its applications
 
Using SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and MetadataUsing SPARQL to Query BioPortal Ontologies and Metadata
Using SPARQL to Query BioPortal Ontologies and Metadata
 
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
Biodiversity Informatics of the Cyperaceae: Where we stand and where we’re he...
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence images
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
 
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
The Crop Ontology - Harmonizing Semantics for Agricultural Field Data, by Eli...
 
2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformatics
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
Cedar OnDemand: An intelligent browser extension to generate ontology-based m...
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 

Andere mochten auch

The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
Hilmar Lapp
 
Quality professional development
Quality professional developmentQuality professional development
Quality professional development
trtkaren
 

Andere mochten auch (7)

KIERASAYS:
KIERASAYS:KIERASAYS:
KIERASAYS:
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descent
 
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...The MIAPA ontology: An annotation ontology for validating minimum metadata re...
The MIAPA ontology: An annotation ontology for validating minimum metadata re...
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?
 
Present simple tense
Present simple tensePresent simple tense
Present simple tense
 
Quality professional development
Quality professional developmentQuality professional development
Quality professional development
 
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
Semantics of and for the diversity of life:
 Opportunities and perils of tryi...
 

Ähnlich wie Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database

Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
ICZN
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
Rajarshi Guha
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
ebiquity
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
Monica Munoz-Torres
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
David Ruau
 

Ähnlich wie Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database (20)

EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
Big data from small data: A deep survey of the neuroscience landscape data via
Big data from small data:  A deep survey of the neuroscience landscape data viaBig data from small data:  A deep survey of the neuroscience landscape data via
Big data from small data: A deep survey of the neuroscience landscape data via
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07
 
VectorBase - PopGenBase Meeting at ASTMH08
VectorBase - PopGenBase Meeting at ASTMH08VectorBase - PopGenBase Meeting at ASTMH08
VectorBase - PopGenBase Meeting at ASTMH08
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
Final Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.KeyFinal Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.Key
 
Case Study in Linked Data and Semantic Web: Human Genome
Case Study in Linked Data and Semantic Web: Human GenomeCase Study in Linked Data and Semantic Web: Human Genome
Case Study in Linked Data and Semantic Web: Human Genome
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic Applications
 
Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014Web Apollo at Genome Informatics 2014
Web Apollo at Genome Informatics 2014
 
"Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature""Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature"
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 

Mehr von Hilmar Lapp

PhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesPhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing Phylogenies
Hilmar Lapp
 
Lapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumLapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing Symposium
Hilmar Lapp
 
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesBioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
Hilmar Lapp
 

Mehr von Hilmar Lapp (16)

Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
Of Trees and Owl: 
The challenges of reasoning over the semantics of shared d...
 
Integrating data with phylogenies, at scale
Integrating data with phylogenies, at scaleIntegrating data with phylogenies, at scale
Integrating data with phylogenies, at scale
 
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
Rphenoscape: 
Connecting the semantics of evolutionary morphology to comparat...
 
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
 
Open Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some IntrospectionOpen Bioinformatics Foundation: 2014 Update & Some Introspection
Open Bioinformatics Foundation: 2014 Update & Some Introspection
 
Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014 Reproducible Science - Panel at iEvoBio 2014
Reproducible Science - Panel at iEvoBio 2014
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...
 
PhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing PhylogeniesPhyloCommons: Sharing, annotating, and reusing Phylogenies
PhyloCommons: Sharing, annotating, and reusing Phylogenies
 
OBF Address at BOSC 2013
OBF Address at BOSC 2013OBF Address at BOSC 2013
OBF Address at BOSC 2013
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
 
Phyloinformatics VoCamp
Phyloinformatics VoCampPhyloinformatics VoCamp
Phyloinformatics VoCamp
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descent
 
Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.Liberating Our Beautiful Trees: A Call to Arms.
Liberating Our Beautiful Trees: A Call to Arms.
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012
 
Lapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing SymposiumLapp, ISCB Software Sharing Symposium
Lapp, ISCB Software Sharing Symposium
 
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future FeaturesBioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
BioSQL Reloaded: v1.0 Release, PhyloDB Module, and Future Features
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database

  • 1. Towards a Simple, Standards Compliant, and Generic Phylogenetic Database Module Hilmar Lapp and Todd Vision National Evolutionary Synthesis Center (NESCent)
  • 2. Rich diversity of online data repositories
  • 3. Most data is not online Syst. Biol. Data Archive Clark J.R. et al. (2008) A Comparative Study in Ancestral Range Reconstruction Methods: Retracing the Uncertain Histories of Insular Lineages. Systematic Biology,57:5,693-707
  • 5. Accelerating knowledge dissemination: A Story • Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species. • Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees. • The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
  • 6. Accelerating knowledge dissemination: A Story • Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species. • Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees. • The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
  • 7. Accelerating knowledge dissemination: A Story • Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species. • Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees. • The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
  • 8. Accelerating knowledge dissemination: A Story • Jane and her lab have accumulated molecular data to resolve the phylogeny of a certain clade of frogs, many of which are endangered species. • Her group assembles a multiple alignment and reconstructs the phylogeny using a variety of methods, some developed by her lab, resulting in 1000s of trees. • The results show overwhelming support for several new branch points. The results are interesting and solid enough to be useful for others working on those species.
  • 9. • Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data. • As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data. • Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
  • 10. • Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data. • As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data. • Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
  • 11. • Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data. • As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data. • Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
  • 12. • Jane downloads and installs PhyloDOM, a freely available open source software package. The software creates a database and Jane uses the programs that come with it to import all her data. • As a result, Jane’s lab now has a web-interface to her results that others can use to query for novel topologies and to explore her data. • Her lab also updates the database from their on-going work, and uses it to add provenance data and links to protocols, publications, and taxonomic concepts.
  • 13. • Other researchers easily download and integrate her results in their own analyses. • Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it. • Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs. • Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
  • 14. • Other researchers easily download and integrate her results in their own analyses. • Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it. • Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs. • Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
  • 15. • Other researchers easily download and integrate her results in their own analyses. • Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it. • Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs. • Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
  • 16. • Other researchers easily download and integrate her results in their own analyses. • Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it. • Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs. • Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
  • 17. • Other researchers easily download and integrate her results in their own analyses. • Even where Jane used new methods, other software understands the meaning of the metadata and can take advantage of it. • Within shortly, her results appear in data aggregators such as iSpecies, EOL, or Scratchpads, along with those from other labs. • Jane herself uses the LifeMap widget to map her trees onto geo-coordinates and to link branches to ecological and biodiversity parameters of respective areas.
  • 18. How to get there? Embeddable Tools Client-based Query Data Aggregators, (PhyloWidget, Interfaces Mash-up Applications GBrowse TreeWidget) Data and other services API (PhyloWS) supporting exchange standards (NeXML, CDAO) Data Middleware: Query & Persistence Management Management Tools Topology- oriented Phylogenetic Database supporting Queries - ontologies - arbitrary metadata Precompute (PhyloDB / BioSQL) Query Optimization Molecular Data (Sequences, Language binding for database model Data loading tools Annotation) (BioPerl, Biojava, Biopython, Bioruby) (BioSQL) Parser libraries for data and semantics Ontologies standards (NeXML, CDAO) Phylogenetic Metadata Character (Evolutionary, ITIS, NCBI Trees Taxonomies (Gene, Species) Data Biodiversity, Taxonomies Computational)
  • 19. Achieving the Vision: Coordinated & open development, nurturing & harnessing existing efforts
  • 20. Database: PhyloDB module Edge_Qualifier_ Node_Qualifier_ Node_Path Value Value -Value -Value - distance -Rank -Rank Edge Node_Dbxref Tree_Root -Is_Alternate Node -Significance Tree_Dbxref Node_Taxon -Label -Rank -Left_Idx -Right_Idx Node_Bioentry Tree_Qualifier_ -Rank Value -Value -Rank Tree -Name -Identifier Taxon -Is_Rooted Term Bioentry Dbxref Biodatabase Ontology
  • 21.
  • 29. • James Estill (U. Georgia): “A Perl-based Command Line Interface to a Topological Query Application for BioSQL in Support of High Throughput Classification and Analysis of LTR Retrotransposons in Plant Genomes”
  • 30. Acknowledgments • Phyloinformatics • Sponsors & support: Hackathon participants • NESCent • BioHackathon 2008 • BioSynC participants • TDWG • EvoInformatics Working Group • DBCLS, CBRC (Japan) participants • Google Summer of Code Students: Jamie Estill