SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Downloaden Sie, um offline zu lesen
Phyloinformatics Workshop Edinburgh 2007




                  iPhy
tools for collation and analysis of
        phylogenomic data

         Martin Jones
        and Mark Blaxter
cercozoa
                                                                                                                                     alv
                                                                                                                                           eo
                                                                                                                                               lat
                                                                                                                                                    es




                                                                           cercom




                                                                                                                                    s
                                                                chl
                     s




                                                                                                                                 te
                   t




                                                                                                                              up I
                an




                                                                    o




                                                                                                                     om gella
              pl




                                                                     rar




                                                                                    eugly
                                                                            onads




                                                                                                                             a
                                                                                     u lyp




                                                                                                                         gro



                                                                                                                           ex
                              *p




                                                                      chn




                                                                                                                           a
                                                                                                                                               II




                                                                                      foraminiferans
                                   ra




                                                                                                                 ap ofl

                                                                                                                        pl
                                                                                                                                          up




                                                               ra




                                                                                                                    ine
                                      si




                                                                        iop
                                             re
                                                                                                                                       o
                         la




                                                                                                                   din
                                        no




                                                                                           hi
                                                                                           hi
                                              e
                                                                                                                                     gr




                                                                dio




                                                                                                                   ic
                             nd



                                                d
            cha




                                                                                                               mar
                                           ph




                                                                           hyt




                                                                                              d am
                                                                                                                               ine


                                                  al
                                                  a




                                                                    lar
                   rap   pl       yt




                                                                                                                s
                                                                                                                              r




                                                                                                a oe
                                                   ga
                                                   ga
                            a




                                                                                                                  te
                       hyt
                                                                                                                         ma
                     e a nts e a




                                                                              es
                                                                      ian
                                                     e
                                                     e




                                                                                                                                                                        het
                                                                                                            cilia
                                                                                                   o ba
                         lga         lg                                                                                                                         s




                                                                       ds
                                                                                                                                                            ecid
                                        ae
                              e




                                                                                                     bs
                                                                                                                                                          so
                                                                                                                                                     bico




                                                                                                                                                                         ero
               chlorop




                                                                                                        s
                       hyte a
                                lgae                                                                                                                     tes
                                                                                                                                                  oomyce




                                                                                                                                                                           kon
                                                                                                                                                        diatoms
                                             glauco




                                                                                                                                                                              ts
                                                     phyte
                                                                                                                                                       brow
                                                             algae                                                              laby
                                                                                                                   opalin                                    n alg
 bozoa




                                                                                                                          ids                         mo
                                                                                                                                     rint                          ae
                                                                                                                                          huli           re
                                                                                                                                               ds
                                   lobose                                                                   cryptophyte                                     ch
                                   amoeba                                                                               s
       dictyostelid                                                                                                                                           la
                                              s
                    slime molds                                                                                                                                  c
amoe




                                                                                                                                                                   alg
                                                                                                                       hapto
                                                                                                                              phyte                                    ae
                          e molds molds                                                                                             s
                dial slim
         plasmo                     e        nts
                               slim                                                                         cor
                                         bio
                         telid                                                                                                        ac
                                                                                                               ej                    vahrasid s
                                       o
                     tos           pel                                                                            ako
                *pro                                                                                                                              lime
                                                                                                                                         lka
                                                                                                                      bid                               mold
                                                                                                                                             mp
                                                                                                                          s                                  s
                                                                                                                                                fiid
                                                   tes                                                                                      eu
                                                               zoaî




                                                                                                                                               gle amoe
                                               lla als
                                             e
                                                                                            di arab
                                          lag nim                                                                                                 nid    ba
                                                                                             p
                                                                                             retor
                                                                                              oxymonads

                                                                                              pl
                                      nof                                                                                                                   s
                                                           no




                                                                                                                                                      s
                                                      fun ia




                                                                                                 om
                                                 a
                                    a




                                                                                                                                      tr ishm
                                                          gi




                                  o
                                                        rid




                              ch




                                                                                                                                       le
                                                    ìchoa




                                                                                                                                        yp
                                                                                                   on
                                                                                                     asa
                                                      po




                                                                                                     tamon




                                                                                                                                          an nia
                                                                                                      ad




                                                                                                                                            os
                                                     os




                                                                                                         s
                                                                                                           lids




                                                                                                                                              om
                                                                                                                                               a
                                                   cr




                                                                                                                                                 es
                                                mi




                                                                                                                                                               discicristates
                                                                                                               ads




         opisthokonts                                                                                           excavates
                                                                 root
Phyloinformatics Workshop Edinburgh 2007




1: Forests of trees, and loads of kindling

2: Organising principles

3: iPhy design

4: iPhy deployment

5: Nameless taxa & endless forms
Phyloinformatics Workshop Edinburgh 2007




1: Forests of trees, and loads of kindling

Phylogenetics is a growth area.
The raw materials (sequences)
      are being added at a startling rate.
Tree databases are also growing
      (both in number and size).


so how does a lab worker bee keep up?
iPhy tools for collation and analysis of phylogenomic data. M Blaxter
iPhy tools for collation and analysis of phylogenomic data. M Blaxter
(10/05/2006)
Metazoan Phyla: Sequences per phylum




                                                                                                                        Porifera
                                                                                                                        Placozoa
                                                                                                                        Buddenbrockia
                                                                                                                        Myxozoa
                                                                                                                        Mesozoa
                                                                                                                        Ctenophora
                                                                                                                        Cnidaria
                                                                                                                        Micrognathozoa
                                                                                                                        Cycliophora
                                                                                                                        Acoelomorpha
                                                                                                                        Gnathostomulida
                                                                                                                        Seisonidea
                                                                                                                        Rotifera
                                                                                                                        Gastrotricha
                                                                                                                        Sipuncula
                                                                                                                        Nemertea
                                                                                                                        Mollusca
                                                                                                                        Entoprocta
                                                                                                                        Bryozoa
                                                                                                                        Brachiopoda
                                                                                                                        Pogonophora
                                                                                                                        Echiura
                                                                                                                        Annelida
                                                                                                                        Platyhelminthes
                                                                                                                        Nematomorpha
                                                                                                                        Nematoda
                                                                                                                        Kinorhyncha
                                                                                                                        Acanthocephala
                                                                                                                        Priapulida
                                                                                                                        Tardigrada
                                                                                                                        Onychophora
                                                                                                                        Arthropoda
                                                                                                                        Xenoturbellida
                                                                                                                        Enteropneusta
                                                                                                                        Hemichordata
                                                                                                                        Echinodermata
                                                                                                                        Chordata
                                                                                                                        Chaetognatha




                                                                                                         100




                                                                                                                    1
                                                                                                 1,000




                                                                                                               10
                                                                                        10,000
                                                     10,000,000




                                                                              100,000
                                       100,000,000




                                                                  1,000,000
(10/05/2006)




                                                                                                 Porifera
                                                                                                 Placozoa
                                                                                                 Buddenbrockia
Metazoan Phyla: Species per phylum




                                                                                                 Myxozoa
                                                                                                 Mesozoa
                                                                                                 Ctenophora
                                                                                                 Cnidaria
                                                                                                 Micrognathozoa
                                                                                                 Cycliophora
                                                                                                 Acoelomorpha
                                                                                                 Gnathostomulida
                                                                                                 Seisonidea
                                                                                                 Rotifera
                                                                                                 Gastrotricha
                                                                                                 Sipuncula
                                                                                                 Nemertea
                                                                                                 Mollusca
                                                                                                 Entoprocta
                                                                                                 Bryozoa
                                                                                                 Brachiopoda
                                                                                                 Pogonophora
                                                                                                 Echiura
                                                                                                 Annelida
                                                                                                 Platyhelminthes
                                                                                                 Nematomorpha
                                                                                                 Nematoda
                                                                                                 Kinorhyncha
                                                                                                 Acanthocephala
                                                                                                 Priapulida
                                                                                                 Tardigrada
                                                                                                 Onychophora
                                                                                                 Arthropoda
                                                                                                 Xenoturbellida
                                                                                                 Enteropneusta
                                                                                                 Hemichordata
                                                                                                 Echinodermata
                                                                                                 Chordata
                                                                                                 Chaetognatha




                                                                                  100




                                                                                             1
                                                                           1000




                                                                                        10
                                                1000000




                                                                   10000
                                     10000000




                                                          100000
(10/05/2006)
Metazoan Phyla: Sequences per species




                                                                Porifera
                                                                Placozoa
                                                                Buddenbrockia
                                                                Myxozoa
                                                                Mesozoa
                                                                Ctenophora
                                                                Cnidaria
                                                                Micrognathozoa
                                                                Cycliophora
                                                                Acoelomorpha
                                                                Gnathostomulida
                                                                Seisonidea
                                                                Rotifera
                                                                Gastrotricha
                                                                Sipuncula
                                                                Nemertea
                                                                Mollusca
                                                                Entoprocta
                                                                Bryozoa
                                                                Brachiopoda
                                                                Pogonophora
                                                                Echiura
                                                                Annelida
                                                                Platyhelminthes
                                                                Nematomorpha
                                                                Nematoda
                                                                Kinorhyncha
                                                                Acanthocephala
                                                                Priapulida
                                                                Tardigrada
                                                                Onychophora
                                                                Arthropoda
                                                                Xenoturbellida
                                                                Enteropneusta
                                                                Hemichordata
                                                                Echinodermata
                                                                Chordata
                                                                Chaetognatha

                                           100




                                                      1




                                                          0.1
                                    1000




                                                 10
Phyloinformatics Workshop Edinburgh 2007




1: Forests of trees, and loads of kindling

Phylogenetics is a growth area.
The raw materials (sequences)
      are being added at a startling rate.
Tree databases are also growing
      (both in number and size).


so how does a lab worker bee keep up?
from Rod Page “Towards a Taxonomically Intelligent Phylogenetic Database”

                    7000

                    6000
                                      Molecular phylogenies
     Cumulative number



                                      TreeBASE studies
                    5000

                    4000

                    3000

                    2000

                    1000

                         0
                             1980   1985     1990        1995   2000
                                               Year
iPhy tools for collation and analysis of phylogenomic data. M Blaxter
Phyloinformatics Workshop Edinburgh 2007




Two modes of data acquisition

(a) wet lab - compute lab synergy
     explicitly source the sequences needed
     preformed ideas of
         the best taxa to sample
         the best genes to sample

[this is the source of most phylogenetic data]
Phyloinformatics Workshop Edinburgh 2007




Two modes of data acquisition

(a) wet lab - compute lab synergy

(b) magpie surfing / tree surgery
    using phyloinformatic tools
      to discover the set of available
      genes AND taxa
      to address a particular problem
Phyloinformatics Workshop Edinburgh 2007




2: Organising principles

On average …

• more data are better
      more taxa
      more genes

• multiple methods are better
Phyloinformatics Workshop Edinburgh 2007




2: Organising principles

• assess all relevant taxa
• assess all relevant sequence
while the NCBI taxonomy
isn’t the best in the world,
at least every sequence
is attached to a taxon,
and TAX_IDs are unique
The Edinburgh EST analysis Pipeline
(trace2dbest)
Process raw sequence traces
Trim off vector & low quality

         (CLOBB)
         Cluster into putative gene objects
         Predict consensus sequence

                  (prot4EST)
                  Predict translation reading frame
                  Generate protein translation

                          (annot8r)
                          Annotate using BLAST GOtcha
                          PSort Pfam SigPep KEGG

                                    (PartiGene)
                                    Collate information in relational
                                    database
NEMBASE3 http://www.nematodes.org/
  The web portal to NEMBASE3
                                  Mark Blaxter, James Wasmuth,
                                       Ann Hedley & Ralf Schmid
                                         University of Edinburgh,
                               Institute of Evolutionary Biology,
                                          Edinburgh UK EH9 3JT
                                        mark.blaxter@ed.ac.uk
NEMBASE3 http://www.nematodes.org/
                         Collectors’ curve of nematode protein families

                                                       Trichinella spiralis
                      50000
                                                     Brugia malayi
 Number of families




                                   Meloidogyne incognita
                      40000

                                                                                           A
                                            Strongyloides
                      30000                 stercoralis
                                  Ancylostoma
                                  caninum
                      20000

                            Caenorhabditis
                      10000 elegans
                                                                                           B
                                                                                           C
                         0
                                                                                  150000
                                                     75000    100000     125000
                                             50000
                              0   25000
                                              Total number of proteins
NEMBASE3 http://www.nematodes.org/
              Earliest origins of nematode protein families

                                                                                 949
                                                          Strongyloidea        (6120)


                                                                               12302
                          V                               Rhabditoidea         (3674)
                                                   1108
                           Rhabditina (Clade V)

                                                                                  0
                                                          Diplogasteromorpha   (1356)
                        4162
                                                                                 435
                          IV                              Panagrolaimomorpha   (2678)
                                                   132
                           Tylenchina (Clade IV)
                                                          Tylenchomorpha         3893
           Rhabditida                                                          (11213)
                                                          Cephalobomorpha
                   7501
                                                                                 293
NEMATODA




                                                          Ascaridomorpha
                          III                                                  (3695)

                                                   152
                           Spirurina (Clade III)
            2811                                                                 824
                                                          Spiruromorpha        (5188)


                                                                                  0
                           I                              Dorylaimida          (1610)
                                                   30
                           Dorylaimia (Clade I)
                                                                                 128
                                                          Trichinellida        (2571)
iPhy tools for collation and analysis of phylogenomic data. M Blaxter
Phyloinformatics Workshop Edinburgh 2007




2: Organising principles

• assess all relevant taxa
• assess all relevant sequence

• store aligned sequences locally
• output ‘slices’ of data in analysis-ready formats
many taxa, missing data
gene->
         abcdefghi
/taxon
   1
   2
   3
   4
   5
   6
   7
   8
   9
Generating a slice that
        • maximises taxonomic coverage
• maximises present data/minimises missing data

     gene->
              abefgi
     /taxon
        1
        3
        7
        9
Phyloinformatics Workshop Edinburgh 2007




2: Organising principles

• assess all relevant taxa
• assess all relevant sequence

• store aligned sequences locally
• output ‘slices’ of data in analysis-ready formats

• store trees locally
• store alternative taxonomic systems
Complete                                  Including
                 Platyhelminthes
genome                                   neglected
                                   L
sequences                                taxa ESTs
                  Annelida

                                       (Philippe et al.)
                  Mollusca

                 Tardigrada

            P     Nematoda
                                   E
                 Arthropoda
            C
                 Vertebrata

                Urochordata

                Cephalochordata
                                   D
                Echinodermata

                 Ctenophora

                  Cnidaria

                Choanoflagellata

                   Fungi
Phyloinformatics Workshop Edinburgh 2007




3: iPhy design
sequence       alignment       TreeFam         TreeBASE       user tree      systematic

   AGGCT                        AGGCT           AGGCT
                                ACGGT           ACGGT
   PheTyr          AGGCT        CCGGA           CCGGA
                   ACGGT
                   CCGGA



Processing to                   Processing to                   Processing to
* identify relevant sequences   * identify relevant sequences   * capture tree data
  and store locally                and store locally            * reconcile tree nodes
* associate sequences           * capture tree data                with existing systems
  and taxa                      * reconcile tree nodes
                                   with existing systems
sequence       alignment       TreeFam         TreeBASE       user tree      systematic

               AGGCT                        AGGCT           AGGCT
                                            ACGGT           ACGGT
               PheTyr           AGGCT       CCGGA           CCGGA
                                ACGGT
                                CCGGA



            Processing to                   Processing to                   Processing to
            * identify relevant sequences   * identify relevant sequences   * capture tree data
              and store locally                and store locally            * reconcile tree nodes
            * associate sequences           * capture tree data                with existing systems
              and taxa                      * reconcile tree nodes
                                               with existing systems




                        AGGCT
                        ACGGT
                        CCGGA

     POA                                                            iPhy database
                   Alignment Cycle
tranAlign               AGGCT
                        ACGGT
                        CCGGA

                                                                                         AGGCT
                                                                                         PheTyr


                                                                                         AGGCT
                                                                                         ACGGT
                                                                                         CCGGA
sequence       alignment       TreeFam         TreeBASE       user tree      systematic

                AGGCT                        AGGCT           AGGCT
                                             ACGGT           ACGGT
                PheTyr           AGGCT       CCGGA           CCGGA
                                 ACGGT
                                 CCGGA



             Processing to                   Processing to                   Processing to
             * identify relevant sequences   * identify relevant sequences   * capture tree data
               and store locally                and store locally            * reconcile tree nodes
             * associate sequences           * capture tree data                with existing systems
               and taxa                      * reconcile tree nodes
                                                with existing systems




                         AGGCT
                         ACGGT
                         CCGGA

      POA                                                            iPhy database
                    Alignment Cycle
 tranAlign               AGGCT
                         ACGGT
                         CCGGA

                                                                                          AGGCT
                                                                                          PheTyr

  TreeFam
                AGGCT
                             Orthologue
                ACGGT
                CCGGA
                              Inference
Ortho-MCL                                                                                 AGGCT
                               Engine                                                     ACGGT
                                                                                          CCGGA
AGGCT
                     ACGGT
                     CCGGA

      POA                                                  iPhy database
               Alignment Cycle
 tranAlign           AGGCT
                     ACGGT
                     CCGGA

                                                                     AGGCT
                                                                     PheTyr

  TreeFam
             AGGCT
                        Orthologue
             ACGGT
             CCGGA
                         Inference
Ortho-MCL                                                            AGGCT
                          Engine                                     ACGGT
                                                                     CCGGA




                               Dataset Exploration Tools


             AGGCT



                        }
 maximal     ACGGT                             AGGCT
             CCGGA
                              Slice            ACGGT
                                               CCGGA
                             Selecter
                                                                PhyML
 bicliques
                                                                MrBayes
                                          Phylogenetics Cycle
                                                                PAUP
                               Tree
                             Comparer
                                                                ...
AGGCT
                     ACGGT
                     CCGGA

      POA                                                  iPhy database
               Alignment Cycle
 tranAlign           AGGCT
                     ACGGT
                     CCGGA

                                                                     AGGCT
                                                                     PheTyr

  TreeFam
             AGGCT
                        Orthologue
             ACGGT
             CCGGA
                         Inference
Ortho-MCL                                                            AGGCT
                          Engine                                     ACGGT
                                                                     CCGGA
                                                                              trees &
                                                                              alignments

                               Dataset Exploration Tools
                                                                              Publication
                                                                               Quality
             AGGCT



                        }
 maximal     ACGGT                             AGGCT
                                                                               Analyses
             CCGGA
                              Slice            ACGGT
                                               CCGGA
                             Selecter
                                                                PhyML
 bicliques
                                                                MrBayes
                                          Phylogenetics Cycle
                                                                PAUP
                               Tree
                             Comparer
                                                                              AGGCT
                                                                ...           ACGGT
                                                                              CCGGA
Phyloinformatics Workshop Edinburgh 2007




4: iPhy deployment

version 0.1: ‘TaxMan’
BMC Bioinformatics                                                                                                                Bio Med Central



Software                                                                                                                         Open Access
TaxMan: a taxonomic database manager
Martin Jones* and Mark Blaxter

Address: Institute of Evolutionary Biology, King's Buildings,   Ashworth Laboratories, West Ma ins Road, Edinburgh EH9 3JT, UK
Email: Martin Jones* - marti n.jones@ed.ac.uk; Mark Blax ter - mark.blaxter@ed.ac.uk
* Corresponding author




Published: 18 December 2006                                                    Received: 11 October 2006
                                                                               Accepted: 18 December 2006
BMC Bioinformatics 2006, 7:536    doi:10.1186/1471-2105-7-536
This article is available from: http://www.biomedcentral.com/1471-2105/7/536
© 2006 Jones and Blaxter; licensee BioMed Central Ltd.
Phyloinformatics Workshop Edinburgh 2007




4: iPhy deployment

version 0.1: ‘TaxMan’

TaxMan automates assembly of large
sequence datasets for chosen taxa
TaxMan automates generation of aligned
sequences sets for chosen genes
Phyloinformatics Workshop Edinburgh 2007




4: iPhy deployment

version 0.1: ‘TaxMan’

TaxMan simplifies selection of taxa for
analysis
  e.g. given a gene set, choosing one species per family
         (choosing the species with the least missing data)

  e.g. given a taxon set, choosing the genes
         (choosing genes with less than a given % missing data)

  e.g. generating custom defined alignments
Phyloinformatics Workshop Edinburgh 2007




4: iPhy deployment

version 0.1: ‘TaxMan’

TaxMan simplifies analysis by exporting
formatted alignments (NEXUS)
  of nucleotides
      (with codon positions and genes as defined partitions)

  of amino acids
      (with genes as defined partitions)
Phyloinformatics Workshop Edinburgh 2007




4: iPhy deployment

version 0.1: ‘TaxMan’

TaxMan simplifies post-phylogenetic analysis
by
  saving trees
    (with links to the original data)
  saving analytical metadata
    (algorithm, parameters, settings)
  saving tree statistics
    (bootstraps, branch lengths)
iPhy tools for collation and analysis of phylogenomic data. M Blaxter
Lophotrochozoa
    70,000 annotated sequences
●


    630,000 EST sequences
●


    21 genes (mt + 18S 28S actin H3 WG EF1A)
●


    53,000 sequences extracted
●


    17,000 aligned consensus sequences
●


    8,700 species represented
●


    One day for data collection, one for alignment
●
Molecular Phylogenetics and Evolution 43 (2007) 583–595
                                                                                                              www.elsevier.com/locate/ympev




The e ect of model choice on phylogenetic inference using
mitochondrial sequence data: Lessons from the scorpions
                            a,¤
                                  , Benjamin Gantenbein b, Victor Fet c, Mark Blaxter                                    a
      Martin Jones
  a
      Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK
                        b
                          AO Research Institute, Clavadelerstrasse 8, Davos Platz CH-7270, Switzerland
                 c
                   Department of Biological Sciences, Marshall University, Huntington, WV 25755-2510, USA

                      Received 25 April 2006; revised 14 November 2006; accepted 14 November 2006
                                            Available online 29 November 2006
Phyloinformatics Workshop Edinburgh 2007




5: Nameless taxa & endless forms
quot;... endless forms
     most beautiful
and most wonderful
         have been,
     and are being,
            evolvedquot;

      (Darwin 1859)
http://www.nematodes.org/NeglectedGenomes/
              ARTHROPODA/Chelicerata.html
Metazoan species per phylum




                                                                                  100000000
                                                                       10000000
                                                             1000000
                                                    100000
                                            10000
                                     1000
                               100
                          10
                      1
  Choanoflagellida
           Porifera
          Placozoa
           Cnidaria
       Ctenophora
            Acoela
          Mesozoa
          Myxozoa
         Nematoda
   Nematomorpha
         Loricifera
      Kinorhyncha
         Priapulida
     Onychophora
       Arthropoda
        Tardigrada
      Gastrotricha
         Nemertea
      Myzostomida
  Gnathostomulida
       Cycliophora
   Platyhelminthes
  Acanthocephala
           Rotifera
     Chaetognatha
       Sipunculida
           Bryozoa
      Brachiopoda
        Entoprocta
          Annelida
     Pogonophora
           Echiura
          Mollusca
     Hemichordata
   Echinodermata
          Chordata
organism-size curve    Eukaryotes
     squillions
number of individuals (log scale)


                                                                                             POSSIBLE
                                                                                             PREDATORS




                                    lots                                            FOOD
                                                                                    ITEMS




                                    few
                                           miniscule     tiny     just visible    small      big
                                                       size of organism (log scale)
Sourhope farm
          NERC quot;Soil Biodiversity
         and Ecosystem Functionquot;
            Programme Study Site

                       120 m x 75 m
       of raw Scottish upland grass




13 000 000 000 nematodes
MAN IS BVT A WORM
Marine
               1034ED Fyne1
               1022ED Fyne1
               1010ED Fyne1
               1020ED Fyne1
               1005ED Fyne1
               1007ED Fyne
         1140ED Orkney
           1139ED Orkney
                 1031ED Fyne1
                                          1043ED Gullane
                                                        1118ED Fyne2
                                                      1011ED Fyne1
                                                      1093ED Fyne2
                                                          1085ED Gullane
                                                         1046ED Gullane
                                                       1041ED Gullane




                                                                                                                       Nematode
                                                       1060ED Gullane
                                                               1
                                                                 1028ED Fyne1
                                                                  1119ED Fyne2
                                                                 1122ED Fyne2
                                                                                     1142ED Orkney
                                                                                1145ED Orkney
                                                                                     1170ED Orkney
                                                                                  1174ED Orkney
                                                                                   1162ED Orkney
                                                                                  1169ED Orkney
                                                                                    1173ED Orkney
                                                                                   1179ED Orkney
                                                                                   1168ED Orkney
                                                                                    1176ED Orkney
                                                                                    1167ED Orkney
                                                                                    1175ED Orkney




                                                                                                                       Barcodes
                                                                                    1147ED Orkney
                                                                            1008ED Fyne1
                                                                                         1009ED Fyne1
                                                                            1144ED Orkney
                                                                            1146ED Orkney
                                                                                     1083ED Gullane
                                                                                   1073ED Gullane
                                                                                    1051ED Gullane
                                                                                        1019ED Fyne1
                                                                                          1124ED Fyne2
                                                                                        1097ED Fyne2
                                                                                             1150ED Orkney
                                                                                                1136ED Orkney
                                                                                                 1152ED Orkney
                                                                                                 1171ED Orkney
                                                                                                  1154ED Orkney
     5 changes                                                                                    1151ED Orkney
                                                                                                 1029ED Fyne1
                                                                                                1012ED Fyne1
                                                                                                1138ED Orkney
                                                                                                1013ED Fyne1
                                                                                                 1032ED Fyne1
                                                                                                   1092ED Fyne2
                                                                                                  1036ED Fyne1



                                                                                                                                  Gullane
                                                                                                  1037ED Fyne1
                                             1075ED Gullane
                                                1109ED Fyne2
                                               1128ED Fyne2
                                                                     1094ED Fyne2
                                                                       1044ED Gullane
                                                                                        1071ED Gullane
                                                                        1064ED Gullane
                                                                         1053ED Gullane
                                                                        1070ED Gullane
                                                                        1038ED Gullane
                                                                        1052ED Gullane


                                                                                                                  Loch Fyne         10
                                                                     1123ED Fyne2
                                                                             1035ED Fyne1
                                                                             1107ED Fyne2
                                                                                 1108ED Fyne2
                                             1024ED Fyne1
                                              1178ED Orkney
                                                 1165ED Orkney



                                                                                                                              2
                                                 1156ED Orkney
                                                  1141ED Orkney
                                                 1164ED Orkney
                                                 1066ED Gullane
                                                                          1047ED Gullane
                                                                            1099ED Fyne2
                                                                           1058ED Gullane
                                                                             1042ED Gullane
                                                                           1088ED Fyne2
                                                                            1086ED Fyne2
                                                                                1039ED Gullane
                                                                         1069ED Gullane


                                                                                                                       10
                                                                         1061ED Gullane
                                                                         1074ED Gullane
                                                                                         1096ED Fyne2
                                                                                       1105ED Fyne2
                                                                                        1133ED Fyne2
                                                                                        1077ED Gullane
                                                                                       1014ED Fyne1
                                                                                      1068ED Gullane
                                                                                       1076ED Gullane


                                                                                                                       4
                                                                                        1080ED Gullane
                                                                                       1072ED Gullane
                                                                                       1054ED Gullane
                                                                                       1062ED Gullane
                                                                                        1048ED Gullane
                                                                                         1057ED Gullane
                                                                                         1040ED Gullane
                                                                                         1059ED Gullane

                                Orkney
                                                       1120ED Fyne2
                                                          1017ED Fyne1


                                                                                                                       11
                                                        1004ED Fyne1
                                                         1018ED Fyne1
                                                                   1177ED Orkney
                                                                         1025ED Fyne1
                                                                        1023ED Fyne1
                                                                                     1016ED Fyne1
                                                                                      1027ED Fyne1
                                                                         1015ED Fyne1
                                                                         1002ED Fyne1
                                                                         1001ED Fyne1
                                                                          1021ED Fyne1
                                                                         1003ED Fyne1


                                                                                                                              2
                                                                         1006ED Fyne1
                                                                       1000ED Fyne1
                                                                        1155ED Orkney
                                                                                 1121ED Fyne2
                                                                           1103ED Fyne2


                                                                                                                                  12
                                                                            1110ED Fyne2


Loch Fyne
                                                                                1114ED Fyne2
                                                                             1125ED Fyne2
                                                                              1131ED Fyne2

                                Gullane                                        1101ED Fyne2
                                                                              1102ED Fyne2
                                                                              1112ED Fyne2
                                                                              1116ED Fyne2
                                                                              1106ED Fyne2
                                                                             1104ED Fyne2
                                                                             1132ED Fyne2




                                                                                                                  51              Orkney
Phyloinformatics Workshop Edinburgh 2007




5: Nameless taxa & endless forms


             MOTU
   Molecular Operational
     Taxonomic Units
motu
1. to cut; to snap off
motu-á te hau, the fishing line snapped off
2. to engrave, to inscribe
letters or pictures in stone or in wood, like the motu mo rogorogo, inscrip-
tions for recitation in lines called kohau.
3. islet
some names of islets: Motu Motiro Hiva, Motu Nui, Motu Iti, Motu Kaokao,
Motu Tapu, Motu Marotiri, Motu Kau, Motu Tavake, Motu Tautara, Motu Ko
Hepa Ko Maihori, Motu Hava.
Phyloinformatics Workshop Edinburgh 2007




5: Nameless taxa & endless forms

MOTU
specimen-based surveys
  CBoL Barcode of Life (CO1)
anonymous, specimen-free surveys
  environmental sampling
  bulk community DNA
  millions of sequences
Phyloinformatics Workshop Edinburgh 2007




5: Nameless taxa & endless forms


       ~1.2 million described species

     ~10-100 million species in reality

Thus, most ‘species’
will never be formally named.
Phyloinformatics Workshop Edinburgh 2007




5: Nameless taxa & endless forms

How do we incorporate these myriad
‘nameless taxa’ into our systems?
Phyloinformatics Workshop Edinburgh 2007




                      TaxMan, iPhy & chelicerate evolution
Martin Jones
                      MOTU and barcoding
Robin Floyd &
Jenna Mann
                      PartiGene & EST analysis
Ralf Schmid,
James Wasmuth
& Ann Hedley

Weitere ähnliche Inhalte

Was ist angesagt?

Neonatal hearing screening - a short overview of the situation in Western Eur...
Neonatal hearing screening - a short overview of the situation in Western Eur...Neonatal hearing screening - a short overview of the situation in Western Eur...
Neonatal hearing screening - a short overview of the situation in Western Eur...Monika Lehnhardt
 
Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...
Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...
Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...Human Capital Media
 
Reconsidering the Digital Divide, by Hovig Tchalian
Reconsidering the Digital Divide, by Hovig TchalianReconsidering the Digital Divide, by Hovig Tchalian
Reconsidering the Digital Divide, by Hovig TchalianEducational Technologies
 
6.09 Develop A Plan And Execute
6.09 Develop A Plan And Execute6.09 Develop A Plan And Execute
6.09 Develop A Plan And ExecuteRalphYoung
 
Neonatal Hearing Screening 2009 Europe
Neonatal Hearing Screening  2009 EuropeNeonatal Hearing Screening  2009 Europe
Neonatal Hearing Screening 2009 Europesimilei
 
Neonatal Hearing Screening
Neonatal  Hearing  ScreeningNeonatal  Hearing  Screening
Neonatal Hearing Screeningsimilei
 
Technology use and educational performance
Technology use and educational performanceTechnology use and educational performance
Technology use and educational performanceFrancesc Pedró
 
Conversation Clusters: Grouping Conversation Through Human Computer Dialog
Conversation Clusters: Grouping Conversation Through Human Computer DialogConversation Clusters: Grouping Conversation Through Human Computer Dialog
Conversation Clusters: Grouping Conversation Through Human Computer DialogTony Bergstrom
 
Open Business @ DMY Berlin 2011 - MakerLab
Open Business @ DMY Berlin 2011 - MakerLabOpen Business @ DMY Berlin 2011 - MakerLab
Open Business @ DMY Berlin 2011 - MakerLabMassimo Menichinelli
 
ASRR Keynote by Barry Dahl
ASRR Keynote by Barry DahlASRR Keynote by Barry Dahl
ASRR Keynote by Barry DahlBarry Dahl
 
Llp Mantra Iii 02 Jun 2009
Llp Mantra Iii 02 Jun 2009Llp Mantra Iii 02 Jun 2009
Llp Mantra Iii 02 Jun 2009LLPonline.in
 
5.1.3 poster researchin findings
5.1.3 poster researchin findings5.1.3 poster researchin findings
5.1.3 poster researchin findingsAbisolaCm
 

Was ist angesagt? (17)

Neonatal hearing screening - a short overview of the situation in Western Eur...
Neonatal hearing screening - a short overview of the situation in Western Eur...Neonatal hearing screening - a short overview of the situation in Western Eur...
Neonatal hearing screening - a short overview of the situation in Western Eur...
 
Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...
Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...
Enterprise Collaboration: Can You Connect Social Learning and Business Perfor...
 
Reconsidering the Digital Divide, by Hovig Tchalian
Reconsidering the Digital Divide, by Hovig TchalianReconsidering the Digital Divide, by Hovig Tchalian
Reconsidering the Digital Divide, by Hovig Tchalian
 
6.09 Develop A Plan And Execute
6.09 Develop A Plan And Execute6.09 Develop A Plan And Execute
6.09 Develop A Plan And Execute
 
Neonatal Hearing Screening 2009 Europe
Neonatal Hearing Screening  2009 EuropeNeonatal Hearing Screening  2009 Europe
Neonatal Hearing Screening 2009 Europe
 
Neonatal Hearing Screening
Neonatal  Hearing  ScreeningNeonatal  Hearing  Screening
Neonatal Hearing Screening
 
Site analysis
Site analysisSite analysis
Site analysis
 
Yhteys2009
Yhteys2009Yhteys2009
Yhteys2009
 
Technology use and educational performance
Technology use and educational performanceTechnology use and educational performance
Technology use and educational performance
 
Conversation Clusters: Grouping Conversation Through Human Computer Dialog
Conversation Clusters: Grouping Conversation Through Human Computer DialogConversation Clusters: Grouping Conversation Through Human Computer Dialog
Conversation Clusters: Grouping Conversation Through Human Computer Dialog
 
Open Business @ DMY Berlin 2011 - MakerLab
Open Business @ DMY Berlin 2011 - MakerLabOpen Business @ DMY Berlin 2011 - MakerLab
Open Business @ DMY Berlin 2011 - MakerLab
 
ASRR Keynote by Barry Dahl
ASRR Keynote by Barry DahlASRR Keynote by Barry Dahl
ASRR Keynote by Barry Dahl
 
Llp Mantra Iii 02 Jun 2009
Llp Mantra Iii 02 Jun 2009Llp Mantra Iii 02 Jun 2009
Llp Mantra Iii 02 Jun 2009
 
Google
GoogleGoogle
Google
 
5.1.3 poster researchin findings
5.1.3 poster researchin findings5.1.3 poster researchin findings
5.1.3 poster researchin findings
 
Informed Cities: Live vote results 2
Informed Cities: Live vote results 2Informed Cities: Live vote results 2
Informed Cities: Live vote results 2
 
Informed Cities: Live vote results 1
Informed Cities: Live vote results 1Informed Cities: Live vote results 1
Informed Cities: Live vote results 1
 

Andere mochten auch

Lny A Veces Tenemos Miedo
Lny A Veces Tenemos MiedoLny A Veces Tenemos Miedo
Lny A Veces Tenemos Miedolnyamuni2
 
HIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandra
HIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandraHIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandra
HIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandraBQRazetti2014
 
Reporte del congreso de EU sobre Rápido y Furioso
Reporte del congreso de EU sobre Rápido y FuriosoReporte del congreso de EU sobre Rápido y Furioso
Reporte del congreso de EU sobre Rápido y Furiosotoliro
 
Innovación, personas y participación
Innovación, personas y participación Innovación, personas y participación
Innovación, personas y participación Consorciocie
 
Es quate sasons
Es quate sasonsEs quate sasons
Es quate sasonscompetic
 
Presentación Impress realizada por Silvia
Presentación Impress realizada por SilviaPresentación Impress realizada por Silvia
Presentación Impress realizada por Silviaraul andres
 
SYG Consultores Servicios Castellano
SYG Consultores Servicios CastellanoSYG Consultores Servicios Castellano
SYG Consultores Servicios CastellanoSYG Consultores
 
Raghav_CDM-Exp_4.5yrs
Raghav_CDM-Exp_4.5yrsRaghav_CDM-Exp_4.5yrs
Raghav_CDM-Exp_4.5yrsRaghavendra S
 
Distributed feature selection for efficient
Distributed feature selection for efficientDistributed feature selection for efficient
Distributed feature selection for efficientNexgen Technology
 
Blood Health for carlo web res
Blood Health for carlo web resBlood Health for carlo web res
Blood Health for carlo web resCarlo Ammendolia
 
Management Consulting Recruitment - Comms Point Recruitment Solutions
Management Consulting Recruitment - Comms Point Recruitment SolutionsManagement Consulting Recruitment - Comms Point Recruitment Solutions
Management Consulting Recruitment - Comms Point Recruitment SolutionsOomph! Recruitment Solutions
 

Andere mochten auch (20)

Lny A Veces Tenemos Miedo
Lny A Veces Tenemos MiedoLny A Veces Tenemos Miedo
Lny A Veces Tenemos Miedo
 
1 exp idea
1 exp idea1 exp idea
1 exp idea
 
Ar quente
Ar quenteAr quente
Ar quente
 
HIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandra
HIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandraHIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandra
HIPOGLUCEMIANTES DE ORIGEN VEGETAL: Bauhinia megalandra
 
Como instalar aeroo en open erp 6
Como instalar aeroo en open erp 6Como instalar aeroo en open erp 6
Como instalar aeroo en open erp 6
 
La impresora
La impresoraLa impresora
La impresora
 
Curso formador de formadores online
Curso formador de formadores onlineCurso formador de formadores online
Curso formador de formadores online
 
Reporte del congreso de EU sobre Rápido y Furioso
Reporte del congreso de EU sobre Rápido y FuriosoReporte del congreso de EU sobre Rápido y Furioso
Reporte del congreso de EU sobre Rápido y Furioso
 
Pre dc-nes
Pre dc-nesPre dc-nes
Pre dc-nes
 
Innovación, personas y participación
Innovación, personas y participación Innovación, personas y participación
Innovación, personas y participación
 
Es quate sasons
Es quate sasonsEs quate sasons
Es quate sasons
 
Presentación IVC+R 2011
Presentación IVC+R 2011Presentación IVC+R 2011
Presentación IVC+R 2011
 
Presentación Impress realizada por Silvia
Presentación Impress realizada por SilviaPresentación Impress realizada por Silvia
Presentación Impress realizada por Silvia
 
SYG Consultores Servicios Castellano
SYG Consultores Servicios CastellanoSYG Consultores Servicios Castellano
SYG Consultores Servicios Castellano
 
Raghav_CDM-Exp_4.5yrs
Raghav_CDM-Exp_4.5yrsRaghav_CDM-Exp_4.5yrs
Raghav_CDM-Exp_4.5yrs
 
Del CRM al Social CRM
Del CRM al Social CRMDel CRM al Social CRM
Del CRM al Social CRM
 
Distributed feature selection for efficient
Distributed feature selection for efficientDistributed feature selection for efficient
Distributed feature selection for efficient
 
Blood Health for carlo web res
Blood Health for carlo web resBlood Health for carlo web res
Blood Health for carlo web res
 
Rhizomatic Philosophy
Rhizomatic PhilosophyRhizomatic Philosophy
Rhizomatic Philosophy
 
Management Consulting Recruitment - Comms Point Recruitment Solutions
Management Consulting Recruitment - Comms Point Recruitment SolutionsManagement Consulting Recruitment - Comms Point Recruitment Solutions
Management Consulting Recruitment - Comms Point Recruitment Solutions
 

Ähnlich wie iPhy tools for collation and analysis of phylogenomic data. M Blaxter

newmont mining Octavo_Symposium_Final_5_15_08
newmont mining Octavo_Symposium_Final_5_15_08newmont mining Octavo_Symposium_Final_5_15_08
newmont mining Octavo_Symposium_Final_5_15_08finance37
 
"Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No...
"Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No..."Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No...
"Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No...IKT-Norge
 
U.S. Renewable Energy Market And Growth
U.S. Renewable Energy Market And GrowthU.S. Renewable Energy Market And Growth
U.S. Renewable Energy Market And GrowthBrookeHeaton
 
Presentatie jan noordergraaf
Presentatie jan noordergraafPresentatie jan noordergraaf
Presentatie jan noordergraafdeniseveekeren
 
Presentatie jan noordergraaf
Presentatie jan noordergraafPresentatie jan noordergraaf
Presentatie jan noordergraafdeniseveekeren
 
ncr annual reports 2002
ncr annual reports 2002ncr annual reports 2002
ncr annual reports 2002finance46
 
Infographic E-commerce in Italy 2011
Infographic E-commerce in Italy 2011Infographic E-commerce in Italy 2011
Infographic E-commerce in Italy 2011Casaleggio Associati
 
if you forget me
if you forget meif you forget me
if you forget menitish
 
2010 Honda Insight Hybrid Omaha Nebraska
2010 Honda Insight Hybrid Omaha Nebraska2010 Honda Insight Hybrid Omaha Nebraska
2010 Honda Insight Hybrid Omaha NebraskaHonda Cars of Bellevue
 
2010 insight-hybrid-brochure
2010 insight-hybrid-brochure2010 insight-hybrid-brochure
2010 insight-hybrid-brochureBurbank Honda
 
2010 insight-hybrid-brochure-honda-dallas-tx
2010 insight-hybrid-brochure-honda-dallas-tx2010 insight-hybrid-brochure-honda-dallas-tx
2010 insight-hybrid-brochure-honda-dallas-txJohn Eagle Honda Dallas
 
2010 Insight Hybrid Brochure-Richards Honda Baton Rouge
2010 Insight Hybrid Brochure-Richards Honda Baton Rouge2010 Insight Hybrid Brochure-Richards Honda Baton Rouge
2010 Insight Hybrid Brochure-Richards Honda Baton RougeRichards Honda
 
2010 insight-hybrid-brochure-honda-panama-city-florida
2010 insight-hybrid-brochure-honda-panama-city-florida2010 insight-hybrid-brochure-honda-panama-city-florida
2010 insight-hybrid-brochure-honda-panama-city-floridaHonda of Panama City
 
2010 insight-hybrid-brochure-honda-katy-houston-tx
2010 insight-hybrid-brochure-honda-katy-houston-tx2010 insight-hybrid-brochure-honda-katy-houston-tx
2010 insight-hybrid-brochure-honda-katy-houston-txHonda Cars of Katy
 
2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula
 2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula 2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula
2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of TemeculaDCH Honda of Temecula
 
2010 Honda Insight Hybrid Virginia Beach
2010 Honda Insight Hybrid Virginia Beach2010 Honda Insight Hybrid Virginia Beach
2010 Honda Insight Hybrid Virginia BeachCheckered Flag Honda
 

Ähnlich wie iPhy tools for collation and analysis of phylogenomic data. M Blaxter (20)

newmont mining Octavo_Symposium_Final_5_15_08
newmont mining Octavo_Symposium_Final_5_15_08newmont mining Octavo_Symposium_Final_5_15_08
newmont mining Octavo_Symposium_Final_5_15_08
 
L4 Bio mass
L4 Bio massL4 Bio mass
L4 Bio mass
 
"Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No...
"Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No..."Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No...
"Ukraine and Global Sourcing", The Ambassador of Ukraine to the Kingdom of No...
 
U.S. Renewable Energy Market And Growth
U.S. Renewable Energy Market And GrowthU.S. Renewable Energy Market And Growth
U.S. Renewable Energy Market And Growth
 
Presentatie jan noordergraaf
Presentatie jan noordergraafPresentatie jan noordergraaf
Presentatie jan noordergraaf
 
Presentatie jan noordergraaf
Presentatie jan noordergraafPresentatie jan noordergraaf
Presentatie jan noordergraaf
 
ncr annual reports 2002
ncr annual reports 2002ncr annual reports 2002
ncr annual reports 2002
 
Infographic E-commerce in Italy 2011
Infographic E-commerce in Italy 2011Infographic E-commerce in Italy 2011
Infographic E-commerce in Italy 2011
 
if you forget me
if you forget meif you forget me
if you forget me
 
2010 Honda Insight Hybrid Omaha Nebraska
2010 Honda Insight Hybrid Omaha Nebraska2010 Honda Insight Hybrid Omaha Nebraska
2010 Honda Insight Hybrid Omaha Nebraska
 
2010 insight-hybrid-brochure
2010 insight-hybrid-brochure2010 insight-hybrid-brochure
2010 insight-hybrid-brochure
 
2010 insight-hybrid-brochure-honda-dallas-tx
2010 insight-hybrid-brochure-honda-dallas-tx2010 insight-hybrid-brochure-honda-dallas-tx
2010 insight-hybrid-brochure-honda-dallas-tx
 
Austin Honda Insight Brochure 2010
Austin Honda Insight Brochure 2010Austin Honda Insight Brochure 2010
Austin Honda Insight Brochure 2010
 
2010 Insight Hybrid Brochure-Richards Honda Baton Rouge
2010 Insight Hybrid Brochure-Richards Honda Baton Rouge2010 Insight Hybrid Brochure-Richards Honda Baton Rouge
2010 Insight Hybrid Brochure-Richards Honda Baton Rouge
 
2010 insight-hybrid-brochure-honda-panama-city-florida
2010 insight-hybrid-brochure-honda-panama-city-florida2010 insight-hybrid-brochure-honda-panama-city-florida
2010 insight-hybrid-brochure-honda-panama-city-florida
 
2010 insight-hybrid-brochure-honda-katy-houston-tx
2010 insight-hybrid-brochure-honda-katy-houston-tx2010 insight-hybrid-brochure-honda-katy-houston-tx
2010 insight-hybrid-brochure-honda-katy-houston-tx
 
2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula
 2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula 2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula
2010 Honda Insight Hybrid Sedan Brochure | DCH Honda of Temecula
 
2010 Honda Insight Hybrid Virginia Beach
2010 Honda Insight Hybrid Virginia Beach2010 Honda Insight Hybrid Virginia Beach
2010 Honda Insight Hybrid Virginia Beach
 
Dvd Label
Dvd LabelDvd Label
Dvd Label
 
Apple Case Study
Apple Case StudyApple Case Study
Apple Case Study
 

Mehr von Roderic Page

ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)Roderic Page
 
Wikidata and the Biodiversity Knowledge Graph
Wikidata and the Biodiversity Knowledge GraphWikidata and the Biodiversity Knowledge Graph
Wikidata and the Biodiversity Knowledge GraphRoderic Page
 
Ozymandias - from an atlas to a knowledge graph of living Australia
Ozymandias - from an atlas to a knowledge graph of living AustraliaOzymandias - from an atlas to a knowledge graph of living Australia
Ozymandias - from an atlas to a knowledge graph of living AustraliaRoderic Page
 
SLiDInG6 talk on biodiversity knowledge graph
SLiDInG6 talk on biodiversity knowledge graphSLiDInG6 talk on biodiversity knowledge graph
SLiDInG6 talk on biodiversity knowledge graphRoderic Page
 
Wild idea for TDWG17 Bitcoins, biodiversity and micropayments
Wild idea for TDWG17 Bitcoins, biodiversity and micropaymentsWild idea for TDWG17 Bitcoins, biodiversity and micropayments
Wild idea for TDWG17 Bitcoins, biodiversity and micropaymentsRoderic Page
 
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graphTowards a biodiversity knowledge graph
Towards a biodiversity knowledge graphRoderic Page
 
The Sam Adams talk
The Sam Adams talkThe Sam Adams talk
The Sam Adams talkRoderic Page
 
Unknown knowns, long tails, and long data
Unknown knowns, long tails, and long dataUnknown knowns, long tails, and long data
Unknown knowns, long tails, and long dataRoderic Page
 
In praise of grumpy old men: Open versus closed data and the challenge of cre...
In praise of grumpy old men: Open versus closed data and the challenge of cre...In praise of grumpy old men: Open versus closed data and the challenge of cre...
In praise of grumpy old men: Open versus closed data and the challenge of cre...Roderic Page
 
BHL, BioStor, and beyond
BHL, BioStor, and beyondBHL, BioStor, and beyond
BHL, BioStor, and beyondRoderic Page
 
Cisco Digital Catapult
Cisco Digital CatapultCisco Digital Catapult
Cisco Digital CatapultRoderic Page
 
Built in the 19th century, rebuilt for the 21st
Built in the 19th century, rebuilt for the 21stBuilt in the 19th century, rebuilt for the 21st
Built in the 19th century, rebuilt for the 21stRoderic Page
 
Two graphs, three responses
Two graphs, three responsesTwo graphs, three responses
Two graphs, three responsesRoderic Page
 
GrBio Workshop talk
GrBio Workshop talkGrBio Workshop talk
GrBio Workshop talkRoderic Page
 
Biodiversity Knowledge Graphs
Biodiversity Knowledge GraphsBiodiversity Knowledge Graphs
Biodiversity Knowledge GraphsRoderic Page
 
Visualing phylogenies: a personal view
Visualing phylogenies: a personal viewVisualing phylogenies: a personal view
Visualing phylogenies: a personal viewRoderic Page
 
Biodiversity informatics: digitising the living world
Biodiversity informatics: digitising the living worldBiodiversity informatics: digitising the living world
Biodiversity informatics: digitising the living worldRoderic Page
 
Ebbe Nielsen Challenge GBIF #gb21
Ebbe Nielsen Challenge GBIF #gb21Ebbe Nielsen Challenge GBIF #gb21
Ebbe Nielsen Challenge GBIF #gb21Roderic Page
 
GBIF Science Committee Report GB21, Delhi, India
GBIF Science Committee Report GB21, Delhi, IndiaGBIF Science Committee Report GB21, Delhi, India
GBIF Science Committee Report GB21, Delhi, IndiaRoderic Page
 

Mehr von Roderic Page (20)

ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)
 
Wikidata and the Biodiversity Knowledge Graph
Wikidata and the Biodiversity Knowledge GraphWikidata and the Biodiversity Knowledge Graph
Wikidata and the Biodiversity Knowledge Graph
 
BioStor Next
BioStor NextBioStor Next
BioStor Next
 
Ozymandias - from an atlas to a knowledge graph of living Australia
Ozymandias - from an atlas to a knowledge graph of living AustraliaOzymandias - from an atlas to a knowledge graph of living Australia
Ozymandias - from an atlas to a knowledge graph of living Australia
 
SLiDInG6 talk on biodiversity knowledge graph
SLiDInG6 talk on biodiversity knowledge graphSLiDInG6 talk on biodiversity knowledge graph
SLiDInG6 talk on biodiversity knowledge graph
 
Wild idea for TDWG17 Bitcoins, biodiversity and micropayments
Wild idea for TDWG17 Bitcoins, biodiversity and micropaymentsWild idea for TDWG17 Bitcoins, biodiversity and micropayments
Wild idea for TDWG17 Bitcoins, biodiversity and micropayments
 
Towards a biodiversity knowledge graph
Towards a biodiversity knowledge graphTowards a biodiversity knowledge graph
Towards a biodiversity knowledge graph
 
The Sam Adams talk
The Sam Adams talkThe Sam Adams talk
The Sam Adams talk
 
Unknown knowns, long tails, and long data
Unknown knowns, long tails, and long dataUnknown knowns, long tails, and long data
Unknown knowns, long tails, and long data
 
In praise of grumpy old men: Open versus closed data and the challenge of cre...
In praise of grumpy old men: Open versus closed data and the challenge of cre...In praise of grumpy old men: Open versus closed data and the challenge of cre...
In praise of grumpy old men: Open versus closed data and the challenge of cre...
 
BHL, BioStor, and beyond
BHL, BioStor, and beyondBHL, BioStor, and beyond
BHL, BioStor, and beyond
 
Cisco Digital Catapult
Cisco Digital CatapultCisco Digital Catapult
Cisco Digital Catapult
 
Built in the 19th century, rebuilt for the 21st
Built in the 19th century, rebuilt for the 21stBuilt in the 19th century, rebuilt for the 21st
Built in the 19th century, rebuilt for the 21st
 
Two graphs, three responses
Two graphs, three responsesTwo graphs, three responses
Two graphs, three responses
 
GrBio Workshop talk
GrBio Workshop talkGrBio Workshop talk
GrBio Workshop talk
 
Biodiversity Knowledge Graphs
Biodiversity Knowledge GraphsBiodiversity Knowledge Graphs
Biodiversity Knowledge Graphs
 
Visualing phylogenies: a personal view
Visualing phylogenies: a personal viewVisualing phylogenies: a personal view
Visualing phylogenies: a personal view
 
Biodiversity informatics: digitising the living world
Biodiversity informatics: digitising the living worldBiodiversity informatics: digitising the living world
Biodiversity informatics: digitising the living world
 
Ebbe Nielsen Challenge GBIF #gb21
Ebbe Nielsen Challenge GBIF #gb21Ebbe Nielsen Challenge GBIF #gb21
Ebbe Nielsen Challenge GBIF #gb21
 
GBIF Science Committee Report GB21, Delhi, India
GBIF Science Committee Report GB21, Delhi, IndiaGBIF Science Committee Report GB21, Delhi, India
GBIF Science Committee Report GB21, Delhi, India
 

Kürzlich hochgeladen

Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 

Kürzlich hochgeladen (20)

Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 

iPhy tools for collation and analysis of phylogenomic data. M Blaxter

  • 1. Phyloinformatics Workshop Edinburgh 2007 iPhy tools for collation and analysis of phylogenomic data Martin Jones and Mark Blaxter
  • 2. cercozoa alv eo lat es cercom s chl s te t up I an o om gella pl rar eugly onads a u lyp gro ex *p chn a II foraminiferans ra ap ofl pl up ra ine si iop re o la din no hi hi e gr dio ic nd d cha mar ph hyt d am ine al a lar rap pl yt s r a oe ga ga a te hyt ma e a nts e a es ian e e het cilia o ba lga lg s ds ecid ae e bs so bico ero chlorop s hyte a lgae tes oomyce kon diatoms glauco ts phyte brow algae laby opalin n alg bozoa ids mo rint ae huli re ds lobose cryptophyte ch amoeba s dictyostelid la s slime molds c amoe alg hapto phyte ae e molds molds s dial slim plasmo e nts slim cor bio telid ac ej vahrasid s o tos pel ako *pro lime lka bid mold mp s s fiid tes eu zoaî gle amoe lla als e di arab lag nim nid ba p retor oxymonads pl nof s no s fun ia om a a tr ishm gi o rid ch le ìchoa yp on asa po tamon an nia ad os os s lids om a cr es mi discicristates ads opisthokonts excavates root
  • 3. Phyloinformatics Workshop Edinburgh 2007 1: Forests of trees, and loads of kindling 2: Organising principles 3: iPhy design 4: iPhy deployment 5: Nameless taxa & endless forms
  • 4. Phyloinformatics Workshop Edinburgh 2007 1: Forests of trees, and loads of kindling Phylogenetics is a growth area. The raw materials (sequences) are being added at a startling rate. Tree databases are also growing (both in number and size). so how does a lab worker bee keep up?
  • 7. (10/05/2006) Metazoan Phyla: Sequences per phylum Porifera Placozoa Buddenbrockia Myxozoa Mesozoa Ctenophora Cnidaria Micrognathozoa Cycliophora Acoelomorpha Gnathostomulida Seisonidea Rotifera Gastrotricha Sipuncula Nemertea Mollusca Entoprocta Bryozoa Brachiopoda Pogonophora Echiura Annelida Platyhelminthes Nematomorpha Nematoda Kinorhyncha Acanthocephala Priapulida Tardigrada Onychophora Arthropoda Xenoturbellida Enteropneusta Hemichordata Echinodermata Chordata Chaetognatha 100 1 1,000 10 10,000 10,000,000 100,000 100,000,000 1,000,000
  • 8. (10/05/2006) Porifera Placozoa Buddenbrockia Metazoan Phyla: Species per phylum Myxozoa Mesozoa Ctenophora Cnidaria Micrognathozoa Cycliophora Acoelomorpha Gnathostomulida Seisonidea Rotifera Gastrotricha Sipuncula Nemertea Mollusca Entoprocta Bryozoa Brachiopoda Pogonophora Echiura Annelida Platyhelminthes Nematomorpha Nematoda Kinorhyncha Acanthocephala Priapulida Tardigrada Onychophora Arthropoda Xenoturbellida Enteropneusta Hemichordata Echinodermata Chordata Chaetognatha 100 1 1000 10 1000000 10000 10000000 100000
  • 9. (10/05/2006) Metazoan Phyla: Sequences per species Porifera Placozoa Buddenbrockia Myxozoa Mesozoa Ctenophora Cnidaria Micrognathozoa Cycliophora Acoelomorpha Gnathostomulida Seisonidea Rotifera Gastrotricha Sipuncula Nemertea Mollusca Entoprocta Bryozoa Brachiopoda Pogonophora Echiura Annelida Platyhelminthes Nematomorpha Nematoda Kinorhyncha Acanthocephala Priapulida Tardigrada Onychophora Arthropoda Xenoturbellida Enteropneusta Hemichordata Echinodermata Chordata Chaetognatha 100 1 0.1 1000 10
  • 10. Phyloinformatics Workshop Edinburgh 2007 1: Forests of trees, and loads of kindling Phylogenetics is a growth area. The raw materials (sequences) are being added at a startling rate. Tree databases are also growing (both in number and size). so how does a lab worker bee keep up?
  • 11. from Rod Page “Towards a Taxonomically Intelligent Phylogenetic Database” 7000 6000 Molecular phylogenies Cumulative number TreeBASE studies 5000 4000 3000 2000 1000 0 1980 1985 1990 1995 2000 Year
  • 13. Phyloinformatics Workshop Edinburgh 2007 Two modes of data acquisition (a) wet lab - compute lab synergy explicitly source the sequences needed preformed ideas of the best taxa to sample the best genes to sample [this is the source of most phylogenetic data]
  • 14. Phyloinformatics Workshop Edinburgh 2007 Two modes of data acquisition (a) wet lab - compute lab synergy (b) magpie surfing / tree surgery using phyloinformatic tools to discover the set of available genes AND taxa to address a particular problem
  • 15. Phyloinformatics Workshop Edinburgh 2007 2: Organising principles On average … • more data are better more taxa more genes • multiple methods are better
  • 16. Phyloinformatics Workshop Edinburgh 2007 2: Organising principles • assess all relevant taxa • assess all relevant sequence
  • 17. while the NCBI taxonomy isn’t the best in the world, at least every sequence is attached to a taxon, and TAX_IDs are unique
  • 18. The Edinburgh EST analysis Pipeline (trace2dbest) Process raw sequence traces Trim off vector & low quality (CLOBB) Cluster into putative gene objects Predict consensus sequence (prot4EST) Predict translation reading frame Generate protein translation (annot8r) Annotate using BLAST GOtcha PSort Pfam SigPep KEGG (PartiGene) Collate information in relational database
  • 19. NEMBASE3 http://www.nematodes.org/ The web portal to NEMBASE3 Mark Blaxter, James Wasmuth, Ann Hedley & Ralf Schmid University of Edinburgh, Institute of Evolutionary Biology, Edinburgh UK EH9 3JT mark.blaxter@ed.ac.uk
  • 20. NEMBASE3 http://www.nematodes.org/ Collectors’ curve of nematode protein families Trichinella spiralis 50000 Brugia malayi Number of families Meloidogyne incognita 40000 A Strongyloides 30000 stercoralis Ancylostoma caninum 20000 Caenorhabditis 10000 elegans B C 0 150000 75000 100000 125000 50000 0 25000 Total number of proteins
  • 21. NEMBASE3 http://www.nematodes.org/ Earliest origins of nematode protein families 949 Strongyloidea (6120) 12302 V Rhabditoidea (3674) 1108 Rhabditina (Clade V) 0 Diplogasteromorpha (1356) 4162 435 IV Panagrolaimomorpha (2678) 132 Tylenchina (Clade IV) Tylenchomorpha 3893 Rhabditida (11213) Cephalobomorpha 7501 293 NEMATODA Ascaridomorpha III (3695) 152 Spirurina (Clade III) 2811 824 Spiruromorpha (5188) 0 I Dorylaimida (1610) 30 Dorylaimia (Clade I) 128 Trichinellida (2571)
  • 23. Phyloinformatics Workshop Edinburgh 2007 2: Organising principles • assess all relevant taxa • assess all relevant sequence • store aligned sequences locally • output ‘slices’ of data in analysis-ready formats
  • 24. many taxa, missing data gene-> abcdefghi /taxon 1 2 3 4 5 6 7 8 9
  • 25. Generating a slice that • maximises taxonomic coverage • maximises present data/minimises missing data gene-> abefgi /taxon 1 3 7 9
  • 26. Phyloinformatics Workshop Edinburgh 2007 2: Organising principles • assess all relevant taxa • assess all relevant sequence • store aligned sequences locally • output ‘slices’ of data in analysis-ready formats • store trees locally • store alternative taxonomic systems
  • 27. Complete Including Platyhelminthes genome neglected L sequences taxa ESTs Annelida (Philippe et al.) Mollusca Tardigrada P Nematoda E Arthropoda C Vertebrata Urochordata Cephalochordata D Echinodermata Ctenophora Cnidaria Choanoflagellata Fungi
  • 29. sequence alignment TreeFam TreeBASE user tree systematic AGGCT AGGCT AGGCT ACGGT ACGGT PheTyr AGGCT CCGGA CCGGA ACGGT CCGGA Processing to Processing to Processing to * identify relevant sequences * identify relevant sequences * capture tree data and store locally and store locally * reconcile tree nodes * associate sequences * capture tree data with existing systems and taxa * reconcile tree nodes with existing systems
  • 30. sequence alignment TreeFam TreeBASE user tree systematic AGGCT AGGCT AGGCT ACGGT ACGGT PheTyr AGGCT CCGGA CCGGA ACGGT CCGGA Processing to Processing to Processing to * identify relevant sequences * identify relevant sequences * capture tree data and store locally and store locally * reconcile tree nodes * associate sequences * capture tree data with existing systems and taxa * reconcile tree nodes with existing systems AGGCT ACGGT CCGGA POA iPhy database Alignment Cycle tranAlign AGGCT ACGGT CCGGA AGGCT PheTyr AGGCT ACGGT CCGGA
  • 31. sequence alignment TreeFam TreeBASE user tree systematic AGGCT AGGCT AGGCT ACGGT ACGGT PheTyr AGGCT CCGGA CCGGA ACGGT CCGGA Processing to Processing to Processing to * identify relevant sequences * identify relevant sequences * capture tree data and store locally and store locally * reconcile tree nodes * associate sequences * capture tree data with existing systems and taxa * reconcile tree nodes with existing systems AGGCT ACGGT CCGGA POA iPhy database Alignment Cycle tranAlign AGGCT ACGGT CCGGA AGGCT PheTyr TreeFam AGGCT Orthologue ACGGT CCGGA Inference Ortho-MCL AGGCT Engine ACGGT CCGGA
  • 32. AGGCT ACGGT CCGGA POA iPhy database Alignment Cycle tranAlign AGGCT ACGGT CCGGA AGGCT PheTyr TreeFam AGGCT Orthologue ACGGT CCGGA Inference Ortho-MCL AGGCT Engine ACGGT CCGGA Dataset Exploration Tools AGGCT } maximal ACGGT AGGCT CCGGA Slice ACGGT CCGGA Selecter PhyML bicliques MrBayes Phylogenetics Cycle PAUP Tree Comparer ...
  • 33. AGGCT ACGGT CCGGA POA iPhy database Alignment Cycle tranAlign AGGCT ACGGT CCGGA AGGCT PheTyr TreeFam AGGCT Orthologue ACGGT CCGGA Inference Ortho-MCL AGGCT Engine ACGGT CCGGA trees & alignments Dataset Exploration Tools Publication Quality AGGCT } maximal ACGGT AGGCT Analyses CCGGA Slice ACGGT CCGGA Selecter PhyML bicliques MrBayes Phylogenetics Cycle PAUP Tree Comparer AGGCT ... ACGGT CCGGA
  • 34. Phyloinformatics Workshop Edinburgh 2007 4: iPhy deployment version 0.1: ‘TaxMan’
  • 35. BMC Bioinformatics Bio Med Central Software Open Access TaxMan: a taxonomic database manager Martin Jones* and Mark Blaxter Address: Institute of Evolutionary Biology, King's Buildings, Ashworth Laboratories, West Ma ins Road, Edinburgh EH9 3JT, UK Email: Martin Jones* - marti n.jones@ed.ac.uk; Mark Blax ter - mark.blaxter@ed.ac.uk * Corresponding author Published: 18 December 2006 Received: 11 October 2006 Accepted: 18 December 2006 BMC Bioinformatics 2006, 7:536 doi:10.1186/1471-2105-7-536 This article is available from: http://www.biomedcentral.com/1471-2105/7/536 © 2006 Jones and Blaxter; licensee BioMed Central Ltd.
  • 36. Phyloinformatics Workshop Edinburgh 2007 4: iPhy deployment version 0.1: ‘TaxMan’ TaxMan automates assembly of large sequence datasets for chosen taxa TaxMan automates generation of aligned sequences sets for chosen genes
  • 37. Phyloinformatics Workshop Edinburgh 2007 4: iPhy deployment version 0.1: ‘TaxMan’ TaxMan simplifies selection of taxa for analysis e.g. given a gene set, choosing one species per family (choosing the species with the least missing data) e.g. given a taxon set, choosing the genes (choosing genes with less than a given % missing data) e.g. generating custom defined alignments
  • 38. Phyloinformatics Workshop Edinburgh 2007 4: iPhy deployment version 0.1: ‘TaxMan’ TaxMan simplifies analysis by exporting formatted alignments (NEXUS) of nucleotides (with codon positions and genes as defined partitions) of amino acids (with genes as defined partitions)
  • 39. Phyloinformatics Workshop Edinburgh 2007 4: iPhy deployment version 0.1: ‘TaxMan’ TaxMan simplifies post-phylogenetic analysis by saving trees (with links to the original data) saving analytical metadata (algorithm, parameters, settings) saving tree statistics (bootstraps, branch lengths)
  • 41. Lophotrochozoa 70,000 annotated sequences ● 630,000 EST sequences ● 21 genes (mt + 18S 28S actin H3 WG EF1A) ● 53,000 sequences extracted ● 17,000 aligned consensus sequences ● 8,700 species represented ● One day for data collection, one for alignment ●
  • 42. Molecular Phylogenetics and Evolution 43 (2007) 583–595 www.elsevier.com/locate/ympev The e ect of model choice on phylogenetic inference using mitochondrial sequence data: Lessons from the scorpions a,¤ , Benjamin Gantenbein b, Victor Fet c, Mark Blaxter a Martin Jones a Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK b AO Research Institute, Clavadelerstrasse 8, Davos Platz CH-7270, Switzerland c Department of Biological Sciences, Marshall University, Huntington, WV 25755-2510, USA Received 25 April 2006; revised 14 November 2006; accepted 14 November 2006 Available online 29 November 2006
  • 43. Phyloinformatics Workshop Edinburgh 2007 5: Nameless taxa & endless forms
  • 44. quot;... endless forms most beautiful and most wonderful have been, and are being, evolvedquot; (Darwin 1859)
  • 45. http://www.nematodes.org/NeglectedGenomes/ ARTHROPODA/Chelicerata.html
  • 46. Metazoan species per phylum 100000000 10000000 1000000 100000 10000 1000 100 10 1 Choanoflagellida Porifera Placozoa Cnidaria Ctenophora Acoela Mesozoa Myxozoa Nematoda Nematomorpha Loricifera Kinorhyncha Priapulida Onychophora Arthropoda Tardigrada Gastrotricha Nemertea Myzostomida Gnathostomulida Cycliophora Platyhelminthes Acanthocephala Rotifera Chaetognatha Sipunculida Bryozoa Brachiopoda Entoprocta Annelida Pogonophora Echiura Mollusca Hemichordata Echinodermata Chordata
  • 47. organism-size curve Eukaryotes squillions number of individuals (log scale) POSSIBLE PREDATORS lots FOOD ITEMS few miniscule tiny just visible small big size of organism (log scale)
  • 48. Sourhope farm NERC quot;Soil Biodiversity and Ecosystem Functionquot; Programme Study Site 120 m x 75 m of raw Scottish upland grass 13 000 000 000 nematodes
  • 49. MAN IS BVT A WORM
  • 50. Marine 1034ED Fyne1 1022ED Fyne1 1010ED Fyne1 1020ED Fyne1 1005ED Fyne1 1007ED Fyne 1140ED Orkney 1139ED Orkney 1031ED Fyne1 1043ED Gullane 1118ED Fyne2 1011ED Fyne1 1093ED Fyne2 1085ED Gullane 1046ED Gullane 1041ED Gullane Nematode 1060ED Gullane 1 1028ED Fyne1 1119ED Fyne2 1122ED Fyne2 1142ED Orkney 1145ED Orkney 1170ED Orkney 1174ED Orkney 1162ED Orkney 1169ED Orkney 1173ED Orkney 1179ED Orkney 1168ED Orkney 1176ED Orkney 1167ED Orkney 1175ED Orkney Barcodes 1147ED Orkney 1008ED Fyne1 1009ED Fyne1 1144ED Orkney 1146ED Orkney 1083ED Gullane 1073ED Gullane 1051ED Gullane 1019ED Fyne1 1124ED Fyne2 1097ED Fyne2 1150ED Orkney 1136ED Orkney 1152ED Orkney 1171ED Orkney 1154ED Orkney 5 changes 1151ED Orkney 1029ED Fyne1 1012ED Fyne1 1138ED Orkney 1013ED Fyne1 1032ED Fyne1 1092ED Fyne2 1036ED Fyne1 Gullane 1037ED Fyne1 1075ED Gullane 1109ED Fyne2 1128ED Fyne2 1094ED Fyne2 1044ED Gullane 1071ED Gullane 1064ED Gullane 1053ED Gullane 1070ED Gullane 1038ED Gullane 1052ED Gullane Loch Fyne 10 1123ED Fyne2 1035ED Fyne1 1107ED Fyne2 1108ED Fyne2 1024ED Fyne1 1178ED Orkney 1165ED Orkney 2 1156ED Orkney 1141ED Orkney 1164ED Orkney 1066ED Gullane 1047ED Gullane 1099ED Fyne2 1058ED Gullane 1042ED Gullane 1088ED Fyne2 1086ED Fyne2 1039ED Gullane 1069ED Gullane 10 1061ED Gullane 1074ED Gullane 1096ED Fyne2 1105ED Fyne2 1133ED Fyne2 1077ED Gullane 1014ED Fyne1 1068ED Gullane 1076ED Gullane 4 1080ED Gullane 1072ED Gullane 1054ED Gullane 1062ED Gullane 1048ED Gullane 1057ED Gullane 1040ED Gullane 1059ED Gullane Orkney 1120ED Fyne2 1017ED Fyne1 11 1004ED Fyne1 1018ED Fyne1 1177ED Orkney 1025ED Fyne1 1023ED Fyne1 1016ED Fyne1 1027ED Fyne1 1015ED Fyne1 1002ED Fyne1 1001ED Fyne1 1021ED Fyne1 1003ED Fyne1 2 1006ED Fyne1 1000ED Fyne1 1155ED Orkney 1121ED Fyne2 1103ED Fyne2 12 1110ED Fyne2 Loch Fyne 1114ED Fyne2 1125ED Fyne2 1131ED Fyne2 Gullane 1101ED Fyne2 1102ED Fyne2 1112ED Fyne2 1116ED Fyne2 1106ED Fyne2 1104ED Fyne2 1132ED Fyne2 51 Orkney
  • 51. Phyloinformatics Workshop Edinburgh 2007 5: Nameless taxa & endless forms MOTU Molecular Operational Taxonomic Units
  • 52. motu 1. to cut; to snap off motu-á te hau, the fishing line snapped off 2. to engrave, to inscribe letters or pictures in stone or in wood, like the motu mo rogorogo, inscrip- tions for recitation in lines called kohau. 3. islet some names of islets: Motu Motiro Hiva, Motu Nui, Motu Iti, Motu Kaokao, Motu Tapu, Motu Marotiri, Motu Kau, Motu Tavake, Motu Tautara, Motu Ko Hepa Ko Maihori, Motu Hava.
  • 53. Phyloinformatics Workshop Edinburgh 2007 5: Nameless taxa & endless forms MOTU specimen-based surveys CBoL Barcode of Life (CO1) anonymous, specimen-free surveys environmental sampling bulk community DNA millions of sequences
  • 54. Phyloinformatics Workshop Edinburgh 2007 5: Nameless taxa & endless forms ~1.2 million described species ~10-100 million species in reality Thus, most ‘species’ will never be formally named.
  • 55. Phyloinformatics Workshop Edinburgh 2007 5: Nameless taxa & endless forms How do we incorporate these myriad ‘nameless taxa’ into our systems?
  • 56. Phyloinformatics Workshop Edinburgh 2007 TaxMan, iPhy & chelicerate evolution Martin Jones MOTU and barcoding Robin Floyd & Jenna Mann PartiGene & EST analysis Ralf Schmid, James Wasmuth & Ann Hedley