SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Metagenomic+tools+for+the+
   fungal+community+
      Holly+Bik,+UC+Davis+
       19+October+2012+
hAp://phylosiE.wordpress.com+
Explicitly+PhylogeneLc+Approaches+
                   Aligned+                EvoluLonary+
                   environmental+          Placement+of+
                   sequences+              short+reads+




                                    ++++




     Guide+Tree+
We+provide:+
•  Support+for+Paired+End+(raw)+Illumina+data+
•  Marker+gene+data+for+Bacteria,+Archaea,+
   Eukaryotes,+Viruses+
•  Taxonomy+assignments+based+on+probability+
   distribuLons+over+a+reference+phylogeny+
•  Complement+to+exisLng+tools+–+QIIME/VAMPs+
  –  Inputs/outputs+will+be+compaLble+for+use+with+
     other+soEware+tools+
Markers+
•    PMPROK+–+Dongying+Wu’s+Bac/Arch+markers+
•    EukaryoLc+Orthologs+–+Parfrey+2011+paper+
•    16S/18S+rRNA++
•    Mitochondria+_+protein_coding+genes+
•    Viral+Markers+–+Markov+clustering+on+genomes+
•    Codon+Subtrees+–+finer+scale+taxonomy+

•  Extended+Markers+–+plasLds,+gene+families+
Reference+Marker+Genes+
specified'PD'cutoff'(e.g.'99%)''




                                                                                               Quan?ta?ve'metric'(minimum'
                                                                                                                                       Tree          Reconcile'NCBI'taxonomy'IDs'



The+Monkey+–+Build+Marker+Packages+
                                                                                               hamming'distance)'used'to'match'
                                                                                                                                   Reconciliation
   with'phylogene?c'topology'
                                                                                               edges'between'NCBI'taxon'tree'
                                                                                               and'molecular'phylogeny'

                                                                                                                                             Clean'and'package'new'marker'genes'


                                                                                                                                   Built Marker      New'marker'gene'packages'placed'into'
                                                                                                                                    Packages
        shared'PhyloSiS'marker'directory'

      Mapping'File'                        PD'            Alignment'File'                                                                    Execute'index'mode'
    (sequence'name,'NCBI'taxon'ID)'       cutoff'        (Marker'sequences'in'FASTA'format)'
                                                                                               Locally'indexed'marker'packages'
                                                                                               will'not'interfere'with'automa?c'
                                                                                                                                   Index Marker       Indexes'the'marker'databases'needed'
  NOTE:'New'marker'packages'are'                                                                                                                      for'LAST'and'Bow?e'
  named'according'to'input'filenames'                  Execute'build_marker'mode'               updates'to'PhyloSiS'core'markers'     Database
  (e.g.'MarkerAlignment.fasta).'Core'
  marker'data'will'be'overwriXen'
  during'new'marker'builds'if'input'     hmmbuild          Create'profile'HMMs'(or'CMs'for'
  files'do'not'have'unique'names'                           rRNA'data)'using'input'sequences'
                                         (ssu-build)
  compared'to'exis?ng'PhyloSiS'
  markers.'
                                                                                                                Built'PhyloSiS'Marker'package'
                                                Generate'unique'IDs'for'input'sequences'

                                                            Build'tree'and'collapse'                                  Tree'                          HMM'profile''
                                          FastTree
         topology'according'to'a'userM                                                             (CMs'for'rRNA)'
                                                            specified'PD'cutoff'(e.g.'99%)''

                                                                                                                                                  Representa?ve'
                                                                                                                Taxon'map'
  Quan?ta?ve'metric'(minimum'
                                                                                                                                                    sequences'
  hamming'distance)'used'to'match'
                                            Tree            Reconcile'NCBI'taxonomy'IDs'
                                        Reconciliation
     with'phylogene?c'topology'
  edges'between'NCBI'taxon'tree'
  and'molecular'phylogeny'                                                                                                          Alignment'
                                                   Clean'and'package'new'marker'genes'


                                        Built Marker       New'marker'gene'packages'placed'into'
                                         Packages
         shared'PhyloSiS'marker'directory'


                                                   Execute'index'mode'
  Locally'indexed'marker'packages'
  will'not'interfere'with'automa?c'
                                        Index Marker         Indexes'the'marker'databases'needed'
                                          Database
          for'LAST'and'Bow?e'
  updates'to'PhyloSiS'core'markers'




                   Built'PhyloSiS'Marker'package'
The+Kangaroo+–+SimulaLon+Data+
                      Genome&Directory&
         Define&the&number&of&&genomes&to&pick&(default&=&10)&and&number&of&
                    reads&to&generate&per&file&(default&=&100,000)&


                                              Execute&sim&mode&
                                                      Determines&PD&contribuFons&for&taxa&
                                   PD on         present&in&concatenated&guide&tree&
                              concatenated tree
 in&PhyloSiH&marker&directory&



                                                      Two&separate&approaches&used:&
                                                      1.  Select&some&number&of&taxa&that&contribute&
                                Select Taxa
              to&PD&(user&input,&default&=&10&taxa)&
                                                      2.  Sample&taxa&uniformly&without&replacement&



                                Compute metrics       Calculated&metrics&include:&the&distance&to&
                               between target and     nearest&neighbors,&connecFng&branch&
                                 remaining taxa
      lengths,&and&the&number&of&sampled&nodes&
                                                      within&various&PD&units&of&connecFng&nodes.&



                                  Knockout            Workflow&plugs&into&updateDB&to&
                                                      remove&genomes&which&have&been&used&
                                Swaths of Taxa
                                                      to&simulate&metagenome&data,&as&well&as&
                                                      a&swath&of&related&taxa.&


                                                Grinder&algorithm&randomly&generates&
                                  Generated
                                                reads&from&selected&genomes,&outputs&
                               Simulated Reads
 simulated&PEAIllumina&and&454&datasets&




                                                      A&new&marker&directory&is&created,&
                                 Simulation
                                                      where&simulated&genomes&have&been&
                               Marker Directory
      knocked&out&from&marker&packages.&&
DBupdate+–+Mining+new+genomes+
                     EBI'                            Private'                   NCBI'                          JGI'
                   Genomes'                         Genomes'                  Genomes'                       Genomes'

                                                                                                         Execute'
                                                                                                         phylosi/_dbupdate.pl'
                                                              Run PhyloSift
                                                                (search + align)


                                                                                    Add'new'sequences'to'marker'packages'

                                                              Infer Updated
                                                                    Tree



                                                      Amino Acid               Nucleotide
                                                        Tree
                    Tree


                                                                                                      PD'metric'used'to'split'guide'tree'into'
                  A'taxa'set'is'selected'with'a'                                 Codon                smaller'subtrees;'subsets'of'taxa'are'
               maxPD'cutoff'of'0.02'and'a'new'        Prune Tree 
                                tree'is'inferred'                               Subtrees
             selected'such'that'no'branch'connecEng'
                                                                                                      them'has'length'>0.X'for'some'value'of'X'


     New'sequences'added'at'0.25'PD'for'amino'
         acid'tree;'higher'PD'threshold'enables'     Update reference
         more'aggressive'searches'of'reference'       sequences with
        database,'since'LAST'searching'is'faster'        new data
                         with'fewer'sequences.'


                                                                                          Reconcile'NCBI'taxonomy'IDs'with'
                                                                    Tree                  phylogeneEc'topologies,'for'both'
                                                                Reconciliation
           amino'acid'tree'and'codon'subtrees'




                                                                   Package
                                                                   Markers



                                                                   Automated 
            Users’'local'marker'databases'are'automaEcally'
                                                                  Download to 
           scanned'each'Eme'PhyloSi/'is'run'and'any'new'
                                                                 PhyloSift Users
        updates'are'automaEcally'downloaded'if'available'
Tree+ReconciliaLon+in+PhyloSiE+



                      Environmental,   Named,
                      Sequences,       Taxa,
Great!,,




           Not,Bad,,




             Ge9ng,Tricky…,,
Tree+Placement+
  Fat+Tree+_+Guppy+
Chemoautotrophic+
Marine+
              bacteria+–+oxidize+
Metagenome+
              ammonia+into+nitrite+




                                      Alveolate+ProLsts+




                                              Common+seawater+
                                              Archaea+
Tree+Placement+
  Tog+Tree+_+Guppy+
Marine+
Metagenome+
Marine+
               Metagenome+


Tree+Placement+
  Sing+Tree+_+Guppy+
Linking+with+the+Fungal+ITS+community+
•  How+does+fungal+ITS+sequence+data+relate+to+your+
   project?+
   –  PhyloSiE+has+the+capability+to+add+any+marker+gene+
      reference+packages+that+are+relevant+for+specific+
      taxonomic+communiLes++
•  What+fungal+ITS+data+does+your+project+currently+
   provide+
   –  None+–+but+we+do+mine+other+marker+genes+from+
      fungal+genomes+
•  What+fungal+ITS+data+is+your+project+hoping+to+
   provide?+
   –  We+wouldn’t+provide+data,+but+can+work+with+users+to+
      increase+support+for+fungal+analyses+
Linking+with+the+Fungal+ITS+community+
•  Is+your+project+involved+with+curaLng+fungal+ITS+
   sequences+
   –  No,+but+we+would+curate+alignments+and+marker+
      packages+of+ITS+sequences+mined+from+public+
      databases+
•  If+so,+what+curaLon+strategies+are+being+
   implemented+for+your+project?+
   –  Alignment+filtering+and+masking,+pruning+reference+
      trees+
•  What+tools+for+working+with+fungal+ITS+sequences+
   does+your+project+currently+provide?++
   –  None+so+far+–+but+can+be+implemented+if+given+a+
      reference+dataset+(e.g.+alignment)+
Linking+with+the+Fungal+ITS+community+
•  What+tools+are+you+developing+/+planning+to+
   develop?++
  –  Current+focus+is+on+mulLsample+comparisons+
  –  Gene+tree+reconciliaLon+
  –  Probability+distribuLon+over+tree+topology+to+
     delimit+OTUs+(PhylogeneLc+OTUs)+
•  What+framework+of+fungal+taxonomy+does+
   your+project+use?++
  –  NCBI_derived+taxonomy+(because+of+tree+
     mapping/reconciliaLon+issues)+
SATELLITE
                    MEETING 




Eukaryotic Metagenomics
              


      March/April 2013
         UC Davis
Acknowledgements+
UC+Davis+
•  Jonathan+Eisen+
•  Aaron+Darling+
•  Guillaume+Jospin+
•  Dongying+Wu+
•  David+Coil+

+
PhyloSiE+SoEware+Development+on+Github:+
hAps://github.com/gjospin/PhyloSiE+
+
Google+Group+for+user+support:++
hAps://groups.google.com/d/forum/phylosiE+
+
TwiAer:+@PhyloSiE+

Weitere ähnliche Inhalte

Andere mochten auch

JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...
JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...
JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...
Holly Bik
 
Social Media Workshop at UC David - Feb 7, 2014
Social Media Workshop at UC David - Feb 7, 2014Social Media Workshop at UC David - Feb 7, 2014
Social Media Workshop at UC David - Feb 7, 2014
Holly Bik
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009
bosc
 
The role of cost in yeast gene expression
The role of cost in yeast gene expressionThe role of cost in yeast gene expression
The role of cost in yeast gene expression
Michael Barton
 

Andere mochten auch (20)

Social Media For Researchers
Social Media For ResearchersSocial Media For Researchers
Social Media For Researchers
 
Social Media for Researchers
Social Media for Researchers Social Media for Researchers
Social Media for Researchers
 
SMBE Satellite Meeting on Eukaryotic -Omics
SMBE Satellite Meeting on Eukaryotic -OmicsSMBE Satellite Meeting on Eukaryotic -Omics
SMBE Satellite Meeting on Eukaryotic -Omics
 
JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...
JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...
JASM2014 talk - "Phinch: An interactive, exploratory data visualization for e...
 
Social Media Workshop at UC David - Feb 7, 2014
Social Media Workshop at UC David - Feb 7, 2014Social Media Workshop at UC David - Feb 7, 2014
Social Media Workshop at UC David - Feb 7, 2014
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
Bio::Phylo - phyloinformatic analysis using perl
Bio::Phylo - phyloinformatic analysis using perlBio::Phylo - phyloinformatic analysis using perl
Bio::Phylo - phyloinformatic analysis using perl
 
VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Surfacing the deep data of taxonomy
Surfacing the deep data of taxonomySurfacing the deep data of taxonomy
Surfacing the deep data of taxonomy
 
Procter Vamsas Bosc2009
Procter Vamsas Bosc2009Procter Vamsas Bosc2009
Procter Vamsas Bosc2009
 
20120622 fridayadelboden
20120622 fridayadelboden20120622 fridayadelboden
20120622 fridayadelboden
 
Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinfor...
Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinfor...Jonathan Eisen talk for #SCS2012 at #ISMB  "Networks in genomics and bioinfor...
Jonathan Eisen talk for #SCS2012 at #ISMB "Networks in genomics and bioinfor...
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
OBF Address at BOSC 2012
OBF Address at BOSC 2012OBF Address at BOSC 2012
OBF Address at BOSC 2012
 
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metage...
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metage...Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metage...
Jonathan Eisen: Phylogenetic approaches to the analysis of genomes and metage...
 
Tetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan EisenTetrahymena genome project 2003 presentation by Jonathan Eisen
Tetrahymena genome project 2003 presentation by Jonathan Eisen
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesis
 
Bio4j
Bio4jBio4j
Bio4j
 
The role of cost in yeast gene expression
The role of cost in yeast gene expressionThe role of cost in yeast gene expression
The role of cost in yeast gene expression
 
The neurobiological nature of free will
The neurobiological nature of free willThe neurobiological nature of free will
The neurobiological nature of free will
 

Kürzlich hochgeladen

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Kürzlich hochgeladen (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 

Fungal ITS meeting presentation

  • 1. Metagenomic+tools+for+the+ fungal+community+ Holly+Bik,+UC+Davis+ 19+October+2012+
  • 3. Explicitly+PhylogeneLc+Approaches+ Aligned+ EvoluLonary+ environmental+ Placement+of+ sequences+ short+reads+ ++++ Guide+Tree+
  • 4. We+provide:+ •  Support+for+Paired+End+(raw)+Illumina+data+ •  Marker+gene+data+for+Bacteria,+Archaea,+ Eukaryotes,+Viruses+ •  Taxonomy+assignments+based+on+probability+ distribuLons+over+a+reference+phylogeny+ •  Complement+to+exisLng+tools+–+QIIME/VAMPs+ –  Inputs/outputs+will+be+compaLble+for+use+with+ other+soEware+tools+
  • 5. Markers+ •  PMPROK+–+Dongying+Wu’s+Bac/Arch+markers+ •  EukaryoLc+Orthologs+–+Parfrey+2011+paper+ •  16S/18S+rRNA++ •  Mitochondria+_+protein_coding+genes+ •  Viral+Markers+–+Markov+clustering+on+genomes+ •  Codon+Subtrees+–+finer+scale+taxonomy+ •  Extended+Markers+–+plasLds,+gene+families+
  • 7.
  • 8. specified'PD'cutoff'(e.g.'99%)'' Quan?ta?ve'metric'(minimum' Tree Reconcile'NCBI'taxonomy'IDs' The+Monkey+–+Build+Marker+Packages+ hamming'distance)'used'to'match' Reconciliation with'phylogene?c'topology' edges'between'NCBI'taxon'tree' and'molecular'phylogeny' Clean'and'package'new'marker'genes' Built Marker New'marker'gene'packages'placed'into' Packages shared'PhyloSiS'marker'directory' Mapping'File' PD' Alignment'File' Execute'index'mode' (sequence'name,'NCBI'taxon'ID)' cutoff' (Marker'sequences'in'FASTA'format)' Locally'indexed'marker'packages' will'not'interfere'with'automa?c' Index Marker Indexes'the'marker'databases'needed' NOTE:'New'marker'packages'are' for'LAST'and'Bow?e' named'according'to'input'filenames' Execute'build_marker'mode' updates'to'PhyloSiS'core'markers' Database (e.g.'MarkerAlignment.fasta).'Core' marker'data'will'be'overwriXen' during'new'marker'builds'if'input' hmmbuild Create'profile'HMMs'(or'CMs'for' files'do'not'have'unique'names' rRNA'data)'using'input'sequences' (ssu-build) compared'to'exis?ng'PhyloSiS' markers.' Built'PhyloSiS'Marker'package' Generate'unique'IDs'for'input'sequences' Build'tree'and'collapse' Tree' HMM'profile'' FastTree topology'according'to'a'userM (CMs'for'rRNA)' specified'PD'cutoff'(e.g.'99%)'' Representa?ve' Taxon'map' Quan?ta?ve'metric'(minimum' sequences' hamming'distance)'used'to'match' Tree Reconcile'NCBI'taxonomy'IDs' Reconciliation with'phylogene?c'topology' edges'between'NCBI'taxon'tree' and'molecular'phylogeny' Alignment' Clean'and'package'new'marker'genes' Built Marker New'marker'gene'packages'placed'into' Packages shared'PhyloSiS'marker'directory' Execute'index'mode' Locally'indexed'marker'packages' will'not'interfere'with'automa?c' Index Marker Indexes'the'marker'databases'needed' Database for'LAST'and'Bow?e' updates'to'PhyloSiS'core'markers' Built'PhyloSiS'Marker'package'
  • 9. The+Kangaroo+–+SimulaLon+Data+ Genome&Directory& Define&the&number&of&&genomes&to&pick&(default&=&10)&and&number&of& reads&to&generate&per&file&(default&=&100,000)& Execute&sim&mode& Determines&PD&contribuFons&for&taxa& PD on present&in&concatenated&guide&tree& concatenated tree in&PhyloSiH&marker&directory& Two&separate&approaches&used:& 1.  Select&some&number&of&taxa&that&contribute& Select Taxa to&PD&(user&input,&default&=&10&taxa)& 2.  Sample&taxa&uniformly&without&replacement& Compute metrics Calculated&metrics&include:&the&distance&to& between target and nearest&neighbors,&connecFng&branch& remaining taxa lengths,&and&the&number&of&sampled&nodes& within&various&PD&units&of&connecFng&nodes.& Knockout Workflow&plugs&into&updateDB&to& remove&genomes&which&have&been&used& Swaths of Taxa to&simulate&metagenome&data,&as&well&as& a&swath&of&related&taxa.& Grinder&algorithm&randomly&generates& Generated reads&from&selected&genomes,&outputs& Simulated Reads simulated&PEAIllumina&and&454&datasets& A&new&marker&directory&is&created,& Simulation where&simulated&genomes&have&been& Marker Directory knocked&out&from&marker&packages.&&
  • 10. DBupdate+–+Mining+new+genomes+ EBI' Private' NCBI' JGI' Genomes' Genomes' Genomes' Genomes' Execute' phylosi/_dbupdate.pl' Run PhyloSift (search + align) Add'new'sequences'to'marker'packages' Infer Updated Tree Amino Acid Nucleotide Tree Tree PD'metric'used'to'split'guide'tree'into' A'taxa'set'is'selected'with'a' Codon smaller'subtrees;'subsets'of'taxa'are' maxPD'cutoff'of'0.02'and'a'new' Prune Tree tree'is'inferred' Subtrees selected'such'that'no'branch'connecEng' them'has'length'>0.X'for'some'value'of'X' New'sequences'added'at'0.25'PD'for'amino' acid'tree;'higher'PD'threshold'enables' Update reference more'aggressive'searches'of'reference' sequences with database,'since'LAST'searching'is'faster' new data with'fewer'sequences.' Reconcile'NCBI'taxonomy'IDs'with' Tree phylogeneEc'topologies,'for'both' Reconciliation amino'acid'tree'and'codon'subtrees' Package Markers Automated Users’'local'marker'databases'are'automaEcally' Download to scanned'each'Eme'PhyloSi/'is'run'and'any'new' PhyloSift Users updates'are'automaEcally'downloaded'if'available'
  • 11. Tree+ReconciliaLon+in+PhyloSiE+ Environmental, Named, Sequences, Taxa,
  • 12.
  • 13. Great!,, Not,Bad,, Ge9ng,Tricky…,,
  • 15. Chemoautotrophic+ Marine+ bacteria+–+oxidize+ Metagenome+ ammonia+into+nitrite+ Alveolate+ProLsts+ Common+seawater+ Archaea+
  • 18. Marine+ Metagenome+ Tree+Placement+ Sing+Tree+_+Guppy+
  • 19. Linking+with+the+Fungal+ITS+community+ •  How+does+fungal+ITS+sequence+data+relate+to+your+ project?+ –  PhyloSiE+has+the+capability+to+add+any+marker+gene+ reference+packages+that+are+relevant+for+specific+ taxonomic+communiLes++ •  What+fungal+ITS+data+does+your+project+currently+ provide+ –  None+–+but+we+do+mine+other+marker+genes+from+ fungal+genomes+ •  What+fungal+ITS+data+is+your+project+hoping+to+ provide?+ –  We+wouldn’t+provide+data,+but+can+work+with+users+to+ increase+support+for+fungal+analyses+
  • 20. Linking+with+the+Fungal+ITS+community+ •  Is+your+project+involved+with+curaLng+fungal+ITS+ sequences+ –  No,+but+we+would+curate+alignments+and+marker+ packages+of+ITS+sequences+mined+from+public+ databases+ •  If+so,+what+curaLon+strategies+are+being+ implemented+for+your+project?+ –  Alignment+filtering+and+masking,+pruning+reference+ trees+ •  What+tools+for+working+with+fungal+ITS+sequences+ does+your+project+currently+provide?++ –  None+so+far+–+but+can+be+implemented+if+given+a+ reference+dataset+(e.g.+alignment)+
  • 21. Linking+with+the+Fungal+ITS+community+ •  What+tools+are+you+developing+/+planning+to+ develop?++ –  Current+focus+is+on+mulLsample+comparisons+ –  Gene+tree+reconciliaLon+ –  Probability+distribuLon+over+tree+topology+to+ delimit+OTUs+(PhylogeneLc+OTUs)+ •  What+framework+of+fungal+taxonomy+does+ your+project+use?++ –  NCBI_derived+taxonomy+(because+of+tree+ mapping/reconciliaLon+issues)+
  • 22. SATELLITE MEETING Eukaryotic Metagenomics March/April 2013 UC Davis
  • 23. Acknowledgements+ UC+Davis+ •  Jonathan+Eisen+ •  Aaron+Darling+ •  Guillaume+Jospin+ •  Dongying+Wu+ •  David+Coil+ + PhyloSiE+SoEware+Development+on+Github:+ hAps://github.com/gjospin/PhyloSiE+ + Google+Group+for+user+support:++ hAps://groups.google.com/d/forum/phylosiE+ + TwiAer:+@PhyloSiE+