SlideShare a Scribd company logo
1 of 27
Download to read offline
Computational Tools for
Metagenomics
Surya Saha
Twitter: @SahaSurya / LinkedIn: www.linkedin.com/in/suryasaha/
Magdalen Lindeberg
Plant Pathology & Plant-Microbe Biology
Microbial Friends & Foes, Sep 25, 2012
Temperton, Current Opinion in Microbiology, 2012
Impact of Technology on Metagenomics
Types of “Meta” genomics
16S rRNA survey of bacterial
microbiome
ITS survey of fungal
microbiome
Bellemain, BMC Microbiology 2010Slide: Julien Tremblay, JGI
Types of “Meta” genomics
Whole genome shotgun
• Varying complexity of microbial communities
• High coverage sequencing
• Sophisticated informatics
• Host associated metagenomes
– Deep sequencing of host meta-genome
– Bioinformatic screening of host sequences
• Environmental metagenomes
– Eg. Soil samples
– Requires very high depth of coverage
– Complicated to assemble
Big picture!!
Big picture!!
What users see
Big picture!!
What users see
What users want!!
16S/ITS community surveys
• Multiple target regions in 16S gene and ITS region
• Comparison of results requires amplification of same region
• Advantages
– Fast survey of large communities
– Mature set of tools and statistics for analysis
– Good for first round survey
• 454 16S tags or pyrotags (~ 700 bp) have been the
preferred method
• Illumina Miseq (2x150bp, 2x250 bp) are the next
workhorses
• Depth of sampling
– 2-6000 reads/sample for simple communities
– 20000 reads /sample for complex soil metagenomes
16S/ITS issues
• Lack of tools for processing ITS/Fungal microbiome data
sets
– RDP classifier targets only ITS
– No ITS reconstruction tools
• Amplification bias effects accuracy and replication
• Use of short reads prevents disambiguation of similar
strains
• 16S or ITS may not differentiate between similar strains
– Clustering is done at 97%
– Regions may be >99% similar
• Sequencing error inflates number of OTUs
• Chloroplast 16S sequences can get amplified in plant
metagenomes
16S/ITS sequence processing workflow
Filter for
contaminants and
low quality reads
Assemble
overlapping reads
Reduce datasets
(clustering)
Perform taxonomic
classification and
compute diversity
metrics
16S/ITS sequence processing workflow
Filter for
contaminants and
low quality reads
Assemble
overlapping reads
Reduce datasets
(clustering)
Perform taxonomic
classification and
compute diversity
metrics
• Quality plots and read trimming
– FastQC
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
– FASTX
http://hannonlab.cshl.edu/fastx_toolkit/
• Chimera removal
– AmpliconNoise
http://code.google.com/p/ampliconnoise/
– UCHIME
http://www.drive5.com/uchime/
Impact of Sequence Length
Slide: Feng Chen, JGI
16S/ITS sequence processing workflow
Filter for
contaminants and
low quality reads
Assemble
overlapping reads
Reduce datasets
(clustering)
Perform taxonomic
classification and
compute diversity
metrics
• Merge overlapping paired end reads
– FLASH
http://www.genomics.jhu.edu/software/FLASH/index.shtml
– FastqJoin
http://code.google.com/p/ea-utils/wiki/FastqJoin
– CD-HIT read-linker
http://weizhong-lab.ucsd.edu/cd-hit/wiki/doku.php?id=cd-hit-
auxtools-manual
16S/ITS sequence processing workflow
Filter for
contaminants and
low quality reads
Assemble
overlapping reads
Reduce datasets
(clustering)
Perform taxonomic
classification and
compute diversity
metrics
• Clustering with high stringency
– UCLUST/USEARCH (16S only)
http://www.drive5.com/usearch/
– CD-HIT-OTU (16S only)
http://weizhong-lab.ucsd.edu/cd-hit-otu/
– phylOTU (16S only)
https://github.com/sharpton/PhylOTU
16S/ITS sequence processing workflow
Filter for
contaminants and
low quality reads
Assemble
overlapping reads
Reduce datasets
(clustering)
Perform
taxonomic
classification and
compute diversity
metrics
• Composition based classifiers
– RDP database + classifier
http://rdp.cme.msu.edu/classifier/classifier.jsp
• Homology based classifiers
– ARB + Silva database (16S only)
http://www.arb-home.de/
– GreenGenes database (16S only)
http://greengenes.lbl.gov/cgi-bin/nph-index.cgi
– UNITE database (ITS only)
http://unite.ut.ee/
– FungalITSPipeline (ITS only)
http://www.emerencia.org/fungalitspipeline.html
• http://www.qiime.org/
• Comprehensive suite of tools
– OTU picking
– Taxonomic classification
– Construction of phylogenetic
trees
– Visualization
– Compute diversity statistics
• Available as Amazon EC2
image
Whole Genome Shotgun (WGS)
Metagenomics
• Better classification with Increasing number of
complete genomes
• Focus on whole genome based phylogeny (whole
genome phylotyping)
• Advantages
– No amplification bias like in 16S/ITS
• Issues
– Poor sampling of fungal diversity
– Assembly of metagenomes is complicated due to
uneven coverage
– Requires high depth of coverage
WGS sequence processing workflow
Filter for low
quality reads
Assemble
reads
Perform taxonomic
classification and
compute diversity
metrics
WGS sequence processing workflow
Filter for low
quality reads
Assemble
reads
Perform taxonomic
classification and
compute diversity
metrics
• Quality plots and read trimming
– FastQC
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
– FASTX
http://hannonlab.cshl.edu/fastx_toolkit/
WGS sequence processing workflow
Filter for low
quality reads
Assemble
reads
Perform taxonomic
classification and
compute diversity
metrics
• NGS assembly with uneven depth
– IDBA-UD
http://i.cs.hku.hk/~alse/hkubrg/projects/idba_ud/
– MIRA
http://www.chevreux.org/projects_mira.html
– Velvet / MetaVelvet
http://www.ebi.ac.uk/~zerbino/velvet/
http://metavelvet.dna.bio.keio.ac.jp/
WGS sequence processing workflow
Filter for low
quality reads
Assemble
reads
Perform taxonomic
classification and
compute diversity
metrics
• Hybrid composition/homology based
classifiers
– FCP
http://kiwi.cs.dal.ca/Software/FCP
– Phymm/PhymmBL
http://www.cbcb.umd.edu/software/phymm/
– AMPHORA2
http://wolbachia.biology.virginia.edu/WuLab/Software.html
– NBC
http://nbc.ece.drexel.edu/
– MEGAN
http://ab.inf.uni-tuebingen.de/software/megan/
WGS sequence processing workflow
Filter for low
quality reads
Assemble
reads
Perform taxonomic
classification and
compute diversity
metrics
• Web based classifiers
– MG-RAST
http://metagenomics.anl.gov/
– CAMERA
http://camera.calit2.net/
– IMG/M
http://img.jgi.doe.gov/cgi-bin/m/main.cgi
MetaPhAln
• Unique clade-specific markers for sequenced bacteria and archaea
• 400 genuses/4000 genomes including HMP genomes
• Species level resolution
• MetaPhAln 2 in the works
– Eukaryotes including Fungi
– Viruses
– Higher coverage of archaea
• Krona and GraphAln for visualization
of output
• Websites
– https://bitbucket.org/nsegata/metaphlan
– http://huttenhower.sph.harvard.edu/metaphlan
PhyloSift/pplacer
• Reference database of marker genes
• Places reads on tree of life based on homology to
reference protein
• Integration with metAMOS for pre-assembling next-
generation datasets
• Bacterial and Archaeal classification only
• Plant and Fungi marker genes are being added
• Websites
– http://phylosift.wordpress.com/
– https://github.com/gjospin/PhyloSift
Real cost of Sequencing!!
Sboner, Genome Biology, 2011
Acknowledgements
Funding
Magdalen Lindeberg
Cornell University
Dave Schneider
USDA-ARS, Ithaca
Citrus greening / Wolbachia (wACP)
Thank you!
Surya Saha ss2489@cornell.edu
Suggestions
• Plan informatics workflow as early as possible
• Incorporate statistics at different stages in the workflow

More Related Content

What's hot

Metagenomics and it’s applications
Metagenomics and it’s applicationsMetagenomics and it’s applications
Metagenomics and it’s applicationsSham Sadiq
 
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...VHIR Vall d’Hebron Institut de Recerca
 
Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysisAnimesh Kumar
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to BioinformaticsDenis C. Bauer
 
Microarrays;application
Microarrays;applicationMicroarrays;application
Microarrays;applicationFyzah Bashir
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...VHIR Vall d’Hebron Institut de Recerca
 
Single cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsSingle cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsfaraharooj
 

What's hot (20)

Metagenomics newer approach in understanding Microbes
Metagenomics newer approach in understanding Microbes  Metagenomics newer approach in understanding Microbes
Metagenomics newer approach in understanding Microbes
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Metagenomics and it’s applications
Metagenomics and it’s applicationsMetagenomics and it’s applications
Metagenomics and it’s applications
 
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
Introduction to Metagenomics. Applications, Approaches and Tools (Bioinformat...
 
Metagenomic analysis
Metagenomic analysisMetagenomic analysis
Metagenomic analysis
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Illumina Sequencing
Illumina SequencingIllumina Sequencing
Illumina Sequencing
 
Microarrays;application
Microarrays;applicationMicroarrays;application
Microarrays;application
 
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
 
BACTERIAL GENOME SEQUENCING PROJECT
BACTERIAL GENOME SEQUENCING PROJECTBACTERIAL GENOME SEQUENCING PROJECT
BACTERIAL GENOME SEQUENCING PROJECT
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Single cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsSingle cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applications
 
Metagenomics
MetagenomicsMetagenomics
Metagenomics
 

Viewers also liked

QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library PrepQIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library PrepQIAGEN
 
Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Christian Frech
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceGenomeInABottle
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...nist-spin
 
Plant genome sequencing and crop improvement
Plant genome sequencing and crop improvementPlant genome sequencing and crop improvement
Plant genome sequencing and crop improvementRagavendran Abbai
 
Errors and Limitaions of Next Generation Sequencing
Errors and Limitaions of Next Generation SequencingErrors and Limitaions of Next Generation Sequencing
Errors and Limitaions of Next Generation SequencingNixon Mendez
 
Bioinformática Introdução (Basic NGS)
Bioinformática Introdução (Basic NGS)Bioinformática Introdução (Basic NGS)
Bioinformática Introdução (Basic NGS)Renato Puga
 
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Thermo Fisher Scientific
 
transforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencingtransforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencingPathKind Labs
 
NGS overview
NGS overviewNGS overview
NGS overviewAllSeq
 
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...VHIR Vall d’Hebron Institut de Recerca
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platformsAllSeq
 
Case studies of HTS / NGS applications
Case studies of HTS / NGS applicationsCase studies of HTS / NGS applications
Case studies of HTS / NGS applicationsrjorton
 

Viewers also liked (20)

Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
Introduction to Metagenomics Data Analysis - UEB-VHIR - 2013
 
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library PrepQIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
 
Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020Next-generation sequencing from 2005 to 2020
Next-generation sequencing from 2005 to 2020
 
Introduction to next generation sequencing
Introduction to next generation sequencingIntroduction to next generation sequencing
Introduction to next generation sequencing
 
Ngs intro_v6_public
 Ngs intro_v6_public Ngs intro_v6_public
Ngs intro_v6_public
 
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practiceAug2013 Heidi Rehm integrating large scale sequencing into clinical practice
Aug2013 Heidi Rehm integrating large scale sequencing into clinical practice
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for HarmonizationEU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
 
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
Next Generation Sequencing for Identification and Subtyping of Foodborne Pat...
 
Plant genome sequencing and crop improvement
Plant genome sequencing and crop improvementPlant genome sequencing and crop improvement
Plant genome sequencing and crop improvement
 
Errors and Limitaions of Next Generation Sequencing
Errors and Limitaions of Next Generation SequencingErrors and Limitaions of Next Generation Sequencing
Errors and Limitaions of Next Generation Sequencing
 
Bioinformática Introdução (Basic NGS)
Bioinformática Introdução (Basic NGS)Bioinformática Introdução (Basic NGS)
Bioinformática Introdução (Basic NGS)
 
Rossen eccmid2015v1.5
Rossen eccmid2015v1.5Rossen eccmid2015v1.5
Rossen eccmid2015v1.5
 
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
Speeding up sequencing: Sequencing in an hour enables sample to answer in a w...
 
transforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencingtransforming clinical microbiology by next generation sequencing
transforming clinical microbiology by next generation sequencing
 
Ngs presentation
Ngs presentationNgs presentation
Ngs presentation
 
NGS overview
NGS overviewNGS overview
NGS overview
 
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platforms
 
Case studies of HTS / NGS applications
Case studies of HTS / NGS applicationsCase studies of HTS / NGS applications
Case studies of HTS / NGS applications
 

Similar to Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences

Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryResearch Information Network
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryCarole Goble
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...OECD Environment
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopMorgan Langille
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardEMBL-ABR
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016GenomeInABottle
 
ECCMID 2015 Meet-The-Expert: Bioinformatics Tools
ECCMID 2015 Meet-The-Expert: Bioinformatics ToolsECCMID 2015 Meet-The-Expert: Bioinformatics Tools
ECCMID 2015 Meet-The-Expert: Bioinformatics ToolsNick Loman
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GenomeInABottle
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...GenomeInABottle
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingJonathan Eisen
 
Whole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptxWhole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptxHaibo Liu
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington UniversitySeth Crosby
 
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Jonathan Eisen
 
ICABR presentation falck zepeda et al june 2016 abrev
ICABR presentation falck zepeda et al june 2016 abrevICABR presentation falck zepeda et al june 2016 abrev
ICABR presentation falck zepeda et al june 2016 abrevjfalck
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowBrian Krueger
 
Using VarSeq to Improve Variant Analysis Research Workflows
Using VarSeq to Improve Variant Analysis Research WorkflowsUsing VarSeq to Improve Variant Analysis Research Workflows
Using VarSeq to Improve Variant Analysis Research WorkflowsDelaina Hawkins
 

Similar to Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences (20)

Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...Overview of the commonly used sequencing platforms, bioinformatic search tool...
Overview of the commonly used sequencing platforms, bioinformatic search tool...
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra Orchard
 
Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016Genome in a bottle for ashg grc giab workshop 181016
Genome in a bottle for ashg grc giab workshop 181016
 
ECCMID 2015 Meet-The-Expert: Bioinformatics Tools
ECCMID 2015 Meet-The-Expert: Bioinformatics ToolsECCMID 2015 Meet-The-Expert: Bioinformatics Tools
ECCMID 2015 Meet-The-Expert: Bioinformatics Tools
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meeting
 
Whole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptxWhole exome sequencing data analysis.pptx
Whole exome sequencing data analysis.pptx
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington University
 
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
 
ICABR presentation falck zepeda et al june 2016 abrev
ICABR presentation falck zepeda et al june 2016 abrevICABR presentation falck zepeda et al june 2016 abrev
ICABR presentation falck zepeda et al june 2016 abrev
 
High Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can KnowHigh Throughput Sequencing Technologies: What We Can Know
High Throughput Sequencing Technologies: What We Can Know
 
Using VarSeq to Improve Variant Analysis Research Workflows
Using VarSeq to Improve Variant Analysis Research WorkflowsUsing VarSeq to Improve Variant Analysis Research Workflows
Using VarSeq to Improve Variant Analysis Research Workflows
 

More from Surya Saha

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...Surya Saha
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomesSurya Saha
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Surya Saha
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingSurya Saha
 
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingUpdates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingSurya Saha
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesSurya Saha
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Surya Saha
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Surya Saha
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data Surya Saha
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017Surya Saha
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all OmicsSurya Saha
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...Surya Saha
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Surya Saha
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016Surya Saha
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Surya Saha
 
Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Surya Saha
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing DataSurya Saha
 
Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015Surya Saha
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Surya Saha
 

More from Surya Saha (20)

An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...An open access resource portal for arthropod vectors and agricultural pathosy...
An open access resource portal for arthropod vectors and agricultural pathosy...
 
Functional annotation of invertebrate genomes
Functional annotation of invertebrate genomesFunctional annotation of invertebrate genomes
Functional annotation of invertebrate genomes
 
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
Saha UC Davis Plant Pathology seminar Infrastructure for battling the Citrus ...
 
Updates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meetingUpdates on Citrusgreening.org database from USDA NIFA project meeting
Updates on Citrusgreening.org database from USDA NIFA project meeting
 
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meetingUpdates on the ACP v3 genome and annotation from USDA NIFA project meeting
Updates on the ACP v3 genome and annotation from USDA NIFA project meeting
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
 
Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...Visualization of insect vector-plant pathogen interactions in the citrus gree...
Visualization of insect vector-plant pathogen interactions in the citrus gree...
 
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...Deciphering the genome of Diaphorina citri to develop solutions for the citru...
Deciphering the genome of Diaphorina citri to develop solutions for the citru...
 
Quality Control of Sequencing Data
Quality Control of Sequencing Data Quality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequencing 2017
Sequencing 2017Sequencing 2017
Sequencing 2017
 
Community resources for all y’all Omics
Community resources for all y’all OmicsCommunity resources for all y’all Omics
Community resources for all y’all Omics
 
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis... CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
CitrusCyc: Metabolic Pathway Databases for the C. clementina and C. sinensis...
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
 
Sequencing 2016
Sequencing 2016Sequencing 2016
Sequencing 2016
 
Tomato Genome Build SL3.0
Tomato Genome Build SL3.0Tomato Genome Build SL3.0
Tomato Genome Build SL3.0
 
Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015Sequencing and Bioinformatics PGRP Summer 2015
Sequencing and Bioinformatics PGRP Summer 2015
 
Quality Control of Sequencing Data
Quality Control of Sequencing DataQuality Control of Sequencing Data
Quality Control of Sequencing Data
 
Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015Sequencing: The Next Generation 2015
Sequencing: The Next Generation 2015
 
Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…Tomato Genome SL2.50 and Beyond…
Tomato Genome SL2.50 and Beyond…
 
Sequencing
SequencingSequencing
Sequencing
 

Recently uploaded

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 

Recently uploaded (20)

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 

Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences

  • 1. Computational Tools for Metagenomics Surya Saha Twitter: @SahaSurya / LinkedIn: www.linkedin.com/in/suryasaha/ Magdalen Lindeberg Plant Pathology & Plant-Microbe Biology Microbial Friends & Foes, Sep 25, 2012
  • 2. Temperton, Current Opinion in Microbiology, 2012 Impact of Technology on Metagenomics
  • 3. Types of “Meta” genomics 16S rRNA survey of bacterial microbiome ITS survey of fungal microbiome Bellemain, BMC Microbiology 2010Slide: Julien Tremblay, JGI
  • 4. Types of “Meta” genomics Whole genome shotgun • Varying complexity of microbial communities • High coverage sequencing • Sophisticated informatics • Host associated metagenomes – Deep sequencing of host meta-genome – Bioinformatic screening of host sequences • Environmental metagenomes – Eg. Soil samples – Requires very high depth of coverage – Complicated to assemble
  • 7. Big picture!! What users see What users want!!
  • 8. 16S/ITS community surveys • Multiple target regions in 16S gene and ITS region • Comparison of results requires amplification of same region • Advantages – Fast survey of large communities – Mature set of tools and statistics for analysis – Good for first round survey • 454 16S tags or pyrotags (~ 700 bp) have been the preferred method • Illumina Miseq (2x150bp, 2x250 bp) are the next workhorses • Depth of sampling – 2-6000 reads/sample for simple communities – 20000 reads /sample for complex soil metagenomes
  • 9. 16S/ITS issues • Lack of tools for processing ITS/Fungal microbiome data sets – RDP classifier targets only ITS – No ITS reconstruction tools • Amplification bias effects accuracy and replication • Use of short reads prevents disambiguation of similar strains • 16S or ITS may not differentiate between similar strains – Clustering is done at 97% – Regions may be >99% similar • Sequencing error inflates number of OTUs • Chloroplast 16S sequences can get amplified in plant metagenomes
  • 10. 16S/ITS sequence processing workflow Filter for contaminants and low quality reads Assemble overlapping reads Reduce datasets (clustering) Perform taxonomic classification and compute diversity metrics
  • 11. 16S/ITS sequence processing workflow Filter for contaminants and low quality reads Assemble overlapping reads Reduce datasets (clustering) Perform taxonomic classification and compute diversity metrics • Quality plots and read trimming – FastQC http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ – FASTX http://hannonlab.cshl.edu/fastx_toolkit/ • Chimera removal – AmpliconNoise http://code.google.com/p/ampliconnoise/ – UCHIME http://www.drive5.com/uchime/
  • 12. Impact of Sequence Length Slide: Feng Chen, JGI
  • 13. 16S/ITS sequence processing workflow Filter for contaminants and low quality reads Assemble overlapping reads Reduce datasets (clustering) Perform taxonomic classification and compute diversity metrics • Merge overlapping paired end reads – FLASH http://www.genomics.jhu.edu/software/FLASH/index.shtml – FastqJoin http://code.google.com/p/ea-utils/wiki/FastqJoin – CD-HIT read-linker http://weizhong-lab.ucsd.edu/cd-hit/wiki/doku.php?id=cd-hit- auxtools-manual
  • 14. 16S/ITS sequence processing workflow Filter for contaminants and low quality reads Assemble overlapping reads Reduce datasets (clustering) Perform taxonomic classification and compute diversity metrics • Clustering with high stringency – UCLUST/USEARCH (16S only) http://www.drive5.com/usearch/ – CD-HIT-OTU (16S only) http://weizhong-lab.ucsd.edu/cd-hit-otu/ – phylOTU (16S only) https://github.com/sharpton/PhylOTU
  • 15. 16S/ITS sequence processing workflow Filter for contaminants and low quality reads Assemble overlapping reads Reduce datasets (clustering) Perform taxonomic classification and compute diversity metrics • Composition based classifiers – RDP database + classifier http://rdp.cme.msu.edu/classifier/classifier.jsp • Homology based classifiers – ARB + Silva database (16S only) http://www.arb-home.de/ – GreenGenes database (16S only) http://greengenes.lbl.gov/cgi-bin/nph-index.cgi – UNITE database (ITS only) http://unite.ut.ee/ – FungalITSPipeline (ITS only) http://www.emerencia.org/fungalitspipeline.html
  • 16. • http://www.qiime.org/ • Comprehensive suite of tools – OTU picking – Taxonomic classification – Construction of phylogenetic trees – Visualization – Compute diversity statistics • Available as Amazon EC2 image
  • 17. Whole Genome Shotgun (WGS) Metagenomics • Better classification with Increasing number of complete genomes • Focus on whole genome based phylogeny (whole genome phylotyping) • Advantages – No amplification bias like in 16S/ITS • Issues – Poor sampling of fungal diversity – Assembly of metagenomes is complicated due to uneven coverage – Requires high depth of coverage
  • 18. WGS sequence processing workflow Filter for low quality reads Assemble reads Perform taxonomic classification and compute diversity metrics
  • 19. WGS sequence processing workflow Filter for low quality reads Assemble reads Perform taxonomic classification and compute diversity metrics • Quality plots and read trimming – FastQC http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ – FASTX http://hannonlab.cshl.edu/fastx_toolkit/
  • 20. WGS sequence processing workflow Filter for low quality reads Assemble reads Perform taxonomic classification and compute diversity metrics • NGS assembly with uneven depth – IDBA-UD http://i.cs.hku.hk/~alse/hkubrg/projects/idba_ud/ – MIRA http://www.chevreux.org/projects_mira.html – Velvet / MetaVelvet http://www.ebi.ac.uk/~zerbino/velvet/ http://metavelvet.dna.bio.keio.ac.jp/
  • 21. WGS sequence processing workflow Filter for low quality reads Assemble reads Perform taxonomic classification and compute diversity metrics • Hybrid composition/homology based classifiers – FCP http://kiwi.cs.dal.ca/Software/FCP – Phymm/PhymmBL http://www.cbcb.umd.edu/software/phymm/ – AMPHORA2 http://wolbachia.biology.virginia.edu/WuLab/Software.html – NBC http://nbc.ece.drexel.edu/ – MEGAN http://ab.inf.uni-tuebingen.de/software/megan/
  • 22. WGS sequence processing workflow Filter for low quality reads Assemble reads Perform taxonomic classification and compute diversity metrics • Web based classifiers – MG-RAST http://metagenomics.anl.gov/ – CAMERA http://camera.calit2.net/ – IMG/M http://img.jgi.doe.gov/cgi-bin/m/main.cgi
  • 23. MetaPhAln • Unique clade-specific markers for sequenced bacteria and archaea • 400 genuses/4000 genomes including HMP genomes • Species level resolution • MetaPhAln 2 in the works – Eukaryotes including Fungi – Viruses – Higher coverage of archaea • Krona and GraphAln for visualization of output • Websites – https://bitbucket.org/nsegata/metaphlan – http://huttenhower.sph.harvard.edu/metaphlan
  • 24. PhyloSift/pplacer • Reference database of marker genes • Places reads on tree of life based on homology to reference protein • Integration with metAMOS for pre-assembling next- generation datasets • Bacterial and Archaeal classification only • Plant and Fungi marker genes are being added • Websites – http://phylosift.wordpress.com/ – https://github.com/gjospin/PhyloSift
  • 25. Real cost of Sequencing!! Sboner, Genome Biology, 2011
  • 26. Acknowledgements Funding Magdalen Lindeberg Cornell University Dave Schneider USDA-ARS, Ithaca Citrus greening / Wolbachia (wACP)
  • 27. Thank you! Surya Saha ss2489@cornell.edu Suggestions • Plan informatics workflow as early as possible • Incorporate statistics at different stages in the workflow