This document outlines plans for multi-site sequencing studies to generate standardized human and bacterial genome sequencing datasets. Samples include a human trio, bacterial isolates, and mixtures, which will be sequenced in triplicate across three sites on various platforms including Illumina HiSeq X Ten, HiSeq 4000, HiSeq 2500, NextSeq 500, Life Tech Ion Proton, Ion S5, Pacific Biosciences, Oxford Nanopore, and others. The goals are to measure intra- and inter-lab variation, sequencing performance at GC extremes, and establish molecular standards for assessing sequencing methods in DNA, RNA, and metagenomics. Data will be analyzed by a team to benchmark tools and published by October 2017.
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
1. Cross-Kingdom Standards in Genomics,
Epigenomics and Metagenomics
(for this world and maybe others)
Christopher E. Mason, Ph.D.
Associate Professor
Department of Physiology and Biophysics &
The Institute for Computational Biomedicine (ICB),
Meyer Cancer Center, Feil Family Brain and Mind Research Institute,
at Weill Cornell Medicine,
Fellow of the Information Society Project, Yale Law School
June 29th , 2017
8. Acronym Group Type Agency/GroupWeb site(s) for Consortiums, Data Sets, Methods, and/or Materials
GIAB Genome in a Bottle
DNA and
cells
NIST https://sites.stanford.edu/abms/giab
Nex-StoCT
Next-generation Sequencing: Standardization of
Clinical Testing (Nex-StoCT) II
DNA CDC http://www.cdc.gov/ophss/csels/dlpss/Genetic_Testing_Quality_Practices/ngsqp.html
GeT-RM
Genetic Testing Reference Materials
Coordination Program
DNA CDC http://wwwn.cdc.gov/clia/Resources/GetRM/default.aspx
http://www.fda.gov/ScienceResearch/BioinformaticsTools/MicroarrayQualityControlProject/
http://www.nature.com/nbt/collections/seqc/index.html
http://www.fda.gov/ScienceResearch/BioinformaticsTools/MicroarrayQualityControlProject/
http://www.nature.com/nbt/focus/maqc/index.html
http://www.abrf.org/index.cfm/group.show/NextGenerationSequencing%28NGS%29.75.htm
http://www.biotech.cornell.edu/news/abrf-next-generation-sequencing-study-webinar
GEUVADIS Genetic European Variation in Health and DiseaseRNA EU http://www.geuvadis.org
http://www.nist.gov/mml/bbd/ercc.cfm
https://www.lifetechnologies.com/order/catalog/product/4456740
ERCC2 External RNA Control Consortium 2 RNA NIST http://www.nist.gov/mml/bbd/ercc2.cfm
SIRV Spike-In RNA Variant Mixes RNA Lexigen https://www.lexogen.com/sirvsrelease/
MBQC Microbiome Quality Control Consortium meta MBQC www.mbqc.org
IMMSA
International Metagenomics and Microbiome
Standards Consortium
meta NIST http://www.nist.gov/mml/bbd/microbial_metrology/immsa-mission-statement.cfm
IHMS International Human Microbiome Standardsmeta meta www.microbiome-standards.org/
BiOMICs Bio-OMICS mixed kingdom DNA standard
meta
and cells
Zymo http://www.zymobiomics.com/
ATCC
International Metagenomics and Microbiome
Standards Consortium
meta ATCC http://www.atcc.org/products/all/CCL-186.aspx
BEI International Human Microbiome Standardsmeta NIAID https://www.beiresources.org/Catalog/otherProducts/HM-782D.aspx
EMP Earth Microbiome Project meta EMP http://earthmicrobiome.org/
XMP eXtreme Microbiome Project meta XMP http://extrememicrobiome.org/
MGRG Metagenomics Research Group meta ABRF http://blog.abrf.org/
MetaSUB
International Metagenomics and Metadesign of
Subways and Urban Biomes
meta ABRF http://www.metasub.org
MAQC / MAQC2
SEQC / SEQC2
Microarray Quality Control Consortium
Sequencing Quality Control Consortium
ABRF-NGS
Registry of Standard Biological Parts DNA iGEM
genome/epigenome
ABRF
Association of Biomolecular Resource Facilities
(ABRF) Next-generation Sequencing
RNA
FDA
transcriptome/epitranscriptome
RNA
metagenome/metatranscriptome
Molecular Standards for Assessing Library, Sequencing, and Analysis Methods in DNA, RNA, and metagenomics
http://parts.igem.org/Main_Page
ERCC External RNA Control Consortium NIST
DNA
RNA
FDA
RSBP
10. Testing and benchmarking for RNA standards
(FDA’s SEQC and ABRF-NGS study)
RNA-seq Standards
Li, Tighe et al., Nature Biotechnology, Sept. 2014
SEQC Consortium, Nature Biotechnology, Sept. 2014
Li, Łabaj, Zumbo, et al., Nature Biotechnology, Sept. 2014http://www.nature.com/nbt/collections/seqc/index.html
11. Even with >12 billion reads, more genes
appear and are annotation/tool dependent.
http://www.nature.com/nbt/focus/seqc/index.html
12. Reference DNA,
TruSeq PCR-free 350
FFPE, TruSeq Nano
FFPE, TruSeq PCR-free
maternal
paternal
son
son
(Coriell)
A B C C2
Personal Genome Project
NIST Reference Human Genomes
C2f
Phase 2 DNA Samples: human
13. Ste Eco Pflu pool
%GC: 28 47 72
Reference bacterial genomes
TruSeq PCR-free 550
Phase 2 DNA Samples: bacterial
15. Lab 1
Compare NIST and Coriell stock cell culture genomes
Evaluate Coriell cell culture as an FFPE reference material
HiSeq X Ten, 2x150
1 flow cell
7 libraries
Library kits: TruSeq PCR Free and TruSeq Nano, 350 bp inserts
16. Lab 1
MiSeq v3, 2x300
Lab 2
Lab 3
3 flow cells
36 libraries
Lab 1
HiSeq 2500 v3 Rapid Run, 2x250
Lab 2
Lab 3
6 flow cells
45 libraries
Generate standardized human genome sequencing datasets
Measure intra- and inter-lab variation
Measure sequencing performance at GC composition extremes
Library kit: TruSeq PCR Free, 550 bp inserts for bacteria, 350 bp for sample C
17. Reference DNA,
AmpliSeq Exome
Ste Hah Mil pool
Samples
maternal
paternal
son
%GC: 28 47 72
A B C C2
Personal Genome Project
NIST Reference Human Genomes
Reference bacterial genomes
Ion Xpress Plus
Fragment Library
Life Technologies
18.
19. Measure sequencing performance at GC composition extremes
Measure intra- and inter-lab variation
Lab 1
RS II Sequel
Lab 2
Lab 3
Pacific Biosciences
20.
21. Samples and Platforms – All tested in triplicate across three distinct sites
Platform Human DNA Bacterial DNA
Illumina HiSeq X Ten A, B, C, C2, C2f
Illumina HiSeq 4000 A, B, C
Illumina HiSeq 2500 v4 1T A, B, C
Illumina HiSeq 2500 v3 Rapid Run C Ste, Eco, Mil, P
Illumina NextSeq 500 High Output C
Illumina MiSeq Ste, Eco, Mil, P
Life Tech Proton A, B, C exomes Ste, Eco, Mil, P
Life Tech S5 A, B, C exomes Ste, Eco, Mil, P
Life Tech PGM Ste, Eco, Mil, P
Pacific Biosciences Ste, Eco, Mil, P
Oxford Nanopore Ste, Eco, Mil, P
maternal
paternal
son
son
(Coriell)
A B C C2
Ste Eco Pflu pool
Human Trio Bacterial Isolates and Mixture
22. Sequencing summary
• 286/307 libraries have been sequenced
• Completion date for all data collection will be
August 2017
– First data is posted
– Submit manuscripts by October
• Data is being analyzed by a team of 25
bioinformatics specialist
– most are members of ABRF-NGS and GBIRG
– some are outside of ABRF
39. Dynamic, the Gut Is.
Measure it carefully, we must.
Aaron Del Duca
40. Ongoing efforts to reduce variance
(or embrace it when helpful)
Aaron Del Duca
41. 16S rRNA is only a part of the
erudition
Lan Y, Rosen G, Hershberg R. “Marker genes that are less conserved in their sequences are useful for predicting genome-
wide similarity levels between closely related prokaryotic strains.” Microbiome. 2016.
“16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes,
but not for closely related ones.”
42. Average Amino Acid identity (AAi) 16s rRNA
Escherichia/Shigella lineage is poorly defined by 16S
43. Metagenomics can expand the microbiome
to query across kingdoms
Data Type 16S 18S ITS Shotgun
Taxonomic Classification Yes Yes Yes Yes
Prokaryotes Yes No No Yes
Archaea Yes No No Yes
Eukaryotes No Yes Yes Yes
Parasites No Yes No Yes
Plasmids No No No Yes
Phages No No No Yes
Human Ancestry No No No Yes
Biosynthetic Gene Clusters No No No Yes
Antimicrobial Resistance (AMR) Markers No No No Yes
Kingdom Specificity Yes Yes Yes No
Approximate Raw Cost / Sample $100 $100 $125 $300
From https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5359768/
44. For complex metagenomic samples,
we see similar challenges as SEQC
Millions of Reads
NumberofSpecies
Elizabeth Hénaff
Abundance (MetaPhlAn)
55. • Sequins are synthetic DNA standards that ‘mirror’ and match the sequencing,
assembly and alignment of microbe genomes.
• Their synthetic sequence allows them to be added directly to a user’s DNA sample
prior to library preparation and sequencing, and thereby act as internal reference
controls.
W W W. S E Q U I N S . X Y Z
56. Diagnostic performance – assess the sensitivity and specificity for
detecting pathogens in a sample.
Quantitative accuracy – measure quantitative performance of a NGS
library, and the impact of sequence coverage on analysis (see over).
Sequins can be analyzed as internal controls throughout the NGS workflow:
Normalization – sequins can act as scaling factors to normalize between
multiple samples for more accurate comparisons.
Quality control and troubleshooting – calibrate and optimize library
preparation, sequencing and analysis steps.
W W W. S E Q U I N S . X Y Z
57. • Metaquins are titrated across a 105 –fold concentration range to form a quantitative
ladder.
• This ladder can be used to assess quantitative accuracy, sensitivity limits and the
impact of sequencing coverage on de novo assembly.
• Alternative mixtures can be used to assess fold-change differences between samples
W W W. S E Q U I N S . X Y Z
61. Genetics and Epigenetics of
Anti-microbial Resistance (AMR)
http://gcgh.grandchallenges.org/grant/global-distribution-and-epigenetic-stratification-antimicrobial-resistance
62.
63. Open, Transparent, Global Collaboration
3 Goals:
1. Geospatial Metagenomic and Forensic Maps
2. Anti-microbial resistance (AMR) marker tracking
3. New Biosynthetic Gene Clusters (BGCs); new drugs
www.metasub.org
89. StuckOnU
MetaSUB metagenomics research comes to the ABRF 2017 in San Diego!
Our research study investigates the microbiome and DNA of your cell phone, as part
of a global study on the genomics of our world’s cities.
121. Flight data shows very good accuracy (89-92%) for 2D reads
Plus, good read accuracy (76-79%) for 1D reads
for the template/complement measures.
Flight Data Read Accuracy(%ofreads)
122.
123. The first genome sequenced and
assembled from beyond-Earth reads
http://biorxiv.org/content/early/2016/09/27/077651
125. Direct Detection of Methylation on PacBio
70.5 71.0 71.5 72.0 72.5 73.0 73.5 74.0 74.5
0
100
200
300
400
Fluorescence
intensity(a.u.)
Time (s)
104.5 105.0 105.5 106.0 106.5 107.0 107.5 108.0 108.5
0
100
200
300
400
Fluorescence
intensity(a.u.)
Time (s)
C
T G A TC G T A C
mA
AG TCT A A
G C C A A A
A
Approach: Kinetic detection of methylated bases during SMRT DNA sequencing
Example: N6-methyladenosine (mA)
Flusberg et al, 2010
139. Conclusions
• Data sets are made and ready for most of the
FDA’s SEQC and ABRF-NGS DNA studies
• Epigenome QC (EpiQC) group is also testing the
same human and metagenone samples for base
modifications
• Metagenomics benchmarking shows striking
differences in default pipelines, even with similar
database sizes and coverage.
• Sequencing experiments can now be planned for
space flight. Maybe Mars.
140. Deep Gratitude to Many People:
Illumina
Gary Schroth
Marc Van Oene
Univ. Chicago
Yoav Gilad
FDA/SEQC/Fudan Univ.
Leming Shi
NIH/UDP/NCBI
Jean & Danielle Thierry-Mieg
Baylor
Jeff Rogers
MSKCC
Danwei Huangfu
Christina Leslie
Ross Levine
Alex Kentsis
HudsonAlpha
Shawn Levy
Mason Lab
Ebrahim Afshinnekoo
Sofia Ahsanuddin
Noah Alexander
Pradeep Ambrose
Daniela Bezdan
Marjan Bozinoski
Dhruva Chandramohan
Chou Chou
David Danko
Tim Donahoe
Jonathan Foox
Elizabeth Hénaff
Matthew MacKay
Alexa McIntyre
Cem Meyden
Niamh O’Hara
Lenore Pipes
Jake Reed
Heba Shabaan
Delia Tomoiaga
Priyanka Vijay
David Westfall
Cornell/WCM
Scott Blanchard
Selina Chen-Kiang
Olivier Elemento
Samie Jaffrey
Ari Melnick
Margaret Ross
Epigenomics Core
Duke
Stacy Horner
Nandan Gokhale
Icahn/MSSM
Eric Schadt,
Andrew Kasarskis,
Joel Dudley, Ali
Bashir,
Bobby Sebra
ABRF
George Grills
Scott Tighe
Don Baldwin
Miami
Maria E Figueroa
AMNH
George Amato
Mark Sidall
@mason_lab
NYU
Martin Blaser
Jane Carlton
Julia Maritz
Chris Park
MIT Media Lab
Kevin Slavin
Devora Najjar
Regina Flores
Rockefeller
Jeanne Garbarino
Charles Rice
NASA
Aaron Burton
Sarah Castro-Wallace
Kate Rubins
Graham Scott
Craig Kundrot
Jackson Labs
Sheng Li
UVA
Francine Garrett-Bakelman