SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Imputation and de novo variant discovery from
low-pass whole genome sequencing data for
cost-effective and scalable trait mapping
Joe Pickrell
@joe_pickrell | joe@gencove.com
WHOLE GENOME SEQUENCING EXOME SEQUENCING
What technology to use to measure genotypes?
SNP ARRAY
O ($1000)
PROS: comprehensive
CONS: expensive, often overkill
O ($100)
PROS: comprehensive in exons
CONS: completely misses non-
coding variations
O ($10)
PROS: cost-effective, well-tested
CONS: no new variant discovery
(e.g. rare or population-specific),
for cost-effectiveness overall
genome coverage suffers
LOW-PASS SEQUENCING
What technology to use to measure genotypes?
SNP ARRAY
O ($10)
Sequencing technologies allow for new
variant discovery, high discovery power
across the genome.
O ($10)
PROS: cost-effective, well-tested
CONS: no new variant discovery
(e.g. rare or population-specific),
for cost-effectiveness overall
genome coverage suffers
+ 💻 =
Shotgun sequence a human genome to (usually) 0.4x or 1x coverage, and
use computational methods to ‘fill in’ anything we missed.
What is low-pass sequencing?
INTUITION
0.4x coverage = one sequencing read at ~30M SNPs
Genotyping array = excellent measurement of 0.5M SNPs
Why now?
2018: nominal price/Mb of sequence is <$0.01
The challenging part of low-pass sequencing is not sequencing per se
1. Cost of commercial library prep kits or outsourcing is higher
(sometimes considerably higher) than the cost of sequencing.
The challenges in low-pass sequencing are in sample
preparation and analysis
2. Going from a fastq file of low-pass sequences to genetic variant
calls is non-trivial and no standard software exists.
1. Divide the 1000 Genomes dataset in two
How does imputation from low-pass sequencing compare to
imputation from arrays?
2. Simulate low-pass sequencing (or genotyping from a few
commonly-used commercial arrays) from one half, impute from
the other
In an African population, low-pass sequencing increases
effective power by ~50-100%
In a European population, low-pass sequencing increases
effective power by ~10-20%
How does imputation from low-pass
sequencing compare to imputation
from arrays?
Up to now this is all simulations.
What about in practice?
How does imputation from low-pass
sequencing compare to imputation
from arrays?
79 European-ancestry individuals sequenced to
~1x coverage
Downsampled to 0.4x, 0.6x, 0.8x
Genotyped on the Affymetrix Axiom Biobank
Precision Medicine Research Array, around 800k
SNPs
Collaboration with Charlie Cox, GSK
High concordance between
genotyping array and imputed low-
pass genome sequences
Concordance at non-reference genotypes at non-reference genotypes
Low-pass sequencing increases power relative to the
PMR array
Can low-pass sequencing be used
to discover variants?
Ignore genotype data, call variants
from the sequencing reads alone
What fraction of polymorphic
variants are identified?
Can low-pass sequencing be used to discover variants?
With 1x sequencing, variants
present in >10 copies are
discovered.
The absolute number of copies
of the variant is more relevant
then the frequency per se; in
massive samples could profile
extremely rare variants.
| Low-pass sequencing increases association power by 10-100%
compared to commonly-used genotyping arrays, particularly in
non-European populations
Summary
| Low-pass sequencing allows for discovery of new/rare
variants, particularly at large sample sizes
| Additional applications: combining low-pass sequencing with
exon capture allows for joint clinical assays of rare and common
variation
THANKS!
Tomaz Berisa

Kaja Wasik

Maria Vazquez
Charlie Cox
Dana Fraser
Karen King
Joe Pickrell | @joe_pickrell | joe@gencove.com
Gencove-GSK results

Weitere ähnliche Inhalte

Ähnlich wie Gencove-GSK results

OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009Sean Davis
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationMohamedHasan816582
 
RNA Seq Data Analysis
RNA Seq Data AnalysisRNA Seq Data Analysis
RNA Seq Data AnalysisRavi Gandham
 
One man's *1 is another man's *13? Trouble with nomenclatures in personalized...
One man's *1 is another man's *13? Trouble with nomenclatures in personalized...One man's *1 is another man's *13? Trouble with nomenclatures in personalized...
One man's *1 is another man's *13? Trouble with nomenclatures in personalized...Matthias Samwald
 
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA Roberto Scarafia
 
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...VHIR Vall d’Hebron Institut de Recerca
 
Axiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAxiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAffymetrix
 
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Ilya Klabukov
 
Axiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAxiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAffymetrix
 
Microhaplotype, A Powerful New Type of Genetic Marker
Microhaplotype, A Powerful New Type of Genetic MarkerMicrohaplotype, A Powerful New Type of Genetic Marker
Microhaplotype, A Powerful New Type of Genetic MarkerMojgan Talebian
 
The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites Family Tree DNA
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNAUlises Urzua
 
FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochureAffymetrix
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomesGenomeInABottle
 
Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...Affymetrix
 
Axiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping ArraysAxiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping ArraysAffymetrix
 
CELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANA
CELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANACELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANA
CELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANARoberto Scarafia
 
Genotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary dataGenotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary dataFAO
 
Bioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss researchBioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss researchJoaquin Dopazo
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationJoaquin Dopazo
 

Ähnlich wie Gencove-GSK results (20)

OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
 
RNA Seq Data Analysis
RNA Seq Data AnalysisRNA Seq Data Analysis
RNA Seq Data Analysis
 
One man's *1 is another man's *13? Trouble with nomenclatures in personalized...
One man's *1 is another man's *13? Trouble with nomenclatures in personalized...One man's *1 is another man's *13? Trouble with nomenclatures in personalized...
One man's *1 is another man's *13? Trouble with nomenclatures in personalized...
 
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �20160219 - S. De Toffol -  Dal Sanger al NGS nello studio delle mutazioni BRCA �
20160219 - S. De Toffol - Dal Sanger al NGS nello studio delle mutazioni BRCA
 
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
 
Axiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array PlateAxiom™ Genome-Wide CEU 1 Array Plate
Axiom™ Genome-Wide CEU 1 Array Plate
 
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
 
Axiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array PlateAxiom™ Genome-Wide ASI 1 Array Plate
Axiom™ Genome-Wide ASI 1 Array Plate
 
Microhaplotype, A Powerful New Type of Genetic Marker
Microhaplotype, A Powerful New Type of Genetic MarkerMicrohaplotype, A Powerful New Type of Genetic Marker
Microhaplotype, A Powerful New Type of Genetic Marker
 
The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites The Origin of Ashkenazi Levites
The Origin of Ashkenazi Levites
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
 
FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochure
 
Aug2013 illumina platinum genomes
Aug2013 illumina platinum genomesAug2013 illumina platinum genomes
Aug2013 illumina platinum genomes
 
Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...Best practices for genotyping analysis of plant and animal genomes with Affym...
Best practices for genotyping analysis of plant and animal genomes with Affym...
 
Axiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping ArraysAxiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping Arrays
 
CELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANA
CELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANACELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANA
CELL - FREE DNA TEST: ASPETTI EMERGENTI NELLA PRATICA QUOTIDIANA
 
Genotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary dataGenotyping, linkage mapping and binary data
Genotyping, linkage mapping and binary data
 
Bioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss researchBioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss research
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 

Kürzlich hochgeladen

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Gencove-GSK results

  • 1. Imputation and de novo variant discovery from low-pass whole genome sequencing data for cost-effective and scalable trait mapping Joe Pickrell @joe_pickrell | joe@gencove.com
  • 2. WHOLE GENOME SEQUENCING EXOME SEQUENCING What technology to use to measure genotypes? SNP ARRAY O ($1000) PROS: comprehensive CONS: expensive, often overkill O ($100) PROS: comprehensive in exons CONS: completely misses non- coding variations O ($10) PROS: cost-effective, well-tested CONS: no new variant discovery (e.g. rare or population-specific), for cost-effectiveness overall genome coverage suffers
  • 3. LOW-PASS SEQUENCING What technology to use to measure genotypes? SNP ARRAY O ($10) Sequencing technologies allow for new variant discovery, high discovery power across the genome. O ($10) PROS: cost-effective, well-tested CONS: no new variant discovery (e.g. rare or population-specific), for cost-effectiveness overall genome coverage suffers + 💻 =
  • 4. Shotgun sequence a human genome to (usually) 0.4x or 1x coverage, and use computational methods to ‘fill in’ anything we missed. What is low-pass sequencing? INTUITION 0.4x coverage = one sequencing read at ~30M SNPs Genotyping array = excellent measurement of 0.5M SNPs
  • 5. Why now? 2018: nominal price/Mb of sequence is <$0.01 The challenging part of low-pass sequencing is not sequencing per se
  • 6. 1. Cost of commercial library prep kits or outsourcing is higher (sometimes considerably higher) than the cost of sequencing. The challenges in low-pass sequencing are in sample preparation and analysis 2. Going from a fastq file of low-pass sequences to genetic variant calls is non-trivial and no standard software exists.
  • 7. 1. Divide the 1000 Genomes dataset in two How does imputation from low-pass sequencing compare to imputation from arrays? 2. Simulate low-pass sequencing (or genotyping from a few commonly-used commercial arrays) from one half, impute from the other
  • 8. In an African population, low-pass sequencing increases effective power by ~50-100%
  • 9. In a European population, low-pass sequencing increases effective power by ~10-20%
  • 10. How does imputation from low-pass sequencing compare to imputation from arrays? Up to now this is all simulations. What about in practice?
  • 11. How does imputation from low-pass sequencing compare to imputation from arrays? 79 European-ancestry individuals sequenced to ~1x coverage Downsampled to 0.4x, 0.6x, 0.8x Genotyped on the Affymetrix Axiom Biobank Precision Medicine Research Array, around 800k SNPs Collaboration with Charlie Cox, GSK
  • 12. High concordance between genotyping array and imputed low- pass genome sequences Concordance at non-reference genotypes at non-reference genotypes
  • 13. Low-pass sequencing increases power relative to the PMR array
  • 14. Can low-pass sequencing be used to discover variants? Ignore genotype data, call variants from the sequencing reads alone What fraction of polymorphic variants are identified?
  • 15. Can low-pass sequencing be used to discover variants? With 1x sequencing, variants present in >10 copies are discovered. The absolute number of copies of the variant is more relevant then the frequency per se; in massive samples could profile extremely rare variants.
  • 16. | Low-pass sequencing increases association power by 10-100% compared to commonly-used genotyping arrays, particularly in non-European populations Summary | Low-pass sequencing allows for discovery of new/rare variants, particularly at large sample sizes | Additional applications: combining low-pass sequencing with exon capture allows for joint clinical assays of rare and common variation
  • 17. THANKS! Tomaz Berisa
 Kaja Wasik
 Maria Vazquez Charlie Cox Dana Fraser Karen King Joe Pickrell | @joe_pickrell | joe@gencove.com