Anzeige

A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomics Platform

Director, Scientific Applications, Partnerships and Product Strategy at Station X um Station X
26. Oct 2017
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomics Platform
Nächste SlideShare
Gardner and Song_2015_Genetics in MedicineGardner and Song_2015_Genetics in Medicine
Wird geladen in ... 3
1 von 1
Anzeige

Más contenido relacionado

Presentaciones para ti(19)

Similar a A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomics Platform(20)

Anzeige

A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomics Platform

  1. ` A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomics Platform Abstract Materials and Methods Conclusions References For further information Antoaneta Vladimirova2, Tod Klingler2, Richard Goold2, Erik G. Puffenberger1 1The Clinic for Special Children, Strasburg, PA; 2Station X, 185 Berry Street, Suite 2001, San Francisco, CA Figure 4. Summary table of all affected probands and their phenotypes, along with the results of the GenePool genomic analysis with the putative candidate genes identified The candidate genes are split into different categories, homozygous recessive, de novo, autosomal dominant and compound heterozygous, based on the potential mode of inheritance in each case. The gene symbols in bold represent the most likely candidates based on a combination of genomic analysis, family medical history assessment, and diseases and HPOs known to be associated with the candidate genes in CGD or the literature. The underlined gene symbols represent full concordance between the CSC results and GenePool gene candidates. Figure 1. Analysis and results of Family #14 Proband and family members were analyzed in GenePool. Figure 1A represents the clinical data gathered for family #14. Other than the proband and an affected brother, which presents with skeletal displasia, scoliosis, ASD, cleft palate, etc., the rest of the family members are unaffected. Figure 1B represents the Analysis Designer setup in GenePool where the project, analysis type and sample groups are selected along with the desired parameters for analysis. Figure 1C represents the two the top candidates variants, associated with two CGD diseases. Figure 1D demonstrates a variation distribution analysis of the two top candidates across all family members. Both variants, in SH3TC2 and SLC26A2 genes, are homozygous while the rest of the unaffected family members are in a heterozygous form. Subjects of Amish or Mennonite descent from multiple families were chosen for next-generation DNA testing if they presented to CSC with clinical signs of an underlying genetic lesion and remained without a diagnosis following standard biochemical and genetic investigations (e.g. metabolic testing, targeted gene sequencing, cytogenetic or lower-density molecular karyotyping, etc.). The study was approved by the Lancaster General Hospital institutional review board. All probands (or their parents) had pre-test counseling to explain the goals, process, timing, and limitations of microarray and exome testing. All subjects consented in writing to participate on behalf of themselves or their children. Prior to molecular testing, every proband underwent detailed phenotyping by one of three CSC clinicians. The process included pre- and perinatal history, a record of illness and hospitalizations, and annotated medical problem list of HPO1 classifications. Probands and their family members were exome-sequenced through the Regeneron Genetics Center (RGC). Briefly, 1ug of high-quality genomic DNA was exome captured using the NimbleGen VCRome SeqCap 2.1 reagent; captured libraries were sequenced on the Illumina HiSeq 2500 platform using v4 chemistry. Exome sequencing was performed such that >85% of the bases were covered at 20x or greater. Raw sequence reads were mapped and aligned to the GRCh37/hg19 human genome reference assembly using standard bioinformatics algorithms (BWA/GATK). Called variants were filtered based on standard quality metrics: minimum read depth (>10), genotype quality (>30), and allelic balance (>20%).Generated VCF files were uploaded to GenePool2 cloud-based genomics platform for subsequent analysis via DNAnexus3 integration. Clinical data such as age, gender, family relation and associated HPOs was uploaded in GenePool along with the molecular data and used in integrative analyses. Only variants passing standard quality criteria were further analyzed in GenePool (coverage>=10, quality>=30, variant frequency>=10%). All variants generated through the standard trio analysis workflow, variation comparison, variation comparison or variant distribution workflows were analyzed. Variants were filtered to exclude previously determined CSC “common” variants. Subsequently, SNPeff4 annotations were utilized so that variants of high and moderate impact were prioritized. Additionally, allele frequencies form 1000 Genomes Project5, Exome Sequencing Project6 (ESP) and The Exome Aggregation Consortium7 (ExAC), and specifically, the European descent-related ones (AF>1% for homozygous recessive and AF>=0 for de novo variants) were applied to prioritize variants further. Clinical Genomics Database8 (CGD) and ClinVar9 disease annotations were also used to identify most likely candidates related to the proband phenotypes and HPOs. Allele frequencies of all unaffected or likely unaffected individuals were also calculated and used for variant prioritization. Disease Ontology10 and genes associated with each disease term curated in GenePool were also applied to identify relevant variant candidates. In addition to trio analysis, variant profile and variant distribution workflows, in cases where families were very large and we could identify groups of affected and unaffected individuals, we also applied variation comparison analysis. Variants were also analyzed with the “gene pivot” functionality in GenePool to rapidly identify compound heterozygous scenarios. Figure 2. Analysis and results of Family #75 Proband and family members were analyzed in GenePool. Figure 1A represents the clinical data gathered for family #75. Other than the proband, which presents with anxiety, aggression, OCD, autism, intellectual disability, epilepsy etc., the rest of the family members are unaffected. Figure 1B represents the prioritized variant results in GenePool after removing the common” variants, filtering for de novo variants, allele frequencies and Disease Ontology “epilepsy syndrome” term, resulting in two top candidates. Interactive pie chart widgets represent the ability to dynamically and visually quickly filter results. Figure 1C CDH2 variant, shown as present in heterozygous form in the affected proband, but no in the any of the unaffected family members. •  GenePool cloud-based genomics software platform was successfully applied in a retrospective CSC exome analysis project to store, manage and analyze genomic data form over two dozen probands and their families to address a variety of undiagnosed medical conditions and their genetic underpinnings. •  For each proband one or more causative variants and candidate genes were identified that support the mode of inheritance of the condition within the family, and the collected clinical data for both the proband and the family members. •  CSC and GenePool genomic results demonstrate a high level of concordance •  A variety of integrated workflows in GenePool such as trio analysis, variation profile, variation cohort comparison and variation distribution allowed for quick, intuitive and efficient process to analyze each family and identify a short list of candidate variants and genes spanning homozygous, de novo, autosomal dominant and compound heterozygous modes of inheritance •  The ability to integrate clinical information along with the molecular data in GenePool was critical for segmenting the family members based on their phenotypes and conditions, and for efficiency of the analysis •  Multiple annotations for variants, genes and diseases in GenePool were instrumental in streamlining the process of variant prioritization and interpretation •  GenePool platform served as an efficient tool in the analysis and identification of putative causative variants to facilitate diagnosis and optimize patient management •  More information on Station X and GenePool platform can be obtained at http:// ww.stationxinc.com. •  For more information on this poster please contact antoaneta@stationxinc.com •  Follow us on Twitter @StationXInc 1. Human Phenotype Ontology: http://human-phenotype-ontology.github.io 2. GenePool by Station X, Inc.: http://www.stationxinc.com/ 3. DNAnexus: https://www.dnanexus.com 4. SnpEff: http://snpeff.sourceforge.net 5. 1000 Genomes Project: http://www.internationalgenome.org 6. Exome Sequencing Project: https://esp.gs.washington.edu/drupal/ 7. Exome Aggregation Consortium: http://exac.broadinstitute.org 8. Clinical Genomics Database: https://research.nhgri.nih.gov/CGD/ 9. ClinVar Database: http://www.ncbi.nlm.nih.gov/clinvar/ 10. Human Disease Ontology: http://www.obofoundry.org/ontology/doid.html The Clinic for Special Children (CSC) is a rural pediatric non-profit medical practice serving uninsured Amish and Mennonite (Plain) children with genetic disorders. The clinic strives to identify genetic causes of childhood disability and disease and uses modern genetic technologies to diagnose and treat patients. Whole exome sequencing (WES) and data analysis in conjunction with deep phenotyping has enabled the scientific community to achieve great success in identifying the molecular bases of disease. The CSC has used these technologies successfully as well over the past several years. The CSC employs a diagnostic pipeline for new patients that involves detailed phenotyping, targeted mutation detection, chromosomal microarray analysis, and exome sequencing in order to generate a molecular diagnosis for the patient. Due to a deep knowledge of segregating mutations in the Plain populations, nearly 50% of all new patients receive a diagnosis through targeted mutation detection while roughly 3% have diagnostic copy number changes. Of the remaining patients, our diagnostic yield for clinical exomes is approximately 49%. We present a validation study of solved WES cases from the CSC where we demonstrate the ability to efficiently identify putative causative variants in GenePool, a cloud- based genomics platform for analysis of genomics data. We utilized the built-in analytical workflows for trio analysis and the pipelines designed for population-size cohort analyses. The latter analyses compared groups of affected and unaffected individuals. We used GenePool’s interactive visualization filters with the comprehensive library of annotations to quickly prioritize the list of potential causative variants to a small highly-relevant set and validated our results. GenePool allowed us to efficiently screen for pathogenic variants associated with autosomal recessive and de novo dominant phenotypes, as well as with more complex genetic diseases. Rapid diagnosis is crucial to optimal patient outcomes, and GenePool solves a critical part of this process by enabling the analysis and identification of a small set of putative pathogenic variants in a short time frame. In this study, we found high concordance between GenePool variant prioritization and the prior ad hoc manual prioritization. The study we present was conducted in specific regional founder populations, but it provides important lessons for WES studies in non-founder populations. Results A. C. B. A. B. C. A. B. C. D. Figure 3. Analysis and results of Family #76 Proband and family members were analyzed in GenePool. Figure 1A represents the clinical data gathered for family #76. Along with the proband, which presents with decreased fetal movement, neonatal hypotonia, global delays, ADD, relative macrocephaly, triangular facies, narrow forehead, flat profile, small mouth, show similar symptoms. In contrast, the sister of the proband is not affected, suggesting an X-linked condition. Figure 1B represents prioritized variant results in GenePool after removing the common” variants, filtering for allele frequencies and associated CGD diseases. Figure 1C demonstrated a variant distribution analysis of the whole family where the HUWE1 variants is in a hemizygous form in the affected proband and brothers, and in a heterozygous form in the unaffected sister and mother. The unaffected father does not have the variant in HUWE1 gene.
Anzeige