Anzeige
Anzeige

Más contenido relacionado

Anzeige
Anzeige

Gene hunting strategies

  1. Pedigree Based Methods • Positional Cloning: Identification of a gene for a particular disease based on its location in the genome, determined by a collection of methods including linkage analysis, genomic (physical) mapping, and Bioinformatics • Founder Gene Approach: Loss of genetic diversity or limited genetic diversity that occurs when a small group of individuals from a genetically diverse population are studied Pedigree Independent Methods • Candidate Gene Approach: Associations between genetic variations within pre- specified genes of interest and phenotypes or disease states • Genome Wide Association Studies: Examination of many common genetic variants in different individuals to see if any variant is associated with disease phenotype Sr No Name Web Address Reference 1 T1Dbase http://www.t1dbase.org (Hulbert et al. 2007) 2 COSMIC http://www.sanger.ac.uk/genetics/CGP/cosmic/ (Forbes et al. 2008) 3 The European Genome-Phenome Archive https://www.ebi.ac.uk/ega/ (Church et al. 2010) 4 ModSNP modsnp.expasy.org/ (Yip et al. 2004) 5 SwissVar http://swissvar.expasy.org/ (Mottaz et al. 2010) 6 HGMD http://www.hgmd.cf.ac.uk/ac/index.php (Stenson et al. 2003) 7 Catalog of published Genome Wide Association Studies (NHGRI) http://www.genome.gov/gwastudies/ (Gong et al. 2011)
  2. 3 Scope of a Genetic Association Study  Candidate gene ◦ Known functional variants ◦ Variants with unknown function in exons, introns, regulatory regions  Linkage candidate region ◦ Functional variants, or those with unknown function in candidate genes ◦ More general coverage of region using many markers  Genome-wide ◦ Test for association with hundreds of thousands (millions) of SNPs spread across the entire genome.
  3. Background  There are two main types of genetic association studies:  population-based case–control studies  family-based studies  Can be hypothesis driven e.g CG or with out prior hypothesis e.g GWAS  Population-based (defined here as nonfamily-based) case–control studies have become the most popular design to find common polymorphisms thought to underlie complex traits (also termed ‘common disease common variant hypothesis’).
  4. CGs  Targeting the genes with previous role in the trait in question  If focus on few genes then is cost effective  Small number of marker are needed to capture the most common variation  Candidate genes can be selected from biological pathways that harbor other previously associated risk loci.
  5. Goals  Use bioinformatics databases to: ◦ Determine basic properties of genes ◦ Identify common genetic variants in and around genes ◦ Characterize genetic variants in terms of frequency and functionality
  6. Possible Stages in Candidate-Gene Study Design Select a Candidate System Select a Candidate Genes in System Select Genetic Variants in Candidate Genes Knowledge of the biology of the phenotype 1. Expert Opinion 2. Literature Search 3. Pathway Analysis 4. (Positional) 1.Literature Search 2.Bioinformatic Databas 3.SNP Tagging
  7. UCSC Genome Browser (http://genome.ucsc.edu/) For a gene of interest  Determine basic properties: ◦ Location ◦ Size, # exons  Identify genetic variants ◦ SNPs, in-dels, STRs
  8. GWAS – Genome Wide Association Studies  Studies of genetic variation across the (entire) human genome  Designed to identify associations between genetic markers & observable traits, or the presence/absence of a disease or condition  Often markers of modest effect
  9. 10 Complex Traits - Multifactorial Inheritance  Examples ◦ Some cancers - Schizophrenia ◦ Type 1 diabetes ◦ Type 2 diabetes - Hypertension ◦ Alzheimer disease - Rheumatoid arthritis ◦ Inflammatory bowel disease - Asthma Genetic Variants Non-genetic factors TraitTrait
  10. 11 Genetic Association Studies  Short-term Goal: Identify genetic variants that explain differences in phenotype among individuals in a study population ◦ Qualitative: disease status, presence/absence of congenital defect ◦ Quantitative: blood glucose levels, % body fat  If association found, then further study can follow to ◦ Understand mechanism of action and disease etiology in individuals ◦ Characterize relevance and/or impact in more general population  Long-term goal: to inform process of identifying and delivering better prevention and treatment strategies
  11. Steps Specify case definition Consider the literature for a consensus definition of the disease of interest. Following standard diagnostic guidelines allows other groups to more easily replicate initial findings, though it is not always the most powerful approach for initial gene detection. If a consensus definition does not exist, consider all evidence and decide on a specific definition that optimizes biological and clinical relevance.
  12. Determine if the disease is heritable  Decide from all available evidence in familial aggregation studies whether there is sufficient evidence that the disease of interest is heritable.  Concordance rates: presence of the same trait in both members of a pair of twins  If the heritability of a disease or subphenotype appears to be low (<20%) and the disease is common, it is likely that very large sample sizes (in excess of 5,000 cases and 5,000 controls) will be required to find predisposing genetic variants using a population-based approach. • Control selection  Should be age, gender and ethnicity specific
  13. Catalog of GWAS Studies http://www.genome.gov/26525384
  14. Catalog of GWAS Studies http://www.genome.gov/26525384
  15. Manolio et al., Clin Invest 2008
  16. Catalog of GWAS Studies http://www.genome.gov/26525384
  17. Genetic association studies Direct genotyping occurs when an actual causal polymorphism is typed. Indirect genotyping occurs when nearby genetic markers that are highly correlated with the causal polymorphism are typed Hirschhorn & Daly, Nat Rev Genet 2005 Candidate Gene or GWAS Takes advantage of the correlation between SNPs, called linkage disequilibrium (LD)
  18. Genome-wide association studies (GWAS)
  19. Copyright restrictions may apply. Examples of Multistage Designs in Genome-wide Association Studies Pearson, T. A. et al. JAMA 2008;299:1335-1344
  20. GWAS Microarray Affymetrix, http://www.affymetrix.comAssay ~ 0.7 - 5M SNPs (keeps increasing)
  21. Genotype calls Good calls! Bad calls!
  22. Quality controls Quality control refers to the procedures used to evaluate the genotyping performance of the samples and the genotyping array. As there can be degradation of input DNA, plating errors and hybridization failures of genotyping chips, it is important to review the performance of the samples prior to definitive downstream analysis with the genotypes. The process of calling genotypes is not error free, It is thus vital to identify and exclude SNPs with potentially high rates of missingness or erroneous genotypes.
  23. Sample quality control The extent of missing genotypes and heterozygosity for a sample are useful indicators for poorly genotyped samples. Samples with anomalously high rates for either of these two measures are often excluded from the outset. High rates of missingness generally imply hybridization problems, which may be caused by faulty arrays or poor quality DNA Excess heterozygosity can indicate sample Contamination
  24. Sample quality control Unintentional use of related samples or accidental sample duplication in large scale studies Such cryptic relatedness is easy to infer through measuring the allele sharing Typically the sample in each relation with the least amount of missing genotypes is retained in the study. Family-based studies, the authenticity of the pedigree relationships can be achieved by calculating the extent of mendelian inconsistency PedChek software Exclude those are inconsistent
  25. SNPs Quality control  Remove SNPs with low call rate (e.g., <97%)  Proportion of SNPs actually called by software  Remove SNPs / Individuals who have too much missing data  Hardy-Weinberg Equilibrium,  Test for this (e.g., chi-squared test) Remove those with very low minor allele frequency
  26. Population structure  Population structure refers to the genetic differences that exist between individuals from different groups, populations or geographical regions.  There are a number of established statistical strategies for detecting population structure, of which those commonly used in genome-wide studies include genomic control (GC), which estimates the degree of inflation of the test statistic  A representation of how differences in genotypic (or allelic) frequencies across different populations can introduce false signals of association
  27. Selection of Markers for Association studies Human genome consists of over 3 billion base pairs Have about 28000 genes individuals are identical for ~99.5% of their sequence, with the small remaining part variable to differing extents  could variation have a role in explaining differences in genetic susceptibility to disease?  comparing variation between diseased (cases) and healthy (control) individuals from the same population If frequency of a variant at specific locus is >1% is said to be a polymorphism The most common class of polymorphisms SNPs, which comprise ~90% of all human variation Other types are larger blocks of sequence variation (mini-/micro-satellites), Indel,
  28. LD: non-random association of allele at two or more loci, that may or may not be on the same chromosome SNPs in LD? dbSNP have about 10 millions HapMap project determine taqSNPs which can be used as a proxies for other in LD and reduces the number of marker to be examined SNP-SNP association, or linkage disequilibrium, is fundamental to our ability to sample the whole genome with relatively few SNPs.
  29. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls (NATURE| Vol 447|7 June 2007) using the Affymetrix GeneChip 500K Mapping Array Set
  30. TaqMan Assay Process
  31. TaqMan assay system and mechanism of action •This is a best method for SNPs genotyping •Robust, reliable and very easy to prepare •Can be done in 384 well plate •Very low genotyping error rate •Reaction can be run on regular thermo cycler but Real-Time PCR detection system is necessary to scan the plates
  32. Output of TaqMan Assay
  33. Thanks…..
Anzeige