BioSMACK - Linux Live CD for GWAS

  1. 1. BioSMACKa Linux Live CD for Analysis of Genome-Wide Association
  2. 2. BioSMACK: a Linux Live CD for Analysis of GWA IEEE BIBM 2010 Workshop?
  3. 3. 오송광저우 홍콩
  4. 4. 대학 BGI공항 숙소 홍콩섬 흥홈역
  5. 5. Northern Han 햅맵 중국인 싱가폴-중국인 홍콩-중국인121 samples - Chinese University of Hong Kong
  6. 6. BioSMACK: a Linux Live CD for Analysis of GWA What is the Genome-Wide Association Study?
  7. 7. 23andMe 설립 KHapMap 완성 (2003~) HGP 완성 국립보건원 유전체센터 Sceience, Breakthrough of the year에 국립보건원 유 설립 Human Genetic Variation 전체센터 입사 KARE 프로젝트 시작 벤터 1991 1996 2001 2006 2011 왓슨, 얀 후안밍 1992 1997 2002 2007 김성진 박사 whole genome 완성 1000 Genomes Project 시작Illumina 설립 1993 1998 2003 2008 PGP-10 데이터 공개 1994 1999 2004 2009 Nature Genetics 한국인 GWAS 결과 발표 Science, 한국인 이동경로 HGP 시작 1990 1995 2000 2005 2010 서울대, Nature에 한국인 whole genome 논문 KAREBrowser 개발 벤터 DTC 서비스 논문 발표 Affymetrix/Illumina SNP 칩 개발 904 published GWAS for 165 traits HapMap 완성 (2002~) 게놈연구재단, 한국인 게놈 프로젝트 출범 최초의 GWAS-노인성 황반 변성, Science genomeunzipped등의 public personal genome 공개 Illumina, Infinium whole-genome genotyping (100,000 markers)
  8. 8. 1 Analysis millions of genotype data requiresmore computing power and highly skilledspecialist for handling large data and series ofanalysis2 Various software (e.g. PLINK, Eigensoft,STRUCTURE and SnpMatrix) have been developedfor GWAS3 Researchers often encounter the problem in theprocess of compiling/installing and configure theenvironmental parameters and library dependency
  9. 9. 고민해결??
  10. 10. BioSMACK: a Linux Live CD for Analysis of GWA What is the Linux Live CD?
  11. 11. • Linux is the free open source operating system• Many GWA softwares support linux• Linux live CD is bootable customized linux from CD/UBS flash drives
  12. 12. • Developer can makes linux live CD for their usage (e.g. biology, chemistry, physics, games)• For biological data analysis - BioLinux, Open Discovery, GRIMP, BioConductorBuntu and PhyLIS• GWAS methods are rapid development, there is a need for a Live CD focusing on GWAS
  13. 13. BioSMACK: a Linux Live CD for Analysis of GWA How implementation of BioSMACK?
  14. 14. •Based on Open-Source software (free to use, redistribute under GNU General Public License)•Based on the Ubuntu Linux distribution (v5.5)•Ubuntu Linux is the most popular Linux distribution•Pre-compiled, installed and configured for GWA software•Command line and JAVA Swing based GUI for GWA software execute•User-manual and example data also included
  15. 15. •Calling genotype from genome-wide SNP chip•Covert PLINK binary format from raw genotype data•Detect the population stratification•Association analysis using PLINK•Estimate the genotype of SNPs that were not observed in GWAS (imputation)•Meta-analysis in two-sample comparisons
  16. 16. •PLINK •HTML Based•SnpMatrix 목차•EIGENSTRAT•STRUCTURE•RMETA•METAL•IMPUTE•MACH 명령어 설명 예제 데이터 실행 명령어
  17. 17. BioSMACK: a Linux Live CD for Analysis of GWA How to install BioSMACK?
  18. 18. 1 Download BioSMACK ISO image file (about 1GB size) - freely available at ksnp.cdc.go.kr/biosmack2 Can be make CD/DVD from ISO image Can be make USB flash drives from ISO image3 Installed on hard disk (erasing the previous operation system) Not installed on hard disk (boot from CD/USB flash drives without making changes to the underlying operating system)
  19. 19. BioSMACK: a Linux Live CD for Analysis of GWA Result and Future Works
  20. 20. 1 Useful for educational purpose and simple analysis onthe fly without installation and configuration2 Use BioSMACK on various kinds of laptops andnetbook in the 5th workshop of Asian Institute inStatistical Genetics and Genomics3 Fully functional research environment for GWAS canbe setting up on any computer within couple of hours
  21. 21. 1 Cloud computingcomputing using resources acquired on demand2 Cluster computingsupport parallel job with job scheduler (e.g. SunGrid Engine, Open PBS, Torque)3 Parallel SoftwareHigh-performance, parallel, on demand for GWASwill be support BioSMACK AMI (Amazon MachineImage) - for cloud computingwill be support parallel job script - for HPC