Keynote given at BOSC, 2010.
Does the hype surrounding cloud match the reality?
Can we use them to solve the problems in provisioning IT services to support next-generation sequencing?
85. Gene Finding DNA HMM Prediction Alignment with known proteins Alignment with fragments recovered in vivo Alignment with other genes and other species
123. Compute architecture VS CPU CPU CPU Fat Network Posix Global filesystem CPU CPU CPU CPU thin network Local storage Local storage Local storage Local storage Batch schedular hadoop/S3 Data-store Data-store
133. Past Collaborations Data Sequencing Centre + DCC Sequencing centre Sequencing centre Sequencing centre Sequencing centre
134. Future Collaborations Collaborations are short term: 18 months-3 years. Sequencing Centre 3 Sequencing Centre 1 Sequencing Centre 2A Sequencing Centre 2B Federated access
135.
136. Genomics Data Unstructured data (flat files) Data size per Genome Structured data (databases) Clinical Researchers, non-infomaticians Sequencing informatics specialists Intensities / raw data (2TB) Alignments (200 GB) Sequence + quality data (500 GB) Variation data (1GB) Individual features (3MB)