1. Cloud BioLinux: pre-configured and on-demand computing for genomics without institutional, geographic or economic boundaries Ntino Krampis, PhD JCVI-NIAID-UL workshop S. Africa 2011
9. Problem 2 : many commonly used bioinformatics tools are difficult to install,
10. usually available only as source code - need technical expertise Acquiring the sequence data is only the first step 2
11.
12. we are all using the cloud: Gmail, Google Docs, Yahoo! Mail, FaceBook; you store and access data on a remote computer
13. cloud computers rented pay-as-you-go by service providers such as Amazon Elastic Compute Cloud (EC2) Solving problem 1: computational capacity on the cloud 3
16. used by companies that need additional computers without investing on hardware
17. physical locations US East / West regions, EU, Singapore, Japan r esearchers
18. work on the closest location, then distribute results world-wide
19. democratizes access to computing resources outside of institutional, economic or national boundaries 750 hours free for new users! : http://aws.amazon.com/free/ Additional services besides computing and storage : http://aws.amazon.com Additional services besides computing and storage : http://aws.amazon.com 4
20.
21. a VM is uploaded on the cloud; runs using on-demand computing capacity from the EC2 cloud service
22. can be accessed world-wide through a desktop / laptop computer with Internet access
23. removes need for local computing infrastructure at each laboratory How does cloud computing work ? local desktop computers Internet remote Amazon EC2 cloud computing service VM VM VM 5
24.
25. Cloud BioLinux offers a VM on the cloud with 100+ pre-installed and configured bioinformatics tools
26. sequence analysis, de novo assembly, annotation, phylogeny, molecular modeling, gene expression
27. a researcher can initiate a practically unlimited number of VMs for large-scale data analysis Solving problem 2: Cloud BioLinux 6
28. sign- in to the Amazon EC2 cloud control console http://aws.amazon.com/console Username: [email_address] Password: SAcloud! 7 Starting our tutorial: using the cloud
29. Launch Cloud BioLinux through the EC2 cloud console Click the Launch Instance button 8
40. Amazon EC2 education-research grants: http://aws.amazon.com/education/ Any questions before we get to the exercises ?
41.
42. Connecting remotely to Cloud BioLinux click the NX client icon on your computer's desktop: A. paste the DNS in the “Host” box B. select “Unix”, “Gnome”, remote desktop size C. “ubuntu” is the default user Login “ workshop” is the password we set 16
57. save and share the Virtual Machine (VM) containing your analysis results with a collaborator storage costs: 0.10$ / GB / month 31
58. authorize access to the VM: public or for certain users other researchers can access the VM with all the software, data, analysis results directly on the cloud Cloud BioLinux: whole system snapshot exchange 32
59. Acknowledgments & Credits Brad Chapman,Tim Booth, Bela Tiwari, Dawn Field – Cloud BioLinux development Deepak Singh and AWS - compute credits on EC2 supporting initial development J. Craig Venter Inst. - sponsorship / time allowed to work on this project D. Gomez, E. Navarro, J. Shao, I. Singh, D. Edwards, M. Stout – JCVI tech innovation Members of the Cloud Biolinux community: Enis Afgan Michael Heuer Richard Holland Mark Jensen Dave Messina Steffen Möller Roman Valls Thank you !