SlideShare a Scribd company logo
1 of 30
Download to read offline
Cloud BioLinux: Pre-configured Bioinformatics
                    Computing for the Genomics Community



                                         Ntino Krampis
                                  Asst. Professor - Informatics
                                    J. Craig Venter Institute

                                       kkrampis@jcvi.org
                          http://www.jcvi.org/cms/about/bios/kkrampis/



Tuesday, November 6, 12
J. Craig Venter Institute ( JCVI )

                •     Human Microbiome
                      Project (Nelson et al. Science
                      2010; 328: 994–99)


                •     NIH funded, launched in
                      2008, $115 million

                •     metagenomic sequencing
                      of microbial genomes
                      from the human body

                •     sequence everything in
                      sample, use informatics to
                      separate genomes

Tuesday, November 6, 12
J. Craig Venter Institute

            •     Global Ocean Survey
                  (first publication, Venter et al.
                  Science 2004; 304: 66-74)


            •     metagenomic sequencing
                  of microbes from oceans
                  around the world

            •     Darwin’s route ?

            •     Numbers: HMP > 2 mil.
                  new proteins, GOS > 1.2




Tuesday, November 6, 12
Big Data and sequencing



     •     JCVI sequencing facility:
           454, Solexa, HiSeq, and
           IonTorrent on the way

     •     Processed data: size
           information content

     •     But... look at SOLiD 3

                                                              Source:
                                          http://www.politigenomics.com/next-generation-
                                                      sequencing-informatics




Tuesday, November 6, 12
JCVI: sequencing and computing
                                      infrastructure

                •         “big” sequencing needs
                          large-scale informatics

                •         ~1000 node Grid Engine
                          cluster

                •         research with Hadoop /
                          MapRecuce, and a small
                          private cloud

                •         50+ bioinformaticians and
                          software developers



Tuesday, November 6, 12
A new paradigm:
                          Low-cost, bench-top sequencers

              •      GS Junior - 454, MiSeq -Illumina

              •      complete sequencing of
                     bacterial, viral, fungal genomes

              •      RNAseq (gene expression),
                     ChiPseq (protein interactions),
                     gene variant discovery

              •      sequencing as a standard
                     technique in basic genetics
                     research - like PCR ?



Tuesday, November 6, 12
Will smaller academic labs become the
                            long tail of sequencing ?



                            “sequencing factories” :
                                 JCVI, Broad Inst.
                               Washington Univ.
  Amount                   Inst. of Genome Sciences
     of                                                small academic labs with
 sequencing                                             bench-top sequencers




                                           Number of labs
Tuesday, November 6, 12
Sequencers shipped without clusters

                    •     Problem A : sequence
                          analysis requires
                          computational capacity

                    •     genome assembly, BLAST,
                          gene finders - annotation

                    •     Problem B: bioinformatics    ???
                          tools need software
                          engineering expertise

                    •     unix/linux operating
                          systems, maintaining
                          software libraries,
                          compiling source code
Tuesday, November 6, 12
Each lab builds a cluster ?

                    •     need additional funds to
                          buy the hardware

                    •     funds for personnel to
                          maintain the cluster and
                          software

                    •     duplication of effort
                          across labs

                    •     sub-optimal utilization of
                          the hardware



Tuesday, November 6, 12
Centralized bioinformatics services

                    •     Bioinformatic Resource
                          Centers ex. GSCID

                    •     bioinformatic services
                          usually coupled with
                          sequencing of a genome

                    •     provide mostly data access
                          to external PIs

                    •     cannot support to every
                          lab with a sequencer



Tuesday, November 6, 12
Problem A : sequence analysis requires
                            computational capacity

                   •      Amazon Elastic Compute
                          Cloud (EC2), pay-by-the-
                          hour computing

                   •      cloud servers cost
                          $0.085 - $2 per hour

                   •      max capacity 64GB RAM /
                          8 CPU (can boot
                          hundreds of servers)                            World-wide data centers

                                                     750 hours free for new users: aws.amazon.com/free/

                                                     free compute for teaching: aws.amazon.com/grants/



Tuesday, November 6, 12
Cloud Computing and Virtualization


                    •     OS, software and data,
                          pre-installed in Virtual
                          Machine (VM)

                    •     cloud provider: hardware
                          and virtualization layer

                    •     VM is a full-featured
                          server in a single file

                    •     VM transfer on private
                          cloud

                                                     Credit: VMware Inc.



Tuesday, November 6, 12
Problem B: bioinformatics tools need
                         software engineering expertise

            •     VM with pre-installed software
                  on the cloud

            •     avoid compiling source code, or
                  other software dependencies

            •     rent computational capacity, on
                  a pay as you go basis

            •     run the VM on the closest
                  Amazon data center




Tuesday, November 6, 12
Solving Problems A & B :
                                       Cloud BioLinux

                    •     Cloud BioLinux: publicly
                          accessible VM on EC2

                    •     100+ pre-installed
                          bioinformatics tools

                    •     remote desktop for non-
                          command line experts

                    •     you can create a cluster with
                          Cloud BioLinux - CloudMan       Krampis K, Booth T, Chapman B, Tiwari B, Bicak M,
                                                                         Field D, Nelson K

                                                              Cloud BioLinux: pre-configured and on-demand
                                                          bioinformatics computing for the genomics community.

                                                              BMC Bioinformatics. 2012 Mar 19; 13: 42.



Tuesday, November 6, 12
Accessing Cloud BioLinux




                           http://aws.amazon.com/console
Tuesday, November 6, 12
Launch through the EC2 cloud console




Tuesday, November 6, 12
Amazon EC2 VM launch wizard



                                       cloudbiolinux.org




Tuesday, November 6, 12
Tuesday, November 6, 12
Cloud BioLinux desktop
                              remote connection
        tinyurl.com/bootcloud1   tinyurl.com/bootcloud2




Tuesday, November 6, 12
Cloud BioLinux desktop




Tuesday, November 6, 12
Cloud BioLinux desktop




Tuesday, November 6, 12
Data exchange on the cloud
                                VM snapshots




Tuesday, November 6, 12
Cloud computing research at JCVI

                    •     open-source cloud
                          platforms, fully compatible
                          with Amazon EC2

                    •     active funding, NIAID viral
                          genomics pipeline on cloud

                    •     end-to-end, sequence to
                          assembly, annotation,
                          visualization via Galaxy

                    •     run on Amazon, private
                          cloud, or desktop


Tuesday, November 6, 12
Scriptable Cloud Infrastructures




                                  Fabric
                               framework     •   Cloud BioLinux VM
                                                 configuration in plain text

                                             •   high-level configuration,
                                                 software groups

                                             •   each group individual
                                                 bioinformatics tools
Tuesday, November 6, 12
Scriptable Cloud Infrastructures


              •      Python Fabric leverages
                     Linux packages (APTitude
                     repositories)

              •      mix and match software
                     from repositories

              •      share VM configuration as
                     source code

              •      clone across clouds

                                                Krampis K, Booth T, Chapman B, Tiwari B, Bicak M, Field D, Nelson K
                                   Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community.
                                                               BMC Bioinformatics. 2012 Mar 19; 13: 42.

Tuesday, November 6, 12
Scalable Data Analysis


            •     Cloud BioLinux + Cloudman

            •     dual role : Master / Worker

            •     Cloud BioLinux VM, has
                  Cloudman scripts that start
                  more copies of itself

            •     Grid Engine (SGE) cluster

            •     http://usecloudman.org/
                                                Afgan, E., Chapman, B. et al. (2012). Using Cloud
                                                Computing Infrastructure with CloudBioLinux, CloudMan,
                                                and Galaxy.Current Protocols in Bioinformatics, 11-9.



Tuesday, November 6, 12
Goodies with Cloud BioLinux




Tuesday, November 6, 12
Goodies with Cloud BioLinux




Tuesday, November 6, 12
From sequencer to the cloud




                                                 credit:
                                         basespace.illumina.com




Tuesday, November 6, 12
Acknowledgments

                    •     Cloud BioLinux community:           cloudbiolinux.org
                          Brad Chapman, Enis Afgan,Tim
                          Booth, Mesude Bicak, Dawn Field     groups.google.com/group/cloudbiolinux


                    •     JCVI collaborators: Alex Richter,   tinyurl.com/cloudboot1
                          Ravi Sanka, Andrey Tovichgrechko,
                          Johannes Goll, Karen Nelson, Bill   tinyurl.com/cloudboot2
                          Nierman, JCVI IT support.
                                                              kkrampis@jcvi.org
                    •     NIAID and for funding:
                          Maria Giovani, Punam Mathur
                                                              slideshare.com/agbiotec




                                              Thank you !
Tuesday, November 6, 12

More Related Content

What's hot

Call for non-coding mRNA resource
Call for non-coding mRNA resourceCall for non-coding mRNA resource
Call for non-coding mRNA resourceMatthias Harbers
 
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...EMC
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngsDin Apellidos
 
DNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesDNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesBarbera van Schaik
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 
Jan2016 dnanexus giab uses andrew carroll
Jan2016 dnanexus giab uses andrew carrollJan2016 dnanexus giab uses andrew carroll
Jan2016 dnanexus giab uses andrew carrollGenomeInABottle
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...QIAGEN
 
How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)James Hadfield
 
NGS overview
NGS overviewNGS overview
NGS overviewAllSeq
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqEnis Afgan
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...Jan Aerts
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platformsAllSeq
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Integrated DNA Technologies
 
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...QIAGEN
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshopc.titus.brown
 
Aug2015 analysis team spiral genetics
Aug2015 analysis team spiral geneticsAug2015 analysis team spiral genetics
Aug2015 analysis team spiral geneticsGenomeInABottle
 
Erlang Cache
Erlang CacheErlang Cache
Erlang Cacheice j
 

What's hot (20)

Call for non-coding mRNA resource
Call for non-coding mRNA resourceCall for non-coding mRNA resource
Call for non-coding mRNA resource
 
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
White Paper: Next-Generation Genome Sequencing Using EMC Isilon Scale-Out NAS...
 
2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs2011 jeroen vanhoudt_ngs
2011 jeroen vanhoudt_ngs
 
DNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesDNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differences
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Jan2016 dnanexus giab uses andrew carroll
Jan2016 dnanexus giab uses andrew carrollJan2016 dnanexus giab uses andrew carroll
Jan2016 dnanexus giab uses andrew carroll
 
Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2Biotech autumn2012-02-ngs2
Biotech autumn2012-02-ngs2
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)How to cluster and sequence an ngs library (james hadfield160416)
How to cluster and sequence an ngs library (james hadfield160416)
 
NGS overview
NGS overviewNGS overview
NGS overview
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platforms
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...
 
Jan2016 pac bio giab
Jan2016 pac bio giabJan2016 pac bio giab
Jan2016 pac bio giab
 
2015 pag-metagenome
2015 pag-metagenome2015 pag-metagenome
2015 pag-metagenome
 
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
Aug2015 analysis team spiral genetics
Aug2015 analysis team spiral geneticsAug2015 analysis team spiral genetics
Aug2015 analysis team spiral genetics
 
Erlang Cache
Erlang CacheErlang Cache
Erlang Cache
 

Viewers also liked

TV Channels of the Future - AtticTV Pte Ltd
TV Channels of the Future - AtticTV Pte LtdTV Channels of the Future - AtticTV Pte Ltd
TV Channels of the Future - AtticTV Pte LtdJohnson Goh
 
55 ways to get more energy
55 ways to get more energy55 ways to get more energy
55 ways to get more energyHome
 
William Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_RwandaWilliam Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_RwandaWilliam Kosar
 
Lapcodex Aviones 1
Lapcodex Aviones 1Lapcodex Aviones 1
Lapcodex Aviones 1guest671e5e0
 
The Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big DataThe Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big DataEugen Glavan
 
20121119 Csusm Business Br
20121119 Csusm Business Br20121119 Csusm Business Br
20121119 Csusm Business BrSteveScheibe
 
Wicked Wiki
Wicked WikiWicked Wiki
Wicked Wikijactlc
 
Meet a geek - Marius Deak
Meet a geek - Marius DeakMeet a geek - Marius Deak
Meet a geek - Marius DeakGeekMeet
 
Fairgrounds Proposal
Fairgrounds ProposalFairgrounds Proposal
Fairgrounds Proposalguest12a2146
 
Webinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability TestingWebinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability TestingUserZoom
 
Rdash demo
Rdash demoRdash demo
Rdash demoJobs2web
 
Representing chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and RulesRepresenting chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and RulesMichel Dumontier
 
MNCs Presentation
MNCs PresentationMNCs Presentation
MNCs Presentationmd11mn
 
New Libertarian Manifesto
New Libertarian ManifestoNew Libertarian Manifesto
New Libertarian Manifestoguest12a2146
 
Formtech Composites Composite Material Substitution In Formula 1 Polymer ...
Formtech Composites   Composite Material Substitution In Formula 1   Polymer ...Formtech Composites   Composite Material Substitution In Formula 1   Polymer ...
Formtech Composites Composite Material Substitution In Formula 1 Polymer ...presspley
 
Generell presentasjon
Generell presentasjonGenerell presentasjon
Generell presentasjonGlenn Melby
 
(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15Sascha Funk
 
Molecular symmetry and specialization of atomic connectivity by class-based r...
Molecular symmetry and specialization of atomic connectivity by class-based r...Molecular symmetry and specialization of atomic connectivity by class-based r...
Molecular symmetry and specialization of atomic connectivity by class-based r...Michel Dumontier
 

Viewers also liked (20)

TV Channels of the Future - AtticTV Pte Ltd
TV Channels of the Future - AtticTV Pte LtdTV Channels of the Future - AtticTV Pte Ltd
TV Channels of the Future - AtticTV Pte Ltd
 
Yoshida thesis
Yoshida thesisYoshida thesis
Yoshida thesis
 
55 ways to get more energy
55 ways to get more energy55 ways to get more energy
55 ways to get more energy
 
Lourenza
LourenzaLourenza
Lourenza
 
William Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_RwandaWilliam Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_Rwanda
 
Lapcodex Aviones 1
Lapcodex Aviones 1Lapcodex Aviones 1
Lapcodex Aviones 1
 
The Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big DataThe Sociology of Nothingness: Challenges of Big Data
The Sociology of Nothingness: Challenges of Big Data
 
20121119 Csusm Business Br
20121119 Csusm Business Br20121119 Csusm Business Br
20121119 Csusm Business Br
 
Wicked Wiki
Wicked WikiWicked Wiki
Wicked Wiki
 
Meet a geek - Marius Deak
Meet a geek - Marius DeakMeet a geek - Marius Deak
Meet a geek - Marius Deak
 
Fairgrounds Proposal
Fairgrounds ProposalFairgrounds Proposal
Fairgrounds Proposal
 
Webinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability TestingWebinar: How to Conduct Unmoderated Remote Usability Testing
Webinar: How to Conduct Unmoderated Remote Usability Testing
 
Rdash demo
Rdash demoRdash demo
Rdash demo
 
Representing chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and RulesRepresenting chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and Rules
 
MNCs Presentation
MNCs PresentationMNCs Presentation
MNCs Presentation
 
New Libertarian Manifesto
New Libertarian ManifestoNew Libertarian Manifesto
New Libertarian Manifesto
 
Formtech Composites Composite Material Substitution In Formula 1 Polymer ...
Formtech Composites   Composite Material Substitution In Formula 1   Polymer ...Formtech Composites   Composite Material Substitution In Formula 1   Polymer ...
Formtech Composites Composite Material Substitution In Formula 1 Polymer ...
 
Generell presentasjon
Generell presentasjonGenerell presentasjon
Generell presentasjon
 
(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15
 
Molecular symmetry and specialization of atomic connectivity by class-based r...
Molecular symmetry and specialization of atomic connectivity by class-based r...Molecular symmetry and specialization of atomic connectivity by class-based r...
Molecular symmetry and specialization of atomic connectivity by class-based r...
 

Similar to Ntino Cloud BioLinux Barcelona Spain 2012

HPC lab projects
HPC lab projectsHPC lab projects
HPC lab projectsJason Riedy
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysisYun Lung Li
 
USENIX FAST2010参加報告
USENIX FAST2010参加報告USENIX FAST2010参加報告
USENIX FAST2010参加報告Ryousei Takano
 
Big Process for Big Data @ NASA
Big Process for Big Data @ NASABig Process for Big Data @ NASA
Big Process for Big Data @ NASAIan Foster
 
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant StoreHadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant StoreUri Laserson
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker, Inc.
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 
San diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-groupSan diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-groupinside-BigData.com
 
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIWhole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIPhil Ewels
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of ScienceGlobus
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Robert Grossman
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it worldChris Dwan
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonMicrosoft Azure for Research
 
CLC bio presentation at 5th SFAF 6/3/2010
CLC bio presentation at 5th SFAF 6/3/2010CLC bio presentation at 5th SFAF 6/3/2010
CLC bio presentation at 5th SFAF 6/3/2010Saul Kravitz
 
March 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working GroupMarch 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working GroupGenomeInABottle
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
 
Global bigdata conf_01282013
Global bigdata conf_01282013Global bigdata conf_01282013
Global bigdata conf_01282013HPCC Systems
 

Similar to Ntino Cloud BioLinux Barcelona Spain 2012 (20)

HPC lab projects
HPC lab projectsHPC lab projects
HPC lab projects
 
Big data solution for ngs data analysis
Big data solution for ngs data analysisBig data solution for ngs data analysis
Big data solution for ngs data analysis
 
USENIX FAST2010参加報告
USENIX FAST2010参加報告USENIX FAST2010参加報告
USENIX FAST2010参加報告
 
HiPipe Professional
HiPipe ProfessionalHiPipe Professional
HiPipe Professional
 
Beyond the Science Gateway
Beyond the Science GatewayBeyond the Science Gateway
Beyond the Science Gateway
 
Big Process for Big Data @ NASA
Big Process for Big Data @ NASABig Process for Big Data @ NASA
Big Process for Big Data @ NASA
 
Hadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant StoreHadoop for Bioinformatics: Building a Scalable Variant Store
Hadoop for Bioinformatics: Building a Scalable Variant Store
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
San diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-groupSan diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-group
 
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGIWhole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
 
Foundations for the Future of Science
Foundations for the Future of ScienceFoundations for the Future of Science
Foundations for the Future of Science
 
Bosc2011 ntino-krampis-full
Bosc2011 ntino-krampis-fullBosc2011 ntino-krampis-full
Bosc2011 ntino-krampis-full
 
Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011Bionimbus - Northwestern CGI Workshop 4-21-2011
Bionimbus - Northwestern CGI Workshop 4-21-2011
 
2015 04 bio it world
2015 04 bio it world2015 04 bio it world
2015 04 bio it world
 
Doing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis GannonDoing Research in the Cloud - NIH Workshop Dennis Gannon
Doing Research in the Cloud - NIH Workshop Dennis Gannon
 
CLC bio presentation at 5th SFAF 6/3/2010
CLC bio presentation at 5th SFAF 6/3/2010CLC bio presentation at 5th SFAF 6/3/2010
CLC bio presentation at 5th SFAF 6/3/2010
 
March 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working GroupMarch 2013 Bioinformatics Working Group
March 2013 Bioinformatics Working Group
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
 
Global bigdata conf_01282013
Global bigdata conf_01282013Global bigdata conf_01282013
Global bigdata conf_01282013
 

More from Ntino Krampis

CHPC Afternoon Session
CHPC Afternoon SessionCHPC Afternoon Session
CHPC Afternoon SessionNtino Krampis
 
CHPC Workshop Morning Session
CHPC Workshop Morning SessionCHPC Workshop Morning Session
CHPC Workshop Morning SessionNtino Krampis
 
Overview of Genome Assembly Algorithms
Overview of Genome Assembly AlgorithmsOverview of Genome Assembly Algorithms
Overview of Genome Assembly AlgorithmsNtino Krampis
 
Cloud BioLinux S.Africa
Cloud BioLinux S.AfricaCloud BioLinux S.Africa
Cloud BioLinux S.AfricaNtino Krampis
 
Ntino Krampis GSC 2011
Ntino Krampis GSC 2011Ntino Krampis GSC 2011
Ntino Krampis GSC 2011Ntino Krampis
 
Large scale data-parsing with Hadoop in Bioinformatics
Large scale data-parsing with Hadoop in BioinformaticsLarge scale data-parsing with Hadoop in Bioinformatics
Large scale data-parsing with Hadoop in BioinformaticsNtino Krampis
 
Chi next gen-ntino-krampis
Chi next gen-ntino-krampisChi next gen-ntino-krampis
Chi next gen-ntino-krampisNtino Krampis
 

More from Ntino Krampis (8)

CHPC Afternoon Session
CHPC Afternoon SessionCHPC Afternoon Session
CHPC Afternoon Session
 
CHPC Workshop Morning Session
CHPC Workshop Morning SessionCHPC Workshop Morning Session
CHPC Workshop Morning Session
 
Overview of Genome Assembly Algorithms
Overview of Genome Assembly AlgorithmsOverview of Genome Assembly Algorithms
Overview of Genome Assembly Algorithms
 
Cloud BioLinux S.Africa
Cloud BioLinux S.AfricaCloud BioLinux S.Africa
Cloud BioLinux S.Africa
 
Cloud ntino-krampis
Cloud ntino-krampisCloud ntino-krampis
Cloud ntino-krampis
 
Ntino Krampis GSC 2011
Ntino Krampis GSC 2011Ntino Krampis GSC 2011
Ntino Krampis GSC 2011
 
Large scale data-parsing with Hadoop in Bioinformatics
Large scale data-parsing with Hadoop in BioinformaticsLarge scale data-parsing with Hadoop in Bioinformatics
Large scale data-parsing with Hadoop in Bioinformatics
 
Chi next gen-ntino-krampis
Chi next gen-ntino-krampisChi next gen-ntino-krampis
Chi next gen-ntino-krampis
 

Recently uploaded

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Recently uploaded (20)

Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Ntino Cloud BioLinux Barcelona Spain 2012

  • 1. Cloud BioLinux: Pre-configured Bioinformatics Computing for the Genomics Community Ntino Krampis Asst. Professor - Informatics J. Craig Venter Institute kkrampis@jcvi.org http://www.jcvi.org/cms/about/bios/kkrampis/ Tuesday, November 6, 12
  • 2. J. Craig Venter Institute ( JCVI ) • Human Microbiome Project (Nelson et al. Science 2010; 328: 994–99) • NIH funded, launched in 2008, $115 million • metagenomic sequencing of microbial genomes from the human body • sequence everything in sample, use informatics to separate genomes Tuesday, November 6, 12
  • 3. J. Craig Venter Institute • Global Ocean Survey (first publication, Venter et al. Science 2004; 304: 66-74) • metagenomic sequencing of microbes from oceans around the world • Darwin’s route ? • Numbers: HMP > 2 mil. new proteins, GOS > 1.2 Tuesday, November 6, 12
  • 4. Big Data and sequencing • JCVI sequencing facility: 454, Solexa, HiSeq, and IonTorrent on the way • Processed data: size information content • But... look at SOLiD 3 Source: http://www.politigenomics.com/next-generation- sequencing-informatics Tuesday, November 6, 12
  • 5. JCVI: sequencing and computing infrastructure • “big” sequencing needs large-scale informatics • ~1000 node Grid Engine cluster • research with Hadoop / MapRecuce, and a small private cloud • 50+ bioinformaticians and software developers Tuesday, November 6, 12
  • 6. A new paradigm: Low-cost, bench-top sequencers • GS Junior - 454, MiSeq -Illumina • complete sequencing of bacterial, viral, fungal genomes • RNAseq (gene expression), ChiPseq (protein interactions), gene variant discovery • sequencing as a standard technique in basic genetics research - like PCR ? Tuesday, November 6, 12
  • 7. Will smaller academic labs become the long tail of sequencing ? “sequencing factories” : JCVI, Broad Inst. Washington Univ. Amount Inst. of Genome Sciences of small academic labs with sequencing bench-top sequencers Number of labs Tuesday, November 6, 12
  • 8. Sequencers shipped without clusters • Problem A : sequence analysis requires computational capacity • genome assembly, BLAST, gene finders - annotation • Problem B: bioinformatics ??? tools need software engineering expertise • unix/linux operating systems, maintaining software libraries, compiling source code Tuesday, November 6, 12
  • 9. Each lab builds a cluster ? • need additional funds to buy the hardware • funds for personnel to maintain the cluster and software • duplication of effort across labs • sub-optimal utilization of the hardware Tuesday, November 6, 12
  • 10. Centralized bioinformatics services • Bioinformatic Resource Centers ex. GSCID • bioinformatic services usually coupled with sequencing of a genome • provide mostly data access to external PIs • cannot support to every lab with a sequencer Tuesday, November 6, 12
  • 11. Problem A : sequence analysis requires computational capacity • Amazon Elastic Compute Cloud (EC2), pay-by-the- hour computing • cloud servers cost $0.085 - $2 per hour • max capacity 64GB RAM / 8 CPU (can boot hundreds of servers) World-wide data centers 750 hours free for new users: aws.amazon.com/free/ free compute for teaching: aws.amazon.com/grants/ Tuesday, November 6, 12
  • 12. Cloud Computing and Virtualization • OS, software and data, pre-installed in Virtual Machine (VM) • cloud provider: hardware and virtualization layer • VM is a full-featured server in a single file • VM transfer on private cloud Credit: VMware Inc. Tuesday, November 6, 12
  • 13. Problem B: bioinformatics tools need software engineering expertise • VM with pre-installed software on the cloud • avoid compiling source code, or other software dependencies • rent computational capacity, on a pay as you go basis • run the VM on the closest Amazon data center Tuesday, November 6, 12
  • 14. Solving Problems A & B : Cloud BioLinux • Cloud BioLinux: publicly accessible VM on EC2 • 100+ pre-installed bioinformatics tools • remote desktop for non- command line experts • you can create a cluster with Cloud BioLinux - CloudMan Krampis K, Booth T, Chapman B, Tiwari B, Bicak M, Field D, Nelson K Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics. 2012 Mar 19; 13: 42. Tuesday, November 6, 12
  • 15. Accessing Cloud BioLinux http://aws.amazon.com/console Tuesday, November 6, 12
  • 16. Launch through the EC2 cloud console Tuesday, November 6, 12
  • 17. Amazon EC2 VM launch wizard cloudbiolinux.org Tuesday, November 6, 12
  • 19. Cloud BioLinux desktop remote connection tinyurl.com/bootcloud1 tinyurl.com/bootcloud2 Tuesday, November 6, 12
  • 22. Data exchange on the cloud VM snapshots Tuesday, November 6, 12
  • 23. Cloud computing research at JCVI • open-source cloud platforms, fully compatible with Amazon EC2 • active funding, NIAID viral genomics pipeline on cloud • end-to-end, sequence to assembly, annotation, visualization via Galaxy • run on Amazon, private cloud, or desktop Tuesday, November 6, 12
  • 24. Scriptable Cloud Infrastructures Fabric framework • Cloud BioLinux VM configuration in plain text • high-level configuration, software groups • each group individual bioinformatics tools Tuesday, November 6, 12
  • 25. Scriptable Cloud Infrastructures • Python Fabric leverages Linux packages (APTitude repositories) • mix and match software from repositories • share VM configuration as source code • clone across clouds Krampis K, Booth T, Chapman B, Tiwari B, Bicak M, Field D, Nelson K Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community. BMC Bioinformatics. 2012 Mar 19; 13: 42. Tuesday, November 6, 12
  • 26. Scalable Data Analysis • Cloud BioLinux + Cloudman • dual role : Master / Worker • Cloud BioLinux VM, has Cloudman scripts that start more copies of itself • Grid Engine (SGE) cluster • http://usecloudman.org/ Afgan, E., Chapman, B. et al. (2012). Using Cloud Computing Infrastructure with CloudBioLinux, CloudMan, and Galaxy.Current Protocols in Bioinformatics, 11-9. Tuesday, November 6, 12
  • 27. Goodies with Cloud BioLinux Tuesday, November 6, 12
  • 28. Goodies with Cloud BioLinux Tuesday, November 6, 12
  • 29. From sequencer to the cloud credit: basespace.illumina.com Tuesday, November 6, 12
  • 30. Acknowledgments • Cloud BioLinux community: cloudbiolinux.org Brad Chapman, Enis Afgan,Tim Booth, Mesude Bicak, Dawn Field groups.google.com/group/cloudbiolinux • JCVI collaborators: Alex Richter, tinyurl.com/cloudboot1 Ravi Sanka, Andrey Tovichgrechko, Johannes Goll, Karen Nelson, Bill tinyurl.com/cloudboot2 Nierman, JCVI IT support. kkrampis@jcvi.org • NIAID and for funding: Maria Giovani, Punam Mathur slideshare.com/agbiotec Thank you ! Tuesday, November 6, 12