SlideShare a Scribd company logo
1 of 24
Download to read offline
Pasteur Institute – Mobyle Developers Workshop




28 September 2012




                      Jennifer Dommer, HPC Web Developer
                      Alex Levitsky, HPC Infrastructure Team Lead
                      NIAID OCICB Bioinformatics & Computational
                        Biosciences Branch (BCBB)
Outline

 What is HPC Web?
 Project Goals and Background (5 min.)
 HPC Web Design (10 min.)
 Use of the Mobyle Framework in HPC Web (10 min.)
 HPC Web Video Demos (15 min.)
 BMID and BMPS Overview
 HPC Web Next Steps
 Questions/Discussion (10 min.)




                                                     2
What is HPC Web?
 Web application developed by National Institute of Allergy
  and Infectious Diseases (NIAID) Bioinformatics and
  Computational Biosciences Branch (BCBB)
 HPC Web Team:
  • Alex Levitsky, HPC Infrastructure Team Lead
  • Vivek Gopalan, Former HPC Infrastructure Team Lead
  • Jennifer Dommer, Software Developer
  • Jie Li, Former Software Developer
  • Ramandeep Kaur, Software Developer
  • Karlynn Noble, Designer/Communications
  • Darrell Hurt, Mariam Quinones, Andrew Oler, Vijay
    Nagarajan, Xavier Ambroggio, Kurt Wollenberg, Mike
    Dolan, Burke Squires, Maarten Leerkes, Subject Matter
    Experts
  • Nick Weber, Project Manager
  • Tram Huyen, Project Sponsor
                                                               3
What is HPC Web?

 Web interface to NIAID High Performance Computing
  (HPC) cluster
 Leverages Mobyle framework for job submission, data
  management, and pipeline creation




                                                        4
NIAID HPC Cluster Configuration




                                  5
Project Goals

 Democratize access to high performance computing
  resources
  • Allow non-command-line-savvy bench researchers
    to access sophisticated computational tools and
    infrastructure for their high-throughput research data
 Provide capabilities to:
  • Engage an interactive user community
  • Access, manage, and share HPC files through an
    intuitive web interface
  • Run, track progress, and re-run jobs using simple
    web forms and interfaces
  • Create simple, automated analysis pipelines
                                                             6
Project Background
 2010
  • NIAID HPC infrastructure established
   – Small cluster of ~5 nodes, 30 cores
 • Late 2010 HPC Web v1 released
   – Static content about how to use HPC resources, which
     applications were installed, and how to use them
   – Frameworks established, including integration of Mobyle
   – Simple functionality for requesting accounts and support,
     viewing cluster status, engaging with community, etc.
   – Integrated with custom UCSC Track Manager application
 2011
  • HPC Web phase II development began
   – Cluster had grown from 5 to nearly 40 nodes, from 30 to nearly
     400 cores
   – Project scope to include job submission, data mangement, and
     pipeline creation from web
                                                                      7
Project Background (continued)
 2012
  • Cluster continuing to grow (now ~50 nodes, 600+ cores,
    GPU- and Infiniband-enabled)
  • Approximately 750 TB data, with plans in place to
    expand data storage and implement hierarchical storage
    management / archiving mechanisms to support future
    growth
  • HPC Web Phase II released in May 2012
   – ~20 applications with Mobyle interfaces, for a total of ~60 forms
     for job submission (including sub-packages for applications,
     e.g., tools within SAMtools suite)
   – Limited number of standardized workflow templates
      E.g., RNA-seq-single-sample-mapping, which maps RNA-
       seq reads to a reference genome using TopHat, then passes
       the alignment file to 1) Cufflinks to assemble transcripts and
       quantify the expression and to 2) SAMtools to index the
       alignment file)
                                                                         8
HPC Web Server




                                 Authorization                                         Storage
               Apache user         module                                              /Shared folder
               (hpcwebadm)                                               Apache user   /group folder
                                         Apache user
                                                                                       /application folder



                                     Mobyle library

                                                   Apache user                                         Apache user



                                           DRMAA library

                                                           Apache user


                                                       SGE submit                         SGE Compute
                                                         host            Apache user         nodes




HPC Web job submission implementation schema using Mobyle
HPC Web Mobyle Job Management Interface




               Let‘s focus on the job bl2seq.T11045404625893
Mobyle job results page for bl2seq.T11045404625893




              BLAST result obtained from server
SGE account details job
 bl2seq.T11045404625893


 Job runs using SGE
 DRMAA library is used
  for job submission from
  Mobyle
 Job runs as apache user
 We could show any of
  these parameters in the
  HPC Web interface
  • Start time
  • Queue time
  • End time
  • Cpu time
                qacct command for the job
HPC Web Video Demos

 Navigating the HPC Web interface:
  • http://www.youtube.com/watch?feature=player_emb
    edded&v=cxxALr5PGlY
 Using My File Manager in HPC Web
  • http://www.youtube.com/watch?feature=player_emb
    edded&v=9K8h2l28S2Y
 Submitting jobs to Cluster from HPC Web
  • http://www.youtube.com/watch?feature=player_emb
    edded&v=9K8h2l28S2Y




                                                      13
BCBB Mobyle Interface Designer (BMID)

 A web based GUI for creating Mobyle XML using
  drag-and-drop options and wizards
 Eliminates the need to manually generate XML,
  aiming to facilitate community creation of interfaces
  and minimize development “bottlenecks”




                                                          14
Mobyle Framework: Command-line Application to Web Application




                                                                15
BCBB Mobyle Pipeline System (BMPS)

 Leverages Mobyle framework to string applications
  together such that the output of one process becomes
  the input of the next
 Simplifies analysis by automating standard set of
  procedures that may have previously required manual
  processing
 Enables sharing of useful/novel pipelines among
  users
 Facilitates QC analysis by making it easy to iteratively
  tweak one or a few parameters of an application
  within a saved pipeline and validate results


                                                             16
Example BMPS Template




Other BMPS template examples
available in HPC Web:

• ChIP-seq-with-control
• Map-reads-and-index
• Fastq-quality-boxplot

                               17
Next Steps in HPC Web Development

 Continued development of web forms, especially for
  NGS and structural biology applications
 BMID interface enhancements
 BMPS/Pipeline system enhancements, including
  additional templates
 Integration with Mobyle2 framework




                                                       18
Feature Request Considerations

 Workflow template sharing between HPC users
 Data sharing with non-HPC account holders, including
  those outside NIH
 Ability for users to create their own application
  interfaces using BCBB Mobyle Interface Designer
  (BMID), and share interfaces with others




                                                         19
Discussion

 Comments/Questions?




                        20
Thank You!

 For more information, please contact:




                                          21
Reference Slides




                   22
HPC Web System Architecture




                              23
Client                                             SGE worker
                                                                                Bio Applications
     JavaScript enabled
     Browser                                             SGE                    TopHat
                                                                                BowTie
                                                                                SSAHA
                                                                                etc

         Ajax libraries
         (Only during development)
                                                                                     Enterprise
                                                                                     Storage

         GWT        GWT - DND
                                                   Web server/SGE submit host
          GWT - Incubator

                                                      SGE                       DRMAA Library


                                     JSON object      Jespa library             Python
                                                                                Java

                                                      CXF web services
                                                      library                   MobyleFramework


                                                      Apache Web Server         Tomcat web server

                                                                  SOAP Object

                                                                                          LDAP
                                                            Collab
                                                                                          server
                                                            Sharpoint
                                                            site                                    24

More Related Content

What's hot

Enterprise OSGi at eBay
Enterprise OSGi at eBayEnterprise OSGi at eBay
Enterprise OSGi at eBay
Tony Ng
 
Thu 1100 duncan_john_color
Thu 1100 duncan_john_colorThu 1100 duncan_john_color
Thu 1100 duncan_john_color
DATAVERSITY
 
SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...
SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...
SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...
Stephan H. Wissel
 

What's hot (19)

Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
 
(ATS3-DEV08) Team Development with Accelrys Enterprise Platform
(ATS3-DEV08) Team Development with Accelrys Enterprise Platform(ATS3-DEV08) Team Development with Accelrys Enterprise Platform
(ATS3-DEV08) Team Development with Accelrys Enterprise Platform
 
LDAP at Lightning Speed
 LDAP at Lightning Speed LDAP at Lightning Speed
LDAP at Lightning Speed
 
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilitySolving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
 
Introduction to h base
Introduction to h baseIntroduction to h base
Introduction to h base
 
Nuxeo World Keynote: Roadmap - What to Expect from Nuxeo in 2011
Nuxeo World Keynote: Roadmap - What to Expect from Nuxeo in 2011Nuxeo World Keynote: Roadmap - What to Expect from Nuxeo in 2011
Nuxeo World Keynote: Roadmap - What to Expect from Nuxeo in 2011
 
컨테이너 기술 소개 - Warden, Garden, Docker
컨테이너 기술 소개 - Warden, Garden, Docker컨테이너 기술 소개 - Warden, Garden, Docker
컨테이너 기술 소개 - Warden, Garden, Docker
 
Enterprise OSGi at eBay
Enterprise OSGi at eBayEnterprise OSGi at eBay
Enterprise OSGi at eBay
 
Improving HR Document Availability and Process Workflows with Electronic Imaging
Improving HR Document Availability and Process Workflows with Electronic ImagingImproving HR Document Availability and Process Workflows with Electronic Imaging
Improving HR Document Availability and Process Workflows with Electronic Imaging
 
Big data on virtualized infrastucture
Big data on virtualized infrastuctureBig data on virtualized infrastucture
Big data on virtualized infrastucture
 
Thu 1100 duncan_john_color
Thu 1100 duncan_john_colorThu 1100 duncan_john_color
Thu 1100 duncan_john_color
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing Hadoop
 
Java 9 Module System Introduction
Java 9 Module System IntroductionJava 9 Module System Introduction
Java 9 Module System Introduction
 
SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...
SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...
SHOW107: The DataSource Session: Take XPages data boldly where no XPages data...
 
Introduction to GlusterFS Webinar - September 2011
Introduction to GlusterFS Webinar - September 2011Introduction to GlusterFS Webinar - September 2011
Introduction to GlusterFS Webinar - September 2011
 
Hadoop and OpenStack
Hadoop and OpenStackHadoop and OpenStack
Hadoop and OpenStack
 
Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0Meet HBase 2.0 and Phoenix-5.0
Meet HBase 2.0 and Phoenix-5.0
 

Similar to HPC Web overview - Mobyle Workshop - September 28, 2012

Building a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStackBuilding a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStack
Animesh Singh
 
Membase Meetup 2010
Membase Meetup 2010Membase Meetup 2010
Membase Meetup 2010
Membase
 
Web Application Development using PHP and MySQL
Web Application Development using PHP and MySQLWeb Application Development using PHP and MySQL
Web Application Development using PHP and MySQL
Ganesh Kamath
 
Memcached, presented to LCA2010
Memcached, presented to LCA2010Memcached, presented to LCA2010
Memcached, presented to LCA2010
Mark Atwood
 

Similar to HPC Web overview - Mobyle Workshop - September 28, 2012 (20)

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow Presentation
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow Presentation
 
Realizing the promise of portability with Apache Beam
Realizing the promise of portability with Apache BeamRealizing the promise of portability with Apache Beam
Realizing the promise of portability with Apache Beam
 
Building a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStackBuilding a PaaS Platform like Bluemix on OpenStack
Building a PaaS Platform like Bluemix on OpenStack
 
Unified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache BeamUnified, Efficient, and Portable Data Processing with Apache Beam
Unified, Efficient, and Portable Data Processing with Apache Beam
 
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
 
Membase Meetup 2010
Membase Meetup 2010Membase Meetup 2010
Membase Meetup 2010
 
Cloud standards interoperability: status update on OCCI and CDMI implementations
Cloud standards interoperability: status update on OCCI and CDMI implementationsCloud standards interoperability: status update on OCCI and CDMI implementations
Cloud standards interoperability: status update on OCCI and CDMI implementations
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
전문가토크릴레이 1탄 html5 전망 (전종홍 박사)
전문가토크릴레이 1탄 html5 전망 (전종홍 박사)전문가토크릴레이 1탄 html5 전망 (전종홍 박사)
전문가토크릴레이 1탄 html5 전망 (전종홍 박사)
 
전문가 토크릴레이 1탄 html5 전망 (전종홍 박사)
전문가 토크릴레이 1탄 html5 전망 (전종홍 박사)전문가 토크릴레이 1탄 html5 전망 (전종홍 박사)
전문가 토크릴레이 1탄 html5 전망 (전종홍 박사)
 
IoT Physical Servers and Cloud Offerings.pdf
IoT Physical Servers and Cloud Offerings.pdfIoT Physical Servers and Cloud Offerings.pdf
IoT Physical Servers and Cloud Offerings.pdf
 
CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018CNCF Introduction - Feb 2018
CNCF Introduction - Feb 2018
 
Web Application Development using PHP and MySQL
Web Application Development using PHP and MySQLWeb Application Development using PHP and MySQL
Web Application Development using PHP and MySQL
 
system automation, integration and recovery
system automation, integration and recoverysystem automation, integration and recovery
system automation, integration and recovery
 
Memcached, presented to LCA2010
Memcached, presented to LCA2010Memcached, presented to LCA2010
Memcached, presented to LCA2010
 
When HPC meet ML/DL: Manage HPC Data Center with Kubernetes
When HPC meet ML/DL: Manage HPC Data Center with KubernetesWhen HPC meet ML/DL: Manage HPC Data Center with Kubernetes
When HPC meet ML/DL: Manage HPC Data Center with Kubernetes
 
Day in the life event-driven workshop
Day in the life  event-driven workshopDay in the life  event-driven workshop
Day in the life event-driven workshop
 
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
 

HPC Web overview - Mobyle Workshop - September 28, 2012

  • 1. Pasteur Institute – Mobyle Developers Workshop 28 September 2012 Jennifer Dommer, HPC Web Developer Alex Levitsky, HPC Infrastructure Team Lead NIAID OCICB Bioinformatics & Computational Biosciences Branch (BCBB)
  • 2. Outline  What is HPC Web?  Project Goals and Background (5 min.)  HPC Web Design (10 min.)  Use of the Mobyle Framework in HPC Web (10 min.)  HPC Web Video Demos (15 min.)  BMID and BMPS Overview  HPC Web Next Steps  Questions/Discussion (10 min.) 2
  • 3. What is HPC Web?  Web application developed by National Institute of Allergy and Infectious Diseases (NIAID) Bioinformatics and Computational Biosciences Branch (BCBB)  HPC Web Team: • Alex Levitsky, HPC Infrastructure Team Lead • Vivek Gopalan, Former HPC Infrastructure Team Lead • Jennifer Dommer, Software Developer • Jie Li, Former Software Developer • Ramandeep Kaur, Software Developer • Karlynn Noble, Designer/Communications • Darrell Hurt, Mariam Quinones, Andrew Oler, Vijay Nagarajan, Xavier Ambroggio, Kurt Wollenberg, Mike Dolan, Burke Squires, Maarten Leerkes, Subject Matter Experts • Nick Weber, Project Manager • Tram Huyen, Project Sponsor 3
  • 4. What is HPC Web?  Web interface to NIAID High Performance Computing (HPC) cluster  Leverages Mobyle framework for job submission, data management, and pipeline creation 4
  • 5. NIAID HPC Cluster Configuration 5
  • 6. Project Goals  Democratize access to high performance computing resources • Allow non-command-line-savvy bench researchers to access sophisticated computational tools and infrastructure for their high-throughput research data  Provide capabilities to: • Engage an interactive user community • Access, manage, and share HPC files through an intuitive web interface • Run, track progress, and re-run jobs using simple web forms and interfaces • Create simple, automated analysis pipelines 6
  • 7. Project Background  2010 • NIAID HPC infrastructure established – Small cluster of ~5 nodes, 30 cores • Late 2010 HPC Web v1 released – Static content about how to use HPC resources, which applications were installed, and how to use them – Frameworks established, including integration of Mobyle – Simple functionality for requesting accounts and support, viewing cluster status, engaging with community, etc. – Integrated with custom UCSC Track Manager application  2011 • HPC Web phase II development began – Cluster had grown from 5 to nearly 40 nodes, from 30 to nearly 400 cores – Project scope to include job submission, data mangement, and pipeline creation from web 7
  • 8. Project Background (continued)  2012 • Cluster continuing to grow (now ~50 nodes, 600+ cores, GPU- and Infiniband-enabled) • Approximately 750 TB data, with plans in place to expand data storage and implement hierarchical storage management / archiving mechanisms to support future growth • HPC Web Phase II released in May 2012 – ~20 applications with Mobyle interfaces, for a total of ~60 forms for job submission (including sub-packages for applications, e.g., tools within SAMtools suite) – Limited number of standardized workflow templates  E.g., RNA-seq-single-sample-mapping, which maps RNA- seq reads to a reference genome using TopHat, then passes the alignment file to 1) Cufflinks to assemble transcripts and quantify the expression and to 2) SAMtools to index the alignment file) 8
  • 9. HPC Web Server Authorization Storage Apache user module /Shared folder (hpcwebadm) Apache user /group folder Apache user /application folder Mobyle library Apache user Apache user DRMAA library Apache user SGE submit SGE Compute host Apache user nodes HPC Web job submission implementation schema using Mobyle
  • 10. HPC Web Mobyle Job Management Interface Let‘s focus on the job bl2seq.T11045404625893
  • 11. Mobyle job results page for bl2seq.T11045404625893 BLAST result obtained from server
  • 12. SGE account details job bl2seq.T11045404625893  Job runs using SGE  DRMAA library is used for job submission from Mobyle  Job runs as apache user  We could show any of these parameters in the HPC Web interface • Start time • Queue time • End time • Cpu time qacct command for the job
  • 13. HPC Web Video Demos  Navigating the HPC Web interface: • http://www.youtube.com/watch?feature=player_emb edded&v=cxxALr5PGlY  Using My File Manager in HPC Web • http://www.youtube.com/watch?feature=player_emb edded&v=9K8h2l28S2Y  Submitting jobs to Cluster from HPC Web • http://www.youtube.com/watch?feature=player_emb edded&v=9K8h2l28S2Y 13
  • 14. BCBB Mobyle Interface Designer (BMID)  A web based GUI for creating Mobyle XML using drag-and-drop options and wizards  Eliminates the need to manually generate XML, aiming to facilitate community creation of interfaces and minimize development “bottlenecks” 14
  • 15. Mobyle Framework: Command-line Application to Web Application 15
  • 16. BCBB Mobyle Pipeline System (BMPS)  Leverages Mobyle framework to string applications together such that the output of one process becomes the input of the next  Simplifies analysis by automating standard set of procedures that may have previously required manual processing  Enables sharing of useful/novel pipelines among users  Facilitates QC analysis by making it easy to iteratively tweak one or a few parameters of an application within a saved pipeline and validate results 16
  • 17. Example BMPS Template Other BMPS template examples available in HPC Web: • ChIP-seq-with-control • Map-reads-and-index • Fastq-quality-boxplot 17
  • 18. Next Steps in HPC Web Development  Continued development of web forms, especially for NGS and structural biology applications  BMID interface enhancements  BMPS/Pipeline system enhancements, including additional templates  Integration with Mobyle2 framework 18
  • 19. Feature Request Considerations  Workflow template sharing between HPC users  Data sharing with non-HPC account holders, including those outside NIH  Ability for users to create their own application interfaces using BCBB Mobyle Interface Designer (BMID), and share interfaces with others 19
  • 21. Thank You!  For more information, please contact: 21
  • 23. HPC Web System Architecture 23
  • 24. Client SGE worker Bio Applications JavaScript enabled Browser SGE TopHat BowTie SSAHA etc Ajax libraries (Only during development) Enterprise Storage GWT GWT - DND Web server/SGE submit host GWT - Incubator SGE DRMAA Library JSON object Jespa library Python Java CXF web services library MobyleFramework Apache Web Server Tomcat web server SOAP Object LDAP Collab server Sharpoint site 24