SlideShare ist ein Scribd-Unternehmen logo
1 von 186
Downloaden Sie, um offline zu lesen
Automotive
Simulation      Risk Analysis
                                 High Throughput Computing

                                        Price Modelling
             Engineering

HIGH     CAE                                Aerospace




PERFORMANCE
COMPUTING 2012/13
TECHNOLOGY
COMPASS CAD                               Big Data Analytics

                                      Life Sciences
TECHNOLOGY COMPASS                                                                                                         INTEL CLUSTER READY ............................................................................62
                                                                                                                           A Quality Standard for HPC Clusters...................................................... 64
TABLE OF CONTENTS AND INTRODUCTION
                                                                                                                           Intel Cluster Ready builds HPC Momentum ..................................... 69
                                                                                                                           The transtec Benchmarking Center ....................................................... 73


HIGH PERFORMANCE COMPUTING .................................................... 4                                          WINDOWS HPC SERVER 2008 R2 ........................................................74
Performance Turns Into Productivity ......................................................6                                Elements of the Microsoft HPC Solution ............................................ 76
Flexible deployment with xCAT ...................................................................8                         Deployment, system management, and monitoring ................. 78
                                                                                                                           Job scheduling..................................................................................................... 80
CLUSTER MANAGEMENT MADE EASY ..............................................12                                              Service-oriented architecture ................................................................... 82
Bright Cluster Manager ................................................................................. 14                Networking and MPI ........................................................................................ 85
                                                                                                                           Microsoft Office Excel support ................................................................. 88
INTELLIGENT HPC WORKLOAD MANAGEMENT .........................28
Moab HPC Suite – Enterprise Edition.................................................... 30                                 PARALLEL NFS ...............................................................................................90
New in Moab 7.0 ................................................................................................. 34       The New Standard for HPC Storage ....................................................... 92
Moab HPC Suite – Basic Edition................................................................ 37                          Whats´s new in NFS 4.1? ............................................................................... 94
Moab HPC Suite - Grid Option .................................................................... 43                       Panasas HPC Storage ...................................................................................... 99


NICE ENGINE FRAME .................................................................................50                      NVIDIA GPU COMPUTING ....................................................................110
A technical portal for remote visualization ...................................... 52                                      The CUDA Architecture ............................................................................... 112
Application highlights.................................................................................... 54              Codename “Fermi” ......................................................................................... 116
Desktop Cloud Virtualization .................................................................... 57                       Introducing NVIDIA Parallel Nsight ..................................................... 122
Remote Visualization...................................................................................... 58              QLogic TrueScale InfiniBand and GPUs ............................................ 126


                                                                                                                           INFINIBAND .................................................................................................130
                                                                                                                           High-speed interconnects ........................................................................ 132
                                                                                                                           Top 10 Reasons to Use QLogic TrueScale InfiniBand ................ 136
                                                                                                                           Intel MPI Library 4.0 Performance ........................................................ 139
                                                                                                                           InfiniBand Fabric Suite (IFS) – What’s New in Version 6.0 ...... 141


                                                                                                                           PARSTREAM .................................................................................................144
                                                                                                                           Big Data Analytics .......................................................................................... 146


                                                                                                                           GLOSSARY .....................................................................................................156




                                                                                                                       2
MORE THAN 30 YEARS OF EXPERIENCE IN SCIENTIFIC COMPUTING                environment is of a highly heterogeneous nature. Even the
1980 marked the beginning of a decade where numerous startups           dynamical provisioning of HPC resources as needed does not
were created, some of which later transformed into big players in       constitute any problem, thus further leading to maximal utiliza-
the IT market. Technical innovations brought dramatic changes           tion of the cluster.
to the nascent computer market. In Tübingen, close to one of Ger-
many’s prime and oldest universities, transtec was founded.             transtec HPC solutions use the latest and most innovative
                                                                        technology. Their superior performance goes hand in hand with
In the early days, transtec focused on reselling DEC computers          energy efficiency, as you would expect from any leading edge IT
and peripherals, delivering high-performance workstations to            solution. We regard these basic characteristics.
university institutes and research facilities. In 1987, SUN/Sparc
and storage solutions broadened the portfolio, enhanced by              This brochure focusses on where transtec HPC solutions excel.
IBM/RS6000 products in 1991. These were the typical worksta-            To name a few: Bright Cluster Manager as the technology leader
tions and server systems for high performance computing then,           for unified HPC cluster management, leading-edge Moab HPC
used by the majority of researchers worldwide.                          Suite for job and workload management, Intel Cluster Ready
                                                                        certification as an independent quality standard for our sys-
In the late 90s, transtec was one of the first companies to offer        tems, Panasas HPC storage systems for highest performance
highly customized HPC cluster solutions based on standard               and best scalability required of an HPC storage system. Again,
Intel architecture servers, some of which entered the TOP500            with these components, usability and ease of management
list of the world’s fastest computing systems.                          are central issues that are addressed. Also, being NVIDIA Tesla
                                                                        Preferred Provider, transtec is able to provide customers with
Thus, given this background and history, it is fair to say that         well-designed, extremely powerful solutions for Tesla GPU
transtec looks back upon a more than 30 years’ experience in            computing. QLogic’s InfiniBand Fabric Suite makes managing a
scientific computing; our track record shows nearly 500 HPC              large InfiniBand fabric easier than ever before – transtec mas-
installations. With this experience, we know exactly what cus-          terly combines excellent and well-chosen components that are
tomers’ demands are and how to meet them. High performance              already there to a fine-tuned, customer-specific, and thoroughly
and ease of management – this is what customers require to-             designed HPC solution.
day. HPC systems are for sure required to peak-perform, as their
name indicates, but that is not enough: they must also be easy          Last but not least, your decision for a transtec HPC solution
to handle. Unwieldy design and operational complexity must be           means you opt for most intensive customer care and best ser-
avoided or at least hidden from administrators and particularly         vice in HPC. Our experts will be glad to bring in their expertise
users of HPC computer systems.                                          and support to assist you at any stage, from HPC design to daily
                                                                        cluster operations, to HPC Cloud Services.
transtec HPC solutions deliver ease of management, both in the
Linux and Windows worlds, and even where the customer´s                 Have fun reading the transtec HPC Compass 2012/13!




                                                                    3
HIGH PERFORMANCE
COMPUTING
PERFORMANCE
TURNS INTO
PRODUCTIVITY
High Performance Computing (HPC) has been with us from the very
beginning of the computer era. High-performance computers were
built to solve numerous problems which the “human computers” could
not handle. The term HPC just hadn’t been coined yet. More important,
some of the early principles have changed fundamentally.


HPC systems in the early days were much different from those we see
today. First, we saw enormous mainframes from large computer manu-
facturers, including a proprietary operating system and job management
system. Second, at universities and research institutes, workstations
made inroads and scientists carried out calculations on their dedicated
Unix or VMS workstations. In either case, if you needed more computing
power, you scaled up, i.e. you bought a bigger machine.


Today the term High-Performance Computing has gained a fundamen-
tally new meaning. HPC is now perceived as a way to tackle complex
mathematical, scientific or engineering problems. The integration of
industry standard, “off-the-shelf” server hardware into HPC clusters fa-
cilitates the construction of computer networks of such power that one
single system could never achieve. The new paradigm for parallelization
is scaling out.


                                  5
HIGH PERFORMANCE COMPUTING                                  Computer-supported simulations of realistic processes (so-
                                                            called Computer Aided Engineering – CAE) has established itself
PERFORMANCE TURNS INTO PRODUCTIVITY
                                                            as a third key pillar in the field of science and research along-
                                                            side theory and experimentation. It is nowadays inconceivable
                                                            that an aircraft manufacturer or a Formula One racing team
                                                            would operate without using simulation software. And scien-
                                                            tific calculations, such as in the fields of astrophysics, medicine,
                                                            pharmaceuticals and bio-informatics, will to a large extent be
                                                            dependent on supercomputers in the future. Software manu-
                                                            facturers long ago recognized the benefit of high-performance
                                                            computers based on powerful standard servers and ported
                                                            their programs to them accordingly.


                                                            The main advantages of scale-out supercomputers is just
                                                            that: they are infinitely scalable, at least in principle. Since
                                                            they are based on standard hardware components, such a
                                                            supercomputer can be charged with more power whenever
                                                            the computational capacity of the system is not sufficient any
                                                            more, simply by adding additional nodes of the same kind. A
  “transtec HPC solutions are meant to provide              cumbersome switch to a different technology can be avoided
  customers with unparalleled ease-of-manage-               in most cases.
  ment and ease-of-use. Apart from that, deciding
  for a transtec HPC solution means deciding for            The primary rationale in using HPC clusters is to grow, to scale
  the most intensive customer care and the best             out computing capacity as far as necessary. To reach that goal,
  service imaginable”                                       an HPC cluster returns most of the invest when it is continu-
                                                            ously fed with computing problems.

  Dr. Oliver Tennert Director Technology Management &
  HPC Solutions                                             The secondary reason for building scale-out supercomputers is
                                                            to maximize the utilization of the system.




                                                        6
If the individual processes engage in a large amount of com-
                                                                        munication, the response time of the network (latency) becomes
                                                                        important. Latency in a Gigabit Ethernet or a 10GE network is typi-
                                                                        cally around 10 µs. High-speed interconnects such as InfiniBand,
                                                                        reduce latency by a factor of 10 down to as low as 1 µs. Therefore,
                                                                        high-speed interconnects can greatly speed up total processing.


                                                                        The other frequently used variant is called SMP applications.
VARIATIONS ON THE THEME: MPP AND SMP                                    SMP, in this HPC context, stands for Shared Memory Processing.
Parallel computations exist in two major variants today. Ap-            It involves the use of shared memory areas, the specific imple-
plications running in parallel on multiple compute nodes are            mentation of which is dependent on the choice of the underlying
frequently so-called Massively Parallel Processing (MPP) applica-       operating system. Consequently, SMP jobs generally only run on
tions. MPP indicates that the individual processes can each             a single node, where they can in turn be multi-threaded and thus
utilize exclusive memory areas. This means that such jobs are           be parallelized across the number of CPUs per node. For many HPC
predestined to be computed in parallel, distributed across the          applications, both the MPP and SMP variant can be chosen.
nodes in a cluster. The individual processes can thus utilize the
separate units of the respective node – especially the RAM, the         Many applications are not inherently suitable for parallel execu-
CPU power and the disk I/O.                                             tion. In such a case, there is no communication between the in-
                                                                        dividual compute nodes, and therefore no need for a high-speed
Communication between the individual processes is imple-                network between them; nevertheless, multiple computing jobs
mented in a standardized way through the MPI software                   can be run simultaneously and sequentially on each individual
interface (Message Passing Interface), which abstracts the              node, depending on the number of CPUs.
underlying network connections between the nodes from
the processes. However, the MPI standard (current version               In order to ensure optimum computing performance for these
2.0) merely requires source code compatibility, not binary              applications, it must be examined how many CPUs and cores
compatibility, so an off-the-shelf application usually needs            deliver the optimum performance.
specific versions of MPI libraries in order to run. Examples of
MPI implementations are OpenMPI, MPICH2, MVAPICH2, Intel                We find applications of this sequential type of work typically in
MPI or – for Windows clusters – MS-MPI.                                 the fields of data analysis or Monte-Carlo simulations.




                                                                    7
HIGH PERFORMANCE COMPUTING
FLEXIBLE DEPLOYMENT WITH XCAT




                                    xCAT as a Powerful and Flexible Deployment Tool
                                    xCAT (Extreme Cluster Administration Tool) is an open source
                                    toolkit for the deployment and low-level administration of HPC
                                    cluster environments, small as well as large ones.


                                    xCAT provides simple commands for hardware control, node dis-
                                    covery, the collection of MAC addresses, and the node deploy-
                                    ment with (diskful) or without local (diskless) installation. The
                                    cluster configuration is stored in a relational database. Node
                                    groups for different operating system images can be defined.
                                    Also, user-specific scripts can be executed automatically at
                                    installation time.


                                    xCAT Provides the Following Low-Level Administrative Features
                                     Remote console support
                                     Parallel remote shell and remote copy commands
                                     Plugins for various monitoring tools like Ganglia or Nagios
                                     Hardware control commands for node discovery, collect-
                                       ing MAC addresses, remote power switching and resetting
                                       of nodes




                                8
 Automatic configuration of syslog, remote shell, DNS, DHCP,              when the code is self-developed, developers often prefer one
   and ntp within the cluster                                             MPI implementation over another.
 Extensive documentation and man pages
                                                                          According to the customer’s wishes, we install various compil-
For cluster monitoring, we install and configure the open                  ers, MPI middleware, as well as job management systems like
source tool Ganglia or the even more powerful open source                 Parastation, Grid Engine, Torque/Maui, or the very powerful
solution Nagios, according to the customer’s preferences and              Moab HPC Suite for the high-level cluster management.
requirements.


Local Installation or Diskless Installation
We offer a diskful or a diskless installation of the cluster nodes.
A diskless installation means the operating system is hosted
partially within the main memory, larger parts may or may
not be included via NFS or other means. This approach allows
for deploying large amounts of nodes very efficiently, and the
cluster is up and running within a very small timescale. Also,
updating the cluster can be done in a very efficient way. For
this, only the boot image has to be updated, and the nodes have
to be rebooted. After this, the nodes run either a new kernel or
even a new operating system. Moreover, with this approach,
partitioning the cluster can also be very efficiently done, either
for testing purposes, or for allocating different cluster parti-
tions for different users or applications.


Development Tools, Middleware, and Applications
According to the application, optimization strategy, or underlying
architecture, different compilers lead to code results of very
different performance. Moreover, different, mainly commercial,
applications, require different MPI implementations. And even




                                                                      9
HPC solution
   benchmarking of                                                             application
HIGH PERFORMANCE COMPUTING
   different systems                                                           installation
PERFORMANCE TURNS INTO PRODUCTIVITY




            continual
          improvement



       maintenance,                                            integration
                                                                               onsite
                                          customer                 into
        support &                                                             hardware
                                           training            customer’s
      managed services                                                        assembly
                                                              environment

SERVICES AND CUSTOMER CARE FROM A TO Z




                                         application-,        burn-in tests      software
     individual Presales
                                          customer-,           of systems          & OS
         consulting
                                         site-specific                          installation
                                           sizing of
                                         HPC solution
      benchmarking of                                                          application
      different systems                                                        installation




            continual
          improvement



       maintenance,                                            integration
                                                                               onsite
                                          customer                 into
        support &                                                             hardware
                                           training            customer’s
      managed services                                                        assembly
                                                              environment



                                                         10
to important middleware components like cluster management
                                                                          or developer tools and the customer’s production applications.
                                                                          Onsite delivery means onsite integration into the customer’s
                                                                          production environment, be it establishing network connectivity
                                                                          to the corporate network, or setting up software and configura-
                                                                          tion parts.


                                                                          transtec HPC clusters are ready-to-run systems – we deliver, you
HPC @ TRANSTEC: SERVICES AND CUSTOMER CARE FROM A TO Z                    turn the key, the system delivers high performance. Every HPC
transtec AG has over 30 years of experience in scientific comput-          project entails transfer to production: IT operation processes and
ing and is one of the earliest manufacturers of HPC clusters.             policies apply to the new HPC system. Effectively, IT personnel is
For nearly a decade, transtec has delivered highly customized             trained hands-on, introduced to hardware components and soft-
High Performance clusters based on standard components to                 ware, with all operational aspects of configuration management.
academic and industry customers across Europe with all the
high quality standards and the customer-centric approach that             transtec services do not stop when the implementation projects
transtec is well known for.                                               ends. Beyond transfer to production, transtec takes care. transtec
                                                                          offers a variety of support and service options, tailored to the
Every transtec HPC solution is more than just a rack full of hard-        customer’s needs. When you are in need of a new installation, a
ware – it is a comprehensive solution with everything the HPC             major reconfiguration or an update of your solution – transtec is
user, owner, and operator need.                                           able to support your staff and, if you lack the resources for main-
                                                                          taining the cluster yourself, maintain the HPC solution for you.
In the early stages of any customer’s HPC project, transtec ex-           From Professional Services to Managed Services for daily opera-
perts provide extensive and detailed consulting to the customer           tions and required service levels, transtec will be your complete
– they benefit from expertise and experience. Consulting is fol-           HPC service and solution provider. transtec’s high standards of
lowed by benchmarking of different systems with either specifi-            performance, reliability and dependability assure your productiv-
cally crafted customer code or generally accepted benchmarking            ity and complete satisfaction.
routines; this aids customers in sizing and devising the optimal
and detailed HPC configuration.                                            transtec’s offerings of HPC Managed Services offer customers the
                                                                          possibility of having the complete management and administra-
Each and every piece of HPC hardware that leaves our factory              tion of the HPC cluster managed by transtec service specialists,
undergoes a burn-in procedure of 24 hours or more if necessary.           in an ITIL compliant way. Moreover, transtec’s HPC on Demand
We make sure that any hardware shipped meets our and our                  services help provide access to HPC resources whenever they
customers’ quality requirements. transtec HPC solutions are turn-         need them, for example, because they do not have the possibility
key solutions. By default, a transtec HPC cluster has everything          of owning and running an HPC cluster themselves, due to lacking
installed and configured – from hardware and operating system              infrastructure, know-how, or admin staff.




                                                                     11
CLUSTER
MANAGEMENT
MADE EASY
Bright Cluster Manager removes the complexity from the
installation, management and use of HPC clusters, without
compromizing performance or capability. With Bright Cluster
Manager, an administrator can easily install, use and manage
multiple clusters simultaneously, without the need for expert
knowledge of Linux or HPC.




                              13
CLUSTER MANAGEMENT MADE EASY                                             A UNIFIED APPROACH
                                                                         Other cluster management offerings take a “toolkit” approach
BRIGHT CLUSTER MANAGER
                                                                         in which a Linux distribution is combined with many third-party
THE CLUSTER INSTALLER TAKES THE ADMINISTRATOR THROUGH THE
                                                                         tools for provisioning, monitoring, alerting, etc.
INSTALLATION PROCESS AND OFFERS ADVANCED OPTIONS SUCH AS
“EXPRESS” AND “REMOTE”.                                                  This approach has critical limitations because those separate
                                                                         tools were not designed to work together, were not designed
                                                                         for HPC, and were not designed to scale. Furthermore, each of
                                                                         the tools has its own interface (mostly command-line based),
                                                                         and each has its own daemons and databases. Countless hours
                                                                         of scripting and testing from highly skilled people are required
                                                                         to get the tools to work for a specific cluster, and much of it
                                                                         goes undocumented.
                                                                         Bright Cluster Manager takes a much more fundamental, inte-
                                                                         grated and unified approach. It was designed and written from
                                                                         the ground up for straightforward, efficient, comprehensive clus-
                                                                         ter management. It has a single lightweight daemon, a central
                                                                         database for all monitoring and configuration data, and a single
BY SELECTING A CLUSTER NODE IN THE TREE ON THE LEFT AND THE TASKS
                                                                         CLI and GUI for all cluster management functionality.
TAB ON THE RIGHT, THE ADMINISTRATOR CAN EXECUTE A NUMBER OF
POWERFUL TASKS ON THAT NODE WITH JUST A SINGLE MOUSE CLICK..             This approach makes Bright Cluster Manager extremely easy to
                                                                         use, scalable, secure and reliable, complete, flexible, and easy to
                                                                         maintain and support.


                                                                         EASE OF INSTALLATION
                                                                         Bright Cluster Manager is easy to install. Typically, system admin-
                                                                         istrators can install and test a fully functional cluster from “bare
                                                                         metal” in less than an hour. Configuration choices made during
                                                                         the installation can be modified afterwards. Multiple installation
                                                                         modes are available, including unattended and remote modes.
                                                                         Cluster nodes can be automatically identified based on switch
                                                                         ports rather than MAC addresses, improving speed and reliability
                                                                         of installation, as well as subsequent maintenance.




                                                                    14
EASE OF USE                                                                  are performed through one intuitive, visual interface.
Bright Cluster Manager is easy to use. System administrators                 Multiple clusters can be managed simultaneously. The CMGUI
have two options: the intuitive Cluster Management Graphical                 runs on Linux, Windows and MacOS (coming soon) and can be
User Interface (CMGUI) and the powerful Cluster Management                   extended using plugins. The CMSH provides practically the same
Shell (CMSH). The CMGUI is a standalone desktop application                  functionality as the Bright CMGUI, but via a command-line inter-
that provides a single system view for managing all hardware                 face. The CMSH can be used both interactively and in batch mode
and software aspects of the cluster through a single point of                via scripts. Either way, system administrators now have unprec-
control. Administrative functions are streamlined as all tasks               edented flexibility and control over their clusters.


CLUSTER METRICS, SUCH AS GPU AND CPU TEMPERATURES, FAN SPEEDS AND NETWORKS STATISTICS CAN BE VISUALIZED BY SIMPLY DRAGGING AND DROPPING THEM FROM
THE LIST ON THE LEFT INTO A GRAPHING WINDOW ON THE RIGHT. MULTIPLE METRICS CAN BE COMBINED IN ONE GRAPH AND GRAPHS CAN BE ZOOMED INTO. GRAPH LAYOUT
AND COLORS CAN BE TAILORED TO YOUR REQUIREMENTS.




                                                                        15
CLUSTER MANAGEMENT MADE EASY                                      SUPPORT FOR LINUX AND WINDOWS
                                                                  Bright Cluster Manager is based on Linux and is available
BRIGHT CLUSTER MANAGER
                                                                  with a choice of pre-integrated, pre-configured and opti-
                                                                  mized Linux distributions, including SUSE Linux Enterprise

                                                                  THE STATUS OF CLUSTER NODES, SWITCHES, OTHER HARDWARE, AS WELL AS UP TO
                                                                  SIX METRICS CAN BE VISUALIZED IN THE RACKVIEW. A ZOOM-OUT OPTION IS AVAIL-
                                                                  ABLE FOR CLUSTERS WITH MANY RACKS.




THE OVERVIEW TAB PROVIDES INSTANT, HIGH-LEVEL INSIGHT INTO
THE STATUS OF THE CLUSTER.




                                                                  Server, Red Hat Enterprise Linux, CentOS and Scientific
                                                                  Linux. Dual-boot installations with Windows HPC Server are
                                                                  supported as well, allowing nodes to either boot from the
                                                                  Bright-managed Linux head node, or the Windows-managed
                                                                  head node.


                                                                  EXTENSIVE DEVELOPMENT ENVIRONMENT
                                                                  Bright Cluster Manager provides an extensive HPC development
                                                                  environment for both serial and parallel applications, including
                                                                  the following (some optional):




                                                             16
 Compilers, including full suites from GNU, Intel, AMD and              THE PARALLEL SHELL ALLOWS FOR SIMULTANEOUS EXECUTION OF COMMANDS OR
                                                                         SCRIPTS ACROSS NODE GROUPS OR ACROSS THE ENTIRE CLUSTER.
   Portland Group
 Debuggers and profilers, including the GNU debugger and
   profiler, TAU, TotalView, Allinea DDT and Allinea OPT
 GPU libraries, including CUDA and OpenCL
 MPI libraries, including OpenMPI, MPICH, MPICH2, MPICH-
   MX, MPICH2-MX, MVAPICH and MVAPICH2; all cross-compiled
   with the compilers installed on Bright Cluster Manager, and
   optimized for high-speed interconnects such as InfiniBand
   and Myrinet
 Mathematical libraries, including ACML, FFTW, GMP,
   GotoBLAS, MKL and ScaLAPACK
 Other libraries, including Global Arrays, HDF5, IIPP, TBB, Net-
   CDF and PETSc


Bright Cluster Manager also provides Environment Modules to              Linux kernels can be assigned to individual images. Incremen-
make it easy to maintain multiple versions of compilers, librar-         tal changes to images can be deployed to live nodes without
ies and applications for different users on the cluster, without         rebooting or re-installation.
creating compatibility conflicts. Each Environment Module file             The provisioning system propagates only changes to the
contains the information needed to configure the shell for an             images, minimizing time and impact on system performance
application, and automatically sets these variables correctly            and availability. Provisioning capability can be assigned to
for the particular application when it is loaded. Bright Cluster         any number of nodes on-the-fly, for maximum flexibility and
Manager includes many preconfigured module files for many                  scalability. Bright Cluster Manager can also provision over
scenarios, such as combinations of compliers, mathematical               InfiniBand and to RAM disk.
and MPI libraries.
                                                                         COMPREHENSIVE MONITORING
POWERFUL IMAGE MANAGEMENT AND PROVISIONING                               With Bright Cluster Manager, system administrators can collect,
Bright Cluster Manager features sophisticated software image             monitor, visualize and analyze a comprehensive set of metrics.
management and provisioning capability. A virtually unlimited            Practically all software and hardware metrics available to the
number of images can be created and assigned to as many                  Linux kernel, and all hardware management interface metrics
different categories of nodes as required. Default or custom             (IPMI, iLO, etc.) are sampled.




                                                                    17
CLUSTER MANAGEMENT MADE EASY
BRIGHT CLUSTER MANAGER




                                    HIGH PERFORMANCE MEETS EFFICIENCY
                                    Initially, massively parallel systems constitute a challenge to
                                    both administrators and users. They are complex beasts. Any-
                                    one building HPC clusters will need to tame the beast, master
                                    the complexity and present users and administrators with an
                                    easy-to-use, easy-to-manage system landscape.


                                    Leading HPC solution providers such as transtec achieve this
                                    goal. They hide the complexity of HPC under the hood and
                                    match high performance with efficiency and ease-of-use for
                                    both users and administrators. The “P” in “HPC” gains a double
                                    meaning: “Performance” plus “Productivity”.


                                    Cluster and workload management software like Moab HPC
                                    Suite, Bright Cluster Manager or QLogic IFS provide the means
                                    to master and hide the inherent complexity of HPC systems. For
                                    administrators and users, HPC clusters are presented as single,
                                    large machines, with many different tuning parameters. The
                                    software also provides a unified view of existing clusters when-
                                    ever unified management is added as a requirement by the
                                    customer at any point in time after the first installation. Thus,
                                    daily routine tasks such as job management, user management,
                                    queue partitioning and management, can be performed easily
                                    with either graphical or web-based tools, without any advanced
                                    scripting skills or technical expertise required from the adminis-
                                    trator or user.




                               18
 Powerful cluster automation functionality allows
                                                                            preemptive actions based on monitoring thresholds
                                                                          Comprehensive cluster monitoring and health checking
                                                                            framework, including automatic sidelining of unhealthy
                                                                            nodes to prevent job failure


                                                                         Scalability from Deskside to TOP500
                                                                          Off-loadable provisioning for maximum scalability
THE BRIGHT ADVANTAGE                                                      Proven on some of the world’s largest clusters
Bright Cluster Manager offers many advantages that lead to
improved productivity, uptime, scalability, performance and              Minimum Overhead/Maximum Performance
security, while reducing total cost of ownership.                         Single lightweight daemon drives all functionality
                                                                          Daemon heavily optimized to minimize effect on operating
Rapid Productivity Gains                                                    system and applications
 Easy to learn and use, with an intuitive GUI                            Single database stores all metric and configuration data
 Quick installation: from bare metal to a cluster ready to use,
   in less than an hour                                                  Top Security
 Fast, flexible provisioning: incremental, live, disk-full, disk-         Automated security and other updates from key-signed
   less, provisioning over InfiniBand, auto node discovery                   repositories
 Comprehensive monitoring: on-the-fly graphs, rackview,                   Encrypted external and internal communications (optional)
   multiple clusters, custom metrics                                      X.509v3 certificate-based public-key authentication
 Powerful automation: thresholds, alerts, actions                        Role-based access control and complete audit trail
 Complete GPU support: NVIDIA, AMD ATI, CUDA, OpenCL                     Firewalls and secure LDAP
 On-demand SMP: instant ScaleMP virtual SMP deployment
 Powerful cluster management shell and SOAP API for auto-
   mating tasks and creating custom capabilities
 Seamless integration with leading workload managers: PBS
   Pro, Moab, Maui, SLURM, Grid Engine, Torque, LSF
 Integrated (parallel) application development environment.
 Easy maintenance: automatically update your cluster from
   Linux and Bright Computing repositories
 Web-based user portal
                                                                                        Bright Computing
Maximum Uptime
 Unattended, robust head node failover to spare head node




                                                                    19
CLUSTER MANAGEMENT MADE EASY        Examples include CPU and GPU temperatures, fan speeds,
                                    switches, hard disk SMART information, system load, memory
BRIGHT CLUSTER MANAGER
                                    utilization, network statistics, storage metrics, power systems
                                    statistics, and workload management statistics. Custom metrics
                                    can also easily be defined.


                                    Metric sampling is done very efficiently – in one process, or
                                    out-of-band where possible. System administrators have full
                                    flexibility over how and when metrics are sampled, and historic
                                    data can be consolidated over time to save disk space.

                                    THE AUTOMATION CONFIGURATION WIZARD GUIDES THE SYSTEM ADMINISTRATOR
                                    THROUGH THE STEPS OF DEFINING A RULE: SELECTING METRICS, DEFINING THRESH-
                                    OLDS AND SPECIFYING ACTIONS.




                                    CLUSTER MANAGEMENT AUTOMATION
                                    Cluster management automation takes preemptive actions
                                    when predetermined system thresholds are exceeded, sav-
                                    ing time and preventing hardware damage. System thresh-
                                    olds can be configured on any of the available metrics. The
                                    built-in configuration wizard guides the system administra-




                               20
tor through the steps of defining a rule: selecting metrics,        EXAMPLE GRAPHS THAT VISUALIZE METRICS ON A GPU CLUSTER.

defining thresholds and specifying actions. For example,
a temperature threshold for GPUs can be established that
results in the system automatically shutting down an over-
heated GPU unit and sending an SMS message to the system
administrator’s mobile phone. Several predefined actions are
available, but any Linux command or script can be config-
ured as an action.


COMPREHENSIVE GPU MANAGEMENT
Bright Cluster Manager radically reduces the time and ef-
fort of managing GPUs, and fully integrates these devices
into the single view of the overall system. Bright includes
powerful GPU management and monitoring capability that
leverages functionality in NVIDIA Tesla GPUs. System admin-
istrators can easily assume maximum control of the GPUs
and gain instant and time-based status insight. In addition
to the standard cluster management capabilities, Bright
Cluster Manager monitors the full range of GPU metrics,
including:                                                          MULTI-TASKING VIA PARALLEL SHELL
 GPU temperature, fan speed, utilization                           The parallel shell allows simultaneous execution of multiple
 GPU exclusivity, compute, display, persistance mode               commands and scripts across the cluster as a whole, or across
 GPU memory utilization, ECC statistics                            easily definable groups of nodes. Output from the executed
 Unit fan speed, serial number, temperature, power                 commands is displayed in a convenient way with variable levels
   usage, voltages and currents, LED status, firmware                of verbosity. Running commands and scripts can be killed easily
 Board serial, driver version, PCI info                            if necessary. The parallel shell is available through both the
                                                                    CMGUI and the CMSH.
Beyond metrics, Bright Cluster Manager features built-in
support for GPU computing with CUDA and OpenCL libraries.           INTEGRATED WORKLOAD MANAGEMENT
Switching between current and previous versions of CUDA and         Bright Cluster Manager is integrated with a wide selection of
OpenCL has also been made easy.                                     free and commercial workload managers. This integration




                                                               21
CLUSTER MANAGEMENT MADE EASY                               provides a number of benefits:
                                                            The selected workload manager gets automatically installed
BRIGHT CLUSTER MANAGER
                                                              and configured
                                                            Many workload manager metrics are monitored
                                                            The GUI provides a user-friendly interface for configuring,
                                                              monitoring and managing the selected workload manager
                                                            The CMSH and the SOAP API provide direct and powerful access
                                                              to a number of workload manager commands and metrics

WORKLOAD MANAGEMENT QUEUES CAN BE VIEWED AND CON-          CREATING AND DISMANTLING A VIRTUAL SMP NODE CAN BE ACHIEVED WITH JUST
FIGURED FROM THE GUI, WITHOUT THE NEED FOR WORKLOAD        A FEW CLICKS WITHIN THE GUI OR A SINGLE COMMAND IN THE CLUSTER MANAGE-
MANAGEMENT EXPERTISE.                                      MENT SHELL.




                                                      22
 Reliable workload manager failover is properly configured              MAXIMUM UPTIME WITH HEALTH CHECKING
 The workload manager is continuously made aware of the                Bright Cluster Manager – Advanced Edition includes a powerful
   health state of nodes (see section on Health Checking)               cluster health checking framework that maximizes system uptime.
                                                                        It continually checks multiple health indicators for all hardware
The following user-selectable workload managers are tightly             and software components and proactively initiates corrective
integrated with Bright Cluster Manager:                                 actions. It can also automatically perform a series of standard
 PBS Pro, Moab, Maui, LSF                                              and user-defined tests just before starting a new job, to ensure
 SLURM, Grid Engine, Torque                                            a successful execution. Examples of corrective actions include
                                                                        autonomous bypass of faulty nodes, automatic job requeuing to
Alternatively, Lava, LoadLeveler or other workload managers can         avoid queue flushing, and process “jailing” to allocate, track, trace
be installed on top of Bright Cluster Manager.                          and flush completed user processes. The health checking frame-
                                                                        work ensures the highest job throughput, the best overall cluster
INTEGRATED SMP SUPPORT                                                  efficiency and the lowest administration overhead.
Bright Cluster Manager – Advanced Edition dynamically ag-
gregates multiple cluster nodes into a single virtual SMP node,         WEB-BASED USER PORTAL
using ScaleMP’s Versatile SMP™ (vSMP) architecture. Creating            The web-based user portal provides read-only access to essential
and dismantling a virtual SMP node can be achieved with just            cluster information, including a general overview of the cluster
a few clicks within the CMGUI. Virtual SMP nodes can also be            status, node hardware and software properties, workload manager
launched and dismantled automatically using the scripting               statistics and user-customizable graphs. The User Portal can easily
capabilities of the CMSH. In Bright Cluster Manager a virtual           be customized and expanded using PHP and the SOAP API.
SMP node behaves like any other node, enabling transparent,
on-the-fly provisioning, configuration, monitoring and man-               USER AND GROUP MANAGEMENT
agement of virtual SMP nodes as part of the overall system              Users can be added to the cluster through the CMGUI or the
management.                                                             CMSH. Bright Cluster Manager comes with a pre-configured
                                                                        LDAP database, but an external LDAP service, or alternative
MAXIMUM UPTIME WITH HEAD NODE FAILOVER                                  authentication system, can be used instead.
Bright Cluster Manager – Advanced Edition allows two head
nodes to be configured in active-active failover mode. Both              ROLE-BASED ACCESS CONTROL AND AUDITING
head nodes are on active duty, but if one fails, the other takes        Bright Cluster Manager’s role-based access control mechanism
over all tasks, seamlessly.                                             allows administrator privileges to be defined on a per-role basis.




                                                                   23
CLUSTER MANAGEMENT MADE EASY                          Administrator actions can be audited using an audit file which
                                                      stores all their write action.
BRIGHT CLUSTER MANAGER

                                                      TOP CLUSTER SECURITY
                                                      Bright Cluster Manager offers an unprecedented level of secu-
                                                      rity that can easily be tailored to local requirements. Security
                                                      features include:
                                                       Automated security and other updates from key-signed
                                                         Linux and Bright Computing repositories
                                                       Encrypted internal and external communications
                                                       X.509v3 certificate based public-key authentication to the
                                                         cluster management infrastructure

                                                      THE WEB-BASED USER PORTAL PROVIDES READ-ONLY ACCESS TO ESSENTIAL CLUSTER
                                                      INFORMATION, INCLUDING A GENERAL OVERVIEW OF THE CLUSTER STATUS, NODE
                                                      HARDWARE AND SOFTWARE PROPERTIES, WORKLOAD MANAGER STATISTICS AND
                                                      USER-CUSTOMIZABLE GRAPHS.




  “The building blocks for transtec HPC solu-
  tions must be chosen according to our goals
  ease-of-management and ease-of-use. With
  Bright Cluster Manager, we are happy to have
  the technology leader at hand, meeting these
  requirements, and our customers value that.”




  Armin Jäger HPC Solution Engineer




                                                 24
 Role-based access control and complete audit trail                         STANDARD AND ADVANCED EDITIONS
 Firewalls and secure LDAP                                                  Bright Cluster Manager is available in two editions: Standard
 Secure shell access                                                        and Advanced. The table on this page lists the differences. You
                                                                             can easily upgrade from the Standard to the Advanced Edition
MULTI-CLUSTER CAPABILITY                                                     as your cluster grows in size or complexity.
Bright Cluster Manager is ideal for organizations that need to
manage multiple clusters, either in one or in multiple locations.            DOCUMENTATION AND SERVICES
Capabilities include:                                                        A comprehensive system administrator manual and user manu-
 All cluster management and monitoring functionality availa-                al are included in PDF format. Customized training and profes-
   ble for all clusters through one GUI                                      sional services are available. Services include various levels of
 Selecting any set of configurations in one cluster and                     support, installation services and consultancy.
   export them to any or all other clusters with a few mouse
   clicks
 Making node images available to other clusters.

BRIGHT CLUSTER MANAGER CAN MANAGE MULTIPLE CLUSTERS SIMULTANEOUSLY.          CLUSTER HEALTH CHECKS CAN BE VISUALIZED IN THE RACKVIEW. THIS SCREENSHOT
THIS OVERVIEW SHOWS CLUSTERS IN OSLO, ABU DHABI AND HOUSTON, ALL MAN-        SHOWS THAT GPU UNIT 41 FAILS A HEALTH CHECK CALLED “ALLFANSRUNNING”.
AGED THROUGH ONE GUI.




                                                                        25
CLUSTER MANAGEMENT MADE EASY
BRIGHT CLUSTER MANAGER




FEATURE                           STANDARD   ADVANCED

Choice of Linux distributions        x           x
Intel Cluster Ready                  x           x
Cluster Management GUI               x           x
Cluster Management Shell             x           x
Web-Based User Portal                x           x
SOAP API                             x           x
Node Provisioning                    x           x
Node Identification                   x           x
Cluster Monitoring                   x           x
Cluster Automation                   x           x
User Management                      x           x
Parallel Shell                       x           x
Workload Manager Integration         x           x
Cluster Security                     x           x
Compilers                            x           x
Debuggers & Profilers                 x           x
MPI Libraries                        x           x
Mathematical Libraries               x           x
Environment Modules                  x           x
NVIDIA CUDA & OpenCL                 x           x
GPU Management & Monitoring          x           x
ScaleMP Management & Monitoring      -           x
Redundant Failover Head Nodes        -           x
Cluster Health Checking              -           x
Off-loadable Provisioning            -           x
Suggested Number of Nodes          4–128     129–10,000+

Multi-Cluster Management             -           x
Standard Support                     x           x
Premium Support                   Optional   Optional




                                                           26
27
INTELLIGENT
HPC WORKLOAD
MANAGEMENT
While all HPC systems face challenges in workload demand,
resource complexity, and scale, enterprise HPC systems face
more stringent challenges and expectations. Enterprise HPC
systems must meet mission-critical and priority HPC workload
demands for commercial businesses and business-oriented
research and academic organizations. They have complex SLAs
and priorities to balance. Their HPC workloads directly impact
the revenue, product delivery, and organizational objectives
of their organizations.




                             29
INTELLIGENT                                          MOAB HPC SUITE
                                                     Moab is the most powerful intelligence engine for policy-based,
HPC WORKLOAD MANAGEMENT                              predictive scheduling across workloads and resources. Moab
MOAB HPC SUITE – ENTERPRISE EDITION                  HPC Suite accelerates results delivery and maximize utiliza-
                                                     tion while simplifying workload management across complex,
                                                     heterogeneous cluster environments. The Moab HPC Suite
                                                     products leverage the multi-dimensional policies in Moab to
                                                     continually model and monitor workloads, resources, SLAs,
                                                     and priorities to optimize workload output. And these policies
                                                     utilize the unique Moab management abstraction layer that
                                                     integrates data across heterogeneous resources and resource
                                                     managers to maximize control as you automate workload man-
                                                     agement actions.


                                                     Managing the World’s Top Systems, Ready to Manage Yours
                                                     Moab manages the world’s largest, most scale-intensive and
                                                     complex HPC environments in the world including 40% of the top
                                                     10 supercomputing systems, nearly 40% of the top 25 and 36%
                                                     of the compute cores in the top 100 systems based on rankings
                                                     from www.Top500.org. So you know it is battle-tested and ready
  “With Moab HPC Suite, we can meet very de-         to efficiently and intelligently manage the complexities of your
  manding customers’ requirements as regards         environment.
  unified management of heterogeneous cluster
  environments, grid management, and provide         MOAB HPC SUITE – ENTERPRISE EDITION
  them with flexible and powerful configuration        Moab HPC Suite - Enterprise Edition provides enterprise-ready
  and reporting options. Our customers value         HPC workload management that self-optimizes the productivity,
  that highly.”                                      workload uptime and meeting of SLAs and business priorities
                                                     for HPC systems and HPC cloud. It uses the battle-tested and
                                                     patented Moab intelligence engine to automate the mission-
  Thomas Gebert HPC Solution Architect               critical workload priorities of enterprise HPC systems. Enterprise
                                                     customers benefit from a single integrated product that brings




                                                30
together key enterprise HPC capabilities, implementation, train-          achievement of business objectives and outcomes that depend
ing, and 24x7 support services to speed the realization of benefits        on the results the enterprise HPC systems deliver. Moab HPC
from their HPC system for their business. Moab HPC Suite – En-            Suite Enterprise Edition delivers:
terprise Edition delivers:
 Productivity acceleration                                               Productivity acceleration to get more results faster and at a
 Uptime automation                                                       lower cost
 Auto-SLA enforcement                                                    Moab HPC Suite – Enterprise Edition gets more results delivered
 Grid- and cloud-ready HPC management                                    faster from HPC resources to lower costs while accelerating
                                                                          overall system, user and administrator productivity. Moab
Designed to Solve Enterprise HPC Challenges                               provides the unmatched scalability, 90-99 percent utilization,
While all HPC systems face challenges in workload and resource            and fast and simple job submission that is required to maximize
complexity, scale and demand, enterprise HPC systems face                 productivity in enterprise HPC organizations. The Moab intel-
more stringent challenges and expectations. Enterprise HPC                ligence engine optimizes workload scheduling and orchestrates
systems must meet mission-critical and priority HPC workload              resource provisioning and management to maximize workload
demands for commercial businesses and business-oriented                   speed and quantity. It also unifies workload management
research and academic organizations. These organizations have             across heterogeneous resources, resource managers and even
complex SLA and priorities to balance. And their HPC workloads            multiple clusters to reduce management complexity and costs.
directly impact the revenue, product delivery, and organization-
al objectives of their organizations.                                     Uptime automation to ensure workload completes successfully
Enterprise HPC organizations must eliminate job delays and                HPC job and resource failures in enterprise HPC systems lead to
failures. They are also seeking to improve resource utilization           delayed results and missed organizational opportunities and
and workload management efficiency across multiple heteroge-               objectives. Moab HPC Suite – Enterprise Edition intelligently
neous systems. To maximize user productivity, they are required           automates workload and resource uptime in HPC systems to en-
to make it easier to access and use HPC resources for users and           sure that workload completes reliably and avoids these failures.
even expand to other clusters or HPC cloud to better handle
workload demand and surges.                                               Auto-SLA enforcement to consistently meet service guaran-
                                                                          tees and business priorities
BENEFITS                                                                  Moab HPC Suite – Enterprise Edition uses the powerful Moab
Moab HPC Suite - Enterprise Edition offers key benefits to                 intelligence engine to optimally schedule and dynamically
reduce costs, improve service performance, and accelerate the             adjust workload to consistently meet service level agreements
productivity of enterprise HPC systems. These benefits drive the           (SLAs), guarantees, and business priorities. This automatically




                                                                     31
INTELLIGENT                                ensures that the right workloads are completed at the optimal
                                           times, taking into account the complex number of departments,
HPC WORKLOAD MANAGEMENT                    priorities and SLAs to be balanced.
MOAB HPC SUITE – ENTERPRISE EDITION
                                           Grid- and Cloud-ready HPC management to more efficiently
                                           manage and meet workload demand
                                           The benefits of a traditional HPC environment can be extended
                                           to more efficiently manage and meet workload and resource
                                           demand by sharing workload across multiple clusters through
                                           grid management and the HPC cloud management capabilities
                                           provided in Moab HPC Suite – Enterprise Edition.


                                           CAPABILITIES
                                           Moab HPC Suite – Enterprise Edition brings together key en-
                                           terprise HPC capabilities into a single integrated product that
                                           self-optimizes the productivity, workload uptime, and meeting
                                           of SLA’s and priorities for HPC systems and HPC Cloud.


                                           Productivity acceleration capabilities deliver more results
                                           faster, lower costs, and increase resource, user and administra-
                                           tor productivity
ARCHITECTURE                                Massive scalability accelerates job response and through-
                                             put, including support for high throughput computing
                                            Workload-optimized allocation policies and provisioning
                                             gets more results out of existing heterogeneous resources to
                                             reduce costs
                                            Workload unification across heterogeneous clusters maxi-
                                             mizes resource availability for workloads and administration
                                             efficiency by managing workload as one cluster
                                            Simplified HPC submission and control for both users and ad-
                                             ministrators with job arrays, templates, self-service submission




                                      32
portal and administrator dashboard                                         (i.e. usage limits, usage reports, etc.)
 Optimized intelligent scheduling that packs workloads and                  SLA and priority polices ensure the highest priority workloads
   backfills around priority jobs and reservations while balancing             are processed first (i.e. quality of service, hierarchical priority
   SLAs to efficiently use all available resources                             weighting, dynamic fairshare policies, etc.)
 Advanced scheduling and management of GPGPUs for jobs to                   Continuous plus future scheduling ensures priorities and gua-
   maximize their utilization including auto-detection, policy-based          rantees are proactively met as conditions and workload levels
   GPGPU scheduling and GPGPU metrics reporting                               change (i.e. future reservations, priorities, and pre-emption)
 Workload-aware auto-power management reduces energy use
   and costs by 30-40 percent with intelligent workload consolidati-        Grid- and cloud-ready HPC management extends the benefits of
   on and auto-power management                                             your traditional HPC environment to more efficiently manage
                                                                            workload and better meet workload demand
Uptime automation capabilities ensure workload completes suc-                Pay-for-use showback and chargeback capabilities track
cessfully and reliably, avoiding failures and missed organizational           actual resource usage with flexible chargeback options and
opportunities and objectives                                                  reporting by user or department
 Intelligent resource placement prevents job failures with gra-             Manage and share workload across multiple remote
   nular resource modeling that ensures all workload requirements             clusters to meet growing workload demand or surges with
   are met while avoiding at-risk resources                                   the single self-service portal and intelligence engine with
 Auto-response to incidents and events maximizes job and sys-                purchase of Moab HPC Suite - Grid Option
   tem uptime with configurable actions to pre-failure conditions,
   amber alerts, or other metrics and monitors                              ARCHITECTURE
 Workload-aware maintenance scheduling helps maintain a                    Moab HPC Suite - Enterprise Edition is architected to integrate
   stable HPC system without disrupting workload productivity               on top of your existing job resource managers and other types
 Real-world services expertise ensures fast time to value and              of resource managers in your environment. It provides policy-
   system uptime with included package of implementation, trai-             based scheduling and management of workloads as well as
   ning, and 24x7 remote support services                                   resource allocation and provisioning orchestration. The Moab
                                                                            intelligence engine makes complex scheduling and manage-
Auto-SLA enforcement schedules and adjusts workload to con-                 ment decisions based on all of the data it integrates from the
sistently meet service guarantees and business priorities so the            various resource managers and then orchestrates the job and
right workloads are completed at the optimal times                          management actions through those resource managers. It
 Department budget enforcement schedules resources in                      does this without requiring any additional agents. This makes
   line with resource sharing agreements and budgets                        it the ideal choice to integrate with existing and new systems




                                                                       33
INTELLIGENT
HPC WORKLOAD MANAGEMENT
NEW IN MOAB 7.0




                               NEW MOAB HPC SUITE 7.0
                               The new Moab HPC Suite 7.0 releases deliver continued break-
                               through advancements in scalability, reliability, and job array
                               management to accelerate system productivity as well as ex-
                               tended database support. Here is a look at the new capabilities
                               and the value they offer customers:


                               TORQUE Resource Manager Scalability and Reliability Ad-
                               vancements for Petaflop and Beyond
                               As part of the Moab HPC Suite 7.0 releases, the TORQUE 4.0
                               resource manager features scalability and reliability advance-
                               ments to fully exploit Moab scalability. These advancements
                               maximize your use of increasing hardware capabilities and
                               enable you to meet growing HPC user needs. Key advancements
                               in TORQUE 4.0 for Moab HPC Suite 7.0 include:


                                The new Job Radix enables you to efficiently run jobs that span
                                  tens of thousands or even hundreds of thousands of nodes.
                                  Each MOM daemon now cascades job communication with
                                  multiple other MOM daemons simultaneously to reduce the
                                  job start-up process time to a small fraction of what it would
                                  normally take across a large number of nodes. The Job Radix
                                  eliminates lost jobs and job start-up bottlenecks caused by
                                  having all nodes MOM daemons communicating with only one
                                  head MOM node. This saves critical minutes on job start-up
                                  process time and allows for higher job throughput.




                          34
 New MOM daemon communication hierarchy increases                              gration with existing user portals, plug-ins of resource manag-
    the number of nodes supported and reduces the overhead                      ers for rich data integration, and script integration. Customers
    of cluster status updates by distributing communication                     now have a standard interface to Moab with REST APIs.
    across multiple nodes instead of a single TORQUE head
    node. This makes status updates more efficient faster sched-                 Simplified Self-Service and Admin Dashboard Portal Experience
    uling and responsiveness.                                                   Moab HPC 7.0 features an enhanced self-service and admin
 New multi-threading improves response and reliability,                        dashboard portal with simplified “click-based” job submission
    allowing for instant feedback to user requests as well as the               for end users as well as new visual cluster dashboard views of
    ability to continue work even if some processes linger.                     nodes, jobs, and reservations for more efficient management. The
 Improved network communications with all UDP-based                            new Visual Cluster dashboard provides administrators and users
    communication replaced with TCP to make data transfers                      views of their cluster resources that are easily filtered by almost
    from node to node more reliable.                                            any factors including id, name, IP address, state, power, pending
                                                                                actions, reservations, load, memory, processors, etc. Users can
Job Array Auto-Cancellation Policies Improve System Productivity                also quickly filter and view their jobs by name, state, user, group,
Moab HPC Suite 7.0 improves system productivity with new job ar-                account, wall clock requested, memory requested, start date/
ray auto-cancellation policies that cancel remaining sub-jobs in an             time, submit date/time, etc. One-click drill-downs provide addi-
array once the solution is found in the array results. This frees up            tional details and options for management actions.
resources, which would otherwise be running irrelevant jobs, to run
other jobs in the queue jobs quicker. The job array auto-cancellation           Resource Usage Accounting Flexibility
policies allow you to set auto-cancellations of sub-jobs based on               Moab HPC Suite 7.0 includes more flexible resource usage ac-
first, any instance of results success or failure, or specific exit codes.        counting options that enable administrators to easily duplicate
                                                                                custom organizational hierarchies such as organization, groups,
Extended Database Support Now Includes PostgreSQL and                           projects, business units, cost centers etc. in the Moab Account-
Oracle in Addition to MySQL                                                     ing Manager usage budgets and charging structure. This ensures
The extended database support in Moab HPC Suite 7.0 enables                     resource usage is budgeted , tracked, and reported or charged
customers to use ODBC-compliant PostgreSQL and Oracle                           back for in the most useful way to admins and their customer
databases in addition to MySQL. This provides customers the                     groups and users.
flexibility to use the database that best meets their needs or is
the standard for their system.


New Moab Web Services Provide Easier Standard Integration
and Customization
New Moab Web Services provide easier standard integration
and customization for a customer’s environment such as inte-




                                                                           35
INTELLIGENT                           as well as to manage your HPC system as it grows and expands
                                      in the future.
HPC WORKLOAD MANAGEMENT
MOAB HPC SUITE – BASIC EDITION        Moab HPC Suite – Enterprise Edition includes the patented
                                      Moab intelligence engine that enables it to integrate with and
                                      automate management across existing heterogeneous environ-
                                      ments to optimize management and workload efficiency. This
                                      unique intelligence engine includes:
                                       Industry leading multi-dimensional policies that automate
                                         the complex real-time decisions and actions for scheduling
                                         workload and allocating and adapting resources. These mul-
                                         ti-dimensional policies can model and consider the workload
                                         requirements, resource attributes and affinities, SLAs and
                                         priorities to enable more complex and efficient decisions to
                                         be automated.
                                       Real-time and predictive future environment scheduling
                                         that drives more accurate and efficient decisions and service
                                         guarantees as it can proactively adjust scheduling and re-
                                         source allocations as it projects the impact of workload and
                                         resource condition changes.
                                       Open & flexible management abstraction layer lets you
                                         integrate the data and orchestrate workload actions across
                                         the chaos of complex heterogeneous cluster environments
                                         and management middleware to maximize workload control,
                                         automation, and optimization.


                                      COMPONENTS
                                      Moab HPC Suite – Enterprise Edition includes the following inte-
                                      grated products and technologies for a complete HPC workload
                                      management solution:
                                       Moab Workload Manager: Patented multi-dimensional




                                 36
intelligence engine that automates the complex decisions           based workload management system that accelerates and auto-
  and orchestrates policy-based workload placement and               mates the scheduling, managing, monitoring, and reporting of
  scheduling as well as resource allocation, provisioning and        HPC workloads on massive scale, multi-technology installations.
  energy management                                                  The Moab HPC Suite – Basic Edition patented multi-dimensional
 Moab Cluster Manager: Graphical desktop administrator              decision engine accelerates both the decisions and orchestrati-
  application for managing, configuring, monitoring, and              on of workload across the ideal combination of diverse resour-
  reporting for Moab managed clusters                                ces, including specialized resources like GPGPUs. The speed and
 Moab Viewpoint: Web-based user self-service job submis-            accuracy of the decisions and scheduling automation optimizes
  sion and management portal and administrator dashboard             workload throughput and resource utilization so more work
  portal                                                             is accomplished in less time with existing resources to control
 Moab Accounting Manager: HPC resource use budgeting                costs and increase the value out of HPC investments.
  and accounting tool that enforces resource sharing agree-
  ments and limits based on departmental budgets and provi-          Moab HPC Suite – Basic Edition enables you to address pressing
  des showback and chargeback reporting for resource usage           HPC challenges including:
 Moab Services Manager: Integration interfaces to resource           Delays to workload start and end times slowing results
  managers and third-party tools                                      Inconsistent delivery on service guarantees and SLA commit-
                                                                        ments
Moab HPC Suite – Enterprise Edition is also integrated with           Under-utilization of resources
TORQUE which is available as a free download on AdaptiveCom-          How to efficiently manage workload across heterogeneous and
puting.com. TORQUE is an open-source job/resource manager               hybrid systems of GPGPUs, hardware, and middleware
that provides continually updated information regarding the           How to simplify job submission & management for users and
state of nodes and workload status. Adaptive Computing is the           administrators to maximize productivity
custodian of the TORQUE project and is actively developing
the code base in cooperation with the TORQUE community to            Moab HPC Suite – Basic Edition acts as the “brain” of an HPC
provide state of the art resource management. Each Moab HPC          system to accelerate and automate complex decision making
Suite product subscription includes support for the Moab HPC         processes. The patented decision engine is capable of making
Suite as well as TORQUE, if you choose to use TORQUE as the          the complex multi-dimensional policy-based decisions needed to
job/resource manager for your cluster.                               schedule workload to optimize job speed, job success and resource
                                                                     utilization. Moab HPC Suite – Basic Edition integrates decision-
MOAB HPC SUITE – BASIC EDITION                                       making data from and automates actions through your system’s
Moab HPC Suite – Basic Edition is a multi-dimensional policy-        existing mix of resource managers. This enables all the dimensions




                                                                37
INTELLIGENT                           of real-time granular resource attributes and state as well as the
                                      timing of current and future resource commitments to be factored
HPC WORKLOAD MANAGEMENT               into more efficient and accurate scheduling and allocation decisi-
MOAB HPC SUITE – BASIC EDITION        ons. It also dramatically simplifies the management tasks and pro-
                                      cesses across these complex, heterogeneous environments. Moab
                                      works with many of the major resource management and industry
                                      standard resource monitoring tools covering mixed hardware,
MOAB HPC SUITE - BASIC EDITION        network, storage and licenses.
                                      Moab HPC Suite – Basic Edition policies are also able to factor
                                      in organizational priorities and complexities when scheduling
                                      workload and allocating resources. Moab ensures workload is pro-
                                      cessed according to organizational priorities and commitments
                                      and that resources are shared fairly across users, groups and even
                                      multiple organizations. This enables organizations to automati-
                                      cally enforce service guarantees and effectively manage organiza-
                                      tional complexities with simple policy-based settings.


                                      BENEFITS
                                      Moab HPC Suite – Basic Edition drives more ROI and results
                                      from your HPC environment including:
                                       Improved job response times and job throughput with a
                                         workload decision engine that accelerates complex wor-
                                         kload scheduling decisions to enable faster job start times
                                         and high throughput computing
                                       Optimized resource utilization to 90-99 percent with multi-
                                         dimensional and predictive workload scheduling to accomp-
                                         lish more with your existing resources
                                       Automated enforcement of service guarantees, priorities,
                                         and resource sharing agreements across users, groups, and
                                         projects
                                       Increased productivity by simplifying HPC use, access, and




                                 38
control for both users and administrators with job arrays,                affinity- and node topology-based placement
  job templates, optional user portal, and GUI administrator               Backfill job scheduling speeds job throughput and maximi-
  management and monitoring tool                                            zes utilization by scheduling smaller or less demanding jobs
 Streamline job turnaround and reduce administrative                       as they can fit around priority jobs and reservations to use
  burden by unifying and automating workload tasks and re-                  all available resources
  source processes across diverse resources and mixed-system               Security policies control which users and groups can access
  environments including GPGPUs                                             which resources
 Provides a scalable workload management architecture                     Checkpointing
  that can manage peta-scale and beyond, is grid-ready,
  compatible with existing infrastructure, and extensible to              Real-time and predictive scheduling ensure job priorities and
  manage your environment as it grows and evolves                         guarantees are proactively met as conditions and workload
                                                                          levels change
CAPABILITIES                                                               Advanced reservations guarantee that jobs run when required
Moab HPC Suite – Basic Edition accelerates workload pro-                   Maintenance reservations reserve resources for planned fu-
cessing with a patented multi-dimensional decision engine                   ture maintenance to avoid disruption to business workloads
that self-optimizes workload placement, resource utilization               Predictive scheduling enables the future workload schedule
and results output while ensuring organizational priorities                 to be continually forecasted and adjusted along with resour-
are met across the users and groups leveraging the HPC                      ce allocations to adapt to changes in conditions and new job
environment.                                                                and reservation requests


Policy-driven scheduling intelligently places workload on op-             Advanced scheduling and management of GPGPUs for jobs to
timal set of diverse resources to maximize job throughput and             maximize their utilization
success as well as utilization and the meeting of workload and             Automatic detection and management of GPGPUs in envi-
group priorities                                                            ronment to eliminate manual configuration and make them
 Priority, SLA and resource sharing policies ensure the highest            immediately available for scheduling
  priority workloads are processed first and resources are                  Exclusively allocate and schedule GPGPUs on a per-job basis
  shared fairly across users and groups such as quality of                 Policy-based management & scheduling using GPGPU
  service, hierarchical priority weighting, and fairshare targets,          metrics
  limits and weights policies                                              Quick access to statistics on GPGPU utilization and key
 Allocation policies optimize resource utilization and prevent             metrics for optimal management and issue diagnosis such as
  job failures with granular resource modeling and scheduling,              error counts, temperature, fan speed, and memory




                                                                     39
INTELLIGENT                           Easier submission, management, and control of job arrays im-
                                      prove user productivity and job throughput efficiency
HPC WORKLOAD MANAGEMENT                Users can easily submit thousands of sub-jobs with a single
MOAB HPC SUITE – BASIC EDITION          job submission with an array index differentiating each array
                                        sub-job
                                       Job array usage limit policies enforce number of job maxi-
                                        mums by credentials or class
                                       Simplified reporting and management of job arrays for end
                                        users filters jobs to summarize, track and manage at the
                                        master job level


                                      Scalable job performance to large-scale, extreme-scale, and
                                      high-throughput computing environments
                                       Efficiently manages the submission and scheduling of hund-
                                        reds of thousands of queued job submissions to support
                                        high throughput computing
                                       Fast scheduler response to user commands while scheduling
                                        so users and administrators get the real-time job informati-
                                        on they need
                                       Fast job throughput rate to get results started and delivered
                                        faster and keep utilization of resources up



                                      Open and flexible management abstraction layer easily integrates
                                      with and automates management across existing heterogeneous
                                      resources and middleware to improve management efficiency
                                       Rich data integration and aggregation enables you to set pow-
                                        erful, multi-dimensional policies based on the existing real-time
                                        resource data monitored without adding any new agents
                                       Heterogeneous resource allocation & management for wor-
                                        kloads across mixed hardware, specialty resources such as




                                 40
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012
Hpc compass transtec_2012

Weitere ähnliche Inhalte

Was ist angesagt?

Advent Net Manage Engine Service Desk Plus Help Admin Guide
Advent Net Manage Engine Service Desk Plus Help Admin GuideAdvent Net Manage Engine Service Desk Plus Help Admin Guide
Advent Net Manage Engine Service Desk Plus Help Admin Guideguestf80501
 
Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02
Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02
Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02Phạm Tiệp
 
100PercentPureJavaCookbook-4_1_1
100PercentPureJavaCookbook-4_1_1100PercentPureJavaCookbook-4_1_1
100PercentPureJavaCookbook-4_1_1AbrarMoiz
 
Not all XML Gateways are Created Equal
Not all XML Gateways are Created EqualNot all XML Gateways are Created Equal
Not all XML Gateways are Created EqualCA API Management
 
Expert oracle database architecture
Expert oracle database architectureExpert oracle database architecture
Expert oracle database architectureairy6548
 
Net app v-c_tech_report_3785
Net app v-c_tech_report_3785Net app v-c_tech_report_3785
Net app v-c_tech_report_3785ReadWrite
 
Antaira catalog v131
Antaira catalog v131Antaira catalog v131
Antaira catalog v131Tom Larson
 
Ammonia Plant Selection Sizing and Troubleshooting
Ammonia Plant Selection Sizing and Troubleshooting Ammonia Plant Selection Sizing and Troubleshooting
Ammonia Plant Selection Sizing and Troubleshooting Karl Kolmetz
 
Actix analyzer training_manual_for_gsm
Actix analyzer training_manual_for_gsmActix analyzer training_manual_for_gsm
Actix analyzer training_manual_for_gsmDragos Biciu
 
Managing Data Center Connectivity TechBook
Managing Data Center Connectivity TechBook Managing Data Center Connectivity TechBook
Managing Data Center Connectivity TechBook EMC
 
Product description vital qip next generation v7 2_en_feb09(1)
Product description vital qip next generation v7 2_en_feb09(1)Product description vital qip next generation v7 2_en_feb09(1)
Product description vital qip next generation v7 2_en_feb09(1)Roy Muy Golfo
 
Load runner controller
Load runner controllerLoad runner controller
Load runner controllerAshwin Mane
 
Zend Server Ce Reference Manual V403
Zend Server Ce Reference Manual V403Zend Server Ce Reference Manual V403
Zend Server Ce Reference Manual V403SMKF Plus Bani Saleh
 
Optimization guidelines accessibility-ericsson-rev01
Optimization guidelines accessibility-ericsson-rev01Optimization guidelines accessibility-ericsson-rev01
Optimization guidelines accessibility-ericsson-rev01ZIZI Yahia
 
Jasper server ce-install-guide
Jasper server ce-install-guideJasper server ce-install-guide
Jasper server ce-install-guidewoid
 
Konecranes "The Happy Customer"
Konecranes "The Happy Customer"Konecranes "The Happy Customer"
Konecranes "The Happy Customer"Konecranes USA
 

Was ist angesagt? (20)

Advent Net Manage Engine Service Desk Plus Help Admin Guide
Advent Net Manage Engine Service Desk Plus Help Admin GuideAdvent Net Manage Engine Service Desk Plus Help Admin Guide
Advent Net Manage Engine Service Desk Plus Help Admin Guide
 
Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02
Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02
Adventnetmanageengineservicedeskplushelpadminguide 090929051027-p hpapp02
 
100PercentPureJavaCookbook-4_1_1
100PercentPureJavaCookbook-4_1_1100PercentPureJavaCookbook-4_1_1
100PercentPureJavaCookbook-4_1_1
 
Not all XML Gateways are Created Equal
Not all XML Gateways are Created EqualNot all XML Gateways are Created Equal
Not all XML Gateways are Created Equal
 
Expert oracle database architecture
Expert oracle database architectureExpert oracle database architecture
Expert oracle database architecture
 
fundamentals of linux
fundamentals of linuxfundamentals of linux
fundamentals of linux
 
Sql developer usermanual_en
Sql developer usermanual_enSql developer usermanual_en
Sql developer usermanual_en
 
Net app v-c_tech_report_3785
Net app v-c_tech_report_3785Net app v-c_tech_report_3785
Net app v-c_tech_report_3785
 
Antaira catalog v131
Antaira catalog v131Antaira catalog v131
Antaira catalog v131
 
Ammonia Plant Selection Sizing and Troubleshooting
Ammonia Plant Selection Sizing and Troubleshooting Ammonia Plant Selection Sizing and Troubleshooting
Ammonia Plant Selection Sizing and Troubleshooting
 
Actix analyzer training_manual_for_gsm
Actix analyzer training_manual_for_gsmActix analyzer training_manual_for_gsm
Actix analyzer training_manual_for_gsm
 
Managing Data Center Connectivity TechBook
Managing Data Center Connectivity TechBook Managing Data Center Connectivity TechBook
Managing Data Center Connectivity TechBook
 
Product description vital qip next generation v7 2_en_feb09(1)
Product description vital qip next generation v7 2_en_feb09(1)Product description vital qip next generation v7 2_en_feb09(1)
Product description vital qip next generation v7 2_en_feb09(1)
 
Load runner controller
Load runner controllerLoad runner controller
Load runner controller
 
Lfa
LfaLfa
Lfa
 
Zend Server Ce Reference Manual V403
Zend Server Ce Reference Manual V403Zend Server Ce Reference Manual V403
Zend Server Ce Reference Manual V403
 
Optimization guidelines accessibility-ericsson-rev01
Optimization guidelines accessibility-ericsson-rev01Optimization guidelines accessibility-ericsson-rev01
Optimization guidelines accessibility-ericsson-rev01
 
LS615 Laser System Manual
LS615 Laser System ManualLS615 Laser System Manual
LS615 Laser System Manual
 
Jasper server ce-install-guide
Jasper server ce-install-guideJasper server ce-install-guide
Jasper server ce-install-guide
 
Konecranes "The Happy Customer"
Konecranes "The Happy Customer"Konecranes "The Happy Customer"
Konecranes "The Happy Customer"
 

Ähnlich wie Hpc compass transtec_2012

Configuring a highly available Microsoft Lync Server 2013 environment on Dell...
Configuring a highly available Microsoft Lync Server 2013 environment on Dell...Configuring a highly available Microsoft Lync Server 2013 environment on Dell...
Configuring a highly available Microsoft Lync Server 2013 environment on Dell...Principled Technologies
 
Optimizing oracle-on-sun-cmt-platform
Optimizing oracle-on-sun-cmt-platformOptimizing oracle-on-sun-cmt-platform
Optimizing oracle-on-sun-cmt-platformSal Marcus
 
820 6359-13
820 6359-13820 6359-13
820 6359-13ramuktg
 
Business and Economic Benefits of VMware NSX
Business and Economic Benefits of VMware NSXBusiness and Economic Benefits of VMware NSX
Business and Economic Benefits of VMware NSXAngel Villar Garea
 
Network Virtualization and Security with VMware NSX - Business Case White Pap...
Network Virtualization and Security with VMware NSX - Business Case White Pap...Network Virtualization and Security with VMware NSX - Business Case White Pap...
Network Virtualization and Security with VMware NSX - Business Case White Pap...Błażej Matusik
 
Everything You Need To Know About Cloud Computing
Everything You Need To Know About Cloud ComputingEverything You Need To Know About Cloud Computing
Everything You Need To Know About Cloud ComputingDarrell Jordan-Smith
 
Mastering Oracle PL/SQL: Practical Solutions
Mastering Oracle PL/SQL: Practical SolutionsMastering Oracle PL/SQL: Practical Solutions
Mastering Oracle PL/SQL: Practical SolutionsMURTHYVENKAT2
 
software-eng.pdf
software-eng.pdfsoftware-eng.pdf
software-eng.pdffellahi1
 
Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407Banking at Ho Chi Minh city
 
Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407Banking at Ho Chi Minh city
 
sum2_abap_unix_hana.pdf
sum2_abap_unix_hana.pdfsum2_abap_unix_hana.pdf
sum2_abap_unix_hana.pdfssuser9f920a1
 
Faronics Insight Tech Console User guide
Faronics Insight Tech Console User guideFaronics Insight Tech Console User guide
Faronics Insight Tech Console User guideFaronics
 
Vmware nsx network virtualization platform white paper
Vmware nsx network virtualization platform white paperVmware nsx network virtualization platform white paper
Vmware nsx network virtualization platform white paperCloudSyntrix
 
VMware-NSX-Network-Virtualization-Platform-WP
VMware-NSX-Network-Virtualization-Platform-WPVMware-NSX-Network-Virtualization-Platform-WP
VMware-NSX-Network-Virtualization-Platform-WPStephen Fenton
 
Vmware nsx-network-virtualization-platform-white-paper
Vmware nsx-network-virtualization-platform-white-paperVmware nsx-network-virtualization-platform-white-paper
Vmware nsx-network-virtualization-platform-white-paperCloudSyntrix
 

Ähnlich wie Hpc compass transtec_2012 (20)

Configuring a highly available Microsoft Lync Server 2013 environment on Dell...
Configuring a highly available Microsoft Lync Server 2013 environment on Dell...Configuring a highly available Microsoft Lync Server 2013 environment on Dell...
Configuring a highly available Microsoft Lync Server 2013 environment on Dell...
 
Optimizing oracle-on-sun-cmt-platform
Optimizing oracle-on-sun-cmt-platformOptimizing oracle-on-sun-cmt-platform
Optimizing oracle-on-sun-cmt-platform
 
820 6359-13
820 6359-13820 6359-13
820 6359-13
 
Business and Economic Benefits of VMware NSX
Business and Economic Benefits of VMware NSXBusiness and Economic Benefits of VMware NSX
Business and Economic Benefits of VMware NSX
 
Network Virtualization and Security with VMware NSX - Business Case White Pap...
Network Virtualization and Security with VMware NSX - Business Case White Pap...Network Virtualization and Security with VMware NSX - Business Case White Pap...
Network Virtualization and Security with VMware NSX - Business Case White Pap...
 
Everything You Need To Know About Cloud Computing
Everything You Need To Know About Cloud ComputingEverything You Need To Know About Cloud Computing
Everything You Need To Know About Cloud Computing
 
hci10_help_sap_en.pdf
hci10_help_sap_en.pdfhci10_help_sap_en.pdf
hci10_help_sap_en.pdf
 
04367a
04367a04367a
04367a
 
Mastering Oracle PL/SQL: Practical Solutions
Mastering Oracle PL/SQL: Practical SolutionsMastering Oracle PL/SQL: Practical Solutions
Mastering Oracle PL/SQL: Practical Solutions
 
plsqladvanced.pdf
plsqladvanced.pdfplsqladvanced.pdf
plsqladvanced.pdf
 
SAP CPI-DS.pdf
SAP CPI-DS.pdfSAP CPI-DS.pdf
SAP CPI-DS.pdf
 
software-eng.pdf
software-eng.pdfsoftware-eng.pdf
software-eng.pdf
 
Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407
 
Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407Ibm total storage productivity center for replication on aix sg247407
Ibm total storage productivity center for replication on aix sg247407
 
sum2_abap_unix_hana.pdf
sum2_abap_unix_hana.pdfsum2_abap_unix_hana.pdf
sum2_abap_unix_hana.pdf
 
Faronics Insight Tech Console User guide
Faronics Insight Tech Console User guideFaronics Insight Tech Console User guide
Faronics Insight Tech Console User guide
 
Vmware nsx network virtualization platform white paper
Vmware nsx network virtualization platform white paperVmware nsx network virtualization platform white paper
Vmware nsx network virtualization platform white paper
 
VMware-NSX-Network-Virtualization-Platform-WP
VMware-NSX-Network-Virtualization-Platform-WPVMware-NSX-Network-Virtualization-Platform-WP
VMware-NSX-Network-Virtualization-Platform-WP
 
Vmware nsx-network-virtualization-platform-white-paper
Vmware nsx-network-virtualization-platform-white-paperVmware nsx-network-virtualization-platform-white-paper
Vmware nsx-network-virtualization-platform-white-paper
 
Oracle sap
Oracle sapOracle sap
Oracle sap
 

Mehr von TTEC

Transtec360 brochure
Transtec360 brochureTranstec360 brochure
Transtec360 brochureTTEC
 
Virtual graphic workspace
Virtual graphic workspace Virtual graphic workspace
Virtual graphic workspace TTEC
 
Software defined storage rev. 2.0
Software defined storage rev. 2.0 Software defined storage rev. 2.0
Software defined storage rev. 2.0 TTEC
 
Hpc compass 2016 17
Hpc compass 2016 17Hpc compass 2016 17
Hpc compass 2016 17TTEC
 
Hpc life science-nl
Hpc life science-nlHpc life science-nl
Hpc life science-nlTTEC
 
Hpc cae nl
Hpc cae nlHpc cae nl
Hpc cae nlTTEC
 
Hpc kompass 2015
Hpc kompass 2015Hpc kompass 2015
Hpc kompass 2015TTEC
 
Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014TTEC
 
transtec vdi in-a-box
transtec vdi in-a-boxtranstec vdi in-a-box
transtec vdi in-a-boxTTEC
 
Nexenta transtec
Nexenta transtecNexenta transtec
Nexenta transtecTTEC
 
HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014TTEC
 
HPC compass 2013/2014
HPC compass 2013/2014HPC compass 2013/2014
HPC compass 2013/2014TTEC
 
Dacoria overview englisch
Dacoria overview englischDacoria overview englisch
Dacoria overview englischTTEC
 
Vmware certified IBM Servers
Vmware certified IBM ServersVmware certified IBM Servers
Vmware certified IBM ServersTTEC
 
Ibm v3700
Ibm v3700Ibm v3700
Ibm v3700TTEC
 
Sansymphony v-r9
Sansymphony v-r9Sansymphony v-r9
Sansymphony v-r9TTEC
 
Microsoft Hyper-V explained
Microsoft Hyper-V explainedMicrosoft Hyper-V explained
Microsoft Hyper-V explainedTTEC
 
ttec NAS powered by Open-E
ttec NAS powered by Open-Ettec NAS powered by Open-E
ttec NAS powered by Open-ETTEC
 
Sandy bridge platform from ttec
Sandy bridge platform from ttecSandy bridge platform from ttec
Sandy bridge platform from ttecTTEC
 
Solution magazine 2012_final_nl
Solution magazine 2012_final_nlSolution magazine 2012_final_nl
Solution magazine 2012_final_nlTTEC
 

Mehr von TTEC (20)

Transtec360 brochure
Transtec360 brochureTranstec360 brochure
Transtec360 brochure
 
Virtual graphic workspace
Virtual graphic workspace Virtual graphic workspace
Virtual graphic workspace
 
Software defined storage rev. 2.0
Software defined storage rev. 2.0 Software defined storage rev. 2.0
Software defined storage rev. 2.0
 
Hpc compass 2016 17
Hpc compass 2016 17Hpc compass 2016 17
Hpc compass 2016 17
 
Hpc life science-nl
Hpc life science-nlHpc life science-nl
Hpc life science-nl
 
Hpc cae nl
Hpc cae nlHpc cae nl
Hpc cae nl
 
Hpc kompass 2015
Hpc kompass 2015Hpc kompass 2015
Hpc kompass 2015
 
Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014
 
transtec vdi in-a-box
transtec vdi in-a-boxtranstec vdi in-a-box
transtec vdi in-a-box
 
Nexenta transtec
Nexenta transtecNexenta transtec
Nexenta transtec
 
HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014
 
HPC compass 2013/2014
HPC compass 2013/2014HPC compass 2013/2014
HPC compass 2013/2014
 
Dacoria overview englisch
Dacoria overview englischDacoria overview englisch
Dacoria overview englisch
 
Vmware certified IBM Servers
Vmware certified IBM ServersVmware certified IBM Servers
Vmware certified IBM Servers
 
Ibm v3700
Ibm v3700Ibm v3700
Ibm v3700
 
Sansymphony v-r9
Sansymphony v-r9Sansymphony v-r9
Sansymphony v-r9
 
Microsoft Hyper-V explained
Microsoft Hyper-V explainedMicrosoft Hyper-V explained
Microsoft Hyper-V explained
 
ttec NAS powered by Open-E
ttec NAS powered by Open-Ettec NAS powered by Open-E
ttec NAS powered by Open-E
 
Sandy bridge platform from ttec
Sandy bridge platform from ttecSandy bridge platform from ttec
Sandy bridge platform from ttec
 
Solution magazine 2012_final_nl
Solution magazine 2012_final_nlSolution magazine 2012_final_nl
Solution magazine 2012_final_nl
 

Kürzlich hochgeladen

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Hpc compass transtec_2012

  • 1. Automotive Simulation Risk Analysis High Throughput Computing Price Modelling Engineering HIGH CAE Aerospace PERFORMANCE COMPUTING 2012/13 TECHNOLOGY COMPASS CAD Big Data Analytics Life Sciences
  • 2. TECHNOLOGY COMPASS INTEL CLUSTER READY ............................................................................62 A Quality Standard for HPC Clusters...................................................... 64 TABLE OF CONTENTS AND INTRODUCTION Intel Cluster Ready builds HPC Momentum ..................................... 69 The transtec Benchmarking Center ....................................................... 73 HIGH PERFORMANCE COMPUTING .................................................... 4 WINDOWS HPC SERVER 2008 R2 ........................................................74 Performance Turns Into Productivity ......................................................6 Elements of the Microsoft HPC Solution ............................................ 76 Flexible deployment with xCAT ...................................................................8 Deployment, system management, and monitoring ................. 78 Job scheduling..................................................................................................... 80 CLUSTER MANAGEMENT MADE EASY ..............................................12 Service-oriented architecture ................................................................... 82 Bright Cluster Manager ................................................................................. 14 Networking and MPI ........................................................................................ 85 Microsoft Office Excel support ................................................................. 88 INTELLIGENT HPC WORKLOAD MANAGEMENT .........................28 Moab HPC Suite – Enterprise Edition.................................................... 30 PARALLEL NFS ...............................................................................................90 New in Moab 7.0 ................................................................................................. 34 The New Standard for HPC Storage ....................................................... 92 Moab HPC Suite – Basic Edition................................................................ 37 Whats´s new in NFS 4.1? ............................................................................... 94 Moab HPC Suite - Grid Option .................................................................... 43 Panasas HPC Storage ...................................................................................... 99 NICE ENGINE FRAME .................................................................................50 NVIDIA GPU COMPUTING ....................................................................110 A technical portal for remote visualization ...................................... 52 The CUDA Architecture ............................................................................... 112 Application highlights.................................................................................... 54 Codename “Fermi” ......................................................................................... 116 Desktop Cloud Virtualization .................................................................... 57 Introducing NVIDIA Parallel Nsight ..................................................... 122 Remote Visualization...................................................................................... 58 QLogic TrueScale InfiniBand and GPUs ............................................ 126 INFINIBAND .................................................................................................130 High-speed interconnects ........................................................................ 132 Top 10 Reasons to Use QLogic TrueScale InfiniBand ................ 136 Intel MPI Library 4.0 Performance ........................................................ 139 InfiniBand Fabric Suite (IFS) – What’s New in Version 6.0 ...... 141 PARSTREAM .................................................................................................144 Big Data Analytics .......................................................................................... 146 GLOSSARY .....................................................................................................156 2
  • 3. MORE THAN 30 YEARS OF EXPERIENCE IN SCIENTIFIC COMPUTING environment is of a highly heterogeneous nature. Even the 1980 marked the beginning of a decade where numerous startups dynamical provisioning of HPC resources as needed does not were created, some of which later transformed into big players in constitute any problem, thus further leading to maximal utiliza- the IT market. Technical innovations brought dramatic changes tion of the cluster. to the nascent computer market. In Tübingen, close to one of Ger- many’s prime and oldest universities, transtec was founded. transtec HPC solutions use the latest and most innovative technology. Their superior performance goes hand in hand with In the early days, transtec focused on reselling DEC computers energy efficiency, as you would expect from any leading edge IT and peripherals, delivering high-performance workstations to solution. We regard these basic characteristics. university institutes and research facilities. In 1987, SUN/Sparc and storage solutions broadened the portfolio, enhanced by This brochure focusses on where transtec HPC solutions excel. IBM/RS6000 products in 1991. These were the typical worksta- To name a few: Bright Cluster Manager as the technology leader tions and server systems for high performance computing then, for unified HPC cluster management, leading-edge Moab HPC used by the majority of researchers worldwide. Suite for job and workload management, Intel Cluster Ready certification as an independent quality standard for our sys- In the late 90s, transtec was one of the first companies to offer tems, Panasas HPC storage systems for highest performance highly customized HPC cluster solutions based on standard and best scalability required of an HPC storage system. Again, Intel architecture servers, some of which entered the TOP500 with these components, usability and ease of management list of the world’s fastest computing systems. are central issues that are addressed. Also, being NVIDIA Tesla Preferred Provider, transtec is able to provide customers with Thus, given this background and history, it is fair to say that well-designed, extremely powerful solutions for Tesla GPU transtec looks back upon a more than 30 years’ experience in computing. QLogic’s InfiniBand Fabric Suite makes managing a scientific computing; our track record shows nearly 500 HPC large InfiniBand fabric easier than ever before – transtec mas- installations. With this experience, we know exactly what cus- terly combines excellent and well-chosen components that are tomers’ demands are and how to meet them. High performance already there to a fine-tuned, customer-specific, and thoroughly and ease of management – this is what customers require to- designed HPC solution. day. HPC systems are for sure required to peak-perform, as their name indicates, but that is not enough: they must also be easy Last but not least, your decision for a transtec HPC solution to handle. Unwieldy design and operational complexity must be means you opt for most intensive customer care and best ser- avoided or at least hidden from administrators and particularly vice in HPC. Our experts will be glad to bring in their expertise users of HPC computer systems. and support to assist you at any stage, from HPC design to daily cluster operations, to HPC Cloud Services. transtec HPC solutions deliver ease of management, both in the Linux and Windows worlds, and even where the customer´s Have fun reading the transtec HPC Compass 2012/13! 3
  • 5. High Performance Computing (HPC) has been with us from the very beginning of the computer era. High-performance computers were built to solve numerous problems which the “human computers” could not handle. The term HPC just hadn’t been coined yet. More important, some of the early principles have changed fundamentally. HPC systems in the early days were much different from those we see today. First, we saw enormous mainframes from large computer manu- facturers, including a proprietary operating system and job management system. Second, at universities and research institutes, workstations made inroads and scientists carried out calculations on their dedicated Unix or VMS workstations. In either case, if you needed more computing power, you scaled up, i.e. you bought a bigger machine. Today the term High-Performance Computing has gained a fundamen- tally new meaning. HPC is now perceived as a way to tackle complex mathematical, scientific or engineering problems. The integration of industry standard, “off-the-shelf” server hardware into HPC clusters fa- cilitates the construction of computer networks of such power that one single system could never achieve. The new paradigm for parallelization is scaling out. 5
  • 6. HIGH PERFORMANCE COMPUTING Computer-supported simulations of realistic processes (so- called Computer Aided Engineering – CAE) has established itself PERFORMANCE TURNS INTO PRODUCTIVITY as a third key pillar in the field of science and research along- side theory and experimentation. It is nowadays inconceivable that an aircraft manufacturer or a Formula One racing team would operate without using simulation software. And scien- tific calculations, such as in the fields of astrophysics, medicine, pharmaceuticals and bio-informatics, will to a large extent be dependent on supercomputers in the future. Software manu- facturers long ago recognized the benefit of high-performance computers based on powerful standard servers and ported their programs to them accordingly. The main advantages of scale-out supercomputers is just that: they are infinitely scalable, at least in principle. Since they are based on standard hardware components, such a supercomputer can be charged with more power whenever the computational capacity of the system is not sufficient any more, simply by adding additional nodes of the same kind. A “transtec HPC solutions are meant to provide cumbersome switch to a different technology can be avoided customers with unparalleled ease-of-manage- in most cases. ment and ease-of-use. Apart from that, deciding for a transtec HPC solution means deciding for The primary rationale in using HPC clusters is to grow, to scale the most intensive customer care and the best out computing capacity as far as necessary. To reach that goal, service imaginable” an HPC cluster returns most of the invest when it is continu- ously fed with computing problems. Dr. Oliver Tennert Director Technology Management & HPC Solutions The secondary reason for building scale-out supercomputers is to maximize the utilization of the system. 6
  • 7. If the individual processes engage in a large amount of com- munication, the response time of the network (latency) becomes important. Latency in a Gigabit Ethernet or a 10GE network is typi- cally around 10 µs. High-speed interconnects such as InfiniBand, reduce latency by a factor of 10 down to as low as 1 µs. Therefore, high-speed interconnects can greatly speed up total processing. The other frequently used variant is called SMP applications. VARIATIONS ON THE THEME: MPP AND SMP SMP, in this HPC context, stands for Shared Memory Processing. Parallel computations exist in two major variants today. Ap- It involves the use of shared memory areas, the specific imple- plications running in parallel on multiple compute nodes are mentation of which is dependent on the choice of the underlying frequently so-called Massively Parallel Processing (MPP) applica- operating system. Consequently, SMP jobs generally only run on tions. MPP indicates that the individual processes can each a single node, where they can in turn be multi-threaded and thus utilize exclusive memory areas. This means that such jobs are be parallelized across the number of CPUs per node. For many HPC predestined to be computed in parallel, distributed across the applications, both the MPP and SMP variant can be chosen. nodes in a cluster. The individual processes can thus utilize the separate units of the respective node – especially the RAM, the Many applications are not inherently suitable for parallel execu- CPU power and the disk I/O. tion. In such a case, there is no communication between the in- dividual compute nodes, and therefore no need for a high-speed Communication between the individual processes is imple- network between them; nevertheless, multiple computing jobs mented in a standardized way through the MPI software can be run simultaneously and sequentially on each individual interface (Message Passing Interface), which abstracts the node, depending on the number of CPUs. underlying network connections between the nodes from the processes. However, the MPI standard (current version In order to ensure optimum computing performance for these 2.0) merely requires source code compatibility, not binary applications, it must be examined how many CPUs and cores compatibility, so an off-the-shelf application usually needs deliver the optimum performance. specific versions of MPI libraries in order to run. Examples of MPI implementations are OpenMPI, MPICH2, MVAPICH2, Intel We find applications of this sequential type of work typically in MPI or – for Windows clusters – MS-MPI. the fields of data analysis or Monte-Carlo simulations. 7
  • 8. HIGH PERFORMANCE COMPUTING FLEXIBLE DEPLOYMENT WITH XCAT xCAT as a Powerful and Flexible Deployment Tool xCAT (Extreme Cluster Administration Tool) is an open source toolkit for the deployment and low-level administration of HPC cluster environments, small as well as large ones. xCAT provides simple commands for hardware control, node dis- covery, the collection of MAC addresses, and the node deploy- ment with (diskful) or without local (diskless) installation. The cluster configuration is stored in a relational database. Node groups for different operating system images can be defined. Also, user-specific scripts can be executed automatically at installation time. xCAT Provides the Following Low-Level Administrative Features  Remote console support  Parallel remote shell and remote copy commands  Plugins for various monitoring tools like Ganglia or Nagios  Hardware control commands for node discovery, collect- ing MAC addresses, remote power switching and resetting of nodes 8
  • 9.  Automatic configuration of syslog, remote shell, DNS, DHCP, when the code is self-developed, developers often prefer one and ntp within the cluster MPI implementation over another.  Extensive documentation and man pages According to the customer’s wishes, we install various compil- For cluster monitoring, we install and configure the open ers, MPI middleware, as well as job management systems like source tool Ganglia or the even more powerful open source Parastation, Grid Engine, Torque/Maui, or the very powerful solution Nagios, according to the customer’s preferences and Moab HPC Suite for the high-level cluster management. requirements. Local Installation or Diskless Installation We offer a diskful or a diskless installation of the cluster nodes. A diskless installation means the operating system is hosted partially within the main memory, larger parts may or may not be included via NFS or other means. This approach allows for deploying large amounts of nodes very efficiently, and the cluster is up and running within a very small timescale. Also, updating the cluster can be done in a very efficient way. For this, only the boot image has to be updated, and the nodes have to be rebooted. After this, the nodes run either a new kernel or even a new operating system. Moreover, with this approach, partitioning the cluster can also be very efficiently done, either for testing purposes, or for allocating different cluster parti- tions for different users or applications. Development Tools, Middleware, and Applications According to the application, optimization strategy, or underlying architecture, different compilers lead to code results of very different performance. Moreover, different, mainly commercial, applications, require different MPI implementations. And even 9
  • 10. HPC solution benchmarking of application HIGH PERFORMANCE COMPUTING different systems installation PERFORMANCE TURNS INTO PRODUCTIVITY continual improvement maintenance, integration onsite customer into support & hardware training customer’s managed services assembly environment SERVICES AND CUSTOMER CARE FROM A TO Z application-, burn-in tests software individual Presales customer-, of systems & OS consulting site-specific installation sizing of HPC solution benchmarking of application different systems installation continual improvement maintenance, integration onsite customer into support & hardware training customer’s managed services assembly environment 10
  • 11. to important middleware components like cluster management or developer tools and the customer’s production applications. Onsite delivery means onsite integration into the customer’s production environment, be it establishing network connectivity to the corporate network, or setting up software and configura- tion parts. transtec HPC clusters are ready-to-run systems – we deliver, you HPC @ TRANSTEC: SERVICES AND CUSTOMER CARE FROM A TO Z turn the key, the system delivers high performance. Every HPC transtec AG has over 30 years of experience in scientific comput- project entails transfer to production: IT operation processes and ing and is one of the earliest manufacturers of HPC clusters. policies apply to the new HPC system. Effectively, IT personnel is For nearly a decade, transtec has delivered highly customized trained hands-on, introduced to hardware components and soft- High Performance clusters based on standard components to ware, with all operational aspects of configuration management. academic and industry customers across Europe with all the high quality standards and the customer-centric approach that transtec services do not stop when the implementation projects transtec is well known for. ends. Beyond transfer to production, transtec takes care. transtec offers a variety of support and service options, tailored to the Every transtec HPC solution is more than just a rack full of hard- customer’s needs. When you are in need of a new installation, a ware – it is a comprehensive solution with everything the HPC major reconfiguration or an update of your solution – transtec is user, owner, and operator need. able to support your staff and, if you lack the resources for main- taining the cluster yourself, maintain the HPC solution for you. In the early stages of any customer’s HPC project, transtec ex- From Professional Services to Managed Services for daily opera- perts provide extensive and detailed consulting to the customer tions and required service levels, transtec will be your complete – they benefit from expertise and experience. Consulting is fol- HPC service and solution provider. transtec’s high standards of lowed by benchmarking of different systems with either specifi- performance, reliability and dependability assure your productiv- cally crafted customer code or generally accepted benchmarking ity and complete satisfaction. routines; this aids customers in sizing and devising the optimal and detailed HPC configuration. transtec’s offerings of HPC Managed Services offer customers the possibility of having the complete management and administra- Each and every piece of HPC hardware that leaves our factory tion of the HPC cluster managed by transtec service specialists, undergoes a burn-in procedure of 24 hours or more if necessary. in an ITIL compliant way. Moreover, transtec’s HPC on Demand We make sure that any hardware shipped meets our and our services help provide access to HPC resources whenever they customers’ quality requirements. transtec HPC solutions are turn- need them, for example, because they do not have the possibility key solutions. By default, a transtec HPC cluster has everything of owning and running an HPC cluster themselves, due to lacking installed and configured – from hardware and operating system infrastructure, know-how, or admin staff. 11
  • 13. Bright Cluster Manager removes the complexity from the installation, management and use of HPC clusters, without compromizing performance or capability. With Bright Cluster Manager, an administrator can easily install, use and manage multiple clusters simultaneously, without the need for expert knowledge of Linux or HPC. 13
  • 14. CLUSTER MANAGEMENT MADE EASY A UNIFIED APPROACH Other cluster management offerings take a “toolkit” approach BRIGHT CLUSTER MANAGER in which a Linux distribution is combined with many third-party THE CLUSTER INSTALLER TAKES THE ADMINISTRATOR THROUGH THE tools for provisioning, monitoring, alerting, etc. INSTALLATION PROCESS AND OFFERS ADVANCED OPTIONS SUCH AS “EXPRESS” AND “REMOTE”. This approach has critical limitations because those separate tools were not designed to work together, were not designed for HPC, and were not designed to scale. Furthermore, each of the tools has its own interface (mostly command-line based), and each has its own daemons and databases. Countless hours of scripting and testing from highly skilled people are required to get the tools to work for a specific cluster, and much of it goes undocumented. Bright Cluster Manager takes a much more fundamental, inte- grated and unified approach. It was designed and written from the ground up for straightforward, efficient, comprehensive clus- ter management. It has a single lightweight daemon, a central database for all monitoring and configuration data, and a single BY SELECTING A CLUSTER NODE IN THE TREE ON THE LEFT AND THE TASKS CLI and GUI for all cluster management functionality. TAB ON THE RIGHT, THE ADMINISTRATOR CAN EXECUTE A NUMBER OF POWERFUL TASKS ON THAT NODE WITH JUST A SINGLE MOUSE CLICK.. This approach makes Bright Cluster Manager extremely easy to use, scalable, secure and reliable, complete, flexible, and easy to maintain and support. EASE OF INSTALLATION Bright Cluster Manager is easy to install. Typically, system admin- istrators can install and test a fully functional cluster from “bare metal” in less than an hour. Configuration choices made during the installation can be modified afterwards. Multiple installation modes are available, including unattended and remote modes. Cluster nodes can be automatically identified based on switch ports rather than MAC addresses, improving speed and reliability of installation, as well as subsequent maintenance. 14
  • 15. EASE OF USE are performed through one intuitive, visual interface. Bright Cluster Manager is easy to use. System administrators Multiple clusters can be managed simultaneously. The CMGUI have two options: the intuitive Cluster Management Graphical runs on Linux, Windows and MacOS (coming soon) and can be User Interface (CMGUI) and the powerful Cluster Management extended using plugins. The CMSH provides practically the same Shell (CMSH). The CMGUI is a standalone desktop application functionality as the Bright CMGUI, but via a command-line inter- that provides a single system view for managing all hardware face. The CMSH can be used both interactively and in batch mode and software aspects of the cluster through a single point of via scripts. Either way, system administrators now have unprec- control. Administrative functions are streamlined as all tasks edented flexibility and control over their clusters. CLUSTER METRICS, SUCH AS GPU AND CPU TEMPERATURES, FAN SPEEDS AND NETWORKS STATISTICS CAN BE VISUALIZED BY SIMPLY DRAGGING AND DROPPING THEM FROM THE LIST ON THE LEFT INTO A GRAPHING WINDOW ON THE RIGHT. MULTIPLE METRICS CAN BE COMBINED IN ONE GRAPH AND GRAPHS CAN BE ZOOMED INTO. GRAPH LAYOUT AND COLORS CAN BE TAILORED TO YOUR REQUIREMENTS. 15
  • 16. CLUSTER MANAGEMENT MADE EASY SUPPORT FOR LINUX AND WINDOWS Bright Cluster Manager is based on Linux and is available BRIGHT CLUSTER MANAGER with a choice of pre-integrated, pre-configured and opti- mized Linux distributions, including SUSE Linux Enterprise THE STATUS OF CLUSTER NODES, SWITCHES, OTHER HARDWARE, AS WELL AS UP TO SIX METRICS CAN BE VISUALIZED IN THE RACKVIEW. A ZOOM-OUT OPTION IS AVAIL- ABLE FOR CLUSTERS WITH MANY RACKS. THE OVERVIEW TAB PROVIDES INSTANT, HIGH-LEVEL INSIGHT INTO THE STATUS OF THE CLUSTER. Server, Red Hat Enterprise Linux, CentOS and Scientific Linux. Dual-boot installations with Windows HPC Server are supported as well, allowing nodes to either boot from the Bright-managed Linux head node, or the Windows-managed head node. EXTENSIVE DEVELOPMENT ENVIRONMENT Bright Cluster Manager provides an extensive HPC development environment for both serial and parallel applications, including the following (some optional): 16
  • 17.  Compilers, including full suites from GNU, Intel, AMD and THE PARALLEL SHELL ALLOWS FOR SIMULTANEOUS EXECUTION OF COMMANDS OR SCRIPTS ACROSS NODE GROUPS OR ACROSS THE ENTIRE CLUSTER. Portland Group  Debuggers and profilers, including the GNU debugger and profiler, TAU, TotalView, Allinea DDT and Allinea OPT  GPU libraries, including CUDA and OpenCL  MPI libraries, including OpenMPI, MPICH, MPICH2, MPICH- MX, MPICH2-MX, MVAPICH and MVAPICH2; all cross-compiled with the compilers installed on Bright Cluster Manager, and optimized for high-speed interconnects such as InfiniBand and Myrinet  Mathematical libraries, including ACML, FFTW, GMP, GotoBLAS, MKL and ScaLAPACK  Other libraries, including Global Arrays, HDF5, IIPP, TBB, Net- CDF and PETSc Bright Cluster Manager also provides Environment Modules to Linux kernels can be assigned to individual images. Incremen- make it easy to maintain multiple versions of compilers, librar- tal changes to images can be deployed to live nodes without ies and applications for different users on the cluster, without rebooting or re-installation. creating compatibility conflicts. Each Environment Module file The provisioning system propagates only changes to the contains the information needed to configure the shell for an images, minimizing time and impact on system performance application, and automatically sets these variables correctly and availability. Provisioning capability can be assigned to for the particular application when it is loaded. Bright Cluster any number of nodes on-the-fly, for maximum flexibility and Manager includes many preconfigured module files for many scalability. Bright Cluster Manager can also provision over scenarios, such as combinations of compliers, mathematical InfiniBand and to RAM disk. and MPI libraries. COMPREHENSIVE MONITORING POWERFUL IMAGE MANAGEMENT AND PROVISIONING With Bright Cluster Manager, system administrators can collect, Bright Cluster Manager features sophisticated software image monitor, visualize and analyze a comprehensive set of metrics. management and provisioning capability. A virtually unlimited Practically all software and hardware metrics available to the number of images can be created and assigned to as many Linux kernel, and all hardware management interface metrics different categories of nodes as required. Default or custom (IPMI, iLO, etc.) are sampled. 17
  • 18. CLUSTER MANAGEMENT MADE EASY BRIGHT CLUSTER MANAGER HIGH PERFORMANCE MEETS EFFICIENCY Initially, massively parallel systems constitute a challenge to both administrators and users. They are complex beasts. Any- one building HPC clusters will need to tame the beast, master the complexity and present users and administrators with an easy-to-use, easy-to-manage system landscape. Leading HPC solution providers such as transtec achieve this goal. They hide the complexity of HPC under the hood and match high performance with efficiency and ease-of-use for both users and administrators. The “P” in “HPC” gains a double meaning: “Performance” plus “Productivity”. Cluster and workload management software like Moab HPC Suite, Bright Cluster Manager or QLogic IFS provide the means to master and hide the inherent complexity of HPC systems. For administrators and users, HPC clusters are presented as single, large machines, with many different tuning parameters. The software also provides a unified view of existing clusters when- ever unified management is added as a requirement by the customer at any point in time after the first installation. Thus, daily routine tasks such as job management, user management, queue partitioning and management, can be performed easily with either graphical or web-based tools, without any advanced scripting skills or technical expertise required from the adminis- trator or user. 18
  • 19.  Powerful cluster automation functionality allows preemptive actions based on monitoring thresholds  Comprehensive cluster monitoring and health checking framework, including automatic sidelining of unhealthy nodes to prevent job failure Scalability from Deskside to TOP500  Off-loadable provisioning for maximum scalability THE BRIGHT ADVANTAGE  Proven on some of the world’s largest clusters Bright Cluster Manager offers many advantages that lead to improved productivity, uptime, scalability, performance and Minimum Overhead/Maximum Performance security, while reducing total cost of ownership.  Single lightweight daemon drives all functionality  Daemon heavily optimized to minimize effect on operating Rapid Productivity Gains system and applications  Easy to learn and use, with an intuitive GUI  Single database stores all metric and configuration data  Quick installation: from bare metal to a cluster ready to use, in less than an hour Top Security  Fast, flexible provisioning: incremental, live, disk-full, disk-  Automated security and other updates from key-signed less, provisioning over InfiniBand, auto node discovery repositories  Comprehensive monitoring: on-the-fly graphs, rackview,  Encrypted external and internal communications (optional) multiple clusters, custom metrics  X.509v3 certificate-based public-key authentication  Powerful automation: thresholds, alerts, actions  Role-based access control and complete audit trail  Complete GPU support: NVIDIA, AMD ATI, CUDA, OpenCL  Firewalls and secure LDAP  On-demand SMP: instant ScaleMP virtual SMP deployment  Powerful cluster management shell and SOAP API for auto- mating tasks and creating custom capabilities  Seamless integration with leading workload managers: PBS Pro, Moab, Maui, SLURM, Grid Engine, Torque, LSF  Integrated (parallel) application development environment.  Easy maintenance: automatically update your cluster from Linux and Bright Computing repositories  Web-based user portal Bright Computing Maximum Uptime  Unattended, robust head node failover to spare head node 19
  • 20. CLUSTER MANAGEMENT MADE EASY Examples include CPU and GPU temperatures, fan speeds, switches, hard disk SMART information, system load, memory BRIGHT CLUSTER MANAGER utilization, network statistics, storage metrics, power systems statistics, and workload management statistics. Custom metrics can also easily be defined. Metric sampling is done very efficiently – in one process, or out-of-band where possible. System administrators have full flexibility over how and when metrics are sampled, and historic data can be consolidated over time to save disk space. THE AUTOMATION CONFIGURATION WIZARD GUIDES THE SYSTEM ADMINISTRATOR THROUGH THE STEPS OF DEFINING A RULE: SELECTING METRICS, DEFINING THRESH- OLDS AND SPECIFYING ACTIONS. CLUSTER MANAGEMENT AUTOMATION Cluster management automation takes preemptive actions when predetermined system thresholds are exceeded, sav- ing time and preventing hardware damage. System thresh- olds can be configured on any of the available metrics. The built-in configuration wizard guides the system administra- 20
  • 21. tor through the steps of defining a rule: selecting metrics, EXAMPLE GRAPHS THAT VISUALIZE METRICS ON A GPU CLUSTER. defining thresholds and specifying actions. For example, a temperature threshold for GPUs can be established that results in the system automatically shutting down an over- heated GPU unit and sending an SMS message to the system administrator’s mobile phone. Several predefined actions are available, but any Linux command or script can be config- ured as an action. COMPREHENSIVE GPU MANAGEMENT Bright Cluster Manager radically reduces the time and ef- fort of managing GPUs, and fully integrates these devices into the single view of the overall system. Bright includes powerful GPU management and monitoring capability that leverages functionality in NVIDIA Tesla GPUs. System admin- istrators can easily assume maximum control of the GPUs and gain instant and time-based status insight. In addition to the standard cluster management capabilities, Bright Cluster Manager monitors the full range of GPU metrics, including: MULTI-TASKING VIA PARALLEL SHELL  GPU temperature, fan speed, utilization The parallel shell allows simultaneous execution of multiple  GPU exclusivity, compute, display, persistance mode commands and scripts across the cluster as a whole, or across  GPU memory utilization, ECC statistics easily definable groups of nodes. Output from the executed  Unit fan speed, serial number, temperature, power commands is displayed in a convenient way with variable levels usage, voltages and currents, LED status, firmware of verbosity. Running commands and scripts can be killed easily  Board serial, driver version, PCI info if necessary. The parallel shell is available through both the CMGUI and the CMSH. Beyond metrics, Bright Cluster Manager features built-in support for GPU computing with CUDA and OpenCL libraries. INTEGRATED WORKLOAD MANAGEMENT Switching between current and previous versions of CUDA and Bright Cluster Manager is integrated with a wide selection of OpenCL has also been made easy. free and commercial workload managers. This integration 21
  • 22. CLUSTER MANAGEMENT MADE EASY provides a number of benefits:  The selected workload manager gets automatically installed BRIGHT CLUSTER MANAGER and configured  Many workload manager metrics are monitored  The GUI provides a user-friendly interface for configuring, monitoring and managing the selected workload manager  The CMSH and the SOAP API provide direct and powerful access to a number of workload manager commands and metrics WORKLOAD MANAGEMENT QUEUES CAN BE VIEWED AND CON- CREATING AND DISMANTLING A VIRTUAL SMP NODE CAN BE ACHIEVED WITH JUST FIGURED FROM THE GUI, WITHOUT THE NEED FOR WORKLOAD A FEW CLICKS WITHIN THE GUI OR A SINGLE COMMAND IN THE CLUSTER MANAGE- MANAGEMENT EXPERTISE. MENT SHELL. 22
  • 23.  Reliable workload manager failover is properly configured MAXIMUM UPTIME WITH HEALTH CHECKING  The workload manager is continuously made aware of the Bright Cluster Manager – Advanced Edition includes a powerful health state of nodes (see section on Health Checking) cluster health checking framework that maximizes system uptime. It continually checks multiple health indicators for all hardware The following user-selectable workload managers are tightly and software components and proactively initiates corrective integrated with Bright Cluster Manager: actions. It can also automatically perform a series of standard  PBS Pro, Moab, Maui, LSF and user-defined tests just before starting a new job, to ensure  SLURM, Grid Engine, Torque a successful execution. Examples of corrective actions include autonomous bypass of faulty nodes, automatic job requeuing to Alternatively, Lava, LoadLeveler or other workload managers can avoid queue flushing, and process “jailing” to allocate, track, trace be installed on top of Bright Cluster Manager. and flush completed user processes. The health checking frame- work ensures the highest job throughput, the best overall cluster INTEGRATED SMP SUPPORT efficiency and the lowest administration overhead. Bright Cluster Manager – Advanced Edition dynamically ag- gregates multiple cluster nodes into a single virtual SMP node, WEB-BASED USER PORTAL using ScaleMP’s Versatile SMP™ (vSMP) architecture. Creating The web-based user portal provides read-only access to essential and dismantling a virtual SMP node can be achieved with just cluster information, including a general overview of the cluster a few clicks within the CMGUI. Virtual SMP nodes can also be status, node hardware and software properties, workload manager launched and dismantled automatically using the scripting statistics and user-customizable graphs. The User Portal can easily capabilities of the CMSH. In Bright Cluster Manager a virtual be customized and expanded using PHP and the SOAP API. SMP node behaves like any other node, enabling transparent, on-the-fly provisioning, configuration, monitoring and man- USER AND GROUP MANAGEMENT agement of virtual SMP nodes as part of the overall system Users can be added to the cluster through the CMGUI or the management. CMSH. Bright Cluster Manager comes with a pre-configured LDAP database, but an external LDAP service, or alternative MAXIMUM UPTIME WITH HEAD NODE FAILOVER authentication system, can be used instead. Bright Cluster Manager – Advanced Edition allows two head nodes to be configured in active-active failover mode. Both ROLE-BASED ACCESS CONTROL AND AUDITING head nodes are on active duty, but if one fails, the other takes Bright Cluster Manager’s role-based access control mechanism over all tasks, seamlessly. allows administrator privileges to be defined on a per-role basis. 23
  • 24. CLUSTER MANAGEMENT MADE EASY Administrator actions can be audited using an audit file which stores all their write action. BRIGHT CLUSTER MANAGER TOP CLUSTER SECURITY Bright Cluster Manager offers an unprecedented level of secu- rity that can easily be tailored to local requirements. Security features include:  Automated security and other updates from key-signed Linux and Bright Computing repositories  Encrypted internal and external communications  X.509v3 certificate based public-key authentication to the cluster management infrastructure THE WEB-BASED USER PORTAL PROVIDES READ-ONLY ACCESS TO ESSENTIAL CLUSTER INFORMATION, INCLUDING A GENERAL OVERVIEW OF THE CLUSTER STATUS, NODE HARDWARE AND SOFTWARE PROPERTIES, WORKLOAD MANAGER STATISTICS AND USER-CUSTOMIZABLE GRAPHS. “The building blocks for transtec HPC solu- tions must be chosen according to our goals ease-of-management and ease-of-use. With Bright Cluster Manager, we are happy to have the technology leader at hand, meeting these requirements, and our customers value that.” Armin Jäger HPC Solution Engineer 24
  • 25.  Role-based access control and complete audit trail STANDARD AND ADVANCED EDITIONS  Firewalls and secure LDAP Bright Cluster Manager is available in two editions: Standard  Secure shell access and Advanced. The table on this page lists the differences. You can easily upgrade from the Standard to the Advanced Edition MULTI-CLUSTER CAPABILITY as your cluster grows in size or complexity. Bright Cluster Manager is ideal for organizations that need to manage multiple clusters, either in one or in multiple locations. DOCUMENTATION AND SERVICES Capabilities include: A comprehensive system administrator manual and user manu-  All cluster management and monitoring functionality availa- al are included in PDF format. Customized training and profes- ble for all clusters through one GUI sional services are available. Services include various levels of  Selecting any set of configurations in one cluster and support, installation services and consultancy. export them to any or all other clusters with a few mouse clicks  Making node images available to other clusters. BRIGHT CLUSTER MANAGER CAN MANAGE MULTIPLE CLUSTERS SIMULTANEOUSLY. CLUSTER HEALTH CHECKS CAN BE VISUALIZED IN THE RACKVIEW. THIS SCREENSHOT THIS OVERVIEW SHOWS CLUSTERS IN OSLO, ABU DHABI AND HOUSTON, ALL MAN- SHOWS THAT GPU UNIT 41 FAILS A HEALTH CHECK CALLED “ALLFANSRUNNING”. AGED THROUGH ONE GUI. 25
  • 26. CLUSTER MANAGEMENT MADE EASY BRIGHT CLUSTER MANAGER FEATURE STANDARD ADVANCED Choice of Linux distributions x x Intel Cluster Ready x x Cluster Management GUI x x Cluster Management Shell x x Web-Based User Portal x x SOAP API x x Node Provisioning x x Node Identification x x Cluster Monitoring x x Cluster Automation x x User Management x x Parallel Shell x x Workload Manager Integration x x Cluster Security x x Compilers x x Debuggers & Profilers x x MPI Libraries x x Mathematical Libraries x x Environment Modules x x NVIDIA CUDA & OpenCL x x GPU Management & Monitoring x x ScaleMP Management & Monitoring - x Redundant Failover Head Nodes - x Cluster Health Checking - x Off-loadable Provisioning - x Suggested Number of Nodes 4–128 129–10,000+ Multi-Cluster Management - x Standard Support x x Premium Support Optional Optional 26
  • 27. 27
  • 29. While all HPC systems face challenges in workload demand, resource complexity, and scale, enterprise HPC systems face more stringent challenges and expectations. Enterprise HPC systems must meet mission-critical and priority HPC workload demands for commercial businesses and business-oriented research and academic organizations. They have complex SLAs and priorities to balance. Their HPC workloads directly impact the revenue, product delivery, and organizational objectives of their organizations. 29
  • 30. INTELLIGENT MOAB HPC SUITE Moab is the most powerful intelligence engine for policy-based, HPC WORKLOAD MANAGEMENT predictive scheduling across workloads and resources. Moab MOAB HPC SUITE – ENTERPRISE EDITION HPC Suite accelerates results delivery and maximize utiliza- tion while simplifying workload management across complex, heterogeneous cluster environments. The Moab HPC Suite products leverage the multi-dimensional policies in Moab to continually model and monitor workloads, resources, SLAs, and priorities to optimize workload output. And these policies utilize the unique Moab management abstraction layer that integrates data across heterogeneous resources and resource managers to maximize control as you automate workload man- agement actions. Managing the World’s Top Systems, Ready to Manage Yours Moab manages the world’s largest, most scale-intensive and complex HPC environments in the world including 40% of the top 10 supercomputing systems, nearly 40% of the top 25 and 36% of the compute cores in the top 100 systems based on rankings from www.Top500.org. So you know it is battle-tested and ready “With Moab HPC Suite, we can meet very de- to efficiently and intelligently manage the complexities of your manding customers’ requirements as regards environment. unified management of heterogeneous cluster environments, grid management, and provide MOAB HPC SUITE – ENTERPRISE EDITION them with flexible and powerful configuration Moab HPC Suite - Enterprise Edition provides enterprise-ready and reporting options. Our customers value HPC workload management that self-optimizes the productivity, that highly.” workload uptime and meeting of SLAs and business priorities for HPC systems and HPC cloud. It uses the battle-tested and patented Moab intelligence engine to automate the mission- Thomas Gebert HPC Solution Architect critical workload priorities of enterprise HPC systems. Enterprise customers benefit from a single integrated product that brings 30
  • 31. together key enterprise HPC capabilities, implementation, train- achievement of business objectives and outcomes that depend ing, and 24x7 support services to speed the realization of benefits on the results the enterprise HPC systems deliver. Moab HPC from their HPC system for their business. Moab HPC Suite – En- Suite Enterprise Edition delivers: terprise Edition delivers:  Productivity acceleration Productivity acceleration to get more results faster and at a  Uptime automation lower cost  Auto-SLA enforcement Moab HPC Suite – Enterprise Edition gets more results delivered  Grid- and cloud-ready HPC management faster from HPC resources to lower costs while accelerating overall system, user and administrator productivity. Moab Designed to Solve Enterprise HPC Challenges provides the unmatched scalability, 90-99 percent utilization, While all HPC systems face challenges in workload and resource and fast and simple job submission that is required to maximize complexity, scale and demand, enterprise HPC systems face productivity in enterprise HPC organizations. The Moab intel- more stringent challenges and expectations. Enterprise HPC ligence engine optimizes workload scheduling and orchestrates systems must meet mission-critical and priority HPC workload resource provisioning and management to maximize workload demands for commercial businesses and business-oriented speed and quantity. It also unifies workload management research and academic organizations. These organizations have across heterogeneous resources, resource managers and even complex SLA and priorities to balance. And their HPC workloads multiple clusters to reduce management complexity and costs. directly impact the revenue, product delivery, and organization- al objectives of their organizations. Uptime automation to ensure workload completes successfully Enterprise HPC organizations must eliminate job delays and HPC job and resource failures in enterprise HPC systems lead to failures. They are also seeking to improve resource utilization delayed results and missed organizational opportunities and and workload management efficiency across multiple heteroge- objectives. Moab HPC Suite – Enterprise Edition intelligently neous systems. To maximize user productivity, they are required automates workload and resource uptime in HPC systems to en- to make it easier to access and use HPC resources for users and sure that workload completes reliably and avoids these failures. even expand to other clusters or HPC cloud to better handle workload demand and surges. Auto-SLA enforcement to consistently meet service guaran- tees and business priorities BENEFITS Moab HPC Suite – Enterprise Edition uses the powerful Moab Moab HPC Suite - Enterprise Edition offers key benefits to intelligence engine to optimally schedule and dynamically reduce costs, improve service performance, and accelerate the adjust workload to consistently meet service level agreements productivity of enterprise HPC systems. These benefits drive the (SLAs), guarantees, and business priorities. This automatically 31
  • 32. INTELLIGENT ensures that the right workloads are completed at the optimal times, taking into account the complex number of departments, HPC WORKLOAD MANAGEMENT priorities and SLAs to be balanced. MOAB HPC SUITE – ENTERPRISE EDITION Grid- and Cloud-ready HPC management to more efficiently manage and meet workload demand The benefits of a traditional HPC environment can be extended to more efficiently manage and meet workload and resource demand by sharing workload across multiple clusters through grid management and the HPC cloud management capabilities provided in Moab HPC Suite – Enterprise Edition. CAPABILITIES Moab HPC Suite – Enterprise Edition brings together key en- terprise HPC capabilities into a single integrated product that self-optimizes the productivity, workload uptime, and meeting of SLA’s and priorities for HPC systems and HPC Cloud. Productivity acceleration capabilities deliver more results faster, lower costs, and increase resource, user and administra- tor productivity ARCHITECTURE  Massive scalability accelerates job response and through- put, including support for high throughput computing  Workload-optimized allocation policies and provisioning gets more results out of existing heterogeneous resources to reduce costs  Workload unification across heterogeneous clusters maxi- mizes resource availability for workloads and administration efficiency by managing workload as one cluster  Simplified HPC submission and control for both users and ad- ministrators with job arrays, templates, self-service submission 32
  • 33. portal and administrator dashboard (i.e. usage limits, usage reports, etc.)  Optimized intelligent scheduling that packs workloads and  SLA and priority polices ensure the highest priority workloads backfills around priority jobs and reservations while balancing are processed first (i.e. quality of service, hierarchical priority SLAs to efficiently use all available resources weighting, dynamic fairshare policies, etc.)  Advanced scheduling and management of GPGPUs for jobs to  Continuous plus future scheduling ensures priorities and gua- maximize their utilization including auto-detection, policy-based rantees are proactively met as conditions and workload levels GPGPU scheduling and GPGPU metrics reporting change (i.e. future reservations, priorities, and pre-emption)  Workload-aware auto-power management reduces energy use and costs by 30-40 percent with intelligent workload consolidati- Grid- and cloud-ready HPC management extends the benefits of on and auto-power management your traditional HPC environment to more efficiently manage workload and better meet workload demand Uptime automation capabilities ensure workload completes suc-  Pay-for-use showback and chargeback capabilities track cessfully and reliably, avoiding failures and missed organizational actual resource usage with flexible chargeback options and opportunities and objectives reporting by user or department  Intelligent resource placement prevents job failures with gra-  Manage and share workload across multiple remote nular resource modeling that ensures all workload requirements clusters to meet growing workload demand or surges with are met while avoiding at-risk resources the single self-service portal and intelligence engine with  Auto-response to incidents and events maximizes job and sys- purchase of Moab HPC Suite - Grid Option tem uptime with configurable actions to pre-failure conditions, amber alerts, or other metrics and monitors ARCHITECTURE  Workload-aware maintenance scheduling helps maintain a Moab HPC Suite - Enterprise Edition is architected to integrate stable HPC system without disrupting workload productivity on top of your existing job resource managers and other types  Real-world services expertise ensures fast time to value and of resource managers in your environment. It provides policy- system uptime with included package of implementation, trai- based scheduling and management of workloads as well as ning, and 24x7 remote support services resource allocation and provisioning orchestration. The Moab intelligence engine makes complex scheduling and manage- Auto-SLA enforcement schedules and adjusts workload to con- ment decisions based on all of the data it integrates from the sistently meet service guarantees and business priorities so the various resource managers and then orchestrates the job and right workloads are completed at the optimal times management actions through those resource managers. It  Department budget enforcement schedules resources in does this without requiring any additional agents. This makes line with resource sharing agreements and budgets it the ideal choice to integrate with existing and new systems 33
  • 34. INTELLIGENT HPC WORKLOAD MANAGEMENT NEW IN MOAB 7.0 NEW MOAB HPC SUITE 7.0 The new Moab HPC Suite 7.0 releases deliver continued break- through advancements in scalability, reliability, and job array management to accelerate system productivity as well as ex- tended database support. Here is a look at the new capabilities and the value they offer customers: TORQUE Resource Manager Scalability and Reliability Ad- vancements for Petaflop and Beyond As part of the Moab HPC Suite 7.0 releases, the TORQUE 4.0 resource manager features scalability and reliability advance- ments to fully exploit Moab scalability. These advancements maximize your use of increasing hardware capabilities and enable you to meet growing HPC user needs. Key advancements in TORQUE 4.0 for Moab HPC Suite 7.0 include:  The new Job Radix enables you to efficiently run jobs that span tens of thousands or even hundreds of thousands of nodes. Each MOM daemon now cascades job communication with multiple other MOM daemons simultaneously to reduce the job start-up process time to a small fraction of what it would normally take across a large number of nodes. The Job Radix eliminates lost jobs and job start-up bottlenecks caused by having all nodes MOM daemons communicating with only one head MOM node. This saves critical minutes on job start-up process time and allows for higher job throughput. 34
  • 35.  New MOM daemon communication hierarchy increases gration with existing user portals, plug-ins of resource manag- the number of nodes supported and reduces the overhead ers for rich data integration, and script integration. Customers of cluster status updates by distributing communication now have a standard interface to Moab with REST APIs. across multiple nodes instead of a single TORQUE head node. This makes status updates more efficient faster sched- Simplified Self-Service and Admin Dashboard Portal Experience uling and responsiveness. Moab HPC 7.0 features an enhanced self-service and admin  New multi-threading improves response and reliability, dashboard portal with simplified “click-based” job submission allowing for instant feedback to user requests as well as the for end users as well as new visual cluster dashboard views of ability to continue work even if some processes linger. nodes, jobs, and reservations for more efficient management. The  Improved network communications with all UDP-based new Visual Cluster dashboard provides administrators and users communication replaced with TCP to make data transfers views of their cluster resources that are easily filtered by almost from node to node more reliable. any factors including id, name, IP address, state, power, pending actions, reservations, load, memory, processors, etc. Users can Job Array Auto-Cancellation Policies Improve System Productivity also quickly filter and view their jobs by name, state, user, group, Moab HPC Suite 7.0 improves system productivity with new job ar- account, wall clock requested, memory requested, start date/ ray auto-cancellation policies that cancel remaining sub-jobs in an time, submit date/time, etc. One-click drill-downs provide addi- array once the solution is found in the array results. This frees up tional details and options for management actions. resources, which would otherwise be running irrelevant jobs, to run other jobs in the queue jobs quicker. The job array auto-cancellation Resource Usage Accounting Flexibility policies allow you to set auto-cancellations of sub-jobs based on Moab HPC Suite 7.0 includes more flexible resource usage ac- first, any instance of results success or failure, or specific exit codes. counting options that enable administrators to easily duplicate custom organizational hierarchies such as organization, groups, Extended Database Support Now Includes PostgreSQL and projects, business units, cost centers etc. in the Moab Account- Oracle in Addition to MySQL ing Manager usage budgets and charging structure. This ensures The extended database support in Moab HPC Suite 7.0 enables resource usage is budgeted , tracked, and reported or charged customers to use ODBC-compliant PostgreSQL and Oracle back for in the most useful way to admins and their customer databases in addition to MySQL. This provides customers the groups and users. flexibility to use the database that best meets their needs or is the standard for their system. New Moab Web Services Provide Easier Standard Integration and Customization New Moab Web Services provide easier standard integration and customization for a customer’s environment such as inte- 35
  • 36. INTELLIGENT as well as to manage your HPC system as it grows and expands in the future. HPC WORKLOAD MANAGEMENT MOAB HPC SUITE – BASIC EDITION Moab HPC Suite – Enterprise Edition includes the patented Moab intelligence engine that enables it to integrate with and automate management across existing heterogeneous environ- ments to optimize management and workload efficiency. This unique intelligence engine includes:  Industry leading multi-dimensional policies that automate the complex real-time decisions and actions for scheduling workload and allocating and adapting resources. These mul- ti-dimensional policies can model and consider the workload requirements, resource attributes and affinities, SLAs and priorities to enable more complex and efficient decisions to be automated.  Real-time and predictive future environment scheduling that drives more accurate and efficient decisions and service guarantees as it can proactively adjust scheduling and re- source allocations as it projects the impact of workload and resource condition changes.  Open & flexible management abstraction layer lets you integrate the data and orchestrate workload actions across the chaos of complex heterogeneous cluster environments and management middleware to maximize workload control, automation, and optimization. COMPONENTS Moab HPC Suite – Enterprise Edition includes the following inte- grated products and technologies for a complete HPC workload management solution:  Moab Workload Manager: Patented multi-dimensional 36
  • 37. intelligence engine that automates the complex decisions based workload management system that accelerates and auto- and orchestrates policy-based workload placement and mates the scheduling, managing, monitoring, and reporting of scheduling as well as resource allocation, provisioning and HPC workloads on massive scale, multi-technology installations. energy management The Moab HPC Suite – Basic Edition patented multi-dimensional  Moab Cluster Manager: Graphical desktop administrator decision engine accelerates both the decisions and orchestrati- application for managing, configuring, monitoring, and on of workload across the ideal combination of diverse resour- reporting for Moab managed clusters ces, including specialized resources like GPGPUs. The speed and  Moab Viewpoint: Web-based user self-service job submis- accuracy of the decisions and scheduling automation optimizes sion and management portal and administrator dashboard workload throughput and resource utilization so more work portal is accomplished in less time with existing resources to control  Moab Accounting Manager: HPC resource use budgeting costs and increase the value out of HPC investments. and accounting tool that enforces resource sharing agree- ments and limits based on departmental budgets and provi- Moab HPC Suite – Basic Edition enables you to address pressing des showback and chargeback reporting for resource usage HPC challenges including:  Moab Services Manager: Integration interfaces to resource  Delays to workload start and end times slowing results managers and third-party tools  Inconsistent delivery on service guarantees and SLA commit- ments Moab HPC Suite – Enterprise Edition is also integrated with  Under-utilization of resources TORQUE which is available as a free download on AdaptiveCom-  How to efficiently manage workload across heterogeneous and puting.com. TORQUE is an open-source job/resource manager hybrid systems of GPGPUs, hardware, and middleware that provides continually updated information regarding the  How to simplify job submission & management for users and state of nodes and workload status. Adaptive Computing is the administrators to maximize productivity custodian of the TORQUE project and is actively developing the code base in cooperation with the TORQUE community to Moab HPC Suite – Basic Edition acts as the “brain” of an HPC provide state of the art resource management. Each Moab HPC system to accelerate and automate complex decision making Suite product subscription includes support for the Moab HPC processes. The patented decision engine is capable of making Suite as well as TORQUE, if you choose to use TORQUE as the the complex multi-dimensional policy-based decisions needed to job/resource manager for your cluster. schedule workload to optimize job speed, job success and resource utilization. Moab HPC Suite – Basic Edition integrates decision- MOAB HPC SUITE – BASIC EDITION making data from and automates actions through your system’s Moab HPC Suite – Basic Edition is a multi-dimensional policy- existing mix of resource managers. This enables all the dimensions 37
  • 38. INTELLIGENT of real-time granular resource attributes and state as well as the timing of current and future resource commitments to be factored HPC WORKLOAD MANAGEMENT into more efficient and accurate scheduling and allocation decisi- MOAB HPC SUITE – BASIC EDITION ons. It also dramatically simplifies the management tasks and pro- cesses across these complex, heterogeneous environments. Moab works with many of the major resource management and industry standard resource monitoring tools covering mixed hardware, MOAB HPC SUITE - BASIC EDITION network, storage and licenses. Moab HPC Suite – Basic Edition policies are also able to factor in organizational priorities and complexities when scheduling workload and allocating resources. Moab ensures workload is pro- cessed according to organizational priorities and commitments and that resources are shared fairly across users, groups and even multiple organizations. This enables organizations to automati- cally enforce service guarantees and effectively manage organiza- tional complexities with simple policy-based settings. BENEFITS Moab HPC Suite – Basic Edition drives more ROI and results from your HPC environment including:  Improved job response times and job throughput with a workload decision engine that accelerates complex wor- kload scheduling decisions to enable faster job start times and high throughput computing  Optimized resource utilization to 90-99 percent with multi- dimensional and predictive workload scheduling to accomp- lish more with your existing resources  Automated enforcement of service guarantees, priorities, and resource sharing agreements across users, groups, and projects  Increased productivity by simplifying HPC use, access, and 38
  • 39. control for both users and administrators with job arrays, affinity- and node topology-based placement job templates, optional user portal, and GUI administrator  Backfill job scheduling speeds job throughput and maximi- management and monitoring tool zes utilization by scheduling smaller or less demanding jobs  Streamline job turnaround and reduce administrative as they can fit around priority jobs and reservations to use burden by unifying and automating workload tasks and re- all available resources source processes across diverse resources and mixed-system  Security policies control which users and groups can access environments including GPGPUs which resources  Provides a scalable workload management architecture  Checkpointing that can manage peta-scale and beyond, is grid-ready, compatible with existing infrastructure, and extensible to Real-time and predictive scheduling ensure job priorities and manage your environment as it grows and evolves guarantees are proactively met as conditions and workload levels change CAPABILITIES  Advanced reservations guarantee that jobs run when required Moab HPC Suite – Basic Edition accelerates workload pro-  Maintenance reservations reserve resources for planned fu- cessing with a patented multi-dimensional decision engine ture maintenance to avoid disruption to business workloads that self-optimizes workload placement, resource utilization  Predictive scheduling enables the future workload schedule and results output while ensuring organizational priorities to be continually forecasted and adjusted along with resour- are met across the users and groups leveraging the HPC ce allocations to adapt to changes in conditions and new job environment. and reservation requests Policy-driven scheduling intelligently places workload on op- Advanced scheduling and management of GPGPUs for jobs to timal set of diverse resources to maximize job throughput and maximize their utilization success as well as utilization and the meeting of workload and  Automatic detection and management of GPGPUs in envi- group priorities ronment to eliminate manual configuration and make them  Priority, SLA and resource sharing policies ensure the highest immediately available for scheduling priority workloads are processed first and resources are  Exclusively allocate and schedule GPGPUs on a per-job basis shared fairly across users and groups such as quality of  Policy-based management & scheduling using GPGPU service, hierarchical priority weighting, and fairshare targets, metrics limits and weights policies  Quick access to statistics on GPGPU utilization and key  Allocation policies optimize resource utilization and prevent metrics for optimal management and issue diagnosis such as job failures with granular resource modeling and scheduling, error counts, temperature, fan speed, and memory 39
  • 40. INTELLIGENT Easier submission, management, and control of job arrays im- prove user productivity and job throughput efficiency HPC WORKLOAD MANAGEMENT  Users can easily submit thousands of sub-jobs with a single MOAB HPC SUITE – BASIC EDITION job submission with an array index differentiating each array sub-job  Job array usage limit policies enforce number of job maxi- mums by credentials or class  Simplified reporting and management of job arrays for end users filters jobs to summarize, track and manage at the master job level Scalable job performance to large-scale, extreme-scale, and high-throughput computing environments  Efficiently manages the submission and scheduling of hund- reds of thousands of queued job submissions to support high throughput computing  Fast scheduler response to user commands while scheduling so users and administrators get the real-time job informati- on they need  Fast job throughput rate to get results started and delivered faster and keep utilization of resources up Open and flexible management abstraction layer easily integrates with and automates management across existing heterogeneous resources and middleware to improve management efficiency  Rich data integration and aggregation enables you to set pow- erful, multi-dimensional policies based on the existing real-time resource data monitored without adding any new agents  Heterogeneous resource allocation & management for wor- kloads across mixed hardware, specialty resources such as 40