The University of Leicester consolidated its high performance computing (HPC) capabilities into a private cloud infrastructure with Panasas ActiveStor, a high performance parallel storage system running over 10GbE and InfiniBand. The centralized supercomputer, Advanced Leicester Information and Computational Environment (ALICE), allows multi-disciplinary researchers to run computationally intensive research projects quickly and efficiently while resulting in significant cost savings for the university.
2. University of Leicester and the ALICE Supercomputer
â
Background
Dr. Chris Rudge had big challenges to overcome when
he was tasked with creating a central supercomputing
facility at The University of Leicester â one of the first
universities in the UK to centralize its high performance
computing (HPC) systems inside a private cloud
infrastructure. Most academics were used to owning HPC clusters have developed to a
their own HPC capabilities, or sharing departmental level where they are now professional
facilities between small groups of colleagues. To products which we should expect to
succeed, Dr. Rudge had to deliver a system that
âjust work.â The complete solution
was better than the custom HPC clusters that were
scattered across the campus if departmental users were chosen by the University of Leicester
to accept the new centralized HPC facility. does exactly that with best of breed
storage from Panasas. Ease of
In challenging economic times, the new system would
have to deliver significant cost savings over many
individual HPC storage systems. Another major
consideration was that an increasing number of HPC
resource users did not work in traditional âcomputation
heavyâ disciplines like physics, astronomy, chemistry and
engineering. The new system therefore had to scale out
performance on a single system, but also be reliable and
easy to use, particularly for academics that did not have
a background in using HPC clusters.
â
management allows researchers
to optimize their use of the system
without having to spend valuable
time simply keeping the system
running. This is very important on an
HPC system which supports a broad
range of research across a diverse
set of academic disciplines.
Previously, Dr. Rudge was responsible for the largest
Dr. Chris Rudge,
HPC system at the university which was dedicated to
University of Leicester
the Physics & Astronomy Department. He was selected
to manage the project because he understood the
challenges of running a small HPC system within a
single department.
âAcademic departments typically struggle to secure The Solution
research funding to employ skilled IT support staff
simply to administer their HPC systems,â commented By consolidating its high performance storage and
Rudge. âWithout dedicated support staff, HPC systems compute budget into a centralized private cloud, the
do not usually function well and researchers find it university avoided the cost of over-provisioning for peak
difficult to make best use of these expensive resources.â workloads as is typical within individual departments.
The facility is far bigger than would be possible for any
By centralizing HPC support, Dr. Rudge assembled a individual departmental system, allowing researchers
team of six to manage the new system, the Advanced to test new ideas with smaller problem sizes, and then
Leicester Information and Computational Environment, to do a small number of large runs that leverage all
dubbed ALICE. The large team dramatically hastened of ALICEâs processing cores. The team worked with
the process of designing, procuring and installing the a leading HPC server provider to create a system
HPC infrastructure, and the final system was delivered with 256 compute nodes, two login nodes and two
on time and on budget, something that individual management nodes, all connected by a fast InfiniBand
departments often fail to achieve with smaller systems. network fabric. Each computational node offers a pair of
1-888-panasas www.panasas.com
2
3. University of Leicester and the ALICE Supercomputer
quad-core 2.67GHz Intel Xeon X5550 CPUs, providing to the powerful computer system while providing the
2048 CPU cores for running jobs. The popular 64-bit flexibility to meet the diverse needs of many researchers.
Scientific Linux operating system â a version of Redhat A high performance storage system was critical as Dr.
Enterprise Linux â was chosen for the operating system. Rudge knew that while the compute nodes would have
very similar performance regardless of supplier, storage
â
Perhaps one of the most important considerations was performance could materially differ. The need to store
selecting the accompanying data storage system. The data for HPC projects from all university departments in
optimal solution would need to quickly deliver data a global namespace required a solution that delivered all
this, but at a very competitive cost.
PanasasÂź ActiveStorâą Parallel Storage
Panasas storage performance and The team developed a suite of benchmarks during
manageability were the key reasons the tender process, including one designed to stress
that the University of Leicester chose the storage and the network infrastructure linking the
compute nodes to the storage. This critical benchmark
Panasas for ALICE. HPC clusters
was designed to run with varying degrees of parallelism
have largely standardized on the use and on different numbers of nodes, ranging from running
of x64 CPUs with dual socket nodes one job on a single node to running eight jobs on each
and InfiniBand interconnect. The type of 32 nodes. The benchmark would find the limits of
of storage and its integration into the bandwidth per node as well as identify any degradation
complete system was seen as one in performance as the number of jobs per node and
of the few areas where there would number of nodes running the benchmark increased. Two
storage solutions, utilizing parallel file systems, were
be differences between suppliers.
shortlisted for benchmarking: Panasas ActiveStor, a
Selecting the wrong product could high performance parallel storage system running over
have potentially crippled the system 10GbE and InfiniBand, and an open source Lustre-
â however fast the CPUs are â they based solution employing commodity hardware. At first
are rendered useless if there is no glance the Lustre system seemed to offer the greatest
data to process. ALICE is the largest flexibility and lowest cost, but further testing proved it
â
HPC system ever purchased by the
University and benchmarking during
the tender process demonstrated
clearly that Panasas offered the best
performance both on a per node basis
and also without degradation of the
aggregate performance with access
from multiple nodes.
would not hold its own on the performance or simplicity
fronts.
As soon as the benchmark was scaled across several
nodes, Panasas ActiveStor demonstrated dramatically
higher parallel file system performance than the Lustre-
based system. Dr. Rudge knew that although he could
not predict how many jobs would run on the cluster at a
time, he had seen an increasing number of data mining
research tasks â jobs that would likely replicate the
demands of the benchmark running over many nodes.
Dr. Chris Rudge, Simple to use storage management was another
University of Leicester consideration. Dr. Rudge was keen to ensure that
the storage solution selected would allow ALICE to
1-888-panasas www.panasas.com
3