HPC lab projects

HPC Lab
David A. Bader, E. Jason Riedy, Henning
Meyerhenke, (horde of students...)

HPC Lab Projects

• UHPC (DARPA)
– Echelon: Extreme-scale Compute Hierarchies with Efficient Locality-
Optimized Nodes
– CHASM: Challenge Applications and Scalable Metrics (CHASM) for
Ubiquitous High Performance Computing
• GTFOLD (NIH): Combinatorial and Computational Methods for the
Analysis, Prediction, and Design of Viral RNA Structures
• PETA-APPS (NSF): Petascale Simulation for Understanding Whole-
Genome Evolution
• Graph500 (Sandia): Establish benchmarks for high-performance data-
intensive computations on parallel, shared-memory platforms
• STING (Intel): An open-source dynamic graph package for Intel platforms
• CASS-MT (DoD): Graph Analytics for Streaming Data on Emerging
Platforms
• GALAXY (NIH, PI Dr. J. Taylor, Emory): Dynamically
Scaling Parallel Execution for Cloud-based Bioinformatics

2

HPC Lab Projects

And yet more...
• Burton (NSF): Develop software and algorithmic
infrastructure for massively multithreaded
architectures.
• Dynamic Graph Data Structures in X10 (IBM):
Develop and evaluate graph data structures in X10
• I/UCRC Center for Hybrid and Multicore
Productivity Research, CHMPR (NSF)

3

Ubiquitous High Performance
Computing (DARPA): Echelon
Overall goal: develop highly parallel, security enabled, power
efficient processing systems, supporting ease of programming, with
resilient execution through all failure modes and intrusion attacks
Architectural Drivers:
Energy Efficient
Security and Dependability
Programmability

Program Objectives:
One PFLOPS, single cabinet including self-contained cooling
50 GFLOPS/W (equivalent to 20 pJ/FLOP)
Total cabinet power budget 57KW, includes processing
resources, storage and cooling
Security embedded at all system levels
Parallel, efficient execution models
Highly programmable parallel systems
Scalable systems – from terascale to petascale David A. Bader (CSE)
Echelon Leadership Team

“NVIDIA-Led Team Receives $25 Million Contract From DARPA to Develop High-Performance GPU Computing Systems” -MarketWatch

Echelon: Extreme-scale Compute Hierarchies
with Efficient Locality-Optimized Nodes
4

Ubiquitous High Performance
Computing (DARPA): CHASM
Overall goal: develop highly parallel, security enabled, power
efficient processing systems, supporting ease of programming, with
resilient execution through all failure modes and intrusion attacks
Architectural Drivers:
New architectures require new benchmarks
Evaluating usability requires applications
Existing metrics do not encompass alll UHPC goals

Program Objectives:
Develop applications, benchmarks, and metrics
Drive UHPC development
Support performance analysis of UHPC systems

Dan Campbell, GTRI, co-PI

CHASM: Challenge Applications and Scalable
Metrics for Ubiquitous High Performance
Computing

5

GTFold (NIH):
RNA Secondary Structure Prediction

Program Goals
Accurate structure of large
viruses such as:
FACULTY
•Influenza
Christine Heitsch (Mathematics)
•HIV
•Polio
David A. Bader
•Tobacco Mosaic
Steve Harvey (Biology)
•Hanta

6

PetaApps (NSF):
Phylogenetics Research on IBM Blue Waters
As part of the IBM PERCS team, we designed the IBM
Blue Waters supercomputer that will sustain petascale
performance on our applications, under the DARPA
High Productivity Computing Systems program.

• GRAPPA: Genome Rearrangements Analysis under Parsimony and other
Phylogenetic Algorithm
• Freely-available, open-source, GNU GPL
• already used by other computational phylogeny groups, Caprara,
Pevzner, LANL, FBI, Smithsonian Institute, Aventis, GlaxoSmithKline,
PharmCos.
• Gene-order Phylogeny Reconstruction
• Breakpoint Median
• Inversion Median
• over one-billion fold speedup from previous codes
• Parallelism scales linearly with the number of processors
FACULTY
David A. Bader, CSE
www.phylo.org
7

Graph500 (SNL):
Exploration of shared-memory graph benchmarks

• Establish benchmarks for
high-performance data-
intensive computations on
parallel, shared-memory Image Source: Nexus (Facebook application)

platforms.
• NOT LINPACK! 5 8
1
Image Source: Giot et al., “A Protein
Interaction Map of Drosophila

• Spec, reference
melanogaster”,
Science 302, 1722-1736, 2003
7 3 4 6 9
implementations at
http://graph500.org 2
Problem Size
• Ranking debuted at SC10 Class
• Press: IEEE Spectrum, Toy (10) 17 GiB
Computerworld, HPCWire, MIT Mini (11) 140 GiB
Tech. Review, EE Times, Small (12) 1.1 TiB
slashdot, etc...
Medium (13) 18 TiB
Large (14) 140 TiB
Huge (15) 1.1 PiB

8

STING (Intel):
Spatio-Temporal Interaction Networks and Graphs
An open-source dynamic graph package for Intel platforms
Intel: Parallel Algorithms in
• Develop and tune the Non-Numeric Computing
STING package to analyze
streaming, graph-
structured data for Intel
multi- and manycore
platforms.
• To support platforms from Photo © CTL Corp.

server farms (NYSE,
Facebook) to hand-held
devices Photo © Intel

• Span update scales from
terabytes per day to
human entry rates
• Basis for algorithmic and
performance work Photo © Intel

9

CASS-MT:
Center for Adaptive Supercomputing Software

• DoD-sponsored, launched July 2008
• Pacific-Northwest Lab
– Georgia Tech, Sandia, WA State, Delaware

• The newest breed of supercomputers have hardware set up not just for
speed, but also to better tackle large networks of seemingly random data.
And now, a multi-institutional group of researchers has been awarded more
than $12M to develop software for these supercomputers. Applications
include anywhere complex webs of information can be found: from internet
security and power grid stability to complex biological networks.

10

Example:
Mining Twitter for Social Good
ICPP 2010

Image credit: bioethicsinstitute.org

11

GALAXY (NIH, PI Dr. J. Taylor, Emory):
Dynamically Scaling Parallel Execution for Cloud-based Bioinformatics

Parallel Genome Sequence Assembly
Next Generation Sequencing experiments produce a
large amount of small base pair strings (reads)
Task: Assemble (concatenate) reads appropriately
into larger substrings (contigs)
Two main assembly approaches, both graph-based
(de Bruijn vs. overlap/string graph)
Objectives: Improve running time and ultimately
also assembly accuracy Assembly
Approach:
Use overlap/string graph for higher accuracy

Parallelism to reduce running time

Compression to reduce memory consumption

12

Pasqual:
New memory-efficient, parallel fast sequence assembler

Experimental Results
Memory Usage and Running Time
●
Pasqual: Our parallel
(shared memory, OpenMP)
sequence assembler
● Run on commodity server
(8 cores, 16 hyperthreads)
● Memory usage reduced to
ca. 50% for large data sets
● Running time compared to
sequential assemblers:
24 to 325 times faster!

● Biologists can assembler
larger data sets faster

13

HPC lab projects

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to HPC lab projects

Similar to HPC lab projects (20)

More from Jason Riedy

More from Jason Riedy (20)

Recently uploaded

Recently uploaded (20)

HPC lab projects