GROMACS Molecular Dynamics on GPU

GROMACS 4.6 Pre-Beta
and 4.6 Beta

Benefits of GPU Accelerated Computing
Faster than CPU only systems in all tests

Large performance boost with marginal price increase

Energy usage cut by more than half

GPUs scale well within a node and over multiple nodes

K20 GPU is our fastest and lowest power high performance GPU yet

Try GPU accelerated GROMACS for free – www.nvidia.com/GPUTestDrive
2

Great Scaling in Small Systems
25.00
Running GROMACS 4.6 pre-beta with CUDA 4.1
21.68
Each blue node contains 1x Intel X5550 CPU
20.00 3.2x (95W TDP, 4 Cores per CPU)

3.2x Each green node contains 1x Intel X5550 CPU
Nanoseconds / Day

(95W TDP, 4 Cores per CPU) and 1x NVIDIA
15.00 M2090 (225W TDP per GPU)
13.01

CPU Only
10.00 3.6x With GPU
8.36
3.6x

5.00
3.7x
Benchmark systems: RNAse in water
with 16,816 atoms in truncated
dodecahedron box
0.00
1 2 3
Number of Nodes

Get up to 3.7x performance compared to CPU-only nodes

Additional Strong Scaling on Larger System
128K Water Molecules
160 Running GROMACS 4.6 pre-beta with CUDA 4.1

Each blue node contains 1x Intel X5670 (95W
140
TDP, 6 Cores per CPU)

120 Each green node contains 1x Intel X5670 (95W
2x TDP, 6 Cores per CPU) and 1x NVIDIA M2070
Nanoseconds / Day

100 (225W TDP per GPU)

80
CPU Only
60 With GPU

2.8x
40

20
3.1x
0
8 16 32 64 128
Number of Nodes

Up to 128 nodes, NVIDIA GPU-accelerated nodes deliver 2-3x performance
when compared to CPU-only nodes

Replace 3 Nodes with 2 GPUs
Running GROMACS 4.6 pre-beta with CUDA 4.1
ADH in Water (134K Atoms)
The blue node contains 2x Intel X5550 CPUs
9 4 CPU Nodes
9000 (95W TDP, 4 Cores, $1000 per CPU)
8.36
$8,000
8 8000 The green node contains 2x Intel X5550 CPUs
(95W TDP, 4 Cores, $1000 per CPU) and 2x
7 6.7 7000
$6,500 NVIDIA M2090s as the GPU (225W TDP, $2000
per GPU)
6 6000

5 5000

4 4000

3 3000

2 2000

1 1000

0 0
Nanoseconds/Day Cost

Save thousands of dollars and perform 25% faster

Greener Science
ADH in Water (134K Atoms)
Running GROMACS 4.6 with CUDA 4.1
12000
The blue nodes contain 2x Intel X5550 CPUs
Energy Expended (KiloJoules Consumed)

(95W TDP, 4 Cores per CPU)
10000
The green node contains 2x Intel X5550 CPUs,
Lower is better 4 Cores per CPU) and 2x NVIDIA M2090s GPUs
8000 (225W TDP per GPU)

6000

4000 Energy Expended
= Power x Time
2000

0
4 Nodes 1 Node + 2x M2090
(760 Watts) (640 Watts)

In simulating each nanosecond, the GPU-accelerated system uses 33% less energy

The Power of Kepler
RNase Solvated Protein 24k Atoms
140

Running GROMACS version 4.6 beta
120
The grey nodes contain 1 or 2 E5-2687W CPUs
(150W each, 8 Cores per CPU) and 1 or 2
100 NVIDIA M2090s.

The green nodes contain 1 or 2 E5-2687W
80 CPUs (8 Cores per CPU) and 1 or 2 NVIDIA
M2090 K20X GPUs (235W each).
60 K20X

40

20

0
1 CPU + 1 GPU 1 CPU + 2 GPU 2 CPU + 1 GPU 2 CPU + 2 GPU

Upgrading an M2090 to a K20X increases performance 10-45%
Ribonuclease

K20X – Fast
RNase Solvated Protein 24k Atoms
120

Running GROMACS version 4.6 beta
100
The blue nodes contain 1 or 2 E5-2687W CPUs
(150W each, 8 Cores per CPU).
80
Nanoseconds / Day

The green nodes contain 1 or 2 E5-2687W
CPUs (8 Cores per CPU) and 1 or 2 NVIDIA
K20X GPUs (235W each).
60 CPU Only
With 1 K20X

40

20

0
1 CPU 2 CPUs

Adding a K20X increases performance by up to 3x
Ribonuclease

K20X, the Fastest Yet
192K Water Molecules
16

Running GROMACS version 4.6-beta2 and
14 CUDA 5.0.35

12 The blue node contains 2 E5-2687W CPUs
(150W each, 8 Cores per CPU).
Nanoseconds / Day

10 The green nodes contain 2 E5-2687W CPUs (8
Cores per CPU) and 1 or 2 NVIDIA K20X GPUs
8 (235W each).

6

4

2

0
CPU CPU + K20X CPU + 2x K20X

Using K20X nodes increases performance by 2.5x
Water

Recommended GPU Node Configuration for
GROMACS Computational Chemistry
Workstation or Single Node Configuration
# of CPU sockets 2
Cores per CPU socket 6+
CPU speed (Ghz) 2.66+
System memory per socket (GB) 32
Kepler K10, K20, K20X
GPUs
Fermi M2090, M2075, C2075
1x
Kepler-based GPUs (K20X, K20 or K10): need fast Sandy
# of GPUs per CPU socket
Bridge or perhaps the very fastest Westmeres, or high-end
AMD Opterons
GPU memory preference (GB) 6
GPU to CPU connection PCIe 2.0 or higher
Server storage 500 GB or higher

Network configuration Gemini, InfiniBand

10 Scale to multiple nodes with same single node configuration

GPU Test Drive
Experience GPU Acceleration
For Computational Chemistry
Researchers, Biophysicists

Preconfigured with Molecular
Dynamics Apps

Remotely Hosted GPU Servers

Free & Easy – Sign up, Log in and
See Results

www.nvidia.com/gputestdrive
11

GROMACS Molecular Dynamics on GPU

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (14)

Similar to GROMACS Molecular Dynamics on GPU

Similar to GROMACS Molecular Dynamics on GPU (20)

Recently uploaded

Recently uploaded (20)

GROMACS Molecular Dynamics on GPU

Editor's Notes