In this deck from the 2018 Swiss HPC Conference, Karsten Kutzer from Lenovo presents: Energy Efficiency and Water-Cool-Technology Innovations.
"This session will discuss why water cooling is becoming more and more important for HPC data centers, Lenovo’s series of innovations in the area of direct water-cooled systems, and ways to re-use “waste heat” created by HPC systems."
Watch the video: https://wp.me/p3RLHQ-iDl
Learn more:
http://lenovo.com
and
http://www.hpcadvisorycouncil.com/events/2018/swiss-workshop/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
2. 2
Why care about Power and Cooling?
Increasing
Electricy Cost
Performance-
Power relation
Application
Diversity
Waste Heat
Reuse
Data Center
limitations
Leading the Industry in Energy Aware HPC
2018 Lenovo - All rights reserved.
4. 42018 Lenovo - All rights reserved.
Application Diversity
• CPU bound BQCD case
• Node runs on full Power
• CPU provides full performance
while running at full power
• Memory bound BQCD case
• Node still runs on full Power
• CPU provides less performance
while still running at full power
0.00
100.00
200.00
300.00
400.00
500.00
600.00
1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
DC node[W]
CPU pkg 0 [W]
RAM pkg 0 [W]
CPU pkg 1 [W]
RAM pkg 1 [W]
0.00
100.00
200.00
300.00
400.00
500.00
600.00
1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
169
177
185
193
201
209
217
225
DC node[W]
CPU pkg 0 [W]
RAM pkg 0 [W]
CPU pkg 1 [W]
RAM pkg 1 [W]
Turbo ON: 157 GFlops Turbo ON: 65 Gflops
SD650 with 2 sockets 8168 and 6 x 16GB DIMMs; room temp = 21°C, inlet water = 45°C, 1.5 lpm/tray
How much energy do we waste on non-CPU bound application?
5. 52018 Lenovo - All rights reserved.
Waste Energy reuse - ERE
Energy Waste Direct Reuse Indirect Reuse
How much energy do we waste by not using the system heat?
Pictures: Leibniz Supercomputing Centre
6. 6
Energy Aware HPC
2018 Lenovo - All rights reserved.
Best CPU choice with max TDP supported
Best performance fully utilizing the system
Best TCO / Performance for maximized ROI
Best use of limited DataCenter capacities
Best Carbon Footprint for eco responsible HPC
7. 7
The three Pillars
Leading the Industry in Energy Aware HPC
2018 Lenovo - All rights reserved.
Hardware Software Infrastructure
10. 10
• Standard Air flow with
internal fans cooled with
the room climatization
• Broadest choice of
configurable options
supported
• Relatively inefficient cooling
• Air cooled but heat
removed with RDHX
through chilled water
• Retains high flexibility
• Enables extremely tight
rack placement
• Potentially room neutral
• Most heat removed by
onboard-waterloop with
up to 50°C temperature
• Supports highest TDP CPU
at densest footprint
• Higher performance
• Free cooling
Air Cooled Air Cooled
w/ Rear Door Heat Exch.
Direct Water Cooled
2018 Lenovo - All rights reserved.
Lenovo Cooling Technologies
Choose for broadest choice
of customizable options
Choose for max performance
and high energy efficiency
Choose for increased energy
efficiency with broad choice
PUE ~2.0 – 1.5
ERE ~2.0 – 1.5
PUE ~1.4 – 1.2
ERE ~1.4 – 1.2
PUE <=1.1
ERE <=1.1
11. 112018 Lenovo - All rights reserved.
Return on Investment for DWC vs RDHx
• New data centers: Water cooling has immediate payback.
• Existing air-cooled data center payback period strongly depends on electricity rate
DWC RDHx
$0.06/kWh $0.12/kWh $0.20/kWh
12. 12
Rear Door Heat Exchanger
2018 Lenovo - All rights reserved.
Up to 27°C Cold Water Cooling
Up to 100% Heat Removal Efficiency on 30kW
No moving parts or power required
Tenthousands of nodes install base
Long
ago
2009
2010
14. 142018 Lenovo - All rights reserved.
Lenovo RDHx2 – Typical Environment
15. 15
Direct “Hot” Watercooling
2012
2014
2018
>24.000 nodes globally
Up to 50°C Hot Water Cooling
Up to 90% Heat Removal Efficiency
World Record Energy Reuse Efficiency
30+ patents on market leading design
2018 Lenovo - All rights reserved.
17. 17
Top-Down View
2018 Lenovo - All rights reserved.
ThinkSystem SD650
Water Inlet *)
Water Outlet
Power
Board
CPUs
6 DIMMs
per CPU
2 AEP
per CPU
x16 PCIe Slot
Disk Drive
M.2 Slot
50°C
60°C
two nodes sharing a tray and a waterloop
*) inlet water temperature 50°C with configuration limitations (45°C without configuration limitations)
18. 182018 Lenovo - All rights reserved.
SD650 Improved Node Water Cooling Architecture
• Focus on maximizing efficiency for high
(up to 50°C) inlet water temperatures
• Device cooling optimization by minimizing
water to device temperature differences
– dT CPU < ~0.1 K / W
– dT Memory < ~1 K / W
– dT Network < ~1 K / W
• Direct water cooling of processors,
memory, voltage regulation devices and
IO devices (Network and Disk)
• Water circuit traverses all critical
components to optimize cooling.
DISK
Conductive
plate
Memory
Water
chanels
19. 192018 Lenovo - All rights reserved.
HPL Temperature & Frequency on SD650 with 8168
PL2 (short term RAPL limit) is 1.2 x TDP PL1 (long term RAPL limit) is TDP
Non AVX instructions AVX instructions Non AVX instructions
SD650 with 2 sockets 8168 and 12 x 16GB DIMMs; room temp = 21°C, inlet water = 40°C, 1.5 lpm/tray
20. 202018 Lenovo - All rights reserved.
Performance Optimization
• ThinkSystem SD530 – Standard Performance
– ~ 2.15 TeraFlop/s sustained HPL
w/ SKL 6148 20C 2.4Ghz 150W
– /s sustained HPL
w/ SKL 6148 20C 2.4Ghz 150W
• ThinkSystem SD650 – High Performance Mode
– ~ 2.34 TeraFlop/s sustained HPL
w/ SKL 6148 20C 2.4Ghz 150W
HPC [GF] AC node DC node CPU Temp
Turbo OFF 2152.7 400.1 368.0 81.8
Turbo ON 2147.2 400.4 368.3 82.1
Turbo OFF
Turbo ON
Turbo OFF 2342.0 472.5 434.7 36.8
Turbo ON 2333.4 473.2 435.4 36.9
SD530 and SD650 with 2 sockets 6148 and 12 x 16GB DIMMs; room temp = 21°C, inlet water = 18°C, 1.5 lpm/tray
+9% +18%
23. 232018 Lenovo - All rights reserved.
SD650 – DC Power Sampling/Reporting Frequency
• AC power at chassis level
(through FPC)
– With xCAT
– With ipmi
• DC power and energy at
node level through XCC
– With hw_usage library
– With ipmi
– With RAPL
– With Allinea
– With LSF or LEAR
NM/ME
HSC
RAPL
CPU/memory
(energy MSRs)
XCC/BMC
1Hz
10Hz
1KHz
Meter
500Hz
Sensor
200Hz1Hz
High Level Software
HSC –node
power
XCC/BMC
FPGA
100Hz
100Hz
100Hz
New for Lenovo
ThinkSystem SD650
10KHz
Sensor
24. 24
Bulk 12V Node 12V
2018 Lenovo - All rights reserved.
SD650 – advanced Accuracy for Power and Energy
• Node DC Power readings
– Better than or equal to +/-3% power reading accuracy
– down to the node’s minimum active power (~40-50W DC).
– Power granularity <=100mW
– At least 100Hz update rate for node power readings
• Node DC Energy meter
– Accumulator for Energy in Joules (~10 weeks until meter overflow)
XCC
ME (Node
Manager)
SN1405006
(used for
capping)
FPGA
(FIFO)
ipmi raw
oem cmd
Rsense
INA226
(used for
metering)
High accuracy, fast sampling Maintains compatibility with Node Manager
26. 262018 Lenovo - All rights reserved.
Energy Aware Run time: Motivation
• Power and Energy has become a
critical constraint for HPC systems
• Performance and Power consumption
of parallel applications depends on:
– Architectural parameters
– Runtime node configuration
– Application characteristics
– Input data
• Manual “best” frequency
– Difficult to select manually and it is a time
consuming process (resources and then
power) and not reusable
– It may change along time
– It may change between nodes
Configure
application for
Architecture X
Execute with N
frequencies:
calculate time
and energy
Select optimal
frequency
27. 27
EAR – Automatic and Dynamic CPU Frequency
• Architecture characterization
• Application characterization
– Outer loop detection (DPD)
– Application signature computation (CPI,GBS,POWER,TIME)
• Performance and power projection
• Users/System policy definition for frequency selection (with thresholds)
– MINIMIZE_ENERGY_TO_SOLUTION
- Goal: To save energy by reducing frequency (with potential performance degradation)
- We limit the performance degradation with a MAX_PERFORMANCE_DEGRADATION threshold
– MINIMIZE_TIME_TO_SOLUTION
- Goal: To reduce time by increasing frequency (with potential energy increase)
- We use a MIN_PERFORMANCE_EFFICIENCY_GAIN threshold to avoid that application that do not scale
with frequency to consume more energy for nothing
2018 Lenovo - All rights reserved.
28. 282018 Lenovo - All rights reserved.
EAR – Functional Overview
Learning Phase (at EAR installation*)
Execution Phase (loaded with application)
Kernel
Execution
Coefficients
Computation
Coeffcients
Database
Dynamic Patter Detection
detects outer loop
Compute power and
performance metrics
for outer loop
Energy Policy
read
CPUFrequency
* or every time cluster configuration is modified
(more memory per node, new processors ...)
Optimal frequency
calculation
31. 312018 Lenovo - All rights reserved.
PUE, ITUE, TUE and ERE
• Power Usage Effectiveness (PUE) says how much power a datacenter uses is not used for computing.
• It is the ratio of total power to the power delivered to computing equipment.
• It does not take into account how effective a server uses the Power it gets.
• Ideal value is 1.0
• IT Usage Effectiveness (ITUE) measures how much power a system uses is not used for computing.
• It is the ratio of the power of IT equipment to the power of the computing components.
• Multiplied with the PUE it gives the Total-Power Usage Effectiveness (TUE)
• Ideal value is 1.0
• Energy Reuse Effectiveness (ERE) integrates the reuse of the power dissipated by the computer.
• It is the ratio of total power considering also reuse to the power delivered to computing equipment.
• An ideal ERE is 0.0. If no reuse, ERE = PUE
𝑃𝑈𝐸 =
𝑇𝑜𝑡𝑎𝑙 𝐹𝑎𝑐𝑖𝑙𝑖𝑡𝑦 𝑃𝑜𝑤𝑒𝑟
𝑇𝑜𝑡𝑎𝑙 𝐼𝑇 𝑃𝑜𝑤𝑒𝑟
𝐼𝑇𝑈𝐸 =
𝑇𝑜𝑡𝑎𝑙 𝐼𝑇 𝑃𝑜𝑤𝑒𝑟
𝑇𝑜𝑡𝑎𝑙 𝐶𝑜𝑚𝑝𝑢𝑡𝑒 𝑃𝑜𝑤𝑒𝑟
𝐸𝑅𝐸 =
(𝑇𝑜𝑡𝑎𝑙 𝐹𝑎𝑐𝑖𝑙𝑖𝑡𝑦 𝑃𝑜𝑤𝑒𝑟 − 𝑇𝑜𝑡𝑎𝑙 𝑅𝑒𝑢𝑠𝑒 𝑃𝑜𝑤𝑒𝑟)
𝑇𝑜𝑡𝑎𝑙 𝐼𝑇 𝑃𝑜𝑤𝑒𝑟
32. 32
• Standard Air flow with
internal fans cooled with
the room climatization
• Broadest choice of
configurable options
supported
• Relatively inefficient cooling
• Air cooled but heat
removed with RDHX
through chilled water
• Retains high flexibility
• Enables extremely tight
rack placement
• Potentially room neutral
• Waste heat reused to
generate coldness to cool
non-DWC components
• Retains highest TDP,
footprint and performance
• Potentially all system heat
covered through DWC
• Most heat removed by
onboard-waterloop with
up to 50°C temperature
• Supports highest TDP CPU
at densest footprint
• Higher performance
• Free cooling
Air Cooled Air Cooled
w/ Rear Door Heat Exch.
Direct Water Cooled
w/ Adsorption Chilling
Direct Water Cooled
2018 Lenovo - All rights reserved.
Lenovo Cooling Technologies
Choose for broadest choice
of customizable options
Choose for max performance
and high energy efficiency
Choose for increased energy
efficiency with broad choice
Choose for max performance
and max energy efficiency
PUE ~2.0 – 1.5
ERE ~2.0 – 1.5
PUE ~1.4 – 1.2
ERE ~1.4 – 1.2
PUE <=1.1
ERE <=1.1
PUE <=1.1
ERE <1
33. 332018 Lenovo - All rights reserved.
Value of Direct Water Cooling with Free Cooling
• Reduced noise level in the DataCenter
• Reduced server power consumption
– Lower processor power consumption (~ 5%)
– No fan per node (~ 4%)
• Reduce cooling power consumption
– At 45°C free cooling all year long ( ~ 25%)
• Energy Aware Scheduling
– Only CPU bound jobs get max frequency (~ 5%)
• CAPEX Savings
– Less conventional chillers for the Computing System
Energy Savings
35-40%
Total Saving
34. 34
Adsorption Chilling
The method of using solid materials
for cooling via evaporation.
• Adsorption chiller consists of two identical
vacuum containers, each containing two heat
exchangers – and water.
– Adsorber (Desorber)
Coated with the adsorbent (e.g. zeolite)
– Evaporator (Condenser)
Evaporation and condensation of water
• Adsorption process has 2 phases
– in the adsorption phase the water on the
evaporator is taken in by the coated material in
the adsorber. Through that evaporation the
evaporater and the water flowing through it does
cool down while the adsorber fills with water
vapor and heats up the water flowing through it.
When the adsorber is saturated the process is
reversed.
– in the desorption phase hot water is passed
through the adsorber acting as a desorber rather
as its desorbing the water vapor and dispensing it
to the evaporator which is acting as condenser at
that point condensing the vapor back to water.
Again the process is reversed when the adsorber
is emptied.
Module 1
Desorption
Hot Water from
Compute Racks
52°/46 °C
Condensation
Cooling Water
to Hybrid
Cooling Tower
26°/32°C
Adsorption
Cooling Water
to Hybrid
Cooling Tower
26°/32°C
Evaporation
Chilled Water
to Storage
etc. Racks
23°/20°C
Desorber Condenser
Module 2
Adsorber Evaporater
35. 352018 Lenovo - All rights reserved.
Value of Direct Water Cooling with Adsorption Chiller
• Reduced noise level in the DataCenter
• Maximum TDP CPU Choice
• Reduced server power consumption
– Lower processor power consumption (~ 5%)
– No fan per node (~ 4%)
• Reduce cooling power consumption
– At 50°C free cooling all year long (~ 25%)
– Heat Reuse generate 600kW cooling capacity (> 5%)
• Energy Aware Runtime
– Frequency optimization during runtime (~ 5%)
• CAPEX Savings
– Less conventional chillers for the Computing System
Energy Savings
40 - 50%
Total Saving