In this deck, Ronald P. Luijten from IBM Research in Zurich presents: DOME 64-bit μDataCenter.
I like to call it a datacenter in a shoebox. With the combination of power and energy efficiency, we believe the microserver will be of interest beyond the DOME project, particularly for cloud data centers and Big Data analytics applications."
The microserver’s team has designed and demonstrated a prototype 64-bit microserver using a PowerPC based chip from Freescale Semiconductor running Linux Fedora and IBM DB2. At 133 × 55 mm2 the microserver contains all of the essential functions of today’s servers, which are 4 to 10 times larger in size. Not only is the microserver compact, it is also very energy-efficient.
Watch the video: http://wp.me/p3RLHQ-gJM
Learn more: https://www.zurich.ibm.com/microserver/
Sign up for our insideHPC Newsletter: http://insideHPC/newsletter
4. SKA (Square Kilometer Array) to measure Big Bang
Picture source: NZZ march 2014
0 10-32s 10-6s 0.01s 3min 380’000 years 13.8 Billion years
Ronald P. Luijten / April 2017 4
Big
Bang Inflation
Protons
created
Start of
nucleosynthesis
through fusion
End of
nucleo-
synthesis
Modern
Universe
5. SKA: What is it?
Top 500: Sum=123 PFlops. 2GFlops/watt.
100x Flops of Sum! ~ 7GWh
~3000 Dishes
3GHz-10GHz.
~0.5M Antennae
.5GHz-1.7GHz.
~0.5M Antennae
.07GHz-0.45GHz.
1. 109 samples/second * .5M antennae: .5 1015 samples/sec.
2. 3.5 109 samples/second * .5M antennae: 1.7 1015 samples/sec.
3. 2 1010 samples/second * 3K antennae: 6.1013 samples/sec
Sum = 2 1015 samples/second @ 86400 seconds/day:
170 1018 (Exa) samples/day. Assume 10-12x reduction @antenna:
14 Exabytes/day (minimum).
Ronald P. Luijten / April 2017 5
7. Dome Project:
System Analysis
Data & StreamingSustainable (Green)
Computing Nanophotonics
Computing Transport Storage
Algorithms & Machines
- Nanophotonics
- Real-Time
Communications
- New Algorithms
- Microservers
- Accelerators
- Access Patterns
Research Streams…
…are mapped to research projects:
…plus an
open user
platform:
User platform
- Student
projects
- Events
- Research
Collaboration
33M€ 5-year Research Project: 76 IBM PY (32 in NL); 50 ASTRON PY
Ronald P. Luijten / April 2017 7
8. Major SKA elements & DOME
Beamforming at
stations
Reconstruction of sky
image
Interferometry, cor-
relation of station beams
Station
Station
Central Signal Processor (CSP) Science Data Processor (SDP)
Archive
Algorithms and Machines (P1)
Access Patterns (P2)
Nanophotonics (P3)
Microservers (P4)
Accelerators (P5)
New Algorithms (P6)
Real-Time Communications (P7)
Ronald P. Luijten / April 2017 8
9. Definition
µDataCenter:
• Ultra-compact self-contained DataCenter using MicroServers
• 64 bit, Server-class computing (ECC on DRAM and caches)
• Ethernet networking
• Storage
• Hot-water cooling, air cooling with 4x less density
• High performance
• Best-of-Breed energy-efficiency
• Competitive cost
• Commodity and standards based
• ‘Appliance’
Allows deployment in space-constrained locations
Edge DataCenter for IoT
Ronald P. Luijten / April 2017 9
The integration of a compute, storage, networking, power &
cooling into ultra-compact form factor
10. Definition
µDataCenter:
• Ultra-compact self-contained DataCenter using MicroServers
• 64 bit, Server-class computing (ECC on DRAM and caches)
• Ethernet networking
• Storage
• Hot-water cooling, air cooling with 4x less density
• High performance
• Best-of-Breed energy-efficiency
• Competitive cost
• Commodity and standards based
• ‘Appliance’
Allows deployment in space-constrained locations
Edge DataCenter for IoT
Ronald P. Luijten / April 2017 10
The integration of a compute, storage, networking, power &
cooling into ultra-compact form factor
11. The economist, technology quarterly, 12March2016
Moore’s law: the reality
Ronald P. Luijten / April 2017 11
13. Global chip wiring vs compute energy
130nm 1.4 0.7 1.2 0.3 42.85714
90nm 1 0.5 1 0.25 50
45nm 100 0.35 100 0.175 0.58 0.145 82.85714
32nm 60 0.22 62.85714 0.11 0.49 0.1225 111.3636
22nm 45 0.146 41.71429 0.073 0.43 0.1075 147.2603
14nm 30 0.097 27.71429 0.0485 0.4 0.1 206.1856
from fig3, borkar2013fig 9
for comp only
0
50
100
150
200
250
90nm 45nm 32nm 22nm 14nm
Relative global chip interconnect versus computation energy in %
Computation energy includes local wiring
Ronald P. Luijten / April 2017 13
14. CMOS scaling era’s
K. Rupp et al, 2015
Era of Dennard (constant energy density) scaling Non-Dennard
scaling
Communication Energy
dominated scaling
Ronald P. Luijten / April 2017 14
16. Definition
µServer:
The integration of an entire server node motherboard*
into a single microchip except DRAM, Nor-boot flash
and power conversion logic.
305mm
245mm
139mmx62mm
* no graphics
Ronald P. Luijten / April 2017 16
17. Definition
µServer:
The integration of an entire server node motherboard*
into a single microchip except DRAM, Nor-boot flash
and power conversion logic.
305mm
245mm
139mmx62mm
This does NOT imply low performance!
* no graphics
Ronald P. Luijten / April 2017 17
19. Indirect Hot-water cooling
Ronald P. Luijten / April 2017 19
133 mm
Standard 240 pin
DDR3 DIMM board
SoC
(Lid Removed)
139 mm
30 mm
61.5 mm
Dual use Cu
-Cooling
-Power dist
DIMM connector
replaced with high
speed SPD08
Cooling plate over
Circuit board
integrated heat-pipes
20. What we get
Ronald P. Luijten / April 2017 20
32-way carrier “BB2”
(8 nodes populated in this picture)
12V power supply
Cooling rails
21. View from above
Server nodes
Power node
Storage node
10 GbE Switch
QSFP cages
Water In/Out
Cooling Rails
Ronald P. Luijten / April 2017 21
22. DOME compute node board diagram
T4240
16GB
DRAM
72bit
16GB
DRAM
72bit
PSoC
1Gbit SPI
flash
Power
converter
USB
JTAG
Serial
I2C
4 x
10 GbE
PCIe x8 2 x SATA
16GB
DRAM
72bit
1866 MT/s 1866 MT/s
1866 MT/s
1V / 40A
12V / 2.5A
Ronald P. Luijten / April 2017 22
23. DOME compute node board diagram
T4240
DRAM DRAM
PSoC
SPI
flash
Power
converter
USB
JTAG
Serial
I2C
4 x
10 GbE
PCIe x8 2 x SATA
DRAM
12V / 2.5A
PSOC collapses 7 functions into a small chip to
save Area, Power and Cost
1. On/Off & Power up sequencing voltage
domains
2. Monitor power supply voltages / current
3. Provide uServer boot configuration (I2C)
4. JTAG debug + HW counter performance access
5. Serial port forward over USB (Linux console)
6. Temperature monitoring and protection
7. Management interface and control (version
management; MAC address assignment etc.)
Ronald P. Luijten / April 2017 23
24. DOME Compute Node Options
Ronald P. Luijten / April 2017 24
61.5mm
T4240 SoC
139 mm
61.5mm
Node ISA DRAM I/O
T4240ZMS ppc64 24 GB 4x 10GbE
28nm Bulk 24 core 3 channel PCI x8
43W TDP 1.8GHz DDR3 2 SATA
e6500 72bit ECC USB, µSD
LS2088ZMS ARMv8 32 GB 6x10GbE
28nm Bulk 8 core 2 channel PCI x4,x2,x1,x1
35W TDP 2+GHz DDR4 2 SATA
A72 72bit ECC USB, µSD
LS2088 SoC
25. DOME Accelerator Node
Ronald P. Luijten / April 2017 25
PCI- and/or Network-Attached FPGA module
FPGA
Xilinx® Kintex® UltraScale™
Five devices options
- XCKU025 (downgrade)
- XCKU035 (downgrade)
-XCKU040 (downgrade)
- XCKU060 (default)
- XCKU095 (upgrade)
Memory (DDR4)
16 GB total (default)
- x2 banks of 8GB x72
- 2400 MT/s, w/ ECC
32 GB total (option)
- x2 banks of 16GB x72
- 2400 MT/s, w/ ECC
Flash
1 Gb x 16 (default)
- Multi-boot support
- Encryption support
2 Gb x16 (option)
reconfigurable accelerator module (FPGA)
Connected thru Ethernet network without any host interaction.
up to 1024 cards can be fit into a single 19” by 42U rack.
FPGA: Xilinx® Kintex® UltraScale™ with two independent DDR4 memory channels (8–16GB each
Top edge extension connector with128 Gbps of bandwidth over 8 lanes,
Daughter card and I/O connectors for plugging an I/O mezzanine
6 x 10 GBE, PCIe3 x8, 2 x SATA3
Status: In bringup
26. Industry I/O interface board
Ronald P. Luijten / April 2017 26
IoT Daughter-Board to FPGA module
USB 2.0 host 2x
Optocoupler in 4 100Mbps Avago
Optocoupler out 4 Dto.
LVDS 7 pairs For ADC, etc.
CAN 2
Output Level Shifter 18 Programmable output level, 1Mbps
Input Level Shifter 12 Programmable input level 10Mbps
Isolated USB Low-Speed host 1
MIPI PHY 1+1
Serial (RS232, RS484, etc.) 2 Or 4 without handshake
attached to Mezzanine connector
providing various IoTinterface standards relevant
seamlessly embedded in the DOME IoT Edge Compute platform
Standard DOME interface (Ethernet, PCI, SATA)
IoT Daughter-Card interfaces:
Note: In addition to the above, the IoT daughter card can be connected to the FPGA via 8 lanes of PCIe3
27. Ronald P. Luijten / April 2017 27
32-way carrier board
32-way carrier-board
Storage
Switch
Power
Compute
Cooling
only left rail shown
Compute
520mm
200mm
32-way Carrier:
– compute node (32x):
32 ppc or 32 ARM or (16 ppc + 16 ARM)
– 64 port Ethernet switch
– 32x 10 GbE to compute nodes
– 8x 40GbE external links
Expect ~1TFlop/s linpack w/ T4240 nodes
2 Carriers in 2U rack unit:
– 64 Compute nodes with total 1536 cores
– 1536 Gbyte DRAM
– 16x 40GbE
– 64 TB storage
28. Ronald P. Luijten / April 2017 28
32-way carrier structure
8x 40G
switch
N N N
1 2 8
S
N N N
9 10 16
S
P N N N
17 18 23
S
N N N
24 25 32
S
P
M
10 GbE
1 GbE management
N
P
S
M
= General Purpose Node (Compute, accelerator)
= Storage node (8x mSATA)
= Power node (DRAM + I/O suplies)
= Management node (T4240 w/ IPMI)
Management bus
SATA
Supply bus
29. Ronald P. Luijten / April 2017 29
32-way carrier network topology
T4240
module
32 way carrier
FM6000 switch
32x 10 GbE internal connectivity from switch
8 x 40GbE external connectivity (QSFP+)
Green links optionally connect to other 32way carrier
30. Ronald P. Luijten / April 2017 30
32-way carrier network topology
T4240
module
32 way carrier
FM6000 switch
32x 10 GbE internal connectivity from switch
8 x 40GbE external connectivity (QSFP+)
Green links optionally connect to other 32way carrier
Short electrical links on carrier board
(Copper backplane standard 10GBASE-KR)
MAC to MAC Ethernet links - eliminate PHY chips
128 PHYs on server nodes
32 PHYs on switch node
31. Ronald P. Luijten / April 2017 31
Currently in bringup (April 2017)
Water-cooled bringup:
SATA carrier (MM node)
USB hub module
Power node
T4240 management node
Storage node
8 T4240 server nodes
Switch node (from right to left)
34. Key Features DOME µDataCenter
2x Operations per Joule compared to energy-efficient Xeon E3-1230Lv3 (SpecBench)
20x denser with watercooling (5x with aircooling)
No moving parts (drives, fans)
Highest system memory bandwidth density: 159GB/s/Liter (peak)
Value:
• Density + Energy-Efficiency + commodity components + standards
• minimal component count
– SoC, PSoC, System partitioning
• Packaging, power and cooling
• Connector definition
Ronald P. Luijten / April 2017 34
35. Product version being finished
The “edge-of-IOT” microDataCenter is being productized – 64 servers in 2U
Market introduction planned summer 2017
rendering of two BB2 carriers in 2U rack unit
Ronald P. Luijten / April 2017 35
36. µDataCenter plans
Ronald P. Luijten / April 2017 36
• Finish ARMv8 server board
• Finish FPGA board
• Obtain funding to build GPU board + Xeon-D board
• Bring µDataCenter to market
• Product launch: Summer 2017
• H2020 proposal for next step in packaging integration:
– use high performance SoC die based on ARMv8
– package with 3D packaged DRAM
– chip carrier technology in size of DOME node cards, but thicker
ZRL Prototype
3D packaged
DRAM
37. Application Areas
Ronald P. Luijten / April 2017 37
•Managing
unstructured data
for Industry 4.0
•Smarter Cities:
Carbon Emissions,
Traffic Flow & Noise
•Computational
Musicology
•Processing
petabytes of data
from the Big Bang
•Industry 4.0 •Internet of Things •Aerospace •Vehicles
CeBIT ‘16 live demo
38. Trends, Conclusions
Making it small really works to improve energy-efficiency
- SoC removes many chip crossings
- short distance
- Save power in unexpected places (PHY, DRAM)
- PSoC eliminates many components
- Water cooling reduces power consumption even further
The future scaling roadmap is in ultra-dense packaging
Big Data changing workloads
IOT distributed DataCenters
Ronald P. Luijten / April 2017 38
39. SKA: http://www.skatelescope.org
DOME: http://www.dome-exascale.nl
µServer: http://www.zurich.ibm.com/microserver
T4240 system: http://swissdutch.ch:6999
Wikipedia: https://en.wikipedia.org/wiki/Microserver
Twitter: https://twitter.com/ronaldgadget
Videos:
Impossible µServer: http://t.co/4vEkEVEazO
Innovators Dilemma: http://youtu.be/imweQe8NgnI
DOME T4240 Fedora: http://youtu.be/D6da5DqcyQk
4.4: Energy-Efficient Microserver Based on a 12-Core 1.8GHz 188K-CoreMark 28nm Bulk CMOS 64b SoC
for Big-Data Applications with 159GB/s/L Memory Bandwidth System Density 39 of 15
Links
Ronald P. Luijten / April 2017 39
41. Acknowledgements
This work is the results of many people
• Andreas Doering, IBM ZRL
• Matteo Cossale, IBM ZRL
• Stephan Paredes, IBM ZRL
• Francois Abel, IBM ZRL
• Beat Weiss, IBM ZRL
• Peter v. Ackeren, NXP
• Ed Swarthout, NXP Austin
• Dac Pham, (formerly NXP Austin)
• Yvonne Chan, IBM Toronto
• Alessandro Curioni, IBM ZRL
• Ton Engbersen, IBM ZRL
• James Nigel, FSL
• Boris Bialek, IBM Toronto
• Marco de Vos, Astron NL
• And many more remain unnamed….
Companies: NXP Austin, Belgium & Germany; IBM worldwide; Transfer – NL
Dutch Gvt for DOME grant
Ronald P. Luijten / April 2017 41
42. Questions???
PS. I like lightweight things
µServer website: www.swissdutch.ch
Ronald P. Luijten / April 2017 42