SlideShare ist ein Scribd-Unternehmen logo
1 von 55
Downloaden Sie, um offline zu lesen
© Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
Enabling POWER 8 advanced features on Linux
Sébastien Chabrolles
Julien Limodin
Fabrice Moyen
PowerSystem Linux Center
IBM Montpellier
1
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
POWER8 Hardware Accelerator
NX
On Chip Accelerators (NX):
Symetric Crypto
Compression engine
Random Number Generator
One NX complex per chip
A given NX can access all memory in the SMP
A given NX can be accessed by any core
Can be accessed via powerVM hypervizor call
In Core Accelerators :
Symetric Crytpo
Private per core
Leverage Vector Unit (VMX)
Direct access for guest/VM (including KVM)
IBM - POWER8
12 cores per socket (from 3 to 4 GHz)
8 HW threads / core (SMT technology)
Large cache (96 MB : 8 MB / core)
High Memory Bandwidth(~200 GB/s)
2
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
1. Transparent Memory Compression
2. -
3. Power8 Split-Core
Enable POWER 8 advanced features on Linux
3
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Transparent Memory Compression
Transparent Memory Compression is a feature provided by the operating system (Kernel)
dynamically compresses process memory without process knowledge.
PowerVM with AIX proposes this functionality via AME (Active Memory Expansion)
Unfortunately, AME does not exist for Linux.
Linux has an alternative solution is named ZSWAP !!!
Zswap is a feature that hooks into the read and write sides of the swap code and acts as a
compressed cache for pages go to and from the swap device
Like AME, Zswap can use the Power NX compression accelerator (842) to improve
compression performance.
But unlike AME, zswap has some restriction :
Paging device are needed with enough space to store uncompressed data.
but still the real one.
Application processes must allow to be swapped-out.
4
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
P8 NX (on-chip) block diagram
Second generation Nest Accelerator complex*
Encryption Engine
Random number generator
Two 842 compression / decompression
engines
Proprietary IBM Research algorithm
SRAM based dictionary compression
Used by AME
Good compression ratio at high bandwidth
106% of LZO on 190+ benchmarks
158% of compression ratio of software
DEFLATE with FHT on Canterbury corpus
Only available via PowerVM or BareMetal
Linux.
-chip accelerators for cryptography and active
IBM J. Res. & Dev., vol. 57, no. 4, Nov./Dec. 2013.
On-chip SMP Interconnect Interface
che
DMA Controller
842
Channel
0
RNG
Channel
1
chs
AES
SHA
IOB
chs
AES
SHA
IOB
che
842
Channel
2
Channel
3
32B 32B 16B 16B
32B
32B32B 16B 16B
32B
32B32B
16B16B
ingress arraysegress arrays
2to1 clock region
On-chip SMP interconnect
5
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Zswap !
For that, we will use a well known Java Benchmark (SPECjbb), run it several time
while increasing the JVM Heap-Size.
1 core POWER8 10GB Mem
Ubuntu 16.04
10 GB Phys. Mem
JVM Heap-Size
9GB 10 GB 18 GB
SPECjbb
1- Baseline Test with Zswap deactivated
2- Test with zswap and software compression (default)
3- Test with zswap and Power HW compression (842)
6
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Memory Over-Allocation test with SPECjbb2005 (BaseLine)
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18
%bopsvsnominal
JVM Heap Size
SPECjbb2005 performance and Memory Over-Allocation
1 P8 core SMT8 10GB Mem
zswap off
Memory
Over-commitment
10% of nominal performance due to
Memory thrashing)
7
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
SWAP / Paging Activity
System Memory
Swap device
1- Swap Out / Page Out
When the memory is full, a process
(LRUD) scans memory and move the
device.
Asynchrous Backgroud task => No impact on
2- Swap In / Page In
When page-fault occurs and pages are
located in the paging device, those pages
must be moved back to the Memory.
As physical disks are much more slower
=> THIS HURTS PERFORMANCE !!!
Swap out
Swap in
8
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18
SwapI/O(MB/s)
JVM Heap Size
Swap I/O activity - SPECjbb2005 Memory Over-Allocation
1P8 core SMT8 - 10GB Mem
zswap off
Memory Over-Allocation test with SPECjbb2005 (Swap I/O)
Memory
Over-commitment
Single SAS disk used as Swap device
Reaches his limit at ~100 MB/s (50% read)
9
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
In the memory thrashing case, the non-deterministic latency and
performance degradation that I/O introduces could be fatal to your
I/O storm could even prevent you to connect to your system or start any
We need a way to smooth out this I/O storm and performance cliff as
memory demand meets memory capacity.
Zswap!
10
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
ZSWAP requirement
1. Zswap is directly available in the Linux Kernel since v3.11
RedHat 7, CentOS 7, Fedora 19
Suse 12
Ubuntu 14.04
Enable zswap at boot level by adding the option zswap.enabled=1 in your boot loader.
2. Power NX (on-chip) acceleration (842) is only available for PowerVM and BareMetal Linux.
Not Available today for PowerKVM guest
cat /proc/device-tree/ibm,platform-facilities/ibm,compression-v1/status should return okay
Note : Ubuntu need a kernel 4.2 or above to get access to Power NX hw (starting with ubuntu 15.10)
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1488495
Enable zswap HW compression with zswap.compressor=842 in your boot loader.
11
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Enabling POWER HW compression engine (842) with zswap
RedHat :
1- Enable Zswap with 842 compressor at boot time.
vi /etc/sysconfig/grub
add zswap.enabled=1 zswap.compressor=842 to GRUB_CMDLINE_LINUX
2- Regenerate your grub.cfg file.
grub2-mkconfig > /boot/grub2/grub.cfg
3- Add 842 kernel modules to your ramdisk
echo 842 > /etc/modules-load.d/842.conf
dracut -f
4- reboot and verify with dmesg | grep zswap
[ 1.064790] zswap: loaded using pool 842/zbud
12
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Enabling POWER HW compression engine (842) with zswap
Ubuntu:
1- Enable Zswap with 842 compressor at boot time.
vi /etc/sysconfig/grub
add zswap.enabled=1 zswap.compressor=842 to GRUB_CMDLINE_LINUX
2- Regenerate your grub.cfg file.
grub2-mkconfig > /boot/grub2/grub.cfg
3- Add 842 kernel modules to your ramdisk
echo 842 > /etc/modules-load.d/842.conf
vi /usr/share/initramfs-tools/hooks/842
Add the following lines:
#!/bin/sh -e
PREREQS=""
case $1 in
prereqs) echo "${PREREQS}"; exit 0;;
esac
. /usr/share/initramfs-tools/hook-functions
force_load 842
update-initramfs -u
4- dmesg | grep zswap
[ 1.064790] zswap: loaded using pool 842/zbud
13
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Zswap parameters and monitoring
Zswap parameters are located in /sys/module/zswap/parameters
You can change :
- compressor : [ lzo or 842 ] default lzo
Compressor algorithm to use
- enabled : [ Y or N ]
Enable zswap
- max_pool_percent : [1 to 100] default 20
Compress pool size limit (in % of RAM)
- Zpool : [ zbud or zsmalloc ] default zbud
Compression pool algorithm.
Zbud : - store 2 pages in one slot (compression ratio 2:1)
- evict the oldest pages to disk when full
Zsmalloc : - can store more pages per slot than zbud (compression ratio ~ 3:1)
- but unlike zbud, redirect new allocation to paging device when full.
(does not recycle old pages).
You can monitor zswap activity by looking at counters located in /sys/kernel/debug/zswap
14
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
zswap
Swap device
1- Compress/Uncompress
(zbud by default).
Scan/Compress use extra CPU cycles, but when
page-fault occurs, it is really faster to get pages
from the compressed pool in memory than disk.
3- Swap In / Page In
When page-fault occurs and pages are
located in the paging device, those pages
must be moved back to the Memory.
THIS HURTS PERFORMANCE !!!
Uncompressed Memory Zpool (zbud)
ZSWAP
ZSWAP
2- Swap Out / Page Out
When the compress zpool is full, zbud
moves odest compressed pages to the
swap device
15
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
ZSWAP Memory Over-Allocation test with SPECjbb2005
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18
%bopsvsnominal
JVM Heap Size
Testing zswap (zbud) with SPECjbb2005
1 P8 core SMT8 10GB Mem - max_pool_percent=40
zswap off
zswap 842 (HW)
Memory
Over-commitment
Zpool
Over-commitment
75% of nominal performance
at 140% memory
16
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
ZSWAP HW vs Soft. compression
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18
%bopsvsnominal
JVM Heap Size
Testing zswap (zbud) with SPECjbb2005
1 P8 core SMT8 10GB Mem - max_pool_percent=40
zswap off
zswap lzo
zswap 842 (HW)
Memory
Over-commitment
Zpool
Over-commitment
X1.5
17
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
ZSWAP Memory Over-Allocation test with SPECjbb2005
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18
%bopsvsnominal
JVM Heap Size
Testing zswap (zbud) with SPECjbb2005
1 P8 core SMT8 10GB Mem - max_pool_percent=40
zswap 842 (HW)
Memory
Over-commitment
Zpool
Over-commitment
1 2 3
18
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Case 1 : Zswap with Memory not Over-Committed
Swap device
Memory Used (uncompressed) Free memory
Enough Memory available application
No/Little swap I/O occuring
Zswap is idle (no CPU overhead)
=> You can almost use all the memory before zswap
starts working
100% Memory Used (uncompressed)
100% CPU user
Best performance for application
19
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Case 2 : Zswap with Memory Over-Committed
Swap device
Memory Used (uncompressed)
Application needs more memory than available
Zswap starts working, compressing pages in/out zpool.
Zpool is increasing
No/Little swap I/O occuring
Below nominal performance due to memory scanning,
unmapping.
Compression/decompression are offloaded to NX 842
Zpool (zbud)
ZSWAP
25% CPU system due to page scanning
75% of nominal performance on
CPU bound application (worst case)
20
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Zswap with 842(HW) vs LZO(Soft)
Zswap HW compression 842
10GB RAM, 14GB Java Heap Size
25% of System CPU (overhead) due to
memory page scanning.
Compression offloaded to NX 842
75% of nominal performance
Zswap Soft. Compression LZO
10GB RAM, 14GB Java Heap Size
50% of system CPU (overhead) due to
memory page scanning and compression
50% of nominal performance
50% better CPU usage with POWER HW compression
21
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18
SwapI/O(kB/s)
JVM Heap Size
Testing zswap (zbud) with SPECjbb2005
1P8 core SMT8 - 10GB Mem - max_pool_percent=40
zswap off
zswap on
ZSWAP Memory Over-Allocation (Swap IO activity)
Memory
Over-commitment
Zpool
Over-commitment
No or few paging when running
1 2 3
22
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Case 3 : Zswap with Memory Over-Committed and Zpool Full
Swap device
Memory Used
(uncompressed)
Application needs more memory than available
Zswap is working, compressing pages in/out zpool
Zpool reaches max_pool_percent limit (compress
pool is full). Need to free some space in Zpool
=> Swapping in/out !!! Performance degradation
Zpool (zbud) FULL
ZSWAP
max_pool_percent=40
75% CPU wait I/O; only 10 % CPU user
10% of nominal performance due to waiting
for pages on swap device (swap in)
SWAP IN/OUT
23
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Zswap Conclusion
Zswap is not AME, but it can really helps to reduce impact of paging activity and secure
your production system with no cost and no penalty:
Power8 NX842 compression engine are available for PowerVM and BareMetal Linux
No Impact, when memory demand is below RAM capacity installed.
Can maintain your system at 75% performance in CPU 100% case (the worse scenario) and
Zswap zbud x1.4 Memory expansion ratio (with max_pool_percent=40)
You need More ??? then you can try zswap with ZSMALLOC allocator .
24
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Zswap with Zsmalloc compress pool (vs zbud)
Swap device
1- Compress/Uncompress
Scan/Compress use extra CPU cycles, but when
page-fault occurs, it is really faster to get pages
from the compressed pool in memory than disk.
2- Swap In / Out
But compare to zbud, zsmalloc
page replacement algorithm. When the zpool is full,
Paging out will occurs directly from the main
memory to the paging device.
Uncompressed Memory
Zpool
(zsmalloc)
ZSWAP
ZSWAP
Zsmalloc can store more pages per
slot than zbud. (3:1 measured)
Resulting to a higher memory
25
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
0
20
40
60
80
100
120
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
%bopsvsnomina
JVM Heap Size
Testing zswap (zbud vs zsmalloc) with SPECjbb2005
1 P8 core SMT8 10GB Mem - max_pool_percent=40
zswap off
zswap zsmalloc 842 (HW)
zswap 842 (HW)
75% Nominal perf. @ x1.8 Memory size
50% Nominal perf. @ x2 Memory size
Memory
Over-commitment
Zpool (zbud)
limit
Zpool (zsmalloc)
limit
ZSWAP (zsmalloc) Memory Over-Allocation test with SPECjbb2005
x2
26
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Monitor Zswap (zsmalloc) activity on 10GB VM with Grafana
10GB
15GB
20GB
25GB
30GB 35GB 40GB
27
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
1. Transparent Memory Compression
2. -
3. Power8 Split-Core
Enable POWER 8 advanced features on Linux
28
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Symetric vs Asymetric encryption
Symmetric encryption (AES):
SLOW/Complex operation
Private key never distributed
Use to send AES secret key
FAST/Simple operation
Secret Key must be distributed
Optimized by Power8
Not Optimized by Power8
29
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Anatomy of a SSL/HTTPS request
SSL Handshake
Executed only once
Asymetric encryption
Secret Key exchange
Data exchange
Symetric encryption
Client browser Server
Majority of the exchange will use symetric encryption
30
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
POWER8 Hardware Accelerator
NX
On Chip Accelerators (NX):
Symetric Crypto: AES, SHA
True random number generator
Need to use thru hypervizor call for guest/VM
Better single thread performance, larger bandwith
Symetric Crypto currently not available for PowerKVM
guest
In Core Accelerators :
Symetric Crypto : AES, SHA
Cyclic Redundancy Check
Private per core
Leverage Vector Unit (VMX)
Direct access for guest/VM
IBM - POWER8
12 cores per socket (from 3 to 4 GHz)
8 HW threads / core (SMT technology)
Large cache (96 MB : 8 MB / core)
High Memory Bandwidth(~200 GB/s)
31
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
AES Symmetric Cryptography / SHA Hash Engine
AES Key lengths: 128b,192b,256b
Combination AES-SHA / SHA-AES supported
Move the data once to encrypt/decrypt and/then authenticate
I/O buffer (IOB) provides function
8.9Gbps throughput per engine for AES 128 CBC Encrypt at 2.4GHz, 256B message
7Gbps engine throughput for SHA-512 at 2.4GHz, 256B message
Supports byte aligned source and target data buffers, scatter/gather
AES modes supported
Electronic Codebook (ECB)
Cipher Block Chaining (CBC)
Counter (CTR)
Counter with CBC-MAC (CCM)
Galois Counter Mode (GCM)
XCBC-MAC-96 (XMAC)
Hash mode supported
SHA1
SHA2 SHA-256
SHA2 SHA-512
Keyed-hash MAC (HMAC)
MD5
32
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
POWER8 Hardware Encryption
Source: Performance Characteristics of the POWER8 Processor, Alex Mericas, IBM Corporation
Algorithm
POWER7+ POWER8
On-Chip On-Chip In-Core
AES-GCM X X X
AES-CTR X X X
AES-CBC X X X
AES-ECB X X X
SHA-256 X X X
SHA-512 X X X
RNG X X
CRC X
Algorithm
POWER7+
(SW)
POWER8 (HW)
Single Thread Multi Thread
SHA-512 35 10.7 (x3) 2.6 (x13)
AES-128-ENC 17 4 (x4) 0.8 (x21)
AES-256-ENC 21 5.5 (x3.8) 1.1 (x19)
Cycles per Byte (1 core and in-core crypto)
-Chip Hardware Accelerators
introduced with POWER7+
POWER8 has same accelerators
Offload encryption for OS-based large
messages (encrypted file systems, etc)
On virtualized system, access to On-Chip
(NX) Hardware Accelerators needs to be
made through hypervizor call.
In-Core acceleration is directly accessible
to virtualized guest (no hypervisor call
needed).
includes user-mode
instructions to accelerate common
algorithms
33
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Linux on Power hypervizor compatibility matrix
Accelerator Features Baremetal PowerVM
guest
PowerKVM
guest
On-chip Compression
(842)
AES
RNG
In-core AES
SHA
CRC
34
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
P8 Hardware Encryption Acceleration
Combination of on-chip accelerators for CPU offload with larger blocks of encryption work, and
in-core instructions for small data sizes.
Exploitation available transparently under OS services and APIs
On-chip Crypto In-core CryptoRandom Number
Generation
/dev/random
/dev/urandom
Hardware
Kernel
User Space
Cryptographic Library in C
IPsec TCP/IP Encrypted File System
GSkit
Standard
Library
Strong Keys
Encrypted
Data In
Flight
Encrypted
Data In
At Rest
OpenSSL
Key Generation
Hypervisor H_COP calls
Applications
Custom Application Use/Libs
= can be exploited here
Physical
TPM
Standard Crypto
APIs
OpenSSL 1.0.2
libcrypto
34
35
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
How to enable the in-core crypto accelerator:
In Java, starting with IBM Java 7.1, AES is accelerated by using POWER8 in-core AES instructions by
specifying -Dcom.ibm.crypto.provider.doAESInHardware=true on the JVM command line.
OpenSSL > 1.0.2 is using VMX in-core P8 instruction and optimization for AES/SHA
All the application based on this version of openSSL will benefit from P8 encryption acceleration.
Ubuntu : OpenSSL 1.0.2 in ubuntu 15.10 and 16.04
RedHat : Still in OpenSSL 1.0.1 => Crypto Not Accelerated
Fedora 23 : OpenSSL 1.0.2
Suse12, OpenSuse 13 : Still in OpenSSL 1.0.1 => Crypto Not Accelerated
What can you do if you do not have the OpenSSL 1.0.2 ?
Code recompilation with « Advanced Toolchain (v9) »
« Advanced toolchain » is a gcc based compiler (provided by IBM for free) that provide POWER
optimized library. (like libcrypto).
You can then enable HW crypto acceleration to your application even if your Linux distribution
provide the latest libcrypto (OpenSSL 1.0.2)
36
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
IBM Advance Toolchain for PowerLinux
URLs:
IBM Advance Toolchain for PowerLinux Documentation
Improving performance with IBM Advance Toolchain for PowerLinux
Description:
The IBM Advance Toolchain for PowerLinux is a set of open source development tools and
runtime libraries which allows users to take leading edge advantage of IBM's latest POWER
hardware features on Linux.
Over time, these libraries and latest compiler technologies are integrated into the shipping
distributions.
However, the IBM Advance Toolchain for PowerLinux contains the latest tested and
supported GNU Compiler Collection (GCC) compiler versions, tailored for Power systems, and
packaged together with an expanding set of processor-tuned libraries, allowing you to take
advantage of the latest technology without waiting..
GCC Compiler
37
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Example of Apache and wget compiled with Advance Toolchain (1/3)
Idea was to recompile Apache and wget with Advance Toolchain to use the Power8 HW in-core
cryptography in order to improve the performance.
Recompile on PowerLinux:
Get source code of Apache and wget from community
Install Advance Toolchain AT9
Recompile out-of-the-box with the following flags, no source code changes at all required.
export CFLAGS="-O3 -m64 -mcpu=power8 -mtune=power8"
export PATH=/opt/at9.0/bin/:$PATH
Configure, make and make install
Simple test: download a 10G file with wget from the Apache web server in HTTPSinste
10GB
Apache (httpd)
WGET
loopbackSSL
38
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Example of Advance Toolchain with Apache and wget (2/3)
Standard Apache and wget
provided by the repo
Transfer done in 3m10s
Compiled Apache and wget
with Advance Toolchain
Transfer done in 23s
39
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Standard Advanced
toolchain
Example of Advance Toolchain with Apache and wget (3/3)
Profiling shows that AT version is
using P8 accelerated version of
ghash and aes
40
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Example 2 : J2EE Application benchmark (DayTrader application)
60% better CPU Utilisation with Power in-core encryption
With P8 HW CryptoWithout P8 HW Crypto
41
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
1. Transparent Memory Compression
2. -
3. Power8 Split-Core
Enable POWER 8 advanced features on Linux
43
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Enabling SMT on PowerKVM guests (1/2)
runrunPowerKVM with 2 P8 cores Guest1 2 vcpus
Guest2 4 vcpus
Default : 2 vcores, 1 thread
Manually Defined: 1 vcore, 4 threads
<vcpu>4<vcpu/>
<cpu>
<topology sockets=1 cores=1 threads=4/>
</cpu>
guest2.xml
WAIT
No free core available.
Vcore cannot be dispatched
Waiting for next dispatch
(time sharing)
SMT level different than 1 will slow down Guests dispatching.
How do we schedule guest VCPUs onto physical CPU cores?
Introduce notion of "virtual core" (vcore)
VCPUs are allocated to vcores before being dispatched by PowerKVM host to real Core.
By default 1 vcpu = 1 vcore
Can be modified to xVCPU = 1 core to enable SMT.
44
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Enabling SMT on PowerKVM guests (2/2)
In order to configure a KVM Guest, the number of VCPUs on a guest must be set to the
product of cores and threads per core assigned to the guest, and the number of threads per
core must be explictly set.
vcpu = sockets x cores x threads
For example, when using libvirt, you can configure a guest with the following settings in
order to get a guest with SMT=8 and 2 cores (16 total vcpus)
<vcpu>16</vcpu>
<cpu>
<topology sockets='1' cores='2' threads
</cpu>
With that configuration, a guest OS will be able to enable SMT=8 (default) and use the 16
threads across the assigned two cores.
This also allows the guest to dynamically control the SMT level directly from the OS
(ppc64_cpu --smt=x)
45
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Enabling SMT topology with Kimchi on PowerKVM 3.1
46
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Default guest SMT mode is 1 VCPU/vcore
Inefficient use of resources in whole-core mode (1 thread/core)
Often chosen by users who are not familiar with POWER
Often chosen by management agents (e.g. OpenStack)
Setting topology is too complex in big cloud environment
Up to now, default core-split mode was whole-core
Good for single-thread performance
Allows users to run SMT1, SMT2, SMT4 and SMT8 guests
Hits over commitment early, especially with SMT1 guests
with 20 cores P8 => 20 maximum vcpu dispatched in // by default.
PowerKVM 3.1 addresses these points with 2 features :
1. (sub)core sharing (piggybacking)
2. Dynamic multi-threading (split-core)
2 vcpus
PowerKVM
with 2 P8 cores
run run
Guest 1
Guest 1 Guest 2
runrun
PowerKVM
with 2 P8 cores
47
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
PowerKVM Micro-Threading (Split-Core)
No split-core :
1 full core available with up to 8 parallel threads
Only 1 guest running at a time
(PowerVM only mode available)
split-core by 2 :
2 sub-cores available each with up to 4 parallel threads.
Up to 2 guests running at a time
split-core by 4 :
4 sub-cores available each with up to 2 parallel threads.
Up to 4 guests running at a time
IBM Power8 chip
1 Core
1 2
21
43
1
48
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
PowerKVM Micro-Threading (Split-Core)
VM1 VM2 VM3 VM4
Context switching (hypervisor overhead)
time
Fullcore
thr1 thr2 thr3 thr4
thr5 thr6 thr7 thr8
Full core
POWER8
Power8 is a 8 threads processor.
All threads share MMU(1) context, therefore must
be in same partition.
Guests in single thread (SMT 1) mode cannot use
the full core capacity.
Micro-Threading benefits:
Better CPU resources usage
More virtual machines per core
Reduces over-commitment overhead (context
switch)
Micro-Threading limitations:
Guest SMT is limited to 2 or 4, depending on the
Split Core level (Half core, Quarter Core)
All threads are running in SMT8 mode. (lower
single thread perf.)
PowerKVM introduces the possibility to split a Power8 core in 2 or 4 subcores: Micro-Threading
(static in PowerKVM 2.1, dynamic in PowerKVM 3.1)
Each subcore has its own MMU(1) and can be dispatched independently to a different Guest (VM).
(1) MMU (MemoryManagement Unit) is a Hardware Memory Decoder
that maps virtual addresses to physical addresses
VM2
subcore1 VM1
VM3
VM4
time
subcore1 subcore2
subcore3 subcore4
thr1 thr2 thr3 thr4
thr5 thr6 thr7 thr8
POWER8
subcore2
subcore3
subcore4
49
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
PowerKVM 3.1 Dynamic Micro-Threading (SubCores)
With PowerKVM 3.1, The hypervisor may dynamically choose to split by-two or by-
four each core in order to optimize vcpus needs with hardware available resources.
runrunPowerKVM3
with 1 P8 core
Guest1 2 vcpus
<topology sockets=1 cores=1threads=2/>
Guest2 4 vcpus
<topology sockets=1 cores=1 threads=4/>
Manually Defined :
1 vcore, 2 threads
Manually Defined:
1 vcore, 4 threads
run
runPowerKVM3
with 1 P8 core
Guest1 2 vcpus
<topology sockets=1 cores=1 threads=2/>
Guest2 2 vcpus
<topology sockets=1 cores=1 threads=2/>
Manually Defined :
1 vcore, 2 threads
Manually Defined:
1 vcore, 2 threads
Guest3 2 vcpus
<topology sockets=1 cores=1 threads=2/>
Manually Defined :
1 vcore, 2 threads
Splitting by 2 is optimum Splitting by 4 is optimum
To manually and statically set the level of subcoring, use at PowerKVM host level:
ppc64_cpu --subcores-per-core # Get number of subcores per core
ppc64_cpu --subcores-per-core=X # Set subcores per core to X (1,2 or 4)
ppc64_cpu --threads-per-core # Get threads per core
(It needs all VMs to be offline)
50
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
PowerKVM 3.1
Micro-Threading (Subcore) DEMO
51
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
PowerKVM 3.1 Dynamic Micro-Threading (SubCores) DEMO
The demonstration is done with:
4 Guests (Virtual machines), all
pinned onto one single core of a
20-cores S822L Power8 server.
PowerKVM 3.1 virtualization.
Each guest is defined with a manual
topology of 1 vcore and 2 threads.
run
PowerKVM3
with 1 P8 core
split1 2 vcpus
<topology sockets=1 cores=1 threads=2/>
split2 2 vcpus
<topology sockets=1 cores=1 threads=2/>
Manually Defined :
1 vcore, 2 threads
Manually Defined:
1 vcore, 2 threads
split3 2 vcpus
<topology sockets=1 cores=1 threads=2/>
Manually Defined :
1 vcore, 2 threads
split3 2 vcpus
<topology sockets=1 cores=1 threads=2/>
Manually Defined :
1 vcore, 2 threads
52
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Time Slice
CoreThreads
1
2
3
4
5
6
7
8
Time Slice
CoreThreads
1
2
3
4
5
6
7
8
PowerKVM 3.1 Dynamic Micro-Threading (SubCores) DEMO
(guest topology is 1 vcore, 2 threads)
Time Slice
CoreThreads
1
2
3
4
5
6
7
8
split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
split1
split2
Split3
split4
No Micro-Threading allowed
Micro-Threading with 2 sub-cores max
Micro-Threading with 4 sub-cores max
53
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
400 VMs on a (small) S822LC 20-cores ?
Thanks to split-core (and piggybacking), even 400 VMs
but nevertheless powerfull IBM S822LC is OK (even if definitely extreme).
Guest= 2 vcpus
Default :
2 vcores, 1 threads
No need to
split(thanks to
piggyback with
20 VMs)
Split-core helps
optimizing cores
utilization
Number of VMs
Almost like PowerKVM 2.1 (piggybacknot available with pKVM 2.1)
PowerKVM 3.1 split-corebenefits
PgbenchpostgreSQL
workload(tps)
54
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Session Evaluations
YOUR OPINION MATTERS!
Submit four or more session
evaluations by 5:30pm Wednesday
to be eligible for drawings!
*Winners will be notified Thursday morning. Prizes must be picked up at
registration desk, during operating hours, by the conclusion of the event.
1 2 3 4
55
IBM Systems Technical Events | ibm.com/training/events
© Copyright IBM Corporation 2016. Technical University/Symposia materials may
not be reproduced in whole or in part without the prior written permission of IBM.
Continue growing your IBM skills
ibm.com/training
provides a comprehensive
portfolio of skills and career
accelerators that are designed
to meet all your training needs.
If training that is right for you with our Global
Training Providers, we can help.
Contact IBM Training at dpmc@us.ibm.com
Global Skills Initiative

Weitere ähnliche Inhalte

Was ist angesagt?

MIPS Assembly Language I
MIPS Assembly Language IMIPS Assembly Language I
MIPS Assembly Language ILiEdo
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Pankaj Suryawanshi
 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer ArchitectureSubhasis Dash
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...John Campbell
 
Memory modules
Memory modulesMemory modules
Memory modulesSana Sini
 
Microcontroller(18CS44) module 1
Microcontroller(18CS44)  module 1Microcontroller(18CS44)  module 1
Microcontroller(18CS44) module 1Swetha A
 
Memory Hierarchy (RAM and ROM)
Memory Hierarchy (RAM and ROM)Memory Hierarchy (RAM and ROM)
Memory Hierarchy (RAM and ROM)sumanth ch
 
TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER
TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER
TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER Rajat More
 
Parallel Sysplex Implement2
Parallel Sysplex Implement2Parallel Sysplex Implement2
Parallel Sysplex Implement2ggddggddggdd
 
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystemThe basic concept of Linux FIleSystem
The basic concept of Linux FIleSystemHungWei Chiu
 
Memory management in operating system | Paging | Virtual memory
Memory management in operating system | Paging | Virtual memoryMemory management in operating system | Paging | Virtual memory
Memory management in operating system | Paging | Virtual memoryShivam Mitra
 
Ch9 OS
Ch9 OSCh9 OS
Ch9 OSC.U
 
Types of memory in Computer
Types of memory in ComputerTypes of memory in Computer
Types of memory in ComputerFazle Rabbi Ador
 

Was ist angesagt? (20)

Mips architecture
Mips architectureMips architecture
Mips architecture
 
MIPS Assembly Language I
MIPS Assembly Language IMIPS Assembly Language I
MIPS Assembly Language I
 
Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)Linux Memory Management with CMA (Contiguous Memory Allocator)
Linux Memory Management with CMA (Contiguous Memory Allocator)
 
IBM Utilities
IBM UtilitiesIBM Utilities
IBM Utilities
 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
 
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
DB2 for z/OS Bufferpool Tuning win by Divide and Conquer or Lose by Multiply ...
 
Memory modules
Memory modulesMemory modules
Memory modules
 
NUMA overview
NUMA overviewNUMA overview
NUMA overview
 
Cache memory
Cache memoryCache memory
Cache memory
 
Memory Organization
Memory OrganizationMemory Organization
Memory Organization
 
Microcontroller(18CS44) module 1
Microcontroller(18CS44)  module 1Microcontroller(18CS44)  module 1
Microcontroller(18CS44) module 1
 
Memory Hierarchy (RAM and ROM)
Memory Hierarchy (RAM and ROM)Memory Hierarchy (RAM and ROM)
Memory Hierarchy (RAM and ROM)
 
TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER
TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER
TYPES OF MEMORIES AND STORAGE DEVICE AND COMPUTER
 
Memory Hierarchy
Memory HierarchyMemory Hierarchy
Memory Hierarchy
 
Parallel Sysplex Implement2
Parallel Sysplex Implement2Parallel Sysplex Implement2
Parallel Sysplex Implement2
 
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystemThe basic concept of Linux FIleSystem
The basic concept of Linux FIleSystem
 
Memory management in operating system | Paging | Virtual memory
Memory management in operating system | Paging | Virtual memoryMemory management in operating system | Paging | Virtual memory
Memory management in operating system | Paging | Virtual memory
 
Mainframe interview
Mainframe interviewMainframe interview
Mainframe interview
 
Ch9 OS
Ch9 OSCh9 OS
Ch9 OS
 
Types of memory in Computer
Types of memory in ComputerTypes of memory in Computer
Types of memory in Computer
 

Ähnlich wie Enabling POWER 8 advanced features on Linux

Improving MeeGo boot-up time
Improving MeeGo boot-up timeImproving MeeGo boot-up time
Improving MeeGo boot-up timeHiroshi Doyu
 
"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on PowerSebastien Chabrolles
 
Re-Think Storage – PernixData. Meet & greet with Frank Denneman
Re-Think Storage – PernixData. Meet & greet with Frank DennemanRe-Think Storage – PernixData. Meet & greet with Frank Denneman
Re-Think Storage – PernixData. Meet & greet with Frank DennemanDigicomp Academy AG
 
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technologyEnterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technologysolarisyougood
 
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeVisão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeAnderson Bassani
 
JavaOne 2014: Java Debugging
JavaOne 2014: Java DebuggingJavaOne 2014: Java Debugging
JavaOne 2014: Java DebuggingChris Bailey
 
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...Concentrated Technology
 
Five cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterTim Ellison
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxSamsung Open Source Group
 
Presentation exploit power vm features to maximize performance &amp; effici...
Presentation   exploit power vm features to maximize performance &amp; effici...Presentation   exploit power vm features to maximize performance &amp; effici...
Presentation exploit power vm features to maximize performance &amp; effici...solarisyougood
 
Presentation power vm common 2012
Presentation   power vm common 2012Presentation   power vm common 2012
Presentation power vm common 2012solarisyougood
 
Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2While42
 
tuningfor_oracle
 tuningfor_oracle tuningfor_oracle
tuningfor_oraclestyxyx
 
Presentation v mware performance overview
Presentation   v mware performance overviewPresentation   v mware performance overview
Presentation v mware performance overviewsolarisyourep
 
OpenPOWER Seminar at IIIT Bangalore
OpenPOWER Seminar at IIIT BangaloreOpenPOWER Seminar at IIIT Bangalore
OpenPOWER Seminar at IIIT BangaloreGanesan Narayanasamy
 

Ähnlich wie Enabling POWER 8 advanced features on Linux (20)

PowerAI Deep Dive ( key points )
PowerAI Deep Dive ( key points )PowerAI Deep Dive ( key points )
PowerAI Deep Dive ( key points )
 
AIX Performance Tuning Session at STU2017
AIX Performance Tuning Session at STU2017AIX Performance Tuning Session at STU2017
AIX Performance Tuning Session at STU2017
 
Improving MeeGo boot-up time
Improving MeeGo boot-up timeImproving MeeGo boot-up time
Improving MeeGo boot-up time
 
"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power"Relax and Recover", an Open Source mksysb for Linux on Power
"Relax and Recover", an Open Source mksysb for Linux on Power
 
Re-Think Storage – PernixData. Meet & greet with Frank Denneman
Re-Think Storage – PernixData. Meet & greet with Frank DennemanRe-Think Storage – PernixData. Meet & greet with Frank Denneman
Re-Think Storage – PernixData. Meet & greet with Frank Denneman
 
Enterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technologyEnterprise power systems transition to power7 technology
Enterprise power systems transition to power7 technology
 
OpenPOWER Seminar at IIT Madras
OpenPOWER Seminar at IIT MadrasOpenPOWER Seminar at IIT Madras
OpenPOWER Seminar at IIT Madras
 
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso MainframeVisão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
Visão geral do hardware do servidor System z e Linux on z - Concurso Mainframe
 
JavaOne 2014: Java Debugging
JavaOne 2014: Java DebuggingJavaOne 2014: Java Debugging
JavaOne 2014: Java Debugging
 
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
 
Five cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark faster
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
 
Reliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on LinuxReliability, Availability and Serviceability on Linux
Reliability, Availability and Serviceability on Linux
 
Presentation exploit power vm features to maximize performance &amp; effici...
Presentation   exploit power vm features to maximize performance &amp; effici...Presentation   exploit power vm features to maximize performance &amp; effici...
Presentation exploit power vm features to maximize performance &amp; effici...
 
Presentation power vm common 2012
Presentation   power vm common 2012Presentation   power vm common 2012
Presentation power vm common 2012
 
Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2Qemu - Raspberry | while42 Singapore #2
Qemu - Raspberry | while42 Singapore #2
 
tuningfor_oracle
 tuningfor_oracle tuningfor_oracle
tuningfor_oracle
 
Presentation v mware performance overview
Presentation   v mware performance overviewPresentation   v mware performance overview
Presentation v mware performance overview
 
OpenPOWER Seminar at IIIT Bangalore
OpenPOWER Seminar at IIIT BangaloreOpenPOWER Seminar at IIIT Bangalore
OpenPOWER Seminar at IIIT Bangalore
 
l011029
l011029l011029
l011029
 

Kürzlich hochgeladen

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Kürzlich hochgeladen (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

Enabling POWER 8 advanced features on Linux

  • 1. © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Enabling POWER 8 advanced features on Linux Sébastien Chabrolles Julien Limodin Fabrice Moyen PowerSystem Linux Center IBM Montpellier
  • 2. 1 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. POWER8 Hardware Accelerator NX On Chip Accelerators (NX): Symetric Crypto Compression engine Random Number Generator One NX complex per chip A given NX can access all memory in the SMP A given NX can be accessed by any core Can be accessed via powerVM hypervizor call In Core Accelerators : Symetric Crytpo Private per core Leverage Vector Unit (VMX) Direct access for guest/VM (including KVM) IBM - POWER8 12 cores per socket (from 3 to 4 GHz) 8 HW threads / core (SMT technology) Large cache (96 MB : 8 MB / core) High Memory Bandwidth(~200 GB/s)
  • 3. 2 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 1. Transparent Memory Compression 2. - 3. Power8 Split-Core Enable POWER 8 advanced features on Linux
  • 4. 3 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Transparent Memory Compression Transparent Memory Compression is a feature provided by the operating system (Kernel) dynamically compresses process memory without process knowledge. PowerVM with AIX proposes this functionality via AME (Active Memory Expansion) Unfortunately, AME does not exist for Linux. Linux has an alternative solution is named ZSWAP !!! Zswap is a feature that hooks into the read and write sides of the swap code and acts as a compressed cache for pages go to and from the swap device Like AME, Zswap can use the Power NX compression accelerator (842) to improve compression performance. But unlike AME, zswap has some restriction : Paging device are needed with enough space to store uncompressed data. but still the real one. Application processes must allow to be swapped-out.
  • 5. 4 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. P8 NX (on-chip) block diagram Second generation Nest Accelerator complex* Encryption Engine Random number generator Two 842 compression / decompression engines Proprietary IBM Research algorithm SRAM based dictionary compression Used by AME Good compression ratio at high bandwidth 106% of LZO on 190+ benchmarks 158% of compression ratio of software DEFLATE with FHT on Canterbury corpus Only available via PowerVM or BareMetal Linux. -chip accelerators for cryptography and active IBM J. Res. & Dev., vol. 57, no. 4, Nov./Dec. 2013. On-chip SMP Interconnect Interface che DMA Controller 842 Channel 0 RNG Channel 1 chs AES SHA IOB chs AES SHA IOB che 842 Channel 2 Channel 3 32B 32B 16B 16B 32B 32B32B 16B 16B 32B 32B32B 16B16B ingress arraysegress arrays 2to1 clock region On-chip SMP interconnect
  • 6. 5 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Zswap ! For that, we will use a well known Java Benchmark (SPECjbb), run it several time while increasing the JVM Heap-Size. 1 core POWER8 10GB Mem Ubuntu 16.04 10 GB Phys. Mem JVM Heap-Size 9GB 10 GB 18 GB SPECjbb 1- Baseline Test with Zswap deactivated 2- Test with zswap and software compression (default) 3- Test with zswap and Power HW compression (842)
  • 7. 6 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Memory Over-Allocation test with SPECjbb2005 (BaseLine) 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 %bopsvsnominal JVM Heap Size SPECjbb2005 performance and Memory Over-Allocation 1 P8 core SMT8 10GB Mem zswap off Memory Over-commitment 10% of nominal performance due to Memory thrashing)
  • 8. 7 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. SWAP / Paging Activity System Memory Swap device 1- Swap Out / Page Out When the memory is full, a process (LRUD) scans memory and move the device. Asynchrous Backgroud task => No impact on 2- Swap In / Page In When page-fault occurs and pages are located in the paging device, those pages must be moved back to the Memory. As physical disks are much more slower => THIS HURTS PERFORMANCE !!! Swap out Swap in
  • 9. 8 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 SwapI/O(MB/s) JVM Heap Size Swap I/O activity - SPECjbb2005 Memory Over-Allocation 1P8 core SMT8 - 10GB Mem zswap off Memory Over-Allocation test with SPECjbb2005 (Swap I/O) Memory Over-commitment Single SAS disk used as Swap device Reaches his limit at ~100 MB/s (50% read)
  • 10. 9 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. In the memory thrashing case, the non-deterministic latency and performance degradation that I/O introduces could be fatal to your I/O storm could even prevent you to connect to your system or start any We need a way to smooth out this I/O storm and performance cliff as memory demand meets memory capacity. Zswap!
  • 11. 10 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. ZSWAP requirement 1. Zswap is directly available in the Linux Kernel since v3.11 RedHat 7, CentOS 7, Fedora 19 Suse 12 Ubuntu 14.04 Enable zswap at boot level by adding the option zswap.enabled=1 in your boot loader. 2. Power NX (on-chip) acceleration (842) is only available for PowerVM and BareMetal Linux. Not Available today for PowerKVM guest cat /proc/device-tree/ibm,platform-facilities/ibm,compression-v1/status should return okay Note : Ubuntu need a kernel 4.2 or above to get access to Power NX hw (starting with ubuntu 15.10) https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1488495 Enable zswap HW compression with zswap.compressor=842 in your boot loader.
  • 12. 11 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Enabling POWER HW compression engine (842) with zswap RedHat : 1- Enable Zswap with 842 compressor at boot time. vi /etc/sysconfig/grub add zswap.enabled=1 zswap.compressor=842 to GRUB_CMDLINE_LINUX 2- Regenerate your grub.cfg file. grub2-mkconfig > /boot/grub2/grub.cfg 3- Add 842 kernel modules to your ramdisk echo 842 > /etc/modules-load.d/842.conf dracut -f 4- reboot and verify with dmesg | grep zswap [ 1.064790] zswap: loaded using pool 842/zbud
  • 13. 12 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Enabling POWER HW compression engine (842) with zswap Ubuntu: 1- Enable Zswap with 842 compressor at boot time. vi /etc/sysconfig/grub add zswap.enabled=1 zswap.compressor=842 to GRUB_CMDLINE_LINUX 2- Regenerate your grub.cfg file. grub2-mkconfig > /boot/grub2/grub.cfg 3- Add 842 kernel modules to your ramdisk echo 842 > /etc/modules-load.d/842.conf vi /usr/share/initramfs-tools/hooks/842 Add the following lines: #!/bin/sh -e PREREQS="" case $1 in prereqs) echo "${PREREQS}"; exit 0;; esac . /usr/share/initramfs-tools/hook-functions force_load 842 update-initramfs -u 4- dmesg | grep zswap [ 1.064790] zswap: loaded using pool 842/zbud
  • 14. 13 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Zswap parameters and monitoring Zswap parameters are located in /sys/module/zswap/parameters You can change : - compressor : [ lzo or 842 ] default lzo Compressor algorithm to use - enabled : [ Y or N ] Enable zswap - max_pool_percent : [1 to 100] default 20 Compress pool size limit (in % of RAM) - Zpool : [ zbud or zsmalloc ] default zbud Compression pool algorithm. Zbud : - store 2 pages in one slot (compression ratio 2:1) - evict the oldest pages to disk when full Zsmalloc : - can store more pages per slot than zbud (compression ratio ~ 3:1) - but unlike zbud, redirect new allocation to paging device when full. (does not recycle old pages). You can monitor zswap activity by looking at counters located in /sys/kernel/debug/zswap
  • 15. 14 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. zswap Swap device 1- Compress/Uncompress (zbud by default). Scan/Compress use extra CPU cycles, but when page-fault occurs, it is really faster to get pages from the compressed pool in memory than disk. 3- Swap In / Page In When page-fault occurs and pages are located in the paging device, those pages must be moved back to the Memory. THIS HURTS PERFORMANCE !!! Uncompressed Memory Zpool (zbud) ZSWAP ZSWAP 2- Swap Out / Page Out When the compress zpool is full, zbud moves odest compressed pages to the swap device
  • 16. 15 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. ZSWAP Memory Over-Allocation test with SPECjbb2005 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 %bopsvsnominal JVM Heap Size Testing zswap (zbud) with SPECjbb2005 1 P8 core SMT8 10GB Mem - max_pool_percent=40 zswap off zswap 842 (HW) Memory Over-commitment Zpool Over-commitment 75% of nominal performance at 140% memory
  • 17. 16 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. ZSWAP HW vs Soft. compression 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 %bopsvsnominal JVM Heap Size Testing zswap (zbud) with SPECjbb2005 1 P8 core SMT8 10GB Mem - max_pool_percent=40 zswap off zswap lzo zswap 842 (HW) Memory Over-commitment Zpool Over-commitment X1.5
  • 18. 17 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. ZSWAP Memory Over-Allocation test with SPECjbb2005 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 %bopsvsnominal JVM Heap Size Testing zswap (zbud) with SPECjbb2005 1 P8 core SMT8 10GB Mem - max_pool_percent=40 zswap 842 (HW) Memory Over-commitment Zpool Over-commitment 1 2 3
  • 19. 18 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Case 1 : Zswap with Memory not Over-Committed Swap device Memory Used (uncompressed) Free memory Enough Memory available application No/Little swap I/O occuring Zswap is idle (no CPU overhead) => You can almost use all the memory before zswap starts working 100% Memory Used (uncompressed) 100% CPU user Best performance for application
  • 20. 19 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Case 2 : Zswap with Memory Over-Committed Swap device Memory Used (uncompressed) Application needs more memory than available Zswap starts working, compressing pages in/out zpool. Zpool is increasing No/Little swap I/O occuring Below nominal performance due to memory scanning, unmapping. Compression/decompression are offloaded to NX 842 Zpool (zbud) ZSWAP 25% CPU system due to page scanning 75% of nominal performance on CPU bound application (worst case)
  • 21. 20 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Zswap with 842(HW) vs LZO(Soft) Zswap HW compression 842 10GB RAM, 14GB Java Heap Size 25% of System CPU (overhead) due to memory page scanning. Compression offloaded to NX 842 75% of nominal performance Zswap Soft. Compression LZO 10GB RAM, 14GB Java Heap Size 50% of system CPU (overhead) due to memory page scanning and compression 50% of nominal performance 50% better CPU usage with POWER HW compression
  • 22. 21 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 SwapI/O(kB/s) JVM Heap Size Testing zswap (zbud) with SPECjbb2005 1P8 core SMT8 - 10GB Mem - max_pool_percent=40 zswap off zswap on ZSWAP Memory Over-Allocation (Swap IO activity) Memory Over-commitment Zpool Over-commitment No or few paging when running 1 2 3
  • 23. 22 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Case 3 : Zswap with Memory Over-Committed and Zpool Full Swap device Memory Used (uncompressed) Application needs more memory than available Zswap is working, compressing pages in/out zpool Zpool reaches max_pool_percent limit (compress pool is full). Need to free some space in Zpool => Swapping in/out !!! Performance degradation Zpool (zbud) FULL ZSWAP max_pool_percent=40 75% CPU wait I/O; only 10 % CPU user 10% of nominal performance due to waiting for pages on swap device (swap in) SWAP IN/OUT
  • 24. 23 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Zswap Conclusion Zswap is not AME, but it can really helps to reduce impact of paging activity and secure your production system with no cost and no penalty: Power8 NX842 compression engine are available for PowerVM and BareMetal Linux No Impact, when memory demand is below RAM capacity installed. Can maintain your system at 75% performance in CPU 100% case (the worse scenario) and Zswap zbud x1.4 Memory expansion ratio (with max_pool_percent=40) You need More ??? then you can try zswap with ZSMALLOC allocator .
  • 25. 24 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Zswap with Zsmalloc compress pool (vs zbud) Swap device 1- Compress/Uncompress Scan/Compress use extra CPU cycles, but when page-fault occurs, it is really faster to get pages from the compressed pool in memory than disk. 2- Swap In / Out But compare to zbud, zsmalloc page replacement algorithm. When the zpool is full, Paging out will occurs directly from the main memory to the paging device. Uncompressed Memory Zpool (zsmalloc) ZSWAP ZSWAP Zsmalloc can store more pages per slot than zbud. (3:1 measured) Resulting to a higher memory
  • 26. 25 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 0 20 40 60 80 100 120 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 %bopsvsnomina JVM Heap Size Testing zswap (zbud vs zsmalloc) with SPECjbb2005 1 P8 core SMT8 10GB Mem - max_pool_percent=40 zswap off zswap zsmalloc 842 (HW) zswap 842 (HW) 75% Nominal perf. @ x1.8 Memory size 50% Nominal perf. @ x2 Memory size Memory Over-commitment Zpool (zbud) limit Zpool (zsmalloc) limit ZSWAP (zsmalloc) Memory Over-Allocation test with SPECjbb2005 x2
  • 27. 26 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Monitor Zswap (zsmalloc) activity on 10GB VM with Grafana 10GB 15GB 20GB 25GB 30GB 35GB 40GB
  • 28. 27 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 1. Transparent Memory Compression 2. - 3. Power8 Split-Core Enable POWER 8 advanced features on Linux
  • 29. 28 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Symetric vs Asymetric encryption Symmetric encryption (AES): SLOW/Complex operation Private key never distributed Use to send AES secret key FAST/Simple operation Secret Key must be distributed Optimized by Power8 Not Optimized by Power8
  • 30. 29 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Anatomy of a SSL/HTTPS request SSL Handshake Executed only once Asymetric encryption Secret Key exchange Data exchange Symetric encryption Client browser Server Majority of the exchange will use symetric encryption
  • 31. 30 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. POWER8 Hardware Accelerator NX On Chip Accelerators (NX): Symetric Crypto: AES, SHA True random number generator Need to use thru hypervizor call for guest/VM Better single thread performance, larger bandwith Symetric Crypto currently not available for PowerKVM guest In Core Accelerators : Symetric Crypto : AES, SHA Cyclic Redundancy Check Private per core Leverage Vector Unit (VMX) Direct access for guest/VM IBM - POWER8 12 cores per socket (from 3 to 4 GHz) 8 HW threads / core (SMT technology) Large cache (96 MB : 8 MB / core) High Memory Bandwidth(~200 GB/s)
  • 32. 31 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. AES Symmetric Cryptography / SHA Hash Engine AES Key lengths: 128b,192b,256b Combination AES-SHA / SHA-AES supported Move the data once to encrypt/decrypt and/then authenticate I/O buffer (IOB) provides function 8.9Gbps throughput per engine for AES 128 CBC Encrypt at 2.4GHz, 256B message 7Gbps engine throughput for SHA-512 at 2.4GHz, 256B message Supports byte aligned source and target data buffers, scatter/gather AES modes supported Electronic Codebook (ECB) Cipher Block Chaining (CBC) Counter (CTR) Counter with CBC-MAC (CCM) Galois Counter Mode (GCM) XCBC-MAC-96 (XMAC) Hash mode supported SHA1 SHA2 SHA-256 SHA2 SHA-512 Keyed-hash MAC (HMAC) MD5
  • 33. 32 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. POWER8 Hardware Encryption Source: Performance Characteristics of the POWER8 Processor, Alex Mericas, IBM Corporation Algorithm POWER7+ POWER8 On-Chip On-Chip In-Core AES-GCM X X X AES-CTR X X X AES-CBC X X X AES-ECB X X X SHA-256 X X X SHA-512 X X X RNG X X CRC X Algorithm POWER7+ (SW) POWER8 (HW) Single Thread Multi Thread SHA-512 35 10.7 (x3) 2.6 (x13) AES-128-ENC 17 4 (x4) 0.8 (x21) AES-256-ENC 21 5.5 (x3.8) 1.1 (x19) Cycles per Byte (1 core and in-core crypto) -Chip Hardware Accelerators introduced with POWER7+ POWER8 has same accelerators Offload encryption for OS-based large messages (encrypted file systems, etc) On virtualized system, access to On-Chip (NX) Hardware Accelerators needs to be made through hypervizor call. In-Core acceleration is directly accessible to virtualized guest (no hypervisor call needed). includes user-mode instructions to accelerate common algorithms
  • 34. 33 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Linux on Power hypervizor compatibility matrix Accelerator Features Baremetal PowerVM guest PowerKVM guest On-chip Compression (842) AES RNG In-core AES SHA CRC
  • 35. 34 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. P8 Hardware Encryption Acceleration Combination of on-chip accelerators for CPU offload with larger blocks of encryption work, and in-core instructions for small data sizes. Exploitation available transparently under OS services and APIs On-chip Crypto In-core CryptoRandom Number Generation /dev/random /dev/urandom Hardware Kernel User Space Cryptographic Library in C IPsec TCP/IP Encrypted File System GSkit Standard Library Strong Keys Encrypted Data In Flight Encrypted Data In At Rest OpenSSL Key Generation Hypervisor H_COP calls Applications Custom Application Use/Libs = can be exploited here Physical TPM Standard Crypto APIs OpenSSL 1.0.2 libcrypto 34
  • 36. 35 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. How to enable the in-core crypto accelerator: In Java, starting with IBM Java 7.1, AES is accelerated by using POWER8 in-core AES instructions by specifying -Dcom.ibm.crypto.provider.doAESInHardware=true on the JVM command line. OpenSSL > 1.0.2 is using VMX in-core P8 instruction and optimization for AES/SHA All the application based on this version of openSSL will benefit from P8 encryption acceleration. Ubuntu : OpenSSL 1.0.2 in ubuntu 15.10 and 16.04 RedHat : Still in OpenSSL 1.0.1 => Crypto Not Accelerated Fedora 23 : OpenSSL 1.0.2 Suse12, OpenSuse 13 : Still in OpenSSL 1.0.1 => Crypto Not Accelerated What can you do if you do not have the OpenSSL 1.0.2 ? Code recompilation with « Advanced Toolchain (v9) » « Advanced toolchain » is a gcc based compiler (provided by IBM for free) that provide POWER optimized library. (like libcrypto). You can then enable HW crypto acceleration to your application even if your Linux distribution provide the latest libcrypto (OpenSSL 1.0.2)
  • 37. 36 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. IBM Advance Toolchain for PowerLinux URLs: IBM Advance Toolchain for PowerLinux Documentation Improving performance with IBM Advance Toolchain for PowerLinux Description: The IBM Advance Toolchain for PowerLinux is a set of open source development tools and runtime libraries which allows users to take leading edge advantage of IBM's latest POWER hardware features on Linux. Over time, these libraries and latest compiler technologies are integrated into the shipping distributions. However, the IBM Advance Toolchain for PowerLinux contains the latest tested and supported GNU Compiler Collection (GCC) compiler versions, tailored for Power systems, and packaged together with an expanding set of processor-tuned libraries, allowing you to take advantage of the latest technology without waiting.. GCC Compiler
  • 38. 37 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Example of Apache and wget compiled with Advance Toolchain (1/3) Idea was to recompile Apache and wget with Advance Toolchain to use the Power8 HW in-core cryptography in order to improve the performance. Recompile on PowerLinux: Get source code of Apache and wget from community Install Advance Toolchain AT9 Recompile out-of-the-box with the following flags, no source code changes at all required. export CFLAGS="-O3 -m64 -mcpu=power8 -mtune=power8" export PATH=/opt/at9.0/bin/:$PATH Configure, make and make install Simple test: download a 10G file with wget from the Apache web server in HTTPSinste 10GB Apache (httpd) WGET loopbackSSL
  • 39. 38 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Example of Advance Toolchain with Apache and wget (2/3) Standard Apache and wget provided by the repo Transfer done in 3m10s Compiled Apache and wget with Advance Toolchain Transfer done in 23s
  • 40. 39 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Standard Advanced toolchain Example of Advance Toolchain with Apache and wget (3/3) Profiling shows that AT version is using P8 accelerated version of ghash and aes
  • 41. 40 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Example 2 : J2EE Application benchmark (DayTrader application) 60% better CPU Utilisation with Power in-core encryption With P8 HW CryptoWithout P8 HW Crypto
  • 42. 41 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 1. Transparent Memory Compression 2. - 3. Power8 Split-Core Enable POWER 8 advanced features on Linux
  • 43. 43 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Enabling SMT on PowerKVM guests (1/2) runrunPowerKVM with 2 P8 cores Guest1 2 vcpus Guest2 4 vcpus Default : 2 vcores, 1 thread Manually Defined: 1 vcore, 4 threads <vcpu>4<vcpu/> <cpu> <topology sockets=1 cores=1 threads=4/> </cpu> guest2.xml WAIT No free core available. Vcore cannot be dispatched Waiting for next dispatch (time sharing) SMT level different than 1 will slow down Guests dispatching. How do we schedule guest VCPUs onto physical CPU cores? Introduce notion of "virtual core" (vcore) VCPUs are allocated to vcores before being dispatched by PowerKVM host to real Core. By default 1 vcpu = 1 vcore Can be modified to xVCPU = 1 core to enable SMT.
  • 44. 44 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Enabling SMT on PowerKVM guests (2/2) In order to configure a KVM Guest, the number of VCPUs on a guest must be set to the product of cores and threads per core assigned to the guest, and the number of threads per core must be explictly set. vcpu = sockets x cores x threads For example, when using libvirt, you can configure a guest with the following settings in order to get a guest with SMT=8 and 2 cores (16 total vcpus) <vcpu>16</vcpu> <cpu> <topology sockets='1' cores='2' threads </cpu> With that configuration, a guest OS will be able to enable SMT=8 (default) and use the 16 threads across the assigned two cores. This also allows the guest to dynamically control the SMT level directly from the OS (ppc64_cpu --smt=x)
  • 45. 45 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Enabling SMT topology with Kimchi on PowerKVM 3.1
  • 46. 46 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Default guest SMT mode is 1 VCPU/vcore Inefficient use of resources in whole-core mode (1 thread/core) Often chosen by users who are not familiar with POWER Often chosen by management agents (e.g. OpenStack) Setting topology is too complex in big cloud environment Up to now, default core-split mode was whole-core Good for single-thread performance Allows users to run SMT1, SMT2, SMT4 and SMT8 guests Hits over commitment early, especially with SMT1 guests with 20 cores P8 => 20 maximum vcpu dispatched in // by default. PowerKVM 3.1 addresses these points with 2 features : 1. (sub)core sharing (piggybacking) 2. Dynamic multi-threading (split-core) 2 vcpus PowerKVM with 2 P8 cores run run Guest 1 Guest 1 Guest 2 runrun PowerKVM with 2 P8 cores
  • 47. 47 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. PowerKVM Micro-Threading (Split-Core) No split-core : 1 full core available with up to 8 parallel threads Only 1 guest running at a time (PowerVM only mode available) split-core by 2 : 2 sub-cores available each with up to 4 parallel threads. Up to 2 guests running at a time split-core by 4 : 4 sub-cores available each with up to 2 parallel threads. Up to 4 guests running at a time IBM Power8 chip 1 Core 1 2 21 43 1
  • 48. 48 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. PowerKVM Micro-Threading (Split-Core) VM1 VM2 VM3 VM4 Context switching (hypervisor overhead) time Fullcore thr1 thr2 thr3 thr4 thr5 thr6 thr7 thr8 Full core POWER8 Power8 is a 8 threads processor. All threads share MMU(1) context, therefore must be in same partition. Guests in single thread (SMT 1) mode cannot use the full core capacity. Micro-Threading benefits: Better CPU resources usage More virtual machines per core Reduces over-commitment overhead (context switch) Micro-Threading limitations: Guest SMT is limited to 2 or 4, depending on the Split Core level (Half core, Quarter Core) All threads are running in SMT8 mode. (lower single thread perf.) PowerKVM introduces the possibility to split a Power8 core in 2 or 4 subcores: Micro-Threading (static in PowerKVM 2.1, dynamic in PowerKVM 3.1) Each subcore has its own MMU(1) and can be dispatched independently to a different Guest (VM). (1) MMU (MemoryManagement Unit) is a Hardware Memory Decoder that maps virtual addresses to physical addresses VM2 subcore1 VM1 VM3 VM4 time subcore1 subcore2 subcore3 subcore4 thr1 thr2 thr3 thr4 thr5 thr6 thr7 thr8 POWER8 subcore2 subcore3 subcore4
  • 49. 49 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. PowerKVM 3.1 Dynamic Micro-Threading (SubCores) With PowerKVM 3.1, The hypervisor may dynamically choose to split by-two or by- four each core in order to optimize vcpus needs with hardware available resources. runrunPowerKVM3 with 1 P8 core Guest1 2 vcpus <topology sockets=1 cores=1threads=2/> Guest2 4 vcpus <topology sockets=1 cores=1 threads=4/> Manually Defined : 1 vcore, 2 threads Manually Defined: 1 vcore, 4 threads run runPowerKVM3 with 1 P8 core Guest1 2 vcpus <topology sockets=1 cores=1 threads=2/> Guest2 2 vcpus <topology sockets=1 cores=1 threads=2/> Manually Defined : 1 vcore, 2 threads Manually Defined: 1 vcore, 2 threads Guest3 2 vcpus <topology sockets=1 cores=1 threads=2/> Manually Defined : 1 vcore, 2 threads Splitting by 2 is optimum Splitting by 4 is optimum To manually and statically set the level of subcoring, use at PowerKVM host level: ppc64_cpu --subcores-per-core # Get number of subcores per core ppc64_cpu --subcores-per-core=X # Set subcores per core to X (1,2 or 4) ppc64_cpu --threads-per-core # Get threads per core (It needs all VMs to be offline)
  • 50. 50 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. PowerKVM 3.1 Micro-Threading (Subcore) DEMO
  • 51. 51 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. PowerKVM 3.1 Dynamic Micro-Threading (SubCores) DEMO The demonstration is done with: 4 Guests (Virtual machines), all pinned onto one single core of a 20-cores S822L Power8 server. PowerKVM 3.1 virtualization. Each guest is defined with a manual topology of 1 vcore and 2 threads. run PowerKVM3 with 1 P8 core split1 2 vcpus <topology sockets=1 cores=1 threads=2/> split2 2 vcpus <topology sockets=1 cores=1 threads=2/> Manually Defined : 1 vcore, 2 threads Manually Defined: 1 vcore, 2 threads split3 2 vcpus <topology sockets=1 cores=1 threads=2/> Manually Defined : 1 vcore, 2 threads split3 2 vcpus <topology sockets=1 cores=1 threads=2/> Manually Defined : 1 vcore, 2 threads
  • 52. 52 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Time Slice CoreThreads 1 2 3 4 5 6 7 8 Time Slice CoreThreads 1 2 3 4 5 6 7 8 PowerKVM 3.1 Dynamic Micro-Threading (SubCores) DEMO (guest topology is 1 vcore, 2 threads) Time Slice CoreThreads 1 2 3 4 5 6 7 8 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 split1 split2 Split3 split4 No Micro-Threading allowed Micro-Threading with 2 sub-cores max Micro-Threading with 4 sub-cores max
  • 53. 53 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. 400 VMs on a (small) S822LC 20-cores ? Thanks to split-core (and piggybacking), even 400 VMs but nevertheless powerfull IBM S822LC is OK (even if definitely extreme). Guest= 2 vcpus Default : 2 vcores, 1 threads No need to split(thanks to piggyback with 20 VMs) Split-core helps optimizing cores utilization Number of VMs Almost like PowerKVM 2.1 (piggybacknot available with pKVM 2.1) PowerKVM 3.1 split-corebenefits PgbenchpostgreSQL workload(tps)
  • 54. 54 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Session Evaluations YOUR OPINION MATTERS! Submit four or more session evaluations by 5:30pm Wednesday to be eligible for drawings! *Winners will be notified Thursday morning. Prizes must be picked up at registration desk, during operating hours, by the conclusion of the event. 1 2 3 4
  • 55. 55 IBM Systems Technical Events | ibm.com/training/events © Copyright IBM Corporation 2016. Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM. Continue growing your IBM skills ibm.com/training provides a comprehensive portfolio of skills and career accelerators that are designed to meet all your training needs. If training that is right for you with our Global Training Providers, we can help. Contact IBM Training at dpmc@us.ibm.com Global Skills Initiative