SlideShare ist ein Scribd-Unternehmen logo
1 von 45
SIMD Instructions
outside and inside
Oracle 12c
Laurent Léturgez – 2015
ABOUT ME
 Oracle Consultant since 2001
 Former developer (C, Java, perl, PL/SQL)
 Blogger since 2004
 http://laurent.leturgez.free.fr (In french and discontinued)
 http://laurent-leturgez.com
 Twitter : @lleturgez
 OCM 11g
Agenda
 SIMD Instructions, outside Oracle 12c
 What is a SIMD instruction ?
 Will my application use SIMD ?
 Raw Performance
 SIMD Instructions, inside Oracle 12c
 How SIMD instructions are used inside Oracle 12c
 Tracing SIMD in Oracle 12c
Caveats
 Most of the topics are from
 My own researches
 My past life as a developer
 Some of the topics are about internals, so:
 Analysis and conclusion may be incomplete
 Future versions of Oracle may change the features
 Tests have been done with Oracle 12.1.0.2, Oracle
Enterprise Linux 7.1, VMWare Fusion 7 (And VirtualBox)
Before we start …
 Some fundamentals (from Dennis Yurichev’s book)
 CPU register : […]The easiest way to understand a register is to
think of it as an untyped temporary variable. Imagine if you were
working with high-level PL1 and could only use eight 32-bit (or 64-
bit) variables. Yet a lot can be done using just these!
 Instruction : A primitive CPU command. The simplest examples
include: moving data between registers, working with memory and
arithmetic primitives. As a rule, each CPU has its own instruction set
architecture (ISA).
 Assembly language : Mnemonic code and some extensions like
macros which are intended to make a programmer’s life easier.
http://beginners.re/Reverse_Engineering_for_Beginners-en.pdf
Agenda
 SIMD Instructions, outside Oracle 12c
 What is a SIMD instruction ?
 Will my application use SIMD ?
 Raw Performance
 SIMD Instructions, inside Oracle 12c
 How SIMD instructions are used inside Oracle 12c
 Tracing SIMD in Oracle 12c
SIMD instructions … outside
Oracle 12c
 SIMD stands for Single Instruction Multiple Data
 Process multiple data
 In one CPU instruction
 Based on
 Specific registers
 Specific CPU instructions and sets of instructions
 Not Oracle specific
 CPU Architecture specific
 Intel
 IBM (Altivec)
 Sparc (VIS)
 This presentation is mainly about Intel architecture
SIMD instructions … outside
Oracle 12c
 What is a SIMD register ?
 It’s a CPU register
 Wider than traditional registers (RDI, RSI, R8, R9 etc.)
 128 up to 512 bits wide
 Contains many data
SIMD instructions … outside
Oracle 12c
 Scalar operation
 an array of 4 integers {1,2,3,4}
 add 1 to each value
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
1
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
1
2
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
1
1
2
2
Reg1
Reg2
Reg3
CPU
RAM
In
Out
2 3 41
4
1
5
3 4 52
…/…
LOAD ADD SAVE
4 LOAD
4 ADD
4 SAVE
SIMD instructions … outside
Oracle 12c
 SIMD operation
 an array of 4 integers {1,2,3,4}
 add 1 to each value
SIMD Reg1
CPU
RAM
In
Out
2 3 41
1 1 11SIMD Reg2
SIMD Reg3
SIMD Reg1
CPU
RAM
In
Out
2 3 41
2 3 41
1 1 11SIMD Reg2
SIMD Reg3
SIMD Reg1
CPU
RAM
In
Out
2 3 41
2 3 41
1 1 11
3 4 52
SIMD Reg2
SIMD Reg3
SIMD Reg1
CPU
RAM
In
Out
2 3 41
3 4 52
2 3 41
1 1 11
3 4 52
SIMD Reg2
SIMD Reg3
LOAD ADD SAVE
SIMD instructions … outside
Oracle 12c
Instruction
set
MMX SSE SSE2/SSE3/S
SSE3/SSE4
AVX/AVX2 AVX3 or
AVX512
Register Size 64 Bits 128 bits 128 bits 256 Bits 512 bits
# Registers 8 8 16 16 32
Register Name MM0 to MM7 XMM0 to XMM7 XMM0 to XMM15 YMM0 to YMM15 ZMM0 to ZMM31
Processors Pentium II Pentium III Pentium IV to
Nehalem
Sandy Bridge -
Haswell
Skylake
Other Only four 32 bits
single precision
floating point
numbers
Usage expansion
(two 64 bits
double precision,
four 32 bits
integers and up to
sixteen 8 bits
bytes)
Three operand
instructions (non
destructive) :
A+B=C rather
than A=A+B
Alignements
requirements
relaxed
SIMD instructions … outside
Oracle 12c
 Intel API (C/C++) : Intel Intrinsics Guide
https://software.intel.com/sites/landingpage/IntrinsicsGuide/
 Sample code:
https://app.box.com/simdSampleC-2015
Agenda
 SIMD Instructions, outside Oracle 12c
 What is a SIMD instruction ?
 Will my application use SIMD ?
 Raw Performance
 SIMD Instructions, inside Oracle 12c
 How SIMD instructions are used inside Oracle 12c
 Tracing SIMD in Oracle 12c
Will my application use SIMD registers
and instructions ?
 It depends on :
 Hardware
 Consult processors datasheets to see which instruction set extensions
are used (if many)
 http://ark.intel.com/#@Processors
 Hypervisor
 Some (old) hypervisors do not support modern extensions
 VirtualBox versions <5.0 don’t support SSE4, AVX and AVX2
 Hyper-V on W2008R2-SP1 needs patch for specific processors to
support AVX
 It depends on the Operating System
AVX (256 bits) is supported from
 Linux Kernel >= 2.6.30
 Redhat EL5 : 2.6.18
 Oracle EL5 w/UEK : 2.6.32
AVX needs xsave kernel parameter
 Solaris 10 upd 10 and Solaris 11
 Windows 2008 R2 SP1
Will my application use SIMD registers
and instructions ?
 It depends on the compiler
 GCC
 > 4.6 for AVX support
 Use of specific switches (-msse2, -msse4.1, msse4.2, -mavx,
-mavx2 …)
 Intel C/C++ Compiler (ICC)
 > 11.1 for AVX Support and > 13.0 for AVX2 support
 Use of specific switches (-xsse4.2, -xavx, -xcore-avx2 …)
 Beware of optimization switches (-O1,-O2, -O3)
 More … disassemble (if you are allowed to  )
 Registers
 Assembler instructions
Will my application use SIMD registers
and instructions ?
Agenda
 SIMD Instructions, outside Oracle 12c
 What is a SIMD instruction ?
 Will my application use SIMD ?
 Raw Performance
 SIMD Instructions, inside Oracle 12c
 How SIMD instructions are used inside Oracle 12c
 Tracing SIMD in Oracle 12c
 Based on a C program
 Used CPU: Haswell microarchitecture (Core i7-
4960HQ). AVX/AVX2 enabled
 3 tests : No SIMD, SSE4, AVX
 Input: one array containing 1Million values.
 Goal: Add 1 to each value, each million values
repeated 4k, 8k, 16k and 32k times
 CPU Time(s) = f(#rows)
“Quick and Dirty” Sample code available here:
https://app.box.com/s/ibmnbblpho4xtbeq2x8ir60nrk37208v
Raw performance
Raw performance
10.35
20.46
42.35
85.64
3.3 6.81
13.73
25.58
1.96 3.51 7.23
15.15
0
10
20
30
40
50
60
70
80
90
4096 M. ROWS 8192 M. ROWS 16384 M. ROWS 32768 M. ROWS
CPUTime(Sec)
RAW Performance (CPU) for SIMD Instructions
NO SIMD SSE4 (XMM Registers) AVX (YMM Registers)
Agenda
 SIMD Instructions, outside Oracle 12c
 What is a SIMD instruction ?
 Will my application use SIMD ?
 Raw Performance
 SIMD Instructions, inside Oracle 12c
 How SIMD instructions are used inside Oracle 12c
 Tracing SIMD in Oracle 12c
SIMD instructions … inside
Oracle 12c
 In Memory Data Structure
 In Memory Compression Unit :
IMCU
 IMCU is the unit of column store
allocation
 Target size is 1M rows
(controlled by _inmemory_imcu_target_rows)
 One IMCU can contain more than
one column
 Each column in one IMCU is a
column unit (CU)
SIMD instructions … inside
Oracle 12c
 In memory column store storage indexes
 For each column unit, min and max values are maintained in
a storage index
 Storage Indexes provide CU pruning
 Information about CU available in GV$IM_COL_CU
(Undocumented. See Bug ID 19361690)
IMCU
Pruning
SIMD instructions … inside
Oracle 12c
 The way your data is sorted matters for best IMCU pruning
SIMD instructions … inside
Oracle 12c
 SIMD extensions are used with In Memory storage indexes
for efficient filtering
1. IM Storage Indexes do IMCU pruning
2. SIMD instructions apply efficiently filter predicates
IMCU
Pruning
Prod-id
10
10
14
14
10
Filtering
with SIMD
SIMD instructions … inside
Oracle 12c
 Oracle 12c uses specific libraries for SIMD (and compression)
 Located in $ORACLE_HOME/lib
 libshpksse4212.so for SSE4.2 extensions
Compiled with ICC v12 with specific xsse4.2 switch
 libshpkavx12.so for AVX extensions
Compiled with ICC v12 with specific xavx switch
 libshpkavx212.so for AVX2 extensions
Not yet implemented (8 functions implemented)
No ICC avx2 switch used because ICC v12 doesn’t support AVX2
 Thanks Tanel Pöder
SIMD instructions … inside
Oracle 12c
 Oracle SIMD related functions
 Located in kdzk kernel module (HPK)
 Part of Advanced Compression library (ADVCMP)
 Easily tracked with systemtap
SIMD instructions … inside
Oracle 12c
 How Oracle uses SIMD extensions ?
It depends on many parameters
 OS Level : /proc/cpuinfo
 AVX and AVX2 support
 SSE4 Support only
SIMD instructions … inside
Oracle 12c
 Which library am I using ?
 pmap
 AVX support
 SSE4 support
SIMD instructions … inside
Oracle 12c
 Which compiler options have been used ?
 Read “comment” section in ELF
 Read the corresponding compiler documentation
[oracle@oel7 conf]$ readelf -p .comment $ORACLE_HOME/lib/libshpkavx12.so |
> | egrep -i 'intel|gcc' | egrep 'xavx|mavx’
[ 2c] -?comment:Intel(R) C Intel(R) 64 Compiler XE for applications running on
Intel(R) 64, Version 12.0 Build 20120731
…/…
-DNTEV_USE_EPOLL -DNET_USE_LDAP -xavx
SIMD instructions … inside
Oracle 12c
 How are SIMD registers used by Oracle ?
 GDB
 To get the call stack (backtrace)
 To set breakpoints on interesting functions
 To view register contents (traditional and SIMD)
 “Info registers” for traditional registers
 “Info all-registers” for all registers (SIMD reg included)
 (gdb) print $ymmX.<format>
Format can be v8_float, v4_double, v32_int8, v16_int16, v8_int32,
v4_int64, or v2_int128
SIMD instructions … inside
Oracle 12c
In red, register content
has been modified
In blue, the second part of
the SIMD registers (128
bits) is empty
SIMD instructions … inside
Oracle 12c
 Oracle IM can use AVX or SSE4 extensions for SIMD
operations
 When AVX is used
It uses only 128 bits out of 256 bits wide registers
• AVX adds new register-state through the 256-bit wide YMM
register file
• Explicit operating system support is required to properly save
and restore AVX's expanded registers between context
switches
• Without this, only AVX 128-bit is supported
SIMD instructions … inside
Oracle 12c
The culprit
 Oracle 12.1.0.2 is supported from EL5 onwards
 EL5 Redhat Kernel is 2.6.18 and this flag (xsave) is
supported from 2.6.30 kernels
 For compatibility reasons, Oracle has to compile
its code on 2.6.18 kernels
Agenda
 SIMD Instructions, outside Oracle 12c
 What is a SIMD instruction ?
 Will my application use SIMD ?
 Raw Performance
 SIMD Instructions, inside Oracle 12c
 How SIMD instructions are used inside Oracle 12c
 Tracing SIMD in Oracle 12c
Tracing SIMD in Oracle 12c
 Oradebug has 2 components related to IM
Tracing SIMD in Oracle 12c
 Interesting components to trace for SIMD
and/or IMCU Pruning are :
 IM_optimizer
Gives information about CBO calculation related to
IM
 ADVCMP_DECOMP.*
ADVCMP_DECOMP_HPK : SIMD functions
ADVCMP_DECOMP_PCODE : Portable Code
Machine (usually comparison functions and results)
Tracing SIMD in Oracle 12c
 IM_optimizer
 Information available in trace file
 IMCU Pruning ratio
 CU decompression costing (per IMCU)
 Predicate evaluation costing (per row)
 Statement has to be parsed to get results
Tracing SIMD in Oracle 12c
select prod_id,cust_id,time_id from laurent.s_capa_high where amount_sold=20;
Tracing SIMD in Oracle 12c
 This information is available in CBO trace file (10053 or SQL_costing
event)
Tracing SIMD in Oracle 12c
 ADVCMP_DECOMP
 ADVCMP_DECOMP_HPK
 Information is available in the trace file (for each IMCU
processed)
 Used library and function
 Number of rows and counting algorithm
 Processing rate (comparison and decompression if relevant)
 But nothing on the results of the processing 
Tracing SIMD in Oracle 12c
 ADVCMP_DECOMP
 ADVCMP_DECOMP_HPK
 Gives information about SIMD function usage and filtering (after
IMCU pruning)
 Example: inmemory table with NO MEMCOMPRESS or DML
compression
Tracing SIMD in Oracle 12c
 ADVCMP_DECOMP
 ADVCMP_DECOMP_HPK
 Example: inmemory compressed table
 SIMD are used only in the kdzk_eq_dict functions
Tracing SIMD in Oracle 12c
 My thoughts about compression/decompression
 NO MEMCOMPRESS / COMPRESS FOR DML
 kdzk*dynp* functions (ex: kdzk_eq_dynp_16bit, kdzk_le_dynp_32bit
etc.)
 FOR QUERY LOW / QUERY HIGH
 Dictionary Encoding (LZW ?) : kdzk_*dict* functions (ex:
kdzk_eq_dict_7bit, kdzk_le_dict_4bit etc.)
 Run Length Encoding: kdzk_burst_rle* functions (ex:
kdzk_burst_rle_8bit, kdzk_burst_rle_16bit …)
 Bit packing compression: kdzk*fixed* functions (ex:
kdzk_ge_lt_fixed_32bit, kdzk_lt_fixed_8bit …)
Tracing SIMD in Oracle 12c
 My thoughts about compression/decompression
 FOR CAPACITY LOW
 FOR QUERY LOW + additional proprietary compression (OZIP)
 Functions: ozip_decode_dict*, kdzk_ozip_decode* (Ex:
kdzk_ozip_decode_dydi, ozip_decode_dict_9_bit etc.)
 FOR CAPACITY HIGH
 FOR QUERY HIGH + heavy weigth compression algorithm
 Compression/decompression method depends on:
 Datatype
 Column Compression Unit size
 Column contents
leturgezl@gmail.com
http://laurent-leturgez.com
@lleturgez

Weitere ähnliche Inhalte

Was ist angesagt?

Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo... Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Enkitec
 
OOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AASOOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AAS
Kyle Hailey
 
Drilling Deep Into Exadata Performance
Drilling Deep Into Exadata PerformanceDrilling Deep Into Exadata Performance
Drilling Deep Into Exadata Performance
Enkitec
 

Was ist angesagt? (20)

Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle Ash masters : advanced ash analytics on Oracle
Ash masters : advanced ash analytics on Oracle
 
How oracle 12c flexes its muscles against oracle 11g r2 final
How oracle 12c flexes its muscles against oracle 11g r2 finalHow oracle 12c flexes its muscles against oracle 11g r2 final
How oracle 12c flexes its muscles against oracle 11g r2 final
 
DBA Commands and Concepts That Every Developer Should Know
DBA Commands and Concepts That Every Developer Should KnowDBA Commands and Concepts That Every Developer Should Know
DBA Commands and Concepts That Every Developer Should Know
 
Best practices for_large_oracle_apps_r12_implementations
Best practices for_large_oracle_apps_r12_implementationsBest practices for_large_oracle_apps_r12_implementations
Best practices for_large_oracle_apps_r12_implementations
 
Tanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools shortTanel Poder - Scripts and Tools short
Tanel Poder - Scripts and Tools short
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
 
Indexing in Exadata
Indexing in ExadataIndexing in Exadata
Indexing in Exadata
 
Create your oracle_apps_r12_lab_with_less_than_us1000
Create your oracle_apps_r12_lab_with_less_than_us1000Create your oracle_apps_r12_lab_with_less_than_us1000
Create your oracle_apps_r12_lab_with_less_than_us1000
 
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo... Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
Tuning SQL for Oracle Exadata: The Good, The Bad, and The Ugly Tuning SQL fo...
 
Mini Session - Using GDB for Profiling
Mini Session - Using GDB for ProfilingMini Session - Using GDB for Profiling
Mini Session - Using GDB for Profiling
 
GLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New FeaturesGLOC 2014 NEOOUG - Oracle Database 12c New Features
GLOC 2014 NEOOUG - Oracle Database 12c New Features
 
OOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AASOOUG - Oracle Performance Tuning with AAS
OOUG - Oracle Performance Tuning with AAS
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata Migrations
 
Cloug Troubleshooting Oracle 11g Rac 101 Tips And Tricks
Cloug Troubleshooting Oracle 11g Rac 101 Tips And TricksCloug Troubleshooting Oracle 11g Rac 101 Tips And Tricks
Cloug Troubleshooting Oracle 11g Rac 101 Tips And Tricks
 
SQL in the Hybrid World
SQL in the Hybrid WorldSQL in the Hybrid World
SQL in the Hybrid World
 
Oracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known FeaturesOracle Exadata Performance: Latest Improvements and Less Known Features
Oracle Exadata Performance: Latest Improvements and Less Known Features
 
Drilling Deep Into Exadata Performance
Drilling Deep Into Exadata PerformanceDrilling Deep Into Exadata Performance
Drilling Deep Into Exadata Performance
 
In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1In Search of Plan Stability - Part 1
In Search of Plan Stability - Part 1
 
Crack the complexity of oracle applications r12 workload v2
Crack the complexity of oracle applications r12 workload v2Crack the complexity of oracle applications r12 workload v2
Crack the complexity of oracle applications r12 workload v2
 
Think Exa!
Think Exa!Think Exa!
Think Exa!
 

Andere mochten auch

Oracle 12c r1 installation on solaris 11.1
Oracle 12c r1 installation on solaris 11.1Oracle 12c r1 installation on solaris 11.1
Oracle 12c r1 installation on solaris 11.1
Laurent Leturgez
 
Oracle 12c in memory en action
Oracle 12c in memory en actionOracle 12c in memory en action
Oracle 12c in memory en action
Laurent Leturgez
 

Andere mochten auch (20)

Single instruction multiple data
Single instruction multiple dataSingle instruction multiple data
Single instruction multiple data
 
UKOUG
UKOUG UKOUG
UKOUG
 
Simd programming introduction
Simd programming introductionSimd programming introduction
Simd programming introduction
 
Hanganalyze presentation
Hanganalyze presentationHanganalyze presentation
Hanganalyze presentation
 
Oracle 12c r1 installation on solaris 11.1
Oracle 12c r1 installation on solaris 11.1Oracle 12c r1 installation on solaris 11.1
Oracle 12c r1 installation on solaris 11.1
 
Oracle 12c in memory en action
Oracle 12c in memory en actionOracle 12c in memory en action
Oracle 12c in memory en action
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
Oracle Database Entrance Ceremony - Touchdown
Oracle Database Entrance Ceremony - TouchdownOracle Database Entrance Ceremony - Touchdown
Oracle Database Entrance Ceremony - Touchdown
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
Modern Linux Performance Tools for Application Troubleshooting
Modern Linux Performance Tools for Application TroubleshootingModern Linux Performance Tools for Application Troubleshooting
Modern Linux Performance Tools for Application Troubleshooting
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop Tutorial
 
Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and Oracle
 
Parallel Processors (SIMD)
Parallel Processors (SIMD) Parallel Processors (SIMD)
Parallel Processors (SIMD)
 
SQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12cSQL Monitoring in Oracle Database 12c
SQL Monitoring in Oracle Database 12c
 
Large-scale social media analysis with Hadoop
Large-scale social media analysis with HadoopLarge-scale social media analysis with Hadoop
Large-scale social media analysis with Hadoop
 
SimD
SimDSimD
SimD
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 

Ähnlich wie Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)

SIMD inside and outside oracle 12c
SIMD inside and outside oracle 12cSIMD inside and outside oracle 12c
SIMD inside and outside oracle 12c
Laurent Leturgez
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
rajeshkvdn
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
Edge AI and Vision Alliance
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
imec.archive
 
HKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewHKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overview
Linaro
 

Ähnlich wie Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2) (20)

SIMD inside and outside Oracle 12c In Memory
SIMD inside and outside Oracle 12c In MemorySIMD inside and outside Oracle 12c In Memory
SIMD inside and outside Oracle 12c In Memory
 
SIMD inside and outside oracle 12c
SIMD inside and outside oracle 12cSIMD inside and outside oracle 12c
SIMD inside and outside oracle 12c
 
Joel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMDJoel Falcou, Boost.SIMD
Joel Falcou, Boost.SIMD
 
Introduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSPIntroduction to Blackfin BF532 DSP
Introduction to Blackfin BF532 DSP
 
Something about SSE and beyond
Something about SSE and beyondSomething about SSE and beyond
Something about SSE and beyond
 
Crypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M ProcessorsCrypto Performance on ARM Cortex-M Processors
Crypto Performance on ARM Cortex-M Processors
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
 
Andes RISC-V processor solutions
Andes RISC-V processor solutionsAndes RISC-V processor solutions
Andes RISC-V processor solutions
 
16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)16-bit Microprocessor Design (2005)
16-bit Microprocessor Design (2005)
 
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
“Programming Vision Pipelines on AMD’s AI Engines,” a Presentation from AMD
 
DSP Processor.pptx
DSP Processor.pptxDSP Processor.pptx
DSP Processor.pptx
 
Introduction to FPGA, VHDL
Introduction to FPGA, VHDL  Introduction to FPGA, VHDL
Introduction to FPGA, VHDL
 
Advance Microcontroller AVR
Advance Microcontroller AVRAdvance Microcontroller AVR
Advance Microcontroller AVR
 
Moving NEON to 64 bits
Moving NEON to 64 bitsMoving NEON to 64 bits
Moving NEON to 64 bits
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
 
HKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewHKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overview
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
 
Introduction2_PIC.ppt
Introduction2_PIC.pptIntroduction2_PIC.ppt
Introduction2_PIC.ppt
 
Arm architecture overview
Arm architecture overviewArm architecture overview
Arm architecture overview
 
Training report on embedded sys_AVR
Training report on embedded sys_AVRTraining report on embedded sys_AVR
Training report on embedded sys_AVR
 

Kürzlich hochgeladen

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
anilsa9823
 

Kürzlich hochgeladen (20)

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Ukoug15 SIMD outside and inside Oracle 12c (12.1.0.2)

  • 1. SIMD Instructions outside and inside Oracle 12c Laurent Léturgez – 2015
  • 2. ABOUT ME  Oracle Consultant since 2001  Former developer (C, Java, perl, PL/SQL)  Blogger since 2004  http://laurent.leturgez.free.fr (In french and discontinued)  http://laurent-leturgez.com  Twitter : @lleturgez  OCM 11g
  • 3. Agenda  SIMD Instructions, outside Oracle 12c  What is a SIMD instruction ?  Will my application use SIMD ?  Raw Performance  SIMD Instructions, inside Oracle 12c  How SIMD instructions are used inside Oracle 12c  Tracing SIMD in Oracle 12c
  • 4. Caveats  Most of the topics are from  My own researches  My past life as a developer  Some of the topics are about internals, so:  Analysis and conclusion may be incomplete  Future versions of Oracle may change the features  Tests have been done with Oracle 12.1.0.2, Oracle Enterprise Linux 7.1, VMWare Fusion 7 (And VirtualBox)
  • 5. Before we start …  Some fundamentals (from Dennis Yurichev’s book)  CPU register : […]The easiest way to understand a register is to think of it as an untyped temporary variable. Imagine if you were working with high-level PL1 and could only use eight 32-bit (or 64- bit) variables. Yet a lot can be done using just these!  Instruction : A primitive CPU command. The simplest examples include: moving data between registers, working with memory and arithmetic primitives. As a rule, each CPU has its own instruction set architecture (ISA).  Assembly language : Mnemonic code and some extensions like macros which are intended to make a programmer’s life easier. http://beginners.re/Reverse_Engineering_for_Beginners-en.pdf
  • 6. Agenda  SIMD Instructions, outside Oracle 12c  What is a SIMD instruction ?  Will my application use SIMD ?  Raw Performance  SIMD Instructions, inside Oracle 12c  How SIMD instructions are used inside Oracle 12c  Tracing SIMD in Oracle 12c
  • 7. SIMD instructions … outside Oracle 12c  SIMD stands for Single Instruction Multiple Data  Process multiple data  In one CPU instruction  Based on  Specific registers  Specific CPU instructions and sets of instructions  Not Oracle specific  CPU Architecture specific  Intel  IBM (Altivec)  Sparc (VIS)  This presentation is mainly about Intel architecture
  • 8. SIMD instructions … outside Oracle 12c  What is a SIMD register ?  It’s a CPU register  Wider than traditional registers (RDI, RSI, R8, R9 etc.)  128 up to 512 bits wide  Contains many data
  • 9. SIMD instructions … outside Oracle 12c  Scalar operation  an array of 4 integers {1,2,3,4}  add 1 to each value Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 1 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 1 2 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 1 1 2 2 Reg1 Reg2 Reg3 CPU RAM In Out 2 3 41 4 1 5 3 4 52 …/… LOAD ADD SAVE 4 LOAD 4 ADD 4 SAVE
  • 10. SIMD instructions … outside Oracle 12c  SIMD operation  an array of 4 integers {1,2,3,4}  add 1 to each value SIMD Reg1 CPU RAM In Out 2 3 41 1 1 11SIMD Reg2 SIMD Reg3 SIMD Reg1 CPU RAM In Out 2 3 41 2 3 41 1 1 11SIMD Reg2 SIMD Reg3 SIMD Reg1 CPU RAM In Out 2 3 41 2 3 41 1 1 11 3 4 52 SIMD Reg2 SIMD Reg3 SIMD Reg1 CPU RAM In Out 2 3 41 3 4 52 2 3 41 1 1 11 3 4 52 SIMD Reg2 SIMD Reg3 LOAD ADD SAVE
  • 11. SIMD instructions … outside Oracle 12c Instruction set MMX SSE SSE2/SSE3/S SSE3/SSE4 AVX/AVX2 AVX3 or AVX512 Register Size 64 Bits 128 bits 128 bits 256 Bits 512 bits # Registers 8 8 16 16 32 Register Name MM0 to MM7 XMM0 to XMM7 XMM0 to XMM15 YMM0 to YMM15 ZMM0 to ZMM31 Processors Pentium II Pentium III Pentium IV to Nehalem Sandy Bridge - Haswell Skylake Other Only four 32 bits single precision floating point numbers Usage expansion (two 64 bits double precision, four 32 bits integers and up to sixteen 8 bits bytes) Three operand instructions (non destructive) : A+B=C rather than A=A+B Alignements requirements relaxed
  • 12. SIMD instructions … outside Oracle 12c  Intel API (C/C++) : Intel Intrinsics Guide https://software.intel.com/sites/landingpage/IntrinsicsGuide/  Sample code: https://app.box.com/simdSampleC-2015
  • 13. Agenda  SIMD Instructions, outside Oracle 12c  What is a SIMD instruction ?  Will my application use SIMD ?  Raw Performance  SIMD Instructions, inside Oracle 12c  How SIMD instructions are used inside Oracle 12c  Tracing SIMD in Oracle 12c
  • 14. Will my application use SIMD registers and instructions ?  It depends on :  Hardware  Consult processors datasheets to see which instruction set extensions are used (if many)  http://ark.intel.com/#@Processors  Hypervisor  Some (old) hypervisors do not support modern extensions  VirtualBox versions <5.0 don’t support SSE4, AVX and AVX2  Hyper-V on W2008R2-SP1 needs patch for specific processors to support AVX
  • 15.  It depends on the Operating System AVX (256 bits) is supported from  Linux Kernel >= 2.6.30  Redhat EL5 : 2.6.18  Oracle EL5 w/UEK : 2.6.32 AVX needs xsave kernel parameter  Solaris 10 upd 10 and Solaris 11  Windows 2008 R2 SP1 Will my application use SIMD registers and instructions ?
  • 16.  It depends on the compiler  GCC  > 4.6 for AVX support  Use of specific switches (-msse2, -msse4.1, msse4.2, -mavx, -mavx2 …)  Intel C/C++ Compiler (ICC)  > 11.1 for AVX Support and > 13.0 for AVX2 support  Use of specific switches (-xsse4.2, -xavx, -xcore-avx2 …)  Beware of optimization switches (-O1,-O2, -O3)  More … disassemble (if you are allowed to  )  Registers  Assembler instructions Will my application use SIMD registers and instructions ?
  • 17. Agenda  SIMD Instructions, outside Oracle 12c  What is a SIMD instruction ?  Will my application use SIMD ?  Raw Performance  SIMD Instructions, inside Oracle 12c  How SIMD instructions are used inside Oracle 12c  Tracing SIMD in Oracle 12c
  • 18.  Based on a C program  Used CPU: Haswell microarchitecture (Core i7- 4960HQ). AVX/AVX2 enabled  3 tests : No SIMD, SSE4, AVX  Input: one array containing 1Million values.  Goal: Add 1 to each value, each million values repeated 4k, 8k, 16k and 32k times  CPU Time(s) = f(#rows) “Quick and Dirty” Sample code available here: https://app.box.com/s/ibmnbblpho4xtbeq2x8ir60nrk37208v Raw performance
  • 19. Raw performance 10.35 20.46 42.35 85.64 3.3 6.81 13.73 25.58 1.96 3.51 7.23 15.15 0 10 20 30 40 50 60 70 80 90 4096 M. ROWS 8192 M. ROWS 16384 M. ROWS 32768 M. ROWS CPUTime(Sec) RAW Performance (CPU) for SIMD Instructions NO SIMD SSE4 (XMM Registers) AVX (YMM Registers)
  • 20. Agenda  SIMD Instructions, outside Oracle 12c  What is a SIMD instruction ?  Will my application use SIMD ?  Raw Performance  SIMD Instructions, inside Oracle 12c  How SIMD instructions are used inside Oracle 12c  Tracing SIMD in Oracle 12c
  • 21. SIMD instructions … inside Oracle 12c  In Memory Data Structure  In Memory Compression Unit : IMCU  IMCU is the unit of column store allocation  Target size is 1M rows (controlled by _inmemory_imcu_target_rows)  One IMCU can contain more than one column  Each column in one IMCU is a column unit (CU)
  • 22. SIMD instructions … inside Oracle 12c  In memory column store storage indexes  For each column unit, min and max values are maintained in a storage index  Storage Indexes provide CU pruning  Information about CU available in GV$IM_COL_CU (Undocumented. See Bug ID 19361690) IMCU Pruning
  • 23. SIMD instructions … inside Oracle 12c  The way your data is sorted matters for best IMCU pruning
  • 24. SIMD instructions … inside Oracle 12c  SIMD extensions are used with In Memory storage indexes for efficient filtering 1. IM Storage Indexes do IMCU pruning 2. SIMD instructions apply efficiently filter predicates IMCU Pruning Prod-id 10 10 14 14 10 Filtering with SIMD
  • 25. SIMD instructions … inside Oracle 12c  Oracle 12c uses specific libraries for SIMD (and compression)  Located in $ORACLE_HOME/lib  libshpksse4212.so for SSE4.2 extensions Compiled with ICC v12 with specific xsse4.2 switch  libshpkavx12.so for AVX extensions Compiled with ICC v12 with specific xavx switch  libshpkavx212.so for AVX2 extensions Not yet implemented (8 functions implemented) No ICC avx2 switch used because ICC v12 doesn’t support AVX2  Thanks Tanel Pöder
  • 26. SIMD instructions … inside Oracle 12c  Oracle SIMD related functions  Located in kdzk kernel module (HPK)  Part of Advanced Compression library (ADVCMP)  Easily tracked with systemtap
  • 27. SIMD instructions … inside Oracle 12c  How Oracle uses SIMD extensions ? It depends on many parameters  OS Level : /proc/cpuinfo  AVX and AVX2 support  SSE4 Support only
  • 28. SIMD instructions … inside Oracle 12c  Which library am I using ?  pmap  AVX support  SSE4 support
  • 29. SIMD instructions … inside Oracle 12c  Which compiler options have been used ?  Read “comment” section in ELF  Read the corresponding compiler documentation [oracle@oel7 conf]$ readelf -p .comment $ORACLE_HOME/lib/libshpkavx12.so | > | egrep -i 'intel|gcc' | egrep 'xavx|mavx’ [ 2c] -?comment:Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.0 Build 20120731 …/… -DNTEV_USE_EPOLL -DNET_USE_LDAP -xavx
  • 30. SIMD instructions … inside Oracle 12c  How are SIMD registers used by Oracle ?  GDB  To get the call stack (backtrace)  To set breakpoints on interesting functions  To view register contents (traditional and SIMD)  “Info registers” for traditional registers  “Info all-registers” for all registers (SIMD reg included)  (gdb) print $ymmX.<format> Format can be v8_float, v4_double, v32_int8, v16_int16, v8_int32, v4_int64, or v2_int128
  • 31. SIMD instructions … inside Oracle 12c In red, register content has been modified In blue, the second part of the SIMD registers (128 bits) is empty
  • 32. SIMD instructions … inside Oracle 12c  Oracle IM can use AVX or SSE4 extensions for SIMD operations  When AVX is used It uses only 128 bits out of 256 bits wide registers • AVX adds new register-state through the 256-bit wide YMM register file • Explicit operating system support is required to properly save and restore AVX's expanded registers between context switches • Without this, only AVX 128-bit is supported
  • 33. SIMD instructions … inside Oracle 12c The culprit  Oracle 12.1.0.2 is supported from EL5 onwards  EL5 Redhat Kernel is 2.6.18 and this flag (xsave) is supported from 2.6.30 kernels  For compatibility reasons, Oracle has to compile its code on 2.6.18 kernels
  • 34. Agenda  SIMD Instructions, outside Oracle 12c  What is a SIMD instruction ?  Will my application use SIMD ?  Raw Performance  SIMD Instructions, inside Oracle 12c  How SIMD instructions are used inside Oracle 12c  Tracing SIMD in Oracle 12c
  • 35. Tracing SIMD in Oracle 12c  Oradebug has 2 components related to IM
  • 36. Tracing SIMD in Oracle 12c  Interesting components to trace for SIMD and/or IMCU Pruning are :  IM_optimizer Gives information about CBO calculation related to IM  ADVCMP_DECOMP.* ADVCMP_DECOMP_HPK : SIMD functions ADVCMP_DECOMP_PCODE : Portable Code Machine (usually comparison functions and results)
  • 37. Tracing SIMD in Oracle 12c  IM_optimizer  Information available in trace file  IMCU Pruning ratio  CU decompression costing (per IMCU)  Predicate evaluation costing (per row)  Statement has to be parsed to get results
  • 38. Tracing SIMD in Oracle 12c select prod_id,cust_id,time_id from laurent.s_capa_high where amount_sold=20;
  • 39. Tracing SIMD in Oracle 12c  This information is available in CBO trace file (10053 or SQL_costing event)
  • 40. Tracing SIMD in Oracle 12c  ADVCMP_DECOMP  ADVCMP_DECOMP_HPK  Information is available in the trace file (for each IMCU processed)  Used library and function  Number of rows and counting algorithm  Processing rate (comparison and decompression if relevant)  But nothing on the results of the processing 
  • 41. Tracing SIMD in Oracle 12c  ADVCMP_DECOMP  ADVCMP_DECOMP_HPK  Gives information about SIMD function usage and filtering (after IMCU pruning)  Example: inmemory table with NO MEMCOMPRESS or DML compression
  • 42. Tracing SIMD in Oracle 12c  ADVCMP_DECOMP  ADVCMP_DECOMP_HPK  Example: inmemory compressed table  SIMD are used only in the kdzk_eq_dict functions
  • 43. Tracing SIMD in Oracle 12c  My thoughts about compression/decompression  NO MEMCOMPRESS / COMPRESS FOR DML  kdzk*dynp* functions (ex: kdzk_eq_dynp_16bit, kdzk_le_dynp_32bit etc.)  FOR QUERY LOW / QUERY HIGH  Dictionary Encoding (LZW ?) : kdzk_*dict* functions (ex: kdzk_eq_dict_7bit, kdzk_le_dict_4bit etc.)  Run Length Encoding: kdzk_burst_rle* functions (ex: kdzk_burst_rle_8bit, kdzk_burst_rle_16bit …)  Bit packing compression: kdzk*fixed* functions (ex: kdzk_ge_lt_fixed_32bit, kdzk_lt_fixed_8bit …)
  • 44. Tracing SIMD in Oracle 12c  My thoughts about compression/decompression  FOR CAPACITY LOW  FOR QUERY LOW + additional proprietary compression (OZIP)  Functions: ozip_decode_dict*, kdzk_ozip_decode* (Ex: kdzk_ozip_decode_dydi, ozip_decode_dict_9_bit etc.)  FOR CAPACITY HIGH  FOR QUERY HIGH + heavy weigth compression algorithm  Compression/decompression method depends on:  Datatype  Column Compression Unit size  Column contents

Hinweis der Redaktion

  1. 12 instructions
  2. 3 instructions
  3. AVX adds new register-state through the 256-bit wide YMM register file, so explicit operating system support is required to properly save and restore AVX's expanded registers between context switches; without this, only AVX 128-bit is supported[citation needed].
  4. ICC Switches : Optimization : https://software.intel.com/en-us/articles/step-by-step-optimizing-with-intel-c-compiler SIMD (very interesting) : https://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations GCC Switches Optimization : https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html SIMD : https://gcc.gnu.org/onlinedocs/gcc-4.9.3/gcc/i386-and-x86-64-Options.html#i386-and-x86-64-Options
  5. Actual Size depends on size of row, compression factor Updated by background process Triggered by IMC0 W00x : processes that populate IM Column store Contains list of rowid
  6. Depends on how data are sorted inside the extents because, loading data into IMCU reads table extents sequentially
  7. More than 1400 function implemented in AVX and SSE42 libraries Xavx (diff mavx) has specific optimization
  8. HPK : High Performance Compression ?
  9. /proc/cpuinfo gives information depending on Hardware, kernel, kernel options, and hypervisor used (if used) For other OS, use tools that uses CPUID function and read EAX, EBX, ECX and EDX registers CPUINFO depends on Hardware, Kernel and its options, used hypervisor
  10. ELF : Executable and Linking Format
  11. Decompression costing : columns used in filter predicates + Columns in select Predicate cost evaluation : /!\ cumulative values
  12. Cost generated by column in the SELECT clause are not reported on the 10053 event trace file. Only the column in the filter predicate
  13. DML compression : Dictionnary Compression ?
  14. SIMD extensions are user