The document summarizes performance testing of database virtualization using Delphix. It describes:
1) Benchmarking OLTP and DSS workloads on original vs virtualized databases, finding similar performance.
2) Testing 2 concurrent original databases vs 2 virtualized databases sharing blocks, again with similar results.
3) Tools for monitoring database, storage, and network performance including scripts for Oracle I/O profiling (oramon.sh) and benchmarking disks and network throughput (fio.sh and netio).
3. Problem
Reports
Production First
copy
QA and UAT
• CERN - European Organization for Nuclear
Research
Developers
• 145 TB database
• 75 TB growth each year
• Dozens of developers want copies.
9. III. Allocate on Write a) Netapp
Target A
Production Flexclone
Snap mirror
Database Clone 1
clones
NetApp Filer NetApp Filer
snapshot
snapshot Target B
Database
Luns Clone 2
Snapshot
Manager for
Oracle
Target C
File system level Clone 3
Clone 4
10. III. Allocate on Write b) ZFS
Target A
1. physical
Clone 1
ZFS Storage Appliance NFS
RMAN Snapshot Clone 1
Copy
to NFS
mount
RMAN
copy
Oracle ZFS Appliance + RMAN
11. Review : Part I
1. Full Cloning
2. Thin Provision
I. clonedb
II. Copy on Write
III. Allocate on Write
a) Netapp ( also EMC VNX)
b) ZFS
c) DxFS
3. Database Virtualization
SMU
Delphix
13. Virtualization Layer
SMU
x86 hardware
Allocate
Storage
Any type
Could be Netapp
But Netapp not automated
Netapp AFAIK doesn’t
shared blocks in memory
14. One time backup of source database
Production
Instance
RMAN APIs
Database
File system
16. Incremental forever change collection
Production
Changes are collected
Instance
automatically forever
Data older than retention
widow freed
Database
File system
17. Typical Architecture
Production Development QA UAT
Instance Instance Instance Instance
Database Database Database Database
File system File system File system File system
18. Clones share duplicate blocks
Source Database Clone Copies of Source Database
Production Development QA UAT
Instance
Instance Instance Instance
NFS vDatabase vDatabase
Database vDatabase
File system
Fiber Channel
29. IBM 3690 256GM RAM
Vmware ESX 5.1 1. Link to Source Database
(copy is compressed by
Delphix 192 GB RAM 4 vCPU 1/3 on average)
RMAN API
Linux Source 20GB 4 vCPU
30. Original
IBM 3690 256GM RAM
Vmware ESX 5.1
Delphix 192 GB RAM 4vCPU
Linux Source 20GB 4 vCPU Linux Target 20GB 4 vCPU
1. Provision a “virtual
database” on target
LINUX
31. Benchmark setup ready
IBM 3690 256GM RAM
Vmware ESX 5.1
Delphix 192 GB RAM 4vCPU
Linux Source 20GB 4 vCPU Linux Target 20GB 4 vCPU
Run “physical”
Run “virtual”
benchmark on source
benchmark on target
database
virtual database
32. charbench
-cs 172.16.101.237:1521:ibm1 # machine:port:SID
-dt thin # driver
-u soe # username
-p soe # password
-uc 100 # user count
-min 10 # min think time
-max 200 # max think time
-rt 0:1 # run time
-a # run automatic
-v users,tpm,tps # collect statistics
http://dominicgiles.com/commandline.html
35. OLTP physical vs virtual , warm cache
Transactions Per Minute (TPM)
Users
36. Part Two: 2 physical vs 2 virtual
Vmware ESX 5.1 IBM 3690 256GM RAM
Delphix 192 GB RAM
Linux Source 20GB Linux Target 20GB
Linux Source 20GB Linux Target 20GB
• 2 Source databases
• 2 virtual database that
share the same
common blocks
40. Problems
swingbench connections time out
rm /dev/random
ln -s /dev/urandom /dev/random
couldn’t connect via listener
Services iptables stop
Chkconfig iptables off
Iptables –F
Service iptables save
55. Wireshark : analyze TCP dumps
• yum install wireshark
• wireshark + perl
– find common NFS requests
• NFS client
• NFS server
– display times for
• NFS Client
• NFS Server
• Delta
https://github.com/khailey/tcpdump/blob/master/parsetcp.pl
56. Parsing nfs server trace: nfs_server.cap
type avg ms count
READ : 44.60, 7731
Parsing client trace: client.cap
type avg ms count
READ : 46.54, 15282
==================== MATCHED DATA ============
READ
type avg ms
server : 48.39,
client : 49.42,
diff : 1.03,
Processed 9647 packets (Matched: 5624 Missed: 4023)
57. Parsing NFS server trace: nfs_server.cap
type avg ms count
READ : 1.17, 9042
Parsing client trace: client.cap
type avg ms count
READ : 1.49, 21984
==================== MATCHED DATA ============
READ
type avg ms count
server : 1.03
client : 1.49
diff : 0.46
58. Oracle on Oracle latency data
tool
Linux on Solaris source
“db file sequential read”
wait (which is basically a
Oracle Oracle 58 ms 47 ms oramon.sh timing of “pread” for 8k
random reads specifically
NFS
TCP trace tcpdump on
TCP
NFS 1.5 45 ms tcpparse.sh LINUX snoop on
Solaris
Client
Network network 0.5 1 ms Delta
TCP trace
TCP tcpparse.sh
NFS 1 ms 44 ms snoop
Server
NFS dtrace nfs:::op-read-
NFS .1 ms 2 ms DTrace start/op-read-done
Server
59. Issues: LINUX rpc queue
On LINUX, in /etc/sysctl.conf modify
sunrpc.tcp_slot_table_entries = 128
then do
sysctl -p
then check the setting using
sysctl -A | grep sunrpc
NFS partitions will have to be unmounted and remounted
Not persistent across reboot
60. Issues: Solaris NFS Server threads
sharectl get -p servers nfs
sharectl set -p servers=512 nfs
svcadm refresh nfs/server
66. Memory Location vs Price vs Perf
memory price speed
Hosts 1000 GB $32K < 1us Off load SAN
Virtual 200 GB $6K < 500us Off load SAN
layer Shared disk
fast clone
SAN 1000 GB $1000K < 100us
72% of all Delphix customers are on 1TB or
below databases
Of the databases buffer cache represents
0.5% of database size, 5GB
67. Leverage new solid state storage more efficiently
Vmware ESX 5.1 IBM 3690 256GM RAM
Delphix 192 GB RAM
Linux Source 20GB Linux Target 20GB
Linux Source 20GB Linux Target 20GB
Smaller space
Prod critical for businessPerformance of prod is highest priorityProtect prod from any extra load
Fastest query is the query not run
Performance issuesSingle point in time
Oracle Database Cloning Solution Using Oracle Recovery Manager and Sun ZFS Storage Appliancehttp://www.oracle.com/technetwork/articles/systems-hardware-architecture/cloning-solution-353626.pdf
Database virtualization is to the data tier whatVMware is to the compute tier. On the compute tier VMware allows the same hardware to be shared by multiple machines. On the data tier virtualization allows the same datafiles to be shared by multiple clones allowing almost instantaneous creation of new copies of databases with almost no disk footprint.
250 pdb x 200 GB = 50 TBEMC sells 1GB$1000Dell sells 32GB $1,000.terabyte of RAM on a Dell costs around $32,000terabyte of RAM on a VMAX 40k costs around $1,000,000.
Most of swingbench's parameters can be changed from the command line. That is to say, the swingconfig.xml file (or the other example files in the sample directory) can be used as templates for a run and each runs parameters can be modified from the command line. The -h option lists command line options[dgiles@macbook-2 bin]$ ./charbench -husage: parameters: -D <variable=value> use value for given environment variable -a run automatically -be <stopafter> end recording statistics after. Value is in the form hh:mm -bs <startafter> start recording statistics after. Value is in the form hh:mm -c <filename> specify config file -co <hostname> specify/override coordinator in configuration file. -com <comment> specify comment for this benchmark run (in double quotes) -cpuloc <hostname > specify/overide location of the cpu monitor. -cs <connectstring> override connect string in configuration file -debug turn on debug output -di <shortname(s)> disable transactions(s) by short name, comma separated -dt <drivertype> override driver type in configuration file (thin,oci, ttdirect, ttclient) -en <shortname(s)> enable transactions(s) by short name, comma separated -h,--help print this message -i run interactively (default) -ld <milliseconds> specify/overide the logon delay (milliseconds) -max <milliseconds> override maximum think time in configuration file -min <milliseconds> override minimum think time in configuration file -p <password> override password in configuration file -r <filename> specify results file -rr specify/overide refresh rate for charts in secs -rt <runtime> specify/overide run time for the benchmark. Value is in the form hh:mm -s run silent -u <username> override username in configuration file -uc <number> override user count in configuration file. -v <options> display run statistics (vmstat/sar like output), options include (comma separated no spaces).trans|cpu|disk|dml|tpm|tps|usersThe following examples show how this functionality can be used Example 1.$ > ./swingbench -cs //localhost/DOM102 -dt thin Will start swingbench using the local config file (swingconfig.xml) but overriding its connectstring and driver type. All other values in the file will be used. Example 2.$ > ./swingbench -c sample/ccconfig.xml -cs //localhost/DOM102 -dt thin Will start swingbench using the config file sample/ccconfig.xml and overriding its connectstring and driver type. All other values in the file will be used. Example 3.$ > ./minibench -c sample/soeconfig.xml -cs //localhost/DOM102 -dt thin -uc 50 -min 0 -max 100 -a Will start minibench (a lighter weight frontend) using the config file sample/ccconfig.xml and overriding its connectstring and driver type. It also overrides the user count and think times. The "-a" option starts the run without any user interaction. Example 4.$ > ./charbench -c sample/soeconfig.xml -cs //localhost/DOM102 -dt thin -cpulocoraclelinux -uc 20 -min 0 -max 100 -a -v users,tpm,tps,cpuAuthor : Dominic GilesVersion : 2.3.0.344Results will be written to results.xml.Time Users TPM TPS User System Wait Idle5:08:19 PM 0 0 0 0 0 0 05:08:21 PM 3 0 0 4 4 3 895:08:22 PM 8 0 0 4 4 3 895:08:23 PM 12 0 0 4 4 3 895:08:24 PM 16 0 0 8 43 0 495:08:25 PM 20 0 0 8 43 0 495:08:26 PM 20 2 2 8 43 0 495:08:27 PM 20 29 27 8 43 0 495:08:28 PM 20 49 20 53 34 1 12Will start charbench (a character based version of swingbench) using the config file sample/ccconfig.xml and overriding its connectstring and driver type. It also overrides the user count and think times. The "-a" option starts the run without any user interaction. This example also connects to the cpumonitor (started previously). It uses the -v option to continually display cpu load information. Example 5.$ > ./minibench -c sample/soeconfig.xml -cs //localhost/DOM102 -cpuloclocalhost -co localhostWill start minibench using the config file sample/ccconfig.xml and overriding its connectstring. It also specifies a cpu monitor started locally on the machine and attaches to a coordinator process also started on the local machine. Example 6.$ > ./minibench -c sample/soeconfig.xml -cs //localhost/DOM102 -cpuloclocalhost -rt 1:30 Will start minibench using the config file sample/ccconfig.xml and overriding its connectstring. It also specifies a cpu monitor started locally on the machine. The "-rt" parameter tells swingbench to run for 1 hour 30 and then stop. Example 7.$ > ./coordinator -g$ > ssh -f node1 'cdswingbench/bin;swingbench/bin/cpumonitor';$ > ssh -f node2 'cdswingbench/bin;swingbench/bin/cpumonitor';$ > ssh -f node3 'cdswingbench/bin;swingbench/bin/cpumonitor';$ > ssh -f node4 'cdswingbench/bin;swingbench/bin/cpumonitor';$ > ./minibench -cs //node1/RAC1 -cpulocnode1 -co localhost &$ > ./minibench -cs //node2/RAC2 -cpulocnode2 -co localhost &$ > ./minibench -cs //node3/RAC3 -cpulocnode3 -co localhost &$ > ./minibench -cs //node4/RAC4 -cpulocnode4 -co localhost &$ > ./clusteroverviewIn 2.3 the loadgenerators can use the additional command line option -g to specify which load generation group they are in i.e.$ > ./minibench -cs //node1/RAC1 -cpulocnode1 -co localhost -g group1 &This collection of commands will first start a coordinator in grpahical mode on the local machine. The next 4 commands secure shell to the 4 nodes of a cluster and start a cpumonitor (swingbench needs to be installed on each of them). The following commands start four load generators with the minibench front end each referencing the thecpumonitor started on each database instance, they also attach to the local coordinator. Finally the last command starts clusteroverview (currently its configuration needs to be specified in its config file). Its possible to stop all of the load generators and coordinator with the following command $ > ./coordinator -stop
Once Last Thinghttp://www.dadbm.com/wp-content/uploads/2013/01/12c_pluggable_database_vs_separate_database.png
250 pdb x 200 GB = 50 TBEMC sells 1GB$1000Dell sells 32GB $1,000.terabyte of RAM on a Dell costs around $32,000terabyte of RAM on a VMAX 40k costs around $1,000,000.
http://www.emc.com/collateral/emcwsca/master-price-list.pdf These prices obtain on pages 897/898:Storage engine for VMAX 40k with 256 GB RAM is around $393,000Storage engine for VMAX 40k with 48 GB RAM is around $200,000So, the cost of RAM here is 193,000 / 208 = $927 a gigabyte. That seems like a good deal for EMC, as Dell sells 32 GB RAM DIMMs for just over $1,000. So, a terabyte of RAM on a Dell costs around $32,000, and a terabyte of RAM on a VMAX 40k costs around $1,000,000.2) Most DBs have a buffer cache that is less than 0.5% (not 5%, 0.5%) of the datafile size.