2. VMWARE AND AMD
Through deep collaboration, VMware and AMD are delivering robust
virtualization solutions to support our customers’ business needs
2003 – VMware demos software at AMD
Opteron™ processor launch
2004 - VMware launches 64-bit support for
AMD processors with GSX Server and
Workstation products
2004 – VMware launches ESX 2.1.1 with
support for AMD Opteron™ processors
2007 –AMD CEO gives keynote at VMworld
and VMware releases ESX 3.5 with AMD-V RVI
support
2009 – VMware launches vSphere 4 with
support for AMD Opteron™ processor
support
2 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
3. THE CLOUD DATA CENTER IS AMD’S FOCUS SERVER
MARKET CAPTURING THE INFLECTION POINT
AMD has strong momentum with the latest
AMD Opteron™ processor products
─ Focused on execution for our customers
AMD is making strategic investments, through
technology partnerships and acquisition, in the
server market addressing the Cloud data center
Colfax/AMD Cloudera Certification*
SGI/AMD Cloud Reference Platform**
The Cloud is the fastest growing segment in the
server market
Customers focusing on low-power, energy A BETTER OPTION FOR
efficient data centers THE HYPER-EFFICIENT,
VIRTUALIZED,
*http://www.amd.com/us/press-releases/Pages/amdannouncescloudera- CLOUD-READY WORLD
2012june14.aspx
**http://www.sgi.com/partners/technology/amd.html
Sources: Customer interviews, AMD Internal forecasts
3 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
4. AMD OPTERON™ PROCESSOR PLATFORMS
A WIDE RANGE OF PLATFORM CHOICES TO MEET BOTH STANDARDIZED AND
CUSTOMIZED ENVIRONMENTS
AMD Opteron™
6000 Series Platform
Performance-per-
watt and Standard Platforms
Expandability for Traditional Rack/Tower/Blade
2P/4P
AMD Opteron™
4000 Series Platform
Highly Energy Custom, purpose-driven Twins/
Efficient and Container/”Skinless” Scale Out
Cost-optimized
for 1P/2P Low cost SMB servers
AMD Opteron™
3000 Series Platform
Price-optimized Custom, purpose-driven low
cost-effective power systems
infrastructure for Low cost, dedicated hosting and
1P servers small business servers
4 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
5. AMD OPTERON™ PROCESSOR OUTSTANDING SERVER
LEADERSHIP
AMD Opteron™ 6200 Series Processors
Performance Records
Best 2P TPC-C® database performance
Best 2P TPC-C database price/performance
Best 2P SAP two tier performance
Best blade VMmark virtualization performance
Best 4P blade power/performance efficiency
AMD Opteron™ processors power 32% of the
world‟s 50 fastest supercomputers*
*http://www.top500.org/list/2012/06/100
VISIT www.amd.com/benchmarks
5 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
7. AMD OPTERON™ PROCESSOR ADVANTAGES FOR
VIRTUALIZATION
Virtualization Lower Cost per VM¹ No Matter How You Calculate it
AMD Opteron™ 6200 Series-based Servers vs. Intel Xeon E5-2600 Series-based Servers
Microsoft Hyper-V®
VMware vSphere
Microsoft Remote FX 77% less
VMware View Cost of SPECvirt®_sc2010 server
configuration divided by its VM score
Xen, KVM
68% less
Assigned 2GB RAM to each core and
took server price divided by the
number of cores
Customer Requirements:
Core density 44% less
High memory addressability Server price divided by a set number
Large L3 cache of VMs
Cost efficiency
AMD Intel
1See backup slide #1
7 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
8. LOWER COST PER VM, LOWER COST PER RACK*
30% Lower Cost per VM
Cost per VM for VMmark Configurations (Hardware w/vSphere™ Enterprise Plus)
(lower is better)
$800.00
Save $130,000 per Rack
(42U – 21 x 2P/2U)
$600.00
$400.00
Evaluation based on hardware configurations
$200.00
used to achieve published VMmark 2.1 scores
VMmark executes a set of diverse workloads
$0.00 and provides a coarse-grain measure of
30 VMs 40 VMs consolidation capacity
Dell PE R715, AMD Opteron™ Model 6284SE VMmark scores are based on running large
HP DL385 Gen8, AMD Opteron™ Model 6284SE2 number of VMs at near 100% with at least 2
Dell PE R720, Intel Xeon™ Model 2690 host systems
HP DL380p Gen8, Intel Xeon™ Model 2690
http://www.vmware.com/a/vmmark/
2 x AMD Opteron™ processors Model 6284SE Dell PE R715 server, 128GB (8x16 GB DDR3-1600) memory, 1x7.2K 500GB SATA HD, base warranty
2 x AMD Opteron™ processors Model 6284SE in HP DL385 Gen8 server, 128GB (8x16 GB DDR3-1600) memory, 1x15K 450GB SAS HD, base warranty
2 x Intel Xeon E5-2690 processors in HP DL380p Gen 8 server, 256GB (16x16 GB DDR3-1600) memory, 1x15K 450GB SAS HD, base warranty
2 x Intel Xeon E5-2690 processors in Dell PE R720 server, 256GB (16x16 GB DDR3-1600) memory, 1x7.2K 500GB SATA HD, base warranty
System prices as of 8/5/2012 http://www.hp.com http://www.dell.com
8 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
9. SAME PERFORMANCE, LOWER COST PER VM
Virtualized Ecommerce Workload Virtualized Ecommerce Workload
Operations per Minute per VM Cost per VM (lower is better)
5,000
$2,000
4,000
$1,500
3,000
$1,000
2,000
$500
1,000
$0
0
8 VMs 16 VMs 32 VMs
2 VMs 8 VMs 16 VMs 32 VMs
HP DL385 Gen8, 2X AMD Opteron™ Model 6274
AMD Opteron 6274 Intel Xeon E5-2665
HP DL380p Gen8, 2X Intel Xeon™ Model E5-2665
Test run VMware ESX with DVD Store, an open source performance tool that
includes web and database servers to emulate an ecommerce environment
Servers running increase number of VMs and standard utilization rate (~25% )¹⁻²
AMD-based server is 14% less cost than the Intel-based server³
1 2 x AMD Opteron™ processors Model 6274 in HP DL385 Gen8 server, 96GB (12x8 GB DDR3-1333) memory, 1x15K 500GB SAS HD, base warranty
2 2 x Intel Xeon E5-2665 processors in HP DL380p Gen 8 server, 96GB (12x8 GB DDR3-1333) memory, 1x15K 500GB SAS HD, base warranty
³ System prices as of 7/5/2012 http://www.hp.com
9 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
10. BETTER PERFORMANCE WITH LARGER NUMBER OF
VMS, LOWER COST PER VM
Virtualized Ecommerce Workload
Operations per Minute
400,000
Ops per minute
350,000
300,000
250,000
200,000
18 VMs 22 VMs 28 VMs 32 VMs 38 VMs
60% 75% 92% 100% 100%
Utilization Utilization Utilization Utilization Utilization
Intel Xeon™ Model E5-2660 AMD Opteron™ Model 6278
Test run with DVD Store, an open source performance tool that includes web and
database servers to emulate an ecommerce environment
Servers running increasing number of VMs and utilization rate¹⁻²
AMD-based server is 14% less cost than the Intel-based server³
1 2 x AMD Opteron™ processors Model 6278 in server, 256GB (16X16GB DDR3-1600) memory, 2x 15k 146 GB SAS HD, base warranty
2 2 x Intel Xeon E5-2660 processors in HP DL380p Gen 8 server, 256GB (16x16GB DDR3-1600 memory, 2x15K 146gB SAS HD, base warranty
³ System prices as of 7/28/2012 http://www.hp.com and http://www.dell.com
10 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
11. AMD OPTERON™ PROCESSOR DELIVERS PERFORMANCE
THAT DRIVES THE CLOUD
Web / Cloud Unmatched Power Lower price / performance /
Efficiency¹ watt / square foot²
Windows® Azure Intel Xeon E5-2600 Series-based server
12 AMD Opteron™ 6200 Series-based Server
LAMP Stack
10
Java
100%
8
Hadoop
6
OpenStack 75%
4
2
50%
0
Customer Requirements:
Power efficiency 25%
Core scalability
Node and thread density 0%
Lowest x86 watts/core in the
High throughput industry: 5.3W for AMD Opteron™
6200 Series and 4.375W for AMD
Opteron™ 4200 Series Up to 13% lower for better value
¹, ² See backup slide #2
11 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
12. THE RIGHT PRICE FOR HADOOP CLUSTERS
TeraSort OPS/Min/$ (higher is better)
AMD-based Colfax 8
node cluster doing
performance testing
recently received
Cloudera CHD3
certification
40% lower cost than
competitive systems
Lower CapEx to get up
and running and expand
Colfax CX2274-NA Colfax CX2265i-X5 production-level clusters
AMD Opteron™ 62741 Intel Xeon E5-26652
$8,211 per system³ $12,084 per system³
Power consumption
$59,102 for 8 node cluster $96,672 for 8 node cluster between clusters
$124,142 for rack (42U) $203,112 for rack (42U) comparable
1 2 x AMD Opteron™ processors Model 6274 in Colfax CX2274-NA server, 96GB (12x8 GB DDR3-1333) memory, 1x15K 300GB SAS HD, base warranty
2 2 x Intel Xeon E5-2665 processors in Colfax CX2265i-X5 server, 96GB (12x8 GB DDR3-1333) memory, 1x15K 300GB SAS HD, base warranty
³ System prices as of 7/20/2012 http://www.colfax-intl.com/
12 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
13. VIRTUALIZATION PERFORMANCE CONSIDERATIONS*
General Recommendations – Typical „Best Practices‟
– Enable network TOE/LSO/LRO, MSI-X, VMware NetQueue/MS VMQ, RSS
features
Can double network/storage performance
– Use fastest DRAM possible
– Consider SSD for „hot‟ processes and files
– Test with “Large Pages”, Jumbo Frames
Set netPktHeapMinSize=32 and netPktHeapMaxSize=128
– Test with VMDIRECTPATH (Passthrough) Enabled in ESX/ESXi
Disables Vmotion so ensure it‟s not required
– NIC Teaming helps balance load across adapters
– Check that IRQ is not shared between Fast & Slow devices
Ex: Disable CD, Floppy, USB support if not required
*Performance Best Practices for VMware vSphere™ 5.0
*VMware View ™ Administration
*vSphere ™ Monitoring and Performance
13 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
14. AMD OPTERON™ PROCESSOR VIRTUALIZATION PERFORMANCE
CONSIDERATIONS
For AMD Opteron™ Processor & Benchmarking
– Craft vSMPs to fit within the number of cores on die and memory size on the
processor
“sched.cpu.vsmpConsolidate=true” encourages NUMAscheduler to keep sibling
vcpus on the same node (die)
– Do not oversize VMs (see above): vcpus need to be in close synchronicity in
virtual time (TSC)
– Ensure NUMA, AMD-V, and IOMMU (HWMMU) are Enabled in server BIOS
– Ensure AMD-V/RVI/Vi is Enabled in VM Properties (see next slide)
– Leverage 1 VM/Core
Turn OFF NUMA Rebalance and Page Migration
This is Not CPU/Memory Affinity
– Ensure Power Management is Disabled in the BIOS, Hypervisor, and the Guest
OS (Humor Me…)
Control Panel in Windows
Scaling Governor in Linux
– On AMD Opteron™ 6200/4200, enable AMD Turbo Core (C6) & HPC Mode
14 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
15. ENABLING AMD-V™ TECHNOLOGY IN VM PROPERTIES
• Explicitly enable the third option vs.‟ Automatic‟
15 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
18. BACKUP SLIDE #1
The price per VM range is established based on three different calculation methods: 1) server price divided by a set number of VMs, 2) server price
divided by the number of cores assuming a 1VM per core load model, and 3) cost of SPECvirt®_sc2010 server configuration divided by its VM score.
(1) Server price divided by a set number of VMs: HP ProLiant DL385 G7 with 2 x AMD Opteron™ processor Model 6282 SE (1ku price $1019) with
32GB RAM, 146 GB 15K hdd, DVD, and 3yr base warranty is $5,143 as of 4/2/12 at www.hp.com. HP ProLiant DL380 G7 with 2 x Intel Xeon processor
Model E5-2690 with 32GB RAM, 146 GB 15K hdd, DVD, and 3yr base warranty is $9,127 as of 4/2/12 at www.hp.com. This method yields a 44%
lower cost per VM. (2) Server price divided by the number of cores assuming a 1VM per core load model: Based on a 1 VM/core model, AMD
Opteron™ 6300 Series-based servers have up to 16 cores per processor. Intel Xeon E5-2600-based servers have up to 8 cores per processor as of
4/6/12 at www.intc.com/pricelist.cfm. Using the pricing and configurations above and assuming 1VM per core, the price/VM for the HP ProLiant DL385
G7 (32 cores and 32 VMs) is $158 and the price/VM for the HP ProLiant DL380 G8 (16 cores and 16VMs) is $570. This method yields a 72% lower
cost per VM. (3) Cost of SPECvirt®_sc2010 server configuration divided by its VM score:
SPEC and SPECvirt are registered trademarks of the Standard Performance Evaluation Corporation. The results for AMD Opteron™ processor Model
6282 SE and Intel Xeon E5-2690 are based upon results published on www.spec.org/virt_sc2010/results as of 4/10/12. The comparison is based on the
best performing two-socket servers using AMD Opteron™ processor Model 6282 SE and Intel Xeon processor Model E5-2690, operating at each
processor‟s default frequency. For the latest SPECvirt®_sc2010 results, visit www.spec.org/virt_sc2010/results. 1570 @ 102 VMs using HP ProLiant
DL385 G7, 2 x AMD Opteron 6282 SE, 256 GB (16 x 16 GB PC3L-10600R at 1333 MHz), VMware ESXi
5.0.0, http://www.spec.org/virt_sc2010/results/res2011q4/virt_sc2010-20111018-00038-perf.html. The price of an HP ProLiant DL385 G7, 2 x AMD
Opteron 6282 SE, 256 GB (16 x 16 GB PC3L-10600R at 1333 MHz) with 146GB 15k hard drive, and 3yr base warranty is $11,721 as of 4/10/12 at
www.hp.com. The price per VM is $115.
2158 @ 132 VMs using IBM x3650 M4, 2 x Intel Xeon E5-2690, 512 GB (16 x 32 GB, 4Rx4 1.35V PC3L-10600 CL9 ECC DDR3 1333MHz LP
LRDIMM), Red Hat Enterprise Linux 6.2, http://www.spec.org/virt_sc2010/results/res2012q1/virt_sc2010-20120207-00042-perf.html. While it was not
possible to price the exact configuration, the price of an IBM x3650 M4 using 2 x Intel Xeon E5-2690 and 504GB RAM (32GB x 10, 16GB x
11, 8GB), 146GB 10k hard drive, and base warranty is $65,575 as of 4/10/12 at www.ibm.com. The price per VM is $497. This method yields a 77%
lower cost per VM.2158 @ 132 VMs using IBM x3650 M4, 2 x Intel Xeon E5-2690, 512 GB (16 x 32 GB, 4Rx4 1.35V PC3L-10600 CL9 ECC DDR3
1333MHz LP LRDIMM), Red Hat Enterprise Linux 6.2, http://www.spec.org/virt_sc2010/results/res2012q1/virt_sc2010-20120207-00042-perf.html.
While it was not possible to price the exact configuration, the price of an IBM x3650 M4 using 2 x Intel Xeon E5-2690 and 504GB RAM (32GB x
10, 16GB x 11, 8GB), 146GB 10k hard drive, and base warranty is $65,575 as of 4/10/12 at www.ibm.com. The price per VM is $497. This method
yields a 77% lower cost per VM. SVR-163
18 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
19. BACKUP SLIDE #2
¹Based on AMD Opteron 6200 Series processor with 16 cores at 85W TDP (5.3125W/core) versus lowest wattage, highest core Intel Xeon processor
with 8 cores at 70W TDP (8.75W/core) according to www.intel.com as of 3/14/12. As of March 16, 2012 AMD Opteron™ processor Models 4200 EE
have the lowest known power per core of any x86 server processor, at 35W TDP (35W/8 = 4.375W/core). Intel's lowest power per core server
processor, Intel Xeon E5-2650L, is 70W TDP (70W/8 = 8.75W/core). See www.intc.com/pricelist.cfm as of 3/16/12. Previous record held by AMD
Opteron processor Models 4100 EE at 35W TDP / 6 cores = 5.83 W/core. SVR-58
2Price/performance/watt/square foot is calculated by the price of a server divided by the SPECpower®_ssj2008 average performance to power ratio
divided by the square footage of a standard rack. Using the HP ProLiant DL385 G7, the price/performance/watt/sq ft equals $0.25. HP ProLiant DL385
G7 with 2 x AMD Opteron™ processor Model 6276 with 32GB RAM, 72 GB 15K hdd, DVD, and 3yr base warranty is $4,443 as of 4/9/12 at
www.hp.com. Using the HP ProLiant DL380 G8, the price/SPECpower®_ssj2008 average performance to power ratio/sq ft equals $0.29. HP ProLiant
DL380 G8 with 2 x Intel Xeon processor Model E5-2680 with 32GB RAM, 72 GB 15K hdd, DVD, and 3yr base warranty is $8,127 as of 4/9/12 at
www.hp.com. The square footage of a standard rack is 6 ft (2ft x 3ft). SPEC and SPECpower are registered trademarks of the Standard Performance
Evaluation Corporation. The results for AMD Opteron™ processor Model 6276 and Intel Xeon processor Model E5-2680 reflect results published on
http://www.spec.org/cpu2006/results as of 4/9/12. The SPECpower®_ssj2008 overall ssj_ops/watt for AMD Opteron™ processor Model 6276 is 2,968
using IBM System x3755 M3, 2 x AMD Opteron 6276 (2.30 GHz), 64GB (16 x 4096 MB), Microsoft Windows Server 2008 Enterprise Edition
x64, Service Pack 1, http://www.spec.org/power_ssj2008/results/res2011q4/power_ssj2008-20111111-00409.html). The SPECpower®_ssj2008 overall
ssj_ops/watt for Intel Xeon processor Model E5-2680 is 4,708 using IBM System x3500 M4, 2 x Intel Xeon E5-2680, 24GB (8 x 2048 MB), Microsoft
Windows Server 2008 Enterprise Edition x64, R2 SP1, http://www.spec.org/power_ssj2008/results/res2012q1/power_ssj2008-20120305-00428.html.
For the latest SPECpower_ssj2008 results, visit http://www.spec.org/power_ssj2008/results. SVR-162
19 | NUMA Performance Considerations in VMware VSphere ™ – VMWorld 2012 | August 2012 | Public
Hinweis der Redaktion
Original legal approval – San Marino Platform Launch 2010Let’s look at where this new processor and associated platform fit in.We’ve shown you a lot of information on our targeted server strategy in the last yearAMD recognized the traditional three platforms were no longer serving the market well Made decision to provide ~75% 2P volume segment of market with two very targeted options6000 series launched in March for maximum performance-per-watt in 2 and 4P4000 series launching now for 1 and 2PCompetition has only a one size fits all solution, de-features the most popular price/power/performance bands
Originally approved for Financial Analyst Day 2012
Virtualization is all about cost efficiency, including cost efficiency and density
Slide is good for existing customers and shows price/performance comparison of configurations with published VMmark scores
Cloud benchmarks/dataLowest Power per core at 4.375WHighest node density and thread density with 16-coresJava – 26% better performance