SlideShare a Scribd company logo
1 of 37
Performance Tuning a Cloud Application 
Shane Gibson 
Sr. Principal Infrastructure Architect 
Cloud Platform Engineering
2 
Agenda 
• About Symantec and Me 
• Key Value as a Service 
• The Pesky Problem 
• Resolving “The Pesky Problem” 
• Performance Tuning Recommendations 
• Summary 
• Q&A
3 
About Symantec and Me
4 
The Symantec Team 
• Cloud Platform Engineering 
– We are building a consolidated cloud platform that provides infrastructure and 
platform services for next generation Symantec products and services 
– Starting small, but scaling to tens of thousands of nodes across multiple DCs 
– Cool technologies in use: OpenStack, Hadoop, Storm, Cassandra, MagnetoDB 
– Strong commitment to provide back to Open Source communities 
• Shane Gibson 
– Served 4 years in USMC as a computer geek (mainframes and Unix) 
– Unix/Linux SysAdmin, System Architect, Network Architect, Security Architect 
– Now Cloud Infrastructure Architect for CPE group at Symantec
5 
Key Value as a Service 
(the “cloud” application)
6 
Key Value as a Service: General Architecture 
• MagnetoDB is a key value store with OpenStack REST and AWS 
DynamoDB API compatibility 
• Uses a “pluggable” backend storage capability 
• Composite service made up of: 
– MagnetoDB front-end API and Streaming service 
– Cassandra for back end, Key Value based storage 
– OpenStack Keystone 
– AMQP Messaging Bus (eg RabbitMQ, QPID, ZeroMQ) 
– Load Balancing capabilities (Hardware or LBaaS)
7 
Key Value as a Service: MagnetoDB 
– API Services Layer 
• Data API 
• Streaming API 
• Monitoring API 
• AWS DynamoDB API 
– Keystone and Notifications 
integrations 
– MagnetoDB Database Driver 
• Cassandra
8 
Key Value as a Service: MagnetoDB 
– API Services Layer 
• Data API 
• Streaming API 
• Monitoring API 
• AWS DynamoDB API 
– Keystone and Notifications 
integrations 
– MagnetoDB Database Driver 
• Cassandra
9 
Key Value as a Service: Cassandra 
– Database storage engine 
– Massively linearly scalable 
– Highly available w/ no SPoF 
– Other features: 
• tunable consistency 
• key-value data model 
• ring topology 
• predictable high performance 
and fault tolerance 
• Rack and Datacenter awareness
10 
Key Value as a Service: Cassandra 
– Database storage engine 
– Massively linearly scalable 
– Highly available w/ no SPoF 
– Other features: 
• tunable consistency 
• key-value data model 
• ring topology 
• predictable high performance 
and fault tolerance 
• Rack and Datacenter awareness
11 
Key Value as a Service: Other Stuff 
– Need a load balancing layer of 
some sort 
• LBaaS or hardware 
– Keystone service 
– AMQP service 
• RabbitMQ
12 
Key Value as a Service: Other Stuff 
– Need a load balancing layer of 
some sort 
• LBaaS or hardware 
– Keystone service 
– AMQP service 
• RabbitMQ
13 
Key Value as a Service: Other Stuff 
– Need a load balancing layer of 
some sort 
• LBaaS or hardware 
– Keystone service 
– AMQP service 
• RabbitMQ
14 
Key Value as a Service: Other Stuff 
– Need a load balancing layer of 
some sort 
• LBaaS or hardware 
– Keystone service 
– AMQP service 
• RabbitMQ
15 
Key Value as a Service: Putting it all Together
16 
The Pesky Problem
17 
The Pesky Problem: Deployed on Bare Metal 
• Initial deployment of KVaaS service on bare metal nodes 
• Mixed both MagnetoDB API service on same node as Cassandra 
– MagnetoDB CPU –vs- Cassandra Disk I/O profile 
• Cassandra directly managing the disks via JBOD (good!) 
• MagnetoDB likes lots of CPU, direct access to 32 (HT) CPUs 
– Please don’t start me on a HyperThread CPU count rant  
• KVaaS team performance expectation set from this experience!
18 
The Pesky Problem: Moved to OpenStack Nova 
• KVaaS service migrated to a “stock” OpenStack Nova cluster 
• Nova Compute nodes set with RAID 10 ephemeral disks 
• OpenContrail used for SDN configuration 
• Performance for each VM Guest roughly 66% of bare metal 
• KVaaS team was unhappy 
bare metal 
250 RPS / HT Core* 
 
virtualized 
165 RPS / HT Core* 
 
19 
The Pesky Problem: Moved to OpenStack Nova, cont. 
performance comparison of “list_tables” 
* results averaged by core since test beds were different
20 
The Pesky Problem: The Goal 
• Deploy our KVaaS service … as a flexible and scalable solution 
• Ability to use OpenStack APIs to manage the service 
• Cloud Provider run KVaaS service or Tenant managed service 
• Initial deployment planned for OpenStack Nova platform 
– Not a containerization service … 
– Though … considering it … 
• Easier auto-scaling, better service packing, flexibility, etc. 
• Explore mixed MagnetoDB/Cassandra –vs- separated services
21 
Resolving “The Pesky Problem”
22 
Resolving the “Pesky Problem”: Approach 
• Baseline the test environment 
– Bare metal deployment and test 
– Mimics the original deployment characteristics 
• Deploy OpenStack Nova – Install KVaaS services 
• Performance tune each component 
– Linux OS and Hardware configuration 
– KVM Hypervisor/Nova Compute performance tuning 
– MagnetoDB/Cassandra performance tuning
23 
Resolving the “Pesky Problem”: Testing Tools 
• Linux OS and Hardware 
– perf, openssl speed, iostat, iozone, iperf, dd (yes, really!), dtrace 
• KVM Hypervisor/Nova Compute 
– kvm_stat, kvmtrace, perf stat –e ‘kvm:*’, specvirt 
• MagnetoDB/Cassandra 
– magnetodb-test-bench, jstat, cstar_perf, cassandra-stress 
• General Test Suite 
– Phoronix Test Suite
24 
Resolving the “Pesky Problem”: Test Architecture
25 
Resolving the “Pesky Problem”: Test Bench
26 
Performance Tuning 
Recommendations
27 
Performance Tuning Results: Linux OS and Hardware 
Recommendations: 
Host: 
Guest: 
• vhost_net or virtio_net, 
virtio_blk, virtio_balloon, 
virtio_pc 
• Paravirtualization ! 
• Disable system perf. gathering – get 
info from host hyper. tools 
• Elevator scheduler to “noop” 
• Give guests as much memory as you 
can (FS cache!) 
• vhost_net, transparent_hugepages, 
high_res_timer, hpet, compaction, 
ksm, cgroups 
• task scheduling tweaks (CFS) 
• Filesystem mount options 
(noatime, nodirtime, relatime) 
• Tune wmem and rmem buffers !!! 
• Elevator I/O Scheduler = deadline
28 
Performance Tuning Results: Linux OS and Hardware 
7-10x 
Recommendations: 
Host: 
• vhost_net, transparent_hugepages, 
high_res_timer, hpet, compaction, 
ksm, cgroups 
30% 
10% less latency 
8x throughput 
• task scheduling tweaks (CFS) 
• Filesystem mount options 
(noatime, nodirtime, relatime) 
• Tune wmem and rmem buffers !!! 
• Elevator I/O Scheduler = deadline 
2x throughput 
Guest: 
• vhost_net or virtio_net, 
virtio_blk, virtio_balloon, 
virtio_pc 
• Paravirtualization ! 
• Disable system perf. gathering – get 
info from host hyper. tools 
• Elevator scheduler to “noop” 
• Give guests as much memory as you 
can (FS cache!)
29 
Performance Tuning Results: KVM /Nova Compute 
Recommendations: 
Host: 
• tweak Transparent Huge Pages 
• bubble up raw devices if possible 
(warning: migration/portability) 
• multi-queue virtio-net 
• SR-IOV if can dedicate NIC 
(warning: see bubble up warning!) 
Guest: 
• qcow2 or raw for guest file backing 
• disk partition alignment is still very 
important 
• preallocate metadata (qcow2) 
• fallocate entire guest image if can 
(qcow2, lose oversubscribe ability) 
• set VM swappiness to zero 
• Async. I/O set to “native”
Recommendations: 
Host: 
• tweak Transparent Huge Pages 
• bubble up raw devices if possible 
(warning: migration/portability) 
• multi-queue virtio-net 
• SR-IOV if can dedicate NIC 
(warning: see bubble up warning!) 
30 
Performance Tuning Results: KVM /Nova Compute 
30 
2 to 15% gain 
~ 10% gain 
40+% gain w/ 
Host + Guest 
8% gain in TPM 
Guest: 
• qcow2 or raw for guest file backing 
• disk partition alignment is still very 
important 
• preallocate metadata (qcow2) 
• fallocate entire guest image if can 
(qcow2, lose oversubscribe ability) 
• set VM swappiness to zero 
• Async. I/O set to “native”
31 
Performance Tuning Results: MagnetoDB/Cassandra 
Recommendations: 
• disk: vm.dirty_ratio & vm.dirty_background_ratio – increasing cache may 
help write work loads that have ordered writes, or writes in bursty 
chunks 
• “CommitLogDirectory“ and “DataFileDirectories“ on separate devices for 
write performance improvement 
• GC tuning of Java heap/new gen – significant latency decreases 
• Tune Bloom Filters, Data Caches, and Compaction 
• Use compression for similar “column families”
32 
Performance Tuning Results: MagnetoDB/Cassandra 
Recommendations: 
10x pages 
• disk: vm.dirty_ratio & vm.dirty_background_ratio – increasing cache may 
help write work loads that have ordered writes, or writes in bursty 
chunks 
• “CommitLogDirectory“ and “DataFileDirectories“ on separate devices for 
write performance improvement 
• GC tuning of Java heap/new gen – significant latency decreases 
• Tune Bloom Filters, Data Caches, and Compaction 
• Use compression for similar “column families” 
25-35% read perf. 
5-10% write gains
33 
Summary
34 
Summary: Notes 
• “clouds” are best composed of small services that can be 
independently combined, tuned, and scaled 
• human expectations in the transition from bare metal to cloud 
need to be reset 
• an iterative step-by-step approach is best 
– Test … Tune … Test … Tune … ! 
• lots of complex pieces in a cloud application
35 
Summary: Notes (continued) 
• Compose your services as individual building blocks 
• Tune each component/service independently 
• Then tune the whole system 
• Automation is critical to iterative test/tune strategies!! 
• Performance tuning is absolutely worth the investment 
• Knowing your work loads is still (maybe even more?) critical
36 
Questions and 
(hopefully?) Answers 
Let’s talk…
Thank you! 
Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its 
affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. 
This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or 
implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice. 
37 
Shane Gibson 
shane_gibson@symantec.com

More Related Content

What's hot

XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...The Linux Foundation
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinEd Balduf
 
OSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitOSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitDon Marti
 
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleXPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleThe Linux Foundation
 
Cinder enhancements-for-replication-using-stateless-snapshots
Cinder enhancements-for-replication-using-stateless-snapshotsCinder enhancements-for-replication-using-stateless-snapshots
Cinder enhancements-for-replication-using-stateless-snapshotsCaitlin Bestler
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usagevincentvdk
 
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...rhatr
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)jbellis
 
Live migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasLive migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasDocker, Inc.
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Community
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeCeph Community
 
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014Amazon Web Services
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebula Project
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replicationEd Balduf
 
Storage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackStorage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackShapeBlue
 
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, IntelXPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, IntelThe Linux Foundation
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceScyllaDB
 
ProfessionalVMware BrownBag VCP5 Section3: Storage
ProfessionalVMware BrownBag VCP5 Section3: StorageProfessionalVMware BrownBag VCP5 Section3: Storage
ProfessionalVMware BrownBag VCP5 Section3: StorageProfessionalVMware
 

What's hot (20)

XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
 
Cinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit AustinCinder Live Migration and Replication - OpenStack Summit Austin
Cinder Live Migration and Replication - OpenStack Summit Austin
 
OSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration SummitOSv presentation from Linux Foundation Collaboration Summit
OSv presentation from Linux Foundation Collaboration Summit
 
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, OracleXPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
 
Cinder enhancements-for-replication-using-stateless-snapshots
Cinder enhancements-for-replication-using-stateless-snapshotsCinder enhancements-for-replication-using-stateless-snapshots
Cinder enhancements-for-replication-using-stateless-snapshots
 
Kvm optimizations
Kvm optimizationsKvm optimizations
Kvm optimizations
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
 
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
You Call that Micro, Mr. Docker? How OSv and Unikernels Help Micro-services S...
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
 
Live migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchasLive migrating a container: pros, cons and gotchas
Live migrating a container: pros, cons and gotchas
 
Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph Ceph Day Melbourne - Troubleshooting Ceph
Ceph Day Melbourne - Troubleshooting Ceph
 
Intel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMeIntel QLC: Cost-effective Ceph on NVMe
Intel QLC: Cost-effective Ceph on NVMe
 
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
 
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
 
Cinder - status of replication
Cinder - status of replicationCinder - status of replication
Cinder - status of replication
 
Storage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackStorage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStack
 
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, IntelXPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
XPDS14 - Xen as High-Performance NFV Platform - Jun Nakajima, Intel
 
kdump: usage and_internals
kdump: usage and_internalskdump: usage and_internals
kdump: usage and_internals
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
ProfessionalVMware BrownBag VCP5 Section3: Storage
ProfessionalVMware BrownBag VCP5 Section3: StorageProfessionalVMware BrownBag VCP5 Section3: Storage
ProfessionalVMware BrownBag VCP5 Section3: Storage
 

Viewers also liked

Case Study: Solving Common Oracle DBA Tasks at a leading German Bank
Case Study: Solving Common Oracle DBA Tasks at a leading German BankCase Study: Solving Common Oracle DBA Tasks at a leading German Bank
Case Study: Solving Common Oracle DBA Tasks at a leading German BankEmbarcadero Technologies
 
553: Oracle Database Performance: Are Database Users Telling Me The Truth?
553: Oracle Database Performance: Are  Database Users Telling Me The Truth?553: Oracle Database Performance: Are  Database Users Telling Me The Truth?
553: Oracle Database Performance: Are Database Users Telling Me The Truth?Alfredo Krieg
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationBigstep
 
Real Life Java EE Performance Tuning
Real Life Java EE Performance TuningReal Life Java EE Performance Tuning
Real Life Java EE Performance TuningC2B2 Consulting
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Colin Charles
 
Case Study On Oracle (2000)
Case Study On Oracle (2000)Case Study On Oracle (2000)
Case Study On Oracle (2000)Roula Samra
 

Viewers also liked (6)

Case Study: Solving Common Oracle DBA Tasks at a leading German Bank
Case Study: Solving Common Oracle DBA Tasks at a leading German BankCase Study: Solving Common Oracle DBA Tasks at a leading German Bank
Case Study: Solving Common Oracle DBA Tasks at a leading German Bank
 
553: Oracle Database Performance: Are Database Users Telling Me The Truth?
553: Oracle Database Performance: Are  Database Users Telling Me The Truth?553: Oracle Database Performance: Are  Database Users Telling Me The Truth?
553: Oracle Database Performance: Are Database Users Telling Me The Truth?
 
Memory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and VirtualizationMemory, Big Data, NoSQL and Virtualization
Memory, Big Data, NoSQL and Virtualization
 
Real Life Java EE Performance Tuning
Real Life Java EE Performance TuningReal Life Java EE Performance Tuning
Real Life Java EE Performance Tuning
 
Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016Tuning Linux for your database FLOSSUK 2016
Tuning Linux for your database FLOSSUK 2016
 
Case Study On Oracle (2000)
Case Study On Oracle (2000)Case Study On Oracle (2000)
Case Study On Oracle (2000)
 

Similar to Performance Tuning a Cloud Application: A Real World Case Study

Kubernetes Internals
Kubernetes InternalsKubernetes Internals
Kubernetes InternalsShimi Bandiel
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…Sergey Dzyuban
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseScyllaDB
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackTon Ngo
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarKamesh Pemmaraju
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Belmiro Moreira
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceShapeBlue
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...NETWAYS
 
Hybrid cloud openstack meetup
Hybrid cloud openstack meetupHybrid cloud openstack meetup
Hybrid cloud openstack meetupdfilppi
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Grayharryvanhaaren
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutSander Temme
 
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...inside-BigData.com
 
Building big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesBuilding big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesVenu Ryali
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into CassandraBrian Hess
 
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...{code}
 
Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Belmiro Moreira
 
A closer look to locaweb IaaS
A closer look to locaweb IaaSA closer look to locaweb IaaS
A closer look to locaweb IaaSGleicon Moraes
 
Optimizing Cloud Foundry and OpenStack for large scale deployments
Optimizing Cloud Foundry and OpenStack for large scale deploymentsOptimizing Cloud Foundry and OpenStack for large scale deployments
Optimizing Cloud Foundry and OpenStack for large scale deploymentsAnimesh Singh
 

Similar to Performance Tuning a Cloud Application: A Real World Case Study (20)

Kubernetes Internals
Kubernetes InternalsKubernetes Internals
Kubernetes Internals
 
To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…To Build My Own Cloud with Blackjack…
To Build My Own Cloud with Blackjack…
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
Toward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStackToward 10,000 Containers on OpenStack
Toward 10,000 Containers on OpenStack
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experience
 
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
OSMC 2019 | Monitoring Alerts and Metrics on Large Power Systems Clusters by ...
 
Hybrid cloud openstack meetup
Hybrid cloud openstack meetupHybrid cloud openstack meetup
Hybrid cloud openstack meetup
 
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. GrayOVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
OVS and DPDK - T.F. Herbert, K. Traynor, M. Gray
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
NAVGEM on the Cloud: Computational Evaluation of Cloud HPC with a Global Atmo...
 
Building big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and KubernetesBuilding big data pipelines with Kafka and Kubernetes
Building big data pipelines with Kafka and Kubernetes
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
EMC World 2016 - code.15 Better Together: Scale-Out Databases on Scale-Out St...
 
Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015Unveiling CERN Cloud Architecture - October, 2015
Unveiling CERN Cloud Architecture - October, 2015
 
A closer look to locaweb IaaS
A closer look to locaweb IaaSA closer look to locaweb IaaS
A closer look to locaweb IaaS
 
Oow2016 review-iaas-paas-13th-18thoctober
Oow2016 review-iaas-paas-13th-18thoctoberOow2016 review-iaas-paas-13th-18thoctober
Oow2016 review-iaas-paas-13th-18thoctober
 
Optimizing Cloud Foundry and OpenStack for large scale deployments
Optimizing Cloud Foundry and OpenStack for large scale deploymentsOptimizing Cloud Foundry and OpenStack for large scale deployments
Optimizing Cloud Foundry and OpenStack for large scale deployments
 

Recently uploaded

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Performance Tuning a Cloud Application: A Real World Case Study

  • 1. Performance Tuning a Cloud Application Shane Gibson Sr. Principal Infrastructure Architect Cloud Platform Engineering
  • 2. 2 Agenda • About Symantec and Me • Key Value as a Service • The Pesky Problem • Resolving “The Pesky Problem” • Performance Tuning Recommendations • Summary • Q&A
  • 4. 4 The Symantec Team • Cloud Platform Engineering – We are building a consolidated cloud platform that provides infrastructure and platform services for next generation Symantec products and services – Starting small, but scaling to tens of thousands of nodes across multiple DCs – Cool technologies in use: OpenStack, Hadoop, Storm, Cassandra, MagnetoDB – Strong commitment to provide back to Open Source communities • Shane Gibson – Served 4 years in USMC as a computer geek (mainframes and Unix) – Unix/Linux SysAdmin, System Architect, Network Architect, Security Architect – Now Cloud Infrastructure Architect for CPE group at Symantec
  • 5. 5 Key Value as a Service (the “cloud” application)
  • 6. 6 Key Value as a Service: General Architecture • MagnetoDB is a key value store with OpenStack REST and AWS DynamoDB API compatibility • Uses a “pluggable” backend storage capability • Composite service made up of: – MagnetoDB front-end API and Streaming service – Cassandra for back end, Key Value based storage – OpenStack Keystone – AMQP Messaging Bus (eg RabbitMQ, QPID, ZeroMQ) – Load Balancing capabilities (Hardware or LBaaS)
  • 7. 7 Key Value as a Service: MagnetoDB – API Services Layer • Data API • Streaming API • Monitoring API • AWS DynamoDB API – Keystone and Notifications integrations – MagnetoDB Database Driver • Cassandra
  • 8. 8 Key Value as a Service: MagnetoDB – API Services Layer • Data API • Streaming API • Monitoring API • AWS DynamoDB API – Keystone and Notifications integrations – MagnetoDB Database Driver • Cassandra
  • 9. 9 Key Value as a Service: Cassandra – Database storage engine – Massively linearly scalable – Highly available w/ no SPoF – Other features: • tunable consistency • key-value data model • ring topology • predictable high performance and fault tolerance • Rack and Datacenter awareness
  • 10. 10 Key Value as a Service: Cassandra – Database storage engine – Massively linearly scalable – Highly available w/ no SPoF – Other features: • tunable consistency • key-value data model • ring topology • predictable high performance and fault tolerance • Rack and Datacenter awareness
  • 11. 11 Key Value as a Service: Other Stuff – Need a load balancing layer of some sort • LBaaS or hardware – Keystone service – AMQP service • RabbitMQ
  • 12. 12 Key Value as a Service: Other Stuff – Need a load balancing layer of some sort • LBaaS or hardware – Keystone service – AMQP service • RabbitMQ
  • 13. 13 Key Value as a Service: Other Stuff – Need a load balancing layer of some sort • LBaaS or hardware – Keystone service – AMQP service • RabbitMQ
  • 14. 14 Key Value as a Service: Other Stuff – Need a load balancing layer of some sort • LBaaS or hardware – Keystone service – AMQP service • RabbitMQ
  • 15. 15 Key Value as a Service: Putting it all Together
  • 16. 16 The Pesky Problem
  • 17. 17 The Pesky Problem: Deployed on Bare Metal • Initial deployment of KVaaS service on bare metal nodes • Mixed both MagnetoDB API service on same node as Cassandra – MagnetoDB CPU –vs- Cassandra Disk I/O profile • Cassandra directly managing the disks via JBOD (good!) • MagnetoDB likes lots of CPU, direct access to 32 (HT) CPUs – Please don’t start me on a HyperThread CPU count rant  • KVaaS team performance expectation set from this experience!
  • 18. 18 The Pesky Problem: Moved to OpenStack Nova • KVaaS service migrated to a “stock” OpenStack Nova cluster • Nova Compute nodes set with RAID 10 ephemeral disks • OpenContrail used for SDN configuration • Performance for each VM Guest roughly 66% of bare metal • KVaaS team was unhappy 
  • 19. bare metal 250 RPS / HT Core*  virtualized 165 RPS / HT Core*  19 The Pesky Problem: Moved to OpenStack Nova, cont. performance comparison of “list_tables” * results averaged by core since test beds were different
  • 20. 20 The Pesky Problem: The Goal • Deploy our KVaaS service … as a flexible and scalable solution • Ability to use OpenStack APIs to manage the service • Cloud Provider run KVaaS service or Tenant managed service • Initial deployment planned for OpenStack Nova platform – Not a containerization service … – Though … considering it … • Easier auto-scaling, better service packing, flexibility, etc. • Explore mixed MagnetoDB/Cassandra –vs- separated services
  • 21. 21 Resolving “The Pesky Problem”
  • 22. 22 Resolving the “Pesky Problem”: Approach • Baseline the test environment – Bare metal deployment and test – Mimics the original deployment characteristics • Deploy OpenStack Nova – Install KVaaS services • Performance tune each component – Linux OS and Hardware configuration – KVM Hypervisor/Nova Compute performance tuning – MagnetoDB/Cassandra performance tuning
  • 23. 23 Resolving the “Pesky Problem”: Testing Tools • Linux OS and Hardware – perf, openssl speed, iostat, iozone, iperf, dd (yes, really!), dtrace • KVM Hypervisor/Nova Compute – kvm_stat, kvmtrace, perf stat –e ‘kvm:*’, specvirt • MagnetoDB/Cassandra – magnetodb-test-bench, jstat, cstar_perf, cassandra-stress • General Test Suite – Phoronix Test Suite
  • 24. 24 Resolving the “Pesky Problem”: Test Architecture
  • 25. 25 Resolving the “Pesky Problem”: Test Bench
  • 26. 26 Performance Tuning Recommendations
  • 27. 27 Performance Tuning Results: Linux OS and Hardware Recommendations: Host: Guest: • vhost_net or virtio_net, virtio_blk, virtio_balloon, virtio_pc • Paravirtualization ! • Disable system perf. gathering – get info from host hyper. tools • Elevator scheduler to “noop” • Give guests as much memory as you can (FS cache!) • vhost_net, transparent_hugepages, high_res_timer, hpet, compaction, ksm, cgroups • task scheduling tweaks (CFS) • Filesystem mount options (noatime, nodirtime, relatime) • Tune wmem and rmem buffers !!! • Elevator I/O Scheduler = deadline
  • 28. 28 Performance Tuning Results: Linux OS and Hardware 7-10x Recommendations: Host: • vhost_net, transparent_hugepages, high_res_timer, hpet, compaction, ksm, cgroups 30% 10% less latency 8x throughput • task scheduling tweaks (CFS) • Filesystem mount options (noatime, nodirtime, relatime) • Tune wmem and rmem buffers !!! • Elevator I/O Scheduler = deadline 2x throughput Guest: • vhost_net or virtio_net, virtio_blk, virtio_balloon, virtio_pc • Paravirtualization ! • Disable system perf. gathering – get info from host hyper. tools • Elevator scheduler to “noop” • Give guests as much memory as you can (FS cache!)
  • 29. 29 Performance Tuning Results: KVM /Nova Compute Recommendations: Host: • tweak Transparent Huge Pages • bubble up raw devices if possible (warning: migration/portability) • multi-queue virtio-net • SR-IOV if can dedicate NIC (warning: see bubble up warning!) Guest: • qcow2 or raw for guest file backing • disk partition alignment is still very important • preallocate metadata (qcow2) • fallocate entire guest image if can (qcow2, lose oversubscribe ability) • set VM swappiness to zero • Async. I/O set to “native”
  • 30. Recommendations: Host: • tweak Transparent Huge Pages • bubble up raw devices if possible (warning: migration/portability) • multi-queue virtio-net • SR-IOV if can dedicate NIC (warning: see bubble up warning!) 30 Performance Tuning Results: KVM /Nova Compute 30 2 to 15% gain ~ 10% gain 40+% gain w/ Host + Guest 8% gain in TPM Guest: • qcow2 or raw for guest file backing • disk partition alignment is still very important • preallocate metadata (qcow2) • fallocate entire guest image if can (qcow2, lose oversubscribe ability) • set VM swappiness to zero • Async. I/O set to “native”
  • 31. 31 Performance Tuning Results: MagnetoDB/Cassandra Recommendations: • disk: vm.dirty_ratio & vm.dirty_background_ratio – increasing cache may help write work loads that have ordered writes, or writes in bursty chunks • “CommitLogDirectory“ and “DataFileDirectories“ on separate devices for write performance improvement • GC tuning of Java heap/new gen – significant latency decreases • Tune Bloom Filters, Data Caches, and Compaction • Use compression for similar “column families”
  • 32. 32 Performance Tuning Results: MagnetoDB/Cassandra Recommendations: 10x pages • disk: vm.dirty_ratio & vm.dirty_background_ratio – increasing cache may help write work loads that have ordered writes, or writes in bursty chunks • “CommitLogDirectory“ and “DataFileDirectories“ on separate devices for write performance improvement • GC tuning of Java heap/new gen – significant latency decreases • Tune Bloom Filters, Data Caches, and Compaction • Use compression for similar “column families” 25-35% read perf. 5-10% write gains
  • 34. 34 Summary: Notes • “clouds” are best composed of small services that can be independently combined, tuned, and scaled • human expectations in the transition from bare metal to cloud need to be reset • an iterative step-by-step approach is best – Test … Tune … Test … Tune … ! • lots of complex pieces in a cloud application
  • 35. 35 Summary: Notes (continued) • Compose your services as individual building blocks • Tune each component/service independently • Then tune the whole system • Automation is critical to iterative test/tune strategies!! • Performance tuning is absolutely worth the investment • Knowing your work loads is still (maybe even more?) critical
  • 36. 36 Questions and (hopefully?) Answers Let’s talk…
  • 37. Thank you! Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice. 37 Shane Gibson shane_gibson@symantec.com

Editor's Notes

  1. timing: 01 16:41
  2. timing: 03 16:43
  3. timing: 05 16:45
  4. timing: 07 16:47
  5. timing: 09 16:49
  6. timing: 09 16:49
  7. timing: 11 16:51
  8. timing: 11 16:51
  9. timing: 13 16:53
  10. timing: 13 16:53
  11. timing: 13 16:53
  12. timing: 13 16:53
  13. timing: 15 16:55
  14. timing: 17 16:57
  15. timing: 19 16:59
  16. timing: 19 16:59
  17. timing: 21 17:01
  18. timing: 23 17:03
  19. timing: 25 17:05
  20. timing: 27 17:07
  21. timing: 29 17:09
  22. timing: 31 17:11
  23. timing: 31 17:11
  24. timing: 33 17:13
  25. timing: 33 17:13
  26. timing: 35 17:15
  27. timing: 35 17:15
  28. timing: 37 17:17
  29. timing: 39 17:19