Although the idea of doing performance testing throughout the software lifecycle sounds simple enough, as soon as you try to combine the concepts of “always testing” (in dev, pre-prod, and production) with “limited time and resources” and throw in the word “comprehensive,” the challenges can be monumental. Quickly the “how” of it emerges as the most important question—and one worth focusing on. Brad Stoner tackles this topic by explaining how he has been able to solve this seemingly impossible puzzle by applying various approaches such as early and often, learning when to say no, and seriously, I did say no—and more. Brad shares concrete examples of how he has successfully implemented full lifecycle performance testing at several companies. Join Brad to learn what performance tests to run at each development and delivery stage—from a simple load profile on a single server to full-scale soak tests over several days.
Comprehensive Performance Testing: From Early Dev to Live Production
1.
T15
Performance
Testing
10/6/16
13:30
Comprehensive
Performance
Testing:
From
Early
Dev
to
Live
Production
Presented
by:
Brad
Stoner
AppDynamics
Brought
to
you
by:
350
Corporate
Way,
Suite
400,
Orange
Park,
FL
32073
888-‐-‐-‐268-‐-‐-‐8770
·∙·∙
904-‐-‐-‐278-‐-‐-‐0524
-‐
info@techwell.com
-‐
http://www.starwest.techwell.com/
2.
Brad
Stoner
Brad
StonerÊis
a
Senior
Sales
Engineer
with
AppDynamics.
In
his
fourteen
years
of
IT
experience,
Brad
has
held
roles
in
performance
engineering,
systems
engineering,
and
operations
management.
Previously,
Brad
managed
the
load
and
performance
team
at
H&R
Block
where
he
spent
seven
years
leading
his
five-‐person
team
in
pursuit
of
improved
application
performance
and
quality.
Brad
and
his
team
managed
the
performance
testing
process
for
more
than
fifty
projects
annually.
He
founded
Sandbreak
Digital
Solutions,
a
consulting
company
specializing
in
web
application
performance
testing,
web
page
optimization,
front
end
optimization,
capacity
testing,
infrastructure
validation,
and
cloud
testing.
4. My background
• 7 years @ H&R Block Load and Performance Team
• 5 person team
• 100k + user concurrency
• Tax peak 2nd week after go-live
• 70 applications annually
• Diverse technology stack – including 3rd party
• 2 years @ Neotys – Performance testing software vendor
• Currently Sales Engineer @ AppDynamics
brad.stoner@appdynamics.com
@sandbreak80
5. What is performance testing
In software engineering, performance testing is in
general, a testing practice performed to determine
how a system performs in terms of responsiveness
and stability under a particular workload. It can also
serve to investigate, measure, validate or verify
other quality attributes of the system, such as
scalability, reliability and resource usage.*
• Load Test
• Performance Test
• Stress Test
• Scalability Test
• Capacity Test
• Endurance Test
• Workload Test
• Device, FE, BE, end-to-end
* https://en.wikipedia.org/wiki/Software_performance_testing
6. Why bother?
Google - Using page speed in site ranking
Facebook - Launches 'lite' mobile app
Amazon - 100ms delay -> $6.79M sales
decrease
Recent airline industry outages
7. Legacy performance testing
• Test after QA and right before launch /
deployment to prod
• Test entire application in war room
• Complex workloads and use cases
• 3-5 weeks to complete
• 3-5 days to script single use case
• Difficult to pinpoint root cause
• Test high volume, long duration
• Peel back onion approach – GIANT ONION
(more later)
• Test system capacity and scalability
• Code focused – only if needed
• Require code freeze
• Potentially expensive and time consuming
changes
8. Increasing velocity
• Customers want everything faster
• Business demands quicker time to market
• Reduce risk and pain of ‘giant deployments’
• Resolve defects faster at a lower cost
• Keep competitive
• … Performance testing isn't historically fast
9. Main challenges
• Time
• Need to rescript use cases
• Fluid environments – both software and infrastructure
• Sequential workflows
• Test Data management and synthesis
• Complex load profiles
• Multiple user profiles
• Integrations
• Synchronizing builds and functionality
10. Keeping up with Agile / DevOps
• “It takes 2 weeks to script all our use cases and we get releases every 3 days”
• “The application is too difficult to test”
• “We are moving to agile on our legacy waterfall project. How we get started?”
• “QA will always be the bottleneck”
• “Issues are difficult to reproduce and our environment is unstable”
• “We don’t have visibility into our infrastructure”
• “If we find an issue, it still takes a week to fix”
• “We have no idea what changed in the application or why we are testing it again”
15. Dev / QA
• Front End Optimization (cache, minimize, round
trips, content size, compression)
• Code issues (concurrency, locking, blocking,
deadlock, single threaded)
• Queue build-up
• Code level performance (method / class)
• Slow responses (functional load)
• Issues with Memory allocation
• 3rd party code or frameworks
• Having debug enabled
• JS execution times
• Sync vs async calls
• Unlimited queries (return all rows)
• Caching (code / object)
• Excessive DB queries
• Logging Levels
16. Staging / Pre-Prod
• Memory leaks
• Thread exhaustion
• User limits
• Garbage collection (STW)
• Stored procedure inefficiencies
• Missing indexes / Schema issues
• DB connection pool issues
• Keep Alive issues
• Data Size Issues
• Issues with virus scan / security software
• 3rd party integrations
• Internal integrations
• CPU limitations
• Memory limitations
• Configuration issues / default install – Huge!
• Data growth issues
• Connection cleanup
• Using only clean data
• Swappiness
17. Prod / Perf
• Load balancing (active / active,
device, VIP)
• Firewall performance issues
• SAN performance issues
• Socket / connection issues
• Bandwidth limitations
• # Of servers required
• CDN issues
• Geographic limitations
• Backups causing issues
• Clustering issues / failover issues
• Issues with shared services (AD,
SSO)
• Disk performance issues
• Data replication performance
issues
• Performance impact of scheduled
tasks
• Load balancing and persistence
• Firmware / BIOS issues
• Proxy limitations
• Proxy / edge caching / FE caching
• DDOS / IDS configuration issues
• ISP limitations
• Noisy neighbors - virtualization
• Bad server in farm
• Switch / link configuration
• PDU / power / overheating issues
• OS limitation / tuning
• Disk space issues
• SAN caching
18. Dev testing - APIs
Mobile web/app example
Staging testing – Capacity w/
UI and API
Build automation
Baseline Pre-Prod / Staging -
platform
Prod / Perf testing
(inside firewall) –
stability / scalability
Prod / Perf testing
(outside firewall) –
network / load
balancing
QA testing – API flows
Optimize app
chatter and
network resources
Mobile app released
Mobile app built
Mobile site releasedMobile site built
APIs released/ BE functionality
Front End
Optimization
19. What if legacy testing principles were applied?
Staging testing – Capacity
w/ UI and API
Baseline Pre-Prod / Staging -
platform
Prod / Perf testing (inside firewall) –
stability / scalability
Prod / Perf testing (outside
firewall) – network / load
balancing
Front End
Optimization
Optimize app
chatter and
network resources
Mobile app released
Mobile app built
Mobile site releasedMobile site built
APIs released/ BE functionality
20. What worked?
• App was developed with testability in mind
• Testing earlier in the development of the mobile app
• Back end testing was completed prior to release any front end!
• Isolation of back end and front end
• Fast feedback for bad builds (functional, errors, performance)
• Build validation with automation led to faster iterations
• Reusability of test cases between teams and environments
• Baseline infrastructure; then focus of code changes
• Create experiments to isolate defect targets
• Document what defects were caught in each environment; refine
• Using production-like data sets as early as possible!
• Build test cases as close to dev as possible
23. • Pinpoint root cause!
• ”Its slower. Is it the database, or the application server, or the code?”
• Used for initial tuning and build testing
• Ensure visibility into infrastructure – automatically updated
• Get resources out of war rooms
• Remove the need to manually setup monitoring in performance tools
• Compare build performance
• Correlate performance and load
• Eliminate re-testing - some issues are difficult to reproduce (backups, virus scan updates) issues outside the
application
• Compare performance in different environments, code the same – what’s different?
• Pass / Fail based on performance – extend performance tools or use less sophisticated performance tools
24.
25.
26.
27. Result of test automation
• Ensure platform capacity and scalability
• Identify and fail poor performing builds automatically
• Performance tests can include functional validations
• Establish and monitor performance trends
• Identify performance issues early
• That’s great… what do we do if we discover a performance issue?
28.
29.
30.
31.
32.
33.
34. When to say no
• Not baselining infrastructure (there costs associated with auto-scaling)
• ‘Green Stamp’
• Empty database testing
• Capacity testing with a fraction of the web / app tier
• Testing 50 VUs and certifying 500 VUs
• 0 think time testing
• 10 minute performance tests
• No pass / fail criteria
• Just tell me if it will break
• … we will add caching later