Practical stress testing and tuning of Cocoon applications.
It is best to know before you go live, whether your new site will stand up to the traffic you expect it to receive.
There is only so much you can tell about the speed and capacity of a web project from browsing it by hand.
This talk offers practical advice, garnered from real-life experience about how to measure and tune the performance of web applications, using free tools like JMeter, the Open Source load-tester from Apache.
The talk will cover, planning your tests, setting up the tools, advise on what data to capture, how to interpret it and some of the possibilities for tuning your project.
It will help you answer questions like : How many users will it take to break my site ? How can I handle more users, faster ? How reliable will my site be ? How can I measure the effect of my changes during development ? How do I compare different implementations of the same functionality ? How do I determine the right size for my server infrastructure ?
The photo-credit links are missing from the Flash version, here they are :
<ul>
<li><a href="http://www.flickr.com/people/jblndl/">Môsieur J</a> (<a href="http://www.flickr.com/photos/jblndl/414818918/">Untitled</a>)</li>
<li><a href="http://www.flickr.com/people/snype451/">Brian Uhreen</a> (<a href="http://www.flickr.com/photos/snype451/126780256/">Sometimes; you have to take life by the horns...</a>)</li>
<li><a href="http://www.flickr.com/people/pgoyette/">Paul Goyette</a> (<a href="http://www.flickr.com/photos/pgoyette/168267714/">interlocking</a>)</li>
<li><a href="http://www.flickr.com/people/fabiovenni/">Fabio Venni</a> (<a href="http://www.flickr.com/photos/fabiovenni/271175015/">D'oh!</a>)</li>
<li><a href="http://www.flickr.com/people/thespeak/">Eric</a> (<a href="http://www.flickr.com/photos/thespeak/150544582/">3</a>)</li>
<li><a href="http://www.flickr.com/people/woneffe/">woneffe|Cotuit</a> (<a href="http://www.flickr.com/photos/woneffe/76448398/">demolition</a>)</li>
</ul>
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Break My Site
1. a talk for CocoonGT 2007 by Jeremy Quinn
Break My Site
practical stress testing and tuning of Cocoon applications
photo credit: Môsieur J
1
This is designed as a beginner’s talk. I am the beginner.
2. scenario
Original
400
350
Page Load (secs)
Throughput (pages/min)
300
Stability %
Approximately 600 hits per data-point
250
200
150
100
50
0
1 10 20 30 40 50 60 70 80 90 100
Users
2
My starting point for optimisation
You may be able to see how useless one person clicking in a browser is at testing, the server
never gets loaded.
Disregard the absolute speed values, this was done on my laptop.
3. scenario
Original
400
350
Page Load (secs)
Throughput (pages/min)
300
Stability %
Average
seconds per
Approximately 600 hits per data-point
page
OUCH !!!!
250
200
150
Throughput
pages per
minute
100
EEEK !!!
50
0
1 10 20 30 40 50 60 70 80 90 100
Users
3
The original server configuration being taken to only 30 users, with disastrous eect.
Throughput drops to single figures, page load time climbs sky high, stability reduces by
nearly half.
From the logs I could see there were not enough threads or memory to cope.
This is badly broken.
4. scenario
Original
accumulated optimisations
400
-–— what’s possible
350
Page Load (secs)
Throughput (pages/min)
300
Stability %
Approximately 600 hits per data-point
All graphs at the same scale
250
Memory Thread Optimisation
200
150
100
50
0
1 10 20 30 40 50 60 70 80 90 100
1 10 20 30 40 50 60 70 80 90 100
Users
4
Memory and Thread optimisation has made a tremendous dierence.
Now I can easily test up to 100 users, before I could not go past 30.
Page Load time, Throughput etc. is now linear with the increase in users.
At some point this configuration will also ‘break’, you need to know where this point is
so that you can plan for it not happening.
5. scenario
Original
accumulated optimisations
400
-–— what’s possible
350
Page Load (secs)
Throughput (pages/min)
300
Stability %
Approximately 600 hits per data-point
All graphs at the same scale
250
Memory Thread Optimisation Read Optimisation
200
150
100
50
0
1 10 20 30 40 50 60 70 80 90 100
1 10 20 30 40 50 60 70 80 90 100 1 10 20 30 40 50 60 70 80 90 100
Users
5
The server is doing slightly less work due to this optimisation
This shows as the gradients flattening out a bit more.
6. scenario
Original
accumulated optimisations
400
-–— what’s possible
350
Page Load (secs)
Throughput (pages/min)
300
Stability %
Approximately 600 hits per data-point
All graphs at the same scale
250
List Optimisation
Memory Thread Optimisation Read Optimisation
200
200
150
150
100
100
50
50
0 0
1 10 20 30 40 50 60 70 80 90 100
1 10 20 30 40 50 60 70 80 90 100 1 10 20 30 40 50 60 70 80 90 100 1 10 20 30 40 50 60 70 80 90 100
Users
6
Another optimisation, further reducing the work done on the server.
Flatter curves, better performance.
This gives you an idea of the scale of improvement you may be able to make.
In my case it is quite dramatic!
7. purpose
what do I hope to achieve?
photo credit: Brian Uhreen
7
Load testing can be used to solve dierent sorts of problems.
You don’t want your site broken by popularity.
Find out how much trafic will make a site break, so you can plan to cope.
8. comparing changes
• code optimisations
• different implementations
• caching strategies
• different datasources
• different server architectures
8
Load testing can be used during development to test changes.
9. test content
• test the content of pages
• test the behaviour of webapps
9
JMeter also has exhaustive tools for crawling and testing content within web pages.
It can also send parameters as POST requests etc. to help you test webapps.
10. gauge your server
• get a specification
• how many of what power of machine?
• choosing and tuning a load-balancer
10
Before you start to work out what servers you need, you have to know what level of trafic it is
intended to cope with.
This has to come from either the implementation you are replacing or from marketing.
Test the machines you plan to use, so you know what capacity they can handle.
Round-Robins with machines of dierent capacities may work really badly.
11. planning
what am I going to do?
photo credit: Paul Goyette
11
Plan well and hopefully you will get useful data
12. what can I change?
• implementations
• server configuration
• memory
• threads
• system architecture
12
What kind of changes do you want to compare?
13. what will happen?
• build a mental model
• predict what you expect to happen
• learn to read the data
• test your ideas against reality
13
As your tests run, relate events in the server logs to how the graphs look
14. order of change
• only make one change at a time
• only make one change at a time
• only make one change at a time
• only make one change at a time
14
If there are several changes you can make
It is important to plan what order you do them in
15. order of change
• have you got that?
15
If you make more than one change at a time
you don’t know which change had what eect
which makes it more dificult to build a mental model of what is happening
16. order of change
• in the most meaningful order
• which depends on your purpose
16
I tested optimisations before playing with threads and memory
For each change, I ran the tests again.
Optimisations may require dierent ideal thread and memory settings.
17. what tools?
on MacOSX I used :
• JMeter
• Terminal
• Console
• Activity Monitor
• Grab
17
The tools I used to capture and watch real-time performance and issues
18. what tools?
on MacOSX I used :
• JMeter
• Terminal
• Console
• Activity Monitor
• Grab
18
I found it was fine to run JMeter on the server for running low level tests.
It is better to run JMeter on another machine though.
For very high throughput testing, you can aggregate results from many JMeter clients.
19. what tools?
on MacOSX I used :
• JMeter
• Terminal
• Console
• Activity Monitor
• Grab
19
I can watch Cocoon in its Terminal window.
See errors as they happen
20. what tools?
on MacOSX I used :
• JMeter
• Terminal
• Console
• Activity Monitor
• Grab
20
I used the Console to filter and watch Jetty’s logs for the repository.
21. what tools?
on MacOSX I used :
• JMeter
• Terminal
• Console
• Activity Monitor
• Grab
21
In Activity Monitor I can track threads, memory and cpu usage.
Here we can see the two java processes
The top one is Cocoon, the bottom one the repository
Here you can see that through bottlenecks, the system is being under-utilised.
IMHO both cores should be working fully on a well tuned system
22. setup
what do I do to get ready?
photo credit: Fabio Venni
22
getting started
23. make a test
keep it simple stupid
23
Here is the simplest way to use JMeter
24. start JMeter
24
Starting JMeter from the command-line
25. start a plan
25
Start a new plan, give it a name
26. threads
26
Set up the number of threads you want to test with
27. request
27
Add an HTTP Request to the Thread Group
Set up the server’s IP, port and path
28. recording
28
A graph for plotting results
If you are running many tests, make sure you give everything suitable names and comments
so you know what you were testing.
The test output can be written to file for further analysis.
29. check everything
• content of datafeeds
• behaviour of pipelines
• outcome of transformations
29
Check every stage, so you know everything is working properly and set up as you expected.
Do this before you start testing.
Otherwise you might not be testing what you think you are.
30. caching
• repository?
• datasource?
• pipeline?
30
Make sure you know the state of all the caching,
so it matches what you want to test.
31. warm-up
• hit your test page(s) with a browser
• last chance for a visual check
• get Cocoon up to speed
31
Warm-up Cocoon.
Hit the pages you want to test in a browser to make sure they work.
32. be local
• clients server on the same subnet
• avoid possible network issues
32
Be on the server's local subnet.
33. enough clients?
• creating many threads
• creates its own overhead
33
Have enough clients to perform the level of testing you need.
34. quit processes
• everything that is unnecessary
• periodic processes skew your results
34
Letting something like your email client fire o periodic checks for new mail, will skew your
results.
35. record your data
• server logs
• JMeter graphs
• JMeter data file (XML or CSV ?)
• screen grabs of specific events
• test it !!!!!
35
Prepare properly, to make sure all of your data will be recorded.
Grab relevant sections of logs.
Grab graphs after a run.
Choose what data to write to a file.
36. running tests
time for a cup of tea?
photo credit: thespeak
36
No, you have lots to do.
37. run-time data
• clear log windows on each run
• keep it organised
• use a file or folder naming convention
37
Record the real-time data you need.
I clear the log windows for each run so after the run is finished, I can easily scroll back
looking for exceptions etc.
Use naming conventions for multiple runs.
38. watch it run
• part of the learning process
• see where the problems happen
• see what you are fixing
• make the link between cause and effect
• take snapshots
38
Watch the tools as the tests run.
Learn to read the graphs.
39. my screen
39
I can follow errors and timings change in Cocoon’s output.
Errors and query times changing for the repo, in Jetty’s logs.
Thread, memory and cpu usage in Activity Monitor.
40. interpreting
the results
what does it all mean?
photo credit: woneffe
40
The key for me, to begin to learn how to interpret the results, was to watch the tests live,
relating what appeared on the graph to what was happening (errors, slowdowns, garbage
collection etc.) on the server.
41. reading the graph
• load speed
• throughput
• deviation
• hits
41
NB. Unless you are testing on real hardware, these speeds have no meaning in isolation
but are useful for comparison
42. reading the graph
• load speed
• throughput
• deviation
• hits
42
Page load speeds in milliseconds.
Lower is better
On a stable system, it should go flat
43. reading the graph
• load speed
• throughput
• deviation
• hits
43
Throughput in pages per second.
Higher is better
On a stable system, it should go flat
44. reading the graph
• load speed
• throughput
• deviation
• hits
44
The deviation (variability) of the load speed in milliseconds
Lower is better
On a stable system, it should go flat
45. reading the graph
• load speed
• throughput
• deviation
• hits
45
The black dots are individual sample times
46. reading the graph
• load speed
• throughput
• deviation
• hits
46
Sometimes JMeter draws data o the graph
I was using a beta, this may have been fixed in the latest release
47. chaotic start
• test for long enough
• JMeter ramps threads
• JVMs warm up
• Servlet adds threads
47
Your tests need to run for long enough for you to get meaningful results.
I ran a minimum of 600 hits per test
48. what’s this sawtooth?
• hysteresis?
• randomise requests?
48
You can sometimes see a sawtooth in the throughput.
What causes this?
Is it important?
This is what I think is happening ....
Maybe you could remove the eect by randomising requests.
49. what’s this sawtooth?
processing
• hysteresis?
• randomise requests?
49
Throughput slows as jobs build up and threads process.
50. what’s this sawtooth?
• hysteresis?
• randomise requests?
sending
50
Responses comes in spurts as jobs complete.
JMeter issues new requests
51. what’s this sawtooth?
• hysteresis?
• randomise requests?
51
The change in gradient can tell you the change in the increase in throughput (?)
52. shi‘appen
52
Learn to relate problems on the server to its eect on the graph.
Big spikes indicate that you have problems
You may see the eects of:
exceptions
garbage collection
53. good signs
• speed flattening
• throughput flattening
• deviation dropping
• no exceptions ; )
53
When the values on the graph begin to flatten out,
it shows that the system has become stable at that load.
54. more tests
• authorise
• fill in data
• upload files
• wrap tests in logic
• etc. etc.
54
I used the simplest possible configuration in JMeter.
I was only hitting one url.
It does a lot more.
55. more protocols
• JDBC
• LDAP
• FTP
• AJP
• JMS
• POP, IMAP
• etc. etc.
55
You can apply tests to many dierent parts of your infrastructure.
56. conclusions
• design to cope
• test early, test often
• plan well
• pay attention
• capture everything
• maintain a model
56