Profiling and Tuning a Web Application - The Dirty Details

Profiling & Tuning a Web Application

Kaelen Proctor

About Me
• Joined Achievers in May 2010
• Senior Software Developer @ Achievers
• Professional Experience:
– Java web applications w/Spring + Hibernate
– JavaScript frameworks… without jQuery!
– PHP with CodeIgniter
– MySQL, MySQL, MySQL

Profiling your Web Application

So what if my site isn’t the fastest?
• Response time directly relates to your
business
– In 2007 Amazon determined that a 100ms
increase in load time would cause a 1% drop in
sales
– In 2009 Shopzilla decreased page load time from 6
to 1.2 seconds, which netted a 7-12% conversion
rate increase!
• The slower you can serve up pages the more
frustrated your customers become

What exactly is profiling?
• Profiling is a dynamic analysis of the time
complexity, frequency/duration of function
calls, or memory allocation of a program
• A profiling tool runs this analysis by
instrumenting either the source or
executable, through a variety of techniques
including event hooks or dynamic
recompilation

Goals
• To show off some tools of the profiling trade
• To demonstrate how to use them effectively
and identify the biggest “bang for you buck”
bottlenecks
• To impress upon you the need to integrate
profiling early and continuously into your
development cycle

Agenda
1. Profiling your application
I. What a single request looks like
II. The database (MySQL)
III. Application code (PHP)
IV. In the Browser
2. Maintaining performance at scale
I. Load testing
II. Production monitoring

Before you embark
• What is your performance goal?
• Where is that relative to today?
• What processes are necessary to maintain the
goal?

Profiling for a Web Application
• Web applications are all about speed; how
quickly a response can be sent and usable
• On the app server, that means understanding
the queries ran, 3rd party libraries, web APIs,
and application code
• Simple can sometimes be best
– We use CodeIgniter (CI) here and its built-in
request profiler is easy to use and extremely
helpful

Enabling the CI profiler
• Drop this line anywhere before the controller
ends:
– $this->output->enable_profiler(true);
• The output code injects the profiling content
at the end of the <body> tag
• Lets see what it looks like site: Special K’s
Video Rentals

How does the profile details help?
• Great overview of what is occurring in the
request
• Queries executed is the most important
aspect of a profile
– Identification of long-running or duplicate queries
• Adding timing benchmarks can give a lot of
insight
– Especially if you leverage a lot of 3rd party libraries
or web services

It’s more of a guideline
• Most likely, you’re profiling on your dev
machine with test data
• No idea how the request will scale
– No competition for resources (i.e. database)
– Have you profiled all possible configurations?
• Is anyone profiling or even paying attention to
the details?

… In my experience, they aren’t
• No matter how much documentation you
write on how to profile and which tools to use,
it will get dropped in crunch time
• Most developers didn’t even turn the profiler
on

Achievers Performance Header ™
• Leveraging CI’s profiler, we tie the profiling
summary with our performance targets
• Text is colour-coded on a linear scale from green
to red as the further the request is from our
targets
• Expanding the header shows the summarized
performance details

Shoving in their faces

Now once any target’s threshold is passed, the header
defaults to the expanded view

Knowing is half the battle
• Finding issues early => more time to fix
• Always profiling => instant detection of a
performance-killing change
• But there is balance
– “Premature optimization is the root of all evil”
– Wait until a feature is working before making it
work fast

Database performance is critical
• It is the biggest shared resource your
application contains
• Really slow queries will affect the speed of the
entire database
• Scaling out your DB is not a simple task, so
ensuring it isn’t bogged down is critical

Finding the stragglers
• First you need to identify the slow queries, so
you can:
1. Manually review each query in your code
2. Profile every request and review each executed
query
3. Let MySQL do the work with its slow query log
• Let’s go with option #3

Slow query log
• When on, MySQL logs any query that runs
longer than a threshold # of seconds
• The log contains the total query time, lock
time, and rows examined/sent
• To enable, add to the MySQL config file:
log-slow-queries=/var/log/mysql/slow-
query.log
long_query_time=0.1

pt-query-digest
• http://www.percona.com/doc/percona-
toolkit/2.2/pt-query-digest.html
• Reads the slow query log and groups queries
by their structure
• Outputs aggregated statistics on the whole log
as well as for each query

Digesting
• All the Percona tools are Perl scripts, so
execution is fairly straightforward
• Usage (on unix/linux):
– pt-query-digest
/var/log/mysql/slow-query.log >
digest.out
• Options for specifying a date range, filter
queries, writing the results to a DB, etc.

Well, now what?
• Now you have a great starting point for finding
bottlenecks in your DB
• Slow queries - run MySQL EXPLAIN
– Refer to the tech talk by Dr. Aris Zakinthinos
– Most likely it is missing indexes
– De-normalization may be necessary
– Protip: Use your biggest data sets when running
explain

Regurgitation
• Running pt-query-digest once won’t solve all
your database issues
• Tuning your query performance is a never-
ending process
– Teach developers how to use EXPLAIN and
optimize queries
– Weekly reports using pt-query-digest to give
visibility into DB performance

The Devil is in the details
• Callgrind is a language agnostic command line
tool that profiles the function calls in your
application (through emulation)
• It generates callgrind files which can contain
the entire call stack, and can be read to
summarize what your app code is doing

XDebug profiling to the rescue
• Awesomely, XDebug writes callgrind files
when profiling is enabled
– This makes generating the grind files trivial in PHP
• Just add to your php.ini:
xdebug.profiler_enable=1
xdebug.profiler_output_dir =
"/tmp/xdebug"

Other Grind Visualizers
• KCacheGrind was the original visualizer, which
was ported as Windows as WinCacheGrind
• Regardless, all three aggregate the function
calls into total # of calls and total cost(s)
• WebGrind is limited to a summary table,
whereas the other two can display the full call
tree

Installing WebGrind
• Installation:
1. Prerequisite: Install XDebug
2. Download zip from WebGrind’s Github
3. Extract zip to folder accessible by webserver
4. Setup virtual host for WebGrind
5. WebGrind will read from the XDebug profiler
output directory automatically
6. Open in browser and voila!

Are we screwed?
• No! We can fix it, otherwise this wouldn’t be a
good demo =)
• We ran into these performance issues with
the OWASP library late in our security release

Output encoding
• Output encoding is a very expensive task
– Simply put, the OWASP library encodes any non-
alphanumeric character
– It makes no assumptions on the incoming data, so
ends up doing a lot of encoding detection and
normalization before any real output encoding

Digging into OWASP
• How did we make it more efficient?
– First, we installed WebGrind and started looking at
exactly what OWASP was doing
– We identified the functions that were taking too
long or being called too often, and then dove into
the code
– A little elbow grease and trial/error later, we had it
optimized and running smoothly

Opcode caching
• PHP is an interpreted language, so with every
request, the code is read from disk, parsed,
and compiled into opcode before executing
• An opcode cache stores the compiled opcode
so the first three steps are skipped
• Speeds up your application by 2-5 times!
• Options: APC, XCache, Zend Optimizer+

Setting up APC
• http://pecl.php.net/package/APC
• Linux:
– Install w/ PECL: pecl install apc-3.1.9
– Compile the extension yourself
• Windows: Download pre-compiled binary
from http://downloads.php.net/pierre/
• Enable by adding the extension in php.ini
• Sit back and enjoy the performance boost

Keep on Grindin’
• Use WebGrind to summarize what your app
code is doing; find the functions bottlenecking
your application
• Make it second nature to profile your
application code with WebGrind
• But for a quick boost, start using an opcode
cache now and never look back!

Wealth of information
• Improving browser performance is a well-
documented subject
• A simple Google search will return thousands
of results for how to optimize
HTML, JavaScript, CSS, Images, HTTP
requests, server settings, etc.
• We’ll focus on a couple easy-to-use tools that
tell you exactly what you should do

Developer tools
• Firebug + webkit developer tools
• The most important aspect of these tools to
performance is the network/timeline tab
• Shows you all resource requests and their
timings including
blocking, waiting, receiving, and more
• Displays when the DOMContent and Load
events are fired

Yahoo’s YSlow and Google PageSpeed
• Browser extensions for Chrome + Firefox
(sorry, IE)
• Analyzes a page request/response and offers
best practices about how to improve
performance
• Yahoo and Google know what they are talking
about; follow the tools advice for the biggest
wins in the shortest timeframe

A quick summary
• HTTP
– Reduce # of requests
– Parallelize downloads
– Smaller cookies
• HTML
– Reduce DOM nodes
– Asynchronously load minor content
• CSS
– Minify + concatenate
– Load in the <head>

Cont’d
• JavaScript:
– Minify + concatenate
– Load after CSS (at end of the <body>, if possible)
• Images
– Don’t resize in browser (optimize size for context)
– Use sprites or data URIs
• Server
– Cache headers/ETags
– Gzipping
– Use Content Delivery Networks (CDNs)

Google Speed Tracer
• Chrome extension that shows a timeline of
the internals of the UI thread including HTML
parsing, script callbacks, painting, garbage
collection, and many more
• Resolving issues found by Speed Tracer should
be saved for after implementing all of YSlow
and PageSpeed’s recommendations

Making it simple
1. Run YSlow/PageSpeed
2. Implement their recommendations
3. …..
4. Profit!

The reasons you load test
• A more accurate portrayal of your site’s
average performance, compared to request
profiling and grind files
• Helps locate issues of scale that don’t appear
when testing a single request

Before we dive into the tools
• First, you need to define a basic flow that you
want to measure as your benchmark
• Ex. Login -> Newsfeed -> Catalog -> User
Search -> Recognition -> Logout
• Should contain the most commonly accessed
URLs
• Also nice to have a mix of GET and POST
requests

Choosing a load tester
• Options abound
– JMeter: GUI
– Siege: CMD
– CURL-loader: CMD
– WebLOAD: GUI
– Loadimpact.com - SaaS
• We will focus on JMeter since we use it =)

JMeter
• A GUI written in Java for load testing and
benchmarking servers (HTTP, SOAP, JMS, etc.)
• Supports variables in requests, assertions on
responses, cookies, and many aggregate
reports
• Not the most intuitive UI until you get used to
it

When to load test?
• Depends on your dev cycle, but once a
week/sprint is a good starting point
• It’s more important to be consistent!
• If possible, should be part of an automated
build/test suite

Getting ahead
• Metrics on multi-user response times before
going to production is important
• Otherwise you have no idea how your app will
scale to a real user load
• It’s probably a good idea to load test with
more users (threads) than your average to
know how you can handle spikes

Application Performance Management
• Tools that focus on monitoring + managing the
performance, availability, and scalability of an
application
• Some options:
– New Relic - PHP/.NET/Ruby/Java/Python
– Scout - Ruby
– AppDynamics - Java/.NET
– dynaTrace - Java/.NET

The benefits
• Too many to list
– Real-time dashboards
– Application response times
– End-user monitoring (browser times)
– Error reporting
– Alerts for server/performance issues
– Server monitoring
– And more!

New Relic
• SaaS APM platform with a slick web UI
• Free lite version has real-time monitoring,
server monitoring, and error detection, but
only retains data for 24 hours
• Pro version is $150/server/month, but has
many additional features, including full
response traces

Installation for PHP
• Find your IT guru and follow the installation
instructions =)
• New Relic has two components
– A PHP extension
– A daemon process that receives data from the
extension and then transmits it to the cloud
• Plus a JavaScript block to facilitate End-user
monitoring

Real metrics = real insight
• APM tools are the culmination of all the
profiling tools + techniques we’ve seen
• There is no substitute for real, user-driven
performance numbers
• Review the bottlenecks the APM identifies, dig
deeper using all the other tools we learned
about, then watch as your app gets faster and
more responsive

Recap
1. Profiling your application
I. Profiling as an overall health check
II. Digesting the slow query logs to find bottlenecks
III. Grinding your code to find the hidden details
IV. YSlow/PageSpeed and doing what they say
2. Maintaining performance
I. Use JMeter to load test for scalability
II. Monitoring prod to accumulate real metrics

Putting it all together
• Teach developers the importance of profiling
their code; integrate into culture
• Performance must be top of mind/visible
• Profile and load test critical sections before
release; confidence in your code
• Run an APM in production; real, actionable
data on bottlenecks
• Never let up: the war is never over

Work at Achievers

achievers.com/careers

Reach Out!
@achieverstech

facebook.com/AchieversTech
tech@achievers.com

Profiling and Tuning a Web Application - The Dirty Details

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Profiling and Tuning a Web Application - The Dirty Details

Ähnlich wie Profiling and Tuning a Web Application - The Dirty Details (20)

Mehr von Achievers Tech

Mehr von Achievers Tech (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Profiling and Tuning a Web Application - The Dirty Details

Hinweis der Redaktion