People love fast web sites, but up until now developers have been focusing on the wrong area. Network (TCP, buffers, routing) performance and Backend (web server, database, etc.) performance are important for reducing hardware costs and improving efficiency, but for most pages 80% of the load time is spent on the frontend (HTML, CSS, JavaScript, images, iframes, and others). We will talk about the best practices for making web pages faster, provide case study from top web site, and introduce the tools we use for researching performance. In addition to know how to improve web performance, we will also try to gain an understanding of the fundamentals of how the Internet works including DNS, HTTP, and browsers. This talks was given as an Educational Series called Fog Computing Reading Group at Cisco Advanced Architecture and Research. The content is derived from the materials by Steven Sounders (Google/Stanford), Collin Jackson (Stanford/CMU) and Daniel Austin (eBay).
2. Agenda
•
•
•
•
•
Overture: Web Performance As Art And Science
First the Theory…
…Then the Practice
Case Study: Before vs. After
…Making the Web Faster and Smarter
• Slides Credits: Steven Sounders (Google/Stanford), Collin Jackson
(Stanford/CMU), Daniel Austin (eBay)
12/13/2013
3. State of The ART?
• An Art and a Science
• Very Little Prior Art
• Users Suffer –
• The World Wide Wait isn’t over yet!
• Rapid Change in the Industry
• Browsers
• Devices
• Standards
• The Rest of the World is Catching Up
• Challenges of Global/Local/Mobile/Social Performance
12/13/2013
4. • PERFORMANCE IS RESPONSE TIME
• PERFORMANCE IS RESPONSE TIME
•
PERFORMANCE IS RESPONSE TIME
• PERFORMANCE IS RESPONSE TIME
• (It’s not latency, it’s not bandwidth, it’s not
queue residence time or queue length or any
such thing.)
12/13/2013
5. Dimensions of Performance
• Geography/Network location
• Bandwidth/Medium/Transport Type
• Browser/Device Type
• RT Varies by as much as 50% (chrome vs IE)
• Page Composition
• Client-side rendering and execution effects (JS, CSS)
• Network Transport Effects
• # of Connections, CDN Use
12/13/2013
6. The MPPC Model
“Multiple Parallel Persistent Connections”
Request
Initiation by
User
S
HTTP Request
HTTP Request
t1
This entire cycle, steps 1-4, is repeated once for each external
reference on the page, so for a given page the total time is:
T = S Dt1 + Dt2 + Dt3 + Dt4
n+1
End User
DNS/Network
Resolution
Page
Composition
t2
n+1
Where n is the number of external page requisites.
Browser
Rendering Time
HTTP Response
t4
Payload
Delivery Time
HTTP Response
t3
T1
T2
T3
T4
Connection Time
Server Duration
Transport Time
Render Time
12/13/2013
7. HTTP Connection Flow
Server
Client
Connection setup (t1)
Client’s perceived
response time
Request (t2)
Response (t3)
Handshake time
Request transmission
time
Estimated server
processing time
Response transmission
time
The more HTTP requests & network roundtrips you require, the slower
your site will be: Images, CSS, JS, DNS lookups, Redirects, #of
packets
12/13/2013
8. T1 – Making the Connection
1=
12/13/2013
DNS
+
TCP+
SSL
9. T2 – THE SERVER DURATION
•
Let (
•
U = ( r)[
•
X =U*
•
Navg = ( r [W( r)W+1 (W+1)( r)W+1]
•
… so 2 = Navg/X (The
response time law)
)
r
r
W]
Never mind - it’s a constant!
12/13/2013
10. T3 – TCP Transport Time
Single Object:
3
= Sz/R+2RTT+
idle
For persistent parallel connections:
3
= (M+1)Si/Ri+[M/kNh]*3 RTTi+
idle
… for 1 base HTML page with M objects, with Si
bits, at bandwidth Ri, k connections per host,
and Nh unique hostnames
12/13/2013
14. Let’s Talk Tools
Site Performance Services
– Gomez
Page Analysis Tools
– Keynote
– YSlow
– WebPagestest.Org
– MS Virtual RoundTrip
• ‘Wholesale’ Testing
Analyzer, HTTPWatch, Many
– Statistical data for many page
Others
views under different conditions
– F12 in your browser
– Operational testing
• ‘Retail’ Testing
– Best for understanding global and
– One Page or App
network effects
– Diagnostic
– Best for functional testing
15. Commercial Testing Services
• Gomez, AlertSite, and Keynote toolsets are similar in many ways
• Synthetic Test Setup
• Test nodes in large datacenters and/or end user’s machines
• Statistical data about response times
• You can do this for yourself on a smaller scale at WebPageTest.org
12/13/2013
16. Happy Birthday, Yslow!
• Methodology
–
–
–
–
DOM Crawler and Packet Sniffer
More accurate
Analyzes components
Stats view
• Implements the 14 18 22 105
YSlow Rules
– All browsers except IE
– Mobile bookmarklet
– Best tool for page analysis
17. Performance Golden Rule
•80-90% of the end-user response time is spent on the
frontend.
•Let’s start there.
• Browser cache makes a big difference
• Reading resources from cache depends on:
•
•
freshness (expiration date)
validity (updates since last-modified date)
20. Too Many Requests!
•
•
•
•
•
•
•
*
80-90% of load time is the frontend
the frontend time is dominated by HTTP
HTTP requests growth since 2003: 25 to 50*
each HTTP request has overhead – even with persistent
connections
reducing HTTP requests has the biggest impact
bigger benefit for users with higher latency
parallelization reduces the need for this
http://www.websiteoptimization.com/speed/tweak/average-web-page/
21. Too Many Requests!
• But...
• is it possible to reduce HTTP requests without
reducing richness?
• Yes:
•
•
•
•
combine JS, CSS
image maps
CSS sprites
inline images
22. Combine JS and CSS
• multiple scripts => one script
• multiple stylesheets => one stylesheet
• apache module:
• http://code.google.com/p/modconcat/
• YUI Combo Handler
• http://yuiblog.com/blog/2008/07/16/combohandler/
23. Image Maps
<img usemap="#map1" border=0 src="/images/imagemap.gif">
<map name="map1">
<area shape="rect" coords="0,0,31,31" href="home.html">
<area shape="rect" coords="36,0,66,31" href="gifts.html">
<area shape="rect" coords="71,0,101,31" href="cart.html">
<area shape="rect" coords="106,0,136,31"
href="settings.html">
<area shape="rect" coords="141,0,171,31" href="help.html">
</map>
• old school, CSS sprites is preferred
• image maps still useful when x,y coordinates
are useful, for example, in maps
25. inline images (data: URLs)
• embed the content of an HTTP response in
place of a URL
<IMG ALT=”Shuriken"
SRC="data:image/gif;base64,R0lGODl...wAIlEEADs=">
• if embedded in HTML document, probably not
cached => embed in stylesheet instead
• base64 encoding increases total size
• works in IE8 (not IE7 and earlier)
31. Expires header
GET /v-app/scripts/107652916-dom.common.js HTTP/1.1
Host: www.blogger.com
User-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1
Accept-Encoding: gzip,deflate
HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Mon, 22 Sep 2008 21:14:35 GMT
Content-Length: 2066
Content-Encoding: gzip
Expires: Mon, 12 Oct 2009 14:57:34 GMT
Cache-Control: max-age=31536000
XmoÛHþÿFÖvã*wØoq...
• Expiration date determines freshness.
• Can also use Cache-Control: max-age
32. Expires vs. max-age
• Expires works in HTTP/1.0, max-age in HTTP/1.1
• Expires is an absolute date:
– 12 Oct 2009 14:57:34 GMT
• max-age is # of seconds until expiration:
– 31536000
• Expires relies on clock synchronization between client and
server for short expirations
• max-age takes precedence over Expires
33. Conditional GET (IMS)
sometime after 3pm PT 9/24/08:
GET /v-app/scripts/107652916-dom.common.js HTTP/1.1
Host: www.blogger.com
User-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1
Accept-Encoding: gzip,deflate
If-Modified-Since: Mon, 22 Sep 2008 21:14:35 GMT
HTTP/1.1 200 OK Modified
304 Not
Content-Type: application/x-javascript
Last-Modified: Mon, 22 Sep 2008 21:14:35 GMT
Content-Length: 2066
Content-Encoding: validity.
IMS determines gzip
Expires: Fri, 26 Sep 2008 22:00:00 GMT
•
• IMS is used when Reload is pressed.
• XmoÛHþÿFÖvã*wØoq...
ETag and If-None-Match also determine validity.
34. Sending Expires (Apache)
• mod_expires
<FilesMatch ".(gif|jpg|js|css)$">
ExpiresDefault "access plus 1 year"
</FilesMatch>
• sends both Expires and max-age:
Expires: Mon, 12 Oct 2009 14:57:34 GMT
Cache-Control: max-age=315360000
35. Expires in the Wild – 2007
Images
amazon.com
Scripts
Stylesheets
% with
Expires
Median
Age
0/62
0/3
0/1
0%
114 days
aol.com
23/43
6/18
1/1
48%
217 days
cnn.com
0/138
2/11
0/2
1%
227 days
ebay.com
16/20
0/7
0/2
55%
140 days
1/23
0/1
0/1
4%
454 days
32/35
3/9
1/1
80%
34 days
myspace.com
0/18
0/2
0/2
0%
1 day
wikipedia.org
6/8
2/3
1/1
75%
1 day
23/23
4/4
1/1
100%
na
0/32
0/7
0/3
0%
26 days
10/40
(25%)
2/5
(38%)
0.5/2 (27%)
12/46
(26%)
froogle.google.com
msn.com
yahoo.com
youtube.com
average
March 2007
36. Expires in the Wild – 2008
Images
Scripts
Stylesheets
% with
Expires
aol.com
26/35
13/20
1/1
71%
189 days
ebay.com
48/48
6/7
2/2
98%
1 day
facebook.com
93/97
20/22
20/20
96%
121 days
google.com/search
1/1
0/1
0/0
50%
1 day
search.live.com/results
6/6
1/1
4/4
100%
na
msn.com
45/45
3/3
3/3
100%
na
myspace.com
21/21
7/7
4/4
100%
na
7/32
5/5
9/9
46%
310 days
23/23
4/4
1/1
100%
na
8/27
1/1
1/1
34%
unk
28/34
(83%)
6/7
(85%)
(100%)
38/45
(85%)
en.wikipedia.org/wiki
yahoo.com
youtube.com
average
5/5
Median
Age
October 2008
38. Compression (encoding)
GET /v-app/scripts/107652916-dom.common.js HTTP/1.1
Host: www.blogger.com
User-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1
Accept-Encoding: gzip,deflate
HTTP/1.1 200 OK
Content-Type: application/x-javascript
Last-Modified: Mon, 22 Sep 2008 21:14:35 GMT
Content-Length: 6230
2066
Content-Encoding: gzip
function d(s) {...
XmoÛHþÿFÖvã*wØoq...
• typically reduces size by 70%
– (6230-2066)/6230 = 67%
39. Compression
• Pro:
• smaller transfer size
• Con:
• CPU cycles – on client and server
• Don’t compress resources < 1K
• Don’t compress resources already compressed (images)
• Change server configuration to enable compressions for
certain MIME types
40. • March
2008
Gzip: not just for HTMLOctober2008
HTML
Scripts
Stylesheets
aol.com
amazon.com
x
x
x
ebay.com
aol.com
x
some
some
facebook.com
cnn.com
x
x
x
google.com/search
ebay.com
x
x
na
search.live.com/results
froogle.google.com
x
x
x
msn.com
x
deflate
x
deflate
x
myspace.com
x
x
x
en.wikipedia.org/wiki
wikipedia.org
x
some
x
some
x
yahoo.com
x
x
x
youtube.com
x
some
x
some
x
gzip scripts, stylesheets, XML, JSON (not images, Flash, PDF)
42. Progressive Rendering
•
progress indicators:*
• reassure the system is working
• convey how much time is left
• provide something to look at
•
•
•
the web page is the progress indicator
progressive rendering – draw content as soon as it's
available
stylesheets block progressive rendering in IE, and
cause "flash" in Firefox
*Jakob
Nielson, http://www.useit.com/papers/responsetime.html
43. Stylesheets in IE
• in IE, nothing in the page is drawn until all style sheets are
done downloading
• reasoning: parse all rules before drawing any element, avoids
having to redraw
• when stylesheets are at the bottom, there is no progressive
rendering => after a long delay the entire page blasts onto the
screen
44. Fastest feels slowest...
...and slowest feels fastest
stylesheet at bottom: content
finishes downloading sooner,
but rendering starts later
=> feels slower
stylesheet at top: content
finishes downloading later,
but rendering starts sooner
=> feels faster
true in IE 6, 7, 8
45. Flash of Unstyled Content (FOUC)
• in Firefox, elements are drawn even if stylesheets
aren't all downloaded
• reasoning: progressive rendering makes
the page feel faster (most developers
will follow the spec and put their
stylesheets in HEAD?)
• when stylesheets are at the bottom
and they change style of rendered
elements, elements have to be redrawn
=> flash of unstyled content
47. Parallel downloads
• HTTP spec recommends only two connections (parallel downloads) per
hostname
• http://www.w3.org/Protocols/rfc2616/rfc2616sec8.html#sec8.1.4
• results in a stair-step pattern
• general rule: page load time increases for every two resources added
48. More connections
• newer browsers open more connections
per hostname
domain sharding – split resources
across multiple domains to
increase parallelization
• previous example using two
domains
• browser looks at name, not IP
address
50. Scripts block
• Unfortunately, scripts block in two ways
• downloading resources below the script
• rendering elements below the script
• Moving the scripts lower means
less blocking
51. Parallel script loading
• execute scripts in order, but download
them in parallel with other resources
• available in newer browsers
(see browserscope.org)
• IE6&7 will be around for years, we have to
keep them in mind, so…
put scripts at the
BOTTOM
55. Inline or external?
inline: faster, but HTML document is bigger
•
• external: more HTTP requests, but cached
• variables
• page views per user (per session) , external
• empty cache stats , external
• component re-use across pages , external
• external is typically better
• main exception: home pages
• best of both worlds
• post-onload download
• dynamic inlining
63. Minification
• minification: removing unnecessary characters from code
(comments, white space, etc.)
• obfuscation: minify as well as reduce length of symbol names
(munge)
64. original code
YAHOO.util.CustomEvent =
function(type, oScope, silent, signature) {
this.type = type;
this.scope = oScope || window;
this.silent = silent;
this.signature = signature || YAHOO.util.CustomEvent.LIST;
this.subscribers = [];
if (!this.silent) {
}
var onsubscribeType = "_YUICEOnSubscribe";
if (type !== onsubscribeType) {
this.subscribeEvent =
new YAHOO.util.CustomEvent(onsubscribeType, this, true);
}
};
event.js from YUI – http://developer.yahoo.com/yui/
67. obfuscation costs
• obfuscation typically reduces size more, but has some costs
• bugs – symbol munged to "aa", namespace conflict
• maintenance – tag external symbols (eg, API)
• debugging – harder to read in production
68. Minifying CSS
•
•
Savings are typically less compared to JavaScript
• CSS typically has fewer comments and whitespace
Greater savings from CSS optimizations
• merging identical rules
• abbreviations
• "#660066" => "#606"
• "0px" => "0"
• "background-color:" => "background:"
70. 3xx status codes
• "further action needs to be taken by the user agent in order to fulfill the
request"
•
•
•
•
•
•
•
•
300 Multiple Choices (based on Content-Type)
301 Moved Permanently
302 Moved Temporarily (aka, Found)
303 See Other (clarification of 302)
304 Not Modified
response for
305 Use Proxy
conditional GET
request
306 (no longer used)
307 Temporary Redirect (clarification of 302)
most popular
HTTP/1.1
•http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3
71. redirect example
Request
GET / HTTP/1.1
Host: astrology.yahoo.com
Response
HTTP/1.1 301 Moved Permanently
Location: http://shine.yahoo.com/astrology/
• Why use redirects?
• prettier URLs
• track traffic
• authentication
72. worst blocker
•inserting a redirect to the HTML document is
worse than how stylesheets and scripts block
• all resources in the page are delayed
• the user gets very little feedback (nothing in the page)
• rendering, even the HTML text, is delayed
• 2nd worse – redirecting to a script
73. avoid redirects
• Eliminate the need
• base href or full URLs for resources
• referer tracking
• HTML 5 – A ping and LINK pingback
• CNAMEs
• mod_rewrite
• no autoindex
• Make them cacheable
• 301 with future Expires
• JavaScript & meta refresh with future Expires
76. What is an ETag
•http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.11
•
•
•
•
•
added in HTTP/1.1
used by clients and servers to validate expired resources
more flexible than Last-Modified date
"An entity tag consists of an opaque quoted string"
" An entity tag MUST be unique across all versions of all entities
associated with a particular resource."
77. Apache ETags
"19f1e-7920-4525b037f0440"
"inode-size-timestamp"
• inode – used by filesystems to store file type, owner, group,
permissions, etc.
• inode for the same file differs across servers even if file size,
timestamp, and directory is the same
• http://stevesouders.com/images/arrow-right-9x13.png
ETag: "21f5315-d4-5d51f0c0"
• http://1.cuzillion.com/images/arrow-right-9x13.png
ETag: "1ee57ec-d4-5d51f0c0"
79. the problem with ETags
• the default ETag syntax in Apache and IIS makes it unlikely that
INM will match across servers, even when the resource is the
same
• probability of an incorrect INM miss:
(n-1)/n where "n" is the number of servers
not an issue if you just have one server
• http://www.apacheweek.com/issues/02-01-18
"can cause an unnecessary performance hit as resources are fetched more often
than is required"
• http://support.microsoft.com/kb/922703
"IIS 6.0 sends a 200 response because it considers the different change numbers to
mean that [the resources] are not the same versions"
80. the solution for ETags
•
•
•
•
if you're not leveraging ETags, turn them off
reduces size of requests and responses
reduces outbound traffic from your servers
increases proxy cache hit rate
82. image optimization: 7 TIPS
1.
2.
3.
4.
choose PNG over GIF (lossless)
crush your PNGs (lossless)
strip needless JPEG metadata (lossless)
make all PNGs palette PNGs
5. avoid AlphaImageLoader, try PNG8 or at least
_filter
6. crush generated images (lossless)
7. use CSS sprites; stay within the palette number of colors
(use PNG8)
86. iframes
• good for:
• embedding content from another web site
• sandboxing 3rd party JavaScript
• asynchronous loading of external scripts
• most expensive DOM element
IE7
Firefox 2
http://stevesouders.com/hpws/iframes-none.php
290 ms
250 ms
http://stevesouders.com/hpws/iframes-10.php
410 ms
437 ms
http://stevesouders.com/hpws/iframes-100.php
766 ms
4300 ms
88. Set-Cookie response header
• HTTP/1.1 200 OK
• Set-Cookie:
MSNPPAuth=B*eDP3m4...WELr;
expires=Wed, 30-Dec-2037 16:00:00
GMT; domain=.live.com;_path=/;
•
•
•
•
•
domain, path, and expires in the cookie header
max size ~4K (varies by browser)
one header per cookie
cookie is stored by the client (browser)
only valid if domain matches current page
89. Cookie response header
• GET /results.aspx?q=flowers
HTTP/1.1
• Host: search.live.com
• Cookie:_MSNPPAuth=B*eDP3m4...WEL
r;_SRCHUID=V=1&GUID=83F46965E902
40739918C1047F88FD26;_SRCHUSR=AU
TOREDIR=0&GEOVAR=&DOB=20081129;
...
• cookie sent back to server on subsequent requests that match
the domain and path
• all cookies sent in one request header
• "; " delimited
90. Cookie size
cookie size
(bytes)
comments
aol.com
494
"stay signed in" checked
ebay.com
1038
"keep me signed in" checked
facebook.com
990
"remember me" checked
google.com/search
417
logged in to iGoogle and YouTube
search.live.com/results
1938
"remember me" and "remember
my password" checked
msn.com
1063
logged in thru search.live.com
myspace.com
2027
"remember me" checked
en.wikipedia.org/wiki
134
"remember me" checked
yahoo.com
677
"keep me signed in" checked
youtube.com
597
also logged in to iGoogle
total size of all cookies
November 2008
91. Cookie impact
cookie size response time delta
500 bytes
1 ms
1000 bytes
1500 bytes
2000 bytes
16 ms
31 ms
47 ms
2500 bytes
3000 bytes
63 ms
78 ms
• cookies on static resources multiplies the delay
• largest packet MTU (Maximum Transmission Unit) for Ethernet: 1500
bytes
92. •
•
•
•
•
•
•
•
•
Live Search cookies sent
http://search.live.com/results.aspx?q=
flowers
http://search.live.com/.../brand_c.css
http://search.live.com/.../serp_c.css
http://search.live.com/.../scopebar2_c
.css
http://search.live.com/.../answerAll_c
.css
http://search.live.com/.../asset4.gif
http://search.live.com/.../cbcoin.gif
http://search.live.com/.../main.js
seven static resources contain the Cookie request header (1938 bytes),
even though cookies don't affect the response
•
7 x 1938 bytes = 13.5K (upstream!)
94. cookie-free static content
• takeaway: serve static content without cookies
• different domain (rule 2 – use a CDN)
• different path ("/app" versus "/images")
96. The 7 Habits of Exceptional Performance
1.
2.
3.
4.
5.
6.
7.
Make Performance a Priority
Test, Measure, Test Again
Learn about the Tools
Balance Performance with Features
Track Results Over Time
Set Targets
Ask Questions; Check It for Yourself!
97. making the Web
Smarter and Faster
• Faster HTTP
• SPDY
• HTTP Speed+Mobility
• Better Browsers
• Chrome has disrupted the market
• HTML5 will drive further evolution
• Moving to Mobile/Device Platforms
• More sensitive to network effects of all kinds
• Application-driven user experience
• Moving away from Hypertext?
•
12/13/2013
Editor's Notes
People love fast web sites, but up until now developers have been focusing on the wrong area. Network (TCP, buffers, routing) performance and Backend (web server, database, etc.) performance are important for reducing hardware costs and improving efficiency, but for most pages 80% of the load time is spent on the frontend (HTML, CSS, JavaScript, images, iframes, and others). We will talk about the best practices for making web pages faster, provide case study from top web site, and introduce the tools we use for researching performance. In addition to know how to improve web performance, we will also try to gain an understanding of the fundamentals of how the Internet works including DNS, HTTP, and browsers.
These are good problems to have.
Oracle PerformanceTuning, 2nd Edition By Peter Corrigan, Mark GurryIn most of the systems, the performance is response time. RT
In the next 5 slides, you will see some equations. I am going to the deep details of those because some of you guys here are experts in that area. We are going to move fast and then jump into today’s main topics on frontend engineering.
We will talk about persistent connection standard changes in HTTP 1.1 later in this presentation.