2. Who am I?
• Unix developer since 1985
• Yeah, I’m really that old, I learned Unix on BSD 2.9
• Long time SunOS/Solaris/Linux user
• Mozilla committer (but not active now)
• VP of Apache Traffic Server PMC
• ASF member
• Overall hacker, geek and technology addict
zwoop@apache.org
@zwoop
+lhedstrom
11. Why Cache is King
• The content fastest served is the data the user
already has locally on his computer/browser
– This is near zero cost and zero latency!
• The speed of light is still a limiting factor
– Reduce the latency -> faster page loads
• Serving out of cache is computationally cheap
– At least compared to e.g. PHP or any other higher
level page generation system
– It’s easy to scale caches horizontally
12. Choosing an intermediary
SMP Scalability
and performance
Ease of use HTTP/1.1
Extensible Features
18. The problem
• You can basically not buy a computer today
with less than 2 CPUs or cores
• Things will only get “worse”!
– Well, really, it’s getting better
• Typical server deployments today have at least
8 – 16 cores
– How many of those can you actually use??
– And are you using them efficiently??
• NUMA turns out to be kind of a bitch…
19. Solution 1: Multi-threading
Single CPU Dual CPU
Thread 1 Thread 1 Thread 2
Thread 3
Thread 2
Thread 1
Thread 3
Thread 3
Thread 1
Thread 3
Time Time
20. Problems with multi-threading
• It’s a wee bit difficult to get it right!
http://www.flickr.com/photos/stuartpilbrow/3345896050
21. Problems with multi-threading
"When two trains approach each other at a
crossing, both shall come to a full stop
and neither shall start up again until
the other has gone."
From Wikipedia, Illogical statute passed by Kansas legislation .
22. Solution 2: Event Processing
Scheduled Network Disk I/O
events events events
Queue
Event
Loop
Disk HTTP state Accept
handler machine handler
Can generate new events
23. Problems with Event Processing
• It hates blocking APIs and
calls!
– Hating it back doesn’t help :/
• Still somewhat complicated
• It doesn’t scale on SMP by
itself
24. Where are we at ?
Apache TS Nginx Squid Varnish
Processes 1 1 - <n> 1 - <n> 1
Threads Based on cores 1 1 Lots
Evented Yes Yes Yes Yes *)
*) Can use blocking calls, with (large) thread pool
25. Proxy Cache test setup
• AWS Large instances, 2 CPUs
• All on RCF 1918 network (“internal” net)
• 8GB RAM
• Access logging enabled to disk (except on Varnish)
• Software versions
– Linux v3.2.0
– Traffic Server v3.3.1
– Nginx v1.3.9
– Squid v3.2.5
– Varnish v3.0.3
• Minimal configuration changes
• Cache a real (Drupal) site
37. RFC 2616 is not optional!
• Neither is the new BIS revision!
• Understanding HTTP and how it relates to
Proxy and Caching is important
– Or you will get it wrong! I promise.
38. How things can go wrong: Vary!
$ curl -D - -o /dev/null -s --compress http://10.118.73.168/
HTTP/1.1 200 OK
Server: nginx/1.3.9
Date: Wed, 12 Dec 2012 18:00:48 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8051
Connection: keep-alive
X-Powered-By: PHP/5.4.9
X-Drupal-Cache: HIT
Etag: "1355334762-0-gzip"
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Cache-Control: public, max-age=900
Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie,Accept-Encoding
Content-Encoding: gzip
39. How things can go wrong: Vary!
$ curl -D - -o /dev/null -s http://10.118.73.168/
HTTP/1.1 200 OK
Note: no gzip support
Server: nginx/1.3.9
Date: Wed, 12 Dec 2012 18:00:57 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 8051
Connection: keep-alive
X-Powered-By: PHP/5.4.9
X-Drupal-Cache: HIT
Etag: "1355334762-0-gzip"
Content-Language: en
X-Generator: Drupal 7 (http://drupal.org)
Cache-Control: public, max-age=900
Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary: Cookie,Accept-Encoding
Content-Encoding: gzip
EPIC FAIL!
40. What type of proxy do you need?
• Of our candidates, only two fully supports all
proxy modes!
45. ATS – The good
• Good HTTP/1.1 support, including SSL
• Tunes itself very well to the system / hardware
at hand
• Excellent cache features and performance
– Raw disk cache is fast and resilient
• Extensible plugin APIs, quite a few plugins
• Used and developed by some of the largest
Web companies in the world
46. ATS – The bad
• Load balancing is incredibly lame
• Seen as difficult to setup (I obviously disagree)
• Developer community is still too small
• Code is complicated
– By necessity? Maybe …
47. ATS – The ugly
• Too many configuration files!
• There’s still legacy code that has to be
replaced or removed
• Not a whole lot of commercial support
– But there’s hope (e.g. OmniTI recently announced
packaged support)
48. Nginx – The good
• Easy to understand the code base, and
software architecture
– Lots of plugins available, including SPDY
• Excellent Web and Application server
– E.g. Nginx + fpm (fcgi) + PHP is the
awesome, according to a very reputable source
• Commercial support available from the people
who wrote and know it best. Huge!
49. Nginx – The bad
• Adding extensions implies rebuilding the
binary
• By far the most configurations required “out
of the box” to even do anything remotely
useful
• It does not make good attempts to tune itself
to the system
• No good support for conditional requests
50. Nginx – The ugly
• The cache is a joke! Really
• The protocol support as an HTTP proxy is
rather poor. It fares the worst in the tests, and
can be outright wrong if you are not very
careful
• From docs: “nginx does not handle "Vary"
headers when caching.” Seriously?
51. Squid – The Good
• Has by far the most HTTP features of the
bunch. I mean, by far, nothing comes even
close
• It also is the best HTTP conformant proxy
today. It has the best scores in the CoAdvisor
tests, by a wide margin
• The features are mature, and used pretty
much everywhere
• Works pretty well out of the box
52. Squid – The Bad
• Old code base
• Cache is not particularly efficient
• Has traditionally been prone to instability
• Complex configurations
– At least IMO, I hate it
53. Squid – The Ugly
• SMP is quite an afterthought
– Duct tape
• Why spend so many years rewriting from v2.x to
v3.x without actually addressing some of the real
problems? Feels like a boat has been missed…
• Not very extensible
– Typically you write external “helper”
processes, similar to fcgi. This is not particularly
flexible, nor powerful (can not do everything you’d
want as a helper, so might have to rewrite the Squid
core)
54. Varnish – The Good
• VCL
• And did I mention VCL? Pure genius!
• Very clever logging mechanism
• ESI is cool, even with its limited subset
– Not unique to Varnish though
• Support from several good commercial
entities
55. Varnish – The Bad
• Letting the kernel do the hard work might
seem like a good idea on paper, but perhaps
not so great in the real world. But lets not go
into a BSD vs Linux kernel war …
• Persistent caching seems like an after thought
at best
• No good support for conditional requests
• What impact does “real” logging have on
performance?
56. Varnish – The Ugly
• There are a lot of threads in this puppy!
• No SSL. And presumably, there never will be?
– So what happens with SPDY / HTTP2 ?
• Protocol support is weak, without a massive
amount of VCL.
• And, you probably will need a PhD in VCL!
– There’s a lot of VCL hacking to do to get it to
behave well
57.
58. Summary
• Please understand your problem`
– Don’t listen to @zwoop on twitter…
• Performance in itself is rarely a key
differentiator; latency, features and
correctness are
• But most important, use a proxy, preferably a
good one, if you run a serious web server
59. Performance AWS 8KB HTML (gzip)
10,000 50.0
9,000 45.87 45.0
Time to firt res p onse (ms)
8,000 40.0
7,000 35.0
Throughput
6,000 30.0
5,000 25.0
22.81
4,000 20.0
s
3,000 15.0
12.16
2,000 9.20 10.0
7.40 7.92
1,000 5.0
0 0.0
ATS 3.3.1 Nginx 1.3.9 Squid 3.2.5 Varnish Varnish Nginx 1.3.9
hack 3.0.3 3.0.3
varnishlog -
w
QPS Latency
60. If it ain’t broken, don’t fix it
But by all means, make it less sucky!
Worked on Traffic Server both at Yahoo, and at Apache.Before we go on, lets do a show of hands. How many have or are using a Proxy server of some sort?How many of you are or were using Squid ?There’s still hope for you.
Traffic Server is obviously not the only HTTP intermediary in the Open Source community. Existing servers include Apache mod_proxy, Squid, NGINX, Varnish and Haproxy. This makes the task of choosing a Proxy server an interesting, but challenging task. You really need to understand your problem space, your requirements, and any restrictions (like, budget). Easy for me to pick, but lets discuss some of the considerations you should take.
There is a lot of “interesting” information out there. Reliable sources telling you how they switch from technology A to technology B, and how much better B is Take it all with a grain of salt. Netflix’s or Facebook’s problem are not your problems! (Unless you work there)
* Before we go into details of what drives Traffic Server, and how we use it, let me briefly discuss the three most common proxy server configurations.* In a forward proxy, the web browser has to be manually (or via auto-PAC files etc.) configured to use a proxy server for all (or some) requests. The browser typically sends the “full” URL as part of the GET request.The forward proxy typically is not required to be configured for “allowed” destination addresses, but can be configured with Access Control List, or blacklists controlling what requests are allowed, and by whom. A forward proxy is typically allowed to cache content, and a common use case scenario is inside corporate firewalls.
A reverse proxy, aka a web accelerator, does not require the browser to cooperate in any special way. As far as the user (browser) is concerned, it looks like it’s talking to any other HTTP web server on the internet. The reverse proxy server on the other hand must be explicitly configured for what traffic it should handle, and how such requests are properly routed to the backend servers (aka. Origin Servers). Just as with a forward proxy, many reverse proxies are configured to cache content locally. It can also help load balancing and redundancy on the Origin Servers, and help solve difficult problems like Ajax routing.
An intercepting proxy, also commonly called a transparent proxy, is very similar to a forward proxy, except the client (browser) does not require any special configuration. As far as the user is concerned, the proxying happens completely transparently. A transparent proxy will intercerpt the HTTP requests, modify them accordingly, and typically “forge” the source IP before forwarding the request to the final destination. Transparent proxies usually also implements traffic filters and monitoring, allowing for strict control of what HTTP traffic passes through the mandatory proxy layer. Typical use cases include ISPs and very strictly controlled corporate firewalls. I’m very excited to announce that as of a few days ago, code for transparent proxy is available in the subversion tree.
For me, there are three important areas to consider when choosing the proxy server (or probably, any other server for that matters): Performance and scalability Features Is it a good product for operations to manage, and for engineers to develop applications for? We’ll discuss these in details, but the goal for Apache Traffic Server is obviously to be smack in the middle of this Venn diagram. We’re not quite there yet.
I decided to only have one represent from Apache, and since it’s my talk, and I’m biased, I picked Apache Traffic Server.
Multithreading allows a process to split itself, and run multiple tasks in “parallel”. There is significantly less overhead running threads compared to individual processes, but threads are still not free. They need memory resources, and incur context switches. It’s a known methodology for solving the concurrency problem, and many, many server implementations relies heavily on threads. Modern OS’es have good support for threads, and standard libraries are widely available.
Deadlocks, where two threads (or processes) need to acquire the same two resources (e.g. locks), which can cause the application to completely stall (unrecoverable) Race conditions can occur, where the outcome is not deterministic, but depends on timing or scheduling of threads execution. Difficult to code and ‘get right’.
Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
It turns out, varnishlog –w is not all that good of an idea…
For me, there are three important areas to consider when choosing the proxy server (or probably, any other server for that matters): Performance and scalability Features Is it a good product for operations to manage, and for engineers to develop applications for? We’ll discuss these in details, but the goal for Apache Traffic Server is obviously to be smack in the middle of this Venn diagram. We’re not quite there yet.
For me, there are three important areas to consider when choosing the proxy server (or probably, any other server for that matters): Performance and scalability Features Is it a good product for operations to manage, and for engineers to develop applications for? We’ll discuss these in details, but the goal for Apache Traffic Server is obviously to be smack in the middle of this Venn diagram. We’re not quite there yet.
“If you never fail, you’re not trying hard enough”But please try to avoid failing on major production systems if you can (no one likes you when you kill all of Yahoos DNS)
I wasn’t going to show this, but wth, here it is. This is the additional test with nginx doing gzip compression on the fly. It’s a bad idea …