Varnish - PLNOG 4

A modern HTTP accelerator
for content providers

Leszek Urbański
Trader Media East Competence Center

unabridged version
PLNOG 4 – Warsaw, 2010-03-05

Modern web apps
● you know this picture...

Modern web apps
● a nice setup...

Modern web apps

● ...but the application is still slow.

●
you need efficient web caching, before the
traffic hits your app

●
CDNs? Hardware? Squid?

Web caching
● CDNs
● expensive
● you are completely dependent on a CDN's service

● hardware
● nice, but...
● $30,000 for one BIG-IP 3600 without redundancy,
support and only 50 Mbps compression with the
standard licence

Web caching
● Squid
● a forward proxy/cache with optional reverse
proxying (HTTP acceleration)

● huge config files full of forward proxy options

● it's slow

● “1970s programming”

Varnish
● a state-of-the-art reverse proxy and cache

●
open source, initially developed for a
Norwegian tabloid “Verdens Gang” in 2006

●
Poul-Henning Kamp – architect and lead
developer

●
Linpro AS

Varnish
● used by TOP100 sites
● Twitter
● Photobucket
● weather.com
● answers.com
● Hulu
● Wikia

● source: Ingvar Hagelund
http://users.linpro.no/ingvar/varnish/stats-2010-01-18.txt

Varnish
● used by only one Alexa TOP100 site in Poland
● Gadu-Gadu

Architecture
● Varnish does not fight the OS kernel!

●
uses virtual memory, two main stevedores:
● mmap()
● malloc()

●
scales well in SMP environments
● event-based acceptor
● multi-threaded worker model

Architecture
● avoids expensive memory operations
● workers used in the MRU order, session lingering

● a worker has a private set of variables on the stack

● static buffers – reused

● uses jemalloc library. No noticeable difference
with Google's tcmalloc

Architecture
● workspaces
● operate on pointers, do not copy data
● malloc() only for the workspaces

● obj_workspace – per object, for request/response
headers and metadata. Watch out for very large
headers/cookies!
● sess_workspace – per thread, for request
processing
● shm_workspace – for SHM logging

Architecture
● SHM logging
● an mmap()ed file shared by all threads and logging
programs

● logging without syscalls!
memcpy(p + SHMLOG_DATA, t.b, l);
/* or */
vsnprintf((char *)(p + SHMLOG_DATA), mlen + 1, fmt, ap);

Architecture
● object eviction from a LRU list
● the list requires locking for writes

● an object is only moved in the LRU list if it hasn't
been moved for the last lru_interval seconds

●
hitpass objects

Architecture
● efficient object purging - “ban list”
● need to purge 200,000 objects from the cache
without overloading the server?

● Varnish keeps a list of purges
● every object is tested against the list, but only if
requested by a client
● if it matches, it is refreshed from a backend
● its “last tested against” pointer is updated

Architecture
● results?
● microsecond-level response for cached objects

● good even for static content

● performance limit currently unknown :-)
● 75,000 reqs/s achieved at TMECC
● 143,000 reqs/s achieved by Kristian from Redpill-
Linpro

Architecture
● serving a request from cache:
<... futex resumed> ) = 0 <0.629910>
futex(0x7f2a577fe2e8, FUTEX_WAKE_PRIVATE, 1) = 0 <0.000011>
ioctl(9, FIONBIO, [0]) = 0 <0.000011>
read(9, "GET /logo.png HTTP/1.0rn (...) 8191) = 177 <0.000016>
clock_gettime(CLOCK_REALTIME, {1265632945, 828835974}) = 0 <0.000011>
writev(9, [{"HTTP/1.1"..., 8}, (...) 12912}], 32) = 13227 <0.000039>
close(9) = 0 <0.000019>
futex(0x44884bf4, FUTEX_WAIT_PRIVATE, 239, NULL <unfinished ...>

● 10 system calls, 4 for clock

Features
● run-time management and reconfiguration
$ telnet localhost 6082
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
vcl.list
200 23
active 7 boot

vcl.load new1 /etc/varnish/default.vcl
200 13
VCL compiled.
vcl.use new1
200 0

Features
● comprehensive logging and management
● varnishadm
● varnishlog
● varnishncsa
● varnishtop
● varnishstat
● varnishhist
● varnishreplay
● varnishtest

Features
● logging examples
● tags
varnishtop -i RxURL

varnishtop -i TxURL

varnishtop -i RxHeader -I '^User-Agent'

varnishlog -c -o ReqStart 10.0.0.1

varnishlog -b -o TxHeader '^X-Forwarded-For: .*10.0.0.1'

Features
● varnishstat – real time statistics
client_conn 87737603 99.74 Client connections accepted
client_req 335496200 381.40 Client requests received
cache_hit 307936704 350.07 Cache hits
cache_hitpass 811746 0.92 Cache hits for pass
backend_conn 12311926 14.00 Backend conn. success
n_object 549675 . N struct object
n_wrk 100 . N worker threads
n_expired 23826372 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_wrk_failed 0 0.00 N worker threads not created
s_req 335510357 381.41 Total Requests
s_pass 2947900 3.35 Total pass
s_fetch 27317481 31.05 Total fetch
sma_nbytes 6661407561 . SMA outstanding bytes
sma_balloc 2173616292374 . SMA bytes allocated
sma_bfree 2166954884813 . SMA bytes free
backend_req 27318738 31.06 Backend requests made
esi_parse 0 0.00 Objects ESI parsed (unlock)
esi_errors 0 0.00 ESI parse errors (unlock)

Features
● timing information
type XID start time

830 ReqEnd c 877345549 1233949945.075706005
1233949945.075754881 0.017112017 0.000022888 0.000025988

end time accept()-processing processing-delivery delivery time

Features
● backend load balancing – directors
● round-robin
● random
● backend health polling – using new connections
● grace
●
URL serialization
● IPv6 support

Features
● no forward-proxy support – can be done, but
with huge amount of configuration magic

● flexible purging
purge req.http.host == foobar.com && req.url ~ ^/directory/.*$
purge obj.http.Cookie ~ example=true

● ESI support

Features
● Edge Side Includes
● a markup language for dynamic content assembly

● used by Akamai, IBM WebSphere, F5, Varnish

● without ESI: page-level caching decisions

● with ESI: a page can be split into separate blocks
and assembled by the cache server

Features
● Edge Side Includes
● Varnish implements a small subset of ESI
● no compression support yet
● no If-Modified-Since support yet
<esi:include src="/esi/hot_news.html"/>

<esi:remove>
<a href="/something">something</a>
</esi:remove>

Features
● VCL – Varnish Configuration Language
● a domain-specific language
● translated to C and compiled
● dynamically loaded
● similar to C, Perl
● = == ! && || ~ !~
● character escaping like in URLs: %nn
● no user-defined variables, use HTTP headers:
set req.http.something = "";
unset req.http.something;

Features
● “normal” “concatenated” “strings” or
{"string
string
"}

synthetic { “string” }

● if () {} elsif {}
● no loops
● include “file.vcl”;
● regsub(), regsuball()

Features
● user-definied subroutines
sub f {
do_magic;
}

call f;

● no arguments / return values in subs
● return(); exclusive to internal VCL functions
● special variables: now (unix time), client.ip,
server.ip, server.port, server.identity

Features
● ACLs
acl localnet {
“localhost”;
“10.0.0.0/24”;
! “10.0.0.1”;
}

if (client.ip ~ localnet) {
do_magic;
}

● security.vcl
● if everything else fails... embedded C!

Features
● embedded C in VCL
● example: syslog logging from VCL (don't :-)
C{
#include <syslog.h>
}C

C{
syslog(LOG_INFO, "Something happened at VCL line XX.");

syslog(LOG_ERR, "Response from backend: XID %s request %s
%s "%s" %d "%s" "%s"", VRT_r_req_xid(sp),
VRT_r_req_request(sp), VRT_GetHdr(sp, HDR_REQ, "005Host:"),
VRT_r_req_url(sp), VRT_r_obj_status(sp), VRT_r_obj_response(sp),
VRT_GetHdr(sp, HDR_OBJ, "011Location:"));
}C

VCL
● request path through VCL
● vcl_recv
● vcl_pipe
● vcl_pass
● vcl_hash
● vcl_{hit,miss}
● vcl_fetch
● vcl_deliver
● http://varnish-cache.org/wiki/VCLExampleDefault
● this graph is oversimplified!

VCL
● vcl_recv
● called at the beginning, after the request has been
received
● possible returns: error, pass, pipe, lookup
● example variables: req.request, req.url, req.proto,
req.backend, req.backend.healthy,
req.http.Header
if (req.host == “static.foo.com” && req.url ~ “^/static/.*”) {
set req.backend = cluster1;
} else {
error 404 “Unknown virtual host”;
}

VCL
● vcl_pipe
● called when entering pipe mode
● shifts bytes back and forth (client ↔ backend)
● possible returns: error, pipe
● example variables: bereq.request, bereq.url,
bereq.proto, bereq.http.Header
● timeouts

VCL
● vcl_pass
● called when entering pass mode

● the request is passed to the backend without
caching

● possible returns: error, pass

● example variables: bereq.*

VCL
● vcl_hash
● called on object lookup
● generates a user-configurable object hash
● possible returns: hash
● example variables: req.hash
vcl_hash {
if (req.url ~ “^/content” && req.http.Cookie ~ “adult=true”) {
set req.hash += “adultContent”;
set req.http.X-Adult-Content = “1”;
}
}

VCL
● vcl_hit
● called after lookup when hit
● possible returns: error, pass, deliver
● example variables: obj.hits, obj.ttl
● caveat: do not modify the object here!
● example: adaptive TTLs:
if (req.http.host ~ “^images.”) {
if (obj.hits > 5 && obj.hits < 10) {
set obj.ttl = 8h;
} elsif (obj.hits >= 10) {
set obj.ttl = 2d;
}
}

VCL
● vcl_miss
● called after lookup when missed

● possible returns: error, pass, fetch

● example variables: bereq.*

VCL
● vcl_fetch
● called after the object has been fetched
● possible returns: error, pass, deliver, esi
● example variables: obj.hits, obj.proto, obj.status,
obj.response, obj.cacheable, obj.ttl, obj.lastuse
● obj.cacheable means: obj.status is 200, 203, 300,
301, 302, 410 or 404
● forced obj.ttl first set here
● obj. called beresp. in trunk

VCL
● vcl_fetch
● ESI processing takes place here


<esi:include src="/counter.cgi"/>

sub vcl_fetch {
if (req.url ~ "/esi/" || obj.http.X-ESI) {
esi;
set obj.ttl = 1d;
} elsif (req.url == "/counter.cgi") {
set obj.ttl = 1m;
}
}

VCL
● vcl_deliver
● called before delivery to the client
● possible returns: error, deliver
● example variables: resp.proto, resp.status,
resp.response, resp.http.HEADER
● modify headers for the client here
set resp.http.X-Served-By = server.identity;
if (obj.hits > 0) {
set resp.http.X-Varnish-Hit = “HIT”;
set resp.http.X-Varnish-Hits = obj.hits;
} else {
set resp.http.X-Varnish-Hit = “MISS”;
}

VCL
● vcl_error
● called on errors
● possible returns: deliver
● example variables: req.*, obj.*
● customizing error pages:
sub vcl_error {
if (req.url ~ “^/MONITOR.txt$”) {
synthetic {“MONITOR
“};
deliver;
}
}

VCL
● restarts
● the “restart” keyword turns the request all the
way back to vcl_recv, available everywhere
sub vcl_fetch {
if (obj.status >= 500) {
restart;
}
}

sub vcl_error {
if (obj.status == 500 && req.restarts < 4) {
restart;
}
}

VCL
● restarts
● the “restart” keyword – you can even try another
data center
sub vcl_recv {
if (req.restarts == 0) {
set req.backend = data_center_1;
} elsif (req.restarts == 1) {
set req.backend = data_center_2;
}
}

VCL
● things to remember
● req. data structure available throughout the VCL
(except in vcl_deliver)
● do not modify objects in vcl_hit (except for TTL)
● if unsure, translate the VCL to C
varnishd -C -f file.vcl

● look for VRT_count(sp, X) for ordering
● if you don't return in a vcl_*, default VCL for that
function is appended

VCL examples
● purging, “the squid way”
sub vcl_recv {
if (req.request == "PURGE") {
if (!client.ip ~ purge) {
error 405 "Not allowed";
}
lookup;
}
}
sub vcl_hit {
set obj.ttl = 0s;
error 200 "Purged";
}
}
sub vcl_miss {
error 404 "Not found";
}
}

VCL examples
● saint mode (trunk only)
● do not send errors to clients
sub vcl_fetch {
if (beresp.status >= 500) {
set beresp.saintmode = 20s;
restart;
}
set beresp.grace = 30m;
}

● saint mode will disable a backend for a specified
period of time
● if all backends are unavailable - grace

VCL examples
● force grace on error
vcl_error {
if (req.restarts == 0) {
set req.http.X-Serve-Graced = "1";
restart;
}
}

vcl_recv {
if (req.http.X-Serve-Graced && req.restarts == 1) {
set req.backend = dead;
}

● define a “dead” backend with health polling
● when restarted from vcl_error, graced content will
be served

VCL examples
● URL rewriting
if (req.http.host ~ "^(www.)?foo" && req.url ~ "^/images/") {
set req.http.host = "images.foo";
set req.url = regsub(req.url, "^/images/", "/");
}

●
redirects (a bit of a hack)
sub vcl_recv {
if (req.http.host = "^(www.)?foo.com" && req.http.User-Agent ~
"iPhone|Nokia|Motorola") {
error 701 "Moved temporarily";
}
}
sub vcl_error {
if (obj.status == 701) {
set obj.http.Location = "http://m.foo.com/";
set obj.status = 302;
deliver;
}
}

VCL examples
● caching publicly available authorized pages
sub vcl_fetch {
if (obj.http.Authorization && !obj.http.Cache-Control ~
"public") {
pass;
}
}

●
caching logged in users (be careful!)
●
http://varnish-cache.org/wiki/VCLExampleCachingLoggedInUsers
● possible with per-user caching and careful use of
ESI
● separate “(not) logged in” objects from “logged in
as...”

VCL examples
● cookie based hashing
sub vcl_hash {
if (req.http.Cookie ~ "language=esperanto" ) {
set req.hash += "LangEsperanto";
}
}

● result: a separate cached version of the object for
requests with Cookie: language=esperanto;
●
extracting the value of a cookie
● nothing more than a regexp
regsub(req.http.Cookie, "^.*?cookie=([^;]*);*.*$", "1");

VCL examples
● serving synthetic responses
if (req.url ~ "^/MONITOR.txt") {
error 200 "OK";
}

● allowing reloads from browsers without purging
if (req.http.Cache-Control ~ "(no-cache|no-store|private)") {
pass;
}

● watch out for nasty bots!
●
passing everything for secure URLs
if (req.url ~ "^/secure") {
pass;
}

VCL examples
● normalizing Accept-Encoding headers for
compression
if (req.http.Accept-Encoding) {
if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
} elsif (req.http.Accept-Encoding ~ "deflate") {
set req.http.Accept-Encoding = "deflate";
} else {
remove req.http.Accept-Encoding;
}
if (req.url ~ ".(css|js)$" &&req.http.User-Agent ~ "MSIE 6") {
remove req.http.Accept-Encoding;
}
}

Best practices
● RFC 2616!
● not really for reverse proxies...

● Varnish is both a client and a server

● TTLs

Best practices
● object TTL control – headers from backend
● considered in the following order:
● Cache-Control: s-maxage=<relative time>
● Cache-Control: max-age=<relative time>
● Varnish ignores all other Cache-Control headers
(unless told otherwise in VCL)
● Expires: absolute time, requires synced clocks
● Expires is an HTTP/1.0 header
● Varnish will try to compensate for clock skew

Best practices
● object TTL control – VCL
● set obj.ttl = x; - takes precedence over headers
● default_ttl configuration parameter

●
Varnish sets the Age: header

●
if in doubt, check varnishlog
● TTL tag

Best practices
● TTL tag in varnishlog
509 TTL - 1850178309 RFC 1798 1267393695 1267393694 1267395494 0 0

XID TTL time Date Expires max-age age

242 TTL c 1416303904 VCL 86400 1267393696

XID TTL time

Best practices
● caching policy
● Last-Modified / If-Modified-Since

● ETag / If-None-Match

● Vary

Best practices
● compression
● Varnish leaves compression up to the backends

● gzip, deflate, none – data set * 3

● Vary: Accept-Encoding

● normalize Accept-Encoding from browsers

Best practices
● sanitize request headers
● we've had requests coming in to
“http://our.com/http://another.com/.*”
if (req.url ~ "^/?http://") {
set req.url = regsub(req.url, "?http://.*", "");
}

● cache hit ratio went from 92% to 94%
●
normalize vhosts
if (req.http.host ~ "^(www.)?example.com") {
set req.http.host = "example.com";
}

●
hit ratio and backend requests: 1% is half of 2%!

Best practices
● set Content-Length on the backends
● static files: multiple backends = multiple VFS
caches
● serving large objects a.k.a. My Own YouTube
● objects are fully fetched before delivery
● use pipe
● ranges not supported
● not really suitable for serving video content

Best practices
● forced TTLs
● on heavily loaded sites – force TTLs to a few
seconds on all pages (but pass secure content)

●
purging
● entries on the ban list accumulate – consider
forcing expiry by PURGE requests
● when forcing expiry, purge all Vary versions!
● include purging in your application design

Best practices
● debugging
● add an X-Served-By header

● add other headers along the way

● beware of header traffic!

● X-Varnish header

Best practices
● HTTPS – use pound or perlbal
● in vcl_pipe:
set bereq.http.Connection = "close";

● drain connections quickly before restarting
varnishd
sub vcl_recv {
if (req.http.Connection != "close") {
set req.http.Connection = "close";
restart;
}
}

Best practices
● graph everything, ask questions later
● YMMV: what is good for the big guys from
TOP100 may not be as good for you (e.g.
stevedore choice)
●
test
● wget --save-headers
● curl -i
● LWP: GET -USsed
● caveat: lwp-request does: “GET http://foo/bar”

Best practices
● if everything else fails...
$ gdb /usr/sbin/varnishd core
GNU gdb 6.8-debian
This GDB was configured as "x86_64-linux-gnu"...
(gdb) bt

(…)

(gdb) frame 3
#3 0x000000000042ef64 in mgt_cli_vlu (priv=0x7fb112813c00,
p=0x7fb1128d3000 "debug.health") at mgt_cli.c:270
270 xxxassert(i == strlen(p));

● don't strip Varnish binaries
● compile with --enable-debugging-symbols --enable-
diagnostics

Configuration
● object hash table
● Varnish 2.0: -h classic,N
● N hash buckets – objects / 10
● a prime number

● Varnish trunk: -h critbit
● Patricia Tree

Configuration
● run-time parameters (can be set from CLI)
● obj_workspace=Nbytes (dynamic in trunk) –
headers, per object overhead
● sess_workspace=Nbytes – entire header and all
edits done in VCL, per thread
● shm_workspace=Nbytes – for the log, per thread
● shm_reclen=Nbytes – max SHM log record length
● session_linger=Nms – time before a worker thread
is returned to its pool
● sess_timeout=Ns – persistent session timeout

Configuration
● thread_pools=N – set to the number of CPU
cores
● thread_pool_add_delay=Nms – default may be
too high
●
thread_pool_max and _min – a bit confusing
● max – the limit for all thread pools
● min – the limit for one thread pool
● do not set too high

OS environment
● forget about 32-bit
● malloc() better than mmap() for in-memory
cache sets
● also better for larger-than-memory cache sets
on Linux (YMMV)
● Varnish on virtualized guests?
● slight latency difference
● can be an issue for on-line auction sites

OS environment
● Virtualization
● Varnish on a standalone system

OS environment
● Virtualization
● Varnish on a Xen domU with pinned vcpus

OS environment
● I/O related tuning on Linux
● set vm.swappiness to 0
● /var/lib/varnish/$HOSTNAME/_.vsl – the SHM log
● put the SHM log on tmpfs
● anticipatory elevator best on HDDs, noop on SSDs
● use ext2
● noatime
● swap striping
● iSCSI is great for logs

OS environment
● network tuning
● run NTP
● check if your load balancer uses keep-alive
● /proc/sys/net – don't tune if you don't know what
you're doing
● don't use net.ipv4.tcp_tw_reuse
● tcp_tw_recycle is even worse
● normally the socket waits 2 * MSL
● reusing causes problems with NAT routers

New features in trunk
● upcoming release: Varnish 2.1
● persistent storage (without LRU support)
● URL hashing director
● client hashing director
● critbit by default
● saint mode
● obj_workspace allocated dynamically
● req.* in vcl_deliver
● obj.* in vcl_fetch is now beresp.*

Shopping list
● http://varnish-cache.org/wiki/PostTwoShoppingList
● ESI enhancements (304s, gzip, etc.)
●
compression support
●
streaming in pass / fetch
●
Content-Range support
●
file upload buffering
●
VCL cookie handling (req.cookie.foo)
●
custom formats in varnishlog and varnishncsa

Shopping list
● expiry randomization
● “lemming effect”

Support & development
● commercial support offered by Redpill-Linpro

●
community support #varnish on irc.linpro.no

●
VML – Varnish Moral License
● http://phk.freebsd.dk/VML/
● not a support contract!
● help pay for Varnish development

Sources
● http://varnish-cache.org/
● TMECC VCL configs
●
Wikia VCL configs
●
http://kristian.blog.linpro.no/
●
http://ingvar.blog.linpro.no/
●
#varnish

Varnish - PLNOG 4

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Varnish - PLNOG 4

Similar to Varnish - PLNOG 4 (20)

Recently uploaded

Recently uploaded (20)

Varnish - PLNOG 4