NY Times: so news doesn't break your server

@NYTDevs | developers.nytimes.com

Varnish: Linchpin of the
NYTimes.com Re-architecture
Adam E. Falk
Software Architect, Web Products

Who I Am
A software architect focusing on server
configuration and resiliency, with
sidelines in DevOps, release
engineering, and testing.
Started as a LAMP developer but has
always been a generalist interested in
all aspects of the data center.

Who We Are
Photo credit: Tony Cenicola/The New York Times

Scope of this Presentation
Everything that follows pertains to the use of
Varnish to accelerate serving content on the
<www.nytimes.com> hostname, only.
There are several other Varnish clusters at
NYTimes.com.

NYTimes.com: Size
15+ million page URLs (1851–present)
● Not all HTML; working on that
200+ new page URLs created each day
Millions more image URLs

NYTimes.com: Traffic
<www.nytimes.com> normal daily peak is
~75,000 requests/second – just this hostname.
● primarily APIs
● HTML traffic is ~4,000 req/sec
Traffic spikes up to 4x during a
breaking news event R.I.P. Leonard Nimoy

2013 Redesign of NYTimes.com

Mission Statement
“Leverage the latest technology in order to
improve the user experience, enhance our
journalism, and provide a more effective
environment for our advertisers.”
Project document

Improve the User Experience
Technical goals:
1. 25% improvement in browser load time, minimum.
2. ...
Sounds like a job for page caching!

50% or better improvement in
● Time to first byte
● Time to paint
● Time to page ready
Achievement Unlocked

Brave New World

Exception to the Rule
A complete code rewrite (almost). Why?
● < insert usual suspects here >
● Deeply embedded server-side personalization
(includes ads)
Output was simply uncacheable.

Never Let a Crisis Go To Waste
☒ (Test|Behavior) Driven Development
☒ Web performance was core from Day 0
☒ Async wherever, whenever
☒ New APIs
☒ CSS: LESS (then), SASS (now)

Can We Cache Pages Now?
Yes, Virginia.
</summary>

Spotlights for You
VCL file modular organization
Cache refresh instead of purge
Varnish cluster today

Changing Horses in Midstream
Site functionality that must not break:
● redirects (mobile, registration, et. al.)
● user tracking
● web crawler detection

Best Practice (singular)

Easy Yet Powerful

Greatest Thing Since Sliced Bread
☒ Single responsibility principle
☒ Code readability (and understanding!)
☒ Time spent troubleshooting
☒ Coding standards

Intermission
There are only two hard things in
Computer Science:
1. Cache invalidation
2. Naming things
3. Off-by-one errors
http://martinfowler.com/bliki/TwoHardThings.html

Cache Invalidation
Purge is not good enough (in Varnish 3).
PURGE causes cache misses on the highest-
traffic content.
Needed cache re(set|build|prime).

NYT Homepage
● Must always be in Varnish cache.
● Every article linked to on the homepage
should already be in Varnish cache.
No cache misses = long TTL.

But...
Some content changes frequently.
Latest version served in real-time after every
publish action.
Short TTL = more cache misses.
PURGE = more cache misses.

Cache Rules Everything Around Me
CREAM: an API to re(set|build|prime) a single
cache entry.
Publish event calls API synchronously.

req.hash_always_miss = true
CREAM requests the just-updated article to
every Varnish server, in parallel.

Where We Are Today: Software
~2,300 lines of VCL code
● Minimum of inline C
10 VMODs
● std, utils, crashhandler, wurfl, boltsort, queryfilter
● 4 custom

Where We Are Today: Traffic
Of the ~4,000 page requests/second to
<www.nytimes.com>:
● ~1,500 now served by Varnish
● ~91% cache hit rate (down from ~96%)

Where We Are Today: Performance
Load test: ~3,000 requests/second/server with
current configuration
We could handle a 4x spike with 2 servers
We run 8 servers per data center

8 Servers? Why?!
Because:
● Biggest spike ever was 10x (2012 Election Night)
● 2 hypervisors => even number of server instances
● Takes too long for us to dynamically provision
● We can afford to stay over-provisioned
Yes, this causes extra backend network traffic.
Scaled out for resilience, scaling up for performance.

Next Steps for Us
1. Install Varnish Cache Plus 4
2. Utilize the Varnish Plus tools for monitoring.
3. Replace CREAM with VHA

Thank You
Adam E. Falk
falkae@nytimes.com
@xenograg
adamfalk.com xenograg.com

We’re hiring
nytimes.com/careers
@NYTDevs | #timesopen | developers.nytimes.com

NY Times: so news doesn't break your server

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie NY Times: so news doesn't break your server

Ähnlich wie NY Times: so news doesn't break your server (20)

Mehr von Varnish Software

Mehr von Varnish Software (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

NY Times: so news doesn't break your server

Hinweis der Redaktion