Scale Your Application while Improving Performance and Lowering Costs (SVC203) | AWS re:Invent 2013

Cache is King:
Scale Your Application while Improving
Performance and Lowering Costs

November 15, 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Content Breakdown
87% of Your Site Consists Of Static/Re-usable Content

But Wait…
56% of The Internet’s Content is NOT Cached Today

Cache is King

~56%

300K Tracked Websites Oct 15 2012-Oct 1 2013

{
Source: httparchive.org

Browser Cache
Edge Cache
Web Cache

Application Cache
Database Cache
Image Source: cakeormistake.wordpress.com

Speed of Light ~300,000 Km/s
Image Source: wordlesstech.com

~200,000 Km/s through fiber
Image Source: www2.wiringandcabling.com

Download Performance

Round Trip: 13,026 km


Theoretical: 65 ms


Theoretical: 65 ms
Measured: 68 ms

Time to First Byte
68 ms

Client

Server

Time to First Byte
68 ms
SYN

68 ms

SYN/ACK

Client

Server

Time to First Byte
68 ms
SYN

68 ms
68 ms
136 ms

SYN/ACK
ACK,
GET /image.jpg HTTP/1.1

Client

Server

Browser Caching
Set max-age or expiry date in the headers.

HTML5 Application Cache.
Helps eliminate network latency.
But… Browser cache size is limited.
(e.g. IE is 8-50M, Chrome is < 80M, Firefox is 50MB, etc.)

Bring the Bytes Closer to Your Users
10 ms

Saves 68 ms
RTT

68 ms

SYN

SYN

SYN/ACK

SYN/ACK

ACK,



Client

CloudFront

Origin

Time to first byte: 20 ms vs. 136 ms

Amazon CloudFront: 46 Global Locations

=

Edge Cache: Amazon CloudFront
Mobile Clients

Amazon S3

Amazon CloudFront
Edge Location

Browsers/Desktop
Clients

Elastic Load
Balancing

Tablets/Devices

Edge Cache
How do you decide what to cache?
• Static or Re-Usable Content
• Customized Content

• On-Demand and Live Video
• Dynamic or Unique Content

Cache Static or Re-Usable Content
HTTP/1.0 200 OK
Date: Mon, 19 Mar 201 12:51:28 GMT
Server: Apache
Last-Modified: Mon, 19 Mar 2012 07:15:25 GMT
Accept-Ranges: bytes
Content-Length: 1918

Cache-Control: max-age=86400
Content-Type: image/jpeg
Vary: Accept-Encoding
Age: 16
X-Cache: Hit from cloudfront
Connection: keep-alive

Cache Customized Content

Customized Content

Caching Your Live Stream?
Edge
HLS, HDS,
Smooth Streaming

HTTP
Cache
Hong Kong

Source

Encoder

Paris

New York

Dynamic Content?
Zero TTL – cannot be cached! CloudFront can still help…

Dynamic Content?
Zero TTL – cannot be cached! CloudFront can still help…
TCP/IP optimizations for the network path

Keep-Alive Connections to reduce RTT
SSL Termination close to viewers
POST/PUT upload optimizations
Latency Based Routing
Low prices, same as static content delivery!

“We are excited to use CloudFront's new
POST, PUT, PATCH, and DELETE capabilities
to accelerate our RESTful APIs on Amazon
EC2. With these new HTTP methods we can
now take advantage of CloudFront’s global
footprint and optimized connections back to
our origin servers in AWS. Routing our
customers’ API requests via a CloudFront edge
location near them will help improve their
experience by minimizing packet loss and
upload latency. This will help provide a
streamlined experience for our customers.”
- Ilan Rabinovitch, Tech Lead, Site Reliability Engineering

Browser Cache
Edge Cache
Web Cache

You can’t have dessert,
until you have your dinner!

-An Experiment.
https://secure.flickr.com/photos/stephen-oung/6319155216/

Experiment Premise
• Start with basic infrastructure.
–
–
–
–
–
–

Single m1.large EC2 instance running Amazon Linux
Single non-Multi-AZ m1.xlarge RDS MySQL 5.6 instance
Apache httpd-2.2.25
PHP 5.3.27
Drupal Commerce Kickstart 7.2
EIP on Instance

• Throw a ton of traffic at it.
– 8,000,000 queries, 40 per second, from 4 other instances.

• Profile it.
– New Relic PHP agent
– webpagetest.org

Experiment Infrastructure

Elastic IP
Amazon RDS
DB Instance

Web/App
Instance

Clients

Results?

(not pretty)

14.7 second responses

Low RPM

Mostly Static Data

Web Cache
Webserver or proxy caches would live between
your CDN/Users and your web tier and can offer
up increased cost performance via reducing
internal application and database load. Can also
offer up increased edge to origin speed for lots of
content.

Web Cache
Popular solutions:
–
–
–
–
–
–

Varnish
Nginx
Apache with mod_cache/mod_proxy
Squid
Perlbal
Language/framework caches (i.e., APC, Zend)

Web Cache
Customer
Traffic

Web/App
Instances

Elastic Load
Balancing

Cache
Instances

Elastic Load
Balancing
VPC Subnet

VPC Subnet
Availability Zone

Amazon
Route 53

Internet
Gateway

Web/App
Instances

Elastic Load
Balancing

Cache
Instances

VPC Subnet
Availability Zone

Elastic Load
Balancing
VPC Subnet

Amazon
CloudFront

Web Cache
• Opt for in-memory caching when possible.

• Pay attention to your cache hit/miss ratios. It could be a sign that you
need to re-size the instances or re-size the number of nodes in your
cache pool.
• Set smart TTLs so that you don’t affect new deploys or cache content
for too long.
• Be smart about what cookies can burst cache and what cookies can’t.
Don’t serve up other people’s content or stale dynamic pages.

Web Cache
– All logged out user pages
– Any completely static pages
– Traffic/log analysis
•
•
•
•

Look at your web logs/CDN logs
Find heavily hit pages
Figure out how often they actually change
Apply a TTL to that page to be cached

– Even 60 second TTLs could help drastically!

Browser Cache
Edge Cache
Web Cache

Application Cache

Application Cache
Application level caches for information such
as session data, temporary application data
such as cart information, and live aggregation
of data feeds.

Application Cache
Popular solutions:
–
–
–
–

Memcached
Redis
Cassandra
Amazon DynamoDB

Application Cache

ElastiCache
Cache Node

Web/App
Instances

VPC Subnet

Elastic Load
Balancing

Cache
Instances

Elastic Load
Balancing
VPC Subnet

VPC Subnet
Availability Zone

Internet
Gateway

ElastiCache
Cache Node
VPC Subnet

Web/App
Instances

Elastic Load
Balancing
VPC Subnet

Availability Zone

Cache
Instances

Elastic Load
Balancing
VPC Subnet

Amazon DynamoDB Session Handler for PHP
https://aws.amazon.com/sdkforphp/

Application Cache

Web/App
Instances

Elastic Load
Balancing

Cache
Instances

Elastic Load
Balancing

Squid Proxy
Instances

VPC Subnet

VPC Subnet

Amazon
DynamoDB

Availability Zone
Internet
Gateway

Web/App
Instances

Elastic Load
Balancing

Cache
Instances

Elastic Load
Balancing
VPC Subnet

VPC Subnet
Availability Zone

Squid Proxy
Instances

Application Cache
Use Cases:
– Session information
– Temporary data
• Cart info, metadata

– Rate limiting
• Fight abuse of APIs, spamming, functionality abuse

– Counters
• Views, Scores, Leader Boards

Application Cache
– Ideally, you have to treat data you cache at this tier as
loss tolerant if working with in-memory caches.
– Session information that wouldn’t make sense in
cookies or in a true DB.
– Look at the kind of data your application is generating
and storing in a DB that it might not need to.

Browser Cache
Edge Cache
Web Cache

Application Cache
Database Cache

Database Cache
Reduce workload on database servers by
caching commonly requested information, or
any information that might not change
frequently (i.e., user info, listing info, product
info).

Database Cache
Popular solutions:
– In-engine query caches
– Memcached
• On dedicated host
• On DB host (built in w/ MySQL 5.6)

– Redis
• On dedicated host

Database Cache
A word of caution:
In-engine DB caches are often not recommended for
many use cases, as they can significantly impact the
performance of many databases. Depending on the
workload and dataset you have, an in-engine query
cache might not be a good idea for you. We recommend
off DB caches where possible.

Database Cache

RDS Instance ElastiCache
Primary (M-AZ) Cache Node

Web/App
Instances

VPC Subnet

Elastic Load
Balancing

Cache
Instances

Elastic Load
Balancing
VPC Subnet

VPC Subnet
Availability Zone

Internet
Gateway

RDS Instance ElastiCache
Standby (MAZ) Cache Node
VPC Subnet

Web/App
Instances

Elastic Load
Balancing
VPC Subnet

Availability Zone

Cache
Instances

Elastic Load
Balancing
VPC Subnet

Memcached – Code Sample
function retrieveValue($query)
{
$queryId = md5($query);
if ($myValue = $memcache->get($queryId))
{
return $myValue;
}
else
{
$myValue = dbfetch($query);
$memcache->set($queryId, $myValue);
return $myValue;
}
}

MySQL 5.6 + Memcached
RDS MySQL supports version
5.6 with integrated Memcached
on the instance:
– Part of the InnoDB engine
– Memcached running as part of MySQL
talks directly to data in InnoDB tables,
essentially turning MySQL into a fast
“key-value store”
– From the opposite view point, adds
persistence to Memcached
– Same Memcached API as standalone

https://dev.mysql.com/doc/refman/5.6/en/innodb-memcached-intro.html

AWS Marketplace & Partners Can Help
•

Customers can find, research, buy
software.

•

Simple pricing, aligns with Amazon
EC2 usage model.

•

Launch in minutes!

•

Marketplace billing integrated into your
AWS account.

•

1100+ products across 24+ categories.

Learn more at: aws.amazon.com/marketplace

Back To Our
Experiment!
https://secure.flickr.com/photos/toyochin/1382531438/

Experiment Infrastructure (original)

Elastic IP
Amazon RDS
DB Instance

Web/App
Instance

Clients

Experiment Infrastructure (with cake)
• Added in ElastiCache Memcached
cache.m1.large
• Added in Amazon CloudFront for static content
– Tuned expires and cache headers in Apache

• Added in APC for PHP caching
– Increased memory to 128Mb, no other changes

• Did nothing to the DB
• Drupal memcached & CDN modules

Experiment Infrastructure (with cake)
CloudFront

Elastic IP
Amazon RDS
DB Instance

Web/App
Instance

Clients
ElastiCache
Cache Node

Results?

(AWESOME!)

response time ~1/2

~2x RPM
Repeat Page view from 15.8sec to 2.75sec

Easy as Pie!

https://secure.flickr.com/photos/linecon0/2654865842/

Toronto Star
•
•
•
•

Canada’s largest daily newspaper
Focused on metro Toronto
3.3 million monthly unique visitors
Small in-house digital group,
supported by vendors & corporate IT
• Digital group run as “startup” within
corporate structure
• Tech stack includes Java, PHP,
Ruby, Python

Why Cache?
•
•
•
•

Performance
Scalability
Reliability
Cost

Edge – The Onion Skin
• Examples: Akamai,
CloudFront
• Static Asset caching
• Full site caching
• Origins
• Behaviors
• thestar.com, wheels.ca,
toronto.com, thegridto.com

Edge

Cloudfront
Elastic Load Balancer
Production - US East Region

Thestar.com
architecture

VPC

Dispatcher/Apache
Amazon Linux 64 Bit
C1 Medium

Publish CQ/CRX
Publish CQ/CRX
Amazon Linux 64 Bit Amazon Linux 64 Bit
C1-Xlarge
C1-Xlarge

Dispatcher/Apache
Amazon Linux 64 Bit
C1 Medium

Publish CQ/CRX
Amazon Linux 64 Bit
C1-Xlarge

Publish CQ/CRX
Amazon Linux 64 Bit
C1-Xlarge

Dispatcher/Apache
Amazon Linux 64 Bit
C1 Medium

Publish CQ/CRX
Publish CQ/CRX
C1-Xlarge
C1-Xlarge

Standby
Master

Author CQ/CRX
Amazon Linux 64 Bit
C1-Xlarge
Availability Zone A

Author CQ/CRX
Amazon Linux 64 Bit
C1-Xlarge
Availability Zone B

Availability Zone D

Cloudfront
Elastic Load Balancer
Production - US East Region

VPC

Dispatcher/Apache
Amazon Linux 64 Bit
C1 Medium

Publish CQ/CRX
Publish CQ/CRX
C1-Xlarge
C1-Xlarge

Dispatcher/Apache
Amazon Linux 64 Bit
C1 Medium

Publish CQ/CRX
Amazon Linux 64 Bit
C1-Xlarge

Standby

Publish CQ/CRX
Amazon Linux 64 Bit
C1-Xlarge

Dispatcher/Apache
Amazon Linux 64 Bit
C1 Medium

Publish CQ/CRX
Publish CQ/CRX
C1-Xlarge
C1-Xlarge

Whole Site Delivery
•
•

•
•

•

Cache everything possible
No server side cookies written,
only select pages pass query
strings
Control caching granularly using
19 different rules
We use a single origin, but Elastic
Load Balancing and multi-tiered
multi-AZ configuration on backend
Planning a multi-region DR
architecture that will also leverage
Amazon CloudFront

Reverse Proxy / Web Accelerator
• Examples: Varnish,
Nginx, Apache
• Serves “static” content
• Reduce load on app
server
• wheels.ca, mystar,
thestar.com,
thegridto.com

Reverse
proxy

mystar

Wheels.ca

App-level caching
• Examples:
Memcached, Reddis
– ElastiCache now supports
both

• Session caching
• View caching
• mystar

App

Database caching
• Example:
Memcached
• Cache frequent SQL
queries
• Reduce DB server
load, response time
• mystar, wheels.ca

DB

Thanks!
Ping me
sevans@thestar.ca
@stephenaevans

Please give us your feedback on this
presentation

SVC203
As a thank you, we will select prize
winners daily for completed surveys!


Varnish Cache is a web application accelerator also
known as a caching HTTP reverse proxy. You install it
in front of any server that speaks HTTP and configure
it to cache the contents. Varnish Cache is really, really
fast. It typically speeds up delivery with a factor of 300
- 1000x, depending on your architecture.
- varnish-cache.org

Nginx (pronounced "engine x") is an open
source reverse proxy server for HTTP,
HTTPS, SMTP, POP3, and IMAP protocols,
as well as a load balancer, HTTP cache, and
a web server (origin server). The nginx
project started with a strong focus on high
concurrency, high performance and low
memory usage.
- Wikipedia

App-level caching
Memcached
Free & open source, high-performance, distributed
memory object caching system, generic in nature, but
intended for use in speeding up dynamic web
applications by alleviating database load.
Memcached is an in-memory key-value store for small
chunks of arbitrary data (strings, objects) from results of
database calls, API calls, or page rendering

App-level caching
Redis
Redis is an open source, BSD licensed, advanced
key-value store. It is often referred to as a data
structure server since keys can contain strings,
hashes, lists, sets and sorted sets.
– Redis.io

“A content delivery network is a large distributed
system of servers deployed in multiple data
centers across the Internet. The goal of a CDN is
to serve content to end-users with high availability
and high performance.” - Wikipedia

“A web accelerator is a proxy server that
reduces web site access times. They can be
a self-contained hardware appliance or
installable software.” - Wikipedia

Scale Your Application while Improving Performance and Lowering Costs (SVC203) | AWS re:Invent 2013

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie Scale Your Application while Improving Performance and Lowering Costs (SVC203) | AWS re:Invent 2013

Ähnlich wie Scale Your Application while Improving Performance and Lowering Costs (SVC203) | AWS re:Invent 2013 (20)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Scale Your Application while Improving Performance and Lowering Costs (SVC203) | AWS re:Invent 2013