Integrating content delivery networks into your application infrastructure can offer many benefits, including major performance improvements for your applications. So understanding how CDNs perform — especially for your specific use cases — is vital. However, testing for measurement is complicated and nuanced, and can result in metric overload and confusion. It’s becoming increasingly important to understand measurement techniques, what they’re telling you, and how to apply them to your actual content.
2. Why
this
matters
• Performance
is
one
of
the
main
reasons
we
use
a
CDN
• Measurement
often
used
during
evaluation
phase
to
compare
CDNs
– Most
of
what
we’ll
talk
about
is
in
this
context
• Seems
easy,
but
isn’t
• Heavily
vendor-‐influenced
– “Ok
Google:
define
irony!”
3. Goals
• What
does
the
measurement
landscape
look
like
• Share
measurement
experiences
• Help
guide
towards
good
testing
plan
if/when
you
decide
to
do
this
8. What
we’ll
be
focusing
on
• Only
on
delivery
and
not
all
the
other
features
CDNs
provide
• How
we
measure
• Metrics
to
measure
• What
to
measure
• Some
gotchas,
misconceptions,
and
common
mistakes
12. Synthetic
testing
• Usually
a
large
network
of
test
nodes
all
over
the
globe
• Highly
scalable,
can
do
lots
of
tests
at
once
• Many
vendors
that
have
this
model
– Examples:
Catchpoint,
Dynatrace(Gomez),
Keynote,
Pingdom,
etc
13. Synthetic
testing
• Built
to
do
full
performance
and
availability
testing
– Lots
of
“monitors”
–
emulating
what
real
users
do
– DNS,
Traceroute,
Ping,
Streaming,
Mobile
– HTTP
• Object
• Browser
• Transactions/Flows
• Tests
set
up
with
some
frequency
to
repeatedly
test
things
– Aggregates
reported
14. Backbone
nodes
• Test
machines
sitting
in
datacenters
all
around
the
globe
• Really
good
at:
– Availability
and
reachability
– Scale
– Backend
problems
– Global
reach
• Terrible
indicators
of
raw
performance
– No
latency
– Infinite
bandwidth
15. Backbone
nodes
• Test
machines
sitting
in
datacenters
all
around
the
globe
• Really
good
at:
– Availability
and
reachability
– Scale
– Backend
problems
– Global
reach
• Often
terrible
indicators
of
raw
performance
– No
latency
– Infinite
bandwidth
17. Last
mile
nodes
• Test
machines
sitting
behind
a
real
home-‐like
internet
connection
• Much
better
at
reporting
what
you
can
expect
from
users,
but
sometimes
unreliable
• Also
not
as
dense
in
deployment
20. RUM
• Use
javascript
to
collect
timing
metrics
• Can
collect
lots
of
things
through
browser
APIs
– Page
metrics,
asset
metrics,
user-‐defined
metrics
21. Use
test
assets
• Use
this
model
to
initiate
tests
in
the
browser
• Some
vendors:
– Cedexis,
TurboBytes,
CloudHarmony,
more…
– Usually,
this
isn’t
their
business,
but
the
data
drives
their
main
business
objectives
• You
can
build
this
yourself
too
22. Use
real
assets
in
the
page
• Collect
timings
from
actual
objects
– Resource
timing
• Vendors
– SOASTA,
New
Relic,
most
synthetic
vendors
– Boomerang
(open
source)
– Google
Analytics
User
Timings
23. DATA,
DATA,
DATA
• For
either
RUM
technique,
we
need
A
LOT
of
data
• Too
much
variance
– Most
vendors
don’t
use
averages
– Medians,
percentiles,
and
histograms
31. TCP
Client
Server
DNS
DNS
(TLS)
HTTP
(TTFB)
HTTP
(Download)
32. DNS
TCP
(TLS)
TTFB
Download
(TTLB-‐TTFB)
Time
33. DNS
TCP
(TLS)
TTFB
Download
(TTLB-‐TTFB)
Time
DNS
RTT
to
DNS
server,
DNS
iterations,
DNS
caching
and
TTLs
34. DNS
TCP
(TLS)
TTFB
Download
(TTLB-‐TTFB)
Time
DNS
TCP
RTT
to
DNS
server,
DNS
iterations,
DNS
caching
and
TTLs
RTT
to
cache
server
(CDN
footprint
&
routing
algorithms)
35. DNS
TCP
(TLS)
TTFB
Download
(TTLB-‐TTFB)
Time
DNS
TCP
(TLS)
RTT
to
DNS
server,
DNS
iterations,
DNS
caching
and
TTLs
RTT
to
cache
server
(CDN
footprint
&
routing
algorithms)
RTT
to
cache
server
(or
RTTs
depending
on
TLS
False
Start),
efficiency
of
TLS
engine
36. DNS
TCP
(TLS)
TTFB
Download
(TTLB-‐TTFB)
Time
DNS
TCP
(TLS)
TTFB
RTT
to
DNS
server,
DNS
iterations,
DNS
caching
and
TTLs
RTT
to
cache
server
(CDN
footprint
&
routing
algorithms)
RTT
to
cache
server
(or
RTTs
depending
on
TLS
False
Start),
efficiency
of
TLS
engine
RTT
to
where
the
object
is
stored
+
storage
efficiency
(different
for
requests
to
origin);
lower
bound
=
network
RTT
37. DNS
TCP
(TLS)
TTFB
Download
(TTLB-‐TTFB)
Time
DNS
TCP
(TLS)
TTFB
TTLB-‐TTFB
RTT
to
DNS
server,
DNS
iterations,
DNS
caching
and
TTLs
RTT
to
cache
server
(CDN
footprint
&
routing
algorithms)
RTT
to
cache
server
(or
RTTs
depending
on
TLS
False
Start),
efficiency
of
TLS
engine
RTT
to
where
the
object
is
stored
+
storage
efficiency
(different
for
requests
to
origin);
lower
bound
=
network
RTT
Bandwidth,
congestion
avoidance
algorithms
(and
RTT!)
38. Core
object
metrics
• Not
every
request
experiences
every
metric:
– DNS:
once
per
domain
– TCP/TLS
setup
once
per
connection
– TTFB/Download
for
every
object
(not
already
in
browser
cache)
43. “I’ll
pick
an
image
from
my
home
page,
use
backbone
synthetic
tests
from
all
over
the
world
and
pick
the
CDN
that
has
the
fastest
average
time”
“let’s
test
an
asset
via
RUM
on
a
million
page
views
a
day
and
pick
the
fastest
CDN”
“let’s
run
webpagetest
on
both
CDNs
and
go
with
whichever
has
a
faster
page
load
time”
~$time curl –v http://…
45. Web
application:
objects
• Your
application
should
determine
what
you
test:
– Objects
served
from
the
edge
– Objects
served
from
origin
(through
CDN)
• If
HTML
is
from
origin
(through
CDN),
we
must
measure
it
– Essential
to
critical
page
metrics
47. • On
any
page
– DNS
queries
only
happen
a
small
number
of
times
– 6
TCP
connections
per
domain
– 1
TLS
setup
per
connection
– Many
many
many
HTTP
fetches
• Core
metrics
– TTFB
– Download
(TTLB-‐TTFB)
if
important
large
objects
– Should
have
a
good
idea
of
DNS/
TCP/TLS,
but
less
critical
48. Web
application
• If
CDN
only
for
static/cacheable
objects:
– One
or
two
representative
assets
– TTFB
and
maybe
download
most
important
Client
CDN
Node
50. Web
application
• If
CDN
also
for
whole
site
(HTML
going
through
CDN)
– Sample
of
key
HTML
pages,
delivered
from
origin
– TTFB
will
show
efficiency
of
routing
(and
connection
management)
to
origin
– TTLB
will
show
efficiency
of
delivery
Web
Server
Client
CDN
Node
51. Web
application
• If
CDN
also
for
whole
site
(HTML
going
through
CDN)
– Sample
of
key
HTML
pages,
delivered
from
origin
– TTFB
will
show
efficiency
of
routing
(and
connection
management)
to
origin
– TTLB
will
show
efficiency
of
delivery
Web
Server
Client
CDN
Node
CDN
Node
62. What
the…???
• We
always
assume
“all
things
equal”
• Too
many
factors
affect
page
load
time
– 3rd
parties
(sometimes
varying),
content
form
origin,
layout,
JS
execution,
etc
• Too
much
variance
Source:
httparchive.org
63. To
be
clear…
• Always
use
webpagetest
(or
something
like
it)
to
understand
your
application’s
performance
profile
• Continue
to
monitor
application
performance,
and
always
spot
check
• Be
extremely
careful
when
using
it
to
compare
CDN
performance,
it
can
mislead
you
– If
using
RUM
to
measure
page
metrics,
with
lots
of
data,
things
become
a
little
more
meaningful
(data
volume
handles
variance)
102. Can
I
serve
stale
content
if
necessary?
(stale-while-revalidate & stale-if-error)
103. What
if
I
can
cache
something
I
didn’t
think
I
could?
104. Key
takeaways
• Everything
is
application-‐dependent
– Evaluate
how
your
application
works
and
what
impacts
performance
the
most
• Don’t
get
locked
into
a
single
number/metric
• Always
know
your
application
performance
and
bottlenecks
• Be
mindful
of
the
bigger
picture
• Don’t
stop
measuring!