Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Stupid
       Web	
  Caching
                Tricks

Mark	
  Nottingham	
  	
  /	
  	
  mnot@yahoo-­‐inc.com	
  	
  /	
  	...
foo.yahoo.com
the internets
             the internets




front-end
the internets
             the internets




front-end




services
the internets
             the internets




front-end




 caching




services
Simple,	
  right?

Well,	
  let’s	
  bring	
  it	
  into	
  rotation...
1           2           3              4       5               6
1276007531.061     205   192.168.1.16   TCP_MISS/200 9286...
1          2          3              4       5               6
1276007531.068   205   192.168.1.16   TCP_MISS/200 9286 GET...
in	
  squid2:
collapsed_forwarding on

      in	
  squid2.HEAD:
collapsed_forwarding_timeout
the internets
             the internets




front-end




 caching
                             SPOF!


services
the internets
             the internets




front-end




 caching




services
+ good	
  business	
  continuity
          more	
  Qlexible


      -­‐ worse	
  hit	
  rate
          high	
  load	
  whe...
the internets
             the internets




front-end




 caching




services
RFC	
  2186	
  -­‐	
  Internet	
  Cache	
  Protocol
                      UDP-­‐based	
  
                      Just	
  th...
the internets
             the internets




front-end




 caching                     ?!



services
24 front-end servers   x
       24 Apache children   x
         5 pages / second   x
8 service requests / page   x
   10k ...
the internets
                    the internets




    front-end
local caching




proxy caching




     services


    ...
1          2          3              4       5                 6
1276007530.037     0   192.168.1.17   TCP_HIT/200 9286 GE...
Cache-Control: stale-while-revalidate=30



                  implemented       Squid	
  2.7
RFC	
  5861        coming	
  ...
1          2          3            4       5               6
1276007530.037     0   192.168.1.17 TCP_HIT/200 9286 GET /det...
the internets
                 the internets




    front-end
local caching




proxy caching




     services
Cache-Control: stale-if-error=3600



                 implemented       Squid	
  2.7
RFC	
  5861       coming	
  soon   S...
Dealing	
  with
            Aborted	
  Requests
                 front-­‐end	
  timeout:	
  500ms
front-end

             ...
Getting	
  an
    Immediate	
  Answer
                                       front-end

       Cache-Control: only-if-cach...
the internets
                 the internets




    front-end
local caching                    cache_peer...round-robin

...
the internets
                 the internets




    front-end
local caching                    cache_peer...carp




prox...
Why	
  
                ?
   won’t	
  
    Squid
     cache	
  that
   response
Easy	
  Answers
request	
  Cache-­Control
	
  
response	
  Cache-­Control
	
  
authentication
	
  
unfriendly	
  freshness...
...in	
  Squid
request	
  Cache-­Control
ignore-­‐reload
response	
  Cache-­Control
ignore-­‐[no-­‐cache,	
  no-­‐store,	
...
...in	
  Traf@ic	
  Server
request	
  Cache-­Control
proxy.conQig.http.cache.ignore_client_no_cache
response	
  Cache-­Con...
Not	
  So	
  Easy:	
  
Wandering	
  URLs
http://srv254.dctr.example.com/foo/image.gif

http://example.com/thing.xml?useles...
No	
  Answers*

        non-­‐GET	
  methods

        Protocol	
  Errors

        Vary:	
  *




               *without	
...
Your
API
will
  be
cached.
Accelerator	
  Caching
                   the internets
                    the internets




proxy caching




     servi...
/people?name=Britney_Spears&page=2

 /people?name=Britney_Spears&page=2&

                     /people?NAME=Britney_Spears...
<map base="http://example.com/">
   <path seg="images">
     <rewrite path=”pix”/>
   </path>
   <path seg="people">
     ...
cache&
There	
  are	
  only   two	
  hard	
  things	
  in	
  CS:




	
  invalidation
                       Phil	
  Karlt...
reliability,	
  
                   scalability,	
  
                  immediacy.


	
  Choose	
  two.	
  Or	
  maybe	
  o...
the internets
                                 the internets




                                                 POST/PUT...
the internets
                       the internets




                                       POST/PUT/DELETE/etc.




  h...
the internets
                              the internets




                                              POST/PUT/DELET...
POST /articles/123/new_comment

                            /newest_comments




   /articles/123/comments

              ...
POST /articles/123/new_comment

                                                        /newest_comments
                 ...
POST /articles/123/new_comment

/cat/vuvuzela                                             /newest_comments
               ...
POST /articles/123/new_comment
Link: </cat/vuvuzela>; rel=”invalidates”
Link: </bob/comments>; rel=”invalidates”




/cat/...
“side effect” invalidation + link relations =

Linked
Cache
Invalidation
the internets
                     the internets




                                     “What’s
                        ...
Cache	
  Channels
                Good	
  for: occasional	
  tight	
  control
                   Caveat: ~10-­‐30s	
  lag;...
The	
  whole	
  point	
  of	
  using	
  
a	
  Web	
  cache	
  is	
  that	
  you’re

       not
     writing
      code.
http://www.squid-­‐cache.org/
http://trafQicserver.apache.org/
http://www.mnot.net/cache_docs/
http://redbot.org/
http://g...
Nächste SlideShare
Wird geladen in …5
×

Stupid Web Caching Tricks

Velocity 2010 presentation. See more at http://www.mnot.net/blog/Caching/

  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Stupid Web Caching Tricks

  1. Stupid Web  Caching Tricks Mark  Nottingham    /    mnot@yahoo-­‐inc.com    /    mnot@mnot.net    /    @mnot
  2. foo.yahoo.com
  3. the internets the internets front-end
  4. the internets the internets front-end services
  5. the internets the internets front-end caching services
  6. Simple,  right? Well,  let’s  bring  it  into  rotation...
  7. 1 2 3 4 5 6 1276007531.061 205 192.168.1.16 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.062 205 192.168.1.17 TCP_MISS/200 9287 GET /details?ticker=ABC 1276007531.064 218 192.168.1.16 TCP_MISS/200 9285 GET /details?ticker=ABC 1276007531.065 198 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.065 215 192.168.1.15 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.15 TCP_MISS/200 9288 GET /details?ticker=ABC 1276007531.072 1 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.254 398 192.168.1.15 TCP_MISS/200 9288 GET /details?ticker=ABC 1276007531.261 408 192.168.1.15 TCP_MISS/200 9287 GET /details?ticker=ABC 1276007531.289 429 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.922 852 192.168.1.15 TCP_MISS/504 282 GET /details?ticker=ABC 1276007532.005 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007532.044 987 192.168.1.16 TCP_MISS/504 283 GET /details?ticker=ABC 1276007532.045 2 192.168.1.16 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007532.068 1001 192.168.1.17 TCP_MISS/000 0 GET /details?ticker=ABC 1276007532.072 998 192.168.1.16 TCP_MISS/504 278 GET /details?ticker=ABC 1276007591.062 60001 192.168.1.16 TCP_MISS/000 0 GET /details?ticker=ABC Oops.
  8. 1 2 3 4 5 6 1276007531.068 205 192.168.1.16 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.17 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.15 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.16 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.068 205 192.168.1.15 TCP_MISS/200 9286 GET /details?ticker=ABC 1276007531.072 1 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007531.072 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.073 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007531.073 0 192.168.1.15 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007531.074 0 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.076 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007531.076 0 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007531.077 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.078 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007531.079 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC Collapsed  Forwarding
  9. in  squid2: collapsed_forwarding on in  squid2.HEAD: collapsed_forwarding_timeout
  10. the internets the internets front-end caching SPOF! services
  11. the internets the internets front-end caching services
  12. + good  business  continuity more  Qlexible -­‐ worse  hit  rate high  load  when  new  caches  come  online caches  can  come  out  of  sync answer: Cache  Peering
  13. the internets the internets front-end caching services
  14. RFC  2186  -­‐  Internet  Cache  Protocol UDP-­‐based   Just  the  URI Query  only in  Squid  /  TrafQic  Server RFC  2756  -­‐  Hyper  Text  Caching  Protocol UDP-­‐based  (option  for  TCP  in  spec) Includes  URI  +  Headers Query,  CLR  operations in  Squid
  15. the internets the internets front-end caching ?! services
  16. 24 front-end servers x 24 Apache children x 5 pages / second x 8 service requests / page x 10k / service response / 2 cache servers = 11,520 req/sec 900 Mbits/sec /cache server
  17. the internets the internets front-end local caching proxy caching services Hierarchy
  18. 1 2 3 4 5 6 1276007530.037 0 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.057 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.083 0 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007530.119 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.141 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.179 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.397 205 192.168.1.15 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.401 1 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.414 201 192.168.1.17 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.418 1 192.168.1.15 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.434 0 192.168.1.16 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007530.442 198 192.168.1.17 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.372 0 192.168.1.15 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.494 201 192.168.1.16 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.525 1 192.168.1.17 TCP_HIT/200 9284 GET /details?ticker=ABC 1276007530.548 201 192.168.1.17 TCP_REFRESH_MISS/200 9285 GET /details?... 1276007530.563 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.594 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC Content  Becomes  Stale
  19. Cache-Control: stale-while-revalidate=30 implemented Squid  2.7 RFC  5861 coming  soon Squid  3.2 Apache  TrafQic  Server
  20. 1 2 3 4 5 6 1276007530.037 0 192.168.1.17 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.057 1 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.083 0 192.168.1.17 TCP_HIT/200 9287 GET /details?ticker=ABC 1276007530.119 0 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.141 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.179 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC 1276007530.192 0 192.168.1.15 TCP_STALE_HIT/200 9285 GET /details?... 1276007530.213 1 192.168.1.17 TCP_STALE_HIT/200 9286 GET /details?... 1276007530.243 0 192.168.1.17 TCP_STALE_HIT/200 9285 GET /details?... 1276007530.294 0 192.168.1.16 TCP_STALE_HIT/200 9287 GET /details?... 1276007530.347 0 192.168.1.17 TCP_STALE_HIT/200 9285 GET /details?... 1276007530.384 219 0.0.0.0 TCP_ASYNC_MISS/200 9285 GET /details?... 1276007530.401 1 192.168.1.17 TCP_HIT/200 9284 GET /details?ticker=ABC 1276007530.418 1 192.168.1.15 TCP_HIT/200 9286 GET /details?ticker=ABC 1276007530.434 0 192.168.1.16 TCP_HIT/200 9285 GET /details?ticker=ABC stale-­‐while-­‐revalidate
  21. the internets the internets front-end local caching proxy caching services
  22. Cache-Control: stale-if-error=3600 implemented Squid  2.7 RFC  5861 coming  soon Squid  3.2 Apache  TrafQic  Server
  23. Dealing  with Aborted  Requests front-­‐end  timeout:  500ms front-end dropped  client  connection slow  service  =  no  cached  response caching not  cached  =  always  slow Squid quick_abort services Apache  TrafQic  Server background_fill
  24. Getting  an Immediate  Answer front-end Cache-Control: only-if-cached 504 Gateway Error caching Cache-Control: max-age=3600, max-stale Squid fetch_only_if_cached_access (soon) services
  25. the internets the internets front-end local caching cache_peer...round-robin proxy caching the internets the internets services
  26. the internets the internets front-end local caching cache_peer...carp proxy caching the internets the internets services
  27. Why   ? won’t   Squid cache  that response
  28. Easy  Answers request  Cache-­Control   response  Cache-­Control   authentication   unfriendly  freshness  information   lack  of  LM/ETag  
  29. ...in  Squid request  Cache-­Control ignore-­‐reload response  Cache-­Control ignore-­‐[no-­‐cache,  no-­‐store,  must-­‐revalidate,  private] authentication ignore-­‐auth unfriendly  freshness  information override-­‐[expire,  lastmod] lack  of  LM/ETag store-­‐stale refresh_pattern . 10 100% 10 [options]
  30. ...in  Traf@ic  Server request  Cache-­Control proxy.conQig.http.cache.ignore_client_no_cache response  Cache-­Control proxy.conQig.http.cache.ignore_server_no_cache authentication proxy.conQig.http.cache.ignore_authentication   unfriendly  freshness  information proxy.conQig.http.cache.when_to_revalidate lack  of  LM/ETag proxy.conQig.http.cache.required_headers dest_domain=example.com method=GET pin-in-cache=2d
  31. Not  So  Easy:   Wandering  URLs http://srv254.dctr.example.com/foo/image.gif http://example.com/thing.xml?uselessToken=abc123 http://example.com/endPointforEverything http://a storeurl_rewrite http://b http://a
  32. No  Answers* non-­‐GET  methods Protocol  Errors Vary:  * *without  hacking
  33. Your API will be cached.
  34. Accelerator  Caching the internets the internets proxy caching services
  35. /people?name=Britney_Spears&page=2 /people?name=Britney_Spears&page=2& /people?NAME=Britney_Spears&page=02 /people?page=2&name=Britney_Spears /people?name=Britney_Spears&page=02 /people?name=britney_spears&page=2 /people?name=Britney_Spears&page=2&token=abc /people?name=Britney_Spears&page=2&user=jane non-­‐canonical  URLs  =  low  cache  hit  rate
  36. <map base="http://example.com/"> <path seg="images"> <rewrite path=”pix”/> </path> <path seg="people"> <query lower_keys="true" sort="true" delete="true"> <page type="bool"/> <name type="lower"/> </query> </path> </map> XML format local in-cache / fetched from site Director
  37. cache& There  are  only two  hard  things  in  CS:  invalidation Phil  Karlton naming   things.
  38. reliability,   scalability,   immediacy.  Choose  two.  Or  maybe  one.
  39. the internets the internets POST/PUT/DELETE/etc. Request-URI Content-Location http acceleration Location origin server RFC  2616: Invalidations  after  Updates  or  Deletions
  40. the internets the internets POST/PUT/DELETE/etc. http acceleration origin server Problem  1:  Peered  Caches
  41. the internets the internets POST/PUT/DELETE/etc. http acceleration origin server Sharing  Invalidations  with  HTCP  CLR
  42. POST /articles/123/new_comment /newest_comments /articles/123/comments /comment_feed Problem  2:  Related  Responses
  43. POST /articles/123/new_comment /newest_comments Link: </articles/123/new_comment>; rel=”invalidate /articles/123/comments Link: </articles/123/new_comment>; rel=”invalidated-by” /comment_feed Link: </articles/123/new_comment>; rel=”invalidated-by Link:  rel=invalidated-­‐by
  44. POST /articles/123/new_comment /cat/vuvuzela /newest_comments Link: </articles/123/new_comment>; rel=”invalidate /bob/comments /articles/123/comments Link: </articles/123/new_comment>; rel=”invalidated-by” /comment_feed Link: </articles/123/new_comment>; rel=”invalidated-by Problem  3:  Dynamic  Relations
  45. POST /articles/123/new_comment Link: </cat/vuvuzela>; rel=”invalidates” Link: </bob/comments>; rel=”invalidates” /cat/vuvuzela /newest_comments Link: </articles/123/new_comment>; rel=”invalidate /bob/comments /articles/123/comments Link: </articles/123/new_comment>; rel=”invalidated-by” /comment_feed Link: </articles/123/new_comment>; rel=”invalidated-by Link:  rel=invalidates
  46. “side effect” invalidation + link relations = Linked Cache Invalidation
  47. the internets the internets “What’s become stale?” http acceleration origin server Cache  Channels
  48. Cache  Channels Good  for: occasional  tight  control Caveat: ~10-­‐30s  lag;  not  immediate Bottleneck: number  of  events  in  channel Linked  Cache  Invalidation Good  for: user-­‐generated  content Caveat: not  100%  reliable Bottleneck: complexity  of  relationships
  49. The  whole  point  of  using   a  Web  cache  is  that  you’re not writing code.
  50. http://www.squid-­‐cache.org/ http://trafQicserver.apache.org/ http://www.mnot.net/cache_docs/ http://redbot.org/ http://github.com/mnot/

×