SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Load balancing @Tuenti


            Ricardo Bartolomé, Senior Systems Engineer
Some numbers


• +12M users.

• 40 billion pageviews a month.

• 40k req/s in core site at peak time (1.8 gbps).

• 10k req/s in image routing layer (2gbps).

• +500 frontend servers
Past


• Linux boxes running LVS and ldirectord.

• DSR strategy for load balancing.

• Frontends used to have a external public IP.

• Double investment in networking gear and its
redundancy.

• SSL balanced across all the frontends.
The (old) big picture

                                    HTTP request
                                                             client



External
  API

                                             HTTP response
                              LVS




     External network
                        f01   f02      fN
                                                   Internal network
Present


• New hardware. 4+1 LB instead of 10 LB (5+5)

• New load balancing strategy using HAProxy layer 7
capabilities.

• SSL terminated in the load balancers.
The big picture

                                                  HTTP request
        External                                                           client
          API


                                                      HTTP response


HTTP           External network
                                        HAProxy
proxy                                                            Internal network
                     HTTP response



                                  f01     f02        fN
Hardware


• Intel Xeon X5677 (4 core, 8 threads @ 3.47GHz)

• 8 gigabit network interfaces (Broadcon NextExtreme
5702 w/ multiqueue support)

• 16 GB of memory
Networking

• 4 links for internal and 4 for external
• Connected to different stack member units
• 4gbps theorical capacity limit per node.

                           member unit 0
                           member unit 1



                            load balancer



                           member unit 0
                           member unit 1
Networking

• We tune IRQ SMP affinity for sharding IRQs across multiple
cores that share the same L2 cache [1]

• We do ECMP (Equal Cost Multi Path) [2] in our edge routers for
sharding traffic across the load balancers.

                                       ip   route   95.131.168.x/32   x.x.x.2
                                       ip   route   95.131.168.x/32   x.x.x.1
                                       ip   route   95.131.168.x/32   x.x.x.3
                                       ip   route   95.131.168.x/32   x.x.x.4
                 router




     lb     lb            lb    lb
HAProxy: Why?


• Layer7 load balancing: Content inspection,
persistence, slow start, throttling, anti-DoS features,
supervision, content switching, keep-alive, etc.

• Very robust and reliable.

• Designed to be a load balancer.

• Offers high control over HTTP delivery and status:
response codes, connections per frontend, queued
request, etc.
HAProxy: Concepts


• Frontend: Section where we listen() for incoming
connections.

• Backend: Pool of servers. We define algorithm,
configure healthy checks, etc.

• Listen section: frontend+backend. Useful for TCP.

• Connection != request: One connection can hold
multiple requests (keep-alive). Only the first one is
analyzed, logged and processed.
HAProxy: Health checks


• Standard health check

# Backend section
backend www_farm
    mode http
    balance roundrobin
    option httpchk GET /server_health

      # Servers
      server fe01 x.x.x.1:80 check inter 2s downinter 5s rise 2 fall 3 weight
100
      server fe02 x.x.x.2:80 check inter 2s downinter 5s rise 2 fall 3 weight
100
HAProxy: Health checks


• Observe mode

# Backend section
backend www_farm
    mode http
    balance roundrobin
    option httpchk GET /server_health
    observe layer7

      # Servers
      server fe01 x.x.x.1:80 check inter 2s downinter 5s rise 2 fall 3 weight
100
      server fe02 x.x.x.2:80 check inter 2s downinter 5s rise 2 fall 3 weight
100
HAProxy: Persistence


• Cookie

• URI & URI parameter

• Source IP

• Header (i.e. Host header)

• RDP cookie (Anyone using MS Terminal Server?)
HAProxy: Cookie persistence

• Map requests between cookie value and backend
server. You can issue these cookies from the code and
play with them.

• Ideal for deploying code by stages, or caching locally
user data.

• If the server becomes unreachable the traffic will be
directed to other server within the same pool.
HAProxy: Cookie persistence


backend www
    mode http
    balance roundrobin
    option redispatch
    cookie mycookie insert maxidle 120 maxlife 900 indirect preserve
domain .tuenti.com
    server fe01 1.1.1.1:80 weight 100 cookie 1111
    server fe02 1.1.1.2:80 weight 100 cookie 1112
    server fe03 1.1.1.3:80 weight 100 cookie 1113
HAProxy: URL persistence


• Specially interesting for balancing HTTP caching servers
(i.e.Varnish). Without this feature the cache pool will be inefficient.

• The URLs are hashed and assigned to a server in the pool
(using a modulo operation). A server will serve always the same
object regardless of the load balancer that attends the request.

• Adding/removing/losing servers to the pool is not harmful thanks
to consistent hashing.
HAProxy: URL persistence
         map-based hashing


A    1     7

B    2     8

C    3     9

D    4

E    5

F    6
HAProxy: URL persistence
         map-based hashing


A    1     7

B    2     8

C    3     9

D    4

E    5

F    6
HAProxy: URL persistence
         map-based hashing


A    1     7                 1   6

B    2     8                 2   7

C    3     9                 3   8

D    4    10                 4   9

E    5                       5   10

F    6
HAProxy: URL persistence
         map-based hashing


A    1     7                 1   6     High miss
                                      rate. #FAIL
B    2     8                 2   7

C    3     9                 3   8

D    4    10                 4   9

E    5                       5   10

F    6
HAProxy: URL persistence
         consistent hashing


A    1     7

B    2     8

C    3     9

D    4

E    5

F    6
HAProxy: URL persistence
         consistent hashing


A    1     7

B    2     8

C    3     9

D    4

E    5

F    6
HAProxy: URL persistence
         consistent hashing


A    1     7

B    2     8

C    3     9

D    4

E    5

F    6
HAProxy: URL persistence
           consistent hashing


A      1     7

B      2     8

C      3     9

D      4
    1/6 misses =
E    ~17% miss
       5

F      6
HAProxy: URL persistence


Our images URLs always look like:
     http://img3.tuenti.net/HyUdrohQQAFnCyjMJ2ekAA

We can choose the first block from the URI and use it for persistence decisions.

     # balance roundrobin
     balance uri depth 1
     hash-type consistent
HAProxy: URL persistence


Our images URLs always look like:
     http://img3.tuenti.net/MdlIdrAOilul8ldcRwD7AdzwAeAdB4AMtgAy

We can choose the first block from the URI and use it for persistence decisions.

     # balance roundrobin
     balance uri depth 1
     hash-type consistent
HAProxy: Content switching and ACLs


• Same frontend, different backend.
• Take decisions about which backend will attend the connection
based on:
    • Layer 7 information (HTTP headers, methods, URI, version,
    status)
    • Layer4 information (source IP, destination IP, port)
    • Internal HAProxy information (amount of backend
    connections, active servers in the backend, etc)

• Too much options for showing all on this presentation.   [1]
HAProxy: Content switching and ACLs


# Frontend section
frontend http
     bind x.x.x.x:80
     mode http
     option forwardfor except 127.0.0.1/8 header X-Forwarded-For

    # Farm content switching
    acl acl-api-uri       path        /api
    acl acl-mobile-site   hdr(host)   -i m.tuenti.com
    acl acl-cdn-service   hdr(host)   -i cdn.tuenti.net

    use_backend               mobile_farm      if acl-mobile-site
    use_backend               api_farm         if acl-api-uri
    use_backend               cdn_farm         if acl-cdn-service

    default_backend      www_farm
HAProxy: Content switching and ACLs


# Backend section
backend www_farm
    mode http
    balance roundrobin

    # Servers
    server fe01 x.x.x.1:80 weight 100
    server fe02 x.x.x.2:80 weight 100

backend mobile_farm
    mode http
    balance roundrobin

    # Servers
    server mfe01 x.x.x.1:80 weight 100
HAProxy: Content switching and ACLs


# Another example using internal HAProxy information
frontend http
     bind x.x.x.x:80
     mode http
     option forwardfor except 127.0.0.1/8 header X-Forwarded-For

    # Insert 250ms delay if the session rate is over 35k req/s
    acl too_fast fe_sess_rate ge 35000
    tcp-request inspect-delay 250ms
    tcp-request content accept if ! too_fast
    tcp-request content accept if WAIT_END
HAProxy: Content blocking


# Another example using internal HAProxy information
frontend http
     bind x.x.x.x:80
     mode http
     option forwardfor except 127.0.0.1/8 header X-Forwarded-For

     # Block requests with negative Content-Length value
     acl invalid-cl hdr_val(content-length) le 0
    block if invalid-cl
HAProxy: Slow start


# Backend section
backend www_farm
    mode http
    balance roundrobin
    option httpchk GET /server_health

     # Servers
     server fe01 x.x.x.1:80 check inter 2s downinter 5s slowstart 60s rise
2 fall 3 weight 100
     server fe02 x.x.x.2:80 check inter 2s downinter 5s slowstart 60s rise
2 fall 3 weight 100
HAProxy: Graceful shutdown


# Backend section
backend www_farm
    mode http
    balance roundrobin
    option httpchk GET /server_health
    http-check disable-on-404

     # Servers
     server fe01 x.x.x.1:80 check inter 2s downinter 5s slowstart 60s rise
2 fall 3 weight 100
     server fe02 x.x.x.2:80 check inter 2s downinter 5s slowstart 60s rise
2 fall 3 weight 100
HAProxy: Monitoring


•Traffic through different frontend interfaces. Easy to
aggregate incoming/outgoing traffic.

• Amount of different HTTP response codes

• /proc/net/sockstat
HAProxy: Monitoring


frontend stats1
     mode              http
     bind-process         1
     bind            :8081
     default_backend        haproxy-stats1

backend haproxy-stats1
    bind-process 1
    mode http
    stats enable
    stats refresh 60s
    stats uri /
    stats auth mgmt:password
Client-side load balancing


• When user logs into the site the browser loads a
javascript API. Browser talks to it.

• Browser communicates with the API and this one
uses EasyXDM.

• Using application logic we control user request to a
defined farm.
   • A/B testing based in any criteria.
   • Where are from?
   • How old are you?
Client-side load balancing


‘frontend_farm_map‘ => array(
          1 => 'www1', // x% (Alava)
          2 => 'www4', // y% (Albacete)
          3 => 'www4', // z% (Alicante)
          …
)

‘users_using_staging => array(
    ‘level’ => ‘limited’,
    ‘percent’ => 10,
)
SSL


• TCP load balancing is not useful for us.

• We deployed stunnel and it worked fine for a while.
• Then we started to suffer contention when accepting new
connections.

• We are currently using stud [2] for terminating SSL in our load
balancers.
SSL: Legal issues


• You can’t use this strategy of SSL termination in your PCI
compliant platform.

• We transport client IP information into X-Forwarded-For headers
in order to log users IPs because law enforcements.

• We terminate SSL in the load balancer because balancing TCP
(SSL) you can’t inform the backend about the client IP.
stud: The Scalable TLS Unwrapping
               Daemon


• Supports both SSL and TLS using OpenSSL.

• Uses a process-per-core model.

• Asynchronous I/O using libev.

• Very little overhead per connection.

• Designed for long-living connections.

• Supports PROXY protocol.

• Recently they added inter-process communication [5].
PROXY protocol


• Created by HAProxy [5] author for safely transport connection
information across multiple layers of NAT or TCP proxies.

• Native support in stud. Patches available for stunnel4.

• We use it for stud informing to HAProxy about the real IP of the
client, converting this information to X-Forwarded-For header that
we can read and store in our application.
PROXY protocol


# stud --ssl -c OPENSSL_CIPHERS -b 127.0.0.1 8888 -f x.x.x.x 443 -n 2
-u stud --write-proxy certificate.pem

frontend http-localhost-proxy-443
    bind 127.0.0.1:8888 accept-proxy
    mode http
    reqadd X-Protocol: SSL
    reqadd X-Port: 443
    default_backend       www_farm
REST API


• Not official feature (yet)   [6]



• You can easily communicate to the server via HTTP.

• Awesome for orchestrating your web tier.
Questions?
Related links
  http://software.intel.com/en-us/articles/improved-linux-smp-scaling-
• [1]
user-directed-processor-affinity/

• [2]   http://en.wikipedia.org/wiki/Equal-cost_multi-path_routing

• [3]   stud repo: https://github.com/bumptech/stud

• [4]   Scaling SSL: http://blog.exceliance.fr/2011/11/07/scaling-out-ssl/

   PROXY protocol: http://haproxy.1wt.eu/download/1.5/doc/proxy-
• [5]
protocol.txt

• [6]   REST API patch: https://github.com/jbuchbinder/haproxy-forked

• HAProxy configuration doc:
http://haproxy.1wt.eu/download/1.5/doc/configuration.txt

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk
 
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
confluent
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
HostedbyConfluent
 

Was ist angesagt? (20)

Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Erasure codes and storage tiers on gluster
Erasure codes and storage tiers on glusterErasure codes and storage tiers on gluster
Erasure codes and storage tiers on gluster
 
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Implementing Domain Events with Kafka
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with Kafka
 
Apache Kafka - Overview
Apache Kafka - OverviewApache Kafka - Overview
Apache Kafka - Overview
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Pub/Sub Messaging
Pub/Sub MessagingPub/Sub Messaging
Pub/Sub Messaging
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
Kafka as your Data Lake - is it Feasible? (Guido Schmutz, Trivadis) Kafka Sum...
 
Maximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceMaximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performance
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 

Andere mochten auch

Tuenti conceptos
Tuenti conceptosTuenti conceptos
Tuenti conceptos
Alex Andray
 
London2011 tuenti
London2011 tuentiLondon2011 tuenti
London2011 tuenti
Juan Varela
 
Abc economist mediareport-final
Abc economist mediareport-finalAbc economist mediareport-final
Abc economist mediareport-final
Juan Varela
 
Socialnetworks
Socialnetworks Socialnetworks
Socialnetworks
eaajm
 

Andere mochten auch (15)

USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Scaling Instagram
Scaling InstagramScaling Instagram
Scaling Instagram
 
Chirp 2010: Scaling Twitter
Chirp 2010: Scaling TwitterChirp 2010: Scaling Twitter
Chirp 2010: Scaling Twitter
 
Embracing Open Source: Practice and Experience from Alibaba
Embracing Open Source: Practice and Experience from AlibabaEmbracing Open Source: Practice and Experience from Alibaba
Embracing Open Source: Practice and Experience from Alibaba
 
Tuenti conceptos
Tuenti conceptosTuenti conceptos
Tuenti conceptos
 
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
Openstack Summit Tokyo 2015 - Building a private cloud to efficiently handle ...
 
All About Those User Stories
All About Those User StoriesAll About Those User Stories
All About Those User Stories
 
London2011 tuenti
London2011 tuentiLondon2011 tuenti
London2011 tuenti
 
Abc economist mediareport-final
Abc economist mediareport-finalAbc economist mediareport-final
Abc economist mediareport-final
 
Socialnetworks
Socialnetworks Socialnetworks
Socialnetworks
 
Product design: How to create a product
Product design: How to create a productProduct design: How to create a product
Product design: How to create a product
 
Telefonica Digital 2012
Telefonica Digital 2012Telefonica Digital 2012
Telefonica Digital 2012
 
Analysis of Facebook and Tuenti
Analysis of Facebook and TuentiAnalysis of Facebook and Tuenti
Analysis of Facebook and Tuenti
 
Telefónica Digital – our formula for success
Telefónica Digital – our formula for successTelefónica Digital – our formula for success
Telefónica Digital – our formula for success
 
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookSREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
 

Ähnlich wie Load balancing at tuenti

Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Michele Orru
 

Ähnlich wie Load balancing at tuenti (20)

haproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdfhaproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdf
 
HAProxy
HAProxy HAProxy
HAProxy
 
slides (PPT)
slides (PPT)slides (PPT)
slides (PPT)
 
A Tale of 2 Systems
A Tale of 2 SystemsA Tale of 2 Systems
A Tale of 2 Systems
 
HA Deployment Architecture with HAProxy and Keepalived
HA Deployment Architecture with HAProxy and KeepalivedHA Deployment Architecture with HAProxy and Keepalived
HA Deployment Architecture with HAProxy and Keepalived
 
Web Server Load Balancer
Web Server Load BalancerWeb Server Load Balancer
Web Server Load Balancer
 
Stream processing on mobile networks
Stream processing on mobile networksStream processing on mobile networks
Stream processing on mobile networks
 
Http - All you need to know
Http - All you need to knowHttp - All you need to know
Http - All you need to know
 
HTTP Acceleration with Varnish
HTTP Acceleration with VarnishHTTP Acceleration with Varnish
HTTP Acceleration with Varnish
 
Web technologies: HTTP
Web technologies: HTTPWeb technologies: HTTP
Web technologies: HTTP
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
.NET Conf 2022 - Networking in .NET 7
.NET Conf 2022 - Networking in .NET 7.NET Conf 2022 - Networking in .NET 7
.NET Conf 2022 - Networking in .NET 7
 
Before OTD EDU - Introduction
Before OTD EDU - IntroductionBefore OTD EDU - Introduction
Before OTD EDU - Introduction
 
Multi-Layer DDoS Mitigation Strategies
Multi-Layer DDoS Mitigation StrategiesMulti-Layer DDoS Mitigation Strategies
Multi-Layer DDoS Mitigation Strategies
 
The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016
 
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0Managing multi tenant resource toward Hive 2.0
Managing multi tenant resource toward Hive 2.0
 
How To Set Up SQL Load Balancing with HAProxy - Slides
How To Set Up SQL Load Balancing with HAProxy - SlidesHow To Set Up SQL Load Balancing with HAProxy - Slides
How To Set Up SQL Load Balancing with HAProxy - Slides
 
Fastsocket Linxiaofeng
Fastsocket LinxiaofengFastsocket Linxiaofeng
Fastsocket Linxiaofeng
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Load balancing at tuenti

  • 1. Load balancing @Tuenti Ricardo Bartolomé, Senior Systems Engineer
  • 2. Some numbers • +12M users. • 40 billion pageviews a month. • 40k req/s in core site at peak time (1.8 gbps). • 10k req/s in image routing layer (2gbps). • +500 frontend servers
  • 3. Past • Linux boxes running LVS and ldirectord. • DSR strategy for load balancing. • Frontends used to have a external public IP. • Double investment in networking gear and its redundancy. • SSL balanced across all the frontends.
  • 4. The (old) big picture HTTP request client External API HTTP response LVS External network f01 f02 fN Internal network
  • 5. Present • New hardware. 4+1 LB instead of 10 LB (5+5) • New load balancing strategy using HAProxy layer 7 capabilities. • SSL terminated in the load balancers.
  • 6. The big picture HTTP request External client API HTTP response HTTP External network HAProxy proxy Internal network HTTP response f01 f02 fN
  • 7. Hardware • Intel Xeon X5677 (4 core, 8 threads @ 3.47GHz) • 8 gigabit network interfaces (Broadcon NextExtreme 5702 w/ multiqueue support) • 16 GB of memory
  • 8. Networking • 4 links for internal and 4 for external • Connected to different stack member units • 4gbps theorical capacity limit per node. member unit 0 member unit 1 load balancer member unit 0 member unit 1
  • 9. Networking • We tune IRQ SMP affinity for sharding IRQs across multiple cores that share the same L2 cache [1] • We do ECMP (Equal Cost Multi Path) [2] in our edge routers for sharding traffic across the load balancers. ip route 95.131.168.x/32 x.x.x.2 ip route 95.131.168.x/32 x.x.x.1 ip route 95.131.168.x/32 x.x.x.3 ip route 95.131.168.x/32 x.x.x.4 router lb lb lb lb
  • 10. HAProxy: Why? • Layer7 load balancing: Content inspection, persistence, slow start, throttling, anti-DoS features, supervision, content switching, keep-alive, etc. • Very robust and reliable. • Designed to be a load balancer. • Offers high control over HTTP delivery and status: response codes, connections per frontend, queued request, etc.
  • 11. HAProxy: Concepts • Frontend: Section where we listen() for incoming connections. • Backend: Pool of servers. We define algorithm, configure healthy checks, etc. • Listen section: frontend+backend. Useful for TCP. • Connection != request: One connection can hold multiple requests (keep-alive). Only the first one is analyzed, logged and processed.
  • 12. HAProxy: Health checks • Standard health check # Backend section backend www_farm mode http balance roundrobin option httpchk GET /server_health # Servers server fe01 x.x.x.1:80 check inter 2s downinter 5s rise 2 fall 3 weight 100 server fe02 x.x.x.2:80 check inter 2s downinter 5s rise 2 fall 3 weight 100
  • 13. HAProxy: Health checks • Observe mode # Backend section backend www_farm mode http balance roundrobin option httpchk GET /server_health observe layer7 # Servers server fe01 x.x.x.1:80 check inter 2s downinter 5s rise 2 fall 3 weight 100 server fe02 x.x.x.2:80 check inter 2s downinter 5s rise 2 fall 3 weight 100
  • 14. HAProxy: Persistence • Cookie • URI & URI parameter • Source IP • Header (i.e. Host header) • RDP cookie (Anyone using MS Terminal Server?)
  • 15. HAProxy: Cookie persistence • Map requests between cookie value and backend server. You can issue these cookies from the code and play with them. • Ideal for deploying code by stages, or caching locally user data. • If the server becomes unreachable the traffic will be directed to other server within the same pool.
  • 16. HAProxy: Cookie persistence backend www mode http balance roundrobin option redispatch cookie mycookie insert maxidle 120 maxlife 900 indirect preserve domain .tuenti.com server fe01 1.1.1.1:80 weight 100 cookie 1111 server fe02 1.1.1.2:80 weight 100 cookie 1112 server fe03 1.1.1.3:80 weight 100 cookie 1113
  • 17. HAProxy: URL persistence • Specially interesting for balancing HTTP caching servers (i.e.Varnish). Without this feature the cache pool will be inefficient. • The URLs are hashed and assigned to a server in the pool (using a modulo operation). A server will serve always the same object regardless of the load balancer that attends the request. • Adding/removing/losing servers to the pool is not harmful thanks to consistent hashing.
  • 18. HAProxy: URL persistence map-based hashing A 1 7 B 2 8 C 3 9 D 4 E 5 F 6
  • 19. HAProxy: URL persistence map-based hashing A 1 7 B 2 8 C 3 9 D 4 E 5 F 6
  • 20. HAProxy: URL persistence map-based hashing A 1 7 1 6 B 2 8 2 7 C 3 9 3 8 D 4 10 4 9 E 5 5 10 F 6
  • 21. HAProxy: URL persistence map-based hashing A 1 7 1 6 High miss rate. #FAIL B 2 8 2 7 C 3 9 3 8 D 4 10 4 9 E 5 5 10 F 6
  • 22. HAProxy: URL persistence consistent hashing A 1 7 B 2 8 C 3 9 D 4 E 5 F 6
  • 23. HAProxy: URL persistence consistent hashing A 1 7 B 2 8 C 3 9 D 4 E 5 F 6
  • 24. HAProxy: URL persistence consistent hashing A 1 7 B 2 8 C 3 9 D 4 E 5 F 6
  • 25. HAProxy: URL persistence consistent hashing A 1 7 B 2 8 C 3 9 D 4 1/6 misses = E ~17% miss 5 F 6
  • 26. HAProxy: URL persistence Our images URLs always look like: http://img3.tuenti.net/HyUdrohQQAFnCyjMJ2ekAA We can choose the first block from the URI and use it for persistence decisions. # balance roundrobin balance uri depth 1 hash-type consistent
  • 27. HAProxy: URL persistence Our images URLs always look like: http://img3.tuenti.net/MdlIdrAOilul8ldcRwD7AdzwAeAdB4AMtgAy We can choose the first block from the URI and use it for persistence decisions. # balance roundrobin balance uri depth 1 hash-type consistent
  • 28. HAProxy: Content switching and ACLs • Same frontend, different backend. • Take decisions about which backend will attend the connection based on: • Layer 7 information (HTTP headers, methods, URI, version, status) • Layer4 information (source IP, destination IP, port) • Internal HAProxy information (amount of backend connections, active servers in the backend, etc) • Too much options for showing all on this presentation. [1]
  • 29. HAProxy: Content switching and ACLs # Frontend section frontend http bind x.x.x.x:80 mode http option forwardfor except 127.0.0.1/8 header X-Forwarded-For # Farm content switching acl acl-api-uri path /api acl acl-mobile-site hdr(host) -i m.tuenti.com acl acl-cdn-service hdr(host) -i cdn.tuenti.net use_backend mobile_farm if acl-mobile-site use_backend api_farm if acl-api-uri use_backend cdn_farm if acl-cdn-service default_backend www_farm
  • 30. HAProxy: Content switching and ACLs # Backend section backend www_farm mode http balance roundrobin # Servers server fe01 x.x.x.1:80 weight 100 server fe02 x.x.x.2:80 weight 100 backend mobile_farm mode http balance roundrobin # Servers server mfe01 x.x.x.1:80 weight 100
  • 31. HAProxy: Content switching and ACLs # Another example using internal HAProxy information frontend http bind x.x.x.x:80 mode http option forwardfor except 127.0.0.1/8 header X-Forwarded-For # Insert 250ms delay if the session rate is over 35k req/s acl too_fast fe_sess_rate ge 35000 tcp-request inspect-delay 250ms tcp-request content accept if ! too_fast tcp-request content accept if WAIT_END
  • 32. HAProxy: Content blocking # Another example using internal HAProxy information frontend http bind x.x.x.x:80 mode http option forwardfor except 127.0.0.1/8 header X-Forwarded-For # Block requests with negative Content-Length value acl invalid-cl hdr_val(content-length) le 0 block if invalid-cl
  • 33. HAProxy: Slow start # Backend section backend www_farm mode http balance roundrobin option httpchk GET /server_health # Servers server fe01 x.x.x.1:80 check inter 2s downinter 5s slowstart 60s rise 2 fall 3 weight 100 server fe02 x.x.x.2:80 check inter 2s downinter 5s slowstart 60s rise 2 fall 3 weight 100
  • 34. HAProxy: Graceful shutdown # Backend section backend www_farm mode http balance roundrobin option httpchk GET /server_health http-check disable-on-404 # Servers server fe01 x.x.x.1:80 check inter 2s downinter 5s slowstart 60s rise 2 fall 3 weight 100 server fe02 x.x.x.2:80 check inter 2s downinter 5s slowstart 60s rise 2 fall 3 weight 100
  • 35. HAProxy: Monitoring •Traffic through different frontend interfaces. Easy to aggregate incoming/outgoing traffic. • Amount of different HTTP response codes • /proc/net/sockstat
  • 36. HAProxy: Monitoring frontend stats1 mode http bind-process 1 bind :8081 default_backend haproxy-stats1 backend haproxy-stats1 bind-process 1 mode http stats enable stats refresh 60s stats uri / stats auth mgmt:password
  • 37. Client-side load balancing • When user logs into the site the browser loads a javascript API. Browser talks to it. • Browser communicates with the API and this one uses EasyXDM. • Using application logic we control user request to a defined farm. • A/B testing based in any criteria. • Where are from? • How old are you?
  • 38. Client-side load balancing ‘frontend_farm_map‘ => array( 1 => 'www1', // x% (Alava) 2 => 'www4', // y% (Albacete) 3 => 'www4', // z% (Alicante) … ) ‘users_using_staging => array( ‘level’ => ‘limited’, ‘percent’ => 10, )
  • 39. SSL • TCP load balancing is not useful for us. • We deployed stunnel and it worked fine for a while. • Then we started to suffer contention when accepting new connections. • We are currently using stud [2] for terminating SSL in our load balancers.
  • 40. SSL: Legal issues • You can’t use this strategy of SSL termination in your PCI compliant platform. • We transport client IP information into X-Forwarded-For headers in order to log users IPs because law enforcements. • We terminate SSL in the load balancer because balancing TCP (SSL) you can’t inform the backend about the client IP.
  • 41. stud: The Scalable TLS Unwrapping Daemon • Supports both SSL and TLS using OpenSSL. • Uses a process-per-core model. • Asynchronous I/O using libev. • Very little overhead per connection. • Designed for long-living connections. • Supports PROXY protocol. • Recently they added inter-process communication [5].
  • 42. PROXY protocol • Created by HAProxy [5] author for safely transport connection information across multiple layers of NAT or TCP proxies. • Native support in stud. Patches available for stunnel4. • We use it for stud informing to HAProxy about the real IP of the client, converting this information to X-Forwarded-For header that we can read and store in our application.
  • 43. PROXY protocol # stud --ssl -c OPENSSL_CIPHERS -b 127.0.0.1 8888 -f x.x.x.x 443 -n 2 -u stud --write-proxy certificate.pem frontend http-localhost-proxy-443 bind 127.0.0.1:8888 accept-proxy mode http reqadd X-Protocol: SSL reqadd X-Port: 443 default_backend www_farm
  • 44. REST API • Not official feature (yet) [6] • You can easily communicate to the server via HTTP. • Awesome for orchestrating your web tier.
  • 46. Related links http://software.intel.com/en-us/articles/improved-linux-smp-scaling- • [1] user-directed-processor-affinity/ • [2] http://en.wikipedia.org/wiki/Equal-cost_multi-path_routing • [3] stud repo: https://github.com/bumptech/stud • [4] Scaling SSL: http://blog.exceliance.fr/2011/11/07/scaling-out-ssl/ PROXY protocol: http://haproxy.1wt.eu/download/1.5/doc/proxy- • [5] protocol.txt • [6] REST API patch: https://github.com/jbuchbinder/haproxy-forked • HAProxy configuration doc: http://haproxy.1wt.eu/download/1.5/doc/configuration.txt