SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Downloaden Sie, um offline zu lesen
Designing for Scale
    Knut Nesheim @knutin
   Paolo Negri @hungryblank
About this talk

2 developers and erlang
           vs.
  1 million daily users
Social Games
Flash client (game)   HTTP API
Social Games
Flash client


               • Game actions need to be
                 persisted and validated

               • 1 API call every 2 secs
Social Games
                            HTTP API



• @ 1 000 000 daily users
• 5000 HTTP reqs/sec
• more than 90% writes
The hard nut


http://www.flickr.com/photos/mukluk/315409445/
Users we expect
                                        DAU

                       1000000



                        750000
  “Monster World”
       daily users
july - december 2010    500000



                        250000



                             0
                                 July         December
Users we have
                                  DAU



   New game
   daily users
march - june 2011



                    50
                     0
                         march april    may   june
What to do?


1 Simulate users
Simulating users

• Must not be too synthetic (like
  apachebench)
• Must look like a meaningful game session
• Users must come online at a given rate and
  play
Tsung


         •    Multi protocol (HTTP, XMPP) benchmarking tool

         •    Able to test non trivial call sequences

         •    Can actually simulate a scripted gaming session




http://tsung.erlang-projects.org/
Tsung - configuration
       Fixed content                    Dynamic parameter
       <request subst="true">
       <http url="http://server.wooga.com/users/%
       %ts_user_server:get_unique_id%%/resources/column/5/
       row/14?%%_routing_key%%"
       method="POST" contents='{"parameter1":"value1"}'>
       </http>
       </request>



http://tsung.erlang-projects.org/
Tsung - configuration
         • Not something you fancy writing
         • We’re in development, calls change and we
              constantly add new calls
         • A session might contain hundreds of
              requests
         • All the calls must refer to a consistent game
              state

http://tsung.erlang-projects.org/
Tsung - configuration
         • From our ruby test code
         user.resources(:column => 5, :row => 14)

         • Same as
          <request subst="true">
          <http url="http://server.wooga.com/users/%
          %ts_user_server:get_unique_id%%/resources/column/5/
          row/14?%%_routing_key%%"
          method="POST" contents='{"parameter1":"value1"}'>
          </http>
          </request>
http://tsung.erlang-projects.org/
Tsung - configuration

         • Session                    A session is a
          • requests                group of requests

         • Arrival phase            Sessions arrive in
          • duration                  phases with a
                                     specific arrival
          • arrival rate                  rate


http://tsung.erlang-projects.org/
Tsung - setup
            Application                         Benchmarking
                                                   cluster
             app server                             tsung
                                    HTTP reqs      worker
                                                        ssh
             app server
                                                    tsung
                                                   master

             app server

http://tsung.erlang-projects.org/
Tsung

         • Generates ~ 2500 reqs/sec on AWS
              m1.large
         • Flexible but hard to extend
         • Code base rather obscure

http://tsung.erlang-projects.org/
What to do?


2 Collect metrics
Tsung-metrics

         • Tsung collects measures and provides
              reports
         • But these measure include tsung network/
              cpu congestion itself
         • Tsung machines aren’t a good point of view

http://tsung.erlang-projects.org/
HAproxy
Application                         Benchmarking
                                       cluster
app server                              tsung
                        HTTP reqs      worker
              haproxy                       ssh
app server
                                        tsung
                                       master

app server
HAproxy

  “The Reliable, High Performance TCP/
  HTTP Load Balancer”
• Placed in front of http servers
• Load balancing
• Fail over
HAproxy - syslog


• Easy to setup
• Efficient (UDP)
• Provides 5 timings per each request
HAproxy
  • Time to receive request from client
Application                        Benchmarking
                                      cluster
app server                                 tsung
                   haproxy                worker
                                               ssh
app server
                                           tsung
                                          master
HAproxy
    • Time spent in HAproxy queue
Application                     Benchmarking
                                   cluster
app server                           tsung
                 haproxy            worker
                                         ssh
app server
                                     tsung
                                    master
HAproxy
    • Time to connect to the server
Application                       Benchmarking
                                     cluster
app server                             tsung
                  haproxy             worker
                                           ssh
app server
                                       tsung
                                      master
HAproxy
• Time to receive response headers from server
Application                        Benchmarking
                                      cluster
 app server                             tsung
                   haproxy             worker
                                            ssh
 app server
                                        tsung
                                       master
HAproxy
• Total session duration time
Application                     Benchmarking
                                   cluster
 app server                         tsung
                    haproxy        worker
                                        ssh
 app server
                                    tsung
                                   master
HAproxy - syslog

• Application urls identify directly server call
• Application urls are easy to parse
• Processing haproxy syslog gives per call
  metric
What to do?


3 Understand metrics
Reading/aggregating
       metrics

• Python to parse/normalize syslog
• R language to analyze/visualize data
• R language console to interactively explore
  benchmarking results
R is a free software environment for
 statistical computing and graphics.
What you get

• Aggregate performance levels (throughput,
  latency)
• Detailed performance per call type
• Statistical analysis (outliers, trends,
  regression, correlation, frequency, standard
  deviation)
What you get
What to do?


4 go deeper
Digging into the data

• From HAproxy log analisys one call
  emerged as exceptionally slow
• Using eprof we were able to determine
  that most of the time was spent in a redis
  query fetching many keys (MGET)
Tracing erldis query
• More than 60% of runtime is spent
  manipulating the socket
• gen_tcp:recv/2 is the culprit
• But why is it called so many times?
Understanding the
     redis protocol
C: LRANGE mylist 0 2
                       <<"*2rn
s: *2                     $5rn
s: $5                     Hellorn
                          $5rn
s: Hello                  Worldrn">>
s: $5
s: World
Understanding erldis
• recv_value/2 is used in the protocol parser
  to get the next data to parse
A different approach
• Two ways to use gen_tcp: active or passive
• In passive, use gen_tcp:recv to explicitly ask
  for data, blocking
• In active, gen_tcp will send the controlling
  process a message when there is data
• Hybrid: active once
A different approach

• Is active sockets faster?
• Proof-of-concept proved active socket
  faster
• Change erldis or write a new driver?
A different approach

• Radical change => new driver
• Keep Erldis queuing approach
• Think about error handling from the start
• Use active sockets
A different approach
• Active socket, parse partial replies
Circuit breaker
• eredis has a simple circuit breaker for when
  Redis is down/unreachable
• eredis returns immediately to clients if
  connection is down
• Reconnecting is done outside request/
  response handling
• Robust handling of errors
Benchmarking eredis

• Redis driver critical for our application
• Must perform well
• Must be stable
• How do we test this?
Basho bench

• Basho produces the Riak KV store
• Basho build a tool to test KV servers
• Basho bench
• We used Basho bench to test eredis
Basho bench
• Create callback module
Basho bench
• Configuration term-file
Basho bench output
eredis is open source




https://github.com/wooga/eredis
What to do?


5 measure internals
Measure internals

HAproxy point of view is valid but how to
measure internals of our application, while
we are live, without the overhead of
tracing?
Think Basho bench

• Basho bench can benchmark a redis driver
• Redis is very fast, 100K ops/sec
• Basho bench overhead is acceptable
• The code is very simple
Cherry pick ideas from
    Basho Bench
• Creates a histogram of timings on the fly,
  reducing the number of data points
• Dumps to disk every N seconds
• Allows statistical tools to work on already
  aggregated data
• Near real-time, from event to stats in N+5
  seconds
Homegrown stats
• Measures latency from the edges of our
  system (excludes HTTP handling)
• And at interesting points inside the system
• Statistical analysis using R
• Correlate with HAproxy data
• Produces graphs and data specific to our
  application
Homegrown stats
Recap

  Measure:
• From an external point of view (HAproxy)
• At the edge of the system (excluding
  HTTP handling)
• Internals in the single process (eprof)
Recap
  Analyze:
• Aggregated measures
• Statistical properties of measures
 • standard deviation
 • distribution
 • trends
Thanks!

http://www.wooga.com/jobs

knut.nesheim@wooga.com       @knutin
paolo.negri@wooga.com    @hungryblank

Weitere ähnliche Inhalte

Was ist angesagt?

Python Raster Function - Esri Developer Conference - 2015
Python Raster Function - Esri Developer Conference - 2015Python Raster Function - Esri Developer Conference - 2015
Python Raster Function - Esri Developer Conference - 2015akferoz07
 
Introduction to the intermediate Python - v1.1
Introduction to the intermediate Python - v1.1Introduction to the intermediate Python - v1.1
Introduction to the intermediate Python - v1.1Andrei KUCHARAVY
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)Tibo Beijen
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Jeremy Edberg
 
Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easynathanmarz
 
Livy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkLivy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkJen Aman
 
Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013
Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013
Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013Amazon Web Services
 
Akka.net versus microsoft orleans
Akka.net versus microsoft orleansAkka.net versus microsoft orleans
Akka.net versus microsoft orleansBill Tulloch
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at ParseTravis Redman
 
Webinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna MitchellWebinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna MitchellCodemotion
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (Greek)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (Greek)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (Greek)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (Greek)Panagiotis Kanavos
 
CQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NETCQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NETDavid Hoerster
 

Was ist angesagt? (14)

Python Raster Function - Esri Developer Conference - 2015
Python Raster Function - Esri Developer Conference - 2015Python Raster Function - Esri Developer Conference - 2015
Python Raster Function - Esri Developer Conference - 2015
 
Introduction to the intermediate Python - v1.1
Introduction to the intermediate Python - v1.1Introduction to the intermediate Python - v1.1
Introduction to the intermediate Python - v1.1
 
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)
 
Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)Devops at Netflix (re:Invent)
Devops at Netflix (re:Invent)
 
Using Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems EasyUsing Simplicity to Make Hard Big Data Problems Easy
Using Simplicity to Make Hard Big Data Problems Easy
 
Livy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkLivy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache Spark
 
Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013
Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013
Cloud Connected Devices on a Global Scale (CPN303) | AWS re:Invent 2013
 
Akka.net versus microsoft orleans
Akka.net versus microsoft orleansAkka.net versus microsoft orleans
Akka.net versus microsoft orleans
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at Parse
 
Webinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna MitchellWebinar: Queues with RabbitMQ - Lorna Mitchell
Webinar: Queues with RabbitMQ - Lorna Mitchell
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (Greek)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (Greek)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (Greek)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (Greek)
 
CQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NETCQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NET
 

Andere mochten auch

NoSQL Games_NoSQL Roadshow Berlin
NoSQL Games_NoSQL Roadshow BerlinNoSQL Games_NoSQL Roadshow Berlin
NoSQL Games_NoSQL Roadshow BerlinWooga
 
Erlang as a Cloud Citizen
Erlang as a Cloud CitizenErlang as a Cloud Citizen
Erlang as a Cloud CitizenWooga
 
Getting the Most our of your Tools_FrontEnd DevConf2013_Minsk
Getting the Most our of your Tools_FrontEnd DevConf2013_MinskGetting the Most our of your Tools_FrontEnd DevConf2013_Minsk
Getting the Most our of your Tools_FrontEnd DevConf2013_MinskWooga
 
When Devs Do Ops
When Devs Do OpsWhen Devs Do Ops
When Devs Do OpsWooga
 
Stateful Application Server_JRubyConf13_Lukas Rieder
Stateful Application Server_JRubyConf13_Lukas RiederStateful Application Server_JRubyConf13_Lukas Rieder
Stateful Application Server_JRubyConf13_Lukas RiederWooga
 
JRubyConf2013_Tim Lossen_All your core
JRubyConf2013_Tim Lossen_All your coreJRubyConf2013_Tim Lossen_All your core
JRubyConf2013_Tim Lossen_All your coreWooga
 
Games for the Masses: Scaling Rails to the Extreme
Games for the Masses: Scaling Rails to the ExtremeGames for the Masses: Scaling Rails to the Extreme
Games for the Masses: Scaling Rails to the ExtremeWooga
 
Metrics. Driven. Design. (Developer Conference Hamburg 2012)
Metrics. Driven. Design. (Developer Conference Hamburg 2012)Metrics. Driven. Design. (Developer Conference Hamburg 2012)
Metrics. Driven. Design. (Developer Conference Hamburg 2012)Wooga
 
How to scale a company - game teams at Wooga
How to scale a company - game teams at WoogaHow to scale a company - game teams at Wooga
How to scale a company - game teams at WoogaWooga
 
Event Stream Processing with Kafka (Berlin Buzzwords 2012)
Event Stream Processing with Kafka (Berlin Buzzwords 2012)Event Stream Processing with Kafka (Berlin Buzzwords 2012)
Event Stream Processing with Kafka (Berlin Buzzwords 2012)Wooga
 
2013 04-29-evolution of backend
2013 04-29-evolution of backend2013 04-29-evolution of backend
2013 04-29-evolution of backendWooga
 
Stateful_Application_Server_RuPy 2012_Brno
Stateful_Application_Server_RuPy 2012_BrnoStateful_Application_Server_RuPy 2012_Brno
Stateful_Application_Server_RuPy 2012_BrnoWooga
 
You are not alone - Scaling multiplayer games
You are not alone - Scaling multiplayer gamesYou are not alone - Scaling multiplayer games
You are not alone - Scaling multiplayer gamesWooga
 
More than syntax
More than syntaxMore than syntax
More than syntaxWooga
 
Continuous Integration for iOS (iOS User Group Berlin)
Continuous Integration for iOS (iOS User Group Berlin)Continuous Integration for iOS (iOS User Group Berlin)
Continuous Integration for iOS (iOS User Group Berlin)Wooga
 
Painful success - lessons learned while scaling up
Painful success - lessons learned while scaling upPainful success - lessons learned while scaling up
Painful success - lessons learned while scaling upWooga
 
NoSQL Games
NoSQL GamesNoSQL Games
NoSQL GamesWooga
 
Wooga: Internationality meets Agility @Zutaten 2013
Wooga: Internationality meets Agility @Zutaten 2013Wooga: Internationality meets Agility @Zutaten 2013
Wooga: Internationality meets Agility @Zutaten 2013Wooga
 
Monitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachineMonitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachineWooga
 
Riak at Wooga_Riak Meetup Sept 2013
Riak at Wooga_Riak Meetup Sept 2013Riak at Wooga_Riak Meetup Sept 2013
Riak at Wooga_Riak Meetup Sept 2013Wooga
 

Andere mochten auch (20)

NoSQL Games_NoSQL Roadshow Berlin
NoSQL Games_NoSQL Roadshow BerlinNoSQL Games_NoSQL Roadshow Berlin
NoSQL Games_NoSQL Roadshow Berlin
 
Erlang as a Cloud Citizen
Erlang as a Cloud CitizenErlang as a Cloud Citizen
Erlang as a Cloud Citizen
 
Getting the Most our of your Tools_FrontEnd DevConf2013_Minsk
Getting the Most our of your Tools_FrontEnd DevConf2013_MinskGetting the Most our of your Tools_FrontEnd DevConf2013_Minsk
Getting the Most our of your Tools_FrontEnd DevConf2013_Minsk
 
When Devs Do Ops
When Devs Do OpsWhen Devs Do Ops
When Devs Do Ops
 
Stateful Application Server_JRubyConf13_Lukas Rieder
Stateful Application Server_JRubyConf13_Lukas RiederStateful Application Server_JRubyConf13_Lukas Rieder
Stateful Application Server_JRubyConf13_Lukas Rieder
 
JRubyConf2013_Tim Lossen_All your core
JRubyConf2013_Tim Lossen_All your coreJRubyConf2013_Tim Lossen_All your core
JRubyConf2013_Tim Lossen_All your core
 
Games for the Masses: Scaling Rails to the Extreme
Games for the Masses: Scaling Rails to the ExtremeGames for the Masses: Scaling Rails to the Extreme
Games for the Masses: Scaling Rails to the Extreme
 
Metrics. Driven. Design. (Developer Conference Hamburg 2012)
Metrics. Driven. Design. (Developer Conference Hamburg 2012)Metrics. Driven. Design. (Developer Conference Hamburg 2012)
Metrics. Driven. Design. (Developer Conference Hamburg 2012)
 
How to scale a company - game teams at Wooga
How to scale a company - game teams at WoogaHow to scale a company - game teams at Wooga
How to scale a company - game teams at Wooga
 
Event Stream Processing with Kafka (Berlin Buzzwords 2012)
Event Stream Processing with Kafka (Berlin Buzzwords 2012)Event Stream Processing with Kafka (Berlin Buzzwords 2012)
Event Stream Processing with Kafka (Berlin Buzzwords 2012)
 
2013 04-29-evolution of backend
2013 04-29-evolution of backend2013 04-29-evolution of backend
2013 04-29-evolution of backend
 
Stateful_Application_Server_RuPy 2012_Brno
Stateful_Application_Server_RuPy 2012_BrnoStateful_Application_Server_RuPy 2012_Brno
Stateful_Application_Server_RuPy 2012_Brno
 
You are not alone - Scaling multiplayer games
You are not alone - Scaling multiplayer gamesYou are not alone - Scaling multiplayer games
You are not alone - Scaling multiplayer games
 
More than syntax
More than syntaxMore than syntax
More than syntax
 
Continuous Integration for iOS (iOS User Group Berlin)
Continuous Integration for iOS (iOS User Group Berlin)Continuous Integration for iOS (iOS User Group Berlin)
Continuous Integration for iOS (iOS User Group Berlin)
 
Painful success - lessons learned while scaling up
Painful success - lessons learned while scaling upPainful success - lessons learned while scaling up
Painful success - lessons learned while scaling up
 
NoSQL Games
NoSQL GamesNoSQL Games
NoSQL Games
 
Wooga: Internationality meets Agility @Zutaten 2013
Wooga: Internationality meets Agility @Zutaten 2013Wooga: Internationality meets Agility @Zutaten 2013
Wooga: Internationality meets Agility @Zutaten 2013
 
Monitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachineMonitoring with Syslog and EventMachine
Monitoring with Syslog and EventMachine
 
Riak at Wooga_Riak Meetup Sept 2013
Riak at Wooga_Riak Meetup Sept 2013Riak at Wooga_Riak Meetup Sept 2013
Riak at Wooga_Riak Meetup Sept 2013
 

Ähnlich wie Designing for Scale

Distributed app development with nodejs and zeromq
Distributed app development with nodejs and zeromqDistributed app development with nodejs and zeromq
Distributed app development with nodejs and zeromqRuben Tan
 
Introduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormIntroduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormJungtaek Lim
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
How to build a Neutron Plugin (stadium edition)
How to build a Neutron Plugin (stadium edition)How to build a Neutron Plugin (stadium edition)
How to build a Neutron Plugin (stadium edition)Salvatore Orlando
 
How to write a Neutron plugin (stadium edition)
How to write a Neutron plugin (stadium edition)How to write a Neutron plugin (stadium edition)
How to write a Neutron plugin (stadium edition)salv_orlando
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging振东 刘
 
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Lohika_Odessa_TechTalks
 
Tech talk microservices debugging
Tech talk microservices debuggingTech talk microservices debugging
Tech talk microservices debuggingAndrey Kolodnitsky
 
Push jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private ChefPush jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private ChefChef Software, Inc.
 

Ähnlich wie Designing for Scale (20)

Distributed app development with nodejs and zeromq
Distributed app development with nodejs and zeromqDistributed app development with nodejs and zeromq
Distributed app development with nodejs and zeromq
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
Api crash
Api crashApi crash
Api crash
 
REST APIs
REST APIsREST APIs
REST APIs
 
Introduction to Apache NiFi And Storm
Introduction to Apache NiFi And StormIntroduction to Apache NiFi And Storm
Introduction to Apache NiFi And Storm
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
How to build a Neutron Plugin (stadium edition)
How to build a Neutron Plugin (stadium edition)How to build a Neutron Plugin (stadium edition)
How to build a Neutron Plugin (stadium edition)
 
How to write a Neutron plugin (stadium edition)
How to write a Neutron plugin (stadium edition)How to write a Neutron plugin (stadium edition)
How to write a Neutron plugin (stadium edition)
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
slides (PPT)
slides (PPT)slides (PPT)
slides (PPT)
 
webservers
webserverswebservers
webservers
 
Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...Debugging Microservices - key challenges and techniques - Microservices Odesa...
Debugging Microservices - key challenges and techniques - Microservices Odesa...
 
Tech talk microservices debugging
Tech talk microservices debuggingTech talk microservices debugging
Tech talk microservices debugging
 
Push jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private ChefPush jobs: an orchestration building block for private Chef
Push jobs: an orchestration building block for private Chef
 

Mehr von Wooga

Story of Warlords: Bringing a turn-based strategy game to mobile
Story of Warlords: Bringing a turn-based strategy game to mobile Story of Warlords: Bringing a turn-based strategy game to mobile
Story of Warlords: Bringing a turn-based strategy game to mobile Wooga
 
Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015
Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015
Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015Wooga
 
In it for the long haul - How Wooga boosts long-term retention
In it for the long haul - How Wooga boosts long-term retentionIn it for the long haul - How Wooga boosts long-term retention
In it for the long haul - How Wooga boosts long-term retentionWooga
 
Leveling up in localization! - Susan Alma & Dario Quondamstefano
Leveling up in localization! - Susan Alma & Dario QuondamstefanoLeveling up in localization! - Susan Alma & Dario Quondamstefano
Leveling up in localization! - Susan Alma & Dario QuondamstefanoWooga
 
Evoloution of Ideas
Evoloution of IdeasEvoloution of Ideas
Evoloution of IdeasWooga
 
Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid
Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid
Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid Wooga
 
Saying No to the CEO: A Deep Look at Independent Teams - Adam Telfer
Saying No to the CEO: A Deep Look at Independent Teams - Adam TelferSaying No to the CEO: A Deep Look at Independent Teams - Adam Telfer
Saying No to the CEO: A Deep Look at Independent Teams - Adam TelferWooga
 
Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)Wooga
 
Big Fish, small pond - strategies for surviving in a maturing market - Ed Biden
Big Fish, small pond - strategies for surviving in a maturing market - Ed BidenBig Fish, small pond - strategies for surviving in a maturing market - Ed Biden
Big Fish, small pond - strategies for surviving in a maturing market - Ed BidenWooga
 
Review mining aps2014 berlin
Review mining aps2014 berlinReview mining aps2014 berlin
Review mining aps2014 berlinWooga
 
Riak & Wooga_Geeek2Geeek Meetup2014 Berlin
Riak & Wooga_Geeek2Geeek Meetup2014 BerlinRiak & Wooga_Geeek2Geeek Meetup2014 Berlin
Riak & Wooga_Geeek2Geeek Meetup2014 BerlinWooga
 
Staying in the Game: Game localization practices for the mobile market
Staying in the Game: Game localization practices for the mobile marketStaying in the Game: Game localization practices for the mobile market
Staying in the Game: Game localization practices for the mobile marketWooga
 
Startup Weekend_Makers and Games_Philipp Stelzer
Startup Weekend_Makers and Games_Philipp StelzerStartup Weekend_Makers and Games_Philipp Stelzer
Startup Weekend_Makers and Games_Philipp StelzerWooga
 
DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)Wooga
 
DevOps goes Mobile - Jax 2014 - Jesper Richter-Reichhelm
DevOps goes Mobile - Jax 2014 - Jesper Richter-ReichhelmDevOps goes Mobile - Jax 2014 - Jesper Richter-Reichhelm
DevOps goes Mobile - Jax 2014 - Jesper Richter-ReichhelmWooga
 
CodeFest 2014_Mobile Game Development
CodeFest 2014_Mobile Game DevelopmentCodeFest 2014_Mobile Game Development
CodeFest 2014_Mobile Game DevelopmentWooga
 
Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014
Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014
Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014Wooga
 
How to hire the best people for your startup-Gitta Blat-Head of People
How to hire the best people for your startup-Gitta Blat-Head of PeopleHow to hire the best people for your startup-Gitta Blat-Head of People
How to hire the best people for your startup-Gitta Blat-Head of PeopleWooga
 
Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014
Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014
Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014Wooga
 
Pocket Gamer Connects 2014_The Experience of Entering the Korean Market
Pocket Gamer Connects 2014_The Experience of Entering the Korean MarketPocket Gamer Connects 2014_The Experience of Entering the Korean Market
Pocket Gamer Connects 2014_The Experience of Entering the Korean MarketWooga
 

Mehr von Wooga (20)

Story of Warlords: Bringing a turn-based strategy game to mobile
Story of Warlords: Bringing a turn-based strategy game to mobile Story of Warlords: Bringing a turn-based strategy game to mobile
Story of Warlords: Bringing a turn-based strategy game to mobile
 
Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015
Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015
Instagram Celebrities: are they the new cats? - Targetsummit Berlin 2015
 
In it for the long haul - How Wooga boosts long-term retention
In it for the long haul - How Wooga boosts long-term retentionIn it for the long haul - How Wooga boosts long-term retention
In it for the long haul - How Wooga boosts long-term retention
 
Leveling up in localization! - Susan Alma & Dario Quondamstefano
Leveling up in localization! - Susan Alma & Dario QuondamstefanoLeveling up in localization! - Susan Alma & Dario Quondamstefano
Leveling up in localization! - Susan Alma & Dario Quondamstefano
 
Evoloution of Ideas
Evoloution of IdeasEvoloution of Ideas
Evoloution of Ideas
 
Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid
Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid
Entitas System Architecture with Unity - Maxim Zaks and Simon Schmid
 
Saying No to the CEO: A Deep Look at Independent Teams - Adam Telfer
Saying No to the CEO: A Deep Look at Independent Teams - Adam TelferSaying No to the CEO: A Deep Look at Independent Teams - Adam Telfer
Saying No to the CEO: A Deep Look at Independent Teams - Adam Telfer
 
Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)Innovation dank DevOps (DevOpsCon Berlin 2015)
Innovation dank DevOps (DevOpsCon Berlin 2015)
 
Big Fish, small pond - strategies for surviving in a maturing market - Ed Biden
Big Fish, small pond - strategies for surviving in a maturing market - Ed BidenBig Fish, small pond - strategies for surviving in a maturing market - Ed Biden
Big Fish, small pond - strategies for surviving in a maturing market - Ed Biden
 
Review mining aps2014 berlin
Review mining aps2014 berlinReview mining aps2014 berlin
Review mining aps2014 berlin
 
Riak & Wooga_Geeek2Geeek Meetup2014 Berlin
Riak & Wooga_Geeek2Geeek Meetup2014 BerlinRiak & Wooga_Geeek2Geeek Meetup2014 Berlin
Riak & Wooga_Geeek2Geeek Meetup2014 Berlin
 
Staying in the Game: Game localization practices for the mobile market
Staying in the Game: Game localization practices for the mobile marketStaying in the Game: Game localization practices for the mobile market
Staying in the Game: Game localization practices for the mobile market
 
Startup Weekend_Makers and Games_Philipp Stelzer
Startup Weekend_Makers and Games_Philipp StelzerStartup Weekend_Makers and Games_Philipp Stelzer
Startup Weekend_Makers and Games_Philipp Stelzer
 
DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)DevOps goes Mobile (daho.am)
DevOps goes Mobile (daho.am)
 
DevOps goes Mobile - Jax 2014 - Jesper Richter-Reichhelm
DevOps goes Mobile - Jax 2014 - Jesper Richter-ReichhelmDevOps goes Mobile - Jax 2014 - Jesper Richter-Reichhelm
DevOps goes Mobile - Jax 2014 - Jesper Richter-Reichhelm
 
CodeFest 2014_Mobile Game Development
CodeFest 2014_Mobile Game DevelopmentCodeFest 2014_Mobile Game Development
CodeFest 2014_Mobile Game Development
 
Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014
Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014
Jelly Splash: Puzzling your way to the top of the App Stores - GDC 2014
 
How to hire the best people for your startup-Gitta Blat-Head of People
How to hire the best people for your startup-Gitta Blat-Head of PeopleHow to hire the best people for your startup-Gitta Blat-Head of People
How to hire the best people for your startup-Gitta Blat-Head of People
 
Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014
Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014
Two Ann(e)s and one Julia_Wooga Lady Power from Berlin_SGA2014
 
Pocket Gamer Connects 2014_The Experience of Entering the Korean Market
Pocket Gamer Connects 2014_The Experience of Entering the Korean MarketPocket Gamer Connects 2014_The Experience of Entering the Korean Market
Pocket Gamer Connects 2014_The Experience of Entering the Korean Market
 

Kürzlich hochgeladen

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Kürzlich hochgeladen (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Designing for Scale

  • 1. Designing for Scale Knut Nesheim @knutin Paolo Negri @hungryblank
  • 2. About this talk 2 developers and erlang vs. 1 million daily users
  • 3. Social Games Flash client (game) HTTP API
  • 4. Social Games Flash client • Game actions need to be persisted and validated • 1 API call every 2 secs
  • 5. Social Games HTTP API • @ 1 000 000 daily users • 5000 HTTP reqs/sec • more than 90% writes
  • 7. Users we expect DAU 1000000 750000 “Monster World” daily users july - december 2010 500000 250000 0 July December
  • 8. Users we have DAU New game daily users march - june 2011 50 0 march april may june
  • 9. What to do? 1 Simulate users
  • 10. Simulating users • Must not be too synthetic (like apachebench) • Must look like a meaningful game session • Users must come online at a given rate and play
  • 11. Tsung • Multi protocol (HTTP, XMPP) benchmarking tool • Able to test non trivial call sequences • Can actually simulate a scripted gaming session http://tsung.erlang-projects.org/
  • 12. Tsung - configuration Fixed content Dynamic parameter <request subst="true"> <http url="http://server.wooga.com/users/% %ts_user_server:get_unique_id%%/resources/column/5/ row/14?%%_routing_key%%" method="POST" contents='{"parameter1":"value1"}'> </http> </request> http://tsung.erlang-projects.org/
  • 13. Tsung - configuration • Not something you fancy writing • We’re in development, calls change and we constantly add new calls • A session might contain hundreds of requests • All the calls must refer to a consistent game state http://tsung.erlang-projects.org/
  • 14. Tsung - configuration • From our ruby test code user.resources(:column => 5, :row => 14) • Same as <request subst="true"> <http url="http://server.wooga.com/users/% %ts_user_server:get_unique_id%%/resources/column/5/ row/14?%%_routing_key%%" method="POST" contents='{"parameter1":"value1"}'> </http> </request> http://tsung.erlang-projects.org/
  • 15. Tsung - configuration • Session A session is a • requests group of requests • Arrival phase Sessions arrive in • duration phases with a specific arrival • arrival rate rate http://tsung.erlang-projects.org/
  • 16. Tsung - setup Application Benchmarking cluster app server tsung HTTP reqs worker ssh app server tsung master app server http://tsung.erlang-projects.org/
  • 17. Tsung • Generates ~ 2500 reqs/sec on AWS m1.large • Flexible but hard to extend • Code base rather obscure http://tsung.erlang-projects.org/
  • 18. What to do? 2 Collect metrics
  • 19. Tsung-metrics • Tsung collects measures and provides reports • But these measure include tsung network/ cpu congestion itself • Tsung machines aren’t a good point of view http://tsung.erlang-projects.org/
  • 20. HAproxy Application Benchmarking cluster app server tsung HTTP reqs worker haproxy ssh app server tsung master app server
  • 21. HAproxy “The Reliable, High Performance TCP/ HTTP Load Balancer” • Placed in front of http servers • Load balancing • Fail over
  • 22. HAproxy - syslog • Easy to setup • Efficient (UDP) • Provides 5 timings per each request
  • 23. HAproxy • Time to receive request from client Application Benchmarking cluster app server tsung haproxy worker ssh app server tsung master
  • 24. HAproxy • Time spent in HAproxy queue Application Benchmarking cluster app server tsung haproxy worker ssh app server tsung master
  • 25. HAproxy • Time to connect to the server Application Benchmarking cluster app server tsung haproxy worker ssh app server tsung master
  • 26. HAproxy • Time to receive response headers from server Application Benchmarking cluster app server tsung haproxy worker ssh app server tsung master
  • 27. HAproxy • Total session duration time Application Benchmarking cluster app server tsung haproxy worker ssh app server tsung master
  • 28. HAproxy - syslog • Application urls identify directly server call • Application urls are easy to parse • Processing haproxy syslog gives per call metric
  • 29. What to do? 3 Understand metrics
  • 30. Reading/aggregating metrics • Python to parse/normalize syslog • R language to analyze/visualize data • R language console to interactively explore benchmarking results
  • 31. R is a free software environment for statistical computing and graphics.
  • 32. What you get • Aggregate performance levels (throughput, latency) • Detailed performance per call type • Statistical analysis (outliers, trends, regression, correlation, frequency, standard deviation)
  • 34. What to do? 4 go deeper
  • 35. Digging into the data • From HAproxy log analisys one call emerged as exceptionally slow • Using eprof we were able to determine that most of the time was spent in a redis query fetching many keys (MGET)
  • 36. Tracing erldis query • More than 60% of runtime is spent manipulating the socket • gen_tcp:recv/2 is the culprit • But why is it called so many times?
  • 37. Understanding the redis protocol C: LRANGE mylist 0 2 <<"*2rn s: *2 $5rn s: $5 Hellorn $5rn s: Hello Worldrn">> s: $5 s: World
  • 38. Understanding erldis • recv_value/2 is used in the protocol parser to get the next data to parse
  • 39. A different approach • Two ways to use gen_tcp: active or passive • In passive, use gen_tcp:recv to explicitly ask for data, blocking • In active, gen_tcp will send the controlling process a message when there is data • Hybrid: active once
  • 40. A different approach • Is active sockets faster? • Proof-of-concept proved active socket faster • Change erldis or write a new driver?
  • 41. A different approach • Radical change => new driver • Keep Erldis queuing approach • Think about error handling from the start • Use active sockets
  • 42. A different approach • Active socket, parse partial replies
  • 43. Circuit breaker • eredis has a simple circuit breaker for when Redis is down/unreachable • eredis returns immediately to clients if connection is down • Reconnecting is done outside request/ response handling • Robust handling of errors
  • 44. Benchmarking eredis • Redis driver critical for our application • Must perform well • Must be stable • How do we test this?
  • 45. Basho bench • Basho produces the Riak KV store • Basho build a tool to test KV servers • Basho bench • We used Basho bench to test eredis
  • 46. Basho bench • Create callback module
  • 49. eredis is open source https://github.com/wooga/eredis
  • 50. What to do? 5 measure internals
  • 51. Measure internals HAproxy point of view is valid but how to measure internals of our application, while we are live, without the overhead of tracing?
  • 52. Think Basho bench • Basho bench can benchmark a redis driver • Redis is very fast, 100K ops/sec • Basho bench overhead is acceptable • The code is very simple
  • 53. Cherry pick ideas from Basho Bench • Creates a histogram of timings on the fly, reducing the number of data points • Dumps to disk every N seconds • Allows statistical tools to work on already aggregated data • Near real-time, from event to stats in N+5 seconds
  • 54. Homegrown stats • Measures latency from the edges of our system (excludes HTTP handling) • And at interesting points inside the system • Statistical analysis using R • Correlate with HAproxy data • Produces graphs and data specific to our application
  • 56. Recap Measure: • From an external point of view (HAproxy) • At the edge of the system (excluding HTTP handling) • Internals in the single process (eprof)
  • 57. Recap Analyze: • Aggregated measures • Statistical properties of measures • standard deviation • distribution • trends
  • 58. Thanks! http://www.wooga.com/jobs knut.nesheim@wooga.com @knutin paolo.negri@wooga.com @hungryblank