SlideShare a Scribd company logo
1 of 85
Download to read offline
all the little pieces
                         distributed systems with PHP

                              Andrei Zmievski @ Digg
                        Dutch PHP Conference @ Amsterdam



Friday, June 12, 2009
Who is this guy?
                   • Open Source Fellow @ Digg
                   • PHP Core Developer since 1999
                   • Architect of the Unicode/i18n support
                   • Release Manager for PHP 6
                   • Twitter: @a
                   • Beer lover (and brewer)

Friday, June 12, 2009
Why distributed?



                   • Because Moore’s Law will not save you
                   • Despite what DHH says




Friday, June 12, 2009
Share nothing


                   • Your Mom was wrong
                   • No shared data on application servers
                   • Distribute it to shared systems



Friday, June 12, 2009
distribute…


                   • memory (memcached)
                   • storage (mogilefs)
                   • work (gearman)



Friday, June 12, 2009
Building blocks


                   • GLAMMP - have you heard of it?
                   • Gearman + LAMP + Memcached
                   • Throw in Mogile too



Friday, June 12, 2009
memcached
Friday, June 12, 2009
background

                   • created by Danga Interactive
                   • high-performance, distributed
                        memory object caching system

                   • sustains Digg, Facebook, LiveJournal,
                        Yahoo!, and many others

                   • if you aren’t using it, you are crazy


Friday, June 12, 2009
background


                   • Very fast over the network and very
                        easy to set up
                   • Designed to be transient
                   • You still need a database



Friday, June 12, 2009
architecture

                    client                  memcached




                    client                  memcached




                    client                  memcached




Friday, June 12, 2009
architecture


                           memcached




Friday, June 12, 2009
slab #4                 4104 bytes


                        slab #3        1368 bytes    1368 bytes


                        slab #2    456 bytes 456 bytes 456 bytes


                                   152   152   152   152   152
                        slab #1
                                  bytes bytes bytes bytes bytes




Friday, June 12, 2009
memory architecture

                   • memory allocated on startup,
                        released on shutdown
                   • variable sized slabs (30+ by default)
                   • each object is stored in the slab most
                        fitting its size

                   • fragmentation can be problematic


Friday, June 12, 2009
memory architecture


                   • items are deleted:
                        • on set
                        • on get, if it’s expired
                        • if slab is full, then use LRU



Friday, June 12, 2009
applications

                   • object cache
                   • output cache
                   • action flood control / rate limiting
                   • simple queue
                   • and much more


Friday, June 12, 2009
PHP clients


                   • a few private ones (Facebook, Yahoo!,
                        etc)
                   • pecl/memcache
                   • pecl/memcached



Friday, June 12, 2009
pecl/memcached

                   • based on libmemcached
                   • released in January 2009
                   • surface API similarity to pecl/
                        memcache
                   • parity with other languages


Friday, June 12, 2009
API
                   • get         • cas
                   • set         • *_by_key
                   • add         • getMulti
                   • replace     • setMulti
                   • delete      • getDelayed / fetch*
                   • append      • callbacks
                   • prepend


Friday, June 12, 2009
consistent hashing
                                IP1



                        IP2-1

                                            IP2-2




                                      IP3

Friday, June 12, 2009
compare-and-swap (cas)


                   • “check and set”
                   • no update if object changed
                   • relies on CAS token



Friday, June 12, 2009
compare-and-swap (cas)
      $m = new Memcached();
      $m->addServer('localhost', 11211);

      do {
                $ips = $m->get('ip_block', null, $cas);

                if ($m->getResultCode() == Memcached::RES_NOTFOUND) {

                        $ips = array($_SERVER['REMOTE_ADDR']);
                        $m->add('ip_block', $ips);

                } else {

                        $ips[] = $_SERVER['REMOTE_ADDR'];
                        $m->cas($cas, 'ip_block', $ips);
                }

      } while ($m->getResultCode() != Memcached::RES_SUCCESS);

Friday, June 12, 2009
delayed “lazy” fetching


                   • issue request with getDelayed()
                   • do other work
                   • fetch results with fetch() or
                        fetchAll()




Friday, June 12, 2009
binary protocol

                   • performance
                     • every request is parsed
                     • can happen thousands times a
                          second
                   • extensibility
                        • support more data in the protocol


Friday, June 12, 2009
callbacks


                   • read-through cache callback
                   • if key is not found, invoke callback,
                        save value to memcache and return it




Friday, June 12, 2009
callbacks


                   • result callback
                   • invoked by getDelayed() for every
                        found item

                   • should not call fetch() in this case



Friday, June 12, 2009
buffered writes


                   • queue up write requests
                   • send when a threshold is exceeded or
                        a ‘get’ command is issued




Friday, June 12, 2009
key prefixing


                   • optional prefix prepended to all the
                        keys automatically
                   • allows for namespacing, versioning,
                        etc.




Friday, June 12, 2009
key locality



                   • allows mapping a set of keys to a
                        specific server




Friday, June 12, 2009
multiple serializers


                   • PHP
                   • igbinary
                   • JSON (soon)



Friday, June 12, 2009
future


                   • UDP support
                   • replication
                   • server management (ejection, status
                        callback)




Friday, June 12, 2009
tips & tricks
                   • 32-bit systems with > 4GB memory:
                        memcached -m4096 -p11211
                        memcached -m4096 -p11212
                        memcached -m4096 -p11213




Friday, June 12, 2009
tips & tricks


                   • write-through or write-back cache
                   • Warm up the cache on code push
                   • Version the keys (if necessary)



Friday, June 12, 2009
tips & tricks

                   • Don’t think row-level DB-style
                        caching; think complex objects
                   • Don’t run memcached on your DB
                        server — your DBAs might send you
                        threatening notes

                   • Use multi-get — run things in parallel


Friday, June 12, 2009
delete by namespace
      1         $ns_key = $memcache->get(quot;foo_namespace_keyquot;);
      2         // if not set, initialize it
      3         if ($ns_key === false)
      4            $memcache->set(quot;foo_namespace_keyquot;,
      5           rand(1, 10000));
      6         // cleverly use the ns_key
      7         $my_key = quot;foo_quot;.$ns_key.quot;_12345quot;;
      8         $my_val = $memcache->get($my_key);
      9         // to clear the namespace:
      10        $memcache->increment(quot;foo_namespace_keyquot;);




Friday, June 12, 2009
storing lists of data

                   • Store items under indexed keys:
                        comment.12, comment.23, etc
                   • Then store the list of item IDs in
                        another key: comments
                   • To retrieve, fetch comments and then
                        multi-get the comment IDs



Friday, June 12, 2009
preventing stampeding



                   • embedded probabilistic timeout
                   • gearman unique task trick




Friday, June 12, 2009
optimization


                   • watch stats (eviction rate, fill, etc)
                        • getStats()
                        • telnet + “stats” commands
                        • peep (heap inspector)



Friday, June 12, 2009
slabs

                   • Tune slab sizes to your needs:
                        • -f chunk size growth factor (default
                          1.25)

                        • -n minimum space allocated for key
                          +value+flags (default 48)



Friday, June 12, 2009
slabs
         slab           class   1:   chunk   size   104   perslab 10082
         slab           class   2:   chunk   size   136   perslab 7710
         slab           class   3:   chunk   size   176   perslab 5957
         slab           class   4:   chunk   size   224   perslab 4681
         ...
         slab           class   38: chunk size 394840 perslab         2
         slab           class   39: chunk size 493552 perslab         2



                                     Default: 38 slabs



Friday, June 12, 2009
slabs
                         Most objects: ∼1-2KB, some larger

         slab           class   1:   chunk   size   1048   perslab   1000
         slab           class   2:   chunk   size   1064   perslab    985
         slab           class   3:   chunk   size   1080   perslab    970
         slab           class   4:   chunk   size   1096   perslab    956
         ...
         slab           class 198: chunk size       9224 perslab      113
         slab           class 199: chunk size       9320 perslab      112

                          memcached -n 1000 -f 1.01


Friday, June 12, 2009
memcached @ digg



Friday, June 12, 2009
ops


                   • memcached on each app server (2GB)
                   • the process is niced to a lower level
                   • separate pool for sessions
                   • 2 servers keep track of cluster health



Friday, June 12, 2009
key prefixes

                   • global key prefix for apc, memcached,
                        etc

                   • each pool has additional, versioned
                        prefix: .sess.2

                   • the key version is incremented on
                        each release

                   • global prefix can invalidate all caches

Friday, June 12, 2009
cache chain

                   • multi-level caching: globals, APC,
                        memcached, etc.
                   • all cache access is through
                        Cache_Chain class
                   • various configurations:
                     • APC ➡ memcached
                     • $GLOBALS ➡ APC

Friday, June 12, 2009
other


                   • large objects (> 1MB)
                   • split on the client side
                   • save the partial keys in a master one



Friday, June 12, 2009
stats




Friday, June 12, 2009
alternatives


                   • in-memory: Tokyo Tyrant, Scalaris
                   • persistent: Hypertable, Cassandra,
                        MemcacheDB

                   • document-oriented: CouchDB



Friday, June 12, 2009
mogile
Friday, June 12, 2009
background

                   • created by Danga Interactive
                   • application-level distributed
                        filesystem

                   • used at Digg, LiveJournal, etc
                   • a form of “cloud caching”
                   • scales very well

Friday, June 12, 2009
background

                   • automatic file replication with custom
                        policies
                   • no single point of failure
                   • flat namespace
                   • local filesystem agnostic
                   • not meant for speed

Friday, June 12, 2009
architecture
                                       node

                            tracker
                                       node


                            tracker
                 app                   node
                              DB


                                       node
                            tracker

                                       node

Friday, June 12, 2009
applications


                   • images
                   • document storage
                   • backing store for certain caches



Friday, June 12, 2009
PHP client



                   • File_Mogile in PEAR
                   • MediaWiki one (not maintained)




Friday, June 12, 2009
Example

      $hosts = array('172.10.1.1', '172.10.1.2');
      $m = new File_Mogile($hosts, 'profiles');
      $m->storeFile('user1234', 'image',
                    '/tmp/image1234.jpg');

      ...

      $paths = $m->getPaths('user1234');




Friday, June 12, 2009
mogile @ digg



Friday, June 12, 2009
mogile @ digg


                   • Wrapper around File_Mogile to cache
                        entries in memcache
                   • fairly standard set-up
                   • trackers run on storage nodes



Friday, June 12, 2009
mogile @ digg

                   • not huge (about 3.5 TB of data)
                   • files are replicated 3x
                   • the user profile images are cached on
                        Netscaler (1.5 GB cache)
                   • mogile cluster load is light


Friday, June 12, 2009
gearman
Friday, June 12, 2009
background


                   • created by Danga Interactive
                   • anagram of “manager”
                   • a system for distributing work
                   • a form of RPC mechanism



Friday, June 12, 2009
background


                   • parallel, asynchronous, scales well
                   • fire and forget, decentralized
                   • avoid tying up Apache processes



Friday, June 12, 2009
background

                   • dispatch function calls to machines
                        that are better suited to do work
                   • do work in parallel
                   • load balance lots of function calls
                   • invoke functions in other languages


Friday, June 12, 2009
architecture

              client                   worker




              client       gearmand    worker




              client                   worker




Friday, June 12, 2009
applications

                   • thumbnail generation
                   • asynchronous logging
                   • cache warm-up
                   • DB jobs, data migration
                   • sending email


Friday, June 12, 2009
servers



                   • Gearman-Server (Perl)
                   • gearmand (C)




Friday, June 12, 2009
clients

                   • Net_Gearman
                        • simplified, pretty stable
                   • pecl/gearman
                        • more powerful, complex, somewhat
                          unstable (under development)



Friday, June 12, 2009
Concepts


                   • Job
                   • Worker
                   • Task
                   • Client



Friday, June 12, 2009
Net_Gearman

                   • Net_Gearman_Job
                   • Net_Gearman_Worker
                   • Net_Gearman_Task
                   • Net_Gearman_Set
                   • Net_Gearman_Client


Friday, June 12, 2009
Echo Job

      class Net_Gearman_Job_Echo extends Net_Gearman_Job_Common
      {
          public function run($arg)
          {
              var_export($arg);
              echo quot;nquot;;
          }
      }                                              Echo.php




Friday, June 12, 2009
Reverse Job
      class Net_Gearman_Job_Reverse extends Net_Gearman_Job_Common
      {
          public function run($arg)
          {
              $result = array();
              $n = count($arg);
              $i = 0;
              while ($value = array_pop($arg)) {
                  $result[] = $value;
                  $i++;
                  $this->status($i, $n);
              }

                        return $result;
                 }
      }                                            Reverse.php

Friday, June 12, 2009
Worker
      define('NET_GEARMAN_JOB_PATH', './');

      require 'Net/Gearman/Worker.php';

      try {
          $worker = new Net_Gearman_Worker(array('localhost:4730'));
          $worker->addAbility('Reverse');
          $worker->addAbility('Echo');
          $worker->beginWork();
      } catch (Net_Gearman_Exception $e) {
          echo $e->getMessage() . quot;nquot;;
          exit;
      }




Friday, June 12, 2009
Client

      require_once 'Net/Gearman/Client.php';

      function complete($job, $handle, $result) {
          echo quot;$job complete, result: quot;.var_export($result,
      true).quot;nquot;;
      }

      function status($job, $handle, $n, $d)
      {
          echo quot;$n/$dnquot;;
      }
                                                      continued..


Friday, June 12, 2009
Client


      $client = new Net_Gearman_Client(array('lager:4730'));

      $task = new Net_Gearman_Task('Reverse', range(1,5));
      $task->attachCallback(quot;completequot;,Net_Gearman_Task::TASK_COMPLETE);
      $task->attachCallback(quot;statusquot;,Net_Gearman_Task::TASK_STATUS);

                                                                continued..




Friday, June 12, 2009
Client

      $set = new Net_Gearman_Set();
      $set->addTask($task);

      $client->runSet($set);

      $client->Echo('Mmm... beer');




Friday, June 12, 2009
pecl/gearman



                   • More complex API
                   • Jobs aren’t separated into files




Friday, June 12, 2009
Worker
      $gmworker= new gearman_worker();
      $gmworker->add_server();
      $gmworker->add_function(quot;reversequot;, quot;reverse_fnquot;);

      while (1)
      {
        $ret= $gmworker->work();
        if ($ret != GEARMAN_SUCCESS)
          break;
      }

      function reverse_fn($job)
      {
        $workload= $job->workload();
        echo quot;Received job: quot; . $job->handle() . quot;nquot;;
        echo quot;Workload: $workloadnquot;;
        $result= strrev($workload);
        echo quot;Result: $resultnquot;;
        return $result;
      }


Friday, June 12, 2009
Client
      $gmclient= new gearman_client();

      $gmclient->add_server('lager');

      echo quot;Sending jobnquot;;

      list($ret, $result) = $gmclient->do(quot;reversequot;, quot;Hello!quot;);

      if ($ret == GEARMAN_SUCCESS)
        echo quot;Success: $resultnquot;;




Friday, June 12, 2009
gearman @ digg



Friday, June 12, 2009
gearman @ digg

                   • 400,000 jobs a day
                   • Jobs: crawling, DB job, FB sync,
                        memcache manipulation, Twitter
                        post, IDDB migration, etc.
                   • Each application server has its own
                        Gearman daemon + workers



Friday, June 12, 2009
tips and tricks

                   • you can daemonize the workers easily
                        with daemon or supervisord

                   • run workers in different groups, don’t
                        block on job A waiting on job B

                   • Make workers exit after N jobs to free
                        up memory (supervisord will restart
                        them)


Friday, June 12, 2009
Thrift
Friday, June 12, 2009
background


                   • NOT developed by Danga (Facebook)
                   • cross-language services
                   • RPC-based



Friday, June 12, 2009
background

                   • interface description language
                   • bindings: C++, C#, Cocoa, Erlang,
                        Haskell, Java, OCaml, Perl, PHP,
                        Python, Ruby, Smalltalk
                   • data types: base, structs, constants,
                        services, exceptions



Friday, June 12, 2009
IDL



Friday, June 12, 2009
Demo



Friday, June 12, 2009
Thank You


                        http://gravitonic.com/talks
Friday, June 12, 2009

More Related Content

What's hot

0628阙宏宇
0628阙宏宇0628阙宏宇
0628阙宏宇zhu02
 
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
Distributed Lock ManagerHao Chen
 
Memcachedb: The Complete Guide
Memcachedb: The Complete GuideMemcachedb: The Complete Guide
Memcachedb: The Complete Guideelliando dias
 
The New MariaDB Offering - MariaDB 10, MaxScale and more
The New MariaDB Offering - MariaDB 10, MaxScale and moreThe New MariaDB Offering - MariaDB 10, MaxScale and more
The New MariaDB Offering - MariaDB 10, MaxScale and moreMariaDB Corporation
 
Connection Pooling in PostgreSQL using pgbouncer
Connection Pooling in PostgreSQL using pgbouncer Connection Pooling in PostgreSQL using pgbouncer
Connection Pooling in PostgreSQL using pgbouncer Sameer Kumar
 
Java IPC and the CLIP library
Java IPC and the CLIP libraryJava IPC and the CLIP library
Java IPC and the CLIP libraryltsllc
 
Toster - Understanding the Rails Web Model and Scalability Options
Toster - Understanding the Rails Web Model and Scalability OptionsToster - Understanding the Rails Web Model and Scalability Options
Toster - Understanding the Rails Web Model and Scalability OptionsFabio Akita
 
Feed Burner Scalability
Feed Burner ScalabilityFeed Burner Scalability
Feed Burner Scalabilitydidip
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsJean-François Gagné
 
GOTO 2013: Why Zalando trusts in PostgreSQL
GOTO 2013: Why Zalando trusts in PostgreSQLGOTO 2013: Why Zalando trusts in PostgreSQL
GOTO 2013: Why Zalando trusts in PostgreSQLHenning Jacobs
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sitesYann Malet
 
MariaDB Server on macOS - FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS -  FOSDEM 2022 MariaDB DevroomMariaDB Server on macOS -  FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS - FOSDEM 2022 MariaDB DevroomValeriy Kravchuk
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
 
Gofer 200707
Gofer 200707Gofer 200707
Gofer 200707oscon2007
 

What's hot (19)

0628阙宏宇
0628阙宏宇0628阙宏宇
0628阙宏宇
 
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
Distributed Lock Manager
 
Memcachedb: The Complete Guide
Memcachedb: The Complete GuideMemcachedb: The Complete Guide
Memcachedb: The Complete Guide
 
The New MariaDB Offering - MariaDB 10, MaxScale and more
The New MariaDB Offering - MariaDB 10, MaxScale and moreThe New MariaDB Offering - MariaDB 10, MaxScale and more
The New MariaDB Offering - MariaDB 10, MaxScale and more
 
Connection Pooling in PostgreSQL using pgbouncer
Connection Pooling in PostgreSQL using pgbouncer Connection Pooling in PostgreSQL using pgbouncer
Connection Pooling in PostgreSQL using pgbouncer
 
GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)
 
Java IPC and the CLIP library
Java IPC and the CLIP libraryJava IPC and the CLIP library
Java IPC and the CLIP library
 
Toster - Understanding the Rails Web Model and Scalability Options
Toster - Understanding the Rails Web Model and Scalability OptionsToster - Understanding the Rails Web Model and Scalability Options
Toster - Understanding the Rails Web Model and Scalability Options
 
Feed Burner Scalability
Feed Burner ScalabilityFeed Burner Scalability
Feed Burner Scalability
 
MySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitationsMySQL Parallel Replication: inventory, use-cases and limitations
MySQL Parallel Replication: inventory, use-cases and limitations
 
GOTO 2013: Why Zalando trusts in PostgreSQL
GOTO 2013: Why Zalando trusts in PostgreSQLGOTO 2013: Why Zalando trusts in PostgreSQL
GOTO 2013: Why Zalando trusts in PostgreSQL
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
7 Ways To Crash Postgres
7 Ways To Crash Postgres7 Ways To Crash Postgres
7 Ways To Crash Postgres
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sites
 
MariaDB Server on macOS - FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS -  FOSDEM 2022 MariaDB DevroomMariaDB Server on macOS -  FOSDEM 2022 MariaDB Devroom
MariaDB Server on macOS - FOSDEM 2022 MariaDB Devroom
 
MySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitationsMySQL Parallel Replication: inventory, use-case and limitations
MySQL Parallel Replication: inventory, use-case and limitations
 
Gofer 200707
Gofer 200707Gofer 200707
Gofer 200707
 
50 tips50minutes
50 tips50minutes50 tips50minutes
50 tips50minutes
 

Similar to All The Little Pieces

Catalyst And Chained
Catalyst And ChainedCatalyst And Chained
Catalyst And ChainedJay Shirley
 
Merb The Super Bike Of Frameworks
Merb The Super Bike Of FrameworksMerb The Super Bike Of Frameworks
Merb The Super Bike Of FrameworksRowan Hick
 
No Really, It's All About You
No Really, It's All About YouNo Really, It's All About You
No Really, It's All About YouChris Cornutt
 
Web Standards and Accessibility
Web Standards and AccessibilityWeb Standards and Accessibility
Web Standards and AccessibilityNick DeNardis
 
Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009
Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009
Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009Guillaume Laforge
 
MacRuby - When objective-c and Ruby meet
MacRuby - When objective-c and Ruby meetMacRuby - When objective-c and Ruby meet
MacRuby - When objective-c and Ruby meetMatt Aimonetti
 
Apache Hadoop Talk at QCon
Apache Hadoop Talk at QConApache Hadoop Talk at QCon
Apache Hadoop Talk at QConCloudera, Inc.
 
GitHub Notable OSS Project
GitHub  Notable OSS ProjectGitHub  Notable OSS Project
GitHub Notable OSS Projectroumia
 
Improving Drupal's Page Loading Performance
Improving Drupal's Page Loading PerformanceImproving Drupal's Page Loading Performance
Improving Drupal's Page Loading PerformanceWim Leers
 
Bay Area Drupal Camp Efficiency
Bay Area Drupal Camp EfficiencyBay Area Drupal Camp Efficiency
Bay Area Drupal Camp Efficiencysmattoon
 
Inside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private CloudInside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private CloudAtlassian
 
Atlassian - A Different Kind Of Software Company
Atlassian - A Different Kind Of Software CompanyAtlassian - A Different Kind Of Software Company
Atlassian - A Different Kind Of Software CompanyMike Cannon-Brookes
 
Code Stinkers Anonymous
Code Stinkers AnonymousCode Stinkers Anonymous
Code Stinkers AnonymousMark Cornick
 
Cloud, Cache, and Configs
Cloud, Cache, and ConfigsCloud, Cache, and Configs
Cloud, Cache, and ConfigsScott Taylor
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcachedelliando dias
 

Similar to All The Little Pieces (20)

Catalyst And Chained
Catalyst And ChainedCatalyst And Chained
Catalyst And Chained
 
Scaling Django Dc09
Scaling Django Dc09Scaling Django Dc09
Scaling Django Dc09
 
Merb The Super Bike Of Frameworks
Merb The Super Bike Of FrameworksMerb The Super Bike Of Frameworks
Merb The Super Bike Of Frameworks
 
No Really, It's All About You
No Really, It's All About YouNo Really, It's All About You
No Really, It's All About You
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little Pieces
 
Web Standards and Accessibility
Web Standards and AccessibilityWeb Standards and Accessibility
Web Standards and Accessibility
 
Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009
Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009
Groovy, to Infinity and Beyond - Groovy/Grails eXchange 2009
 
MacRuby - When objective-c and Ruby meet
MacRuby - When objective-c and Ruby meetMacRuby - When objective-c and Ruby meet
MacRuby - When objective-c and Ruby meet
 
Intro To Git
Intro To GitIntro To Git
Intro To Git
 
Apache Hadoop Talk at QCon
Apache Hadoop Talk at QConApache Hadoop Talk at QCon
Apache Hadoop Talk at QCon
 
GitHub Notable OSS Project
GitHub  Notable OSS ProjectGitHub  Notable OSS Project
GitHub Notable OSS Project
 
Improving Drupal's Page Loading Performance
Improving Drupal's Page Loading PerformanceImproving Drupal's Page Loading Performance
Improving Drupal's Page Loading Performance
 
Bay Area Drupal Camp Efficiency
Bay Area Drupal Camp EfficiencyBay Area Drupal Camp Efficiency
Bay Area Drupal Camp Efficiency
 
Inside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private CloudInside the Atlassian OnDemand Private Cloud
Inside the Atlassian OnDemand Private Cloud
 
Atlassian - A Different Kind Of Software Company
Atlassian - A Different Kind Of Software CompanyAtlassian - A Different Kind Of Software Company
Atlassian - A Different Kind Of Software Company
 
Code Stinkers Anonymous
Code Stinkers AnonymousCode Stinkers Anonymous
Code Stinkers Anonymous
 
Cloud, Cache, and Configs
Cloud, Cache, and ConfigsCloud, Cache, and Configs
Cloud, Cache, and Configs
 
Intro To Grails
Intro To GrailsIntro To Grails
Intro To Grails
 
Scaling Django
Scaling DjangoScaling Django
Scaling Django
 
Scaling Rails with memcached
Scaling Rails with memcachedScaling Rails with memcached
Scaling Rails with memcached
 

Recently uploaded

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

All The Little Pieces

  • 1. all the little pieces distributed systems with PHP Andrei Zmievski @ Digg Dutch PHP Conference @ Amsterdam Friday, June 12, 2009
  • 2. Who is this guy? • Open Source Fellow @ Digg • PHP Core Developer since 1999 • Architect of the Unicode/i18n support • Release Manager for PHP 6 • Twitter: @a • Beer lover (and brewer) Friday, June 12, 2009
  • 3. Why distributed? • Because Moore’s Law will not save you • Despite what DHH says Friday, June 12, 2009
  • 4. Share nothing • Your Mom was wrong • No shared data on application servers • Distribute it to shared systems Friday, June 12, 2009
  • 5. distribute… • memory (memcached) • storage (mogilefs) • work (gearman) Friday, June 12, 2009
  • 6. Building blocks • GLAMMP - have you heard of it? • Gearman + LAMP + Memcached • Throw in Mogile too Friday, June 12, 2009
  • 8. background • created by Danga Interactive • high-performance, distributed memory object caching system • sustains Digg, Facebook, LiveJournal, Yahoo!, and many others • if you aren’t using it, you are crazy Friday, June 12, 2009
  • 9. background • Very fast over the network and very easy to set up • Designed to be transient • You still need a database Friday, June 12, 2009
  • 10. architecture client memcached client memcached client memcached Friday, June 12, 2009
  • 11. architecture memcached Friday, June 12, 2009
  • 12. slab #4 4104 bytes slab #3 1368 bytes 1368 bytes slab #2 456 bytes 456 bytes 456 bytes 152 152 152 152 152 slab #1 bytes bytes bytes bytes bytes Friday, June 12, 2009
  • 13. memory architecture • memory allocated on startup, released on shutdown • variable sized slabs (30+ by default) • each object is stored in the slab most fitting its size • fragmentation can be problematic Friday, June 12, 2009
  • 14. memory architecture • items are deleted: • on set • on get, if it’s expired • if slab is full, then use LRU Friday, June 12, 2009
  • 15. applications • object cache • output cache • action flood control / rate limiting • simple queue • and much more Friday, June 12, 2009
  • 16. PHP clients • a few private ones (Facebook, Yahoo!, etc) • pecl/memcache • pecl/memcached Friday, June 12, 2009
  • 17. pecl/memcached • based on libmemcached • released in January 2009 • surface API similarity to pecl/ memcache • parity with other languages Friday, June 12, 2009
  • 18. API • get • cas • set • *_by_key • add • getMulti • replace • setMulti • delete • getDelayed / fetch* • append • callbacks • prepend Friday, June 12, 2009
  • 19. consistent hashing IP1 IP2-1 IP2-2 IP3 Friday, June 12, 2009
  • 20. compare-and-swap (cas) • “check and set” • no update if object changed • relies on CAS token Friday, June 12, 2009
  • 21. compare-and-swap (cas) $m = new Memcached(); $m->addServer('localhost', 11211); do { $ips = $m->get('ip_block', null, $cas); if ($m->getResultCode() == Memcached::RES_NOTFOUND) { $ips = array($_SERVER['REMOTE_ADDR']); $m->add('ip_block', $ips); } else { $ips[] = $_SERVER['REMOTE_ADDR']; $m->cas($cas, 'ip_block', $ips); } } while ($m->getResultCode() != Memcached::RES_SUCCESS); Friday, June 12, 2009
  • 22. delayed “lazy” fetching • issue request with getDelayed() • do other work • fetch results with fetch() or fetchAll() Friday, June 12, 2009
  • 23. binary protocol • performance • every request is parsed • can happen thousands times a second • extensibility • support more data in the protocol Friday, June 12, 2009
  • 24. callbacks • read-through cache callback • if key is not found, invoke callback, save value to memcache and return it Friday, June 12, 2009
  • 25. callbacks • result callback • invoked by getDelayed() for every found item • should not call fetch() in this case Friday, June 12, 2009
  • 26. buffered writes • queue up write requests • send when a threshold is exceeded or a ‘get’ command is issued Friday, June 12, 2009
  • 27. key prefixing • optional prefix prepended to all the keys automatically • allows for namespacing, versioning, etc. Friday, June 12, 2009
  • 28. key locality • allows mapping a set of keys to a specific server Friday, June 12, 2009
  • 29. multiple serializers • PHP • igbinary • JSON (soon) Friday, June 12, 2009
  • 30. future • UDP support • replication • server management (ejection, status callback) Friday, June 12, 2009
  • 31. tips & tricks • 32-bit systems with > 4GB memory: memcached -m4096 -p11211 memcached -m4096 -p11212 memcached -m4096 -p11213 Friday, June 12, 2009
  • 32. tips & tricks • write-through or write-back cache • Warm up the cache on code push • Version the keys (if necessary) Friday, June 12, 2009
  • 33. tips & tricks • Don’t think row-level DB-style caching; think complex objects • Don’t run memcached on your DB server — your DBAs might send you threatening notes • Use multi-get — run things in parallel Friday, June 12, 2009
  • 34. delete by namespace 1 $ns_key = $memcache->get(quot;foo_namespace_keyquot;); 2 // if not set, initialize it 3 if ($ns_key === false) 4 $memcache->set(quot;foo_namespace_keyquot;, 5 rand(1, 10000)); 6 // cleverly use the ns_key 7 $my_key = quot;foo_quot;.$ns_key.quot;_12345quot;; 8 $my_val = $memcache->get($my_key); 9 // to clear the namespace: 10 $memcache->increment(quot;foo_namespace_keyquot;); Friday, June 12, 2009
  • 35. storing lists of data • Store items under indexed keys: comment.12, comment.23, etc • Then store the list of item IDs in another key: comments • To retrieve, fetch comments and then multi-get the comment IDs Friday, June 12, 2009
  • 36. preventing stampeding • embedded probabilistic timeout • gearman unique task trick Friday, June 12, 2009
  • 37. optimization • watch stats (eviction rate, fill, etc) • getStats() • telnet + “stats” commands • peep (heap inspector) Friday, June 12, 2009
  • 38. slabs • Tune slab sizes to your needs: • -f chunk size growth factor (default 1.25) • -n minimum space allocated for key +value+flags (default 48) Friday, June 12, 2009
  • 39. slabs slab class 1: chunk size 104 perslab 10082 slab class 2: chunk size 136 perslab 7710 slab class 3: chunk size 176 perslab 5957 slab class 4: chunk size 224 perslab 4681 ... slab class 38: chunk size 394840 perslab 2 slab class 39: chunk size 493552 perslab 2 Default: 38 slabs Friday, June 12, 2009
  • 40. slabs Most objects: ∼1-2KB, some larger slab class 1: chunk size 1048 perslab 1000 slab class 2: chunk size 1064 perslab 985 slab class 3: chunk size 1080 perslab 970 slab class 4: chunk size 1096 perslab 956 ... slab class 198: chunk size 9224 perslab 113 slab class 199: chunk size 9320 perslab 112 memcached -n 1000 -f 1.01 Friday, June 12, 2009
  • 41. memcached @ digg Friday, June 12, 2009
  • 42. ops • memcached on each app server (2GB) • the process is niced to a lower level • separate pool for sessions • 2 servers keep track of cluster health Friday, June 12, 2009
  • 43. key prefixes • global key prefix for apc, memcached, etc • each pool has additional, versioned prefix: .sess.2 • the key version is incremented on each release • global prefix can invalidate all caches Friday, June 12, 2009
  • 44. cache chain • multi-level caching: globals, APC, memcached, etc. • all cache access is through Cache_Chain class • various configurations: • APC ➡ memcached • $GLOBALS ➡ APC Friday, June 12, 2009
  • 45. other • large objects (> 1MB) • split on the client side • save the partial keys in a master one Friday, June 12, 2009
  • 47. alternatives • in-memory: Tokyo Tyrant, Scalaris • persistent: Hypertable, Cassandra, MemcacheDB • document-oriented: CouchDB Friday, June 12, 2009
  • 49. background • created by Danga Interactive • application-level distributed filesystem • used at Digg, LiveJournal, etc • a form of “cloud caching” • scales very well Friday, June 12, 2009
  • 50. background • automatic file replication with custom policies • no single point of failure • flat namespace • local filesystem agnostic • not meant for speed Friday, June 12, 2009
  • 51. architecture node tracker node tracker app node DB node tracker node Friday, June 12, 2009
  • 52. applications • images • document storage • backing store for certain caches Friday, June 12, 2009
  • 53. PHP client • File_Mogile in PEAR • MediaWiki one (not maintained) Friday, June 12, 2009
  • 54. Example $hosts = array('172.10.1.1', '172.10.1.2'); $m = new File_Mogile($hosts, 'profiles'); $m->storeFile('user1234', 'image', '/tmp/image1234.jpg'); ... $paths = $m->getPaths('user1234'); Friday, June 12, 2009
  • 55. mogile @ digg Friday, June 12, 2009
  • 56. mogile @ digg • Wrapper around File_Mogile to cache entries in memcache • fairly standard set-up • trackers run on storage nodes Friday, June 12, 2009
  • 57. mogile @ digg • not huge (about 3.5 TB of data) • files are replicated 3x • the user profile images are cached on Netscaler (1.5 GB cache) • mogile cluster load is light Friday, June 12, 2009
  • 59. background • created by Danga Interactive • anagram of “manager” • a system for distributing work • a form of RPC mechanism Friday, June 12, 2009
  • 60. background • parallel, asynchronous, scales well • fire and forget, decentralized • avoid tying up Apache processes Friday, June 12, 2009
  • 61. background • dispatch function calls to machines that are better suited to do work • do work in parallel • load balance lots of function calls • invoke functions in other languages Friday, June 12, 2009
  • 62. architecture client worker client gearmand worker client worker Friday, June 12, 2009
  • 63. applications • thumbnail generation • asynchronous logging • cache warm-up • DB jobs, data migration • sending email Friday, June 12, 2009
  • 64. servers • Gearman-Server (Perl) • gearmand (C) Friday, June 12, 2009
  • 65. clients • Net_Gearman • simplified, pretty stable • pecl/gearman • more powerful, complex, somewhat unstable (under development) Friday, June 12, 2009
  • 66. Concepts • Job • Worker • Task • Client Friday, June 12, 2009
  • 67. Net_Gearman • Net_Gearman_Job • Net_Gearman_Worker • Net_Gearman_Task • Net_Gearman_Set • Net_Gearman_Client Friday, June 12, 2009
  • 68. Echo Job class Net_Gearman_Job_Echo extends Net_Gearman_Job_Common { public function run($arg) { var_export($arg); echo quot;nquot;; } } Echo.php Friday, June 12, 2009
  • 69. Reverse Job class Net_Gearman_Job_Reverse extends Net_Gearman_Job_Common { public function run($arg) { $result = array(); $n = count($arg); $i = 0; while ($value = array_pop($arg)) { $result[] = $value; $i++; $this->status($i, $n); } return $result; } } Reverse.php Friday, June 12, 2009
  • 70. Worker define('NET_GEARMAN_JOB_PATH', './'); require 'Net/Gearman/Worker.php'; try { $worker = new Net_Gearman_Worker(array('localhost:4730')); $worker->addAbility('Reverse'); $worker->addAbility('Echo'); $worker->beginWork(); } catch (Net_Gearman_Exception $e) { echo $e->getMessage() . quot;nquot;; exit; } Friday, June 12, 2009
  • 71. Client require_once 'Net/Gearman/Client.php'; function complete($job, $handle, $result) { echo quot;$job complete, result: quot;.var_export($result, true).quot;nquot;; } function status($job, $handle, $n, $d) { echo quot;$n/$dnquot;; } continued.. Friday, June 12, 2009
  • 72. Client $client = new Net_Gearman_Client(array('lager:4730')); $task = new Net_Gearman_Task('Reverse', range(1,5)); $task->attachCallback(quot;completequot;,Net_Gearman_Task::TASK_COMPLETE); $task->attachCallback(quot;statusquot;,Net_Gearman_Task::TASK_STATUS); continued.. Friday, June 12, 2009
  • 73. Client $set = new Net_Gearman_Set(); $set->addTask($task); $client->runSet($set); $client->Echo('Mmm... beer'); Friday, June 12, 2009
  • 74. pecl/gearman • More complex API • Jobs aren’t separated into files Friday, June 12, 2009
  • 75. Worker $gmworker= new gearman_worker(); $gmworker->add_server(); $gmworker->add_function(quot;reversequot;, quot;reverse_fnquot;); while (1) { $ret= $gmworker->work(); if ($ret != GEARMAN_SUCCESS) break; } function reverse_fn($job) { $workload= $job->workload(); echo quot;Received job: quot; . $job->handle() . quot;nquot;; echo quot;Workload: $workloadnquot;; $result= strrev($workload); echo quot;Result: $resultnquot;; return $result; } Friday, June 12, 2009
  • 76. Client $gmclient= new gearman_client(); $gmclient->add_server('lager'); echo quot;Sending jobnquot;; list($ret, $result) = $gmclient->do(quot;reversequot;, quot;Hello!quot;); if ($ret == GEARMAN_SUCCESS) echo quot;Success: $resultnquot;; Friday, June 12, 2009
  • 77. gearman @ digg Friday, June 12, 2009
  • 78. gearman @ digg • 400,000 jobs a day • Jobs: crawling, DB job, FB sync, memcache manipulation, Twitter post, IDDB migration, etc. • Each application server has its own Gearman daemon + workers Friday, June 12, 2009
  • 79. tips and tricks • you can daemonize the workers easily with daemon or supervisord • run workers in different groups, don’t block on job A waiting on job B • Make workers exit after N jobs to free up memory (supervisord will restart them) Friday, June 12, 2009
  • 81. background • NOT developed by Danga (Facebook) • cross-language services • RPC-based Friday, June 12, 2009
  • 82. background • interface description language • bindings: C++, C#, Cocoa, Erlang, Haskell, Java, OCaml, Perl, PHP, Python, Ruby, Smalltalk • data types: base, structs, constants, services, exceptions Friday, June 12, 2009
  • 85. Thank You http://gravitonic.com/talks Friday, June 12, 2009