SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Advanced Task Management in Celery


           Mahendra M
           @mahendra
    https://github.com/mahendra
@mahendra
●   Python developer for 6 years
●   FOSS enthusiast/volunteer for 14 years
    ●   Bangalore LUG and Infosys LUG
    ●   FOSS.in and LinuxBangalore/200x
●   Celery user for 3 years
●   Contributions
    ●   patches, testing new releases
    ●   Zookeeper msg transport for kombu
    ●   Kafka support (in-progress)
Quick Intro to Celery
●   Asynchronous task/job queue
●   Uses distributed message passing
●   Tasks are run asynchronously on worker nodes
●   Results are passed back to the caller (if any)
Overview

                    Worker 1



                    Worker 2
Sender    Msg Q
                        .
                        .
                        .

                    Worker N
Sample Code
from celery.task import task


@task
def add(x, y):
   return x + y


result = add.delay(5,6)
result.get()
Uses of Celery
●   Asynchronous task processing
●   Handling long running / heavy jobs
    ●   Image resizing, video transcode, PDF generation
●   Offloading heavy web backend operations
●   Scheduling tasks to be run at a particular time
    ●   Cron for python
Advanced Uses
●   Task Routing
●   Task retries, timeout and revoking
●   Task Canvas – combining tasks
    ●   Task co-ordination
    ●   Dependencies
    ●   Task trees or graphs
    ●   Batch tasks
    ●   Progress monitoring
●   Tricks
    ●   DB conflict management
Sending tasks to a particular worker

                                  Worker 1
                                 (Windows)

                       windows
                                  Worker 2
                       windows   (Windows)
     Sender    Msg Q
                                     .
                        linux
                                     .
                                     .
                                 Worker N
                                  (Linux)
Routing tasks – Use cases
●   Priority execution
●   Based on hardware capabilities
    ●   Special cards available for video capture
    ●   Making use of GPUs (CUDA)
●   Based on OS (for eg. Playready encryption)
●   Based on location
    ●   Moving compute closer to data (Hadoop-ish)
    ●   Sending tasks to different data centers
●   Sequencing operations (CouchDB conflicts)
Sample Code
from celery.task import task


@task(queue = 'windows')
def drm_encrypt(audio_file, key_phrase):
   ...


r = drm_encrypt.apply_async( args = [afile, key],
                               queue = 'windows' )


#Start celery worker with queues options
$ celery worker -Q windows
Retrying tasks
@task( default_retry_delay = 60,
      max_retries = 3 )
def drm_encrypt(audio_file, key_phrase):
   try:
       playready.encrypt(...)
   except Exception, exc:
       raise drm_encrypt.retry(exc=exc, countdown=5)
Retrying tasks
●   You can specify the number of times a task can
    be retried.
●   The cases for retrying a task must be handled
    within code. Celery will not do it automatically
●   The tasks should be designed to be idempotent
Handling worker failures
@task( acks_late = True )
def drm_encrypt(audio_file, key_phrase):
     try:
          playready.encrypt(...)
     except Exception, exc:
          raise drm_encrypt.retry(exc=exc, countdown=5)



●   This is used where the task must be resend in case of
    worker or node failure
●   The ack message to the message queue is sent after the
    task finishes executing
Worker processes

                                 Worker 1
                                (Windows)

                      windows
                                 Worker 2
                      windows   (Windows)
Sender        Msg Q
                                    .
                       linux
                                    .
                                    .
                                Worker N
                                 (Linux)
                                    Process 1
                                    Process 2

                                    Process N
Worker processes

                                 Worker 1
                                (Windows)

                      windows
                                 Worker 2
                      windows   (Windows)
Sender        Msg Q
                                    .
                       linux
                                    .
                                    .
                                Worker N
                                 (Linux)
                                    Process 1
                                    Process 2

                                    Process N
Worker process
●   In every worker node, celery starts a pool of
    worker processes
●   The number is determined by the concurrency
    setting (or autodetected – for full CPU usage)
●   Each processes can be configured to restart
    after running x number of tasks
    ●   Disabled by default
●   Alternately eventlet can be used instead of
    processes (discuss later)
Revoking tasks
celery.control.revoke( task_id,
                        terminate = False,
                        signal = 'SIGKILL' )
●
    revoke() works by sending a broadcast
    message to all workers
●   If a task has not yet run, workers will keep this
    task_id in memory and ensure that it does not
    run
●   If a task is running, revoke() will not work
    unless terminate = True
Task expiration
task.apply_async( expires = x )
        x can be
        * in seconds
        * a specific datetime()


●   Global time limits can be configured in settings
    ●   Soft time limit – the task receives an exception
        which can be used to cleanup
    ●   Hard time limit – the worker running the task is
        killed and is replaced with another one.
Handling soft time limit
@task()
def drm_encrypt(audio_file, key_phrase):
   Try:
          setup_tmp_files()
           SoftTimeLimitExceeded:

          playready.encrypt(...)
   except SoftTimeLimitExceeded:
          cleanup_tmp_files()
   except Exception, exc:
          raise drm_encrypt.retry(exc=exc, countdown=5)
Task Canvas
●   Chains – Linking one task to another
●   Groups – Execute several tasks in parallel
●   Chord – execute a task after a set of tasks has
    finished
●   Map and starmap – Similar to map() function
●   Chunks – divide an iterable of work into chunks
●   Chunks + Chord/chain can be used for map-
    reduce
                Best shown in a demo
Task trees

[ task 1 ] --- spawns --- [ task 2 ] ---- spawns -->   [ task 2_1 ]
                  |                                    [ task 2_3 ]
                  |
                  +------ [ task 3 ] ---- spawns -->   [ task 3_1 ]
                  |                                    [ task 3_2 ]
                  |
                  +------ [ task 4 ] ---- links ---> [ task 5 ]
                                                         |(spawns)
                                                         |
                                                         |
                          [ task 8 ] <--- links <--- [ task 6 ]
                                                         |(spawns)
                                                     [ task 7 ]
Task Trees
●   Home grown solution (our current approach)
    ●   Use db models and keep track of trees
●   Better approach
    ●   Use celery-tasktree
    ●   http://pypi.python.org/pypi/celery-tasktree
Celery Batches
●   Collect jobs and execute it in a batch.
●   Can be used for stats collection
●   Batch execution is done once
    ●   a configured timeout is reached OR
    ●   a configured number of tasks have been received
●   Useful for reducing n/w and db loads
Celery Batches
from celery.contrib.batches import Batches
@task( base=Batches, flush_every=50, flush_interval=10 )
def collect_stats( requests ):
   items = {}
   for request in requests:
       item_id = request.kwargs['item_id']
       items[ item_id ] = get_obj( item_id )
       items[ item_id ].count += 1
   # Sync to db


collect_stats.delay( item_id = 45 )
collect_stats.delay( item_id = 57 )
Celery monitoring
●   Celery Flower
    https://github.com/mher/flower
●   Django admin monitor
●   Celery jobstatic
    http://pypi.python.org/pypi/jobtastic
Celery deployment
●   Cyme – celery instance manager
    https://github.com/celery/cyme
●   Celery autoscaling
●   Use celery eventlet where required

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Django Celery
Django Celery Django Celery
Django Celery
 
Top 10 RxJs Operators in Angular
Top 10 RxJs Operators in Angular Top 10 RxJs Operators in Angular
Top 10 RxJs Operators in Angular
 
Asynchronous JavaScript Programming with Callbacks & Promises
Asynchronous JavaScript Programming with Callbacks & PromisesAsynchronous JavaScript Programming with Callbacks & Promises
Asynchronous JavaScript Programming with Callbacks & Promises
 
Prometheus Celery Exporter
Prometheus Celery ExporterPrometheus Celery Exporter
Prometheus Celery Exporter
 
Celery - A Distributed Task Queue
Celery - A Distributed Task QueueCelery - A Distributed Task Queue
Celery - A Distributed Task Queue
 
Introduction to RxJS
Introduction to RxJSIntroduction to RxJS
Introduction to RxJS
 
Angular and The Case for RxJS
Angular and The Case for RxJSAngular and The Case for RxJS
Angular and The Case for RxJS
 
JUnit5 and TestContainers
JUnit5 and TestContainersJUnit5 and TestContainers
JUnit5 and TestContainers
 
Kotlin Coroutines in Practice @ KotlinConf 2018
Kotlin Coroutines in Practice @ KotlinConf 2018Kotlin Coroutines in Practice @ KotlinConf 2018
Kotlin Coroutines in Practice @ KotlinConf 2018
 
Les dessous du framework spring
Les dessous du framework springLes dessous du framework spring
Les dessous du framework spring
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mq
 
Kotlin Coroutines. Flow is coming
Kotlin Coroutines. Flow is comingKotlin Coroutines. Flow is coming
Kotlin Coroutines. Flow is coming
 
L'API Collector dans tous ses états
L'API Collector dans tous ses étatsL'API Collector dans tous ses états
L'API Collector dans tous ses états
 
Introduction to Coroutines @ KotlinConf 2017
Introduction to Coroutines @ KotlinConf 2017Introduction to Coroutines @ KotlinConf 2017
Introduction to Coroutines @ KotlinConf 2017
 
Celery with python
Celery with pythonCelery with python
Celery with python
 
RxJS & Angular Reactive Forms @ Codemotion 2019
RxJS & Angular Reactive Forms @ Codemotion 2019RxJS & Angular Reactive Forms @ Codemotion 2019
RxJS & Angular Reactive Forms @ Codemotion 2019
 
Sequelize
SequelizeSequelize
Sequelize
 
An Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
An Introduction to JUnit 5 and how to use it with Spring boot tests and MockitoAn Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
An Introduction to JUnit 5 and how to use it with Spring boot tests and Mockito
 
Why Task Queues - ComoRichWeb
Why Task Queues - ComoRichWebWhy Task Queues - ComoRichWeb
Why Task Queues - ComoRichWeb
 
Callbacks, Promises, and Coroutines (oh my!): Asynchronous Programming Patter...
Callbacks, Promises, and Coroutines (oh my!): Asynchronous Programming Patter...Callbacks, Promises, and Coroutines (oh my!): Asynchronous Programming Patter...
Callbacks, Promises, and Coroutines (oh my!): Asynchronous Programming Patter...
 

Ähnlich wie Advanced task management with Celery

Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with gevent
Mahendra M
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
koji lin
 
Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013
Francesco
 

Ähnlich wie Advanced task management with Celery (20)

SWT Tech Sharing: Node.js + Redis
SWT Tech Sharing: Node.js + RedisSWT Tech Sharing: Node.js + Redis
SWT Tech Sharing: Node.js + Redis
 
Apache Spark Internals
Apache Spark InternalsApache Spark Internals
Apache Spark Internals
 
internals
internalsinternals
internals
 
Internals
InternalsInternals
Internals
 
Celery
CeleryCelery
Celery
 
Concurrency in Swift
Concurrency in SwiftConcurrency in Swift
Concurrency in Swift
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Dragoncraft Architectural Overview
Dragoncraft Architectural OverviewDragoncraft Architectural Overview
Dragoncraft Architectural Overview
 
Robust C++ Task Systems Through Compile-time Checks
Robust C++ Task Systems Through Compile-time ChecksRobust C++ Task Systems Through Compile-time Checks
Robust C++ Task Systems Through Compile-time Checks
 
Celery introduction
Celery introductionCelery introduction
Celery introduction
 
Batch Processing with Amazon EC2 Container Service
Batch Processing with Amazon EC2 Container ServiceBatch Processing with Amazon EC2 Container Service
Batch Processing with Amazon EC2 Container Service
 
Async and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLAsync and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NL
 
Troubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer PerspectiveTroubleshooting MySQL from a MySQL Developer Perspective
Troubleshooting MySQL from a MySQL Developer Perspective
 
NovaProva, a new generation unit test framework for C programs
NovaProva, a new generation unit test framework for C programsNovaProva, a new generation unit test framework for C programs
NovaProva, a new generation unit test framework for C programs
 
Scaling Django with gevent
Scaling Django with geventScaling Django with gevent
Scaling Django with gevent
 
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
Kernel Recipes 2018 - KernelShark 1.0; What's new and what's coming - Steven ...
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"Yevhen Tatarynov "From POC to High-Performance .NET applications"
Yevhen Tatarynov "From POC to High-Performance .NET applications"
 
Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013Numpy Meetup 07/02/2013
Numpy Meetup 07/02/2013
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Advanced task management with Celery

  • 1. Advanced Task Management in Celery Mahendra M @mahendra https://github.com/mahendra
  • 2. @mahendra ● Python developer for 6 years ● FOSS enthusiast/volunteer for 14 years ● Bangalore LUG and Infosys LUG ● FOSS.in and LinuxBangalore/200x ● Celery user for 3 years ● Contributions ● patches, testing new releases ● Zookeeper msg transport for kombu ● Kafka support (in-progress)
  • 3. Quick Intro to Celery ● Asynchronous task/job queue ● Uses distributed message passing ● Tasks are run asynchronously on worker nodes ● Results are passed back to the caller (if any)
  • 4. Overview Worker 1 Worker 2 Sender Msg Q . . . Worker N
  • 5. Sample Code from celery.task import task @task def add(x, y): return x + y result = add.delay(5,6) result.get()
  • 6. Uses of Celery ● Asynchronous task processing ● Handling long running / heavy jobs ● Image resizing, video transcode, PDF generation ● Offloading heavy web backend operations ● Scheduling tasks to be run at a particular time ● Cron for python
  • 7. Advanced Uses ● Task Routing ● Task retries, timeout and revoking ● Task Canvas – combining tasks ● Task co-ordination ● Dependencies ● Task trees or graphs ● Batch tasks ● Progress monitoring ● Tricks ● DB conflict management
  • 8. Sending tasks to a particular worker Worker 1 (Windows) windows Worker 2 windows (Windows) Sender Msg Q . linux . . Worker N (Linux)
  • 9. Routing tasks – Use cases ● Priority execution ● Based on hardware capabilities ● Special cards available for video capture ● Making use of GPUs (CUDA) ● Based on OS (for eg. Playready encryption) ● Based on location ● Moving compute closer to data (Hadoop-ish) ● Sending tasks to different data centers ● Sequencing operations (CouchDB conflicts)
  • 10. Sample Code from celery.task import task @task(queue = 'windows') def drm_encrypt(audio_file, key_phrase): ... r = drm_encrypt.apply_async( args = [afile, key], queue = 'windows' ) #Start celery worker with queues options $ celery worker -Q windows
  • 11. Retrying tasks @task( default_retry_delay = 60, max_retries = 3 ) def drm_encrypt(audio_file, key_phrase): try: playready.encrypt(...) except Exception, exc: raise drm_encrypt.retry(exc=exc, countdown=5)
  • 12. Retrying tasks ● You can specify the number of times a task can be retried. ● The cases for retrying a task must be handled within code. Celery will not do it automatically ● The tasks should be designed to be idempotent
  • 13. Handling worker failures @task( acks_late = True ) def drm_encrypt(audio_file, key_phrase): try: playready.encrypt(...) except Exception, exc: raise drm_encrypt.retry(exc=exc, countdown=5) ● This is used where the task must be resend in case of worker or node failure ● The ack message to the message queue is sent after the task finishes executing
  • 14. Worker processes Worker 1 (Windows) windows Worker 2 windows (Windows) Sender Msg Q . linux . . Worker N (Linux) Process 1 Process 2 Process N
  • 15. Worker processes Worker 1 (Windows) windows Worker 2 windows (Windows) Sender Msg Q . linux . . Worker N (Linux) Process 1 Process 2 Process N
  • 16. Worker process ● In every worker node, celery starts a pool of worker processes ● The number is determined by the concurrency setting (or autodetected – for full CPU usage) ● Each processes can be configured to restart after running x number of tasks ● Disabled by default ● Alternately eventlet can be used instead of processes (discuss later)
  • 17. Revoking tasks celery.control.revoke( task_id, terminate = False, signal = 'SIGKILL' ) ● revoke() works by sending a broadcast message to all workers ● If a task has not yet run, workers will keep this task_id in memory and ensure that it does not run ● If a task is running, revoke() will not work unless terminate = True
  • 18. Task expiration task.apply_async( expires = x ) x can be * in seconds * a specific datetime() ● Global time limits can be configured in settings ● Soft time limit – the task receives an exception which can be used to cleanup ● Hard time limit – the worker running the task is killed and is replaced with another one.
  • 19. Handling soft time limit @task() def drm_encrypt(audio_file, key_phrase): Try: setup_tmp_files() SoftTimeLimitExceeded: playready.encrypt(...) except SoftTimeLimitExceeded: cleanup_tmp_files() except Exception, exc: raise drm_encrypt.retry(exc=exc, countdown=5)
  • 20. Task Canvas ● Chains – Linking one task to another ● Groups – Execute several tasks in parallel ● Chord – execute a task after a set of tasks has finished ● Map and starmap – Similar to map() function ● Chunks – divide an iterable of work into chunks ● Chunks + Chord/chain can be used for map- reduce Best shown in a demo
  • 21. Task trees [ task 1 ] --- spawns --- [ task 2 ] ---- spawns --> [ task 2_1 ] | [ task 2_3 ] | +------ [ task 3 ] ---- spawns --> [ task 3_1 ] | [ task 3_2 ] | +------ [ task 4 ] ---- links ---> [ task 5 ] |(spawns) | | [ task 8 ] <--- links <--- [ task 6 ] |(spawns) [ task 7 ]
  • 22. Task Trees ● Home grown solution (our current approach) ● Use db models and keep track of trees ● Better approach ● Use celery-tasktree ● http://pypi.python.org/pypi/celery-tasktree
  • 23. Celery Batches ● Collect jobs and execute it in a batch. ● Can be used for stats collection ● Batch execution is done once ● a configured timeout is reached OR ● a configured number of tasks have been received ● Useful for reducing n/w and db loads
  • 24. Celery Batches from celery.contrib.batches import Batches @task( base=Batches, flush_every=50, flush_interval=10 ) def collect_stats( requests ): items = {} for request in requests: item_id = request.kwargs['item_id'] items[ item_id ] = get_obj( item_id ) items[ item_id ].count += 1 # Sync to db collect_stats.delay( item_id = 45 ) collect_stats.delay( item_id = 57 )
  • 25. Celery monitoring ● Celery Flower https://github.com/mher/flower ● Django admin monitor ● Celery jobstatic http://pypi.python.org/pypi/jobtastic
  • 26. Celery deployment ● Cyme – celery instance manager https://github.com/celery/cyme ● Celery autoscaling ● Use celery eventlet where required