Netflix uses Conductor, an open source microservices orchestrator, to manage complex content processing workflows involving ingestion, encoding, localization, and delivery. Conductor provides visibility, control, and reuse of tasks through a task queuing system and workflow definitions. It has scaled to process millions of workflow executions across Netflix's content platform using a stateless architecture with Dynomite for storage and Dyno-Queues for task distribution.
2. Content Platform Engineering - CPE
〉 Studio In the Cloud
〉 Content Ingest from Studio Partners
〉 Title Setup - Making it live on Netflix.com
〉 Localization
3. CPE - Processes (some of many)
〉 Content Ingest & Delivery
〉 Title Setup
〉 IMF*
Deliveries
〉 Encodes and Deployments
〉 Content Quality Control
〉 Content Localization
* IMF - Interoperable Master Format
4. Once Upon A Time ...
〉 Peer to Peer Messaging
〉 10’s MM messages per day
〉 Process flows embedded in applications
〉 Lack of control (STOP deployment!)
〉 Lack of visibility into progress
6. Peer to Peer
Application C Application BApplication BApplication A
Request Content Content Inspection Result Encode Publish
Events / API calls Events / API calls Events / API calls
7. Peer to Peer
Application C Application BApplication BApplication A
Request Content Content Inspection Result Encode Publish
Events / API calls Events / API calls Events / API calls
〉 Logical flow is not easily trackable
〉 Modifying steps is not easy (tightly coupled)
〉 Controlling flow is not possible
〉 Reusing tasks is not trivial
11. Conductor - Design Goals
〉 BYO Task (Reuse existing code)
〉 REST/HTTP support
〉 Extensible and Hackable
〉 JSON based DSL to define blueprint
〉 Scale Out Horizontally
〉 Visibility, Traceability & Control
12. Same Flow - New Flavor
Request
Content
Content
Inspection
Result Encode PublishStart
Stop
Conductor
Application A
Task
Request
Content
Application B
Task
Content
Inspection
Application C
Task
Encode
Application B
Task
Publish
OrchestrationExecution
13. Architecture
API
Workflows Metadata Tasks
SERVICE
Workflow Service Task Service
Decider Service Queue Service
STORE
Storage (Dynomite)
Start and manage
workflows
Define blueprints
and tasks
Gets tasks from
queue and execute
Index (Elasticsearch)
14. Scaling up Conductor
〉 Peer-to-Peer - Scale horizontally
〉 Stateless server - state is persisted in database
〉 Storage scalability : Dynomite
〉 Workload scale: Dyno-Queues
15. Storage
Dynomite
〉 Generic Dynamo implementation (Redis, Memcache)
〉 Multi-datacenter
〉 Highly available
〉 Peer-to-Peer
Elasticsearch
〉 Indexing workflow and task executions
〉 Verbose logging of worker executions
16. Dyno-Queues
〉 Distributed lock free queues used by Conductor
〉 OSS
〉 Apache 2.0 License
〉 https://github.com/Netflix/dyno-queues
〉 Delayed Queues
〉 Loose priorities and FIFO
28. Execution Flow
App A
Conductor
App A
Task
Request
Content
Task
Content
Inspection
App C
Task
Encode
App B
Task
Publish
Workflow / Task Service Decider/Queue Service
1. Start content_ingest
workflow
Storage
Task Queues
App B
29. Execution Flow
App A
Conductor
App A
Task
Request
Content
Task
Content
Inspection
App C
Task
Encode
App B
Task
Publish
Workflow / Task Service Decider/Queue Service
1. Start content_ingest
workflow
2. Get Workflow Definition
Storage
Task Queues
App B
30. Execution Flow
App A
Conductor
App A
Task
Request
Content
Task
Content
Inspection
App C
Task
Encode
App B
Task
Publish
Workflow / Task Service Decider/Queue Service
1. Start content_ingest
workflow
2. Get Workflow Definition
3. Schedule Task
Storage
Task Queues
App B
31. Execution Flow
App A
Conductor
App A
Task
Request
Content
Task
Content
Inspection
App C
Task
Encode
App B
Task
Publish
Workflow / Task Service Decider/Queue Service
1. Start content_ingest
workflow
2. Get Workflow Definition
3. Schedule Task
4. Put in Queue
Storage
Task Queues
App B
32. Execution Flow
App A
Conductor
App A
Task
Request
Content
Task
Content
Inspection
App C
Task
Encode
App B
Task
Publish
Workflow / Task Service Decider/Queue Service
1. Start content_ingest
workflow
2. Get Workflow Definition
3. Schedule Task
4. Put in Queue
5. Poll For task
Storage
Task Queues
App B
33. Execution Flow
App A
Conductor
App A
Task
Request
Content
Task
Content
Inspection
App C
Task
Encode
App B
Task
Publish
Workflow / Task Service Decider/Queue Service
1. Start content_ingest
workflow
2. Get Workflow Definition
3. Schedule Task
4. Put in Queue
5. Poll For task
6. Execute &
update task status
Storage
Task Queues
App B
34. Workers
Worker 1
Worker 2
Worker 3
Worker n
...
Management/
Execution Service
Task Queues
Orchestrator
Trigger
Schedule
Task
HTTP
Database
Index
HTTP
Update Task Status
Queue Poll
38. Conductor @ Netflix
〉 In production > 1 year
〉 ~100 Process Flows
〉 ~200 Tasks / Services
〉 Avg. Tasks per workflow: 6
〉 Largest : 48 Tasks
〉 ~4 MM Executions