SlideShare ist ein Scribd-Unternehmen logo
1 von 81
Downloaden Sie, um offline zu lesen
Opencast Job Dispatching
Greg Logan
gregorydlogan@gmail.com
February 15, 2018
Greg Logan February 15, 2018 1 / 30
Housekeeping
This is going to be a deeply technical talk
Greg Logan February 15, 2018 2 / 30
Housekeeping
This is going to be a deeply technical talk
If reality seems to be imploding...
Feel free to zone out for a bit
Ask questions
Greg Logan February 15, 2018 2 / 30
Housekeeping
This is going to be a deeply technical talk
If reality seems to be imploding...
Feel free to zone out for a bit
Ask questions
This presentation abuses UML
Greg Logan February 15, 2018 2 / 30
Housekeeping
This is going to be a deeply technical talk
If reality seems to be imploding...
Feel free to zone out for a bit
Ask questions
This presentation abuses UML
This is being recorded
Greg Logan February 15, 2018 2 / 30
Housekeeping
This is going to be a deeply technical talk
If reality seems to be imploding...
Feel free to zone out for a bit
Ask questions
This presentation abuses UML
This is being recorded
Shout questions as you think of them
Greg Logan February 15, 2018 2 / 30
Opencast Job Dispatching
Overview
Quick review: Services, and how they are registered
Anatomy of a job
How is a job created?
How does a job get dispatched?
What is a workflow? How does it differ from a job?
How is a workflow created?
(Relatively) complete workflow in steps, a descent into madness
Greg Logan February 15, 2018 3 / 30
Quick Review: Service and Service Registration
Opencast services register themselves with the service registry
This registry is local
The database synchronizes the registrations through the cluster
Local services talk directly to the local service registry
Remote services talk to their remote, which talks to its local registry
The architecture of how this all works was explained last talk
Greg Logan February 15, 2018 4 / 30
Anatomy of a Job
What is an Opencast Job?
Database object
Greg Logan February 15, 2018 5 / 30
Anatomy of a Job
What is an Opencast Job?
Database object
A representation of a unit of work within Opencast
Greg Logan February 15, 2018 5 / 30
Anatomy of a Job
What is an Opencast Job?
Database object
A representation of a unit of work within Opencast
A way to asynchronously keep track of your operations!
Greg Logan February 15, 2018 5 / 30
Anatomy of a Job
What is an Opencast Job?
Database object
A representation of a unit of work within Opencast
A way to asynchronously keep track of your operations!
Contains the data for a full operation (ie, encode of a stream)
Greg Logan February 15, 2018 5 / 30
Anatomy of a Job
What is an Opencast Job?
Database object
A representation of a unit of work within Opencast
A way to asynchronously keep track of your operations!
Contains the data for a full operation (ie, encode of a stream)
19 fields!
Status
Creating Service Type
Operation
Dispatchable
Job Load
Blocking Job
Blocked By
Greg Logan February 15, 2018 5 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Greg Logan February 15, 2018 6 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Each encode generates a job, as does each publish
Greg Logan February 15, 2018 6 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Each encode generates a job, as does each publish
These jobs may spawn subjobs
An encode nearly always spawns an inspect job
Greg Logan February 15, 2018 6 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Each encode generates a job, as does each publish
These jobs may spawn subjobs
An encode nearly always spawns an inspect job
Jobs can block waiting for their children
Greg Logan February 15, 2018 6 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Each encode generates a job, as does each publish
These jobs may spawn subjobs
An encode nearly always spawns an inspect job
Jobs can block waiting for their children
Jobs can block waiting for resources(*)
Greg Logan February 15, 2018 6 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Each encode generates a job, as does each publish
These jobs may spawn subjobs
An encode nearly always spawns an inspect job
Jobs can block waiting for their children
Jobs can block waiting for resources(*)
An undispatchable job is handled by the host which created it
Greg Logan February 15, 2018 6 / 30
Job Creation
How is a job created?
A job is created by the service registry (SR) when an operation is
started
Each encode generates a job, as does each publish
These jobs may spawn subjobs
An encode nearly always spawns an inspect job
Jobs can block waiting for their children
Jobs can block waiting for resources(*)
An undispatchable job is handled by the host which created it
Ingest
Greg Logan February 15, 2018 6 / 30
Job Dispatching: The basics
Job dispatching
This is where the sausage gets made
This is very simplified from the actual code
Greg Logan February 15, 2018 7 / 30
Job Dispatching: The (initial) sausage factory
function dispatchJobs(List[] jobs)
for all job in jobs do
serviceType ← job.serviceType
candidateServices ← getServicesOfType(serviceType)
serviceId ← dispatchJob(job, candidateServices)
function dispatchJob(Job job, List services)
for all service in services do
accepter ← HTTP.POST(job, service)
if accepter = null then return accepter.id
Greg Logan February 15, 2018 8 / 30
Job Dispatching: Weak Sausages
There are a number of issues here
Service fairness
Service load
Job load
Priority/Failed jobs
Greg Logan February 15, 2018 9 / 30
Job Dispatching: Service and Job Load
Job Load values
... are not the actual hardware cost to run a job
Greg Logan February 15, 2018 10 / 30
Job Dispatching: Service and Job Load
Job Load values
... are not the actual hardware cost to run a job
... are completely arbitrary
Greg Logan February 15, 2018 10 / 30
Job Dispatching: Service and Job Load
Job Load values
... are not the actual hardware cost to run a job
... are completely arbitrary
... should be thought of as a counter, rather than a load average
Greg Logan February 15, 2018 10 / 30
Job Dispatching: Service and Job Load
Service Load values
... are the sum of the Jobs currently in the RUNNING state
Greg Logan February 15, 2018 11 / 30
Job Dispatching: Service and Job Load
Service Load values
... are the sum of the Jobs currently in the RUNNING state
... do not represent the real load on the system
Greg Logan February 15, 2018 11 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
The node will be assigned at most that much load
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
The node will be assigned at most that much load
(jobs.load) <= node.maxload
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
The node will be assigned at most that much load
(jobs.load) <= node.maxload
If node.maxload = 8
job.load = 2 → 4 jobs
job.load = 4 → 2 jobs
job.load > 4 → 1 jobs
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
The node will be assigned at most that much load
(jobs.load) <= node.maxload
If node.maxload = 8
job.load = 2 → 4 jobs
job.load = 4 → 2 jobs
job.load > 4 → 1 jobs
Job load can be fractional!
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
The node will be assigned at most that much load
(jobs.load) <= node.maxload
If node.maxload = 8
job.load = 2 → 4 jobs
job.load = 4 → 2 jobs
job.load > 4 → 1 jobs
Job load can be fractional!
Job load can be negative!
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
So what’s the point of the load value?
Each node/host defines a maximum load for itself
Typically this is equal to the number of processor cores
The node will be assigned at most that much load
(jobs.load) <= node.maxload
If node.maxload = 8
job.load = 2 → 4 jobs
job.load = 4 → 2 jobs
job.load > 4 → 1 jobs
Job load can be fractional!
Job load can be negative!
Don’t do this...
Greg Logan February 15, 2018 12 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Set that job’s cost to very small (zero?)
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Set that job’s cost to very small (zero?)
Set that job’s cost to greater than node.maxload everywhere else
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Set that job’s cost to very small (zero?)
Set that job’s cost to greater than node.maxload everywhere else
Set the rest of the costs to greater than node.maxload
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Set that job’s cost to very small (zero?)
Set that job’s cost to greater than node.maxload everywhere else
Set the rest of the costs to greater than node.maxload
That job will only run on that hardware
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Set that job’s cost to very small (zero?)
Set that job’s cost to greater than node.maxload everywhere else
Set the rest of the costs to greater than node.maxload
That job will only run on that hardware
This can block processing!
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Aside: Neat Tricks
Specialist nodes
Really really good at one thing
Set that job’s cost to very small (zero?)
Set that job’s cost to greater than node.maxload everywhere else
Set the rest of the costs to greater than node.maxload
That job will only run on that hardware
This can block processing!
Current bug: Cheaper encoding not prioritized (MH-12493)
Greg Logan February 15, 2018 13 / 30
Job Dispatching: Service and Job Load
Taking the safeties off
Each node/host defines a maximum load for itself
Greg Logan February 15, 2018 14 / 30
Job Dispatching: Service and Job Load
Taking the safeties off
Each node/host defines a maximum load for itself
If the cost for a job exceeds maxload for all nodes the job never
processes
Greg Logan February 15, 2018 14 / 30
Job Dispatching: Service and Job Load
Taking the safeties off
Each node/host defines a maximum load for itself
If the cost for a job exceeds maxload for all nodes the job never
processes
org.opencastproject.job.load.acceptexceeding
Greg Logan February 15, 2018 14 / 30
Job Dispatching: Service and Job Load
Taking the safeties off
Each node/host defines a maximum load for itself
If the cost for a job exceeds maxload for all nodes the job never
processes
org.opencastproject.job.load.acceptexceeding
This is true by default
Greg Logan February 15, 2018 14 / 30
Job Dispatching: Service and Job Load
Taking the safeties off
Each node/host defines a maximum load for itself
If the cost for a job exceeds maxload for all nodes the job never
processes
org.opencastproject.job.load.acceptexceeding
This is true by default
Setting this to false is safe
Greg Logan February 15, 2018 14 / 30
Job Dispatching: Service and Job Load
Taking the safeties off
Each node/host defines a maximum load for itself
If the cost for a job exceeds maxload for all nodes the job never
processes
org.opencastproject.job.load.acceptexceeding
This is true by default
Setting this to false is safe
Set this to false prior to changing job loads
Greg Logan February 15, 2018 14 / 30
Job Dispatching: Accounting for Load
function mainDispatch( )
repeat
jobs ← getAllJobs( )
dispatchJobs(jobs)
until shutdown
function dispatchJobs(List[] jobs)
for all job in jobs do
serviceType ← job.serviceType
candidateServices ← getServicesOfType(serviceType)
candidateServices ← filterServicesByLoad(job.load)
serviceId ← dispatchJob(job, candidateServices)
Greg Logan February 15, 2018 15 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
This isn’t that
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
This isn’t that
MH-6850
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
This isn’t that
MH-6850
This is for handling undispatchable, failed, and queued jobs
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
This isn’t that
MH-6850
This is for handling undispatchable, failed, and queued jobs
Undispatchable: No service accepted them
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
This isn’t that
MH-6850
This is for handling undispatchable, failed, and queued jobs
Undispatchable: No service accepted them
Failed: Did not complete successfully
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Priority
One thing people always want:
How can I make this recording process in front of that one?
This isn’t that
MH-6850
This is for handling undispatchable, failed, and queued jobs
Undispatchable: No service accepted them
Failed: Did not complete successfully
Queued: New jobs
Greg Logan February 15, 2018 16 / 30
Job Dispatching: Accounting for Priority
function mainDispatch( )
repeat
jobs ← getPriorityJobs( )
dispatchJobs(jobs)
jobs ← getRestartJobs( )
dispatchJobs(jobs)
jobs ← getQueuedJobs( )
dispatchJobs(jobs)
jobs ← getAllJobs( )
dispatchJobs(jobs)
until shutdown
Greg Logan February 15, 2018 17 / 30
On to workflows
What is a workflow
It’s a recording?
Greg Logan February 15, 2018 18 / 30
On to workflows
What is a workflow
It’s a recording?
It’s a processing run for a recording?
Greg Logan February 15, 2018 18 / 30
On to workflows
What is a workflow
It’s a recording?
It’s a processing run for a recording?
It’s a collection of jobs
Greg Logan February 15, 2018 18 / 30
On to workflows
What is a workflow
It’s a recording?
It’s a processing run for a recording?
It’s a collection of jobs
It’s a job with some metadata
Greg Logan February 15, 2018 18 / 30
The Workflow Service
The Workflow Service
Keeps track of all workflows
Greg Logan February 15, 2018 19 / 30
The Workflow Service
The Workflow Service
Keeps track of all workflows
Organizes the creation of jobs
Greg Logan February 15, 2018 19 / 30
The Workflow Service
The Workflow Service
Keeps track of all workflows
Organizes the creation of jobs
Organizes the sequence of jobs
Greg Logan February 15, 2018 19 / 30
The Workflow Service
The Workflow Service
Keeps track of all workflows
Organizes the creation of jobs
Organizes the sequence of jobs
Note that this is creation, not execution
Greg Logan February 15, 2018 19 / 30
The Workflow Service
The Workflow Service
Keeps track of all workflows
Organizes the creation of jobs
Organizes the sequence of jobs
Note that this is creation, not execution
The origin point of all work in the system
Greg Logan February 15, 2018 19 / 30
So how does this work?
Who calls the workflow service?
You do
Created via the admin UI
Created via ingest
You get a WorkflowInstance
Updating the workflow service takes the job ID!
Greg Logan February 15, 2018 20 / 30
What does this look like?
User AdminUI WorkflowService ServiceRegistry
Start
.Start()
.createJob
Greg Logan February 15, 2018 21 / 30
Wait, what?
Some of you might have noticed that the previous sequence has problems
It just creates a job, then it stops
Greg Logan February 15, 2018 22 / 30
Wait, what?
Some of you might have noticed that the previous sequence has problems
It just creates a job, then it stops
It does not actually do any processing
Greg Logan February 15, 2018 22 / 30
Wait, what?
Some of you might have noticed that the previous sequence has problems
It just creates a job, then it stops
It does not actually do any processing
That’s because your workflow is a job
Job type: workflow
Job operation START WORKFLOW
This gets dispatched just like any other job
Greg Logan February 15, 2018 22 / 30
What does this look like?
User AdminUI WorkflowService ServiceRegistry
Start
.Start()
.createJob(ST WORKFLOW)
Greg Logan February 15, 2018 23 / 30
What does this look like?
ServiceRegistry WorkflowService
.createJob(START WORKFLOW)
.process()
Greg Logan February 15, 2018 24 / 30
What does this look like?
ServiceRegistry WorkflowService
.createJob(START WORKFLOW)
.process()
.createJob(START OPERATION)
Greg Logan February 15, 2018 25 / 30
But wait, there’s more!
It begins
Everything is a job
It’s jobs all the way down
What is START OPERATION?
Greg Logan February 15, 2018 26 / 30
But wait, there’s more!
It begins
Everything is a job
It’s jobs all the way down
What is START OPERATION?
It is a Workflow Job
Greg Logan February 15, 2018 26 / 30
We need to go deeper...
ServiceRegistry WorkflowService
.createJob(START WORKFLOW)
.process()
.createJob(START OPERATION)
Greg Logan February 15, 2018 27 / 30
We need to go deeper...
ServiceRegistry WorkflowService SomeService
createJob(START WORKFLOW)
.process()
.createJob(START OPERATION)
process
LoopLoop For each workflow step
Greg Logan February 15, 2018 28 / 30
And deeper...
ServiceRegistry WorkflowService SomeWOH SomeService
.process()
.createJob()
process
.start()
.foo()
LoopLoop For each workflow step
Greg Logan February 15, 2018 29 / 30
Wrapup
This was a long, complex talk
I hope I was clear
Please ask any questions you might have
This was actually simplified, there are at least two layers
missing
Bonus points if you can guess what they are!
Greg Logan February 15, 2018 30 / 30

Weitere ähnliche Inhalte

Kürzlich hochgeladen

The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...kalichargn70th171
 
Mastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxMastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxAS Design & AST.
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdfSteve Caron
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfkalichargn70th171
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
Advantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxAdvantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxRTS corp
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxSasikiranMarri
 

Kürzlich hochgeladen (20)

The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
The Ultimate Guide to Performance Testing in Low-Code, No-Code Environments (...
 
Mastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxMastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptx
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
Advantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptxAdvantages of Cargo Cloud Solutions.pptx
Advantages of Cargo Cloud Solutions.pptx
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
 

Opencast Job Dispatching

  • 1. Opencast Job Dispatching Greg Logan gregorydlogan@gmail.com February 15, 2018 Greg Logan February 15, 2018 1 / 30
  • 2. Housekeeping This is going to be a deeply technical talk Greg Logan February 15, 2018 2 / 30
  • 3. Housekeeping This is going to be a deeply technical talk If reality seems to be imploding... Feel free to zone out for a bit Ask questions Greg Logan February 15, 2018 2 / 30
  • 4. Housekeeping This is going to be a deeply technical talk If reality seems to be imploding... Feel free to zone out for a bit Ask questions This presentation abuses UML Greg Logan February 15, 2018 2 / 30
  • 5. Housekeeping This is going to be a deeply technical talk If reality seems to be imploding... Feel free to zone out for a bit Ask questions This presentation abuses UML This is being recorded Greg Logan February 15, 2018 2 / 30
  • 6. Housekeeping This is going to be a deeply technical talk If reality seems to be imploding... Feel free to zone out for a bit Ask questions This presentation abuses UML This is being recorded Shout questions as you think of them Greg Logan February 15, 2018 2 / 30
  • 7. Opencast Job Dispatching Overview Quick review: Services, and how they are registered Anatomy of a job How is a job created? How does a job get dispatched? What is a workflow? How does it differ from a job? How is a workflow created? (Relatively) complete workflow in steps, a descent into madness Greg Logan February 15, 2018 3 / 30
  • 8. Quick Review: Service and Service Registration Opencast services register themselves with the service registry This registry is local The database synchronizes the registrations through the cluster Local services talk directly to the local service registry Remote services talk to their remote, which talks to its local registry The architecture of how this all works was explained last talk Greg Logan February 15, 2018 4 / 30
  • 9. Anatomy of a Job What is an Opencast Job? Database object Greg Logan February 15, 2018 5 / 30
  • 10. Anatomy of a Job What is an Opencast Job? Database object A representation of a unit of work within Opencast Greg Logan February 15, 2018 5 / 30
  • 11. Anatomy of a Job What is an Opencast Job? Database object A representation of a unit of work within Opencast A way to asynchronously keep track of your operations! Greg Logan February 15, 2018 5 / 30
  • 12. Anatomy of a Job What is an Opencast Job? Database object A representation of a unit of work within Opencast A way to asynchronously keep track of your operations! Contains the data for a full operation (ie, encode of a stream) Greg Logan February 15, 2018 5 / 30
  • 13. Anatomy of a Job What is an Opencast Job? Database object A representation of a unit of work within Opencast A way to asynchronously keep track of your operations! Contains the data for a full operation (ie, encode of a stream) 19 fields! Status Creating Service Type Operation Dispatchable Job Load Blocking Job Blocked By Greg Logan February 15, 2018 5 / 30
  • 14. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Greg Logan February 15, 2018 6 / 30
  • 15. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Each encode generates a job, as does each publish Greg Logan February 15, 2018 6 / 30
  • 16. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Each encode generates a job, as does each publish These jobs may spawn subjobs An encode nearly always spawns an inspect job Greg Logan February 15, 2018 6 / 30
  • 17. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Each encode generates a job, as does each publish These jobs may spawn subjobs An encode nearly always spawns an inspect job Jobs can block waiting for their children Greg Logan February 15, 2018 6 / 30
  • 18. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Each encode generates a job, as does each publish These jobs may spawn subjobs An encode nearly always spawns an inspect job Jobs can block waiting for their children Jobs can block waiting for resources(*) Greg Logan February 15, 2018 6 / 30
  • 19. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Each encode generates a job, as does each publish These jobs may spawn subjobs An encode nearly always spawns an inspect job Jobs can block waiting for their children Jobs can block waiting for resources(*) An undispatchable job is handled by the host which created it Greg Logan February 15, 2018 6 / 30
  • 20. Job Creation How is a job created? A job is created by the service registry (SR) when an operation is started Each encode generates a job, as does each publish These jobs may spawn subjobs An encode nearly always spawns an inspect job Jobs can block waiting for their children Jobs can block waiting for resources(*) An undispatchable job is handled by the host which created it Ingest Greg Logan February 15, 2018 6 / 30
  • 21. Job Dispatching: The basics Job dispatching This is where the sausage gets made This is very simplified from the actual code Greg Logan February 15, 2018 7 / 30
  • 22. Job Dispatching: The (initial) sausage factory function dispatchJobs(List[] jobs) for all job in jobs do serviceType ← job.serviceType candidateServices ← getServicesOfType(serviceType) serviceId ← dispatchJob(job, candidateServices) function dispatchJob(Job job, List services) for all service in services do accepter ← HTTP.POST(job, service) if accepter = null then return accepter.id Greg Logan February 15, 2018 8 / 30
  • 23. Job Dispatching: Weak Sausages There are a number of issues here Service fairness Service load Job load Priority/Failed jobs Greg Logan February 15, 2018 9 / 30
  • 24. Job Dispatching: Service and Job Load Job Load values ... are not the actual hardware cost to run a job Greg Logan February 15, 2018 10 / 30
  • 25. Job Dispatching: Service and Job Load Job Load values ... are not the actual hardware cost to run a job ... are completely arbitrary Greg Logan February 15, 2018 10 / 30
  • 26. Job Dispatching: Service and Job Load Job Load values ... are not the actual hardware cost to run a job ... are completely arbitrary ... should be thought of as a counter, rather than a load average Greg Logan February 15, 2018 10 / 30
  • 27. Job Dispatching: Service and Job Load Service Load values ... are the sum of the Jobs currently in the RUNNING state Greg Logan February 15, 2018 11 / 30
  • 28. Job Dispatching: Service and Job Load Service Load values ... are the sum of the Jobs currently in the RUNNING state ... do not represent the real load on the system Greg Logan February 15, 2018 11 / 30
  • 29. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores Greg Logan February 15, 2018 12 / 30
  • 30. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores The node will be assigned at most that much load Greg Logan February 15, 2018 12 / 30
  • 31. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores The node will be assigned at most that much load (jobs.load) <= node.maxload Greg Logan February 15, 2018 12 / 30
  • 32. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores The node will be assigned at most that much load (jobs.load) <= node.maxload If node.maxload = 8 job.load = 2 → 4 jobs job.load = 4 → 2 jobs job.load > 4 → 1 jobs Greg Logan February 15, 2018 12 / 30
  • 33. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores The node will be assigned at most that much load (jobs.load) <= node.maxload If node.maxload = 8 job.load = 2 → 4 jobs job.load = 4 → 2 jobs job.load > 4 → 1 jobs Job load can be fractional! Greg Logan February 15, 2018 12 / 30
  • 34. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores The node will be assigned at most that much load (jobs.load) <= node.maxload If node.maxload = 8 job.load = 2 → 4 jobs job.load = 4 → 2 jobs job.load > 4 → 1 jobs Job load can be fractional! Job load can be negative! Greg Logan February 15, 2018 12 / 30
  • 35. Job Dispatching: Service and Job Load So what’s the point of the load value? Each node/host defines a maximum load for itself Typically this is equal to the number of processor cores The node will be assigned at most that much load (jobs.load) <= node.maxload If node.maxload = 8 job.load = 2 → 4 jobs job.load = 4 → 2 jobs job.load > 4 → 1 jobs Job load can be fractional! Job load can be negative! Don’t do this... Greg Logan February 15, 2018 12 / 30
  • 36. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Greg Logan February 15, 2018 13 / 30
  • 37. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Greg Logan February 15, 2018 13 / 30
  • 38. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Set that job’s cost to very small (zero?) Greg Logan February 15, 2018 13 / 30
  • 39. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Set that job’s cost to very small (zero?) Set that job’s cost to greater than node.maxload everywhere else Greg Logan February 15, 2018 13 / 30
  • 40. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Set that job’s cost to very small (zero?) Set that job’s cost to greater than node.maxload everywhere else Set the rest of the costs to greater than node.maxload Greg Logan February 15, 2018 13 / 30
  • 41. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Set that job’s cost to very small (zero?) Set that job’s cost to greater than node.maxload everywhere else Set the rest of the costs to greater than node.maxload That job will only run on that hardware Greg Logan February 15, 2018 13 / 30
  • 42. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Set that job’s cost to very small (zero?) Set that job’s cost to greater than node.maxload everywhere else Set the rest of the costs to greater than node.maxload That job will only run on that hardware This can block processing! Greg Logan February 15, 2018 13 / 30
  • 43. Job Dispatching: Service and Job Load Aside: Neat Tricks Specialist nodes Really really good at one thing Set that job’s cost to very small (zero?) Set that job’s cost to greater than node.maxload everywhere else Set the rest of the costs to greater than node.maxload That job will only run on that hardware This can block processing! Current bug: Cheaper encoding not prioritized (MH-12493) Greg Logan February 15, 2018 13 / 30
  • 44. Job Dispatching: Service and Job Load Taking the safeties off Each node/host defines a maximum load for itself Greg Logan February 15, 2018 14 / 30
  • 45. Job Dispatching: Service and Job Load Taking the safeties off Each node/host defines a maximum load for itself If the cost for a job exceeds maxload for all nodes the job never processes Greg Logan February 15, 2018 14 / 30
  • 46. Job Dispatching: Service and Job Load Taking the safeties off Each node/host defines a maximum load for itself If the cost for a job exceeds maxload for all nodes the job never processes org.opencastproject.job.load.acceptexceeding Greg Logan February 15, 2018 14 / 30
  • 47. Job Dispatching: Service and Job Load Taking the safeties off Each node/host defines a maximum load for itself If the cost for a job exceeds maxload for all nodes the job never processes org.opencastproject.job.load.acceptexceeding This is true by default Greg Logan February 15, 2018 14 / 30
  • 48. Job Dispatching: Service and Job Load Taking the safeties off Each node/host defines a maximum load for itself If the cost for a job exceeds maxload for all nodes the job never processes org.opencastproject.job.load.acceptexceeding This is true by default Setting this to false is safe Greg Logan February 15, 2018 14 / 30
  • 49. Job Dispatching: Service and Job Load Taking the safeties off Each node/host defines a maximum load for itself If the cost for a job exceeds maxload for all nodes the job never processes org.opencastproject.job.load.acceptexceeding This is true by default Setting this to false is safe Set this to false prior to changing job loads Greg Logan February 15, 2018 14 / 30
  • 50. Job Dispatching: Accounting for Load function mainDispatch( ) repeat jobs ← getAllJobs( ) dispatchJobs(jobs) until shutdown function dispatchJobs(List[] jobs) for all job in jobs do serviceType ← job.serviceType candidateServices ← getServicesOfType(serviceType) candidateServices ← filterServicesByLoad(job.load) serviceId ← dispatchJob(job, candidateServices) Greg Logan February 15, 2018 15 / 30
  • 51. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? Greg Logan February 15, 2018 16 / 30
  • 52. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? This isn’t that Greg Logan February 15, 2018 16 / 30
  • 53. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? This isn’t that MH-6850 Greg Logan February 15, 2018 16 / 30
  • 54. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? This isn’t that MH-6850 This is for handling undispatchable, failed, and queued jobs Greg Logan February 15, 2018 16 / 30
  • 55. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? This isn’t that MH-6850 This is for handling undispatchable, failed, and queued jobs Undispatchable: No service accepted them Greg Logan February 15, 2018 16 / 30
  • 56. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? This isn’t that MH-6850 This is for handling undispatchable, failed, and queued jobs Undispatchable: No service accepted them Failed: Did not complete successfully Greg Logan February 15, 2018 16 / 30
  • 57. Job Dispatching: Priority One thing people always want: How can I make this recording process in front of that one? This isn’t that MH-6850 This is for handling undispatchable, failed, and queued jobs Undispatchable: No service accepted them Failed: Did not complete successfully Queued: New jobs Greg Logan February 15, 2018 16 / 30
  • 58. Job Dispatching: Accounting for Priority function mainDispatch( ) repeat jobs ← getPriorityJobs( ) dispatchJobs(jobs) jobs ← getRestartJobs( ) dispatchJobs(jobs) jobs ← getQueuedJobs( ) dispatchJobs(jobs) jobs ← getAllJobs( ) dispatchJobs(jobs) until shutdown Greg Logan February 15, 2018 17 / 30
  • 59. On to workflows What is a workflow It’s a recording? Greg Logan February 15, 2018 18 / 30
  • 60. On to workflows What is a workflow It’s a recording? It’s a processing run for a recording? Greg Logan February 15, 2018 18 / 30
  • 61. On to workflows What is a workflow It’s a recording? It’s a processing run for a recording? It’s a collection of jobs Greg Logan February 15, 2018 18 / 30
  • 62. On to workflows What is a workflow It’s a recording? It’s a processing run for a recording? It’s a collection of jobs It’s a job with some metadata Greg Logan February 15, 2018 18 / 30
  • 63. The Workflow Service The Workflow Service Keeps track of all workflows Greg Logan February 15, 2018 19 / 30
  • 64. The Workflow Service The Workflow Service Keeps track of all workflows Organizes the creation of jobs Greg Logan February 15, 2018 19 / 30
  • 65. The Workflow Service The Workflow Service Keeps track of all workflows Organizes the creation of jobs Organizes the sequence of jobs Greg Logan February 15, 2018 19 / 30
  • 66. The Workflow Service The Workflow Service Keeps track of all workflows Organizes the creation of jobs Organizes the sequence of jobs Note that this is creation, not execution Greg Logan February 15, 2018 19 / 30
  • 67. The Workflow Service The Workflow Service Keeps track of all workflows Organizes the creation of jobs Organizes the sequence of jobs Note that this is creation, not execution The origin point of all work in the system Greg Logan February 15, 2018 19 / 30
  • 68. So how does this work? Who calls the workflow service? You do Created via the admin UI Created via ingest You get a WorkflowInstance Updating the workflow service takes the job ID! Greg Logan February 15, 2018 20 / 30
  • 69. What does this look like? User AdminUI WorkflowService ServiceRegistry Start .Start() .createJob Greg Logan February 15, 2018 21 / 30
  • 70. Wait, what? Some of you might have noticed that the previous sequence has problems It just creates a job, then it stops Greg Logan February 15, 2018 22 / 30
  • 71. Wait, what? Some of you might have noticed that the previous sequence has problems It just creates a job, then it stops It does not actually do any processing Greg Logan February 15, 2018 22 / 30
  • 72. Wait, what? Some of you might have noticed that the previous sequence has problems It just creates a job, then it stops It does not actually do any processing That’s because your workflow is a job Job type: workflow Job operation START WORKFLOW This gets dispatched just like any other job Greg Logan February 15, 2018 22 / 30
  • 73. What does this look like? User AdminUI WorkflowService ServiceRegistry Start .Start() .createJob(ST WORKFLOW) Greg Logan February 15, 2018 23 / 30
  • 74. What does this look like? ServiceRegistry WorkflowService .createJob(START WORKFLOW) .process() Greg Logan February 15, 2018 24 / 30
  • 75. What does this look like? ServiceRegistry WorkflowService .createJob(START WORKFLOW) .process() .createJob(START OPERATION) Greg Logan February 15, 2018 25 / 30
  • 76. But wait, there’s more! It begins Everything is a job It’s jobs all the way down What is START OPERATION? Greg Logan February 15, 2018 26 / 30
  • 77. But wait, there’s more! It begins Everything is a job It’s jobs all the way down What is START OPERATION? It is a Workflow Job Greg Logan February 15, 2018 26 / 30
  • 78. We need to go deeper... ServiceRegistry WorkflowService .createJob(START WORKFLOW) .process() .createJob(START OPERATION) Greg Logan February 15, 2018 27 / 30
  • 79. We need to go deeper... ServiceRegistry WorkflowService SomeService createJob(START WORKFLOW) .process() .createJob(START OPERATION) process LoopLoop For each workflow step Greg Logan February 15, 2018 28 / 30
  • 80. And deeper... ServiceRegistry WorkflowService SomeWOH SomeService .process() .createJob() process .start() .foo() LoopLoop For each workflow step Greg Logan February 15, 2018 29 / 30
  • 81. Wrapup This was a long, complex talk I hope I was clear Please ask any questions you might have This was actually simplified, there are at least two layers missing Bonus points if you can guess what they are! Greg Logan February 15, 2018 30 / 30