Cs704 d distributedschedulingetc.

Scheduling
CS 704D Advanced OS 2

Scheduling Approaches
 Task assignment approach
 Schedule the tasks in a user submitted process to
suitable nodes for improving performance
 Load balancing approach
 Tasks are distributed to nodes to balance workload of
the nodes
 Load sharing approach
 Ensures no node is idle when a process may wait for a
processor

Desirable Features of Good
Scheduling Algorithm
 No A priori knowledge about the processes
 Dynamic in nature
 Quick decision making capability
 Balanced system performance and scheduling
overhead
 Stability
 Scalability
 Fault tolerance
 Fairness of service

Task Assignment Approach
The Basic Idea
 Assumptions (so that tasks can be assigned optimally)
 The process is already split into tasks, that are broken at
natural boundaries and data transfers can be minimized
between tasks
 Computation required by a task and speed of each processor
are known
 Cost of execution of the task at each node is known
 IPC costs between pairs of processes are known
 Other constraints, such as resource requirements, resources
at each node, precedence among tasks etc are known
 Reassignment of tasks is , generally, not possible

Task Assignment Approach
The optimization to be done
 Minimize IPC costs
 Quick turn around for the process
 High degree of parallelism
 Efficient utilization of system resources, in general

Load Balancing Approach
 Static vs. dynamic
 Deterministic vs. probabilistic
 Centralized vs. Distributed
 Cooperative vs. Non-cooperative

Load Balancing Taxonomy
Load balancing Algorithms
Deterministic
Static Dynamic
Centralized Distributed
Non-cooperativeCooperative
Probabilistic

Static vs. Dynamic
 Static: Uses average behavior of system, ignores
current state
 Simple, no need to process system state information
 But does not adjust to current situation
 Dynamic: Reacts to system changes dynamically
 Responds to system state, avoids unnecessary poor
performance
 Provides better performance , but complex as system
state information needs to be collected and processed

Deterministic Vs. Probabilistic
 Deterministic processes allocate tasks based on
known properties of nodes and the processes to be
scheduled
 A probabilistic algorithm uses information regarding
static attributes of the system such as number of
nodes, processing capability at each node, network
topology and formulate simple placement rules
 Deterministic algorithms are difficult to optimize and
costs more to implement

Centralized vs. Distributed
 A centralized server node assigns tasks based on state
information collected at that node. All other processor
nodes send information to that node and keep it
updated.
 Quite efficient
 But, single point of failure is the central node
 Solution is to replicate k+1 servers if k faults are to be
serviced. Issue of consistency arises
 If a delay of a few seconds are acceptable re-
instantiation can be used

Distributed Scheme
 K physically distributed entities
 Each is considered a local controller, runs concurrently
and asynchronously with other entities
 Entities make decisions based on system wide
objective function
 In a fully distributed system all N nodes are the
scheduling entities scheduling local as well as
acceptance of remote processes

Cooperative vs. Non-cooperative
 In non-cooperative mode entities make decisions
independent of other nodes
 In cooperative mode the entities cooperate to arrive at
a decision
 Cooperative algorithms are thus complex and has
higher overheads
 Cooperative algorithms have better stability though

Design issues
 Load estimation policy
 Process transfer policy
 State information exchange policy
 Location policy
 Priority assignment policy
 Migration limiting policy

Load estimation Policy
 Parameters that are time dependent and node
dependent are
 Number of processes on the node at the time of
estimation
 Resource demands of these processes
 Instruction mixes of these processes
 Architecture and speed of the node’s processor

Load Estimation Methods
 A simple measure could be number of processes executing
at the node
 That may not be accurate as the load really depends on the
remaining service time of these processes, some methods
are
 Memoryless method: assumes remaining time is same for all,
irrespective of time used by these processes, load estimate is
thus just the number of processes method
 Pastrepeats: remaining service time is equal to the time used
by it so far
 Distribution method: if the distribution of services times is
known then the remaining service time is expected
remaining service time conditioned by time already used

Process Transfer Policy
 Threshold decides if a process will be transferred
 Static policy: predefined threshold, no exchange of state
information is required to decide the threshold
 Dynamic policy: the threshold is the product of average
workload of all nodes and a predefined constant Ci. Ci is
proportional to the processing capability of ni relative to
the processing capability of other nodes. State
information needs to be exchanged to keep determining
the load

Process Transfer-Threshold
 Single threshold policy: rigid single threshold can lead
to instability as if the local load is just below threshold
and a process is accepted, the accepted process
increases the load beyond the threshold
 Double threshold policy: The threshold is a band
creating a high watermark and a low watermark to
decide transfers (overloaded, high mark, normal load,
low mar, under-loaded)
 Necessary that the newer processes coming messages
should not affect the local load significantly

Thresholding
Overloaded
Underloaded
Overloaded
Underloaded
Normal
Threshold
High mark
Low mark

State Information Exchange Policy
 Periodic Broadcast: lot of communication load, does
not scale well
 Broadcast when state changes: further refinement is to
transmit state information only when state changes to
under loaded or overloaded
 On-demand exchange: request state when the load
changes to under loaded/ overloaded from normal (
note: only nodes that are not normal need to convey
information)
 Exchange by polling: When needed a node can poll
others

Location Policy
 Threshold: select nodes at random, check if placing
the process will overload it
 Shortest: a pre-decided number of nodes are polled,
process transferred to one with the least load
 Bidding: request bids for a process to be migrated,
nodes bid for it. Best bidder, in terms of fastest time,
cheapest processing, best price/performance
 Pairing: low load and high load nodes are paired, there
may be requests seeking pairing

Priority Assignment Policy
 Selfish: Local processes get higher priority
 Altruistic: remote processes get higher priority
 Intermediate: Local processes get higher priority if the
number is more than or equal to remote processes
number. Otherwise remote processes are prioritized

Priority Assignment Comparisons
 Selfish: worst response time for remote processes,
higher penalty I remote process arrives at highly
loaded node, lower otherwise
 Altruistic: best response time, local processes may face
penalty
 Intermediate: response time performance is close to
that of altruistic policy yet the local processes do not
suffer as much

Migration Limiting Policy
 Uncontrolled: Local and remote processes treated the
same way. migration may happen any number of
times, can cause instability
 Controlled: A migration count is used. Many designers
feel migration is expensive and thus it is limited to 1.
Some prefer limiting it to a small value k, determined
statically or dynamically

Load Sharing Approach
 Load sharing may be better than load balancing, load
fluctuates and the overhead of exchanging state
information is quite a bit. It might be better to make sure
no nodes idle when any node has more than one process
 Priority assignment and migration limiting policies are
same as with load balancing approach

Load Sharing Design issues
 Load estimation Policies
 Process transfer policies
 Location policies
 Sender initiated location policies
 Receiver ini8tiated location policies
 State Information Exchange policies
 Broadcast when state changes
 Poll when state changes

Load Estimation Policies
 Simplest load estimation required to find only if a
node is idle or overloaded by counting number of
processes at a node
 Most modern systems have some processes running all
the time at any node. A better estimator may be the
CPU utilization

Process Transfer Policies
 All or nothing strategy
 Fix threshold at 1
 A node will accept a process if it has no processes, an
will try to send away a process when count is more
than one
 If CPU utilization is used, a high-low policy will need
to be used

Location Policies
 Location policies is one of the two following types
 Sender initiated
 Receiver initiated
 Depending on who initiates the search for a node that
can accept the process

Location-Sender Initiated
 When a node exceeds the threshold, it start
broadcasting its status to find a node or starts random
probes to other nodes to find a free node
 A probe limit is specified to stop the nodes from
searching a large number of nodes and wasting time
 Scalability is better with limited probing
 Some analyses indicate the probe limit does not have
much of an effect

Location-Receiver Initiated
 A node load falls below threshold, it’ll start
broadcast/probing to find a node that can provide a
process
 A node can transfer a process only if doing that would
not cause the load to go below the threshold
 With broadcast, the receiver can find one candidate
from among the replies
 In random probing the probing continues until a
candidate is found or the probing limit is reached

Observations on Location Policies
 Either location policies offer advantage over no location of
processes
 Sender initiated policies are preferred when the system is
lightly or moderately loaded
 In a lightly loaded system, receiver generates probe
messages would be many
 Receiver initiated policies are preferable on high load
systems only if process transfer costs are comparable. As
sender probes increase when the system is already highly
loaded
 If transfer costs in receiver initiated policies are
significantly higher then, sender initiated policies provide
better performance

State Information Exchange
Policies
 No need to periodically exchange information
 State information is needed only when a node is
under-loaded or overloaded
 Information is exchange only when state changes
 Policies used
 Broadcast when state changes
 Poll when state changes

State Info-broadcast when State
Changes
 When under-loaded or overloaded a node broadcasts a
request for status information
 In sender initiated policy, the broadcast happens when
the node is underloaded
 In receiver initiated policy the broadcast would be
initiated when a node is changes state to underloaded
 When threshold is 1, the policy is called broadcast
when idle

State Info-Poll when State
Changes
 When a node falls below the threshold, it initiates a
random polling cycle
 The cycle ends when a node is found that can transfer
a process to the receiver
 Or when a probing limit is reached
 When a fixed threshold of 1 is used, this is known as
the poll when idle policy

Cs704 d distributedschedulingetc.

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cs704 d distributedschedulingetc.

Similar to Cs704 d distributedschedulingetc. (20)

More from Debasis Das

More from Debasis Das (20)

Recently uploaded

Recently uploaded (20)

Cs704 d distributedschedulingetc.