It is well-known that SRPT is optimal for minimizing
ow time on machines that run one job at a time.
However, running one job at a time is a big under-
utilization for modern systems where sharing, simultane-
ous execution, and virtualization-enabled consolidation
are a common trend to boost utilization. Such machines,
used in modern large data centers and clouds, are
powerful enough to run multiple jobs/VMs at a time
subject to overall CPU, memory, network, and disk
capacity constraints.
Motivated by this pr
1. Weighted Flowtime on Capacitated Machines
Kyle Fox∗ Madhukar Korupolu
University of Illinois at Urbana-Champaign Google Research
Urbana, IL Mountain View, CA
kylefox2@illinois.edu mkar@google.com
Abstract has been a driving force for virtualization adoption
It is well-known that SRPT is optimal for minimizing and transformation of IT in data centers.
flow time on machines that run one job at a time. This adoption, in turn, is driving a proliferation
However, running one job at a time is a big under- of compute farms and public, private, and hybrid
utilization for modern systems where sharing, simultane- clouds which share resources and machines among
ous execution, and virtualization-enabled consolidation multiple users and jobs [1, 2, 4, 32]. Large compute
are a common trend to boost utilization. Such machines, farms and clouds often have excess capacity provi-
used in modern large data centers and clouds, are
sioned for peak workloads and these can be shared
powerful enough to run multiple jobs/VMs at a time
with new users to improve utilization during off-peak
subject to overall CPU, memory, network, and disk
capacity constraints. times. Tasks such as MapReduce [21] which are
Motivated by this prominent trend and need, in this I/O driven, are often run with multiple instances
work, we give the first scheduling algorithms to minimize (of different MapReduces) on the same machine so
weighted flow time on such capacitated machines. To they all progress toward completion simultaneously.
capture the difficulty of the problem, we show that A characterization of workloads in Google compute
without resource augmentation, no online algorithm can clusters is given in [20, 32, 34].
achieve a bounded competitive ratio. We then investigate A common scenario in all of these situations is
algorithms with a small resource augmentation in speed
that of jobs (or VMs) arriving in an online manner
and/or capacity. Our first result is a simple (2 + ε)-
with their resource requirements and processing time
capacity O(1/ε)-competitive greedy algorithm. Using
only speed augmentation, we then obtain a 1.75-speed requirements and a scheduler scheduling them on
O(1)-competitive algorithm. Our main technical result one of the machines in the cluster. Jobs leave when
is a near-optimal (1 + ε)-speed, (1 + ε)-capacity O(1/ε3 )- they are completed. The problem is interesting even
competitive algorithm using a novel combination of with a single machine. A common goal in such
knapsacks, densities, job classification into categories, a scenario is that of minimizing average weighted
and potential function methods. We show that our flow time, defined as the weighted average of the
results also extend to the multiple unrelated capacitated job completion times minus arrival times, i.e., the
machines setting. total time spent by jobs in the system. Other
1 Introduction common goals include load-balancing (minimizing
maximum load on machines), minimizing average
Four trends are becoming increasingly prominent
stretch (stretch of a job defined as completion time
in modern data centers. Modern machines are
minus arrival time divided by processing time), or
increasingly powerful and able to run multiple
completing as many jobs as possible before their
jobs at the same time subject to overall capacity
deadlines if there are deadlines. Adding weights
constraints. Running only one job at a time on them
lets us prioritize the completion of certain jobs over
leads to significant under-utilization. Virtualization
others. In fact, weighted flow time subsumes the
technologies such as VMware [3] and Xen [13] are
stretch metric if we set the weight of a job to be the
enabling workloads to be packaged as VMs and
reciprocal of its processing time. In this paper we
multiple VMs run on the same physical machine [5].
focus on the weighted flow time objective.
This packing of VMs on the same machine en-
ables server consolidation [6, 7, 15] and data center 1.1 Prior work on minimizing flow time:
cost reduction in the form of reduced hardware, Single job at a time model. Traditionally,
floorspace, energy, cooling, and maintenance costs. almost all the known flow time results have been
This utilization boost and the resulting cost savings for machines that run at most one job at a time.
∗ Portions
of this work were done while this author was Consider the following model. A server sees n jobs
visiting Google Research. arrive over time. Each job j arrives at time rj
2. and requires a processing time pj . At each point scalable. Chekuri et al. [18] gave the first speed-
in time t, the server may choose to schedule one scalable online algorithms for minimizing flow time
job j that has already arrived (rj ≤ t). We say a on multiple machines with unit weights. Becchetti
job j has been completed when it has been scheduled et al. [14] gave the first analysis that the simple
for pj time units. Preemption is allowed, so job j greedy algorithm Highest Density First (HDF) is
may be scheduled, paused, and then later scheduled speed-scalable with arbitrary job weights on a single
without any loss of progress. Each job j is given a machine. HDF always schedules the alive job with
weight wj . The goal of the scheduler is to minimize the highest weight to processing time ratio. Bussema
the total weighted flow time of the system, defined and Torng [16] showed the natural generalization
as j wj (Cj −rj ) where Cj is the completion time of of HDF to multiple machines is also speed-scalable
job j. For this paper in particular, we are interested with arbitrary job weights. Table 1 gives a summary
in the clairvoyant online version of this problem of these results.
where the server is not aware of any job j until its 1.2 The capacitated flow time problem.
release, but once job j arrives, the server knows pj The above results assume that any machine can
and wj . run at most one job at a time. This assumption
It is well known that Shortest Remaining Pro- is wasteful and does not take advantage of the
cessing Time (SRPT) is the optimal algorithm sharing, simultaneous execution, and virtualization
for minimizing flow time with unit weights in the abilities of modern computers. The current trend in
above single-job machine model. In fact, SRPT industry is not only to focus on distributing large
is the best known algorithm for unit weights on workloads, but also to run multiple jobs/VMs at
multiple machines in both the online and offline the same time on individual machines subject to
setting. Leonardi and Raz [30] show SRPT has com- overall capacity constraints. By not addressing the
petitive/approximation ratio O(log(min {n/m, P })) sharing and virtualization trend, theory is getting
where m is the number of machines and P is the ratio out of touch with the industry.
of the longest processing time to shortest. They also The problem of scheduling an online sequence
give a matching lower bound for any randomized of jobs (VMs) in such scenarios necessitates the
online algorithm. Chekuri, Khanna, and Zhu [19] “capacitated machines” variant of the flow time
show every randomized online algorithm has com- problem. In addition to each job j having a
petitive ratio Ω(min W 1/2 , P 1/2 , (n/m)1/4 ) when processing time and weight, job j also has a height hj
the jobs are given arbitrary weights and there are that represents how much of a given resource (e.g.
multiple machines. Bansal and Chan [11] partially RAM, network throughput) the job requires to be
extended this result to show there is no deterministic scheduled. The server is given a capacity that we
online algorithm with constant competitive ratio assume is 1 without loss of generality. At each
when the jobs are given arbitrary weights, even point in time, the server may schedule an arbitrary
with a single machine. However, there are some number of jobs such that the sum of heights for
positive online results known for a single machine jobs scheduled is at most the capacity of the server.
where the competitive ratio includes a function of We assume all jobs have height at most 1, or some
the processing times and weights [12, 19]1 . jobs could never be scheduled. The setting can be
The strong lower bounds presented above moti- easily extended to use multiple machines and multi-
vate the use of resource augmentation, originally dimensional heights, but even the single-machine,
proposed by Kalyanasundaram and Pruhs [28], single-dimension case given above is challenging from
which gives an algorithm a fighting chance on worst a technical standpoint. We focus on the single-
case scheduling instances when compared to an machine, single-dimension problem first, and later
optimal offline algorithm. We say an algorithm show how our results can be extended to multiple
is s-speed c-competitive if for any input instance I, machines. We leave further handling of multiple
the algorithm’s weighted flow time while running at dimensions as an interesting open problem.
speed s is at most c times the optimal schedule’s The case of jobs having sizes related to machine
weighted flow time while running at unit speed. capacities has been studied before in the context of
Ideally, we would like an algorithm to be constant load-balancing [9, 10] and in operations research and
factor competitive when given (1 + ε) times a systems [6,15,31,33]. However, this paper is the first
resource over the optimal offline algorithm for any instance we are aware of that studies capacitated
constant ε > 0. We call such algorithms [resource]- machines with the objective of minimizing flow time.
1 The result of [19] is actually a semi-online algorithm in We call this the Capacitated Flow Time (CFT)
that P is known in advance. problem. As flow time is a fundamental scheduling
3. Capacitated Weights Machines Competitive Ratio
no unit single 1 [folklore]
no unit multiple Ω(log n/m + log P ) [30]
O(log min {n/m, P }) [30]
(1 + ε)-speed O(1/ε)-comp. [18]
no arbitrary single ω(1) [11]
O(log2 P ) [19, (semi-online)]
O(log W ) [12]
O(log n + log P ) [12]
(1 + ε)-speed O(1/ε)-comp. [14]
no arbitrary multiple Ω min W 1/2 , P 1/2 , (n/m)1/4 [19]
(1 + ε)-speed O(1/ε2 )-comp. [16]
yes unit single (1.2 − ε)-capacity Ω(P ) [this paper]
(1.5 − ε)-capacity Ω(log n + log P ) [this paper]
yes weighted single (2 − ε)-capacity ω(1) [this paper]
yes weighted multiple (2 + ε)-capacity O(1/ε)-comp. [this paper]
1.75-speed O(1)-comp. [this paper]
(1 + ε)-speed (1 + ε)-capacity O(1/ε3 )-comp. [this paper]
Table 1. A summary of prior and new results in online scheduling to minimize flowtime. Here, W denotes the number of distinct
weights given to jobs, and P denotes the ratio of the largest to least processing time.
objective, we believe this is an important problem HRDF uses greedy knapsack packing with job
to study and bridge the gap to practice. residual densities (defined later). The algorithm and
analysis are tight for capacity-only augmentation
1.3 Our results and outline of the paper.
given the lower bound in Theorem 2.4. We then turn
In Section 2, we show that online scheduling
to the question of whether speed augmentation helps
in the capacitated machines case is much more
beat the (2 − ε)-capacity lower bound. Our initial
difficult than scheduling in the single-job model. In
attempt and result (not described in this paper)
particular, Theorem 2.1 shows there exists no online
was an s-speed u-capacity constant-competitive
randomized algorithm for the CFT problem with a
algorithm for all s and u with s · u ≥ 2 + ε. We
bounded competitive ratio, even when all jobs have
subsequently obtained an algorithm High Priority
unit weight. This result stands in stark contrast
Category Scheduling (HPCS) which breaks through
to the same situation where SRPT is actually an
the 2 bound. HPCS is 1.75-speed O(1)-competitive
optimal algorithm for the single job at a time model.
and is described in Section 4.
Motivated by the difficulty of online scheduling,
Our main result is a (1 + ε)-speed (1 + ε)-
we consider the use of resource augmentation in
capacity O(1/ε3 )-competitive algorithm called Knap-
speed and capacity. In a capacity augmentation
sack Category Scheduling (KCS). KCS builds on
analysis, an algorithm is given capacity greater
HPCS and is presented in Section 5. The KCS
than 1 while being compared to the optimal schedule
result shows the power of combining both speed and
on a unit capacity server. We define u-capacity
capacity augmentation and is the first of its kind.2
c-competitive accordingly. We also say an algo-
Pseudocode descriptions of all our algorithms are
rithm is s-speed u-capacity c-competitive if it is c-
given in Figures 1, 3, and 4. We conclude with a
competitive when given both s-speed and u-capacity
brief summary of our results as well as some open
over the optimal unit resource schedule. We show
problems related to the CFT problem in Section 6.
in Corollary 2.2 and Theorem 2.3 that our lower
bound holds even when an algorithm is given a small All of our results can be extended to the multiple
constant factor extra capacity without additional unrelated capacitated machine setting as detailed
speed. In fact, strong lower bounds continue to exist in Appendix D. In the unrelated machines model,
even as the augmented capacity approaches 2. each machine-job pair is given a weight, processing
time, and height unrelated to any other machine-job
In Section 3, we propose and analyze a greedy
pairs.
algorithm Highest Residual Density First (HRDF)
for the CFT problem and show it is (2 + ε)-capacity 2 Inpractice, machines are rarely filled to full capacity, so
O(1/ε)-competitive for any small constant ε > 0. small augmentations in capacity are not unreasonable.
4. Technical novelty: Our algorithms use novel Suppose case (b) occurs. We release one job
combinations and generalizations of knapsacks, job of type II at each time step t = T, T + P, T +
densities, and grouping of jobs into categories. While 2P, . . . , L, L T . There are always T /2 · P jobs
HRDF uses knapsacks and job densities and HPCS alive in A’s schedule during [T, L]. So A’s total flow
uses job densities and categories, KCS combines time is Ω(T · L/P ). The optimal schedule finishes
all three. In fact, our algorithms are the first we all jobs of type II during [0, T ]. It has 2T jobs of
are aware of that do not make flow time scheduling type I alive at time T , but it finishes these jobs
decisions based on a single ordering of jobs. Our during [T, 3T ] while also processing jobs of type II.
analysis methods combining potential functions, So, after time 3T , the optimal schedule has no more
knapsacks, and categories are the first of their kind alive jobs and its total flow time is O(L). We get
and may have other applications. the theorem by setting T ≥ P 2 .
2 Lower Bounds
Corollary 2.2. Any randomized online
We begin the evaluation of the CFT problem by algorithm A for the CFT problem is (1.2 − ε)-
giving lower bounds on the competitive ratio for any capacity Ω(P )-competitive for any constant ε > 0
online scheduling algorithm. Some lower bounds even when all jobs have unit weight.
hold even when jobs are given unit weights. Recall P
denotes the ratio of the largest processing time to In fact, superconstant lower bounds continue
the least processing time. to exist even when an online algorithm’s server
is given more than 1.2 capacity. When we stop
Theorem 2.1. Any randomized online algorithm A requiring input instances to have unit weights, then
for the CFT problem has competitive ratio Ω(P ) the lower bounds become resilient to even larger
even when all jobs have unit weight. capacity augmentations.
Proof: Garg and Kumar [24] consider a variant on Theorem 2.3. Any randomized online algorithm A
the single-job capacity identical machines model they for the CFT problem is (1.5 − ε)-capacity
call Parallel Scheduling with Subset Constraints. Ω(log n + log P )-competitive for any constant ε > 0
Our proof very closely follows the proof of [24, even when all jobs have unit weight.
Theorem 3.2]. While our proof is a lower bound
on any deterministic algorithm, it easily generalizes Proof: There is a simple reduction from online
to randomized algorithms with oblivious adversaries. scheduling in the single-job identical machines model
Further, the proof continues to hold even with to the capacitated server model. Let I be any input
small amounts of capacity augmentation. Fix a instance for the single-job identical machines model
deterministic online algorithm A. We use only one where there are two machines and all jobs have
machine and all jobs have unit weight. There are unit weight. We create an input instance I using
two types of jobs: Type I jobs have unit processing the same jobs as I with each job given height 0.5.
time and height 0.4; Type II jobs have processing Any schedule of I is a schedule for I. The lower
time P 1 and height 0.6. Let T P be a bound of [30, Theorem 9] immediately implies the
large constant. On each time step t = 0, . . . , T − 1, theorem.
we release two jobs of type I. At each time step
t = 0, P, 2P, . . . , T − 1, we release one job of type Theorem 2.4. No deterministic online
II (we assume T − 1 is a multiple of P ). The total algorithm A for the CFT problem is constant
processing time of all released jobs is 3T . The server competitive even when given (2 − ε)-capacity for
can schedule two type I jobs at once, and one type any constant ε > 0 assuming jobs are allowed
I job with a type II job, but it cannot schedule two arbitrary weights.
type II jobs at once. Therefore, one of the following
must occur: (a) at least T /2 jobs of type I remain Proof: We apply a similar reduction to that given
alive at time T , or (b) at least T /2 · P jobs of type in the proof of Theorem 2.3. Let I be any input
II remain alive at time T . instance for the single-job identical machines model
Suppose case (a) occurs. We release 2 jobs of where there is one machine and jobs have arbitrary
type I at each time step t = T, . . . , L, L T . The weight. We create an input instance I using the
total flow time of A is Ω(T ·L). However, the optimal same jobs as I with each job given height 1. Any
schedule finishes all jobs of type I during [0, T ]. It has schedule of I is a schedule for I. The lower
only T /P jobs (of type II) waiting at each time step bound of [11, Theorem 2.1] immediately implies
during [T, L], so the optimal flow time is O(T · L/P ). the theorem.
5. HRDF(I, t):
/* QA (t) : Jobs alive at time t. pA (t): Remaining processing time of j */
j
1. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign residual densities */
j j
2. Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity
j
Figure 1. The algorithm HRDF.
3 Analysis Warmup: Highest Residual Den- We give the remaining notation for our potential
sity First function. Let pO (t) be the remaining processing time
j
In this section, we analyze the deterministic online of job j at time t in the optimal schedule. Let QO (t)
algorithm Highest Residual Density First (HRDF) be the set of alive jobs in the optimal schedule at
for the CFT problem. A description of HRDF time t. Let QAj (t) = k ∈ QA (t) : dA (t) > dA (t)
k j
appears in Figure 1. Despite the simplicity of the be the set of alive jobs in HRDF’s schedule more
algorithm, we are able to show the following result dense than job j. Let S A (t) and S O (t) be the set
matching the lower bound of Theorem 2.4. of jobs processed at time t by HRDF’s and the
optimal schedule respectively. Finally, let W A (t) =
Theorem 3.1. HRDF is (2 + ε)-capacity O(1/ε)- j∈QA (t) wj be the total weight of all jobs alive in
HRDF’s schedule at time t. The rest of the notation
competitive for any constant ε where 0 < ε < 1.
we use in our potential function is given in Figure 1
and Table 2.
3.1 Analysis of HRDF. We use a potential
The potential function we use has two terms for
function analysis to prove Theorem 3.1. Set an
each job j:
input instance I, and consider running HRDF with
2+ε capacity alongside the optimal schedule without wj
ΦA (t) =
j Rj (t) + (1 + ε)pA (t) − Vj> (t)
j
resource augmentation. We begin our analysis by ε
A + Yj (t) + Fj (t)
defining some notation. For any job j let Cj be the
completion time of j in HRDF’s schedule and let Cj O wj <
ΦO (t) =
j R (t)
be the completion time of j in the optimal schedule. ε j
A
For any time t ≥ rj , let Aj (t) = min Cj , t − rj
The potential function is defined as
be the accumulated weighted flow time of job j at
A
time t and let Aj = Aj (Cj ). Finally, let A = j Aj Φ(t) = ΦA (t) − ΦO (t).
j j
be the total weighted flow time for HRDF’s schedule.
j∈QA (t) j∈QO (t)
Let OPTj (t), OPTj , and OPT be defined similarly
for the optimal schedule. 3.2 Main proof idea: Using extra capacity
Our analysis of HRDF builds on the potential to keep up with the optimal. We give full
function approach outlined in [27], specifically [17, details of the potential function analysis in Ap-
22, 23], for the single-job at a time problem and pendix A including the effects from job arrivals
extends it to the capacitated case. Unlike in [22, 23], and completions, but briefly mention here some
our method supports arbitrary job weights and of the continuous changes made to A + Φ due to
capacities, and unlike in [17], we do not use the processing from HRDF and the optimal schedule.
notion of fractional flow time.3 We give a potential Intuitively, these changes compare S A (t) and S O (t),
function Φ that is almost everywhere differentiable the jobs processed at time t by HRDF’s and the
such that Φ(0) = Φ(∞) = 0 and then focus optimal schedule respectively. As is the case for
on bounding all continuous and discrete increases many potential function arguments, we must show
to A + Φ across time by a function of OPT in order that HRDF removes the remaining area of jobs more
to bound A by a function of OPT.4 efficiently than the optimal schedule. The extra
3 Avoiding the use of fractional flow time allows us to efficiency shows Φ falls fast enough to counteract
analyze HRDF using capacity-only augmentation without changes in A. Showing an algorithm’s schedule
speed augmentation [17]. It also saves us a factor of Ω(1/ε) removes area more efficiently than the optimal
in the competitive ratio for KCS. schedule is often the most difficult part of a potential
4 An analysis of an HRDF-like algorithm for the unrelated
single-job machine problem was given in [8] by relaxing an integrality gap of the natural linear program for our setting
integer program and applying dual fitting techniques. We is unbounded for the small amounts of resource augmention
focus on potential function arguments in this work as the we use in later algorithms.
6. Term Definition Explanation
Total remaining area of jobs more dense than j. Continuous
Rj (t) k∈QAj (t) hk pA (t)
k
decreases in Rj counteract continuous increases in A.
Total remaining area of jobs more dense than j upon j’s
<
Rj (t) k∈QAj (rj ) hk pA (t)
k
<
arrival. The decrease from Rj cancels the increase to Φ
caused by Rj when j arrives.
The optimal schedule’s total remaining area for each job k
more dense than j upon k’s arrival. The increase to Rj
Vj> (t) k∈QO (t)∩QAj (rk ) hk pO (t)
k caused by the arrival of a job k is canceled by an equal
increase in Vj> .
Yj (t) = 0 for all t < rj and Captures work done by the optimal schedule that does not
Yj (t) d
dt Yj (t) = k∈(S O (t)QAj (rk )) hk for affect Vj> . Used to bound increases in Φ from the removal
<
all t ≥ rj of Rj terms when jobs are completed.
d
Fj (t) = 0 for all t < rj , dt Fj (t) = 0 if j ∈/ Captures the increase in flow time by the optimal schedule
Fj (t) S (t), and dt Fj (t) = hj k∈QO (t) wk oth-
A d
wj
while our schedule processes j. Used to bound increases
erwise. in Φ from the removal of Vj terms when jobs are completed.
Table 2. A summary of notation used in our potential functions.
Input HRDF OPT
0.8
0.3
time time time
Figure 2. A lower bound example for HRDF. Type I jobs appear every time step as green squares with height 0.3 and weight 2
while type II jobs appear at time steps {0, 1} mod 3 as purple rectangles with height 0.8 and weight 1. The process continues
until time T 1. Left: The input set of jobs. Center: As type I jobs have higher residual density, HRDF picks one at each time
step as it arrives. Right: Optimal schedule uses type II jobs at time steps {0, 1} mod 3 and three type I jobs at time steps 2
mod 3.
d
function analysis and greatly motivates the other process j. Therefore, dt Rj (t) ≤ −(1 + ε) and
algorithms given in this paper. In our potential
function, rate of work is primarily captured in the d wj 1+ε A
(Rj + (1 + ε)pA )(t) ≤ −
j W (t).
Rj , pA , Vj , and Yj terms. dt ε ε
j j∈QA (t)
Lemma 3.2. If HRDF’s schedule uses 2+ε capacity The optimal schedule is limited to scheduling
and the optimal schedule uses unit capacity, then jobs with a total height of 1. No job contributes to
both Vj> and Yj , so
d wj
ΦA −
j Fj (t) =
dt ε d wj 1
j∈QA (t) (−Vj> + Yj )(t) ≤ W A (t).
d wj dt ε ε
j∈QA (t)
(Rj + (1 + ε)pA − Vj> + Yj )(t)
j
dt ε
j∈QA (t)
≤ −W A (t).
4 Beyond Greedy: High Priority Category
Proof: Consider any time t. For any job j, Scheduling
neither Rj nor (1 + ε)pA can increase due to
j The most natural question after seeing the above
continuous processing of jobs. If HRDF’s schedule analysis is whether there exists an algorithm that is
d
processes j at time t, then dt (1 + ε)pA (t) = −(1 + ε).
j constant factor competitive when given only speed
If HRDF’s schedule does not process j, then HRDF’s augmentation. In particular, a (1+ε)-speed constant
schedule processes a set of more dense jobs of height competitive algorithm for any small constant ε would
at least (1 + ε). Otherwise, there would be room to be ideal.
7. HPCS(I, t, κ):
/* κ is an integer determined in analysis */
1. For each 1 ≤ i < κ: /* Divide jobs into categories by height */
QA (t) ← j ∈ QA (t) : 1/(i + 1) < hj ≤ 1/i /* Assign alive category i jobs */
i
Wi (t) ← j∈QA (t) wj /* Assign queued weight of category i */
i
For each j ∈ QA (t): hj ← 1/i/* Assign modified height of job j */
i
QA (t) ← j ∈ QA (t) : hj ≤ 1/κ /* X is the small category */
X
WX (t) ← j∈QA (t) wj
X
For each j ∈ QA (t): hj ← hj
X
2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */
j j
3. i∗ ← highest queued weight category
Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity
i∗ j i∗
Figure 3. The algorithms HPCS.
Unfortunately, the input sequence given in Theorem 4.1. HPCS is 1.75-speed O(1)-
Figure 2 shows that greedily scheduling by residual competitive.
density does not suffice. HRDF’s schedule will
accumulate two type II jobs every three time steps, 4.2 Analysis of HPCS. We use a potential
giving it a total flow time of Ω(T 2 ) by time T . function argument to show the competitiveness of
The optimal schedule will have flow time O(T ), HPCS. The main change in our proof is that the
giving HRDF a competitive ratio of Ω(T ). Even terms in the potential function only consider jobs
if we change HRDF to perform an optimal knapsack within a single category at a time. For any fixed
packing by density or add some small constant input instance I, consider running HPCS’s schedule
amount of extra speed, the lower bound still holds. at speed 1.75 alongside the optimal schedule at
unit speed. For a non-small category i job j, let
4.1 Categorizing jobs. The above example
QAj (t) = k ∈ QA (t) : dA (t) > dA (t) be the set
i i k j
shows that greedily scheduling the high value jobs
of alive category i jobs more dense than job j.
and ignoring others does not do well. In fact, type II
Define QAj (t) similarly. We use the notation in
X
jobs, though lower in value individually, cannot be
Figure 3 along with the notation given in the
allowed to build up in the queue forever. To address
proof for HRDF, but we use the modified height
this issue, we use the notion of categories and group
in every case height is used. Further, we restrict
jobs of similar heights together. The intuition is
the definitions of Rj , Rj , Vj> , Yj , and Fj to
<
that once a category or group of jobs accumulates
include only jobs in the same category as j. For
enough weight, it becomes important and should be
example, if job j belongs to category i then Rj (t) =
scheduled. The algorithm HPCS using this intuition A
is given in Figure 3. Note that HPCS schedules from k∈QAj (t) hk pk (t).
i
one category at a time and packs as many jobs as Let λ be a constant we set later. The potential
possible from that category. function we use has two terms for each job j.
Within the category HPCS chooses, we can
bound the performance of HPCS compared to the ΦA (t) = λκwj Rj (t) + pA (t) − Vj> (t)
j j
optimal. The only slack then is the ability of + Yj (t) + Fj (t)
the optimal schedule to take jobs from multiple
ΦO (t)
j
<
= λκwj Rj (t)
categories at once. However, using a surprising
application of a well known bound for knapsacks,
we show that the optimal schedule cannot affect the The potential function is again defined as
potential function too much by scheduling jobs from
multiple categories. In fact, we are able to show Φ(t) = ΦA (t) −
j
O
Φj (t).
HPCS is constant competitive with strictly less than j∈QA (t) j∈QO (t)
double speed.5
with speed strictly less than 1.75, but we do not have a closed
form expression for the minimum speed possible using our
5 We note that we can achieve constant competitiveness techniques.
8. 4.3 Main proof idea: Knapsack upper packing problem. For each non-small category i,
bounds on the optimal efficiency. As before, there are i items in K each with value λκW ∗ (t)/i
we defer the full analysis for Appendix B, but we and size 1/(i+1). These items represent the optimal
mention here the same types of changes mentioned schedule being able to process i jobs at once from
in Section 3. Our goal is to show the optimal category i. Knapsack instance K also contains one
schedule cannot perform too much work compared item for each category X job j alive in the optimal
to HPCS’s schedule even if the optimal schedule schedule with value λκW ∗ (t)hj and size hj . The
processes jobs from several categories at once. Our capacity for K is 1. A valid solution to K is a
proof uses a well known upper bound argument set U of items of total size at most 1. The value
for knapsack packing. Consider any time t and of a solution U is the sum of values for items in U .
let W ∗ (t) be the queued weight of the category The optimal solution to K has a value at least as
scheduled by HPCS at time t. large as the largest possible increase to Φ due to
changes in the Vj and Yj terms. Note that we cannot
Lemma 4.2. If HPCS’s schedule uses 1.75 speed pack an item of size 1 and two items of size 1/2
and the optimal schedule uses unit speed, then for in K. Define K1 and K2 to be K with one item of
all κ ≥ 200 and λ ≥ 100 size 1 removed and one item of size 1/2 removed
respectively. The larger of the optimal solutions
d
ΦA (t) − λκwj Fj (t) = to K1 and K2 is the optimal solution to K.
dt j
j∈QA (t) Let the knapsack density of an item be the ratio
d of its value to its size. Let v1 , v2 , . . . be the values
λκwj (Rj +pA − Vj + Yj )(t)
j of the items in K in descending order of knapsack
dt
j∈QA (t) density. Let s1 , s2 , . . . be the sizes of these same
≤ −κW ∗ (t). items. Greedily pack the items in decreasing order
of knapsack density and let k be the index of the
There are κ categories, so this fall counteracts the first item that does not fit. Folklore states that
increase in A. v1 + v2 + · · · + αvk is an upper bound on the optimal
solution to K where α = 1−(s1 +s2sk +···+sk−1 )
. This
Proof: Suppose HPCS schedules jobs from small fact applies to K1 and K2 as well.
category X. Let j be some job in category X. In all three knapsack instances described,
If HPCS’s schedule processes j at time t, then ordering the items by size gives their order by
d A
dt pj (t) = −1.75. If HPCS’s schedule does not knapsack density. Indeed, items of size 1/i
process j at time t, then it does process jobs from have knapsack density λκW ∗ (t)(i + 1)/i. Items
category X with height at least 1 − 1/κ. Therefore, created from category X jobs have knapsack
d X
dt Rj (t) ≤ −1.75(1 − 1/κ). Similar bounds hold density λκW ∗ (t). We may now easily verify
if HPCS’s schedule processes jobs from non-small that 1.45λκW ∗ (t) and 1.73λκW ∗ (t) are upper
category i, so bounds for the optimal solutions to K1 and K2
respectively. Therefore,
d
λκwj (Rj + pA )(t) ≤
j d
dt λκwj (−Vj + Yj )(t) ≤ 1.73λκW ∗ (t).
j∈QA (t) dt
i∈QA (t)
1 ∗
− 1.75λ 1 − κW (t).
κ
Bounding the increases to Φ from changes to
5 Main Algorithm: Knapsack Category
the Vj and Yj terms takes more subtlety than before.
Scheduling
We pessimistically assume every category has queued
weight W ∗ (t). We also imagine that for each non- As stated in the previous section, a limitation of
small category i job j processed by the optimal HPCS is that it chooses jobs from one category
schedule, job j has height 1/(i + 1). The optimal at a time, putting it at a disadvantage. In this
schedule can still process the same set of jobs with section, we propose and analyze a new algorithm
these assumptions, and the changes due to Vj and Yj called Knapsack Category Scheduling (KCS) which
terms only increase with these assumptions. uses a fully polynomial-time approximation scheme
We now model the changes due to the Vj and Yj (FPTAS) for the knapsack problem [25,29] to choose
terms as an instance K to the standard knapsack jobs from multiple categories. Categories still play
9. KCS(I, t, ε):
/* ε is any real number with 0 < ε ≤ 1/2 */
1. For each integer i ≥ 0 with (1 + ε/2)i < 2/ε /* Divide jobs into categories by height */
QA (t) ← {j ∈ QA (t) : 1/(1 + ε/2)i+1 < hj ≤ 1/(1 + ε/2)i }
i
Wi (t) ← j∈QA (t) wj
i
For each j ∈ QA (t): hj ← 1/(1 + ε/2)i
i
QA (t) ← QA (t) ∪i QA (t) /* Category X jobs have height at most ε/2 */
X i
WX (t) ← j∈QA (t) wj
X
For each j ∈ QA (t): hj ← hj
X
2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */
j j
3. Initialize knapsack K(t) with capacity (1 + ε)/* Setup a knapsack instance to choose jobs */
For each category i and category X; for each j ∈ QA (t)
i
QAj (t) ← k ∈ QA (t) : dA (t) > dA (t) /* Assign category i jobs more dense than j */
i i k j
Add item j to K(t) with size hj and value wj + hj k∈(QA (t)QAj (t){j}) wk
i i
4. Pack jobs from a (1 − ε/2)-approximation for K(t)
Figure 4. The algorithm KCS.
a key role in KCS, as they help assign values to jobs KCS. Let κ be the number of categories used by
when making scheduling decisions. KCS. Observe
Detailed pseudocode for KCS is given in Figure 4.
Steps 1 and 2 of KCS are similar to those of HPCS, κ ≤ 2 + log1+ε/2 (2/ε) = O(1/ε2 ).
except that KCS uses geometrically decreasing
heights for job categories.6 Another interesting The potential function has two terms for each job j.
aspect of KCS is the choice of values for jobs in step
4. Intuitively, the value for each knapsack item j is 8κwj
ΦA (t) =
j Rj (t) + pA (t) − Vj> (t)
j
chosen to reflect the total weight of less dense jobs ε
waiting on j’s completion in a greedy scheduling + Yj (t) + Fj (t)
of j’s category. More practically, the values are 8κwj <
chosen so the optimal knapsack solution will affect ΦO (t) =
j Rj (t)
ε
the potential function Φ as much as possible.
KCS combines the best of all three techniques: The potential function is defined as
knapsack, densities, and categories, and we are able
to show the following surprising result illustrating Φ(t) = ΦA (t) −
j ΦO (t).
j
the power of combining speed and capacity augmen- j∈QA (t) j∈QO (t)
tation.
Picking categories: Full details of the analysis
Theorem 5.1. KCS is (1+ε)-speed (1+ε)-capacity appear in Appendix C, but we mention here the
O(1/ε3 )-competitive. same types of changes mentioned in Section 3.
The optimal schedule may cause large continuous
5.1 Main proof idea: Combining categories increases to Φ by scheduling jobs from multiple ‘high
optimally. We use a potential function analysis weight’ categories. We show that KCS has the option
to prove Theorem 5.1. Fix an input instance I and of choosing jobs from these same categories. Because
consider running KCS’s schedule with (1 + ε) speed KCS approximates an optimal knapsack packing,
and capacity alongside the optimal schedule with KCS’s schedule performs just as much work as the
unit resources. We continue to use the same notation optimal schedule. Fix a time t. Let W ∗ (t) be the
used for HPCS, but modified when necessary for maximum queued weight of any category in KCS’s
schedule at time t.
6 As we shall see in the analysis, knapsacks and capacity
augmentation enable KCS to use geometrically decreasing
heights, where as HPCS needs harmonically decreasing Lemma 5.2. If KCS’s schedule uses 1 + ε speed
heights to pack jobs tightly and get a bound below 2. and capacity and the optimal schedule uses unit
10. speed and capacity, then then there are jobs with higher height density in U
with total height at least HX .
d 8κwj
ΦA −
j Fj (t) =
dt ε d 8κwj
j∈QA (t) (4) Rj + pA (t) ≤
j
d 8κwj dt ε
j∈QA (t)
(Rj +pA − Vj> + Yj )(t)
j
X
dt ε 8κWi (t)
j∈QA (t)
− HX .
≤ −κW ∗ (t). ε
Proof: Suppose the optimal schedule processes Ti Let ∆U be the sum of the right hand sides of (3)
jobs from non-small category i which have a total and (4).
unmodified height of Hi . Then, Recall the knapsack instance K(t) used
by KCS. Let U be a solution to K(t) and
d 8κwj
(1) −Vj> + Yj (t) ≤ let f (U ) be the value of solution U . Let
i
dt ε ∆A = d 8κwj
(Rj + pA )(t). If KCS
j∈QA (t)
i j∈Q A (t) dt ε j
8κWi (t) schedules the jobs corresponding to the U , then
Ti . ∆A = − 8(1+ε)κ f (U ). Let U ∗ be the optimal
i
(1 + ε/2)i ε ε
solution to K(t) and let ∆∗ = − 8(1+ε)κ f (U ∗ ).
ε
Let HX be the capacity not used by the optimal Because U as defined above is one potential solution
schedule for jobs in non-small categories. We see to K(t) and KCS runs an FPTAS on K(t) to decide
which jobs are scheduled, we see
d 8κwj
(2) −Vj> + Yj (t) ≤
dt ε ∆O + ∆A ≤ ∆O + (1 − ε/2)∆∗
j∈QA (t)
X
8κWX (t) 8(1 + ε/4)κ
HX . ≤ ∆O − f (U ∗ )
ε ε
≤ ∆O + ∆U − 2κf (U ∗ )
Let ∆O be the sum of the right hand sides of (1) and
(2). Note that ∆O may be larger than the actual = −2κf (U ∗ ).
rate of increase in Φ due to the Vj and Yj terms.
Now, consider the following set of If category X has the maximum queued weight,
jobs U ∈ QA (t). For each non-small category i, then one possible solution to K(t) is to greedily
set U contains the min |QA (t)|, Ti most dense jobs schedule jobs from category X in descending order
i
from QA (t). Each of the jobs chosen from QA (t) has of height density. Using previously shown tech-
i i
height at most 1/(1 + ε/2)i and the optimal schedule niques, we see f (U ∗ ) ≥ W ∗ (t). If a non-small
processes Hi height worth of jobs, each with height category i has maximum queued weight, then a
at least 1/(1 + ε/2)i+1 . Therefore, the jobs chosen possible solution is to greedily schedule jobs from
from QA (t) have total height at most (1 + ε/2)Hi . category i in descending order of height density.
i
Set U also contains up to HX + ε/2 height worth of Now we see f (U ∗ ) ≥ (1/2)W ∗ (t). Using our bound
jobs from QA (t) chosen greedily in decreasing order on ∆O + ∆A , we prove the lemma.
X
of density. The jobs of U together have height at
most i (1 + ε/2)Hi + HX + ε/2 ≤ 1 + ε. 6 Conclusions and Open Problems
Now, suppose KCS schedules the jobs from
In this paper, we introduce the problem of weighted
set U at unit speed or faster. Consider a non-small
flow time on capacitated machines and give the first
category i. Observe Ti ≤ (1 + ε/2)i . Using similar
upper and lower bounds for the problem. Our main
techniques to those given earlier, we observe
result is a (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )-
d 8κwj competitive algorithm that can be extended to the
(3) Rj + pA (t) ≤
j
dt ε unrelated multiple machines setting. However, it
i j∈QA (t)
i remains an open problem whether (1 + ε)-speed and
8κWi (t) unit capacity suffices to be constant competitive
−Ti .
i
(1 + ε/2)i ε for any ε > 0. Further, the multi-dimensional
case remains open, where jobs can be scheduled
Now, consider any job j from small category X. simulateously given that no single dimension of the
d
If j ∈ U , then dt pA (t) =≤ −1 ≤ −HX . If j ∈ U ,
j / server’s capacity is overloaded.
11. Acknowledgements. The authors would like to [15] N. Bobroff, A. Kochut, and K. Beaty. Dynamic
thank Chandra Chekuri, Sungjin Im, Benjamin placement of virtual machines for managing sla
Moseley, Aranyak Mehta, Gagan Aggarwal, and violations. Proc. 10th Ann. IEEE Symp. Integrated
the anonymous reviewers for helpful discussions Management (IM), pp. 119 – 128. 2007.
and comments on an earlier version of the paper. [16] C. Bussema and E. Torng. Greedy multiprocessor
Research by the first author is supported in part by server scheduling. Oper. Res. Lett. 34(4):451–458,
2006.
the Department of Energy Office of Science Graduate
[17] J. Chadha, N. Garg, A. Kumar, and V. N. Mu-
Fellowship Program (DOE SCGF), made possible in
ralidhara. A competitive algorithm for minimizing
part by the American Recovery and Reinvestment weighted flow time on unrelated machines with
Act of 2009, administered by ORISE-ORAU under speed augmentation. Proc. 41st Ann. ACM Symp.
contract no. DE-AC05-06OR23100. Theory Comput., pp. 679–684. 2009.
References [18] C. Chekuri, A. Goel, S. Khanna, and A. Kumar.
[1] Amazon elastic compute cloud (amazon ec2). http: Multi-processor scheduling to minimize flow time
//aws.amazon.com/ec2/ . with epsilon resource augmentation. Proc. 36th
[2] Google appengine cloud. http://appengine.google. Ann. ACM Symp. Theory Comput., pp. 363–372.
com/ . 2004.
[3] Vmware. http://www.vmware.com/ . [19] C. Chekuri, S. Khanna, and A. Zhu. Algorithms
[4] Vmware cloud computing solutions for minimizing weighted flow time. Proc. 33rd Ann.
(public/private/hybrid). http://www.vmware. ACM Symp. Theory Comput., pp. 84–93. 2001.
com/solutions/cloud-computing/index.html . [20] V. Chudnovsky, R. Rifaat, J. Hellerstein, B. Sharma,
[5] Vmware distributed resource scheduler. http:// and C. Das. Modeling and synthesizing task
www.vmware.com/products/drs/overview.html . placement constraints in google compute clusters.
Symp. on Cloud Computing. 2011.
[6] Vmware server consolidation solutions. http:
//www.vmware.com/solutions/consolidation/ [21] J. Dean and S. Ghemawat. Mapreduce: Simplified
index.html . data processing on large clusters. Proc. 6th Symp.
[7] Server consolidation using VMware ESX Server, on Operating System Design and Implementation.
2005. http://www.redbooks.ibm.com/redpapers/ 2004.
pdfs/redp3939.pdf . [22] K. Fox. Online scheduling on identical machines
[8] S. Anand, N. Garg, and A. Kumar. Resource using SRPT. Master’s thesis, University of Illinois,
augmentation for weighted flow-time explained by Urbana-Champaign, 2010.
dual fitting. Proc. 23rd Ann. ACM-SIAM Symp. [23] K. Fox and B. Moseley. Online scheduling on
Discrete Algorithms, pp. 1228–1241. 2012. identical machines using SRPT. Proc. 22nd ACM-
[9] B. Awerbuch, Y. Azar, S. Plotkin, and O. Waarts. SIAM Symp. Discrete Algorithms, pp. 120–128.
Competitive routing of virtual circuits with un- 2011.
known duration. Proc. 5th Ann. ACM-SIAM Symp. [24] N. Garg and A. Kumar. Minimizing average flow-
Discrete Algorithms, pp. 321–327. 1994. time : Upper and lower bounds. Proc. 48th Ann.
[10] Y. Azar, B. Kalyanasundaram, S. Plotkin, K. Pruhs, IEEE Symp. Found. Comput. Sci., pp. 603–613.
and O. Waarts. On-line load balancing of temporary 2007.
tasks. Proc. 3rd Workshop on Algorithms and Data [25] H. Ibarra and E. Kim. Fast approximation algo-
Structures, pp. 119–130. 1993. rithms for the knapsack and sum of subset problems.
[11] N. Bansal and H.-L. Chan. Weighted flow time does J. ACM 22(4):463–468. ACM, Oct. 1975.
not admit O(1)-competitive algorithms. Proc. 20th [26] S. Im and B. Moseley. An online scalable algorithm
Ann. ACM-SIAM Symp. Discrete Algorithms, pp. for minimizing lk-norms of weighted flow time on
1238–1244. 2009. unrelated machines. Proc. 22nd ACM-SIAM Symp.
[12] N. Bansal and K. Dhamdhere. Minimizing weighted Discrete Algorithms, pp. 95–108. 2011.
flow time. ACM Trans. Algorithms 3(4):39:1–39:14, [27] S. Im, B. Moseley, and K. Pruhs. A tutorial on
2007. amortized local competitiveness in online scheduling.
[13] P. Barham, B. Dragovic, K. Fraser, S. Hand, SIGACT News 42(2):83–97. ACM, June 2011.
T. Harris, A. Ho, R. Neuge-bauer, I. Pratt, and [28] B. Kalyanasundaram and K. Pruhs. Speed is as
A. Warfield. Xen and the art of virtualization. powerful as clairvoyance. Proc. 36th Ann. IEEE
Proc. 19th Symp. on Operating Systems Principles Symp. on Found. Comput. Sci., pp. 214–221. 1995.
(SOSP), pp. 164–177. 2003. [29] E. Lawler. Fast approximation algorithms for the
[14] L. Becchetti, S. Leonardi, A. Marchetti-Spaccamela, knapsack problems. Mathematics of Operations
and K. Pruhs. Online weighted flow time and dead- Research 4(4):339–356, 1979.
line scheduling. Proc. 5th International Workshop [30] S. Leonardi and D. Raz. Approximating total flow
on Randomization and Approximation, pp. 36–47. time on parallel machines. J. Comput. Syst. Sci.
2001. 73(6):875–891, 2007.
12. [31] S. Martello and P. Toth. Knapsack Problems: • If HRDF’s schedule processes job j at time t,
Algorithms and Computer Implementations. John then
Wiley, 1990.
[32] A. Mishra, J. Hellerstein, W. Cirne, and C. Das.
d wj hj wj wk
Fj (t) =
Towards characterizing cloud backend workloads: dt ε ε wj
k∈QO (t)
Insights from Google compute clusters, 2010.
hj d
[33] A. Singh, M. Korupolu, and D. Mohapatra. = OPT(t).
Server-storage virtualization: Integration and load- ε dt
balancing in data centers. Proc. IEEE-ACM
Recall S A (t) is the set of jobs processed by
Supercomputing Conference (SC), pp. 53:1–53:12.
2008.
HRDF’s schedule at time t. Observe that
[34] Q. Zhang, J. Hellerstein, and R. Boutaba. Char-
hj ≤ (2 + ε).
acterizing task usage shapes in Google compute
j∈S A (t)
clusters. Proc. 5th International Workshop on Large
Scale Distributed Systems and Middleware. 2011.
Therefore,
d wj 2+ε d
Fj (t) ≤ OPT(t).
Appendix j∈S A (t)
dt ε ε dt
A Analysis of HRDF • Finally,
In this appendix, we complete the analysis of HRDF.
We use the definitions and potential function as d wj < (2 + ε)wj
− (Rj )(t) ≤
given in Section 3. Consider the changes that occur dt ε ε
j∈QO (t) j∈QO (t)
to A + Φ over time. The events we need to consider 2+ε d
follow: = OPT(t)
ε dt
Job arrivals: Consider the arrival of job j at Combining the above inequalities gives us the
time rj . The potential function Φ gains the terms ΦA
j following:
and −ΦO . We see
j
(1 + ε)wj d d d
ΦA (rj ) − ΦO (rj ) =
j j pj (A + Φ)(t) = Aj (t) + ΦA (t)
ε dt dt dt j
j∈QA (t)
1+ε
≤ OPTj d O
ε − Φ (t)
dt j
j∈QO (t)
For any term ΦA with k = j, job j may only
k
>
contribute to Rk and −Vk , and its contributions to 2+ε d
≤ W A (t) − W A (t) + OPT(t)
the two terms cancel. Further, job j has no effect on ε dt
any term ΦO where k = j. All together, job arrivals 2+ε d
k + OPT(t)
increase A + Φ by at most ε dt
4 + 2ε d
1+ε = OPT(t)
(5) OPT. ε dt
ε
Therefore, the total increase to A + Φ due to
continuous processing of jobs is at most
Continuous job processing: Consider any time t. 4 + 2ε d 4 + 2ε
(6) OPT(t)dt = OPT.
We have the following: t≥0 ε dt ε
• By Lemma 3.2,
d wj Job completions: The completion of job j in
(Rj + (1 + ε)pA − Vj> + Yj )(t) ≤
j
dt ε HRDF’s schedule increases Φ by
j∈QA (t)
wj
− W A (t). −ΦA (Cj ) =
j
A
Vj> (Cj ) − Yj (Cj ) − Fj (Cj ) .
A A A
ε
13. Similarly, the completion of job j by the optimal A.1 Completing the proof. We may now
schedule increases Φ by complete the proof of Theorem 3.1. By summing
wj the increases to A + Φ from (5) and (6), we see
−ΦO (Cj ) =
j
O < O
Rj (Cj ) .
ε 5 + 3ε 1
We argue that the negative terms from these (7) A≤ OPT = O OPT.
ε ε
increases cancel the positive terms. Fix a job j and
consider any job k contributing to Vj> (Cj ). By
A B Analysis of HPCS
>
definition of Vj , job k arrives after job j and is In this appendix, we complete the analysis of HPCS.
more height dense than job j upon its arrival. We We use the definitions and potential function as
see given in Section 4. Consider the changes that occur
wk wj hj pA (rk )wk
j
to A + Φ over time. As is the case for HRDF,
> A (r )
=⇒ > hk pk . job completions and density inversions cause no
hk pk hj pj k wj
overall increase to A + Φ. The analyses for these
Job A k is alive to contribute a total changes are essentially the same as those given in
hj pj (rk )wk
of wj > hk pk to Fj during times HRDF’s Appendix A, except the constants may change. The
schedule processes job j. The decrease in Φ of k’s remaining events we need to consider follow:
A
contribution to Fj (Cj ) negates the increase caused
by k’s contribution to Vj> (Cj ).
A Job arrivals: Consider the arrival of job j at
< O
Now, consider job k contributing to Rj (Cj ). By time rj . The potential function Φ gains the terms ΦA
j
definition we know k arrives before job j and that and −ΦO . We see
j
job j is less height dense than job k at the time j ΦA (rj ) − ΦO (rj ) = λκwj pj
j j
arrives. We see
wk wj ≤ λκOPTj .
> =⇒ wk hj pj > wj hk pA (rj ).
k
hk pA (rj )
k hj pj For any term ΦA
with k = j, job j may only
k
>
Job j does not contribute to >
Vk , because it is less contribute to Rk and −Vk , and its contributions to
height dense than job k upon its arrival. Further, the two terms cancel. Further, job j has no effect on
job k is alive in HRDF’s schedule when the optimal any term ΦO where k = j. All together, job arrivals
k
schedule completes job j. Therefore, all processing increase A + Φ by at most
done to j by the optimal schedule contributes (8) λκOPT.
A
to Yk . So, removing job j’s contribution to Yk (Ck )
wk wj A
causes Φ to decrease by ε hj pj > ε hk pk (rj ), an
upper bound on the amount of increase caused
< O
by removing k’s contribution to −Rj (Cj ). We Continuous job processing: Consider any time t.
conclude that job completions cause no overall We have the following:
increase in A + Φ. • By Lemma 4.2,
d
Density inversions: The final change we need λκwj (Rj +pA −Vj +Yj )(t) ≤ −κW ∗ (t)
j
to consider for Φ comes from HRDF’s schedule dt
j∈QA (t)
changing the ordering of the most height dense jobs.
if κ ≥ 200 and λ ≥ 100.
The Rj terms are the only ones changed by these
events. Suppose HRDF’s schedule causes some job j • Note j∈S A (t) hj ≤ 1. Therefore,
to become less height dense than another job k at
time t. Residual density changes are continuous, so d wk
λκwj Fj (t) = λκhj wj
wj wk dt wj
j∈S A (t) j∈S A (t) k∈QO (t)
= =⇒ wj hk pA (t) = wk hj pA (t).
k j
hj pA (t)
j hk pA (t)
k d
≤ λκ OPT(t).
At time t, job k starts to contribute to Rj , but dt
job j stops contributing to Rk . Therefore, the • Observe:
change to Φ from the inversion is
d <
wj wk − λκwj (Rj )(t) ≤ 1.75λκwj
hk pA (t) −
k hj pA (t) = 0.
j O
dt
j∈QO (t)
ε ε j∈Q (t)
We conclude that these events cause no overall d
= 1.75λκ OPT(t)
increase in A + Φ. dt
14. Combining the above inequalities gives us the • By Lemma 5.2,
following:
d 8κwj
(Rj +pA −Vj> +Yj )(t) ≤ −κW ∗ (t).
j
dt ε
d d d j∈QA (t)
(A + Φ)(t) = Aj (t) + ΦA (t)
dt dt dt j
j∈QA (t)
• For any job j in non-small category i, we
d O have hj ≤ (1 + ε/2)hj by definition. Also,
− Φ (t)
dt j j∈S A (t) hj ≤ 1 + ε. Therefore,
j∈QO (t)
d
≤ κW ∗ (t) − κW ∗ (t) + λκ OPT(t) d 8κwj 8κhj wj wk
dt Fj (t) =
d dt ε ε wj
i∈S A (t) i∈S A (t)
+ 1.75λκ OPT(t)
dt 8(1 + ε)2 κ d
d ≤ OPT(t).
≤ 2.75λκ OPT(t) ε dt
dt
Therefore, the total increase to A + Φ due to • The number of non-small category i jobs pro-
continuous processing of jobs is at most cessed by KCS’s schedule is at most (1 + ε)2 /hj .
d Therefore,
(9) 2.75λκ OPT(t)dt = 2.75λκOPT.
t≥0 dt d 8κwj < 8(1 + ε)3 κwj
− (Rj )(t) ≤
B.1 Completing the proof. We may now dt ε ε
i∈QO (t) i∈QO (t)
complete the proof of Theorem 4.1. Set κ = 200
8(1 + ε)3 κ d
and λ = 100. By summing the increases to A + Φ = OPT(t)
ε dt
from (8) and (9), we see
(10) A ≤ 3.75λκOPT = O(OPT). Combining the above inequalities gives us the
following:
C Analysis of KCS
In this appendix, we complete the analysis of KCS.
d d d
We use the definitions and potential function as (A + Φ)(t) = Aj (t) + ΦA (t)
given in Section 5. Consider the changes that occur dt dt dt j
j∈QA (t)
to A + Φ over time. As is the case for HRDF, d O
job completions and density inversions cause no − Φ (t)
dt j
overall increase to A + Φ. The analyses for these j∈QO (t)
changes are essentially the same as those given in ≤ κW (t) − κW ∗ (t)
∗
Appendix A, except the constants may change. The
8(1 + ε)2 κ d
remaining events we need to consider follow: + OPT(t)
ε dt
Job arrivals: Consider the arrival of job j at 8(1 + ε)3 κ d
+ OPT(t)
time rj . The potential function Φ gains the terms ΦA
j
ε dt
and −ΦO . We see 16(1 + ε)3 κ d
j ≤ OPT(t)
ε dt
8κwj
ΦA (rj ) − ΦO (rj ) =
j j pj Therefore, the total increase to A + Φ due to
ε
1 continuous processing of jobs is at most
≤O OPTj . (12)
ε3
16(1 + ε)3 κ d 1
Again, the changes to other terms cancel. All OPT(t)dt = O OPT
t≥0 ε dt ε3
together, job arrivals increase A + Φ by at most
1 C.1 Completing the proof. We may now
(11) O OPT.
ε3 complete the proof of Theorem 5.1. By summing
the increases to A + Φ from (11) and (12), we see
Continuous job processing: Consider any time t. 1
(13) A≤O OPT.
We have the following: ε3
15. D Extensions to Multiple Machines scheduled on the machine chosen for job j by the
optimal schedule. For example, if A is HRDF, then
Here, we describe how our results can be extended
let pA (t) be the remaining processing time of j
j
to work with multiple unrelated machines. In the
unrelated machines variant of the CFT problem, we on machine . Let dA (t) = w j /(h j pA (t)). Let
j j
have m machines. Job j has weight w j , processing QAj (t) = k ∈ QA (t) : dA (t) > dA (t)
k j Finally, we
time p j , and height h j if it is assigned to machine . have Rj (t) = k∈QAj (t) h k pA (t). The definition
k
At each point in time, the server may schedule an
of the potential function for each algorithm A then
arbitrary number of jobs assigned to each machine
remains unchanged.
as long as the sum of their heights does not exceed 1.
For each machine and job j, we assume either h j ≤ The analysis for continuous job processing, job
1 or p j = ∞. completitions, and density inversions is exactly the
same as those given above for A as we restrict our
The algorithms in this paper can be extended
attention to a single machine in each case. We only
easily to the unrelated machines model. These
need to consider job arrivals. Let cj be the coefficient
extensions are non-migratory, meaning a job is
of pA (t) in the potential function for A (for example,
j
fully processed on a single machine. In fact, our
cj = (1 + ε)wj /ε for HRDF). Recall that in each
extensions are immediate dispatch, meaning each
analysis, the increase to Φ from job arrivals was
job is assigned to a machine upon its arrival. Let A
at most cj OPTj . Consider the arrival of job j at
be one of the algorithms given above. In order
time rj . Let A be the machine chosen for j by
to extend A, we simply apply an idea of Chadha j
et al. [17] also used by Im and Moseley [26]. Namely, our extension and let O be the machine chosen by
j
we run A on each machine independently and assign the optimal algorithm. By our extension’s choice
jobs to the machine that causes the least increase of machine, the potential function Φ increases by
in A’s potential function. cj p O j upon job j’s arrival. This value is at most
j
A more precise explanation follows. We use cj OPTj . Summing over all jobs, we see job arrivals
all the definitions given for A and its analysis. increase Φ by the same amount they did in the
Let QA (t) be the set of alive jobs assigned to earlier analyses.
machine at time t. Upon job j’s arrival, assign Finally, we compare against optimal schedules
job j to the machine that causes the least increase that can be migratory. In this case, we model
in A a migratory optimal schedule as assigning an α j
k∈QA (t) w k (Rk + pk ) where each term Rk
A fraction of job j to each machine ( α i = 1) upon
and pk is modified to consider only jobs assigned
job j’s arrival. We further adjust our definitions of
to and their processing times and heights on
Vj> and Rj to account for these fractional jobs; Yj>
<
. Each machine independently runs A on the
only includes fractional jobs that are more dense
jobs to which it is assigned using processing times, <
than j and Rj includes contributions by jobs that
weights, and heights as given for that machine.
are more height dense than the fractional job the
The extensions to each algorithm A given above
optimal schedule assigns in place of j. Now by our
have the same competitive ratios as their single
extension’s choice in machine, the arrival of job j
machine counterparts, given the amount of resource
increases Φ by α j cj p j ≤ cj OPTj . Summing
augmentation required in the single machine model
over all jobs completes our analysis of job arrivals.
for A.
Continous job processing, job completions, and
D.1 Analysis. Here, we sketch the analysis of density inversions work as before.
our extension for multiple unrelated machines. Our
analysis is essentially the same as the single machine
case, except the terms in the potential functions
are slightly changed. Let A be the algorithm we
are extending. We begin by assuming the optimal
schedule is non-migratory and immediate dispatch;
whenever a job j arrives, it is immediately assigned
to a machine and all processing for j is performed
on . For each term Rj , Vj> , Yj , and Fj , we
change its definition to only include contributions
by jobs scheduled on the machine chosen for job
j by our extension to A. For Rj , we change its
<
definition to only include contributions by jobs