SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Weighted Flowtime on Capacitated Machines
                           Kyle Fox∗                                    Madhukar Korupolu
          University of Illinois at Urbana-Champaign                       Google Research
                            Urbana, IL                                    Mountain View, CA
                     kylefox2@illinois.edu                                 mkar@google.com

Abstract                                                      has been a driving force for virtualization adoption
It is well-known that SRPT is optimal for minimizing          and transformation of IT in data centers.
flow time on machines that run one job at a time.                  This adoption, in turn, is driving a proliferation
However, running one job at a time is a big under-            of compute farms and public, private, and hybrid
utilization for modern systems where sharing, simultane-      clouds which share resources and machines among
ous execution, and virtualization-enabled consolidation       multiple users and jobs [1, 2, 4, 32]. Large compute
are a common trend to boost utilization. Such machines,       farms and clouds often have excess capacity provi-
used in modern large data centers and clouds, are
                                                              sioned for peak workloads and these can be shared
powerful enough to run multiple jobs/VMs at a time
                                                              with new users to improve utilization during off-peak
subject to overall CPU, memory, network, and disk
capacity constraints.                                         times. Tasks such as MapReduce [21] which are
     Motivated by this prominent trend and need, in this      I/O driven, are often run with multiple instances
work, we give the first scheduling algorithms to minimize      (of different MapReduces) on the same machine so
weighted flow time on such capacitated machines. To            they all progress toward completion simultaneously.
capture the difficulty of the problem, we show that             A characterization of workloads in Google compute
without resource augmentation, no online algorithm can        clusters is given in [20, 32, 34].
achieve a bounded competitive ratio. We then investigate          A common scenario in all of these situations is
algorithms with a small resource augmentation in speed
                                                              that of jobs (or VMs) arriving in an online manner
and/or capacity. Our first result is a simple (2 + ε)-
                                                              with their resource requirements and processing time
capacity O(1/ε)-competitive greedy algorithm. Using
only speed augmentation, we then obtain a 1.75-speed          requirements and a scheduler scheduling them on
O(1)-competitive algorithm. Our main technical result         one of the machines in the cluster. Jobs leave when
is a near-optimal (1 + ε)-speed, (1 + ε)-capacity O(1/ε3 )-   they are completed. The problem is interesting even
competitive algorithm using a novel combination of            with a single machine. A common goal in such
knapsacks, densities, job classification into categories,      a scenario is that of minimizing average weighted
and potential function methods. We show that our              flow time, defined as the weighted average of the
results also extend to the multiple unrelated capacitated     job completion times minus arrival times, i.e., the
machines setting.                                             total time spent by jobs in the system. Other
1    Introduction                                             common goals include load-balancing (minimizing
                                                              maximum load on machines), minimizing average
Four trends are becoming increasingly prominent
                                                              stretch (stretch of a job defined as completion time
in modern data centers. Modern machines are
                                                              minus arrival time divided by processing time), or
increasingly powerful and able to run multiple
                                                              completing as many jobs as possible before their
jobs at the same time subject to overall capacity
                                                              deadlines if there are deadlines. Adding weights
constraints. Running only one job at a time on them
                                                              lets us prioritize the completion of certain jobs over
leads to significant under-utilization. Virtualization
                                                              others. In fact, weighted flow time subsumes the
technologies such as VMware [3] and Xen [13] are
                                                              stretch metric if we set the weight of a job to be the
enabling workloads to be packaged as VMs and
                                                              reciprocal of its processing time. In this paper we
multiple VMs run on the same physical machine [5].
                                                              focus on the weighted flow time objective.
This packing of VMs on the same machine en-
ables server consolidation [6, 7, 15] and data center         1.1 Prior work on minimizing flow time:
cost reduction in the form of reduced hardware,               Single job at a time model. Traditionally,
floorspace, energy, cooling, and maintenance costs.            almost all the known flow time results have been
This utilization boost and the resulting cost savings         for machines that run at most one job at a time.
    ∗ Portions
             of this work were done while this author was     Consider the following model. A server sees n jobs
visiting Google Research.                                     arrive over time. Each job j arrives at time rj
and requires a processing time pj . At each point                scalable. Chekuri et al. [18] gave the first speed-
in time t, the server may choose to schedule one                 scalable online algorithms for minimizing flow time
job j that has already arrived (rj ≤ t). We say a                on multiple machines with unit weights. Becchetti
job j has been completed when it has been scheduled              et al. [14] gave the first analysis that the simple
for pj time units. Preemption is allowed, so job j               greedy algorithm Highest Density First (HDF) is
may be scheduled, paused, and then later scheduled               speed-scalable with arbitrary job weights on a single
without any loss of progress. Each job j is given a              machine. HDF always schedules the alive job with
weight wj . The goal of the scheduler is to minimize             the highest weight to processing time ratio. Bussema
the total weighted flow time of the system, defined                and Torng [16] showed the natural generalization
as j wj (Cj −rj ) where Cj is the completion time of             of HDF to multiple machines is also speed-scalable
job j. For this paper in particular, we are interested           with arbitrary job weights. Table 1 gives a summary
in the clairvoyant online version of this problem                of these results.
where the server is not aware of any job j until its             1.2 The capacitated flow time problem.
release, but once job j arrives, the server knows pj             The above results assume that any machine can
and wj .                                                         run at most one job at a time. This assumption
    It is well known that Shortest Remaining Pro-                is wasteful and does not take advantage of the
cessing Time (SRPT) is the optimal algorithm                     sharing, simultaneous execution, and virtualization
for minimizing flow time with unit weights in the                 abilities of modern computers. The current trend in
above single-job machine model. In fact, SRPT                    industry is not only to focus on distributing large
is the best known algorithm for unit weights on                  workloads, but also to run multiple jobs/VMs at
multiple machines in both the online and offline                   the same time on individual machines subject to
setting. Leonardi and Raz [30] show SRPT has com-                overall capacity constraints. By not addressing the
petitive/approximation ratio O(log(min {n/m, P }))               sharing and virtualization trend, theory is getting
where m is the number of machines and P is the ratio             out of touch with the industry.
of the longest processing time to shortest. They also                The problem of scheduling an online sequence
give a matching lower bound for any randomized                   of jobs (VMs) in such scenarios necessitates the
online algorithm. Chekuri, Khanna, and Zhu [19]                  “capacitated machines” variant of the flow time
show every randomized online algorithm has com-                  problem. In addition to each job j having a
petitive ratio Ω(min W 1/2 , P 1/2 , (n/m)1/4 ) when             processing time and weight, job j also has a height hj
the jobs are given arbitrary weights and there are               that represents how much of a given resource (e.g.
multiple machines. Bansal and Chan [11] partially                RAM, network throughput) the job requires to be
extended this result to show there is no deterministic           scheduled. The server is given a capacity that we
online algorithm with constant competitive ratio                 assume is 1 without loss of generality. At each
when the jobs are given arbitrary weights, even                  point in time, the server may schedule an arbitrary
with a single machine. However, there are some                   number of jobs such that the sum of heights for
positive online results known for a single machine               jobs scheduled is at most the capacity of the server.
where the competitive ratio includes a function of               We assume all jobs have height at most 1, or some
the processing times and weights [12, 19]1 .                     jobs could never be scheduled. The setting can be
    The strong lower bounds presented above moti-                easily extended to use multiple machines and multi-
vate the use of resource augmentation, originally                dimensional heights, but even the single-machine,
proposed by Kalyanasundaram and Pruhs [28],                      single-dimension case given above is challenging from
which gives an algorithm a fighting chance on worst               a technical standpoint. We focus on the single-
case scheduling instances when compared to an                    machine, single-dimension problem first, and later
optimal offline algorithm. We say an algorithm                     show how our results can be extended to multiple
is s-speed c-competitive if for any input instance I,            machines. We leave further handling of multiple
the algorithm’s weighted flow time while running at               dimensions as an interesting open problem.
speed s is at most c times the optimal schedule’s                    The case of jobs having sizes related to machine
weighted flow time while running at unit speed.                   capacities has been studied before in the context of
Ideally, we would like an algorithm to be constant               load-balancing [9, 10] and in operations research and
factor competitive when given (1 + ε) times a                    systems [6,15,31,33]. However, this paper is the first
resource over the optimal offline algorithm for any                instance we are aware of that studies capacitated
constant ε > 0. We call such algorithms [resource]-              machines with the objective of minimizing flow time.
   1 The result of [19] is actually a semi-online algorithm in   We call this the Capacitated Flow Time (CFT)
that P is known in advance.                                      problem. As flow time is a fundamental scheduling
Capacitated       Weights      Machines                           Competitive Ratio
          no             unit         single                                1 [folklore]
          no             unit        multiple                        Ω(log n/m + log P ) [30]
                                                                     O(log min {n/m, P }) [30]
                                                                 (1 + ε)-speed O(1/ε)-comp. [18]
            no         arbitrary       single                                ω(1) [11]
                                                                   O(log2 P ) [19, (semi-online)]
                                                                           O(log W ) [12]
                                                                       O(log n + log P ) [12]
                                                                 (1 + ε)-speed O(1/ε)-comp. [14]
            no         arbitrary     multiple                 Ω min W 1/2 , P 1/2 , (n/m)1/4 [19]
                                                                (1 + ε)-speed O(1/ε2 )-comp. [16]
           yes            unit         single                  (1.2 − ε)-capacity Ω(P ) [this paper]
                                                         (1.5 − ε)-capacity Ω(log n + log P ) [this paper]
           yes         weighted       single                    (2 − ε)-capacity ω(1) [this paper]
           yes         weighted      multiple              (2 + ε)-capacity O(1/ε)-comp. [this paper]
                                                               1.75-speed O(1)-comp. [this paper]
                                                   (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )-comp. [this paper]
Table 1. A summary of prior and new results in online scheduling to minimize flowtime. Here, W denotes the number of distinct
weights given to jobs, and P denotes the ratio of the largest to least processing time.



objective, we believe this is an important problem               HRDF uses greedy knapsack packing with job
to study and bridge the gap to practice.                         residual densities (defined later). The algorithm and
                                                                 analysis are tight for capacity-only augmentation
1.3 Our results and outline of the paper.
                                                                 given the lower bound in Theorem 2.4. We then turn
In Section 2, we show that online scheduling
                                                                 to the question of whether speed augmentation helps
in the capacitated machines case is much more
                                                                 beat the (2 − ε)-capacity lower bound. Our initial
difficult than scheduling in the single-job model. In
                                                                 attempt and result (not described in this paper)
particular, Theorem 2.1 shows there exists no online
                                                                 was an s-speed u-capacity constant-competitive
randomized algorithm for the CFT problem with a
                                                                 algorithm for all s and u with s · u ≥ 2 + ε. We
bounded competitive ratio, even when all jobs have
                                                                 subsequently obtained an algorithm High Priority
unit weight. This result stands in stark contrast
                                                                 Category Scheduling (HPCS) which breaks through
to the same situation where SRPT is actually an
                                                                 the 2 bound. HPCS is 1.75-speed O(1)-competitive
optimal algorithm for the single job at a time model.
                                                                 and is described in Section 4.
    Motivated by the difficulty of online scheduling,
                                                                     Our main result is a (1 + ε)-speed (1 + ε)-
we consider the use of resource augmentation in
                                                                 capacity O(1/ε3 )-competitive algorithm called Knap-
speed and capacity. In a capacity augmentation
                                                                 sack Category Scheduling (KCS). KCS builds on
analysis, an algorithm is given capacity greater
                                                                 HPCS and is presented in Section 5. The KCS
than 1 while being compared to the optimal schedule
                                                                 result shows the power of combining both speed and
on a unit capacity server. We define u-capacity
                                                                 capacity augmentation and is the first of its kind.2
c-competitive accordingly. We also say an algo-
                                                                 Pseudocode descriptions of all our algorithms are
rithm is s-speed u-capacity c-competitive if it is c-
                                                                 given in Figures 1, 3, and 4. We conclude with a
competitive when given both s-speed and u-capacity
                                                                 brief summary of our results as well as some open
over the optimal unit resource schedule. We show
                                                                 problems related to the CFT problem in Section 6.
in Corollary 2.2 and Theorem 2.3 that our lower
bound holds even when an algorithm is given a small                  All of our results can be extended to the multiple
constant factor extra capacity without additional                unrelated capacitated machine setting as detailed
speed. In fact, strong lower bounds continue to exist            in Appendix D. In the unrelated machines model,
even as the augmented capacity approaches 2.                     each machine-job pair is given a weight, processing
                                                                 time, and height unrelated to any other machine-job
    In Section 3, we propose and analyze a greedy
                                                                 pairs.
algorithm Highest Residual Density First (HRDF)
for the CFT problem and show it is (2 + ε)-capacity                 2 Inpractice, machines are rarely filled to full capacity, so
O(1/ε)-competitive for any small constant ε > 0.                 small augmentations in capacity are not unreasonable.
Technical novelty: Our algorithms use novel                      Suppose case (b) occurs. We release one job
combinations and generalizations of knapsacks, job           of type II at each time step t = T, T + P, T +
densities, and grouping of jobs into categories. While       2P, . . . , L, L T . There are always T /2 · P jobs
HRDF uses knapsacks and job densities and HPCS               alive in A’s schedule during [T, L]. So A’s total flow
uses job densities and categories, KCS combines              time is Ω(T · L/P ). The optimal schedule finishes
all three. In fact, our algorithms are the first we           all jobs of type II during [0, T ]. It has 2T jobs of
are aware of that do not make flow time scheduling            type I alive at time T , but it finishes these jobs
decisions based on a single ordering of jobs. Our            during [T, 3T ] while also processing jobs of type II.
analysis methods combining potential functions,              So, after time 3T , the optimal schedule has no more
knapsacks, and categories are the first of their kind         alive jobs and its total flow time is O(L). We get
and may have other applications.                             the theorem by setting T ≥ P 2 .
2   Lower Bounds
                                                             Corollary 2.2. Any        randomized       online
We begin the evaluation of the CFT problem by                algorithm A for the CFT problem is (1.2 − ε)-
giving lower bounds on the competitive ratio for any         capacity Ω(P )-competitive for any constant ε > 0
online scheduling algorithm. Some lower bounds               even when all jobs have unit weight.
hold even when jobs are given unit weights. Recall P
denotes the ratio of the largest processing time to              In fact, superconstant lower bounds continue
the least processing time.                                   to exist even when an online algorithm’s server
                                                             is given more than 1.2 capacity. When we stop
Theorem 2.1. Any randomized online algorithm A               requiring input instances to have unit weights, then
for the CFT problem has competitive ratio Ω(P )              the lower bounds become resilient to even larger
even when all jobs have unit weight.                         capacity augmentations.
Proof: Garg and Kumar [24] consider a variant on             Theorem 2.3. Any randomized online algorithm A
the single-job capacity identical machines model they        for the CFT problem is (1.5 − ε)-capacity
call Parallel Scheduling with Subset Constraints.            Ω(log n + log P )-competitive for any constant ε > 0
Our proof very closely follows the proof of [24,             even when all jobs have unit weight.
Theorem 3.2]. While our proof is a lower bound
on any deterministic algorithm, it easily generalizes        Proof: There is a simple reduction from online
to randomized algorithms with oblivious adversaries.         scheduling in the single-job identical machines model
Further, the proof continues to hold even with               to the capacitated server model. Let I be any input
small amounts of capacity augmentation. Fix a                instance for the single-job identical machines model
deterministic online algorithm A. We use only one            where there are two machines and all jobs have
machine and all jobs have unit weight. There are             unit weight. We create an input instance I using
two types of jobs: Type I jobs have unit processing          the same jobs as I with each job given height 0.5.
time and height 0.4; Type II jobs have processing            Any schedule of I is a schedule for I. The lower
time P       1 and height 0.6. Let T             P be a      bound of [30, Theorem 9] immediately implies the
large constant. On each time step t = 0, . . . , T − 1,      theorem.
we release two jobs of type I. At each time step
t = 0, P, 2P, . . . , T − 1, we release one job of type      Theorem 2.4. No        deterministic      online
II (we assume T − 1 is a multiple of P ). The total          algorithm A for the CFT problem is constant
processing time of all released jobs is 3T . The server      competitive even when given (2 − ε)-capacity for
can schedule two type I jobs at once, and one type           any constant ε > 0 assuming jobs are allowed
I job with a type II job, but it cannot schedule two         arbitrary weights.
type II jobs at once. Therefore, one of the following
must occur: (a) at least T /2 jobs of type I remain          Proof: We apply a similar reduction to that given
alive at time T , or (b) at least T /2 · P jobs of type      in the proof of Theorem 2.3. Let I be any input
II remain alive at time T .                                  instance for the single-job identical machines model
    Suppose case (a) occurs. We release 2 jobs of            where there is one machine and jobs have arbitrary
type I at each time step t = T, . . . , L, L     T . The     weight. We create an input instance I using the
total flow time of A is Ω(T ·L). However, the optimal         same jobs as I with each job given height 1. Any
schedule finishes all jobs of type I during [0, T ]. It has   schedule of I is a schedule for I. The lower
only T /P jobs (of type II) waiting at each time step        bound of [11, Theorem 2.1] immediately implies
during [T, L], so the optimal flow time is O(T · L/P ).       the theorem.
HRDF(I, t):
                        /* QA (t) : Jobs alive at time t. pA (t): Remaining processing time of j */
                                                           j
                     1. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign residual densities */
                                               j                j


                     2. Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity
                                        j


                                              Figure 1. The algorithm HRDF.


3   Analysis Warmup: Highest Residual Den-                          We give the remaining notation for our potential
    sity First                                                  function. Let pO (t) be the remaining processing time
                                                                                j

In this section, we analyze the deterministic online            of job j at time t in the optimal schedule. Let QO (t)
algorithm Highest Residual Density First (HRDF)                 be the set of alive jobs in the optimal schedule at
for the CFT problem. A description of HRDF                      time t. Let QAj (t) = k ∈ QA (t) : dA (t) > dA (t)
                                                                                                        k        j

appears in Figure 1. Despite the simplicity of the              be the set of alive jobs in HRDF’s schedule more
algorithm, we are able to show the following result             dense than job j. Let S A (t) and S O (t) be the set
matching the lower bound of Theorem 2.4.                        of jobs processed at time t by HRDF’s and the
                                                                optimal schedule respectively. Finally, let W A (t) =
Theorem 3.1. HRDF is (2 + ε)-capacity O(1/ε)-                      j∈QA (t) wj be the total weight of all jobs alive in
                                                                HRDF’s schedule at time t. The rest of the notation
competitive for any constant ε where 0 < ε < 1.
                                                                we use in our potential function is given in Figure 1
                                                                and Table 2.
3.1 Analysis of HRDF. We use a potential
                                                                    The potential function we use has two terms for
function analysis to prove Theorem 3.1. Set an
                                                                each job j:
input instance I, and consider running HRDF with
2+ε capacity alongside the optimal schedule without                           wj
                                                                     ΦA (t) =
                                                                      j           Rj (t) + (1 + ε)pA (t) − Vj> (t)
                                                                                                    j
resource augmentation. We begin our analysis by                               ε
                                             A                                    + Yj (t) + Fj (t)
defining some notation. For any job j let Cj be the
completion time of j in HRDF’s schedule and let Cj  O                         wj <
                                                                     ΦO (t) =
                                                                      j          R (t)
be the completion time of j in the optimal schedule.                          ε j
                                            A
For any time t ≥ rj , let Aj (t) = min Cj , t − rj
                                                                    The potential function is defined as
be the accumulated weighted flow time of job j at
                          A
time t and let Aj = Aj (Cj ). Finally, let A = j Aj                     Φ(t) =              ΦA (t) −              ΦO (t).
                                                                                             j                     j
be the total weighted flow time for HRDF’s schedule.
                                                                                 j∈QA (t)              j∈QO (t)
Let OPTj (t), OPTj , and OPT be defined similarly
for the optimal schedule.                                       3.2 Main proof idea: Using extra capacity
    Our analysis of HRDF builds on the potential                to keep up with the optimal. We give full
function approach outlined in [27], specifically [17,            details of the potential function analysis in Ap-
22, 23], for the single-job at a time problem and               pendix A including the effects from job arrivals
extends it to the capacitated case. Unlike in [22, 23],         and completions, but briefly mention here some
our method supports arbitrary job weights and                   of the continuous changes made to A + Φ due to
capacities, and unlike in [17], we do not use the               processing from HRDF and the optimal schedule.
notion of fractional flow time.3 We give a potential             Intuitively, these changes compare S A (t) and S O (t),
function Φ that is almost everywhere differentiable              the jobs processed at time t by HRDF’s and the
such that Φ(0) = Φ(∞) = 0 and then focus                        optimal schedule respectively. As is the case for
on bounding all continuous and discrete increases               many potential function arguments, we must show
to A + Φ across time by a function of OPT in order              that HRDF removes the remaining area of jobs more
to bound A by a function of OPT.4                               efficiently than the optimal schedule. The extra
   3 Avoiding the use of fractional flow time allows us to       efficiency shows Φ falls fast enough to counteract
analyze HRDF using capacity-only augmentation without           changes in A. Showing an algorithm’s schedule
speed augmentation [17]. It also saves us a factor of Ω(1/ε)    removes area more efficiently than the optimal
in the competitive ratio for KCS.                               schedule is often the most difficult part of a potential
   4 An analysis of an HRDF-like algorithm for the unrelated

single-job machine problem was given in [8] by relaxing an      integrality gap of the natural linear program for our setting
integer program and applying dual fitting techniques. We         is unbounded for the small amounts of resource augmention
focus on potential function arguments in this work as the       we use in later algorithms.
Term      Definition                                            Explanation
                                                                 Total remaining area of jobs more dense than j. Continuous
  Rj (t)       k∈QAj (t)   hk pA (t)
                               k
                                                                 decreases in Rj counteract continuous increases in A.
                                                                 Total remaining area of jobs more dense than j upon j’s
  <
 Rj (t)        k∈QAj (rj )   hk pA (t)
                                 k
                                                                                                <
                                                                 arrival. The decrease from Rj cancels the increase to Φ
                                                                 caused by Rj when j arrives.
                                                                 The optimal schedule’s total remaining area for each job k
                                                                 more dense than j upon k’s arrival. The increase to Rj
 Vj> (t)       k∈QO (t)∩QAj (rk )   hk pO (t)
                                        k                        caused by the arrival of a job k is canceled by an equal
                                                                 increase in Vj> .
            Yj (t) = 0 for all t < rj and                        Captures work done by the optimal schedule that does not
  Yj (t)     d
            dt Yj (t) =   k∈(S O (t)QAj (rk )) hk        for    affect Vj> . Used to bound increases in Φ from the removal
                                                                      <
            all t ≥ rj                                           of Rj terms when jobs are completed.
                                        d
            Fj (t) = 0 for all t < rj , dt Fj (t) = 0 if j ∈/    Captures the increase in flow time by the optimal schedule
  Fj (t)    S (t), and dt Fj (t) = hj k∈QO (t) wk oth-
              A          d
                                                    wj
                                                                 while our schedule processes j. Used to bound increases
            erwise.                                              in Φ from the removal of Vj terms when jobs are completed.
                                 Table 2. A summary of notation used in our potential functions.



                   Input                                      HRDF                                         OPT

  0.8



  0.3


                    time                                      time                                         time

Figure 2. A lower bound example for HRDF. Type I jobs appear every time step as green squares with height 0.3 and weight 2
while type II jobs appear at time steps {0, 1} mod 3 as purple rectangles with height 0.8 and weight 1. The process continues
until time T     1. Left: The input set of jobs. Center: As type I jobs have higher residual density, HRDF picks one at each time
step as it arrives. Right: Optimal schedule uses type II jobs at time steps {0, 1} mod 3 and three type I jobs at time steps 2
mod 3.


                                                                                              d
function analysis and greatly motivates the other                    process j. Therefore,    dt Rj (t)   ≤ −(1 + ε) and
algorithms given in this paper. In our potential
function, rate of work is primarily captured in the                             d wj                          1+ε A
                                                                                     (Rj + (1 + ε)pA )(t) ≤ −
                                                                                                   j             W (t).
Rj , pA , Vj , and Yj terms.                                                    dt ε                           ε
      j                                                              j∈QA (t)

Lemma 3.2. If HRDF’s schedule uses 2+ε capacity                         The optimal schedule is limited to scheduling
and the optimal schedule uses unit capacity, then                    jobs with a total height of 1. No job contributes to
                                                                     both Vj> and Yj , so
           d       wj
              ΦA −
               j      Fj (t) =
           dt      ε                                                                 d wj                  1
j∈QA (t)                                                                                  (−Vj> + Yj )(t) ≤ W A (t).
                   d wj                                                              dt ε                  ε
                                                                          j∈QA (t)
                        (Rj + (1 + ε)pA − Vj> + Yj )(t)
                                      j
                   dt ε
        j∈QA (t)

                                       ≤ −W A (t).
                                                                     4   Beyond Greedy: High Priority Category
Proof: Consider any time t. For any job j,                               Scheduling
neither Rj nor (1 + ε)pA can increase due to
                            j                                        The most natural question after seeing the above
continuous processing of jobs. If HRDF’s schedule                    analysis is whether there exists an algorithm that is
                            d
processes j at time t, then dt (1 + ε)pA (t) = −(1 + ε).
                                       j                             constant factor competitive when given only speed
If HRDF’s schedule does not process j, then HRDF’s                   augmentation. In particular, a (1+ε)-speed constant
schedule processes a set of more dense jobs of height                competitive algorithm for any small constant ε would
at least (1 + ε). Otherwise, there would be room to                  be ideal.
HPCS(I, t, κ):
                   /* κ is an integer determined in analysis */
                1. For each 1 ≤ i < κ: /* Divide jobs into categories by height */
                       QA (t) ← j ∈ QA (t) : 1/(i + 1) < hj ≤ 1/i /* Assign alive category i jobs */
                          i
                       Wi (t) ← j∈QA (t) wj /* Assign queued weight of category i */
                                       i
                       For each j ∈ QA (t): hj ← 1/i/* Assign modified height of job j */
                                       i


                  QA (t) ← j ∈ QA (t) : hj ≤ 1/κ /* X is the small category */
                   X
                  WX (t) ← j∈QA (t) wj
                                 X
                  For each j ∈ QA (t): hj ← hj
                                X


                2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */
                                         j                j


                3. i∗ ← highest queued weight category
                   Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity
                         i∗        j                                    i∗


                                             Figure 3. The algorithms HPCS.


    Unfortunately, the input sequence given in                 Theorem 4.1. HPCS                  is      1.75-speed          O(1)-
Figure 2 shows that greedily scheduling by residual            competitive.
density does not suffice. HRDF’s schedule will
accumulate two type II jobs every three time steps,            4.2 Analysis of HPCS. We use a potential
giving it a total flow time of Ω(T 2 ) by time T .              function argument to show the competitiveness of
The optimal schedule will have flow time O(T ),                 HPCS. The main change in our proof is that the
giving HRDF a competitive ratio of Ω(T ). Even                 terms in the potential function only consider jobs
if we change HRDF to perform an optimal knapsack               within a single category at a time. For any fixed
packing by density or add some small constant                  input instance I, consider running HPCS’s schedule
amount of extra speed, the lower bound still holds.            at speed 1.75 alongside the optimal schedule at
                                                               unit speed. For a non-small category i job j, let
4.1 Categorizing jobs. The above example
                                                               QAj (t) = k ∈ QA (t) : dA (t) > dA (t) be the set
                                                                 i                   i   k        j
shows that greedily scheduling the high value jobs
                                                               of alive category i jobs more dense than job j.
and ignoring others does not do well. In fact, type II
                                                               Define QAj (t) similarly. We use the notation in
                                                                          X
jobs, though lower in value individually, cannot be
                                                               Figure 3 along with the notation given in the
allowed to build up in the queue forever. To address
                                                               proof for HRDF, but we use the modified height
this issue, we use the notion of categories and group
                                                               in every case height is used. Further, we restrict
jobs of similar heights together. The intuition is
                                                               the definitions of Rj , Rj , Vj> , Yj , and Fj to
                                                                                          <
that once a category or group of jobs accumulates
                                                               include only jobs in the same category as j. For
enough weight, it becomes important and should be
                                                               example, if job j belongs to category i then Rj (t) =
scheduled. The algorithm HPCS using this intuition                              A
is given in Figure 3. Note that HPCS schedules from               k∈QAj (t) hk pk (t).
                                                                      i

one category at a time and packs as many jobs as                  Let λ be a constant we set later. The potential
possible from that category.                                   function we use has two terms for each job j.
    Within the category HPCS chooses, we can
bound the performance of HPCS compared to the                             ΦA (t) = λκwj Rj (t) + pA (t) − Vj> (t)
                                                                           j                      j
optimal. The only slack then is the ability of                                           + Yj (t) + Fj (t)
the optimal schedule to take jobs from multiple
                                                                          ΦO (t)
                                                                           j
                                                                                           <
                                                                                   = λκwj Rj (t)
categories at once. However, using a surprising
application of a well known bound for knapsacks,
we show that the optimal schedule cannot affect the                 The potential function is again defined as
potential function too much by scheduling jobs from
multiple categories. In fact, we are able to show                         Φ(t) =              ΦA (t) −
                                                                                               j
                                                                                                                     O
                                                                                                                    Φj (t).
HPCS is constant competitive with strictly less than                               j∈QA (t)              j∈QO (t)

double speed.5
                                                               with speed strictly less than 1.75, but we do not have a closed
                                                               form expression for the minimum speed possible using our
  5 We   note that we can achieve constant competitiveness     techniques.
4.3 Main proof idea:            Knapsack upper              packing problem. For each non-small category i,
bounds on the optimal efficiency. As before,                  there are i items in K each with value λκW ∗ (t)/i
we defer the full analysis for Appendix B, but we           and size 1/(i+1). These items represent the optimal
mention here the same types of changes mentioned            schedule being able to process i jobs at once from
in Section 3. Our goal is to show the optimal               category i. Knapsack instance K also contains one
schedule cannot perform too much work compared              item for each category X job j alive in the optimal
to HPCS’s schedule even if the optimal schedule             schedule with value λκW ∗ (t)hj and size hj . The
processes jobs from several categories at once. Our         capacity for K is 1. A valid solution to K is a
proof uses a well known upper bound argument                set U of items of total size at most 1. The value
for knapsack packing. Consider any time t and               of a solution U is the sum of values for items in U .
let W ∗ (t) be the queued weight of the category            The optimal solution to K has a value at least as
scheduled by HPCS at time t.                                large as the largest possible increase to Φ due to
                                                            changes in the Vj and Yj terms. Note that we cannot
Lemma 4.2. If HPCS’s schedule uses 1.75 speed               pack an item of size 1 and two items of size 1/2
and the optimal schedule uses unit speed, then for          in K. Define K1 and K2 to be K with one item of
all κ ≥ 200 and λ ≥ 100                                     size 1 removed and one item of size 1/2 removed
                                                            respectively. The larger of the optimal solutions
            d
               ΦA (t) − λκwj Fj (t) =                       to K1 and K2 is the optimal solution to K.
            dt j
 j∈QA (t)                                                       Let the knapsack density of an item be the ratio
                           d                                of its value to its size. Let v1 , v2 , . . . be the values
                              λκwj (Rj +pA − Vj + Yj )(t)
                                         j                  of the items in K in descending order of knapsack
                           dt
                j∈QA (t)                                    density. Let s1 , s2 , . . . be the sizes of these same
                                        ≤ −κW ∗ (t).        items. Greedily pack the items in decreasing order
                                                            of knapsack density and let k be the index of the
There are κ categories, so this fall counteracts the        first item that does not fit. Folklore states that
increase in A.                                              v1 + v2 + · · · + αvk is an upper bound on the optimal
                                                            solution to K where α = 1−(s1 +s2sk       +···+sk−1 )
                                                                                                                  . This
Proof: Suppose HPCS schedules jobs from small               fact applies to K1 and K2 as well.
category X. Let j be some job in category X.                    In all three knapsack instances described,
If HPCS’s schedule processes j at time t, then              ordering the items by size gives their order by
 d A
dt pj (t) = −1.75. If HPCS’s schedule does not              knapsack density.           Indeed, items of size 1/i
process j at time t, then it does process jobs from         have knapsack density λκW ∗ (t)(i + 1)/i. Items
category X with height at least 1 − 1/κ. Therefore,         created from category X jobs have knapsack
 d X
dt Rj (t) ≤ −1.75(1 − 1/κ). Similar bounds hold             density λκW ∗ (t).         We may now easily verify
if HPCS’s schedule processes jobs from non-small            that 1.45λκW ∗ (t) and 1.73λκW ∗ (t) are upper
category i, so                                              bounds for the optimal solutions to K1 and K2
                                                            respectively. Therefore,
                d
                   λκwj (Rj + pA )(t) ≤
                               j                                           d
                dt                                                            λκwj (−Vj + Yj )(t) ≤ 1.73λκW ∗ (t).
     j∈QA (t)                                                              dt
                                                                i∈QA (t)
                                       1        ∗
                           − 1.75λ 1 −      κW (t).
                                       κ

   Bounding the increases to Φ from changes to
                                                            5    Main Algorithm:            Knapsack Category
the Vj and Yj terms takes more subtlety than before.
                                                                 Scheduling
We pessimistically assume every category has queued
weight W ∗ (t). We also imagine that for each non-          As stated in the previous section, a limitation of
small category i job j processed by the optimal             HPCS is that it chooses jobs from one category
schedule, job j has height 1/(i + 1). The optimal           at a time, putting it at a disadvantage. In this
schedule can still process the same set of jobs with        section, we propose and analyze a new algorithm
these assumptions, and the changes due to Vj and Yj         called Knapsack Category Scheduling (KCS) which
terms only increase with these assumptions.                 uses a fully polynomial-time approximation scheme
   We now model the changes due to the Vj and Yj            (FPTAS) for the knapsack problem [25,29] to choose
terms as an instance K to the standard knapsack             jobs from multiple categories. Categories still play
KCS(I, t, ε):
              /* ε is any real number with 0 < ε ≤ 1/2 */
           1. For each integer i ≥ 0 with (1 + ε/2)i < 2/ε /* Divide jobs into categories by height */
                   QA (t) ← {j ∈ QA (t) : 1/(1 + ε/2)i+1 < hj ≤ 1/(1 + ε/2)i }
                     i
                   Wi (t) ← j∈QA (t) wj
                                  i
                   For each j ∈ QA (t): hj ← 1/(1 + ε/2)i
                                  i
              QA (t) ← QA (t)  ∪i QA (t) /* Category X jobs have height at most ε/2 */
                X                   i
              WX (t) ← j∈QA (t) wj
                               X
              For each j ∈ QA (t): hj ← hj
                              X


           2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */
                                    j                j


           3. Initialize knapsack K(t) with capacity (1 + ε)/* Setup a knapsack instance to choose jobs */
              For each category i and category X; for each j ∈ QA (t)
                                                                 i
                   QAj (t) ← k ∈ QA (t) : dA (t) > dA (t) /* Assign category i jobs more dense than j */
                      i             i      k        j
                   Add item j to K(t) with size hj and value wj + hj k∈(QA (t)QAj (t){j}) wk
                                                                              i        i


           4. Pack jobs from a (1 − ε/2)-approximation for K(t)

                                               Figure 4. The algorithm KCS.


a key role in KCS, as they help assign values to jobs          KCS. Let κ be the number of categories used by
when making scheduling decisions.                              KCS. Observe
    Detailed pseudocode for KCS is given in Figure 4.
Steps 1 and 2 of KCS are similar to those of HPCS,                       κ ≤ 2 + log1+ε/2 (2/ε) = O(1/ε2 ).
except that KCS uses geometrically decreasing
heights for job categories.6 Another interesting               The potential function has two terms for each job j.
aspect of KCS is the choice of values for jobs in step
4. Intuitively, the value for each knapsack item j is                           8κwj
                                                                       ΦA (t) =
                                                                        j             Rj (t) + pA (t) − Vj> (t)
                                                                                                j
chosen to reflect the total weight of less dense jobs                              ε
waiting on j’s completion in a greedy scheduling                                    + Yj (t) + Fj (t)
of j’s category. More practically, the values are                               8κwj <
chosen so the optimal knapsack solution will affect                     ΦO (t) =
                                                                        j            Rj (t)
                                                                                  ε
the potential function Φ as much as possible.
    KCS combines the best of all three techniques:                 The potential function is defined as
knapsack, densities, and categories, and we are able
to show the following surprising result illustrating                   Φ(t) =                ΦA (t) −
                                                                                              j                    ΦO (t).
                                                                                                                    j
the power of combining speed and capacity augmen-                                 j∈QA (t)              j∈QO (t)
tation.
                                                               Picking categories: Full details of the analysis
Theorem 5.1. KCS is (1+ε)-speed (1+ε)-capacity                 appear in Appendix C, but we mention here the
O(1/ε3 )-competitive.                                          same types of changes mentioned in Section 3.
                                                               The optimal schedule may cause large continuous
5.1 Main proof idea: Combining categories                      increases to Φ by scheduling jobs from multiple ‘high
optimally. We use a potential function analysis                weight’ categories. We show that KCS has the option
to prove Theorem 5.1. Fix an input instance I and              of choosing jobs from these same categories. Because
consider running KCS’s schedule with (1 + ε) speed             KCS approximates an optimal knapsack packing,
and capacity alongside the optimal schedule with               KCS’s schedule performs just as much work as the
unit resources. We continue to use the same notation           optimal schedule. Fix a time t. Let W ∗ (t) be the
used for HPCS, but modified when necessary for                  maximum queued weight of any category in KCS’s
                                                               schedule at time t.
   6 As we shall see in the analysis, knapsacks and capacity

augmentation enable KCS to use geometrically decreasing
heights, where as HPCS needs harmonically decreasing           Lemma 5.2. If KCS’s schedule uses 1 + ε speed
heights to pack jobs tightly and get a bound below 2.          and capacity and the optimal schedule uses unit
speed and capacity, then                                          then there are jobs with higher height density in U
                                                                  with total height at least HX .
            d                     8κwj
                   ΦA −
                    j                  Fj (t) =
            dt                      ε                                                  d 8κwj
 j∈QA (t)                                                         (4)                         Rj + pA (t) ≤
                                                                                                    j
                              d 8κwj                                                   dt ε
                                                                            j∈QA (t)
                                     (Rj +pA − Vj> + Yj )(t)
                                           j
                                                                               X
                              dt ε                                                           8κWi (t)
                 j∈QA (t)
                                                                                   − HX               .
                                                   ≤ −κW ∗ (t).                                ε

Proof: Suppose the optimal schedule processes Ti                      Let ∆U be the sum of the right hand sides of (3)
jobs from non-small category i which have a total                 and (4).
unmodified height of Hi . Then,                                        Recall the knapsack instance K(t) used
                                                                  by KCS. Let U be a solution to K(t) and
                             d 8κwj
(1)                                 −Vj> + Yj (t) ≤               let f (U ) be the value of solution U .         Let
        i
                             dt ε                                 ∆A =                 d 8κwj
                                                                                              (Rj + pA )(t). If KCS
                j∈QA (t)
                   i                                                         j∈Q A (t) dt  ε         j

                                   8κWi (t)                       schedules the jobs corresponding to the U , then
                             Ti                .                  ∆A = − 8(1+ε)κ f (U ). Let U ∗ be the optimal
                     i
                                  (1 + ε/2)i ε                                 ε
                                                                  solution to K(t) and let ∆∗ = − 8(1+ε)κ f (U ∗ ).
                                                                                                            ε
Let HX be the capacity not used by the optimal                    Because U as defined above is one potential solution
schedule for jobs in non-small categories. We see                 to K(t) and KCS runs an FPTAS on K(t) to decide
                                                                  which jobs are scheduled, we see
                         d 8κwj
(2)                             −Vj> + Yj (t) ≤
                         dt ε                                             ∆O + ∆A ≤ ∆O + (1 − ε/2)∆∗
            j∈QA (t)
               X

                              8κWX (t)                                                        8(1 + ε/4)κ
                     HX                .                                               ≤ ∆O −             f (U ∗ )
                                 ε                                                                 ε
                                                                                       ≤ ∆O + ∆U − 2κf (U ∗ )
Let ∆O be the sum of the right hand sides of (1) and
(2). Note that ∆O may be larger than the actual                                        = −2κf (U ∗ ).
rate of increase in Φ due to the Vj and Yj terms.
    Now,     consider the following set of                           If category X has the maximum queued weight,
jobs U ∈ QA (t). For each non-small category i,                   then one possible solution to K(t) is to greedily
set U contains the min |QA (t)|, Ti most dense jobs               schedule jobs from category X in descending order
                           i
from QA (t). Each of the jobs chosen from QA (t) has              of height density. Using previously shown tech-
        i                                   i
height at most 1/(1 + ε/2)i and the optimal schedule              niques, we see f (U ∗ ) ≥ W ∗ (t). If a non-small
processes Hi height worth of jobs, each with height               category i has maximum queued weight, then a
at least 1/(1 + ε/2)i+1 . Therefore, the jobs chosen              possible solution is to greedily schedule jobs from
from QA (t) have total height at most (1 + ε/2)Hi .               category i in descending order of height density.
         i
Set U also contains up to HX + ε/2 height worth of                Now we see f (U ∗ ) ≥ (1/2)W ∗ (t). Using our bound
jobs from QA (t) chosen greedily in decreasing order              on ∆O + ∆A , we prove the lemma.
             X
of density. The jobs of U together have height at
most i (1 + ε/2)Hi + HX + ε/2 ≤ 1 + ε.                            6     Conclusions and Open Problems
    Now, suppose KCS schedules the jobs from
                                                                  In this paper, we introduce the problem of weighted
set U at unit speed or faster. Consider a non-small
                                                                  flow time on capacitated machines and give the first
category i. Observe Ti ≤ (1 + ε/2)i . Using similar
                                                                  upper and lower bounds for the problem. Our main
techniques to those given earlier, we observe
                                                                  result is a (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )-
                              d 8κwj                              competitive algorithm that can be extended to the
(3)                                  Rj + pA (t) ≤
                                           j
                              dt ε                                unrelated multiple machines setting. However, it
            i    j∈QA (t)
                    i                                             remains an open problem whether (1 + ε)-speed and
                                      8κWi (t)                    unit capacity suffices to be constant competitive
                              −Ti                 .
                         i
                                     (1 + ε/2)i ε                 for any ε > 0. Further, the multi-dimensional
                                                                  case remains open, where jobs can be scheduled
    Now, consider any job j from small category X.                simulateously given that no single dimension of the
                d
If j ∈ U , then dt pA (t) =≤ −1 ≤ −HX . If j ∈ U ,
                    j                         /                   server’s capacity is overloaded.
Acknowledgements. The authors would like to                [15] N. Bobroff, A. Kochut, and K. Beaty. Dynamic
thank Chandra Chekuri, Sungjin Im, Benjamin                     placement of virtual machines for managing sla
Moseley, Aranyak Mehta, Gagan Aggarwal, and                     violations. Proc. 10th Ann. IEEE Symp. Integrated
the anonymous reviewers for helpful discussions                 Management (IM), pp. 119 – 128. 2007.
and comments on an earlier version of the paper.           [16] C. Bussema and E. Torng. Greedy multiprocessor
Research by the first author is supported in part by             server scheduling. Oper. Res. Lett. 34(4):451–458,
                                                                2006.
the Department of Energy Office of Science Graduate
                                                           [17] J. Chadha, N. Garg, A. Kumar, and V. N. Mu-
Fellowship Program (DOE SCGF), made possible in
                                                                ralidhara. A competitive algorithm for minimizing
part by the American Recovery and Reinvestment                  weighted flow time on unrelated machines with
Act of 2009, administered by ORISE-ORAU under                   speed augmentation. Proc. 41st Ann. ACM Symp.
contract no. DE-AC05-06OR23100.                                 Theory Comput., pp. 679–684. 2009.
References                                                 [18] C. Chekuri, A. Goel, S. Khanna, and A. Kumar.
 [1] Amazon elastic compute cloud (amazon ec2). http:           Multi-processor scheduling to minimize flow time
     //aws.amazon.com/ec2/ .                                    with epsilon resource augmentation. Proc. 36th
 [2] Google appengine cloud. http://appengine.google.           Ann. ACM Symp. Theory Comput., pp. 363–372.
     com/ .                                                     2004.
 [3] Vmware. http://www.vmware.com/ .                      [19] C. Chekuri, S. Khanna, and A. Zhu. Algorithms
 [4] Vmware         cloud      computing       solutions        for minimizing weighted flow time. Proc. 33rd Ann.
     (public/private/hybrid).      http://www.vmware.           ACM Symp. Theory Comput., pp. 84–93. 2001.
     com/solutions/cloud-computing/index.html .            [20] V. Chudnovsky, R. Rifaat, J. Hellerstein, B. Sharma,
 [5] Vmware distributed resource scheduler. http://             and C. Das. Modeling and synthesizing task
     www.vmware.com/products/drs/overview.html .                placement constraints in google compute clusters.
                                                                Symp. on Cloud Computing. 2011.
 [6] Vmware server consolidation solutions.        http:
     //www.vmware.com/solutions/consolidation/             [21] J. Dean and S. Ghemawat. Mapreduce: Simplified
     index.html .                                               data processing on large clusters. Proc. 6th Symp.
 [7] Server consolidation using VMware ESX Server,              on Operating System Design and Implementation.
     2005. http://www.redbooks.ibm.com/redpapers/               2004.
     pdfs/redp3939.pdf .                                   [22] K. Fox. Online scheduling on identical machines
 [8] S. Anand, N. Garg, and A. Kumar. Resource                  using SRPT. Master’s thesis, University of Illinois,
     augmentation for weighted flow-time explained by            Urbana-Champaign, 2010.
     dual fitting. Proc. 23rd Ann. ACM-SIAM Symp.           [23] K. Fox and B. Moseley. Online scheduling on
     Discrete Algorithms, pp. 1228–1241. 2012.                  identical machines using SRPT. Proc. 22nd ACM-
 [9] B. Awerbuch, Y. Azar, S. Plotkin, and O. Waarts.           SIAM Symp. Discrete Algorithms, pp. 120–128.
     Competitive routing of virtual circuits with un-           2011.
     known duration. Proc. 5th Ann. ACM-SIAM Symp.         [24] N. Garg and A. Kumar. Minimizing average flow-
     Discrete Algorithms, pp. 321–327. 1994.                    time : Upper and lower bounds. Proc. 48th Ann.
[10] Y. Azar, B. Kalyanasundaram, S. Plotkin, K. Pruhs,         IEEE Symp. Found. Comput. Sci., pp. 603–613.
     and O. Waarts. On-line load balancing of temporary         2007.
     tasks. Proc. 3rd Workshop on Algorithms and Data      [25] H. Ibarra and E. Kim. Fast approximation algo-
     Structures, pp. 119–130. 1993.                             rithms for the knapsack and sum of subset problems.
[11] N. Bansal and H.-L. Chan. Weighted flow time does           J. ACM 22(4):463–468. ACM, Oct. 1975.
     not admit O(1)-competitive algorithms. Proc. 20th     [26] S. Im and B. Moseley. An online scalable algorithm
     Ann. ACM-SIAM Symp. Discrete Algorithms, pp.               for minimizing lk-norms of weighted flow time on
     1238–1244. 2009.                                           unrelated machines. Proc. 22nd ACM-SIAM Symp.
[12] N. Bansal and K. Dhamdhere. Minimizing weighted            Discrete Algorithms, pp. 95–108. 2011.
     flow time. ACM Trans. Algorithms 3(4):39:1–39:14,      [27] S. Im, B. Moseley, and K. Pruhs. A tutorial on
     2007.                                                      amortized local competitiveness in online scheduling.
[13] P. Barham, B. Dragovic, K. Fraser, S. Hand,                SIGACT News 42(2):83–97. ACM, June 2011.
     T. Harris, A. Ho, R. Neuge-bauer, I. Pratt, and       [28] B. Kalyanasundaram and K. Pruhs. Speed is as
     A. Warfield. Xen and the art of virtualization.             powerful as clairvoyance. Proc. 36th Ann. IEEE
     Proc. 19th Symp. on Operating Systems Principles           Symp. on Found. Comput. Sci., pp. 214–221. 1995.
     (SOSP), pp. 164–177. 2003.                            [29] E. Lawler. Fast approximation algorithms for the
[14] L. Becchetti, S. Leonardi, A. Marchetti-Spaccamela,        knapsack problems. Mathematics of Operations
     and K. Pruhs. Online weighted flow time and dead-           Research 4(4):339–356, 1979.
     line scheduling. Proc. 5th International Workshop     [30] S. Leonardi and D. Raz. Approximating total flow
     on Randomization and Approximation, pp. 36–47.             time on parallel machines. J. Comput. Syst. Sci.
     2001.                                                      73(6):875–891, 2007.
[31] S. Martello and P. Toth. Knapsack Problems:              • If HRDF’s schedule processes job j at time t,
     Algorithms and Computer Implementations. John              then
     Wiley, 1990.
[32] A. Mishra, J. Hellerstein, W. Cirne, and C. Das.
                                                                              d wj          hj wj                   wk
                                                                                   Fj (t) =
     Towards characterizing cloud backend workloads:                          dt ε            ε                     wj
                                                                                                         k∈QO (t)
     Insights from Google compute clusters, 2010.
                                                                                                 hj d
[33] A. Singh, M. Korupolu, and D. Mohapatra.                                                =        OPT(t).
     Server-storage virtualization: Integration and load-                                        ε dt
     balancing in data centers. Proc. IEEE-ACM
                                                                  Recall S A (t) is the set of jobs processed by
     Supercomputing Conference (SC), pp. 53:1–53:12.
     2008.
                                                                  HRDF’s schedule at time t. Observe that
[34] Q. Zhang, J. Hellerstein, and R. Boutaba. Char-
                                                                                               hj ≤ (2 + ε).
     acterizing task usage shapes in Google compute
                                                                                   j∈S A (t)
     clusters. Proc. 5th International Workshop on Large
     Scale Distributed Systems and Middleware. 2011.
                                                                  Therefore,

                                                                                  d wj          2+ε d
                                                                                       Fj (t) ≤       OPT(t).
Appendix                                                              j∈S A (t)
                                                                                  dt ε           ε dt

A     Analysis of HRDF                                        • Finally,
In this appendix, we complete the analysis of HRDF.
We use the definitions and potential function as                               d    wj <                             (2 + ε)wj
                                                                                 −   (Rj )(t) ≤
given in Section 3. Consider the changes that occur                           dt   ε                                    ε
                                                                   j∈QO (t)                              j∈QO (t)
to A + Φ over time. The events we need to consider                                                       2+ε d
follow:                                                                                              =         OPT(t)
                                                                                                          ε dt
Job arrivals: Consider the arrival of job j at                  Combining the above inequalities gives us the
time rj . The potential function Φ gains the terms ΦA
                                                    j       following:
and −ΦO . We see
         j

                                 (1 + ε)wj                  d                                  d          d
           ΦA (rj ) − ΦO (rj ) =
            j          j                   pj                  (A + Φ)(t) =                       Aj (t) + ΦA (t)
                                     ε                      dt                                 dt         dt j
                                                                                  j∈QA (t)
                                 1+ε
                               ≤       OPTj                                                          d O
                                   ε                                                  −                Φ (t)
                                                                                                     dt j
                                                                                          j∈QO (t)
    For any term ΦA with k = j, job j may only
                    k
                         >
contribute to Rk and −Vk , and its contributions to                                                         2+ε d
                                                                               ≤ W A (t) − W A (t) +              OPT(t)
the two terms cancel. Further, job j has no effect on                                                         ε dt
any term ΦO where k = j. All together, job arrivals                                    2+ε d
           k                                                                          +        OPT(t)
increase A + Φ by at most                                                                 ε dt
                                                                                 4 + 2ε d
                         1+ε                                                   =          OPT(t)
(5)                          OPT.                                                   ε dt
                          ε
                                                               Therefore, the total increase to A + Φ due to
                                                            continuous processing of jobs is at most

Continuous job processing: Consider any time t.                             4 + 2ε d            4 + 2ε
                                                            (6)                      OPT(t)dt =        OPT.
We have the following:                                                t≥0      ε dt                ε

    • By Lemma 3.2,

                 d wj                                       Job completions: The completion of job j in
                      (Rj + (1 + ε)pA − Vj> + Yj )(t) ≤
                                    j
                 dt ε                                       HRDF’s schedule increases Φ by
      j∈QA (t)
                                                                               wj
                 − W A (t).                                  −ΦA (Cj ) =
                                                               j
                                                                   A
                                                                                  Vj> (Cj ) − Yj (Cj ) − Fj (Cj ) .
                                                                                        A          A          A
                                                                               ε
Similarly, the completion of job j by the optimal        A.1 Completing the proof. We may now
schedule increases Φ by                                  complete the proof of Theorem 3.1. By summing
                          wj                             the increases to A + Φ from (5) and (6), we see
            −ΦO (Cj ) =
                j
                     O           <    O
                               Rj (Cj ) .
                           ε                                                 5 + 3ε                  1
    We argue that the negative terms from these          (7)         A≤             OPT = O                OPT.
                                                                                ε                    ε
increases cancel the positive terms. Fix a job j and
consider any job k contributing to Vj> (Cj ). By
                                              A          B     Analysis of HPCS
                 >
definition of Vj , job k arrives after job j and is       In this appendix, we complete the analysis of HPCS.
more height dense than job j upon its arrival. We        We use the definitions and potential function as
see                                                      given in Section 4. Consider the changes that occur
     wk         wj           hj pA (rk )wk
                                 j
                                                         to A + Φ over time. As is the case for HRDF,
          >      A (r )
                        =⇒                 > hk pk .     job completions and density inversions cause no
    hk pk   hj pj k                wj
                                                         overall increase to A + Φ. The analyses for these
Job A   k is alive to contribute a total                 changes are essentially the same as those given in
   hj pj (rk )wk
of       wj      > hk pk to Fj during times HRDF’s       Appendix A, except the constants may change. The
schedule processes job j. The decrease in Φ of k’s       remaining events we need to consider follow:
                        A
contribution to Fj (Cj ) negates the increase caused
by k’s contribution to Vj> (Cj ).
                               A                         Job arrivals: Consider the arrival of job j at
                                          <    O
    Now, consider job k contributing to Rj (Cj ). By     time rj . The potential function Φ gains the terms ΦA
                                                                                                             j

definition we know k arrives before job j and that        and −ΦO . We see
                                                                  j
job j is less height dense than job k at the time j                    ΦA (rj ) − ΦO (rj ) = λκwj pj
                                                                        j          j
arrives. We see
       wk          wj                                                                          ≤ λκOPTj .
                >       =⇒ wk hj pj > wj hk pA (rj ).
                                             k
   hk pA (rj )
        k         hj pj                                      For any term         ΦA
                                                                               with k = j, job j may only
                                                                                   k
                                                                                  >
Job j does not contribute to    >
                              Vk , because it is less    contribute to Rk and −Vk , and its contributions to
height dense than job k upon its arrival. Further,       the two terms cancel. Further, job j has no effect on
job k is alive in HRDF’s schedule when the optimal       any term ΦO where k = j. All together, job arrivals
                                                                    k
schedule completes job j. Therefore, all processing      increase A + Φ by at most
done to j by the optimal schedule contributes            (8)                            λκOPT.
                                                   A
to Yk . So, removing job j’s contribution to Yk (Ck )
                          wk         wj    A
causes Φ to decrease by ε hj pj > ε hk pk (rj ), an
upper bound on the amount of increase caused
                                        <   O
by removing k’s contribution to −Rj (Cj ). We            Continuous job processing: Consider any time t.
conclude that job completions cause no overall           We have the following:
increase in A + Φ.                                           • By Lemma 4.2,
                                                                           d
Density inversions: The final change we need                                   λκwj (Rj +pA −Vj +Yj )(t) ≤ −κW ∗ (t)
                                                                                         j
to consider for Φ comes from HRDF’s schedule                               dt
                                                               j∈QA (t)
changing the ordering of the most height dense jobs.
                                                               if κ ≥ 200 and λ ≥ 100.
The Rj terms are the only ones changed by these
events. Suppose HRDF’s schedule causes some job j            • Note        j∈S A (t)   hj ≤ 1. Therefore,
to become less height dense than another job k at
time t. Residual density changes are continuous, so                        d                                                   wk
                                                                              λκwj Fj (t) =               λκhj wj
    wj          wk                                                         dt                                                  wj
                                                               j∈S A (t)                      j∈S A (t)             k∈QO (t)
           =           =⇒ wj hk pA (t) = wk hj pA (t).
                                 k              j
 hj pA (t)
     j       hk pA (t)
                 k                                                                                 d
                                                                                            ≤ λκ      OPT(t).
    At time t, job k starts to contribute to Rj , but                                              dt
job j stops contributing to Rk . Therefore, the              • Observe:
change to Φ from the inversion is
                                                                       d          <
            wj             wk                                            − λκwj (Rj )(t) ≤                       1.75λκwj
               hk pA (t) −
                   k          hj pA (t) = 0.
                                  j                               O
                                                                      dt
                                                                                                      j∈QO (t)
            ε               ε                                  j∈Q (t)

We conclude that these events cause no overall                                                                 d
                                                                                                   = 1.75λκ       OPT(t)
increase in A + Φ.                                                                                             dt
Combining the above inequalities gives us the                • By Lemma 5.2,
following:
                                                                                  d 8κwj
                                                                                         (Rj +pA −Vj> +Yj )(t) ≤ −κW ∗ (t).
                                                                                               j
                                                                                  dt ε
 d                               d          d                         j∈QA (t)
    (A + Φ)(t) =                    Aj (t) + ΦA (t)
 dt                              dt         dt j
                    j∈QA (t)
                                                                 • For any job j in non-small category i, we
                                         d O                       have hj ≤ (1 + ε/2)hj by definition. Also,
                          −                Φ (t)
                                         dt j                        j∈S A (t) hj ≤ 1 + ε. Therefore,
                              j∈QO (t)
                                                   d
                 ≤ κW ∗ (t) − κW ∗ (t) + λκ           OPT(t)                      d 8κwj                           8κhj wj wk
                                                   dt                                    Fj (t) =
                                     d                                            dt ε                                ε    wj
                                                                      i∈S A (t)                        i∈S A (t)
                          + 1.75λκ      OPT(t)
                                     dt                                                                8(1 + ε)2 κ d
                        d                                                                          ≤                  OPT(t).
              ≤ 2.75λκ OPT(t)                                                                               ε      dt
                       dt
   Therefore, the total increase to A + Φ due to                 • The number of non-small category i jobs pro-
continuous processing of jobs is at most                           cessed by KCS’s schedule is at most (1 + ε)2 /hj .
                         d                                         Therefore,
(9)             2.75λκ      OPT(t)dt = 2.75λκOPT.
          t≥0            dt                                                       d    8κwj <                            8(1 + ε)3 κwj
                                                                                     −     (Rj )(t) ≤
B.1 Completing the proof. We may now                                              dt     ε                                     ε
                                                                      i∈QO (t)                                i∈QO (t)
complete the proof of Theorem 4.1. Set κ = 200
                                                                                                              8(1 + ε)3 κ d
and λ = 100. By summing the increases to A + Φ                                                            =                  OPT(t)
                                                                                                                   ε      dt
from (8) and (9), we see
(10)        A ≤ 3.75λκOPT = O(OPT).                                Combining the above inequalities gives us the
                                                               following:
C      Analysis of KCS
In this appendix, we complete the analysis of KCS.
                                                                 d                                     d          d
We use the definitions and potential function as                     (A + Φ)(t) =                          Aj (t) + ΦA (t)
given in Section 5. Consider the changes that occur              dt                                    dt         dt j
                                                                                        j∈QA (t)
to A + Φ over time. As is the case for HRDF,                                                                  d O
job completions and density inversions cause no                                             −                   Φ (t)
                                                                                                              dt j
overall increase to A + Φ. The analyses for these                                               j∈QO (t)
changes are essentially the same as those given in                                    ≤ κW (t) − κW ∗ (t)
                                                                                            ∗
Appendix A, except the constants may change. The
                                                                                               8(1 + ε)2 κ d
remaining events we need to consider follow:                                                +                 OPT(t)
                                                                                                    ε      dt
Job arrivals: Consider the arrival of job j at                                                 8(1 + ε)3 κ d
                                                                                            +                 OPT(t)
time rj . The potential function Φ gains the terms ΦA
                                                    j
                                                                                                    ε      dt
and −ΦO . We see                                                                        16(1 + ε)3 κ d
         j                                                                            ≤                 OPT(t)
                                                                                             ε       dt
                                 8κwj
         ΦA (rj ) − ΦO (rj ) =
          j          j                pj                          Therefore, the total increase to A + Φ due to
                                   ε
                                     1                         continuous processing of jobs is at most
                                ≤O       OPTj .                (12)
                                     ε3
                                                                        16(1 + ε)3 κ d                 1
Again, the changes to other terms cancel. All                                           OPT(t)dt = O     OPT
                                                                    t≥0      ε       dt               ε3
together, job arrivals increase A + Φ by at most
                           1                                   C.1 Completing the proof. We may now
(11)                 O           OPT.
                           ε3                                  complete the proof of Theorem 5.1. By summing
                                                               the increases to A + Φ from (11) and (12), we see

Continuous job processing: Consider any time t.                                                 1
                                                               (13)                  A≤O                OPT.
We have the following:                                                                          ε3
D    Extensions to Multiple Machines                       scheduled on the machine chosen for job j by the
                                                           optimal schedule. For example, if A is HRDF, then
Here, we describe how our results can be extended
                                                           let pA (t) be the remaining processing time of j
                                                                 j
to work with multiple unrelated machines. In the
unrelated machines variant of the CFT problem, we          on machine . Let dA (t) = w j /(h j pA (t)). Let
                                                                                 j                 j
have m machines. Job j has weight w j , processing         QAj (t) = k ∈ QA (t) : dA (t) > dA (t)
                                                                                    k        j       Finally, we
time p j , and height h j if it is assigned to machine .   have Rj (t) = k∈QAj (t) h k pA (t). The definition
                                                                                             k
At each point in time, the server may schedule an
                                                           of the potential function for each algorithm A then
arbitrary number of jobs assigned to each machine
                                                           remains unchanged.
as long as the sum of their heights does not exceed 1.
For each machine and job j, we assume either h j ≤             The analysis for continuous job processing, job
1 or p j = ∞.                                              completitions, and density inversions is exactly the
                                                           same as those given above for A as we restrict our
    The algorithms in this paper can be extended
                                                           attention to a single machine in each case. We only
easily to the unrelated machines model. These
                                                           need to consider job arrivals. Let cj be the coefficient
extensions are non-migratory, meaning a job is
                                                           of pA (t) in the potential function for A (for example,
                                                                j
fully processed on a single machine. In fact, our
                                                           cj = (1 + ε)wj /ε for HRDF). Recall that in each
extensions are immediate dispatch, meaning each
                                                           analysis, the increase to Φ from job arrivals was
job is assigned to a machine upon its arrival. Let A
                                                           at most cj OPTj . Consider the arrival of job j at
be one of the algorithms given above. In order
                                                           time rj . Let A be the machine chosen for j by
to extend A, we simply apply an idea of Chadha                               j

et al. [17] also used by Im and Moseley [26]. Namely,      our extension and let O be the machine chosen by
                                                                                     j
we run A on each machine independently and assign          the optimal algorithm. By our extension’s choice
jobs to the machine that causes the least increase         of machine, the potential function Φ increases by
in A’s potential function.                                 cj p O j upon job j’s arrival. This value is at most
                                                                j

    A more precise explanation follows. We use             cj OPTj . Summing over all jobs, we see job arrivals
all the definitions given for A and its analysis.           increase Φ by the same amount they did in the
Let QA (t) be the set of alive jobs assigned to            earlier analyses.
machine at time t. Upon job j’s arrival, assign                Finally, we compare against optimal schedules
job j to the machine that causes the least increase        that can be migratory. In this case, we model
in                          A                              a migratory optimal schedule as assigning an α j
       k∈QA (t) w k (Rk + pk ) where each term Rk
       A                                                   fraction of job j to each machine (      α i = 1) upon
and pk is modified to consider only jobs assigned
                                                           job j’s arrival. We further adjust our definitions of
to     and their processing times and heights on
                                                           Vj> and Rj to account for these fractional jobs; Yj>
                                                                       <
 . Each machine independently runs A on the
                                                           only includes fractional jobs that are more dense
jobs to which it is assigned using processing times,                       <
                                                           than j and Rj includes contributions by jobs that
weights, and heights as given for that machine.
                                                           are more height dense than the fractional job the
The extensions to each algorithm A given above
                                                           optimal schedule assigns in place of j. Now by our
have the same competitive ratios as their single
                                                           extension’s choice in machine, the arrival of job j
machine counterparts, given the amount of resource
                                                           increases Φ by        α j cj p j ≤ cj OPTj . Summing
augmentation required in the single machine model
                                                           over all jobs completes our analysis of job arrivals.
for A.
                                                           Continous job processing, job completions, and
D.1 Analysis. Here, we sketch the analysis of              density inversions work as before.
our extension for multiple unrelated machines. Our
analysis is essentially the same as the single machine
case, except the terms in the potential functions
are slightly changed. Let A be the algorithm we
are extending. We begin by assuming the optimal
schedule is non-migratory and immediate dispatch;
whenever a job j arrives, it is immediately assigned
to a machine and all processing for j is performed
on . For each term Rj , Vj> , Yj , and Fj , we
change its definition to only include contributions
by jobs scheduled on the machine chosen for job
j by our extension to A. For Rj , we change its
                                     <

definition to only include contributions by jobs

Weitere ähnliche Inhalte

Was ist angesagt?

T&T Cloud CMG FINAL
T&T Cloud CMG FINALT&T Cloud CMG FINAL
T&T Cloud CMG FINAL
Clea Zolotow
 
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersOn the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
Cemal Ardil
 
An overview and survey of simulation models
An overview and survey of simulation modelsAn overview and survey of simulation models
An overview and survey of simulation models
Anjandeep Rai
 

Was ist angesagt? (18)

LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTINGLOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
LOAD BALANCING ALGORITHM TO IMPROVE RESPONSE TIME ON CLOUD COMPUTING
 
Application of selective algorithm for effective resource provisioning in clo...
Application of selective algorithm for effective resource provisioning in clo...Application of selective algorithm for effective resource provisioning in clo...
Application of selective algorithm for effective resource provisioning in clo...
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
Dynamic Cloud Partitioning and Load Balancing in Cloud
Dynamic Cloud Partitioning and Load Balancing in Cloud Dynamic Cloud Partitioning and Load Balancing in Cloud
Dynamic Cloud Partitioning and Load Balancing in Cloud
 
Modified Active Monitoring Load Balancing with Cloud Computing
Modified Active Monitoring Load Balancing with Cloud ComputingModified Active Monitoring Load Balancing with Cloud Computing
Modified Active Monitoring Load Balancing with Cloud Computing
 
kogatam_swetha
kogatam_swethakogatam_swetha
kogatam_swetha
 
T&T Cloud CMG FINAL
T&T Cloud CMG FINALT&T Cloud CMG FINAL
T&T Cloud CMG FINAL
 
Optimal load balancing in cloud computing
Optimal load balancing in cloud computingOptimal load balancing in cloud computing
Optimal load balancing in cloud computing
 
Optimized Assignment of Independent Task for Improving Resources Performance ...
Optimized Assignment of Independent Task for Improving Resources Performance ...Optimized Assignment of Independent Task for Improving Resources Performance ...
Optimized Assignment of Independent Task for Improving Resources Performance ...
 
Dynamic Three Stages Task Scheduling Algorithm on Cloud Computing
Dynamic Three Stages Task Scheduling Algorithm on Cloud ComputingDynamic Three Stages Task Scheduling Algorithm on Cloud Computing
Dynamic Three Stages Task Scheduling Algorithm on Cloud Computing
 
Ieeepro techno solutions ieee java project - budget-driven scheduling algor...
Ieeepro techno solutions   ieee java project - budget-driven scheduling algor...Ieeepro techno solutions   ieee java project - budget-driven scheduling algor...
Ieeepro techno solutions ieee java project - budget-driven scheduling algor...
 
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centersOn the-joint-optimization-of-performance-and-power-consumption-in-data-centers
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
 
LOAD BALANCING ALGORITHM ON CLOUD COMPUTING FOR OPTIMIZE RESPONE TIME
LOAD BALANCING ALGORITHM ON CLOUD COMPUTING FOR OPTIMIZE RESPONE TIMELOAD BALANCING ALGORITHM ON CLOUD COMPUTING FOR OPTIMIZE RESPONE TIME
LOAD BALANCING ALGORITHM ON CLOUD COMPUTING FOR OPTIMIZE RESPONE TIME
 
Resource Management for Computer Operating Systems
Resource Management for Computer Operating SystemsResource Management for Computer Operating Systems
Resource Management for Computer Operating Systems
 
An overview and survey of simulation models
An overview and survey of simulation modelsAn overview and survey of simulation models
An overview and survey of simulation models
 
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...
 
RTS
RTSRTS
RTS
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
 

Andere mochten auch

Times after the concord stagecoach was on display
Times after the concord stagecoach was on displayTimes after the concord stagecoach was on display
Times after the concord stagecoach was on display
Sunny Kr
 

Andere mochten auch (7)

Data enriched linear regression
Data enriched linear regressionData enriched linear regression
Data enriched linear regression
 
Times after the concord stagecoach was on display
Times after the concord stagecoach was on displayTimes after the concord stagecoach was on display
Times after the concord stagecoach was on display
 
Optimizing Budget Constrained Spend in Search Advertising
Optimizing Budget Constrained Spend in Search AdvertisingOptimizing Budget Constrained Spend in Search Advertising
Optimizing Budget Constrained Spend in Search Advertising
 
Algorithmic Thermodynamics
Algorithmic ThermodynamicsAlgorithmic Thermodynamics
Algorithmic Thermodynamics
 
Human Computation Must Be Reproducible
Human Computation Must Be ReproducibleHuman Computation Must Be Reproducible
Human Computation Must Be Reproducible
 
Approximation Algorithms for the Directed k-Tour and k-Stroll Problems
Approximation Algorithms for the Directed k-Tour and k-Stroll ProblemsApproximation Algorithms for the Directed k-Tour and k-Stroll Problems
Approximation Algorithms for the Directed k-Tour and k-Stroll Problems
 
Latent Structured Ranking
Latent Structured RankingLatent Structured Ranking
Latent Structured Ranking
 

Ähnlich wie Weighted Flowtime on Capacitated Machines

Ähnlich wie Weighted Flowtime on Capacitated Machines (20)

Load Balancing Algorithm to Improve Response Time on Cloud Computing
Load Balancing Algorithm to Improve Response Time on Cloud ComputingLoad Balancing Algorithm to Improve Response Time on Cloud Computing
Load Balancing Algorithm to Improve Response Time on Cloud Computing
 
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job first (DHSJF): A Task Schedu...
 
Cost-Efficient Task Scheduling with Ant Colony Algorithm for Executing Large ...
Cost-Efficient Task Scheduling with Ant Colony Algorithm for Executing Large ...Cost-Efficient Task Scheduling with Ant Colony Algorithm for Executing Large ...
Cost-Efficient Task Scheduling with Ant Colony Algorithm for Executing Large ...
 
Task Scheduling using Hybrid Algorithm in Cloud Computing Environments
Task Scheduling using Hybrid Algorithm in Cloud Computing EnvironmentsTask Scheduling using Hybrid Algorithm in Cloud Computing Environments
Task Scheduling using Hybrid Algorithm in Cloud Computing Environments
 
N0173696106
N0173696106N0173696106
N0173696106
 
Scheduling in cloud computing
Scheduling in cloud computingScheduling in cloud computing
Scheduling in cloud computing
 
Load Balancing in Cloud using Modified Genetic Algorithm
Load Balancing in Cloud using Modified Genetic AlgorithmLoad Balancing in Cloud using Modified Genetic Algorithm
Load Balancing in Cloud using Modified Genetic Algorithm
 
Chap 1(one) general introduction
Chap 1(one)  general introductionChap 1(one)  general introduction
Chap 1(one) general introduction
 
I018215561
I018215561I018215561
I018215561
 
Optimized load balancing mechanism in parallel computing for workflow in clo...
Optimized load balancing mechanism in parallel computing for  workflow in clo...Optimized load balancing mechanism in parallel computing for  workflow in clo...
Optimized load balancing mechanism in parallel computing for workflow in clo...
 
C1803052327
C1803052327C1803052327
C1803052327
 
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTING
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTINGDYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTING
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTING
 
A load balancing strategy for reducing data loss risk on cloud using remodif...
A load balancing strategy for reducing data loss risk on cloud  using remodif...A load balancing strategy for reducing data loss risk on cloud  using remodif...
A load balancing strategy for reducing data loss risk on cloud using remodif...
 
Evolutionary Multi-Goal Workflow Progress in Shade
Evolutionary  Multi-Goal Workflow Progress in ShadeEvolutionary  Multi-Goal Workflow Progress in Shade
Evolutionary Multi-Goal Workflow Progress in Shade
 
Cloud Computing Load Balancing Algorithms Comparison Based Survey
Cloud Computing Load Balancing Algorithms Comparison Based SurveyCloud Computing Load Balancing Algorithms Comparison Based Survey
Cloud Computing Load Balancing Algorithms Comparison Based Survey
 
Task scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud ComputingTask scheduling Survey in Cloud Computing
Task scheduling Survey in Cloud Computing
 
IRJET- Advance Approach for Load Balancing in Cloud Computing using (HMSO) Hy...
IRJET- Advance Approach for Load Balancing in Cloud Computing using (HMSO) Hy...IRJET- Advance Approach for Load Balancing in Cloud Computing using (HMSO) Hy...
IRJET- Advance Approach for Load Balancing in Cloud Computing using (HMSO) Hy...
 
Nephele efficient parallel data processing in the cloud
Nephele  efficient parallel data processing in the cloudNephele  efficient parallel data processing in the cloud
Nephele efficient parallel data processing in the cloud
 
AN EFFICIENT HEURISTIC ALGORITHM FOR FLEXIBLE JOB SHOP SCHEDULING WITH MAINTE...
AN EFFICIENT HEURISTIC ALGORITHM FOR FLEXIBLE JOB SHOP SCHEDULING WITH MAINTE...AN EFFICIENT HEURISTIC ALGORITHM FOR FLEXIBLE JOB SHOP SCHEDULING WITH MAINTE...
AN EFFICIENT HEURISTIC ALGORITHM FOR FLEXIBLE JOB SHOP SCHEDULING WITH MAINTE...
 
An Efficient Heuristic Algorithm for Flexible Job Shop Scheduling with Mainte...
An Efficient Heuristic Algorithm for Flexible Job Shop Scheduling with Mainte...An Efficient Heuristic Algorithm for Flexible Job Shop Scheduling with Mainte...
An Efficient Heuristic Algorithm for Flexible Job Shop Scheduling with Mainte...
 

Mehr von Sunny Kr

What are wireframes?
What are wireframes?What are wireframes?
What are wireframes?
Sunny Kr
 
comScore Inc. - 2013 Mobile Future in Focus
comScore Inc. - 2013 Mobile Future in FocuscomScore Inc. - 2013 Mobile Future in Focus
comScore Inc. - 2013 Mobile Future in Focus
Sunny Kr
 

Mehr von Sunny Kr (11)

UNIVERSAL ACCOUNT NUMBER (UAN) Manual
UNIVERSAL ACCOUNT NUMBER (UAN) ManualUNIVERSAL ACCOUNT NUMBER (UAN) Manual
UNIVERSAL ACCOUNT NUMBER (UAN) Manual
 
Advanced Tactics and Engagement Strategies for Google+
Advanced Tactics and Engagement Strategies for Google+Advanced Tactics and Engagement Strategies for Google+
Advanced Tactics and Engagement Strategies for Google+
 
WhatsApp Hacking Tricks and Software
WhatsApp Hacking Tricks and SoftwareWhatsApp Hacking Tricks and Software
WhatsApp Hacking Tricks and Software
 
A scalable gibbs sampler for probabilistic entity linking
A scalable gibbs sampler for probabilistic entity linkingA scalable gibbs sampler for probabilistic entity linking
A scalable gibbs sampler for probabilistic entity linking
 
Collaboration in the cloud at Google
Collaboration in the cloud at GoogleCollaboration in the cloud at Google
Collaboration in the cloud at Google
 
Think Hotels - Book Cheap Hotels Worldwide
Think Hotels - Book Cheap Hotels WorldwideThink Hotels - Book Cheap Hotels Worldwide
Think Hotels - Book Cheap Hotels Worldwide
 
What are wireframes?
What are wireframes?What are wireframes?
What are wireframes?
 
comScore Inc. - 2013 Mobile Future in Focus
comScore Inc. - 2013 Mobile Future in FocuscomScore Inc. - 2013 Mobile Future in Focus
comScore Inc. - 2013 Mobile Future in Focus
 
Topical clustering of search results
Topical clustering of search resultsTopical clustering of search results
Topical clustering of search results
 
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...
 
Clinching Auctions with Online Supply
Clinching Auctions with Online SupplyClinching Auctions with Online Supply
Clinching Auctions with Online Supply
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Weighted Flowtime on Capacitated Machines

  • 1. Weighted Flowtime on Capacitated Machines Kyle Fox∗ Madhukar Korupolu University of Illinois at Urbana-Champaign Google Research Urbana, IL Mountain View, CA kylefox2@illinois.edu mkar@google.com Abstract has been a driving force for virtualization adoption It is well-known that SRPT is optimal for minimizing and transformation of IT in data centers. flow time on machines that run one job at a time. This adoption, in turn, is driving a proliferation However, running one job at a time is a big under- of compute farms and public, private, and hybrid utilization for modern systems where sharing, simultane- clouds which share resources and machines among ous execution, and virtualization-enabled consolidation multiple users and jobs [1, 2, 4, 32]. Large compute are a common trend to boost utilization. Such machines, farms and clouds often have excess capacity provi- used in modern large data centers and clouds, are sioned for peak workloads and these can be shared powerful enough to run multiple jobs/VMs at a time with new users to improve utilization during off-peak subject to overall CPU, memory, network, and disk capacity constraints. times. Tasks such as MapReduce [21] which are Motivated by this prominent trend and need, in this I/O driven, are often run with multiple instances work, we give the first scheduling algorithms to minimize (of different MapReduces) on the same machine so weighted flow time on such capacitated machines. To they all progress toward completion simultaneously. capture the difficulty of the problem, we show that A characterization of workloads in Google compute without resource augmentation, no online algorithm can clusters is given in [20, 32, 34]. achieve a bounded competitive ratio. We then investigate A common scenario in all of these situations is algorithms with a small resource augmentation in speed that of jobs (or VMs) arriving in an online manner and/or capacity. Our first result is a simple (2 + ε)- with their resource requirements and processing time capacity O(1/ε)-competitive greedy algorithm. Using only speed augmentation, we then obtain a 1.75-speed requirements and a scheduler scheduling them on O(1)-competitive algorithm. Our main technical result one of the machines in the cluster. Jobs leave when is a near-optimal (1 + ε)-speed, (1 + ε)-capacity O(1/ε3 )- they are completed. The problem is interesting even competitive algorithm using a novel combination of with a single machine. A common goal in such knapsacks, densities, job classification into categories, a scenario is that of minimizing average weighted and potential function methods. We show that our flow time, defined as the weighted average of the results also extend to the multiple unrelated capacitated job completion times minus arrival times, i.e., the machines setting. total time spent by jobs in the system. Other 1 Introduction common goals include load-balancing (minimizing maximum load on machines), minimizing average Four trends are becoming increasingly prominent stretch (stretch of a job defined as completion time in modern data centers. Modern machines are minus arrival time divided by processing time), or increasingly powerful and able to run multiple completing as many jobs as possible before their jobs at the same time subject to overall capacity deadlines if there are deadlines. Adding weights constraints. Running only one job at a time on them lets us prioritize the completion of certain jobs over leads to significant under-utilization. Virtualization others. In fact, weighted flow time subsumes the technologies such as VMware [3] and Xen [13] are stretch metric if we set the weight of a job to be the enabling workloads to be packaged as VMs and reciprocal of its processing time. In this paper we multiple VMs run on the same physical machine [5]. focus on the weighted flow time objective. This packing of VMs on the same machine en- ables server consolidation [6, 7, 15] and data center 1.1 Prior work on minimizing flow time: cost reduction in the form of reduced hardware, Single job at a time model. Traditionally, floorspace, energy, cooling, and maintenance costs. almost all the known flow time results have been This utilization boost and the resulting cost savings for machines that run at most one job at a time. ∗ Portions of this work were done while this author was Consider the following model. A server sees n jobs visiting Google Research. arrive over time. Each job j arrives at time rj
  • 2. and requires a processing time pj . At each point scalable. Chekuri et al. [18] gave the first speed- in time t, the server may choose to schedule one scalable online algorithms for minimizing flow time job j that has already arrived (rj ≤ t). We say a on multiple machines with unit weights. Becchetti job j has been completed when it has been scheduled et al. [14] gave the first analysis that the simple for pj time units. Preemption is allowed, so job j greedy algorithm Highest Density First (HDF) is may be scheduled, paused, and then later scheduled speed-scalable with arbitrary job weights on a single without any loss of progress. Each job j is given a machine. HDF always schedules the alive job with weight wj . The goal of the scheduler is to minimize the highest weight to processing time ratio. Bussema the total weighted flow time of the system, defined and Torng [16] showed the natural generalization as j wj (Cj −rj ) where Cj is the completion time of of HDF to multiple machines is also speed-scalable job j. For this paper in particular, we are interested with arbitrary job weights. Table 1 gives a summary in the clairvoyant online version of this problem of these results. where the server is not aware of any job j until its 1.2 The capacitated flow time problem. release, but once job j arrives, the server knows pj The above results assume that any machine can and wj . run at most one job at a time. This assumption It is well known that Shortest Remaining Pro- is wasteful and does not take advantage of the cessing Time (SRPT) is the optimal algorithm sharing, simultaneous execution, and virtualization for minimizing flow time with unit weights in the abilities of modern computers. The current trend in above single-job machine model. In fact, SRPT industry is not only to focus on distributing large is the best known algorithm for unit weights on workloads, but also to run multiple jobs/VMs at multiple machines in both the online and offline the same time on individual machines subject to setting. Leonardi and Raz [30] show SRPT has com- overall capacity constraints. By not addressing the petitive/approximation ratio O(log(min {n/m, P })) sharing and virtualization trend, theory is getting where m is the number of machines and P is the ratio out of touch with the industry. of the longest processing time to shortest. They also The problem of scheduling an online sequence give a matching lower bound for any randomized of jobs (VMs) in such scenarios necessitates the online algorithm. Chekuri, Khanna, and Zhu [19] “capacitated machines” variant of the flow time show every randomized online algorithm has com- problem. In addition to each job j having a petitive ratio Ω(min W 1/2 , P 1/2 , (n/m)1/4 ) when processing time and weight, job j also has a height hj the jobs are given arbitrary weights and there are that represents how much of a given resource (e.g. multiple machines. Bansal and Chan [11] partially RAM, network throughput) the job requires to be extended this result to show there is no deterministic scheduled. The server is given a capacity that we online algorithm with constant competitive ratio assume is 1 without loss of generality. At each when the jobs are given arbitrary weights, even point in time, the server may schedule an arbitrary with a single machine. However, there are some number of jobs such that the sum of heights for positive online results known for a single machine jobs scheduled is at most the capacity of the server. where the competitive ratio includes a function of We assume all jobs have height at most 1, or some the processing times and weights [12, 19]1 . jobs could never be scheduled. The setting can be The strong lower bounds presented above moti- easily extended to use multiple machines and multi- vate the use of resource augmentation, originally dimensional heights, but even the single-machine, proposed by Kalyanasundaram and Pruhs [28], single-dimension case given above is challenging from which gives an algorithm a fighting chance on worst a technical standpoint. We focus on the single- case scheduling instances when compared to an machine, single-dimension problem first, and later optimal offline algorithm. We say an algorithm show how our results can be extended to multiple is s-speed c-competitive if for any input instance I, machines. We leave further handling of multiple the algorithm’s weighted flow time while running at dimensions as an interesting open problem. speed s is at most c times the optimal schedule’s The case of jobs having sizes related to machine weighted flow time while running at unit speed. capacities has been studied before in the context of Ideally, we would like an algorithm to be constant load-balancing [9, 10] and in operations research and factor competitive when given (1 + ε) times a systems [6,15,31,33]. However, this paper is the first resource over the optimal offline algorithm for any instance we are aware of that studies capacitated constant ε > 0. We call such algorithms [resource]- machines with the objective of minimizing flow time. 1 The result of [19] is actually a semi-online algorithm in We call this the Capacitated Flow Time (CFT) that P is known in advance. problem. As flow time is a fundamental scheduling
  • 3. Capacitated Weights Machines Competitive Ratio no unit single 1 [folklore] no unit multiple Ω(log n/m + log P ) [30] O(log min {n/m, P }) [30] (1 + ε)-speed O(1/ε)-comp. [18] no arbitrary single ω(1) [11] O(log2 P ) [19, (semi-online)] O(log W ) [12] O(log n + log P ) [12] (1 + ε)-speed O(1/ε)-comp. [14] no arbitrary multiple Ω min W 1/2 , P 1/2 , (n/m)1/4 [19] (1 + ε)-speed O(1/ε2 )-comp. [16] yes unit single (1.2 − ε)-capacity Ω(P ) [this paper] (1.5 − ε)-capacity Ω(log n + log P ) [this paper] yes weighted single (2 − ε)-capacity ω(1) [this paper] yes weighted multiple (2 + ε)-capacity O(1/ε)-comp. [this paper] 1.75-speed O(1)-comp. [this paper] (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )-comp. [this paper] Table 1. A summary of prior and new results in online scheduling to minimize flowtime. Here, W denotes the number of distinct weights given to jobs, and P denotes the ratio of the largest to least processing time. objective, we believe this is an important problem HRDF uses greedy knapsack packing with job to study and bridge the gap to practice. residual densities (defined later). The algorithm and analysis are tight for capacity-only augmentation 1.3 Our results and outline of the paper. given the lower bound in Theorem 2.4. We then turn In Section 2, we show that online scheduling to the question of whether speed augmentation helps in the capacitated machines case is much more beat the (2 − ε)-capacity lower bound. Our initial difficult than scheduling in the single-job model. In attempt and result (not described in this paper) particular, Theorem 2.1 shows there exists no online was an s-speed u-capacity constant-competitive randomized algorithm for the CFT problem with a algorithm for all s and u with s · u ≥ 2 + ε. We bounded competitive ratio, even when all jobs have subsequently obtained an algorithm High Priority unit weight. This result stands in stark contrast Category Scheduling (HPCS) which breaks through to the same situation where SRPT is actually an the 2 bound. HPCS is 1.75-speed O(1)-competitive optimal algorithm for the single job at a time model. and is described in Section 4. Motivated by the difficulty of online scheduling, Our main result is a (1 + ε)-speed (1 + ε)- we consider the use of resource augmentation in capacity O(1/ε3 )-competitive algorithm called Knap- speed and capacity. In a capacity augmentation sack Category Scheduling (KCS). KCS builds on analysis, an algorithm is given capacity greater HPCS and is presented in Section 5. The KCS than 1 while being compared to the optimal schedule result shows the power of combining both speed and on a unit capacity server. We define u-capacity capacity augmentation and is the first of its kind.2 c-competitive accordingly. We also say an algo- Pseudocode descriptions of all our algorithms are rithm is s-speed u-capacity c-competitive if it is c- given in Figures 1, 3, and 4. We conclude with a competitive when given both s-speed and u-capacity brief summary of our results as well as some open over the optimal unit resource schedule. We show problems related to the CFT problem in Section 6. in Corollary 2.2 and Theorem 2.3 that our lower bound holds even when an algorithm is given a small All of our results can be extended to the multiple constant factor extra capacity without additional unrelated capacitated machine setting as detailed speed. In fact, strong lower bounds continue to exist in Appendix D. In the unrelated machines model, even as the augmented capacity approaches 2. each machine-job pair is given a weight, processing time, and height unrelated to any other machine-job In Section 3, we propose and analyze a greedy pairs. algorithm Highest Residual Density First (HRDF) for the CFT problem and show it is (2 + ε)-capacity 2 Inpractice, machines are rarely filled to full capacity, so O(1/ε)-competitive for any small constant ε > 0. small augmentations in capacity are not unreasonable.
  • 4. Technical novelty: Our algorithms use novel Suppose case (b) occurs. We release one job combinations and generalizations of knapsacks, job of type II at each time step t = T, T + P, T + densities, and grouping of jobs into categories. While 2P, . . . , L, L T . There are always T /2 · P jobs HRDF uses knapsacks and job densities and HPCS alive in A’s schedule during [T, L]. So A’s total flow uses job densities and categories, KCS combines time is Ω(T · L/P ). The optimal schedule finishes all three. In fact, our algorithms are the first we all jobs of type II during [0, T ]. It has 2T jobs of are aware of that do not make flow time scheduling type I alive at time T , but it finishes these jobs decisions based on a single ordering of jobs. Our during [T, 3T ] while also processing jobs of type II. analysis methods combining potential functions, So, after time 3T , the optimal schedule has no more knapsacks, and categories are the first of their kind alive jobs and its total flow time is O(L). We get and may have other applications. the theorem by setting T ≥ P 2 . 2 Lower Bounds Corollary 2.2. Any randomized online We begin the evaluation of the CFT problem by algorithm A for the CFT problem is (1.2 − ε)- giving lower bounds on the competitive ratio for any capacity Ω(P )-competitive for any constant ε > 0 online scheduling algorithm. Some lower bounds even when all jobs have unit weight. hold even when jobs are given unit weights. Recall P denotes the ratio of the largest processing time to In fact, superconstant lower bounds continue the least processing time. to exist even when an online algorithm’s server is given more than 1.2 capacity. When we stop Theorem 2.1. Any randomized online algorithm A requiring input instances to have unit weights, then for the CFT problem has competitive ratio Ω(P ) the lower bounds become resilient to even larger even when all jobs have unit weight. capacity augmentations. Proof: Garg and Kumar [24] consider a variant on Theorem 2.3. Any randomized online algorithm A the single-job capacity identical machines model they for the CFT problem is (1.5 − ε)-capacity call Parallel Scheduling with Subset Constraints. Ω(log n + log P )-competitive for any constant ε > 0 Our proof very closely follows the proof of [24, even when all jobs have unit weight. Theorem 3.2]. While our proof is a lower bound on any deterministic algorithm, it easily generalizes Proof: There is a simple reduction from online to randomized algorithms with oblivious adversaries. scheduling in the single-job identical machines model Further, the proof continues to hold even with to the capacitated server model. Let I be any input small amounts of capacity augmentation. Fix a instance for the single-job identical machines model deterministic online algorithm A. We use only one where there are two machines and all jobs have machine and all jobs have unit weight. There are unit weight. We create an input instance I using two types of jobs: Type I jobs have unit processing the same jobs as I with each job given height 0.5. time and height 0.4; Type II jobs have processing Any schedule of I is a schedule for I. The lower time P 1 and height 0.6. Let T P be a bound of [30, Theorem 9] immediately implies the large constant. On each time step t = 0, . . . , T − 1, theorem. we release two jobs of type I. At each time step t = 0, P, 2P, . . . , T − 1, we release one job of type Theorem 2.4. No deterministic online II (we assume T − 1 is a multiple of P ). The total algorithm A for the CFT problem is constant processing time of all released jobs is 3T . The server competitive even when given (2 − ε)-capacity for can schedule two type I jobs at once, and one type any constant ε > 0 assuming jobs are allowed I job with a type II job, but it cannot schedule two arbitrary weights. type II jobs at once. Therefore, one of the following must occur: (a) at least T /2 jobs of type I remain Proof: We apply a similar reduction to that given alive at time T , or (b) at least T /2 · P jobs of type in the proof of Theorem 2.3. Let I be any input II remain alive at time T . instance for the single-job identical machines model Suppose case (a) occurs. We release 2 jobs of where there is one machine and jobs have arbitrary type I at each time step t = T, . . . , L, L T . The weight. We create an input instance I using the total flow time of A is Ω(T ·L). However, the optimal same jobs as I with each job given height 1. Any schedule finishes all jobs of type I during [0, T ]. It has schedule of I is a schedule for I. The lower only T /P jobs (of type II) waiting at each time step bound of [11, Theorem 2.1] immediately implies during [T, L], so the optimal flow time is O(T · L/P ). the theorem.
  • 5. HRDF(I, t): /* QA (t) : Jobs alive at time t. pA (t): Remaining processing time of j */ j 1. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign residual densities */ j j 2. Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity j Figure 1. The algorithm HRDF. 3 Analysis Warmup: Highest Residual Den- We give the remaining notation for our potential sity First function. Let pO (t) be the remaining processing time j In this section, we analyze the deterministic online of job j at time t in the optimal schedule. Let QO (t) algorithm Highest Residual Density First (HRDF) be the set of alive jobs in the optimal schedule at for the CFT problem. A description of HRDF time t. Let QAj (t) = k ∈ QA (t) : dA (t) > dA (t) k j appears in Figure 1. Despite the simplicity of the be the set of alive jobs in HRDF’s schedule more algorithm, we are able to show the following result dense than job j. Let S A (t) and S O (t) be the set matching the lower bound of Theorem 2.4. of jobs processed at time t by HRDF’s and the optimal schedule respectively. Finally, let W A (t) = Theorem 3.1. HRDF is (2 + ε)-capacity O(1/ε)- j∈QA (t) wj be the total weight of all jobs alive in HRDF’s schedule at time t. The rest of the notation competitive for any constant ε where 0 < ε < 1. we use in our potential function is given in Figure 1 and Table 2. 3.1 Analysis of HRDF. We use a potential The potential function we use has two terms for function analysis to prove Theorem 3.1. Set an each job j: input instance I, and consider running HRDF with 2+ε capacity alongside the optimal schedule without wj ΦA (t) = j Rj (t) + (1 + ε)pA (t) − Vj> (t) j resource augmentation. We begin our analysis by ε A + Yj (t) + Fj (t) defining some notation. For any job j let Cj be the completion time of j in HRDF’s schedule and let Cj O wj < ΦO (t) = j R (t) be the completion time of j in the optimal schedule. ε j A For any time t ≥ rj , let Aj (t) = min Cj , t − rj The potential function is defined as be the accumulated weighted flow time of job j at A time t and let Aj = Aj (Cj ). Finally, let A = j Aj Φ(t) = ΦA (t) − ΦO (t). j j be the total weighted flow time for HRDF’s schedule. j∈QA (t) j∈QO (t) Let OPTj (t), OPTj , and OPT be defined similarly for the optimal schedule. 3.2 Main proof idea: Using extra capacity Our analysis of HRDF builds on the potential to keep up with the optimal. We give full function approach outlined in [27], specifically [17, details of the potential function analysis in Ap- 22, 23], for the single-job at a time problem and pendix A including the effects from job arrivals extends it to the capacitated case. Unlike in [22, 23], and completions, but briefly mention here some our method supports arbitrary job weights and of the continuous changes made to A + Φ due to capacities, and unlike in [17], we do not use the processing from HRDF and the optimal schedule. notion of fractional flow time.3 We give a potential Intuitively, these changes compare S A (t) and S O (t), function Φ that is almost everywhere differentiable the jobs processed at time t by HRDF’s and the such that Φ(0) = Φ(∞) = 0 and then focus optimal schedule respectively. As is the case for on bounding all continuous and discrete increases many potential function arguments, we must show to A + Φ across time by a function of OPT in order that HRDF removes the remaining area of jobs more to bound A by a function of OPT.4 efficiently than the optimal schedule. The extra 3 Avoiding the use of fractional flow time allows us to efficiency shows Φ falls fast enough to counteract analyze HRDF using capacity-only augmentation without changes in A. Showing an algorithm’s schedule speed augmentation [17]. It also saves us a factor of Ω(1/ε) removes area more efficiently than the optimal in the competitive ratio for KCS. schedule is often the most difficult part of a potential 4 An analysis of an HRDF-like algorithm for the unrelated single-job machine problem was given in [8] by relaxing an integrality gap of the natural linear program for our setting integer program and applying dual fitting techniques. We is unbounded for the small amounts of resource augmention focus on potential function arguments in this work as the we use in later algorithms.
  • 6. Term Definition Explanation Total remaining area of jobs more dense than j. Continuous Rj (t) k∈QAj (t) hk pA (t) k decreases in Rj counteract continuous increases in A. Total remaining area of jobs more dense than j upon j’s < Rj (t) k∈QAj (rj ) hk pA (t) k < arrival. The decrease from Rj cancels the increase to Φ caused by Rj when j arrives. The optimal schedule’s total remaining area for each job k more dense than j upon k’s arrival. The increase to Rj Vj> (t) k∈QO (t)∩QAj (rk ) hk pO (t) k caused by the arrival of a job k is canceled by an equal increase in Vj> . Yj (t) = 0 for all t < rj and Captures work done by the optimal schedule that does not Yj (t) d dt Yj (t) = k∈(S O (t)QAj (rk )) hk for affect Vj> . Used to bound increases in Φ from the removal < all t ≥ rj of Rj terms when jobs are completed. d Fj (t) = 0 for all t < rj , dt Fj (t) = 0 if j ∈/ Captures the increase in flow time by the optimal schedule Fj (t) S (t), and dt Fj (t) = hj k∈QO (t) wk oth- A d wj while our schedule processes j. Used to bound increases erwise. in Φ from the removal of Vj terms when jobs are completed. Table 2. A summary of notation used in our potential functions. Input HRDF OPT 0.8 0.3 time time time Figure 2. A lower bound example for HRDF. Type I jobs appear every time step as green squares with height 0.3 and weight 2 while type II jobs appear at time steps {0, 1} mod 3 as purple rectangles with height 0.8 and weight 1. The process continues until time T 1. Left: The input set of jobs. Center: As type I jobs have higher residual density, HRDF picks one at each time step as it arrives. Right: Optimal schedule uses type II jobs at time steps {0, 1} mod 3 and three type I jobs at time steps 2 mod 3. d function analysis and greatly motivates the other process j. Therefore, dt Rj (t) ≤ −(1 + ε) and algorithms given in this paper. In our potential function, rate of work is primarily captured in the d wj 1+ε A (Rj + (1 + ε)pA )(t) ≤ − j W (t). Rj , pA , Vj , and Yj terms. dt ε ε j j∈QA (t) Lemma 3.2. If HRDF’s schedule uses 2+ε capacity The optimal schedule is limited to scheduling and the optimal schedule uses unit capacity, then jobs with a total height of 1. No job contributes to both Vj> and Yj , so d wj ΦA − j Fj (t) = dt ε d wj 1 j∈QA (t) (−Vj> + Yj )(t) ≤ W A (t). d wj dt ε ε j∈QA (t) (Rj + (1 + ε)pA − Vj> + Yj )(t) j dt ε j∈QA (t) ≤ −W A (t). 4 Beyond Greedy: High Priority Category Proof: Consider any time t. For any job j, Scheduling neither Rj nor (1 + ε)pA can increase due to j The most natural question after seeing the above continuous processing of jobs. If HRDF’s schedule analysis is whether there exists an algorithm that is d processes j at time t, then dt (1 + ε)pA (t) = −(1 + ε). j constant factor competitive when given only speed If HRDF’s schedule does not process j, then HRDF’s augmentation. In particular, a (1+ε)-speed constant schedule processes a set of more dense jobs of height competitive algorithm for any small constant ε would at least (1 + ε). Otherwise, there would be room to be ideal.
  • 7. HPCS(I, t, κ): /* κ is an integer determined in analysis */ 1. For each 1 ≤ i < κ: /* Divide jobs into categories by height */ QA (t) ← j ∈ QA (t) : 1/(i + 1) < hj ≤ 1/i /* Assign alive category i jobs */ i Wi (t) ← j∈QA (t) wj /* Assign queued weight of category i */ i For each j ∈ QA (t): hj ← 1/i/* Assign modified height of job j */ i QA (t) ← j ∈ QA (t) : hj ≤ 1/κ /* X is the small category */ X WX (t) ← j∈QA (t) wj X For each j ∈ QA (t): hj ← hj X 2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */ j j 3. i∗ ← highest queued weight category Sort QA (t) by dA (t) descending and pack jobs from QA (t) up to capacity i∗ j i∗ Figure 3. The algorithms HPCS. Unfortunately, the input sequence given in Theorem 4.1. HPCS is 1.75-speed O(1)- Figure 2 shows that greedily scheduling by residual competitive. density does not suffice. HRDF’s schedule will accumulate two type II jobs every three time steps, 4.2 Analysis of HPCS. We use a potential giving it a total flow time of Ω(T 2 ) by time T . function argument to show the competitiveness of The optimal schedule will have flow time O(T ), HPCS. The main change in our proof is that the giving HRDF a competitive ratio of Ω(T ). Even terms in the potential function only consider jobs if we change HRDF to perform an optimal knapsack within a single category at a time. For any fixed packing by density or add some small constant input instance I, consider running HPCS’s schedule amount of extra speed, the lower bound still holds. at speed 1.75 alongside the optimal schedule at unit speed. For a non-small category i job j, let 4.1 Categorizing jobs. The above example QAj (t) = k ∈ QA (t) : dA (t) > dA (t) be the set i i k j shows that greedily scheduling the high value jobs of alive category i jobs more dense than job j. and ignoring others does not do well. In fact, type II Define QAj (t) similarly. We use the notation in X jobs, though lower in value individually, cannot be Figure 3 along with the notation given in the allowed to build up in the queue forever. To address proof for HRDF, but we use the modified height this issue, we use the notion of categories and group in every case height is used. Further, we restrict jobs of similar heights together. The intuition is the definitions of Rj , Rj , Vj> , Yj , and Fj to < that once a category or group of jobs accumulates include only jobs in the same category as j. For enough weight, it becomes important and should be example, if job j belongs to category i then Rj (t) = scheduled. The algorithm HPCS using this intuition A is given in Figure 3. Note that HPCS schedules from k∈QAj (t) hk pk (t). i one category at a time and packs as many jobs as Let λ be a constant we set later. The potential possible from that category. function we use has two terms for each job j. Within the category HPCS chooses, we can bound the performance of HPCS compared to the ΦA (t) = λκwj Rj (t) + pA (t) − Vj> (t) j j optimal. The only slack then is the ability of + Yj (t) + Fj (t) the optimal schedule to take jobs from multiple ΦO (t) j < = λκwj Rj (t) categories at once. However, using a surprising application of a well known bound for knapsacks, we show that the optimal schedule cannot affect the The potential function is again defined as potential function too much by scheduling jobs from multiple categories. In fact, we are able to show Φ(t) = ΦA (t) − j O Φj (t). HPCS is constant competitive with strictly less than j∈QA (t) j∈QO (t) double speed.5 with speed strictly less than 1.75, but we do not have a closed form expression for the minimum speed possible using our 5 We note that we can achieve constant competitiveness techniques.
  • 8. 4.3 Main proof idea: Knapsack upper packing problem. For each non-small category i, bounds on the optimal efficiency. As before, there are i items in K each with value λκW ∗ (t)/i we defer the full analysis for Appendix B, but we and size 1/(i+1). These items represent the optimal mention here the same types of changes mentioned schedule being able to process i jobs at once from in Section 3. Our goal is to show the optimal category i. Knapsack instance K also contains one schedule cannot perform too much work compared item for each category X job j alive in the optimal to HPCS’s schedule even if the optimal schedule schedule with value λκW ∗ (t)hj and size hj . The processes jobs from several categories at once. Our capacity for K is 1. A valid solution to K is a proof uses a well known upper bound argument set U of items of total size at most 1. The value for knapsack packing. Consider any time t and of a solution U is the sum of values for items in U . let W ∗ (t) be the queued weight of the category The optimal solution to K has a value at least as scheduled by HPCS at time t. large as the largest possible increase to Φ due to changes in the Vj and Yj terms. Note that we cannot Lemma 4.2. If HPCS’s schedule uses 1.75 speed pack an item of size 1 and two items of size 1/2 and the optimal schedule uses unit speed, then for in K. Define K1 and K2 to be K with one item of all κ ≥ 200 and λ ≥ 100 size 1 removed and one item of size 1/2 removed respectively. The larger of the optimal solutions d ΦA (t) − λκwj Fj (t) = to K1 and K2 is the optimal solution to K. dt j j∈QA (t) Let the knapsack density of an item be the ratio d of its value to its size. Let v1 , v2 , . . . be the values λκwj (Rj +pA − Vj + Yj )(t) j of the items in K in descending order of knapsack dt j∈QA (t) density. Let s1 , s2 , . . . be the sizes of these same ≤ −κW ∗ (t). items. Greedily pack the items in decreasing order of knapsack density and let k be the index of the There are κ categories, so this fall counteracts the first item that does not fit. Folklore states that increase in A. v1 + v2 + · · · + αvk is an upper bound on the optimal solution to K where α = 1−(s1 +s2sk +···+sk−1 ) . This Proof: Suppose HPCS schedules jobs from small fact applies to K1 and K2 as well. category X. Let j be some job in category X. In all three knapsack instances described, If HPCS’s schedule processes j at time t, then ordering the items by size gives their order by d A dt pj (t) = −1.75. If HPCS’s schedule does not knapsack density. Indeed, items of size 1/i process j at time t, then it does process jobs from have knapsack density λκW ∗ (t)(i + 1)/i. Items category X with height at least 1 − 1/κ. Therefore, created from category X jobs have knapsack d X dt Rj (t) ≤ −1.75(1 − 1/κ). Similar bounds hold density λκW ∗ (t). We may now easily verify if HPCS’s schedule processes jobs from non-small that 1.45λκW ∗ (t) and 1.73λκW ∗ (t) are upper category i, so bounds for the optimal solutions to K1 and K2 respectively. Therefore, d λκwj (Rj + pA )(t) ≤ j d dt λκwj (−Vj + Yj )(t) ≤ 1.73λκW ∗ (t). j∈QA (t) dt i∈QA (t) 1 ∗ − 1.75λ 1 − κW (t). κ Bounding the increases to Φ from changes to 5 Main Algorithm: Knapsack Category the Vj and Yj terms takes more subtlety than before. Scheduling We pessimistically assume every category has queued weight W ∗ (t). We also imagine that for each non- As stated in the previous section, a limitation of small category i job j processed by the optimal HPCS is that it chooses jobs from one category schedule, job j has height 1/(i + 1). The optimal at a time, putting it at a disadvantage. In this schedule can still process the same set of jobs with section, we propose and analyze a new algorithm these assumptions, and the changes due to Vj and Yj called Knapsack Category Scheduling (KCS) which terms only increase with these assumptions. uses a fully polynomial-time approximation scheme We now model the changes due to the Vj and Yj (FPTAS) for the knapsack problem [25,29] to choose terms as an instance K to the standard knapsack jobs from multiple categories. Categories still play
  • 9. KCS(I, t, ε): /* ε is any real number with 0 < ε ≤ 1/2 */ 1. For each integer i ≥ 0 with (1 + ε/2)i < 2/ε /* Divide jobs into categories by height */ QA (t) ← {j ∈ QA (t) : 1/(1 + ε/2)i+1 < hj ≤ 1/(1 + ε/2)i } i Wi (t) ← j∈QA (t) wj i For each j ∈ QA (t): hj ← 1/(1 + ε/2)i i QA (t) ← QA (t) ∪i QA (t) /* Category X jobs have height at most ε/2 */ X i WX (t) ← j∈QA (t) wj X For each j ∈ QA (t): hj ← hj X 2. For each j ∈ QA (t): dA (t) ← wj /(hj pA (t)) /* Assign densities by modified height */ j j 3. Initialize knapsack K(t) with capacity (1 + ε)/* Setup a knapsack instance to choose jobs */ For each category i and category X; for each j ∈ QA (t) i QAj (t) ← k ∈ QA (t) : dA (t) > dA (t) /* Assign category i jobs more dense than j */ i i k j Add item j to K(t) with size hj and value wj + hj k∈(QA (t)QAj (t){j}) wk i i 4. Pack jobs from a (1 − ε/2)-approximation for K(t) Figure 4. The algorithm KCS. a key role in KCS, as they help assign values to jobs KCS. Let κ be the number of categories used by when making scheduling decisions. KCS. Observe Detailed pseudocode for KCS is given in Figure 4. Steps 1 and 2 of KCS are similar to those of HPCS, κ ≤ 2 + log1+ε/2 (2/ε) = O(1/ε2 ). except that KCS uses geometrically decreasing heights for job categories.6 Another interesting The potential function has two terms for each job j. aspect of KCS is the choice of values for jobs in step 4. Intuitively, the value for each knapsack item j is 8κwj ΦA (t) = j Rj (t) + pA (t) − Vj> (t) j chosen to reflect the total weight of less dense jobs ε waiting on j’s completion in a greedy scheduling + Yj (t) + Fj (t) of j’s category. More practically, the values are 8κwj < chosen so the optimal knapsack solution will affect ΦO (t) = j Rj (t) ε the potential function Φ as much as possible. KCS combines the best of all three techniques: The potential function is defined as knapsack, densities, and categories, and we are able to show the following surprising result illustrating Φ(t) = ΦA (t) − j ΦO (t). j the power of combining speed and capacity augmen- j∈QA (t) j∈QO (t) tation. Picking categories: Full details of the analysis Theorem 5.1. KCS is (1+ε)-speed (1+ε)-capacity appear in Appendix C, but we mention here the O(1/ε3 )-competitive. same types of changes mentioned in Section 3. The optimal schedule may cause large continuous 5.1 Main proof idea: Combining categories increases to Φ by scheduling jobs from multiple ‘high optimally. We use a potential function analysis weight’ categories. We show that KCS has the option to prove Theorem 5.1. Fix an input instance I and of choosing jobs from these same categories. Because consider running KCS’s schedule with (1 + ε) speed KCS approximates an optimal knapsack packing, and capacity alongside the optimal schedule with KCS’s schedule performs just as much work as the unit resources. We continue to use the same notation optimal schedule. Fix a time t. Let W ∗ (t) be the used for HPCS, but modified when necessary for maximum queued weight of any category in KCS’s schedule at time t. 6 As we shall see in the analysis, knapsacks and capacity augmentation enable KCS to use geometrically decreasing heights, where as HPCS needs harmonically decreasing Lemma 5.2. If KCS’s schedule uses 1 + ε speed heights to pack jobs tightly and get a bound below 2. and capacity and the optimal schedule uses unit
  • 10. speed and capacity, then then there are jobs with higher height density in U with total height at least HX . d 8κwj ΦA − j Fj (t) = dt ε d 8κwj j∈QA (t) (4) Rj + pA (t) ≤ j d 8κwj dt ε j∈QA (t) (Rj +pA − Vj> + Yj )(t) j X dt ε 8κWi (t) j∈QA (t) − HX . ≤ −κW ∗ (t). ε Proof: Suppose the optimal schedule processes Ti Let ∆U be the sum of the right hand sides of (3) jobs from non-small category i which have a total and (4). unmodified height of Hi . Then, Recall the knapsack instance K(t) used by KCS. Let U be a solution to K(t) and d 8κwj (1) −Vj> + Yj (t) ≤ let f (U ) be the value of solution U . Let i dt ε ∆A = d 8κwj (Rj + pA )(t). If KCS j∈QA (t) i j∈Q A (t) dt ε j 8κWi (t) schedules the jobs corresponding to the U , then Ti . ∆A = − 8(1+ε)κ f (U ). Let U ∗ be the optimal i (1 + ε/2)i ε ε solution to K(t) and let ∆∗ = − 8(1+ε)κ f (U ∗ ). ε Let HX be the capacity not used by the optimal Because U as defined above is one potential solution schedule for jobs in non-small categories. We see to K(t) and KCS runs an FPTAS on K(t) to decide which jobs are scheduled, we see d 8κwj (2) −Vj> + Yj (t) ≤ dt ε ∆O + ∆A ≤ ∆O + (1 − ε/2)∆∗ j∈QA (t) X 8κWX (t) 8(1 + ε/4)κ HX . ≤ ∆O − f (U ∗ ) ε ε ≤ ∆O + ∆U − 2κf (U ∗ ) Let ∆O be the sum of the right hand sides of (1) and (2). Note that ∆O may be larger than the actual = −2κf (U ∗ ). rate of increase in Φ due to the Vj and Yj terms. Now, consider the following set of If category X has the maximum queued weight, jobs U ∈ QA (t). For each non-small category i, then one possible solution to K(t) is to greedily set U contains the min |QA (t)|, Ti most dense jobs schedule jobs from category X in descending order i from QA (t). Each of the jobs chosen from QA (t) has of height density. Using previously shown tech- i i height at most 1/(1 + ε/2)i and the optimal schedule niques, we see f (U ∗ ) ≥ W ∗ (t). If a non-small processes Hi height worth of jobs, each with height category i has maximum queued weight, then a at least 1/(1 + ε/2)i+1 . Therefore, the jobs chosen possible solution is to greedily schedule jobs from from QA (t) have total height at most (1 + ε/2)Hi . category i in descending order of height density. i Set U also contains up to HX + ε/2 height worth of Now we see f (U ∗ ) ≥ (1/2)W ∗ (t). Using our bound jobs from QA (t) chosen greedily in decreasing order on ∆O + ∆A , we prove the lemma. X of density. The jobs of U together have height at most i (1 + ε/2)Hi + HX + ε/2 ≤ 1 + ε. 6 Conclusions and Open Problems Now, suppose KCS schedules the jobs from In this paper, we introduce the problem of weighted set U at unit speed or faster. Consider a non-small flow time on capacitated machines and give the first category i. Observe Ti ≤ (1 + ε/2)i . Using similar upper and lower bounds for the problem. Our main techniques to those given earlier, we observe result is a (1 + ε)-speed (1 + ε)-capacity O(1/ε3 )- d 8κwj competitive algorithm that can be extended to the (3) Rj + pA (t) ≤ j dt ε unrelated multiple machines setting. However, it i j∈QA (t) i remains an open problem whether (1 + ε)-speed and 8κWi (t) unit capacity suffices to be constant competitive −Ti . i (1 + ε/2)i ε for any ε > 0. Further, the multi-dimensional case remains open, where jobs can be scheduled Now, consider any job j from small category X. simulateously given that no single dimension of the d If j ∈ U , then dt pA (t) =≤ −1 ≤ −HX . If j ∈ U , j / server’s capacity is overloaded.
  • 11. Acknowledgements. The authors would like to [15] N. Bobroff, A. Kochut, and K. Beaty. Dynamic thank Chandra Chekuri, Sungjin Im, Benjamin placement of virtual machines for managing sla Moseley, Aranyak Mehta, Gagan Aggarwal, and violations. Proc. 10th Ann. IEEE Symp. Integrated the anonymous reviewers for helpful discussions Management (IM), pp. 119 – 128. 2007. and comments on an earlier version of the paper. [16] C. Bussema and E. Torng. Greedy multiprocessor Research by the first author is supported in part by server scheduling. Oper. Res. Lett. 34(4):451–458, 2006. the Department of Energy Office of Science Graduate [17] J. Chadha, N. Garg, A. Kumar, and V. N. Mu- Fellowship Program (DOE SCGF), made possible in ralidhara. A competitive algorithm for minimizing part by the American Recovery and Reinvestment weighted flow time on unrelated machines with Act of 2009, administered by ORISE-ORAU under speed augmentation. Proc. 41st Ann. ACM Symp. contract no. DE-AC05-06OR23100. Theory Comput., pp. 679–684. 2009. References [18] C. Chekuri, A. Goel, S. Khanna, and A. Kumar. [1] Amazon elastic compute cloud (amazon ec2). http: Multi-processor scheduling to minimize flow time //aws.amazon.com/ec2/ . with epsilon resource augmentation. Proc. 36th [2] Google appengine cloud. http://appengine.google. Ann. ACM Symp. Theory Comput., pp. 363–372. com/ . 2004. [3] Vmware. http://www.vmware.com/ . [19] C. Chekuri, S. Khanna, and A. Zhu. Algorithms [4] Vmware cloud computing solutions for minimizing weighted flow time. Proc. 33rd Ann. (public/private/hybrid). http://www.vmware. ACM Symp. Theory Comput., pp. 84–93. 2001. com/solutions/cloud-computing/index.html . [20] V. Chudnovsky, R. Rifaat, J. Hellerstein, B. Sharma, [5] Vmware distributed resource scheduler. http:// and C. Das. Modeling and synthesizing task www.vmware.com/products/drs/overview.html . placement constraints in google compute clusters. Symp. on Cloud Computing. 2011. [6] Vmware server consolidation solutions. http: //www.vmware.com/solutions/consolidation/ [21] J. Dean and S. Ghemawat. Mapreduce: Simplified index.html . data processing on large clusters. Proc. 6th Symp. [7] Server consolidation using VMware ESX Server, on Operating System Design and Implementation. 2005. http://www.redbooks.ibm.com/redpapers/ 2004. pdfs/redp3939.pdf . [22] K. Fox. Online scheduling on identical machines [8] S. Anand, N. Garg, and A. Kumar. Resource using SRPT. Master’s thesis, University of Illinois, augmentation for weighted flow-time explained by Urbana-Champaign, 2010. dual fitting. Proc. 23rd Ann. ACM-SIAM Symp. [23] K. Fox and B. Moseley. Online scheduling on Discrete Algorithms, pp. 1228–1241. 2012. identical machines using SRPT. Proc. 22nd ACM- [9] B. Awerbuch, Y. Azar, S. Plotkin, and O. Waarts. SIAM Symp. Discrete Algorithms, pp. 120–128. Competitive routing of virtual circuits with un- 2011. known duration. Proc. 5th Ann. ACM-SIAM Symp. [24] N. Garg and A. Kumar. Minimizing average flow- Discrete Algorithms, pp. 321–327. 1994. time : Upper and lower bounds. Proc. 48th Ann. [10] Y. Azar, B. Kalyanasundaram, S. Plotkin, K. Pruhs, IEEE Symp. Found. Comput. Sci., pp. 603–613. and O. Waarts. On-line load balancing of temporary 2007. tasks. Proc. 3rd Workshop on Algorithms and Data [25] H. Ibarra and E. Kim. Fast approximation algo- Structures, pp. 119–130. 1993. rithms for the knapsack and sum of subset problems. [11] N. Bansal and H.-L. Chan. Weighted flow time does J. ACM 22(4):463–468. ACM, Oct. 1975. not admit O(1)-competitive algorithms. Proc. 20th [26] S. Im and B. Moseley. An online scalable algorithm Ann. ACM-SIAM Symp. Discrete Algorithms, pp. for minimizing lk-norms of weighted flow time on 1238–1244. 2009. unrelated machines. Proc. 22nd ACM-SIAM Symp. [12] N. Bansal and K. Dhamdhere. Minimizing weighted Discrete Algorithms, pp. 95–108. 2011. flow time. ACM Trans. Algorithms 3(4):39:1–39:14, [27] S. Im, B. Moseley, and K. Pruhs. A tutorial on 2007. amortized local competitiveness in online scheduling. [13] P. Barham, B. Dragovic, K. Fraser, S. Hand, SIGACT News 42(2):83–97. ACM, June 2011. T. Harris, A. Ho, R. Neuge-bauer, I. Pratt, and [28] B. Kalyanasundaram and K. Pruhs. Speed is as A. Warfield. Xen and the art of virtualization. powerful as clairvoyance. Proc. 36th Ann. IEEE Proc. 19th Symp. on Operating Systems Principles Symp. on Found. Comput. Sci., pp. 214–221. 1995. (SOSP), pp. 164–177. 2003. [29] E. Lawler. Fast approximation algorithms for the [14] L. Becchetti, S. Leonardi, A. Marchetti-Spaccamela, knapsack problems. Mathematics of Operations and K. Pruhs. Online weighted flow time and dead- Research 4(4):339–356, 1979. line scheduling. Proc. 5th International Workshop [30] S. Leonardi and D. Raz. Approximating total flow on Randomization and Approximation, pp. 36–47. time on parallel machines. J. Comput. Syst. Sci. 2001. 73(6):875–891, 2007.
  • 12. [31] S. Martello and P. Toth. Knapsack Problems: • If HRDF’s schedule processes job j at time t, Algorithms and Computer Implementations. John then Wiley, 1990. [32] A. Mishra, J. Hellerstein, W. Cirne, and C. Das. d wj hj wj wk Fj (t) = Towards characterizing cloud backend workloads: dt ε ε wj k∈QO (t) Insights from Google compute clusters, 2010. hj d [33] A. Singh, M. Korupolu, and D. Mohapatra. = OPT(t). Server-storage virtualization: Integration and load- ε dt balancing in data centers. Proc. IEEE-ACM Recall S A (t) is the set of jobs processed by Supercomputing Conference (SC), pp. 53:1–53:12. 2008. HRDF’s schedule at time t. Observe that [34] Q. Zhang, J. Hellerstein, and R. Boutaba. Char- hj ≤ (2 + ε). acterizing task usage shapes in Google compute j∈S A (t) clusters. Proc. 5th International Workshop on Large Scale Distributed Systems and Middleware. 2011. Therefore, d wj 2+ε d Fj (t) ≤ OPT(t). Appendix j∈S A (t) dt ε ε dt A Analysis of HRDF • Finally, In this appendix, we complete the analysis of HRDF. We use the definitions and potential function as d wj < (2 + ε)wj − (Rj )(t) ≤ given in Section 3. Consider the changes that occur dt ε ε j∈QO (t) j∈QO (t) to A + Φ over time. The events we need to consider 2+ε d follow: = OPT(t) ε dt Job arrivals: Consider the arrival of job j at Combining the above inequalities gives us the time rj . The potential function Φ gains the terms ΦA j following: and −ΦO . We see j (1 + ε)wj d d d ΦA (rj ) − ΦO (rj ) = j j pj (A + Φ)(t) = Aj (t) + ΦA (t) ε dt dt dt j j∈QA (t) 1+ε ≤ OPTj d O ε − Φ (t) dt j j∈QO (t) For any term ΦA with k = j, job j may only k > contribute to Rk and −Vk , and its contributions to 2+ε d ≤ W A (t) − W A (t) + OPT(t) the two terms cancel. Further, job j has no effect on ε dt any term ΦO where k = j. All together, job arrivals 2+ε d k + OPT(t) increase A + Φ by at most ε dt 4 + 2ε d 1+ε = OPT(t) (5) OPT. ε dt ε Therefore, the total increase to A + Φ due to continuous processing of jobs is at most Continuous job processing: Consider any time t. 4 + 2ε d 4 + 2ε (6) OPT(t)dt = OPT. We have the following: t≥0 ε dt ε • By Lemma 3.2, d wj Job completions: The completion of job j in (Rj + (1 + ε)pA − Vj> + Yj )(t) ≤ j dt ε HRDF’s schedule increases Φ by j∈QA (t) wj − W A (t). −ΦA (Cj ) = j A Vj> (Cj ) − Yj (Cj ) − Fj (Cj ) . A A A ε
  • 13. Similarly, the completion of job j by the optimal A.1 Completing the proof. We may now schedule increases Φ by complete the proof of Theorem 3.1. By summing wj the increases to A + Φ from (5) and (6), we see −ΦO (Cj ) = j O < O Rj (Cj ) . ε 5 + 3ε 1 We argue that the negative terms from these (7) A≤ OPT = O OPT. ε ε increases cancel the positive terms. Fix a job j and consider any job k contributing to Vj> (Cj ). By A B Analysis of HPCS > definition of Vj , job k arrives after job j and is In this appendix, we complete the analysis of HPCS. more height dense than job j upon its arrival. We We use the definitions and potential function as see given in Section 4. Consider the changes that occur wk wj hj pA (rk )wk j to A + Φ over time. As is the case for HRDF, > A (r ) =⇒ > hk pk . job completions and density inversions cause no hk pk hj pj k wj overall increase to A + Φ. The analyses for these Job A k is alive to contribute a total changes are essentially the same as those given in hj pj (rk )wk of wj > hk pk to Fj during times HRDF’s Appendix A, except the constants may change. The schedule processes job j. The decrease in Φ of k’s remaining events we need to consider follow: A contribution to Fj (Cj ) negates the increase caused by k’s contribution to Vj> (Cj ). A Job arrivals: Consider the arrival of job j at < O Now, consider job k contributing to Rj (Cj ). By time rj . The potential function Φ gains the terms ΦA j definition we know k arrives before job j and that and −ΦO . We see j job j is less height dense than job k at the time j ΦA (rj ) − ΦO (rj ) = λκwj pj j j arrives. We see wk wj ≤ λκOPTj . > =⇒ wk hj pj > wj hk pA (rj ). k hk pA (rj ) k hj pj For any term ΦA with k = j, job j may only k > Job j does not contribute to > Vk , because it is less contribute to Rk and −Vk , and its contributions to height dense than job k upon its arrival. Further, the two terms cancel. Further, job j has no effect on job k is alive in HRDF’s schedule when the optimal any term ΦO where k = j. All together, job arrivals k schedule completes job j. Therefore, all processing increase A + Φ by at most done to j by the optimal schedule contributes (8) λκOPT. A to Yk . So, removing job j’s contribution to Yk (Ck ) wk wj A causes Φ to decrease by ε hj pj > ε hk pk (rj ), an upper bound on the amount of increase caused < O by removing k’s contribution to −Rj (Cj ). We Continuous job processing: Consider any time t. conclude that job completions cause no overall We have the following: increase in A + Φ. • By Lemma 4.2, d Density inversions: The final change we need λκwj (Rj +pA −Vj +Yj )(t) ≤ −κW ∗ (t) j to consider for Φ comes from HRDF’s schedule dt j∈QA (t) changing the ordering of the most height dense jobs. if κ ≥ 200 and λ ≥ 100. The Rj terms are the only ones changed by these events. Suppose HRDF’s schedule causes some job j • Note j∈S A (t) hj ≤ 1. Therefore, to become less height dense than another job k at time t. Residual density changes are continuous, so d wk λκwj Fj (t) = λκhj wj wj wk dt wj j∈S A (t) j∈S A (t) k∈QO (t) = =⇒ wj hk pA (t) = wk hj pA (t). k j hj pA (t) j hk pA (t) k d ≤ λκ OPT(t). At time t, job k starts to contribute to Rj , but dt job j stops contributing to Rk . Therefore, the • Observe: change to Φ from the inversion is d < wj wk − λκwj (Rj )(t) ≤ 1.75λκwj hk pA (t) − k hj pA (t) = 0. j O dt j∈QO (t) ε ε j∈Q (t) We conclude that these events cause no overall d = 1.75λκ OPT(t) increase in A + Φ. dt
  • 14. Combining the above inequalities gives us the • By Lemma 5.2, following: d 8κwj (Rj +pA −Vj> +Yj )(t) ≤ −κW ∗ (t). j dt ε d d d j∈QA (t) (A + Φ)(t) = Aj (t) + ΦA (t) dt dt dt j j∈QA (t) • For any job j in non-small category i, we d O have hj ≤ (1 + ε/2)hj by definition. Also, − Φ (t) dt j j∈S A (t) hj ≤ 1 + ε. Therefore, j∈QO (t) d ≤ κW ∗ (t) − κW ∗ (t) + λκ OPT(t) d 8κwj 8κhj wj wk dt Fj (t) = d dt ε ε wj i∈S A (t) i∈S A (t) + 1.75λκ OPT(t) dt 8(1 + ε)2 κ d d ≤ OPT(t). ≤ 2.75λκ OPT(t) ε dt dt Therefore, the total increase to A + Φ due to • The number of non-small category i jobs pro- continuous processing of jobs is at most cessed by KCS’s schedule is at most (1 + ε)2 /hj . d Therefore, (9) 2.75λκ OPT(t)dt = 2.75λκOPT. t≥0 dt d 8κwj < 8(1 + ε)3 κwj − (Rj )(t) ≤ B.1 Completing the proof. We may now dt ε ε i∈QO (t) i∈QO (t) complete the proof of Theorem 4.1. Set κ = 200 8(1 + ε)3 κ d and λ = 100. By summing the increases to A + Φ = OPT(t) ε dt from (8) and (9), we see (10) A ≤ 3.75λκOPT = O(OPT). Combining the above inequalities gives us the following: C Analysis of KCS In this appendix, we complete the analysis of KCS. d d d We use the definitions and potential function as (A + Φ)(t) = Aj (t) + ΦA (t) given in Section 5. Consider the changes that occur dt dt dt j j∈QA (t) to A + Φ over time. As is the case for HRDF, d O job completions and density inversions cause no − Φ (t) dt j overall increase to A + Φ. The analyses for these j∈QO (t) changes are essentially the same as those given in ≤ κW (t) − κW ∗ (t) ∗ Appendix A, except the constants may change. The 8(1 + ε)2 κ d remaining events we need to consider follow: + OPT(t) ε dt Job arrivals: Consider the arrival of job j at 8(1 + ε)3 κ d + OPT(t) time rj . The potential function Φ gains the terms ΦA j ε dt and −ΦO . We see 16(1 + ε)3 κ d j ≤ OPT(t) ε dt 8κwj ΦA (rj ) − ΦO (rj ) = j j pj Therefore, the total increase to A + Φ due to ε 1 continuous processing of jobs is at most ≤O OPTj . (12) ε3 16(1 + ε)3 κ d 1 Again, the changes to other terms cancel. All OPT(t)dt = O OPT t≥0 ε dt ε3 together, job arrivals increase A + Φ by at most 1 C.1 Completing the proof. We may now (11) O OPT. ε3 complete the proof of Theorem 5.1. By summing the increases to A + Φ from (11) and (12), we see Continuous job processing: Consider any time t. 1 (13) A≤O OPT. We have the following: ε3
  • 15. D Extensions to Multiple Machines scheduled on the machine chosen for job j by the optimal schedule. For example, if A is HRDF, then Here, we describe how our results can be extended let pA (t) be the remaining processing time of j j to work with multiple unrelated machines. In the unrelated machines variant of the CFT problem, we on machine . Let dA (t) = w j /(h j pA (t)). Let j j have m machines. Job j has weight w j , processing QAj (t) = k ∈ QA (t) : dA (t) > dA (t) k j Finally, we time p j , and height h j if it is assigned to machine . have Rj (t) = k∈QAj (t) h k pA (t). The definition k At each point in time, the server may schedule an of the potential function for each algorithm A then arbitrary number of jobs assigned to each machine remains unchanged. as long as the sum of their heights does not exceed 1. For each machine and job j, we assume either h j ≤ The analysis for continuous job processing, job 1 or p j = ∞. completitions, and density inversions is exactly the same as those given above for A as we restrict our The algorithms in this paper can be extended attention to a single machine in each case. We only easily to the unrelated machines model. These need to consider job arrivals. Let cj be the coefficient extensions are non-migratory, meaning a job is of pA (t) in the potential function for A (for example, j fully processed on a single machine. In fact, our cj = (1 + ε)wj /ε for HRDF). Recall that in each extensions are immediate dispatch, meaning each analysis, the increase to Φ from job arrivals was job is assigned to a machine upon its arrival. Let A at most cj OPTj . Consider the arrival of job j at be one of the algorithms given above. In order time rj . Let A be the machine chosen for j by to extend A, we simply apply an idea of Chadha j et al. [17] also used by Im and Moseley [26]. Namely, our extension and let O be the machine chosen by j we run A on each machine independently and assign the optimal algorithm. By our extension’s choice jobs to the machine that causes the least increase of machine, the potential function Φ increases by in A’s potential function. cj p O j upon job j’s arrival. This value is at most j A more precise explanation follows. We use cj OPTj . Summing over all jobs, we see job arrivals all the definitions given for A and its analysis. increase Φ by the same amount they did in the Let QA (t) be the set of alive jobs assigned to earlier analyses. machine at time t. Upon job j’s arrival, assign Finally, we compare against optimal schedules job j to the machine that causes the least increase that can be migratory. In this case, we model in A a migratory optimal schedule as assigning an α j k∈QA (t) w k (Rk + pk ) where each term Rk A fraction of job j to each machine ( α i = 1) upon and pk is modified to consider only jobs assigned job j’s arrival. We further adjust our definitions of to and their processing times and heights on Vj> and Rj to account for these fractional jobs; Yj> < . Each machine independently runs A on the only includes fractional jobs that are more dense jobs to which it is assigned using processing times, < than j and Rj includes contributions by jobs that weights, and heights as given for that machine. are more height dense than the fractional job the The extensions to each algorithm A given above optimal schedule assigns in place of j. Now by our have the same competitive ratios as their single extension’s choice in machine, the arrival of job j machine counterparts, given the amount of resource increases Φ by α j cj p j ≤ cj OPTj . Summing augmentation required in the single machine model over all jobs completes our analysis of job arrivals. for A. Continous job processing, job completions, and D.1 Analysis. Here, we sketch the analysis of density inversions work as before. our extension for multiple unrelated machines. Our analysis is essentially the same as the single machine case, except the terms in the potential functions are slightly changed. Let A be the algorithm we are extending. We begin by assuming the optimal schedule is non-migratory and immediate dispatch; whenever a job j arrives, it is immediately assigned to a machine and all processing for j is performed on . For each term Rj , Vj> , Yj , and Fj , we change its definition to only include contributions by jobs scheduled on the machine chosen for job j by our extension to A. For Rj , we change its < definition to only include contributions by jobs