Cw13 0.01-final

Making
HTCondor
Energy
Eﬃcient

by
iden5fying
miscreant
jobs

Stephen
McGough,
Ma@hew
Forshaw

&
Clive
Gerrard

Newcastle
University

Stuart
Wheater

Arjuna
Technologies
Limited

Mo5va5on
Policy
and
Simula5on
Conclusion

Task
lifecycle

5me

HTC
user

submits

task

resource

selec5on

Interac5ve

user
logs

in

Task

evic5on

resource

selec5on

Computer

reboot

resource

selec5on

Interac5ve

user
logs

in

…

resource

selec5on

Task

evic5on

Task

evic5on

Task

comple5on

5me

…

Good

Bad

‘Miscreant’
–

has
been

evicted
but

don’t
know
if

it’s
good
or
bad

Mo5va5on

Mo5va5on
Policy
and
Simula5on
Conclusion

Mo5va5on

•  We
have
run
a
high-‐throughput
cluster
for
~6
years

–  Allowing
many
researchers
to
perform
more
work
quicker

•  University
has
strong
desire
to
reduce
energy

consump5on
and
reduce
CO2
produc5on

–  Currently
powering
down
computer
&
buying
low
power
PCs

–  “If
a
computer
is
not
‘working’
it
should
be
powered
down”

•  Can
we
go
further
to
reduce
wasted
energy?

–  Reduce
5me
computers
spend
running
work
which
does
not

complete

–  Prevent
re-‐submission
of
‘bad’
jobs

–  Reduce
the
number
of
resubmissions
for
‘good’
jobs

•  Aims

–  Inves5gate
policy
for
reducing
energy
consump5on

–  Determine
the
impact
on
high-‐throughput
users

Mo5va5on

Mo5va5on
Policy
and
Simula5on
Conclusion

Can
we
ﬁx
the
number
of
retries?

0 200 400 600 800 1000 1200 1400 1600 1800 200
0
100
200
300
400
500
600
700
800
900
Number of evictions
Cumulativewastedseconds(millions)
Good Jobs
Bad Jobs







    




•  ~57
years
of
compu5ng
5me
during
2010

•  ~39
years
of
wasted
5me

–  ~27
years
for
‘bad’
tasks
:
average
45
retries
:
max
1946
retries

–  ~12
years
for
‘good’
tasks
:
average
1.38
retries
:
max
360
retries

•  100%
‘good’
task
comple5on
-‐>
360
retries

•  S5ll
wastes
~13
years
on
‘bad’
tasks

–  95%
‘good’
task
comple5on
-‐>
3
retries
:
9,808
good
tasks
killed
(3.32%)

–  99%
‘good’
task
comple5on
-‐>
6
retries
:

2,022
good
tasks
killed
(0.68%)

Mo5va5on

Mo5va5on
Policy
and
Simula5on
Conclusion

0 200 400 600 800 1000 1200 1400
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
Idle time (minutes)
Cumulativefreeseconds
Can
we
make
tasks
short
enough?

•  Make
tasks
short
enough
to
reduce
miscreants

•  Average
idle
interval
–
371
minutes

•  But
to
ensure
availability
of
intervals

– 95%
:
need
to
reduce

5me
limit
to
2
minutes

– 99%
:
need
to
reduce

5me
limit
to
1
minute

•  Imprac5cal
to
make

tasks
this
short

Mo5va5on

Mo5va5on
Policy
and
Simula5on
Conclusion

Cluster
Simula5on

•  High
Level
Simula5on
of
Condor

– Trace
logs
from
a
twelve
month
period
are
used
as

input

•  User
Logins
/
Logouts
(computer
used)

•  Condor
Job
Submission
5mes
(‘good’/’bad’
and
dura5on)

end applications are unlikely to migrate to virtual desktops
or user owned devices due to hardware requirements and
icensing conditions, so we expect to need to maintain a
pool of hardware that will be useful for Condor for some
ime.
PUE values have been assigned at the cluster level with
values in the range of 0.9 to 1.4. These values have not been
empirically evaluated but used here to steer jobs. In most
cases the cluster rooms have a low enough computer density
not to require cooling giving these clusters a PUE value of
1.0. However, two clusters are located in rooms that require
air conditioning, giving these a PUE of 1.4. Likewise, four
clusters are based in a basement room, which is cold all
year round; hence computer heat is used to offset heating
requirements for the room, giving a PUE value of 0.9.
By default computers within the cluster will enter the
sleep state after a given interval of inactivity. This time will
depend on whether the cluster is open or not. During open
hours computers will remain in the idle state for one hour
before entering the sleep state whilst outside of these hours
he idle interval before sleep is reduced to 15 minutes. This
policy (P2) was originally trialled under Windows XP where
he time for computers to resume from the shutdown state
was considerable (sleep was an unreliable option for our
environment). Likewise the time interval before a Condor
ob could start using a computer (M1) was set to be 15
minutes during cluster opening hours and 0 minutes outside
Table I: Computer Types
Type Cores Speed Power Consumption
Active Idle Sleep
Normal 2 ⇠3Ghz 57W 40W 2W
High End 4 ⇠3Ghz 114W 67W 3W
Legacy 2 ⇠2Ghz 100-180W 50-80W 4W
Figure 3 illustrates the interactive logins for this period
showing the high degree of seasonality within the data. It
is easy to distinguish between week and weekends as well
as where the three terms lie along with the vacations. This
represents 1,229,820 interactive uses of the computers.
Figure 4 depicts the profile for the 532,467 job submis-
sions made to Condor during this period. As can be seen
the job submissions follow no clearly definable pattern. Note
that out of these submissions 131,909 were later killed by the
original Condor user. In order to simulate these killed jobs
the simulation assumes that these will be non-terminating
jobs and will keep on submitting them to resources until the
time at which the high-throughput user terminates them. The
graph is clipped on Thursday 03/06/2010 as this date had
93,000 job submissions.
For the simulations we will report on the total power
consumed (in MWh) for the period. In order to determine
the effect on high-throughput users of a policy we will also
report the average overhead observed by jobs submitted to
Condor (in seconds). Where overhead is defined to be the
amount of time in excess of the execution duration of the job.
Other statistics will be reported as appropriate for particular
!"!#$
!"#$
#$
#!$
#!!$
!"#$%&'()'*"$#+,,+(-,'."-/&%/,'
012%'
Figure 4: Condor job submission profile
Ac5ve

User
/
Condor

Idle

Sleep

WOL
Z
Z
Z
Cycle Stealing Users
Interactive Users
High-Throughput
Management
Z
Z
Z
Management policy
Policy
and
Simula5on

Mo5va5on
Policy
and
Simula5on
Conclusion

0 5 10 15 20 25 30
10
1
10
2
10
3
10
4
10
5
Number of retries (n)
Numberofgoodtaskskilled
N1 C1
N1 C2
N1 C3
N2 C1
N2 C2
N2 C3
N3 C1
N3 C2
N3 C3
n
realloca5on
policies

•  N1(n):
Abandon
task
if
deallocated
n
5mes.

•  N2(n):
Abandon
task
if
deallocated
n
5mes

ignoring
interac5ve
users.

•  N3(n):
Abandon
task
if
deallocated
n
5mes

ignoring
planned
machine
reboots.

•  C1:
Tasks
allocated
to
resources
at
random,

favouring
awake
resources

•  C2:
Target
less
used
computers
(longer
idle

5mes)

•  C3:
Tasks
are
allocated
to
computers
in
clusters

with
least
amount
of
5me
used
by
interac5ve

users

0 5 10 15 20 25 30
3
3.5
4
4.5
5
5.5
6
x 10
6
Energyconsumption(MWh)
N1 C1
N1 C2
N1 C3
N2 C1
N2 C2
N2 C3
N3 C1
N3 C2
N3 C3
0 5 10 15 20 25 30
0
2
4
6
8
10
12
x 10
5
Overheadsonallgoodtasks
N1 C1
N1 C2
N1 C3
N2 C1
N2 C2
N2 C3
N3 C1
N3 C2
N3 C3
Policy
and
Simula5on

Mo5va5on
Policy
and
Simula5on
Conclusion

0 10 20 30 40 50 60 70 80 90 100
10
1
10
2
10
3
10
4
10
5
Accrued time (hours)
A1 C1
A1 C2
A1 C3
A2 C1
A2 C2
A2 C3
A3 C1
A3 C2
A3 C3
Accrued
Time
Policies

•  Impose
a
limit
on
cumula5ve

execu5on
5me
for
a
task.

•  A1(t):
Abandon
if
accrued
5me
>
t

and
task
deallocated.

•  A2(t):
As
A1,
discoun5ng

interac5ve
users..

•  A3(t):
As
A1,
discoun5ng
reboots.
0 10 20 30 40 50 60 70 80 90 100
3.5
4
4.5
5
5.5
6
x 10
6
A1 C1
A1 C2
A1 C3
A2 C1
A2 C2
A2 C3
A3 C1
A3 C2
A3 C3
0 10 20 30 40 50 60 70 80 90 100
3
4
5
6
7
8
9
10
11
12
x 10
5
A1 C1
A1 C2
A1 C3
A2 C1
A2 C2
A2 C3
A3 C1
A3 C2
A3 C3
Policy
and
Simula5on

Mo5va5on
Policy
and
Simula5on
Conclusion

Individual
Time
Policy

•  Impose
a
limit
on
individual

execu5on
5me
for
a
task.

–  Nightly
reboots
bound
this
to
24
hours.

–  What
is
the
impact
of
lowering
this?

•  I1(t):
Abandon
if
individual
5me
>
t.

0 5 10 15 20 25 30
3.5
4
4.5
5
5.5
6
x 10
6
Individual time (hours)
I1 C1
I1 C2
I1 C3
0 5 10 15 20 25
10
0
10
1
10
2
10
3
10
4
10
5
I1 C1
I1 C2
I1 C3
0 5 10 15 20 25 30
3
4
5
6
7
8
9
10
11
12
x 10
5
I1 C1
I1 C2
I1 C3
Policy
and
Simula5on

Mo5va5on
Policy
and
Simula5on
Conclusion

20 40 60 80 100 120 140
10
6
10
7
10
8
10
9
10
10
10
11
10
12
Maximum execution duration (hours)
m=10, n=10
m=10, n=20
m=10, n=30
m=20, n=10
m=20, n=20
m=20, n=30
m=30, n=10
m=30, n=20
m=30, n=30
m=40, n=10
m=40, n=20
m=40, n=30
Dedicated
Resources

D1(m,d):
Miscreant
tasks
are

permi@ed
to
con5nue

execu5ng
on
a
dedicated
set
of

m
resources
(without

interac5ve
users
or
reboots),

with
a
maximum
dura5on
d.

20 40 60 80 100 120 140
0
2
4
6
8
10
12
14
16
m=10, n=10
m=10, n=20
m=10, n=30
m=20, n=10
m=20, n=20
m=20, n=30
m=30, n=10
m=30, n=20
m=30, n=30
m=40, n=10
m=40, n=20
m=40, n=30
20 40 60 80 100 120 140
1.12
1.14
1.16
1.18
1.2
1.22
1.24
1.26
1.28
1.3
x 10
6
m=10, n=10
m=10, n=20
m=10, n=30
m=20, n=10
m=20, n=20
m=20, n=30
m=30, n=10
m=30, n=20
m=30, n=30
m=40, n=10
m=40, n=20
m=40, n=30
Policy
and
Simula5on

Mo5va5on
Policy
and
Simula5on
Conclusion

Conclusion

•  Simple
policies
can
be
used
to
reduce
the
eﬀect

of
miscreant
tasks
in
a
mul5-‐use
cycle
stealing

cluster.

–  N2
(total
evic5ons
ignoring
users)

•  Order
of
magnitude
reduc5on
in
energy

consump5on

–  Reduce
amount
of
eﬀort
wasted
on
tasks
that
will

never
complete

•  Policies
may
be
combined
to
achieve
further

improvements.

–  Adding
in
dedicated
computers

Conclusion

Ques5ons?

stephen.mcgough@ncl.ac.uk

m.j.forshaw@ncl.ac.uk

More
info:

McGough,
A
Stephen;
Forshaw,
Ma@hew;
Gerrard,
Clive;
Wheater,
Stuart;

Reducing
the
Number
of
Miscreant
Tasks
ExecuBons
in
a
MulB-‐use
Cluster,

Cloud
and
Green
Compu5ng
(CGC),
2012

Cw13 0.01-final

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Cw13 0.01-final

Ähnlich wie Cw13 0.01-final (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Cw13 0.01-final