Presentation on miscreant jobs in HTCondor presented at HTCondor week 2013. Showing how to reduce the number of bad jobs run and increase the chances of good jobs running quickly.
Human Factors of XR: Using Human Factors to Design XR Systems
Cw13 0.01-final
1. Making
HTCondor
Energy
Efficient
by
iden5fying
miscreant
jobs
Stephen
McGough,
Ma@hew
Forshaw
&
Clive
Gerrard
Newcastle
University
Stuart
Wheater
Arjuna
Technologies
Limited
2. Mo5va5on
Policy
and
Simula5on
Conclusion
Task
lifecycle
5me
HTC
user
submits
task
resource
selec5on
Interac5ve
user
logs
in
Task
evic5on
resource
selec5on
Computer
reboot
resource
selec5on
Interac5ve
user
logs
in
…
resource
selec5on
Task
evic5on
Task
evic5on
Task
comple5on
5me
…
Good
Bad
‘Miscreant’
–
has
been
evicted
but
don’t
know
if
it’s
good
or
bad
Mo5va5on
3. Mo5va5on
Policy
and
Simula5on
Conclusion
Mo5va5on
• We
have
run
a
high-‐throughput
cluster
for
~6
years
– Allowing
many
researchers
to
perform
more
work
quicker
• University
has
strong
desire
to
reduce
energy
consump5on
and
reduce
CO2
produc5on
– Currently
powering
down
computer
&
buying
low
power
PCs
– “If
a
computer
is
not
‘working’
it
should
be
powered
down”
• Can
we
go
further
to
reduce
wasted
energy?
– Reduce
5me
computers
spend
running
work
which
does
not
complete
– Prevent
re-‐submission
of
‘bad’
jobs
– Reduce
the
number
of
resubmissions
for
‘good’
jobs
• Aims
– Inves5gate
policy
for
reducing
energy
consump5on
– Determine
the
impact
on
high-‐throughput
users
Mo5va5on
4. Mo5va5on
Policy
and
Simula5on
Conclusion
Can
we
fix
the
number
of
retries?
0 200 400 600 800 1000 1200 1400 1600 1800 200
0
100
200
300
400
500
600
700
800
900
Number of evictions
Cumulativewastedseconds(millions)
Good Jobs
Bad Jobs
• ~57
years
of
compu5ng
5me
during
2010
• ~39
years
of
wasted
5me
– ~27
years
for
‘bad’
tasks
:
average
45
retries
:
max
1946
retries
– ~12
years
for
‘good’
tasks
:
average
1.38
retries
:
max
360
retries
• 100%
‘good’
task
comple5on
-‐>
360
retries
• S5ll
wastes
~13
years
on
‘bad’
tasks
– 95%
‘good’
task
comple5on
-‐>
3
retries
:
9,808
good
tasks
killed
(3.32%)
– 99%
‘good’
task
comple5on
-‐>
6
retries
:
2,022
good
tasks
killed
(0.68%)
Mo5va5on
5. Mo5va5on
Policy
and
Simula5on
Conclusion
0 200 400 600 800 1000 1200 1400
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
Idle time (minutes)
Cumulativefreeseconds
Can
we
make
tasks
short
enough?
• Make
tasks
short
enough
to
reduce
miscreants
• Average
idle
interval
–
371
minutes
• But
to
ensure
availability
of
intervals
– 95%
:
need
to
reduce
5me
limit
to
2
minutes
– 99%
:
need
to
reduce
5me
limit
to
1
minute
• Imprac5cal
to
make
tasks
this
short
Mo5va5on
6. Mo5va5on
Policy
and
Simula5on
Conclusion
Cluster
Simula5on
• High
Level
Simula5on
of
Condor
– Trace
logs
from
a
twelve
month
period
are
used
as
input
• User
Logins
/
Logouts
(computer
used)
• Condor
Job
Submission
5mes
(‘good’/’bad’
and
dura5on)
end applications are unlikely to migrate to virtual desktops
or user owned devices due to hardware requirements and
icensing conditions, so we expect to need to maintain a
pool of hardware that will be useful for Condor for some
ime.
PUE values have been assigned at the cluster level with
values in the range of 0.9 to 1.4. These values have not been
empirically evaluated but used here to steer jobs. In most
cases the cluster rooms have a low enough computer density
not to require cooling giving these clusters a PUE value of
1.0. However, two clusters are located in rooms that require
air conditioning, giving these a PUE of 1.4. Likewise, four
clusters are based in a basement room, which is cold all
year round; hence computer heat is used to offset heating
requirements for the room, giving a PUE value of 0.9.
By default computers within the cluster will enter the
sleep state after a given interval of inactivity. This time will
depend on whether the cluster is open or not. During open
hours computers will remain in the idle state for one hour
before entering the sleep state whilst outside of these hours
he idle interval before sleep is reduced to 15 minutes. This
policy (P2) was originally trialled under Windows XP where
he time for computers to resume from the shutdown state
was considerable (sleep was an unreliable option for our
environment). Likewise the time interval before a Condor
ob could start using a computer (M1) was set to be 15
minutes during cluster opening hours and 0 minutes outside
Table I: Computer Types
Type Cores Speed Power Consumption
Active Idle Sleep
Normal 2 ⇠3Ghz 57W 40W 2W
High End 4 ⇠3Ghz 114W 67W 3W
Legacy 2 ⇠2Ghz 100-180W 50-80W 4W
Figure 3 illustrates the interactive logins for this period
showing the high degree of seasonality within the data. It
is easy to distinguish between week and weekends as well
as where the three terms lie along with the vacations. This
represents 1,229,820 interactive uses of the computers.
Figure 4 depicts the profile for the 532,467 job submis-
sions made to Condor during this period. As can be seen
the job submissions follow no clearly definable pattern. Note
that out of these submissions 131,909 were later killed by the
original Condor user. In order to simulate these killed jobs
the simulation assumes that these will be non-terminating
jobs and will keep on submitting them to resources until the
time at which the high-throughput user terminates them. The
graph is clipped on Thursday 03/06/2010 as this date had
93,000 job submissions.
For the simulations we will report on the total power
consumed (in MWh) for the period. In order to determine
the effect on high-throughput users of a policy we will also
report the average overhead observed by jobs submitted to
Condor (in seconds). Where overhead is defined to be the
amount of time in excess of the execution duration of the job.
Other statistics will be reported as appropriate for particular
!"!#$
!"#$
#$
#!$
#!!$
!"#$%&'()'*"$#+,,+(-,'."-/&%/,'
012%'
Figure 4: Condor job submission profile
Ac5ve
User
/
Condor
Idle
Sleep
WOL
Z
Z
Z
Cycle Stealing Users
Interactive Users
High-Throughput
Management
Z
Z
Z
Management policy
Policy
and
Simula5on
7. Mo5va5on
Policy
and
Simula5on
Conclusion
0 5 10 15 20 25 30
10
1
10
2
10
3
10
4
10
5
Number of retries (n)
Numberofgoodtaskskilled
N1 C1
N1 C2
N1 C3
N2 C1
N2 C2
N2 C3
N3 C1
N3 C2
N3 C3
n
realloca5on
policies
• N1(n):
Abandon
task
if
deallocated
n
5mes.
• N2(n):
Abandon
task
if
deallocated
n
5mes
ignoring
interac5ve
users.
• N3(n):
Abandon
task
if
deallocated
n
5mes
ignoring
planned
machine
reboots.
• C1:
Tasks
allocated
to
resources
at
random,
favouring
awake
resources
• C2:
Target
less
used
computers
(longer
idle
5mes)
• C3:
Tasks
are
allocated
to
computers
in
clusters
with
least
amount
of
5me
used
by
interac5ve
users
0 5 10 15 20 25 30
3
3.5
4
4.5
5
5.5
6
x 10
6
Number of retries (n)
Energyconsumption(MWh)
N1 C1
N1 C2
N1 C3
N2 C1
N2 C2
N2 C3
N3 C1
N3 C2
N3 C3
0 5 10 15 20 25 30
0
2
4
6
8
10
12
x 10
5
Number of retries (n)
Overheadsonallgoodtasks
N1 C1
N1 C2
N1 C3
N2 C1
N2 C2
N2 C3
N3 C1
N3 C2
N3 C3
Policy
and
Simula5on
9. Mo5va5on
Policy
and
Simula5on
Conclusion
Individual
Time
Policy
• Impose
a
limit
on
individual
execu5on
5me
for
a
task.
– Nightly
reboots
bound
this
to
24
hours.
– What
is
the
impact
of
lowering
this?
• I1(t):
Abandon
if
individual
5me
>
t.
0 5 10 15 20 25 30
3.5
4
4.5
5
5.5
6
x 10
6
Individual time (hours)
Energyconsumption(MWh)
I1 C1
I1 C2
I1 C3
0 5 10 15 20 25
10
0
10
1
10
2
10
3
10
4
10
5
Individual time (hours)
Numberofgoodtaskskilled
I1 C1
I1 C2
I1 C3
0 5 10 15 20 25 30
3
4
5
6
7
8
9
10
11
12
x 10
5
Individual time (hours)
Overheadsonallgoodtasks
I1 C1
I1 C2
I1 C3
Policy
and
Simula5on
10. Mo5va5on
Policy
and
Simula5on
Conclusion
20 40 60 80 100 120 140
10
6
10
7
10
8
10
9
10
10
10
11
10
12
Maximum execution duration (hours)
Energyconsumption(MWh)
m=10, n=10
m=10, n=20
m=10, n=30
m=20, n=10
m=20, n=20
m=20, n=30
m=30, n=10
m=30, n=20
m=30, n=30
m=40, n=10
m=40, n=20
m=40, n=30
Dedicated
Resources
D1(m,d):
Miscreant
tasks
are
permi@ed
to
con5nue
execu5ng
on
a
dedicated
set
of
m
resources
(without
interac5ve
users
or
reboots),
with
a
maximum
dura5on
d.
20 40 60 80 100 120 140
0
2
4
6
8
10
12
14
16
Maximum execution duration (hours)
Numberofgoodtaskskilled
m=10, n=10
m=10, n=20
m=10, n=30
m=20, n=10
m=20, n=20
m=20, n=30
m=30, n=10
m=30, n=20
m=30, n=30
m=40, n=10
m=40, n=20
m=40, n=30
20 40 60 80 100 120 140
1.12
1.14
1.16
1.18
1.2
1.22
1.24
1.26
1.28
1.3
x 10
6
Maximum execution duration (hours)
Overheadsonallgoodtasks
m=10, n=10
m=10, n=20
m=10, n=30
m=20, n=10
m=20, n=20
m=20, n=30
m=30, n=10
m=30, n=20
m=30, n=30
m=40, n=10
m=40, n=20
m=40, n=30
Policy
and
Simula5on
11. Mo5va5on
Policy
and
Simula5on
Conclusion
Conclusion
• Simple
policies
can
be
used
to
reduce
the
effect
of
miscreant
tasks
in
a
mul5-‐use
cycle
stealing
cluster.
– N2
(total
evic5ons
ignoring
users)
• Order
of
magnitude
reduc5on
in
energy
consump5on
– Reduce
amount
of
effort
wasted
on
tasks
that
will
never
complete
• Policies
may
be
combined
to
achieve
further
improvements.
– Adding
in
dedicated
computers
Conclusion
12. Ques5ons?
stephen.mcgough@ncl.ac.uk
m.j.forshaw@ncl.ac.uk
More
info:
McGough,
A
Stephen;
Forshaw,
Ma@hew;
Gerrard,
Clive;
Wheater,
Stuart;
Reducing
the
Number
of
Miscreant
Tasks
ExecuBons
in
a
MulB-‐use
Cluster,
Cloud
and
Green
Compu5ng
(CGC),
2012