Dynamic Provisioning of Data Intensive Computing Middleware Frameworks

Dynamic Provisioning of Data Intensive
Computing Middleware Frameworks: A Case
Study
Linh B. Ngo1
Michael E. Payne1
Flavio Villanustre2
Richard Taylor2
Amy W. Apon1
1School of Computing, Clemson University
2LexisNexis® Risk Solutions

Contents
1.
Overview
of
Clemson
University’s
Cyberinfrastructure
Resource

2.
Demand
for
Dynamic
Data-‐Intensive
Compu@ng
Middleware
Frameworks

3.
Dynamic
Provisioning
of
Data-‐Intensive
Compu@ng
Framework

4.
Deploying
Hadoop
Ecosystem
vs.
Deploying
HPCC
Systems®

5.
Lessons
Learned

Cyberinfrastructure Resource at Clemson University
Condominium model
2,007 Computer Nodes (21,400 cores), including 276 GPU nodes
Sustained 551 Tflops (benchmarked on GPU nodes only)
1289 active users, 12 academic departments across 36 fields of research
Facilities

Cyberinfrastructure Resource at Clemson University
•  1G/10G/Myrinet-‐10G/Inﬁniband-‐40G/Inﬁniband-‐56G

•  Local
storage
between
100-‐200GB
(majority)
and
400-‐900GB
(since
2013)

•  Shared
233TB
OrangeFS
scratch
space
and
more
than
3PB
archival
space

Demand
for
Dynamic
Data-‐Intensive
Compu@ng
Middleware

Frameworks
•  Genome
Sequencing
(Hadoop
MapReduce/GPGPU)

•  Molecular
Dynamic
Forward
Flux
Sampling
(Hadoop
Streaming/LAMMPS)

•  Streaming
Data
Infrastructure
for
Connected
Vehicle
System
(Hadoop

Distributed
File
System/Spark/Ka_a)

•  Big
Scholarly
Data
(HPCC
Systems)

•  CS
Course
in
Distributed
and
Cluster
Compu@ng
(MPI/MapReduce,

Hadoop/Spark/HPCC
Systems®
…)

Demand
for
Dynamic
Data-‐Intensive
Compu@ng
Middleware

Frameworks
•  Changes
in
cyberinfrastructure
support
model
for
data
infrastructure:

–  Beyond
a
tradi@onal
remote
distributed
file
system
model

–  From
sta@c
and
dedicated
resource
to
dynamic
resource

–  Data
management
processes
co-‐locate
with
compu@ng
processes

•  Challenges
for
system
administrators:

–  Accommoda@ng
different
frameworks
for
different
research

–  Complying
with
exis@ng
administra@ve
policy
and
scheduling
priority

•  What
can
users
do?

–  Deploying
dynamic
data-‐intensive
compu@ng
frameworks
within
the

limits
of
user
privilege
and
without
the
interven@on
of
administrators

Dynamic
Provisioning
of
Data-‐Intensive
Compu@ng
Framework:

Installa@on

•  Where
to
install

1.  Home
directory:
Persistent,
limited
in
storage

2.  Shared
distributed
storage:
Fast,
semi-‐persistent,
“unlimited”
storage

3.  Local
storage
on
compute
node:
Fast,
non-‐persistent,
requires

reinstalla@on

•  How
to
handle
dependencies

1.  Ideally
in
home
or
shared
distributed
storage
(persistency)

2.  Dynamic
loading
mechanisms
via
environment
paths

Target

deployment

directories
on

local
disks

PBS_NODEFILE

Deployment/
ConﬁguraBon
Scripts

1

2

3

4

user.palmeHo.clemson.edu

Dynamic
Provisioning
of
Data-‐Intensive
Compu@ng
Framework:

Deployment

Deploying
Hadoop
Ecosystem
vs.
deploying
HPCC
Systems®:

Overview

•  Open
source
alterna@ves
based

on
the
conceptual
architecture
of

a
data-‐intensive
compu@ng

infrastructure
developed
by

Google

•  Comprehensive
data-‐intensive

compu@ng
system
targe@ng

enterprise
users,
developed
in

early
2000,
open
source
since

2011

Deploying
Hadoop
Ecosystem
vs.
deploying
HPCC
Systems®:

Installa@on:
Hadoop

•  Self-‐contained,
pre-‐compiled
jar
ﬁles

•  No
installa@on
is
needed,
relies
on
shell
scripts
to
launch
component

daemons

•  Dependencies:
JDK

Deploying
Hadoop
Ecosystem
vs.
deploying
HPCC
Systems®:

Installa@on:
HPCC
Systems

•  Standard
configure/make/make
install

–  Assump@on
about
an
industrial
produc@on
environment
(with

administra@ve
privileges)

–  Modifica@on
to
avoid
hard-‐coded
system
installa@on
paths

–  Modifica@on
of
template
XML
configura@on
files
to
avoid
default

HPCC
Systems-‐specific
user
crea@on
and
administra@ve
check

•  Dependencies:

–  Not
on
Palmeko:
ICU,
Xalan,
Xerces,
APR
…

–  On
Palmeko
but
no
correct
version:
Binu@ls

Deploying
Hadoop
Ecosystem
vs.
deploying
HPCC
Systems:

Deployment:
Hadoop

•  Component

placement

determina@on

•  Cleanup
target

directories
from

previous

deployment

•  Create
target

directories
(log,

storage,
pid
…)

•  Synchronize
order

of
component

start-‐up

Namenode
ResourceManager
SparkMaster

DataNode

NodeManager

SparkExecutor

DataNode

NodeManager

SparkExecutor

DataNode

NodeManager

SparkExecutor

1st
node
in

PBS_NODEFILE

2nd
node
in

PBS_NODEFILE

3rd
node
in

PBS_NODEFILE

4th
node
in

PBS_NODEFILE

5th
node
in

PBS_NODEFILE

nth
node
in

PBS_NODEFILE

•  Addi@onal
components
(Hbase,
Hive,
Ka_a
…)
can
be

added
to
this
deployment
model

Deploying
Hadoop
Ecosystem
vs.
deploying
HPCC
Systems:

Deployment:
HPCC
Systems

•  Determine
node

alloca@on
and
internal

IP
addresses

•  HPCC
Systems
is

configured
via
its
own

deployment
programs

(configmgr,
configgen,

hpcc-‐init)

1st
node
in

PBS_NODEFILE

2nd
node
in

PBS_NODEFILE

1st
node
in

PBS_NODEFILE

3rd
node
in

PBS_NODEFILE

4th
node
in

PBS_NODEFILE

5th
node
in

PBS_NODEFILE

nth
node
in

PBS_NODEFILE

Deploying
Hadoop
Ecosystem
vs.
deploying
HPCC
Systems:

Deployment:
HPCC
Systems

•  Node
memory
constraints

•  HPCC
Systems
reserves

75%
of
available
memory

for
thor
by
default

•  Palmeko
does
not
allow

unlimited
memory

reserva@on

•  As
a
result,
thor_master

cannot
launch
new
jobs

via
fork()

•  Resolved
by
lower

memory
reserva@on

1st
node
in

PBS_NODEFILE

2nd
node
in

PBS_NODEFILE

1st
node
in

PBS_NODEFILE

3rd
node
in

PBS_NODEFILE

4th
node
in

PBS_NODEFILE

5th
node
in

PBS_NODEFILE

nth
node
in

PBS_NODEFILE

Lessons
Learned

•  A
common
approach
can
be
adapted
for
both
Hadoop
Ecosystem
and

HPCC
Systems

•  Limita@ons
on
non-‐administra@ve
accounts
can
impact
the
deployment

and
performance
via
system
resource
constraints

–  Unable
to
u@lize
all
available
memory
on
allocated
node
(HPCC

Systems)

•  Dynamic
deployment
via
non-‐administra@ve
accounts
provide
ini@a@ve

for
users
to
experiment
with
and
u@lize
new
large
scale
frameworks

without
addi@onal
burden
for
administrators

Lessons
Learned

•  Experience
in
deploying
as
users
is,
in
turn,
extremely
applicable
to
the

process
of
deployment
with
administra@ve
privileges.

•  E.g.:
CloudLab
cloud
compu@ng
experimental
testbed
with
non-‐persistent,

ephemeral,
and
short-‐term
(15
hours)
alloca@on

–  Script-‐based
installa@on
and
deployment
are
needed,
even
with

administra@ve
right,
to
automate
the
deployment
of
the
experiment

•  Experience
in
deploying
as
administrators
is
helpful
in
debugging
user-‐
based
deployment:

–  Iden@ﬁca@on
and
resolu@on
of
memory
alloca@on
issue
in
HPCC

Systems
were
done
by
changing
system
limita@on
using
administra@ve

commands.

QUESTIONS?
Linh B. Ngo1 Michael E. Payne1 Flavio Villanustre2 Richard Taylor2 Amy W. Apon1
{lngo,mpayne3,aapon}@clemson.edu
1School of Computing, Clemson University
{flavio.villanustre,richard.taylor}@lexisnexis.com
2LexisNexis Risk Solutions
More information about HPCCSystems can be found at http://hpccsystems.com

Dynamic Provisioning of Data Intensive Computing Middleware Frameworks

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Dynamic Provisioning of Data Intensive Computing Middleware Frameworks

Ähnlich wie Dynamic Provisioning of Data Intensive Computing Middleware Frameworks (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Dynamic Provisioning of Data Intensive Computing Middleware Frameworks