SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
2nd IEEE International Conference on Cloud Computing Technology and Science



     CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase


                       Chen Zhang                                                                Hans De Sterck
       David R. Cheriton School of Computer Science                                    Department of Applied Mathematics
                  University of Waterloo                                                      University of Waterloo
                     Waterloo, Canada                                                           Waterloo, Canada
             Email: c15zhang@cs.uwaterloo.ca                                           Email: hdesterck@math.uwaterloo.ca



    Abstract—As MapReduce becomes more and more popu-                    programs and paradigms.
 lar in data processing applications, the demand for Hadoop                 Going for public cloud services such as Amazon EC2 and
 clusters grows. However, Hadoop is incompatible with existing           S3 is not suitable for all Hadoop needs either. Specifically,
 cluster batch job queuing systems and requires a dedicated
 cluster under its full control. Hadoop also lacks support for
                                                                         it could be too expensive or infeasible to transfer and store
 user access control, accounting, fine-grain performance moni-            large amounts of data with Amazon. For example, a live
 toring and legacy batch job processing facilities comparable to         cell image processing application described in [7] generates
 existing cluster job queuing systems, making dedicated Hadoop           more than 300GB of data daily. Assuming 3MB/s WAN
 clusters less amenable for administrators and normal users              bandwidth, it would take about 27 hours to transfer all the
 alike with hybrid computing needs involving both MapReduce
 and legacy applications. As a result, getting a properly suited
                                                                         data into Amazon at the cost of $301 , and another $45 per
 and sized Hadoop cluster has not been easy in organizations             day for storage if using Amazon S3. Worse still, some of
 with existing clusters. This paper presents CloudBATCH, a               these big data sets might only need to be processed once
 prototype solution to this problem enabling Hadoop to function          and then discarded or archived, making the expensive data
 as a traditional batch job queuing system with enhanced func-           transfer even less cost-effective. It could be more suitable for
 tionality for cluster resource management. With CloudBATCH,
 a complete shift to Hadoop for managing an entire cluster to
                                                                         these types of data processing tasks to be completed within
 cater for hybrid computing needs becomes feasible.                      the institutions where the data are generated.
                                                                            Efforts are made to make Hadoop and traditional queuing
                                                                         systems coexist and work together. But these efforts have
                       I. I NTRODUCTION
                                                                         limitations themselves. The Hadoop project itself has tried
    MapReduce [1] is becoming increasingly popular and                   to work in this direction and provides a Hadoop On Demand
 used to solve a wide spectrum of problems ranging from                  (HOD) package in support of provisioning and managing
 bioinformatics to astronomy. The increase of MapReduce                  independent Hadoop MapReduce and HDFS instances on
 usage requires a growing number of Hadoop [2] clusters.                 a shared cluster of nodes managed by a resource manager
 However, because Hadoop is incompatible with existing                   called “Torque”. It is relatively easy to make HOD work with
 cluster batch job queuing systems and requires a dedicated              other resource managers such as Condor, SGE (Sun Grid
 cluster under its full control, it is not easy to get a Hadoop          Engine), and Moab. However, an important limitation [4] for
 cluster within organizations with existing cluster resources            HOD is that HOD cannot effectively move computation near
 due to administrative difficulties. It is also difficult to man-          data to exploit data locality because of its design choice of
 age Hadoop clusters due to Hadoop’s lack of functionality               creating small sub-clusters for processing individual users’
 for job management, user access control, accounting, fine-               tasks in a subset of compute nodes where data may not
 grain performance monitoring, etc. As a result, Hadoop clus-            reside. Other efforts have been made to develop new queuing
 ters may suffer from usage contention/monopoly if shared by             systems to work with Hadoop. Oracle attempted to provide
 multiple user groups. Additionally, even for institutions that          a solution and recently made a new release of the Oracle
 can afford several dedicated clusters for various usages, pro-          Grid Engine, SGE 6.2u5, claiming to be the first cluster
 visioning resources for Hadoop clusters is still tricky since           resource management system with Hadoop integration so
 the resources, once dedicated to Hadoop, cannot be used for             that users can submit Hadoop applications to an SGE cluster
 the load balancing of legacy batch job submissions to other             just like they would with any other parallel jobs. SGE
 overloaded traditional clusters even if they are idle. This             will take care of setting up the Hadoop MapReduce on
 is because Hadoop is designed for running a certain type                demand. Apart from the apparent limitation of requiring
 of applications, mainly back-end server computations such               users to use/switch to SGE, the SGE Hadoop integration also
 as page rank calculations, which are normally MapReduce                 suffers from the data locality problem, although less severe
 programs for large-scale data processing tasks rather than
                                                                           1 Inbound
 general computing problems with diverse types of legacy                               data transfer to Amazon is free till November 2010


978-0-7695-4302-4/10 $26.00 © 2010 IEEE                            368
DOI 10.1109/CloudCom.2010.22
than HOD, as well as problems concerning non-exclusive                  created for various purposes, such as the Capacity Scheduler,
host access (See Section II). Additionally, both HOD and                the Fair Scheduler [5], and the Dynamic Priority (DP)
SGE with Hadoop integration suffer from wasting resources               parallel task scheduler [4].
during the reduce phase, which is an intrinsic drawback                    However, Hadoop schedulers cannot make Hadoop more
for dynamically creating a private Hadoop cluster per user              favorable towards functioning as a cluster batch queuing
MapReduce application request, and there exists no solution             system. They are task level schedulers that only concern
for this problem yet.                                                   stateless decisions for job executions and do not provide
   To address the above mentioned problems, this paper                  rich functionality for cluster level management, such as user
attempts to provide a solution where Hadoop can function as             access control and accounting, and advanced job and job
a traditional batch job queuing system with enhanced man-               queue management with easily customizable priority levels
agement functionalities in addition to its original resource            and reservation support.
management capability. The goal is to develop a system that
enables cluster administrators to make a complete shift to
Hadoop for managing an entire cluster for hybrid computing              B. Hadoop on Demand
needs. Our system is named “CloudBATCH”. The general
idea about how CloudBATCH works is as follows: The                         As the Hadoop project’s effort to alleviate its incompati-
system includes a set of HBase [3] tables that are used                 bility with traditional queuing systems, Hadoop on Demand
for storing resource management information, such as user               (HOD) is added to Hadoop for dynamically creating and
credentials and queue and job information, which provide                using Hadoop clusters through existing queuing systems.
the basis for rich customized management functionalities.               The idea is to make use of the existing cluster queuing
In addition, based on the HBase tables, our system provides             system to schedule jobs that each run a Hadoop daemon
a queuing system that allows “Job Brokers” to submit                    on a compute node. These running daemons together create
“Wrappers” (Map-only MapReduce programs) as agents to                   an on-demand Hadoop cluster. After the Hadoop cluster
execute jobs in Hadoop. Users can submit jobs and check                 is set up, user submitted MapReduce applications can be
job status by using a “Client” at any time.                             executed. A simple walkthrough of the process for creating
   The remainder of the paper is structured as follows:                 a Hadoop cluster and executing user submitted MapReduce
in Section II we introduce related work about Hadoop                    jobs through HOD is as follows:
schedulers, Hadoop on Demand, and SGE with Hadoop
                                                                          1) User requests from cluster resource management sys-
integration. In Section III we briefly describe some desired
                                                                             tem a number of nodes on reserve and submits a job
properties of a batch job queuing system for clusters. Section
                                                                             called RingMaster to be started on one of the reserved
IV will describe the design and implementation of Cloud-
                                                                             nodes.
BATCH and how CloudBATCH achieves the properties
                                                                          2) Nodes are reserved and RingMaster is started on one
listed in Section III. In Section V we give a performance
                                                                             of the reserved nodes.
evaluation. Section VI concludes.
                                                                          3) RingMaster starts one process called HodRing on each
                    II. R ELATED W ORK                                       of the reserved nodes.
                                                                          4) HodRing brings up on-demand Hadoop daemons (na-
   In this section, we review existing solutions for Hadoop
                                                                             menode, jobtracker, tasktracker, etc.) according to the
resource management and its integration with existing clus-
                                                                             specifications maintained by RingMaster in configura-
ter batch job queuing systems, and examine their limitations.
                                                                             tion files.
To our knowledge, none of the existing solutions can enable
                                                                          5) MapReduce jobs that were submitted by the user are
Hadoop with a rich set of traditional cluster management
                                                                             executed on the on-demand cluster.
functionalities comparable to what CloudBATCH provides.
                                                                          6) The Hadoop cluster gets torn down and resources are
A. Hadoop Schedulers                                                         released.
   Hadoop allows customized schedulers to be developed                     As seen from the above walkthrough of an HOD process,
as pluggable components. The motivation for supporting                  data locality of the external HDFS is not exploited because
various schedulers, besides an obvious objective of achieving           the reservation and allocation of cluster nodes takes no
better system utilization, originates from the user demands             account of where the data to be processed are stored,
to share a Hadoop cluster allocated into different queues for           violating an important design principle and reducing the
various types of jobs (production jobs, interactive jobs, etc.)         advantage of the MapReduce framework. Another problem
among multiple users. The default Hadoop scheduler is a                 with HOD is that the HodRing processes are started through
simple FIFO queue for the entire Hadoop cluster which is                ssh by RingMaster, and the cluster resource management
insufficient to provide guaranteed capacity with fair resource           system is unable to track resource usage and to perform
allocations for mixed usage needs. Several schedulers are               thorough cleanup when the cluster is torn down.



                                                                  369
C. SGE with Hadoop Integration                                                    scheduling tasktrackers. However, a compute node could get
   Recently, Oracle released SGE 6.2u5 with Hadoop in-                            overloaded because multiple tasktrackers may be started on
tegration, enabling SGE to work with Hadoop without                               the same node for data locality concerns to execute tasks.
requiring a separate dedicated Hadoop cluster. The core                           In this case, an unpredictable number of Hadoop speculative
design idea is similar to HOD in that it also tries to start                      tasks [5] may be started on the other idling nodes where data
an on-demand Hadoop cluster by running Hadoop daemons                             do not reside, incurring extra overhead in data staging and
through the cluster resource management system on a set of                        waste of resources for executing the otherwise unnecessary
reserved compute nodes. The difference with HOD is that                           duplicated tasks. The unbalanced mingled execution of nor-
SGE exerts better concern for data locality when scheduling                       mal and speculative tasks may further mix up Hadoop sched-
tasktrackers and supports better resource usage monitoring                        ulers built in with the dynamically created Hadoop cluster
and cleanup because tasktrackers are directly started by SGE                      which are unaware of the higher level scheduling decisions
as opposed to be started by HodRing in HOD. The main                              made by SGE, harming performance in unpredictable ways.
problem, apart from locking users down to using SGE, lies                         Even if the default Hadoop speculative task functionality
in its mechanism of exploiting data locality and the non-                         is turned off, the potential danger of overloading a node
exclusive usage of compute nodes.                                                 persists, which could further contribute to unbalanced exe-
   Figure 1 illustrates how SGE with Hadoop integration                           cutions of Map tasks that result in wasting cluster resources
works:                                                                            as explained below. SGE also has an exclusive host access
                                                                                  facility. But if this facility is required, then the data locality
   1) A process called Job Submission Verifier (JSV) talks
                                                                                  exploitation mechanism would be much less useful because
       to the namenode of an external HDFS to obtain a data
                                                                                  normally a data node would potentially host data blocks
       locality mapping from the user submitted MapReduce
                                                                                  needed by several user applications while only one of them
       program’s HDFS paths to data node locations in
                                                                                  can benefit from data locality in the case with exclusive
       blocks and racks. Note that the “Load Sensors” are
                                                                                  host access. Additionally, the SGE installation file is about
       responsible for reporting on the block and rack data for
                                                                                  700MB, quite large compared to Hadoop which is only about
       each execution host where an SGE Execution Daemon
                                                                                  70MB in a tar ball and thus less costly to distribute.
       runs.
   2) The SGE scheduler starts the Hadoop Parallel Environ-                          Worse still, both HOD and SGE suffer from the same ma-
       ment (PE) with a jobtracker and a set of tasktrackers as                   jor problem intrinsic to the idea of creating a Hadoop cluster
       near the data nodes containing user application input                      on-the-fly for each user MapReduce application request.
       data as possible.                                                          The problem is a possible significant waste of resources in
   3) MapReduce applications that were submitted by the                           the Reduce phase, where nodes might be idling when the
       user are executed. Because several tasktrackers might                      number of Reduce tasks to be executed is much smaller than
       have been started on the same physical node, physical                      the number of Map tasks that are executed. This is because
       nodes could be overloaded when user applications start                     each of the on-demand clusters is privately tailored to a
       to be executed.                                                            single user MapReduce application submission and the size
   4) The Hadoop cluster gets torn down and resources are                         of the Hadoop cluster is fixed at node reservation time. If
       released.                                                                  the user MapReduce application requires far more Map tasks
                                                                                  than Reduce tasks and the number of nodes are reserved
                                                                                  matching the Map tasks (which is usually what users would
                                                                                  request), many of the machines in the on-demand Hadoop
                                                                                  cluster will be idling when the much smaller number of
                                                                                  Reduce tasks are running at the end. The waste of resources
                                                                                  could also occur in the case of having unbalanced executions
                                                                                  of Map tasks, in which case a portion of Map tasks get
                                                                                  finished ahead of time and wait for the others to finish
                                                                                  before being able to enter the Reduce phase. There exists
                                                                                  no solution for this problem yet.


                                                                                                   III. D ESIRED P ROPERTIES

Figure 1. SGE Hadoop Integration (taken from online blog post by Oracle).            In this section, we describe some properties that Hadoop
                                                                                  lacks good support of, yet are essnetial for batch job queu-
  As seen above, the major advantage about SGE compared                           ing systems. CloudBATCH supports the majority of these
to HOD is the utilization of data locality information for                        properties.



                                                                            370
A. Job Queue Management                                                                                                                Monitor


   Job queue management is critical for managing shared                                                                                      Check
                                                                                                                                             status

usage of a cluster for various purposes among multiple users.                                    Serial Job
                                                                                                                    Hbase Tables
For example, there could be a queue for batch serial jobs and            User   Client
                                                                                                                                                                                         Reservation
                                                                                          Reserved                   Job Table                  User Table    Queue Table
a queue for MPI applications. Cluster nodes can be assigned                                 Job
                                                                                                        MapReduce
                                                                                                           Job
                                                                                                                                                                                           Table


to queues with a certain minimum and maximum quantity                                                                poll                                        poll


and capacity guarantee for optimized resource utilization.                                                              Job                                         Job
                                                                                                                      Broker i                                    Broker j
Additionally, each queue should optionally have specific
                                                                                                                                                                        Submit Wrapper
policies concerning the limit of job running time, queue                                                                                                                 Execute Job

                                                                                                                            Submit Wrapper

job priority, etc., to avoid the need to specify these policies                                                              Execute Job
                                                                                                                                                             Wrapper
                                                                                                                                                                                         Wrapper
                                                                                                                                                                n                          Job

on a per job basis. Furthermore, user access control for                                                                                     Wrapper                                        y

                                                                                                                                               JobA
submitting to queues should also be in place.                                                                                                   x

                                                                                                                                Pool of MapReduce Worker Nodes


B. Job Scheduling and Reservation
                                                                                         Figure 2.            CloudBATCH architecture overview.
   Jobs should be put into a certain queue with queue-
specific properties, such as priority and execution time limit.
Jobs with higher priorities should be scheduled first and may             number of Job Brokers are polling for the HBase tables to
require preemption based on priority. Reservation for pre-               find jobs ready for execution. If a job is ready, a Wrapper
scheduled job submission and customized job submission                   containing the job is submitted to Hadoop MapReduce by
policies may be supported such as putting a threshold on the             the Job Broker. The Wrapper is responsible for monitoring
number of simultaneous job submissions allowed for each                  the execution status of the job and updating relevant records
user. In addition, job status and history shall be kept on a per         concerning job and resource information in HBase tables.
user basis together with descriptions for user management,               The Wrapper is also responsible for following some job
statistical study and data provenance purposes. Resource                 policies such as on execution time limit and may terminate
status organized by queues shall also be maintained con-                 a running job if it takes too long over the limit. Monitors are
cerning activeness and workload. Both the status of jobs                 currently responsible for detecting and handling failures that
and resources shall be able to be monitored in real-time or              may occur after Wrappers are submitted to Hadoop MapRe-
offline for analysis.                                                     duce but before they can start to execute the assigned jobs,
C. User Access Control and Accounting                                    which may result in jobs being marked as “Status:queued”
                                                                         but never getting executed.
   User access control shall be supported at least at the
queue level, which requires user credentials to be kept and                 The key part of CloudBATCH is a set of HBase tables that
validated when a job submission or reservation is made.                  store Meta-data information for resources. The main tables
Stateful job execution status and history along with user                are Queue Table (Table I), Job Table (shown in three parts
information shall be kept to support accounting. User groups             in Tables II, III, and IV), Scheduled Time Table (Table V),
may also be supported.                                                   and User Table (Table VI)2 .
   In the following section, we discuss to which degree                     The Queue Table (Table I) stores information about
CloudBATCH succeeds in meeting these desired properties.                 queues such as type of jobs (e.g., a queue for MapReduce
                                                                         jobs or serial jobs), queue capacity measured in the max-
                    IV. C LOUD BATCH                                     imum number of simultaneous active jobs allowed, queue
A. Architecture Overview                                                 job priority (numerical values with increasing priority level
                                                                         from 1-10), execution time limit in milliseconds associated
   CloudBATCH is designed as a fully distributed system
                                                                         with the queue, queue domain (private for some users or
on top of Hadoop and HBase. It uses a set of HBase
                                                                         public for all), and list of users or groups (access control list)
tables globally accessible across all the nodes to manage
                                                                         allowed to access the queue. Note that Table I is a sparse
Meta-data for jobs and resources, and runs jobs through
                                                                         table with rows of varying lengths. In our example settings,
Hadoop MapReduce for transparent data and computation
                                                                         queue “serial” is under the “public” domain and does not
distribution.
                                                                         specify a list of users/groups who are allowed to access the
   Figure 2 shows the architecture overview of the Cloud-
                                                                         queue, while queue “bioinformatics” is under the “private”
BATCH system. CloudBATCH has several distributed com-
                                                                         domain and only allows the listed users to submit jobs to it.
ponents, Clients, Job Brokers, Wrappers and Monitors. The
                                                                         There is no priority associated with queue “bioinformatics”
general sequence of job submission and execution using
                                                                         meaning that either the default priority of “1” or the priority
CloudBATCH is as follows: Users use Clients to submit
                                                                         specified by the user at job submission will be used.
jobs to the system. Job information with proper job status
is put into HBase tables by the Client. In the meantime, a                 2 HBase   column families are omitted for simplicity for all tables.



                                                                   371
The Job Table (Tables II, III, IV) is a large table con-              specific policies or restrictions such as the maximum number
taining extensive information about jobs. Here we show                   of simultaneous jobs allowed to be running in queues at any
the essential columns only and is split into three tables for            time. The “IndividualJobPriorityLimit” regulates the highest
better display. Jobs are identified by unique IDs, submitted              job priority value a user can submit a job with. This ensures
to queues, and associated with the submitting users. Cloud-              that only authorized users can submit high priority jobs that
BATCH currently accepts three types of jobs, namely, serial,             may preempt lower priority jobs if desired (future work).
MapReduce and Scheduled Time (i.e., serial or MapReduce                     In the following subsections, we introduce the key Cloud-
jobs scheduled with an earliest start time). Each job has an             BATCH components one by one and discuss how well
associated priority (either inherited from the queue it is sub-          CloudBATCH satisfies the desired properties mentioned in
mitted to or specified explicitly through the Client), execu-             Section III as well as several other design issues.
tion length limit, SubmitTime (the time at the HBase server
                                                                         B. Client
when the Client first puts in the job record), QueuedTime
(the time a Wrapper is submitted for this job), StartTime (the             The Client provides user interaction functionality for user
time the Wrapper starts executing the job), EndTime (the                 access control and job submission to proper queues with the
time the Wrapper finishes job execution and status update),               capability of staging Non-NFS files into Hadoop HDFS. The
and Status (submitted, queued, running, failed, succeeded).              Client normally runs on the job submitting machine. Figure
Each job should contain an executable command. In the case               3 shows the process of handling job submission.
                                                                                                                  Job submission command
when the Network File System (NFS) is used (which is
the common case in clusters among compute nodes), job                                                                   Get User Name
                                                                                                                      from commandline
commands should contain the necessary file paths needed
by the job. When NFS is not used, job commands should                                   Query User          Specific user
                                                                                      Table for policies
contain input file paths relative to the execution directory                                                                       Default user

on the compute node and in the meantime specify the                                                                         Apply User
corresponding absolute file paths on the machine where                                                                        policies


the files are stored (normally in the machine from which                        User non-exist
                                                                                                                                                              User not allowed for
                                                                                                                    Get Queue name from                       the intended queue
job submissions are made) using the –NonNFSFiles option                                                            commandline and query
                                                                                                                   Queue Table for policies
through the Client so that those files will be staged into                                                                          Default queue
HDFS by the Client and later retrieved by the Wrapper and                                 terminate                                                                  terminate
                                                                                                                           Apply Queue
put into the correct directory structure relative to the com-                                                                Policies


mand execution directory on the cloud node where the job                                                   Non-NFS files
                                                                                         Stage files
is executed (in this case, there is a single globally accessible
HDFS system and each cloud node also has its own local file                                                                               Scheduled Time Job    Update Scheduled Time
system). Other supporting non-NFS input files can also be                                                                                                               Table


staged in this way. Output result files are however not staged
                                                                                                                      Update Job Table
automatically in the current CloudBATCH prototype. Users
should include a file staging logic if necessary at the end of                Figure 3.          The flow of operations in the CloudBATCH Client.
their programs to stage result files explicitly to the desired
destinations. Please pay attention to the way how some of the               After a job is submitted through the commandline inter-
columns are named and data are stored. For example, instead              face, the Client parses the command for user and queue
of having a single “Status” column, we use columns named                 names (if unspecified, the default user and queue will be
by the corresponding status such as “Status:queued”. This                used). The User Table will be checked for access control
is designed for doing fast queries such as “all the jobs that            purposes, and user policies such as the maximum job priority
are queued” by taking advantage of some methods provided                 value allowed will be applied to the submitted job. The
by HBase that can rapidly scan for all the rows for columns              Queue Table will also be checked to see if the user is allowed
with non-empty values.                                                   to submit jobs to the specific queue and to apply queue
   The Scheduled Time Table (Table V) is used to record job              policies to the job as well. If the policy from the Queue Table
requests with a specified earliest start time. Scheduled job              conflicts with the one from the User Table, an exception
information is first entered into the Scheduled Time table                will be raised and the job should be submitted again with
and then put into the Job table with “Status:submitted”. The             corrected information. After these checks, the Client finally
value of “ScheduledTime” in the Scheduled Time Table is                  gets a unique job ID by the mechanism of global sequence
used to set the “SubmitTime” in the Job Table. When a Job                number assignment mentioned in [6] and puts the job infor-
Broker sees a scheduled job, it will not process it until the            mation into the Job Table with “Status:submitted”. If the job
“SubmitTime” has been reached.                                           uses non-NFS files, there is currently some limited support
   The User Table (Table VI) is used to record some user-                from the Client to let users specify colon-separated pairs of


                                                                   372
Table I
                                                            Q UEUE TABLE

  QueueID          QueueType     Capacity     JobPriority       JobLengthLimit       Domain    UserAlice     UserBob        Groupdefault
  serial           serial        50           1                 1200000              public                                 Y
  bioinformatics   MapReduce     10                             600000               private                 Y

                                                              Table II
                                                        J OB TABLE , PART 1

 JobID    QueueID:    QueueID:         JobType:serial       JobType:MapReduce         JobType:ScheduledTime      Priority:1    Priority:5
          serial      bioinformatics
 23       Y                            Y                                                                         Y
 24                   Y                                     Y                                                    Y
 25       Y                            Y                                              Y                                        Y

                                                              Table III
                                                        J OB TABLE , PART 2


 JobID    StartTime   EndTime    SubmitTime     QueuedTime        Job         Status:       Status:    Status:    Status:     Status:
                                                                  Length      submitted     queued     running    failed      succeeded
                                                                  Limit
 23                              t1                               1200000     Y
 24       t4                     t2             t3                600000                               Y
 25                              t5                               1200000     Y

                                                              Table IV
                                                        J OB TABLE , PART 3

 JobID    UserAlice   UserBob    Groupdefault    Groupbio        Command          JobDescription   AbsolutePath1      AbsolutePath2
 23       Y                      Y                               Command1         mpeg encoder     RelativeFileDir1
 24                   Y                          Y               Command2         bio sequence                        RelativeFileDir2
 25                   Y                          Y               Command3         space weather



absolute local file paths on the client machine and relative             locations.
remote file paths on the Hadoop execution node (used in the
job command) through the –NonNFSFiles option. The Client                C. Job Broker
creates a column in the Job table (Table IV) named after the               The Job Broker is responsible for polling the Job Table
absolute local file path with column value the remote file                for lists of user submitted jobs that are ready for execution.
path relative to the job command execution directory. The               For every job on the lists, the Job Broker submits to
absolute file path, concatenated with a HDFS path prefix                  the Hadoop MapReduce framework a “Wrapper” for job
containing the job ID, will be used as the temporary file path           execution. Figure 4 shows the tasks a Job Broker should
in HDFS where the Client will transfer the user-specified                perform periodically when polling the Job Table.
local files to. For example, for job ID X with command
                                                                           A Job Broker queries the Job Table periodically to get
“ls images/medical.jpg”, the user can specify a file pair
                                                                        a list of jobs with “Status:submitted” according to job
“/home/user/images/medical.jpg:images/medical.jpg”. Then
                                                                        priorities from high to low. For each of the jobs in the list,
the local image file will be transferred to HDFS un-
                                                                        the Job Broker checks whether the job is a scheduled job that
der “/HDFS/tmp/jobX/home/user/images/medical.jpg”. The
                                                                        has not yet reached its scheduled execution time, whether the
Wrapper that executes this job later will copy that HDFS
                                                                        intended job queue is already full to its maximum capacity,
file to the local relative path “images/medical.jpg” before
                                                                        and whether the user who submits the job already has a
invoking the job command. The executable of the command
                                                                        number of running jobs at the maximum value of his allowed
should be transferred this way as well if necessary. Future
                                                                        quota. If the job is allowed to be executed immediately, the
work can be added to distinguish between input files and
                                                                        Job Broker changes the job status to “Status:queued” and
output files so that non-NFS file users can have an option
                                                                        submits a Wrapper to execute it. Note that the job status is
to put the result files back into HDFS with customized
                                                                        changed before submitting the Wrapper. If the Wrapper fails


                                                                  373
Table V
                                                                                                S CHEDULED T IME TABLE

                          ScheduledTimeID                           UserBob               Groupbio                JobID     ScheduledTime       RequestSubmitTime
                          2                                         Y                     Y                       25        t5                  t0

                                                                                                                Table VI
                                                                                                              U SER TABLE

      UserID                      SimultaneousJobNumLimit                                  IndividualJobPriorityLimit                  Groupdefault   Groupbio     Groupspace
      UserAlice                   20                                                       5                                           Y
      UserBob                     30                                                       8                                           Y              Y

                               Query Job Table for a list of
                             Status:submitted jobs by Priority
                                       High to Low
                                                                                                                       pletes, either successful or failed, the Wrapper will update
                              For EACH job
                                                            Scheduled Time job with                                    the job status (“Status:successful” or “Status:failed”) in the
                                                           SubmitTime not yet reached
                                                                                                                       Job Table, perform cleanup of the temporary job execution
               Status:submitted   in queue X                                                                           directory and terminate itself normally.
                            Query Job Table for the total No. of
                                 running jobs in queue X
                                                                             Query Queue Table for the
                                                                               capacity of queue X
                                                                                                                       E. Monitor
                                                                                                                          Monitors are currently responsible for detecting and han-
                                                       Reached max queue capacity
                                                                                                                       dling Wrapper failures which may happen after Wrappers
                                                                                                                       are submitted to Hadoop MapReduce but fail to execute
                           Query User Table for the max No. of
                               simultaneous jobs allowed
                                                                          Query Job Table for the total No.
                                                                            of running jobs for this user
                                                                                                                       the assigned jobs, which results in jobs being marked as
                                                                                                                       “Status:queued” but never executed. The current method
                 Too many running tasks
                       for User                                                                                        to achieve this is to first set a time threshold T, then let
                                                                                                                       the Monitor poll the Job Table for jobs that stay in the
                                   Change job status to
                                                                                                                       “queued” status for a time period longer than T counting
                                     Status:queued
                                                                                                                       from “QueuedTime”. It is possible that jobs will still be
                                                                                                                       queued after a healthy Wrapper is submitted with it, even
      Check next job in
         the job list
                                     Submit Wrapper
                                      to execute job
                                                                                                                       though queue capacity or user active job number limit are
                                                                                                                       not reached. This may happen when the cluster is overloaded
  Figure 4.       The flow of operations in the CloudBATCH Job Broker.                                                  and newly submitted MapReduce jobs (i.e., Wrappers them-
                                                                                                                       selves) are queued by the Hadoop MapReduce framework.
                                                                                                                       Therefore, the threshold shall be chosen and adjusted very
before actually starting the job execution, the job could be                                                           carefully according to cluster size and load. Monitor usages
straggling with the “queued” status forever. This is handled                                                           are still not fully explored for our prototype and may be
by Monitors as described below.                                                                                        extended in the future for detecting jobs that have been
                                                                                                                       marked as “Status:running” but actually failed.
D. Wrapper
   Wrappers are responsible for executing jobs through the                                                             F. Discussion
Hadoop MapReduce framework. They are actually Map-only                                                                    CloudBATCH supports the majority of the desired prop-
MapReduce programs acting as agents to execute programs                                                                erties explained in Section III. For example, it adopts a
at compute nodes where they are scheduled to run by                                                                    scalable architecture where there is no single point of failure
the underlying Hadoop schedulers transparently. When a                                                                 in its components (HBase is assumed to be scalable and
Wrapper starts at some node, it first grabs from the HBase                                                              robust) and multiple numbers of its components (such as
table the necessary job information and files that need to be                                                           Job Brokers) can be run according to the scale of the cluster
staged to the local machine. Then it starts the job execution                                                          and job requests it manages; it supports advanced user and
through commandline invocation, MapReduce and legacy                                                                   job queue management with individually associated policies
job alike, and updates the job status to “Status:running”.                                                             concerning user access control and maximum number of
During job execution, it will check the execution status such                                                          active jobs allowed, queue capacity, job priority, etc. It
as total running time in order to follow some job policies                                                             also inherits Hadoop’s transparent data and computation
such as maximum execution time limit, and may terminate                                                                deployment as well as task level fault tolerance concerning
a running job if it takes too long. In the future, Wrappers                                                            the way how Wrappers are handled. Some other proper-
can be extended to do more advanced resource utilization                                                               ties are also supported by working complementarily with
monitoring during job execution. After job execution com-                                                              Hadoop schedulers, such as the minimum number of node


                                                                                                                 374
allocation guaranteed to each queue which can be specified
inside Hadoop schedulers. In addition, CloudBATCH can
handle simple non-NFS file staging requests rather than
requiring users’ manual efforts. The HBase table records
can also serve as a good basis for future data provenance
support. However, MPI is not yet supported since there is
no Hadoop version of the mpirun command yet. The direct
reservation of compute nodes rather than earliest start time
specification for jobs is not supported either because Hadoop
hides compute node details from explicit user control, a                 Figure 5. Server throughput of accepting job submissions vs Number of
tradeoff for the purpose of transparent user access and ease             concurrent Clients.
of management over large numbers of machines.
   Concerning fault tolerance, CloudBATCH is capable of
handling two types of failures. The first is fault tolerance for                    VI. C ONCLUSIONS AND F UTURE W ORK
jobs against machine failures. This is handled by the fault                 In this paper, we describe a system called “CloudBATCH”
tolerance mechanism of the underlying Hadoop MapRe-                      to enable Hadoop to function as a traditional batch job queu-
duce framework through Wrappers. Because Wrappers are                    ing system with enhanced management functionalities for
MapReduce programs, they will be restarted automatically                 cluster resource management. With CloudBATCH, clusters
by default for 6 times if machine failures occur. The other              can be managed using Hadoop alone to cater for hybrid
type of fault would potentially occur due to the concurrent              computing needs involving both MapReduce and legacy
execution of multiple CloudBATCH components. For exam-                   applications. Future work will be explored in the direction
ple, if several Job Brokers run simultaneously, they might               of further testing the system under multi-queue, multi-user
all try to update the status of the same job. This problem               situations with heavy load and refining the prototype im-
may be avoided by fail-safe component protocols which may                plementation of the system for trial production deployment
possibly make use of distributed transactions with global                in solving real-world use cases. CloudBATCH may also be
snapshot isolation on HBase tables [6] so that updates to the            exploited to make dedicated Hadoop clusters (when they
HBase tables by different components won’t conflict with                  coexist with other traditional clusters) useful for the load
each other.                                                              balancing of legacy batch job submissions.

             V. P ERFORMANCE E VALUATION                                                           R EFERENCES
   We present the result of a basic test for the server through-         [1] J. Dean and S. Ghemawat. Mapreduce: Simplified Data
put of handling job submissions vs the number of concurrent                  Processing on Large Clusters. Commun. ACM, 51:107–113,
                                                                             2008.
Clients. The test is performed because according to the
architecture of CloudBATCH, client job submission could                  [2] Hadoop. http://hadoop.apache.org/.
become a performance bottleneck. The test investigates how
fast the system can accept job submissions (inserting job                [3] HBase. http://hadoop.apache.org/hbase/.
information into the HBase tables with proper job status)                [4] T. Sandholm and K. Lai. Dynamic Proportional Share Schedul-
from more and more concurrent users. The result provides                     ing in Hadoop. In LNCS: Proceedings of the 15th Workshop
insight in the system capacity of handling user requests. We                 on Job Scheduling Strategies for Parallel Processing, pages
assume that there is only one queue with no restrictions in                  110–131, 2010.
terms of queue/user policies. Having multiple queues with
                                                                         [5] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy,
complicated policies may decrease system performance,                        S. Shenker, and I. Stoica. Delay Scheduling: A Simple
which will be studied in future work. The test is performed                  Technique for Achieving Locality and Fairness in Cluster
on a 3-machine cluster with Hadoop and HBase installed.                      Scheduling. In Proceedings of the 5th European conference
   In the test, we run an increasing number of Clients with                  on Computer systems, EuroSys ’10, pages 265–278, 2010.
an even distribution of the number of Clients on each of
                                                                         [6] C. Zhang and H. De Sterck. Supporting Multi-row Distributed
our 3 machines concurrently. Each Client submits 100 jobs                    Transactions with Global Snapshot Isolation Using Bare-bones
consecutively to our system. The result (Figure 5) shows                     HBase. In Proceedings of the 11th International Conference
that the system can accept about 25 job submission requests                  on Grid Computing (Grid), 2010.
per second. We argue that the performance is satisfactory for
                                                                         [7] C. Zhang, H. De Sterck, A. Aboulnaga, H. Djambazian, and
normal user job submission request loads. The reason for the
                                                                             R. Sladek. Case Study of Scientific Data Processing on a Cloud
slight decline in throughput after reaching its peak value is                Using Hadoop. In LNCS: Proceedings of the 23rd International
because the Client components compete for the same system                    Symposium of High Performance Computing Systems and
resources used to host Hadoop and HBase.                                     Applications (HPCS), pages 400–415, 2009.



                                                                   375

Weitere ähnliche Inhalte

Was ist angesagt?

Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce cscpconf
 
Characterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningCharacterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningJoão Gabriel Lima
 
Implementation of nosql for robotics
Implementation of nosql for roboticsImplementation of nosql for robotics
Implementation of nosql for roboticsJoão Gabriel Lima
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32jujukoko
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar ReportAtul Kushwaha
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Modeinventionjournals
 
Integrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopIntegrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopJoão Gabriel Lima
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 

Was ist angesagt? (17)

Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce Survey of Parallel Data Processing in Context with MapReduce
Survey of Parallel Data Processing in Context with MapReduce
 
Characterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningCharacterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learning
 
Implementation of nosql for robotics
Implementation of nosql for roboticsImplementation of nosql for robotics
Implementation of nosql for robotics
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
Hadoop Seminar Report
Hadoop Seminar ReportHadoop Seminar Report
Hadoop Seminar Report
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Report
Hadoop ReportHadoop Report
Hadoop Report
 
Why Spark over Hadoop?
Why Spark over Hadoop?Why Spark over Hadoop?
Why Spark over Hadoop?
 
Understanding hdfs
Understanding hdfsUnderstanding hdfs
Understanding hdfs
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Mode
 
Integrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopIntegrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoop
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Actian DataFlow Whitepaper
Actian DataFlow WhitepaperActian DataFlow Whitepaper
Actian DataFlow Whitepaper
 
An Introduction to the World of Hadoop
An Introduction to the World of HadoopAn Introduction to the World of Hadoop
An Introduction to the World of Hadoop
 

Andere mochten auch

Optimizing z/OS Batch
Optimizing z/OS BatchOptimizing z/OS Batch
Optimizing z/OS BatchMartin Packer
 
Batch job processing
Batch job processingBatch job processing
Batch job processingSon Nguyen
 
Munich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topics
Munich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topicsMunich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topics
Munich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topicsMartin Packer
 
Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance Specialist
Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance SpecialistMunich 2016 - Z011597 Martin Packer - How To Be A Better Performance Specialist
Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance SpecialistMartin Packer
 
Create schedule monitor batch jobs with dynamic selection
Create schedule monitor batch jobs with dynamic selectionCreate schedule monitor batch jobs with dynamic selection
Create schedule monitor batch jobs with dynamic selectionRanjan Amit
 
Coupling Facility CPU
Coupling Facility CPUCoupling Facility CPU
Coupling Facility CPUMartin Packer
 
Work Instructions - Batch Job Monitoring team
Work Instructions - Batch Job Monitoring teamWork Instructions - Batch Job Monitoring team
Work Instructions - Batch Job Monitoring teamkuldeepkrsingh22
 
Methods of Production : Job, Batch & Mass Productiion
Methods of Production : Job, Batch & Mass ProductiionMethods of Production : Job, Batch & Mass Productiion
Methods of Production : Job, Batch & Mass ProductiionHarinadh Karimikonda
 
Mainframe JCL Part - 1
Mainframe JCL Part - 1Mainframe JCL Part - 1
Mainframe JCL Part - 1janaki ram
 
How to build the perfect pattern library
How to build the perfect pattern libraryHow to build the perfect pattern library
How to build the perfect pattern libraryWolf Brüning
 

Andere mochten auch (13)

Optimizing z/OS Batch
Optimizing z/OS BatchOptimizing z/OS Batch
Optimizing z/OS Batch
 
Batch Jobs: Beyond the Basics
Batch Jobs: Beyond the BasicsBatch Jobs: Beyond the Basics
Batch Jobs: Beyond the Basics
 
Batch job processing
Batch job processingBatch job processing
Batch job processing
 
Munich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topics
Munich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topicsMunich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topics
Munich 2016 - Z011601 Martin Packer - Parallel Sysplex Performance Topics topics
 
Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance Specialist
Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance SpecialistMunich 2016 - Z011597 Martin Packer - How To Be A Better Performance Specialist
Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance Specialist
 
Create schedule monitor batch jobs with dynamic selection
Create schedule monitor batch jobs with dynamic selectionCreate schedule monitor batch jobs with dynamic selection
Create schedule monitor batch jobs with dynamic selection
 
Coupling Facility CPU
Coupling Facility CPUCoupling Facility CPU
Coupling Facility CPU
 
Batch job schedule
Batch job scheduleBatch job schedule
Batch job schedule
 
Work Instructions - Batch Job Monitoring team
Work Instructions - Batch Job Monitoring teamWork Instructions - Batch Job Monitoring team
Work Instructions - Batch Job Monitoring team
 
JOB BATCH COSTING
JOB BATCH COSTINGJOB BATCH COSTING
JOB BATCH COSTING
 
Methods of Production : Job, Batch & Mass Productiion
Methods of Production : Job, Batch & Mass ProductiionMethods of Production : Job, Batch & Mass Productiion
Methods of Production : Job, Batch & Mass Productiion
 
Mainframe JCL Part - 1
Mainframe JCL Part - 1Mainframe JCL Part - 1
Mainframe JCL Part - 1
 
How to build the perfect pattern library
How to build the perfect pattern libraryHow to build the perfect pattern library
How to build the perfect pattern library
 

Ähnlich wie Cloud batch a batch job queuing system on clouds with hadoop and h-base

DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...
DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...
DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...cscpconf
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System cscpconf
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsDavid Portnoy
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoopManoj Jangalva
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
A data aware caching 2415
A data aware caching 2415A data aware caching 2415
A data aware caching 2415SANTOSH WAYAL
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainersriram0233
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache HadoopChristopher Pezza
 

Ähnlich wie Cloud batch a batch job queuing system on clouds with hadoop and h-base (20)

DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...
DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...
DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVI...
 
Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System Design Issues and Challenges of Peer-to-Peer Video on Demand System
Design Issues and Challenges of Peer-to-Peer Video on Demand System
 
B04 06 0918
B04 06 0918B04 06 0918
B04 06 0918
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop Implementations
 
2.1-HADOOP.pdf
2.1-HADOOP.pdf2.1-HADOOP.pdf
2.1-HADOOP.pdf
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoop
 
G017143640
G017143640G017143640
G017143640
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
A data aware caching 2415
A data aware caching 2415A data aware caching 2415
A data aware caching 2415
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
Analytics 3
Analytics 3Analytics 3
Analytics 3
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainer
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 
Cppt
CpptCppt
Cppt
 
Cppt
CpptCppt
Cppt
 
Bigdata and Hadoop Introduction
Bigdata and Hadoop IntroductionBigdata and Hadoop Introduction
Bigdata and Hadoop Introduction
 
HDFS
HDFSHDFS
HDFS
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 

Mehr von João Gabriel Lima

Deep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer SegmentationDeep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer SegmentationJoão Gabriel Lima
 
Aplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full StackAplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full StackJoão Gabriel Lima
 
Realidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKitRealidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKitJoão Gabriel Lima
 
Big data e Inteligência Artificial
Big data e Inteligência ArtificialBig data e Inteligência Artificial
Big data e Inteligência ArtificialJoão Gabriel Lima
 
Mineração de Dados no Weka - Regressão Linear
Mineração de Dados no Weka -  Regressão LinearMineração de Dados no Weka -  Regressão Linear
Mineração de Dados no Weka - Regressão LinearJoão Gabriel Lima
 
Segurança na Internet - Estudos de caso
Segurança na Internet - Estudos de casoSegurança na Internet - Estudos de caso
Segurança na Internet - Estudos de casoJoão Gabriel Lima
 
Segurança na Internet - Google Hacking
Segurança na Internet - Google  HackingSegurança na Internet - Google  Hacking
Segurança na Internet - Google HackingJoão Gabriel Lima
 
Segurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentaisSegurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentaisJoão Gabriel Lima
 
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...João Gabriel Lima
 
Mineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - ClusterizaçãoMineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - ClusterizaçãoJoão Gabriel Lima
 
Mineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e WekaMineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e WekaJoão Gabriel Lima
 
Visualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark sideVisualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark sideJoão Gabriel Lima
 
REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?João Gabriel Lima
 
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...João Gabriel Lima
 
E-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãosE-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãosJoão Gabriel Lima
 
[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.js[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.jsJoão Gabriel Lima
 
Hackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com JavascriptHackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com JavascriptJoão Gabriel Lima
 

Mehr von João Gabriel Lima (20)

Cooking with data
Cooking with dataCooking with data
Cooking with data
 
Deep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer SegmentationDeep marketing - Indoor Customer Segmentation
Deep marketing - Indoor Customer Segmentation
 
Aplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full StackAplicações de Alto Desempenho com JHipster Full Stack
Aplicações de Alto Desempenho com JHipster Full Stack
 
Realidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKitRealidade aumentada com react native e ARKit
Realidade aumentada com react native e ARKit
 
JS - IA
JS - IAJS - IA
JS - IA
 
Big data e Inteligência Artificial
Big data e Inteligência ArtificialBig data e Inteligência Artificial
Big data e Inteligência Artificial
 
Mineração de Dados no Weka - Regressão Linear
Mineração de Dados no Weka -  Regressão LinearMineração de Dados no Weka -  Regressão Linear
Mineração de Dados no Weka - Regressão Linear
 
Segurança na Internet - Estudos de caso
Segurança na Internet - Estudos de casoSegurança na Internet - Estudos de caso
Segurança na Internet - Estudos de caso
 
Segurança na Internet - Google Hacking
Segurança na Internet - Google  HackingSegurança na Internet - Google  Hacking
Segurança na Internet - Google Hacking
 
Segurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentaisSegurança na Internet - Conceitos fundamentais
Segurança na Internet - Conceitos fundamentais
 
Web Machine Learning
Web Machine LearningWeb Machine Learning
Web Machine Learning
 
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
Mineração de Dados com RapidMiner - Um Estudo de caso sobre o Churn Rate em...
 
Mineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - ClusterizaçãoMineração de dados com RapidMiner + WEKA - Clusterização
Mineração de dados com RapidMiner + WEKA - Clusterização
 
Mineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e WekaMineração de dados na prática com RapidMiner e Weka
Mineração de dados na prática com RapidMiner e Weka
 
Visualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark sideVisualizacao de dados - Come to the dark side
Visualizacao de dados - Come to the dark side
 
REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?REST x SOAP : Qual abordagem escolher?
REST x SOAP : Qual abordagem escolher?
 
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
Game of data - Predição e Análise da série Game Of Thrones a partir do uso de...
 
E-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãosE-trânsito cidadão - IPVA em suas mãos
E-trânsito cidadão - IPVA em suas mãos
 
[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.js[Estácio - IESAM] Automatizando Tarefas com Gulp.js
[Estácio - IESAM] Automatizando Tarefas com Gulp.js
 
Hackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com JavascriptHackeando a Internet das Coisas com Javascript
Hackeando a Internet das Coisas com Javascript
 

Kürzlich hochgeladen

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Kürzlich hochgeladen (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Cloud batch a batch job queuing system on clouds with hadoop and h-base

  • 1. 2nd IEEE International Conference on Cloud Computing Technology and Science CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase Chen Zhang Hans De Sterck David R. Cheriton School of Computer Science Department of Applied Mathematics University of Waterloo University of Waterloo Waterloo, Canada Waterloo, Canada Email: c15zhang@cs.uwaterloo.ca Email: hdesterck@math.uwaterloo.ca Abstract—As MapReduce becomes more and more popu- programs and paradigms. lar in data processing applications, the demand for Hadoop Going for public cloud services such as Amazon EC2 and clusters grows. However, Hadoop is incompatible with existing S3 is not suitable for all Hadoop needs either. Specifically, cluster batch job queuing systems and requires a dedicated cluster under its full control. Hadoop also lacks support for it could be too expensive or infeasible to transfer and store user access control, accounting, fine-grain performance moni- large amounts of data with Amazon. For example, a live toring and legacy batch job processing facilities comparable to cell image processing application described in [7] generates existing cluster job queuing systems, making dedicated Hadoop more than 300GB of data daily. Assuming 3MB/s WAN clusters less amenable for administrators and normal users bandwidth, it would take about 27 hours to transfer all the alike with hybrid computing needs involving both MapReduce and legacy applications. As a result, getting a properly suited data into Amazon at the cost of $301 , and another $45 per and sized Hadoop cluster has not been easy in organizations day for storage if using Amazon S3. Worse still, some of with existing clusters. This paper presents CloudBATCH, a these big data sets might only need to be processed once prototype solution to this problem enabling Hadoop to function and then discarded or archived, making the expensive data as a traditional batch job queuing system with enhanced func- transfer even less cost-effective. It could be more suitable for tionality for cluster resource management. With CloudBATCH, a complete shift to Hadoop for managing an entire cluster to these types of data processing tasks to be completed within cater for hybrid computing needs becomes feasible. the institutions where the data are generated. Efforts are made to make Hadoop and traditional queuing systems coexist and work together. But these efforts have I. I NTRODUCTION limitations themselves. The Hadoop project itself has tried MapReduce [1] is becoming increasingly popular and to work in this direction and provides a Hadoop On Demand used to solve a wide spectrum of problems ranging from (HOD) package in support of provisioning and managing bioinformatics to astronomy. The increase of MapReduce independent Hadoop MapReduce and HDFS instances on usage requires a growing number of Hadoop [2] clusters. a shared cluster of nodes managed by a resource manager However, because Hadoop is incompatible with existing called “Torque”. It is relatively easy to make HOD work with cluster batch job queuing systems and requires a dedicated other resource managers such as Condor, SGE (Sun Grid cluster under its full control, it is not easy to get a Hadoop Engine), and Moab. However, an important limitation [4] for cluster within organizations with existing cluster resources HOD is that HOD cannot effectively move computation near due to administrative difficulties. It is also difficult to man- data to exploit data locality because of its design choice of age Hadoop clusters due to Hadoop’s lack of functionality creating small sub-clusters for processing individual users’ for job management, user access control, accounting, fine- tasks in a subset of compute nodes where data may not grain performance monitoring, etc. As a result, Hadoop clus- reside. Other efforts have been made to develop new queuing ters may suffer from usage contention/monopoly if shared by systems to work with Hadoop. Oracle attempted to provide multiple user groups. Additionally, even for institutions that a solution and recently made a new release of the Oracle can afford several dedicated clusters for various usages, pro- Grid Engine, SGE 6.2u5, claiming to be the first cluster visioning resources for Hadoop clusters is still tricky since resource management system with Hadoop integration so the resources, once dedicated to Hadoop, cannot be used for that users can submit Hadoop applications to an SGE cluster the load balancing of legacy batch job submissions to other just like they would with any other parallel jobs. SGE overloaded traditional clusters even if they are idle. This will take care of setting up the Hadoop MapReduce on is because Hadoop is designed for running a certain type demand. Apart from the apparent limitation of requiring of applications, mainly back-end server computations such users to use/switch to SGE, the SGE Hadoop integration also as page rank calculations, which are normally MapReduce suffers from the data locality problem, although less severe programs for large-scale data processing tasks rather than 1 Inbound general computing problems with diverse types of legacy data transfer to Amazon is free till November 2010 978-0-7695-4302-4/10 $26.00 © 2010 IEEE 368 DOI 10.1109/CloudCom.2010.22
  • 2. than HOD, as well as problems concerning non-exclusive created for various purposes, such as the Capacity Scheduler, host access (See Section II). Additionally, both HOD and the Fair Scheduler [5], and the Dynamic Priority (DP) SGE with Hadoop integration suffer from wasting resources parallel task scheduler [4]. during the reduce phase, which is an intrinsic drawback However, Hadoop schedulers cannot make Hadoop more for dynamically creating a private Hadoop cluster per user favorable towards functioning as a cluster batch queuing MapReduce application request, and there exists no solution system. They are task level schedulers that only concern for this problem yet. stateless decisions for job executions and do not provide To address the above mentioned problems, this paper rich functionality for cluster level management, such as user attempts to provide a solution where Hadoop can function as access control and accounting, and advanced job and job a traditional batch job queuing system with enhanced man- queue management with easily customizable priority levels agement functionalities in addition to its original resource and reservation support. management capability. The goal is to develop a system that enables cluster administrators to make a complete shift to Hadoop for managing an entire cluster for hybrid computing B. Hadoop on Demand needs. Our system is named “CloudBATCH”. The general idea about how CloudBATCH works is as follows: The As the Hadoop project’s effort to alleviate its incompati- system includes a set of HBase [3] tables that are used bility with traditional queuing systems, Hadoop on Demand for storing resource management information, such as user (HOD) is added to Hadoop for dynamically creating and credentials and queue and job information, which provide using Hadoop clusters through existing queuing systems. the basis for rich customized management functionalities. The idea is to make use of the existing cluster queuing In addition, based on the HBase tables, our system provides system to schedule jobs that each run a Hadoop daemon a queuing system that allows “Job Brokers” to submit on a compute node. These running daemons together create “Wrappers” (Map-only MapReduce programs) as agents to an on-demand Hadoop cluster. After the Hadoop cluster execute jobs in Hadoop. Users can submit jobs and check is set up, user submitted MapReduce applications can be job status by using a “Client” at any time. executed. A simple walkthrough of the process for creating The remainder of the paper is structured as follows: a Hadoop cluster and executing user submitted MapReduce in Section II we introduce related work about Hadoop jobs through HOD is as follows: schedulers, Hadoop on Demand, and SGE with Hadoop 1) User requests from cluster resource management sys- integration. In Section III we briefly describe some desired tem a number of nodes on reserve and submits a job properties of a batch job queuing system for clusters. Section called RingMaster to be started on one of the reserved IV will describe the design and implementation of Cloud- nodes. BATCH and how CloudBATCH achieves the properties 2) Nodes are reserved and RingMaster is started on one listed in Section III. In Section V we give a performance of the reserved nodes. evaluation. Section VI concludes. 3) RingMaster starts one process called HodRing on each II. R ELATED W ORK of the reserved nodes. 4) HodRing brings up on-demand Hadoop daemons (na- In this section, we review existing solutions for Hadoop menode, jobtracker, tasktracker, etc.) according to the resource management and its integration with existing clus- specifications maintained by RingMaster in configura- ter batch job queuing systems, and examine their limitations. tion files. To our knowledge, none of the existing solutions can enable 5) MapReduce jobs that were submitted by the user are Hadoop with a rich set of traditional cluster management executed on the on-demand cluster. functionalities comparable to what CloudBATCH provides. 6) The Hadoop cluster gets torn down and resources are A. Hadoop Schedulers released. Hadoop allows customized schedulers to be developed As seen from the above walkthrough of an HOD process, as pluggable components. The motivation for supporting data locality of the external HDFS is not exploited because various schedulers, besides an obvious objective of achieving the reservation and allocation of cluster nodes takes no better system utilization, originates from the user demands account of where the data to be processed are stored, to share a Hadoop cluster allocated into different queues for violating an important design principle and reducing the various types of jobs (production jobs, interactive jobs, etc.) advantage of the MapReduce framework. Another problem among multiple users. The default Hadoop scheduler is a with HOD is that the HodRing processes are started through simple FIFO queue for the entire Hadoop cluster which is ssh by RingMaster, and the cluster resource management insufficient to provide guaranteed capacity with fair resource system is unable to track resource usage and to perform allocations for mixed usage needs. Several schedulers are thorough cleanup when the cluster is torn down. 369
  • 3. C. SGE with Hadoop Integration scheduling tasktrackers. However, a compute node could get Recently, Oracle released SGE 6.2u5 with Hadoop in- overloaded because multiple tasktrackers may be started on tegration, enabling SGE to work with Hadoop without the same node for data locality concerns to execute tasks. requiring a separate dedicated Hadoop cluster. The core In this case, an unpredictable number of Hadoop speculative design idea is similar to HOD in that it also tries to start tasks [5] may be started on the other idling nodes where data an on-demand Hadoop cluster by running Hadoop daemons do not reside, incurring extra overhead in data staging and through the cluster resource management system on a set of waste of resources for executing the otherwise unnecessary reserved compute nodes. The difference with HOD is that duplicated tasks. The unbalanced mingled execution of nor- SGE exerts better concern for data locality when scheduling mal and speculative tasks may further mix up Hadoop sched- tasktrackers and supports better resource usage monitoring ulers built in with the dynamically created Hadoop cluster and cleanup because tasktrackers are directly started by SGE which are unaware of the higher level scheduling decisions as opposed to be started by HodRing in HOD. The main made by SGE, harming performance in unpredictable ways. problem, apart from locking users down to using SGE, lies Even if the default Hadoop speculative task functionality in its mechanism of exploiting data locality and the non- is turned off, the potential danger of overloading a node exclusive usage of compute nodes. persists, which could further contribute to unbalanced exe- Figure 1 illustrates how SGE with Hadoop integration cutions of Map tasks that result in wasting cluster resources works: as explained below. SGE also has an exclusive host access facility. But if this facility is required, then the data locality 1) A process called Job Submission Verifier (JSV) talks exploitation mechanism would be much less useful because to the namenode of an external HDFS to obtain a data normally a data node would potentially host data blocks locality mapping from the user submitted MapReduce needed by several user applications while only one of them program’s HDFS paths to data node locations in can benefit from data locality in the case with exclusive blocks and racks. Note that the “Load Sensors” are host access. Additionally, the SGE installation file is about responsible for reporting on the block and rack data for 700MB, quite large compared to Hadoop which is only about each execution host where an SGE Execution Daemon 70MB in a tar ball and thus less costly to distribute. runs. 2) The SGE scheduler starts the Hadoop Parallel Environ- Worse still, both HOD and SGE suffer from the same ma- ment (PE) with a jobtracker and a set of tasktrackers as jor problem intrinsic to the idea of creating a Hadoop cluster near the data nodes containing user application input on-the-fly for each user MapReduce application request. data as possible. The problem is a possible significant waste of resources in 3) MapReduce applications that were submitted by the the Reduce phase, where nodes might be idling when the user are executed. Because several tasktrackers might number of Reduce tasks to be executed is much smaller than have been started on the same physical node, physical the number of Map tasks that are executed. This is because nodes could be overloaded when user applications start each of the on-demand clusters is privately tailored to a to be executed. single user MapReduce application submission and the size 4) The Hadoop cluster gets torn down and resources are of the Hadoop cluster is fixed at node reservation time. If released. the user MapReduce application requires far more Map tasks than Reduce tasks and the number of nodes are reserved matching the Map tasks (which is usually what users would request), many of the machines in the on-demand Hadoop cluster will be idling when the much smaller number of Reduce tasks are running at the end. The waste of resources could also occur in the case of having unbalanced executions of Map tasks, in which case a portion of Map tasks get finished ahead of time and wait for the others to finish before being able to enter the Reduce phase. There exists no solution for this problem yet. III. D ESIRED P ROPERTIES Figure 1. SGE Hadoop Integration (taken from online blog post by Oracle). In this section, we describe some properties that Hadoop lacks good support of, yet are essnetial for batch job queu- As seen above, the major advantage about SGE compared ing systems. CloudBATCH supports the majority of these to HOD is the utilization of data locality information for properties. 370
  • 4. A. Job Queue Management Monitor Job queue management is critical for managing shared Check status usage of a cluster for various purposes among multiple users. Serial Job Hbase Tables For example, there could be a queue for batch serial jobs and User Client Reservation Reserved Job Table User Table Queue Table a queue for MPI applications. Cluster nodes can be assigned Job MapReduce Job Table to queues with a certain minimum and maximum quantity poll poll and capacity guarantee for optimized resource utilization. Job Job Broker i Broker j Additionally, each queue should optionally have specific Submit Wrapper policies concerning the limit of job running time, queue Execute Job Submit Wrapper job priority, etc., to avoid the need to specify these policies Execute Job Wrapper Wrapper n Job on a per job basis. Furthermore, user access control for Wrapper y JobA submitting to queues should also be in place. x Pool of MapReduce Worker Nodes B. Job Scheduling and Reservation Figure 2. CloudBATCH architecture overview. Jobs should be put into a certain queue with queue- specific properties, such as priority and execution time limit. Jobs with higher priorities should be scheduled first and may number of Job Brokers are polling for the HBase tables to require preemption based on priority. Reservation for pre- find jobs ready for execution. If a job is ready, a Wrapper scheduled job submission and customized job submission containing the job is submitted to Hadoop MapReduce by policies may be supported such as putting a threshold on the the Job Broker. The Wrapper is responsible for monitoring number of simultaneous job submissions allowed for each the execution status of the job and updating relevant records user. In addition, job status and history shall be kept on a per concerning job and resource information in HBase tables. user basis together with descriptions for user management, The Wrapper is also responsible for following some job statistical study and data provenance purposes. Resource policies such as on execution time limit and may terminate status organized by queues shall also be maintained con- a running job if it takes too long over the limit. Monitors are cerning activeness and workload. Both the status of jobs currently responsible for detecting and handling failures that and resources shall be able to be monitored in real-time or may occur after Wrappers are submitted to Hadoop MapRe- offline for analysis. duce but before they can start to execute the assigned jobs, C. User Access Control and Accounting which may result in jobs being marked as “Status:queued” but never getting executed. User access control shall be supported at least at the queue level, which requires user credentials to be kept and The key part of CloudBATCH is a set of HBase tables that validated when a job submission or reservation is made. store Meta-data information for resources. The main tables Stateful job execution status and history along with user are Queue Table (Table I), Job Table (shown in three parts information shall be kept to support accounting. User groups in Tables II, III, and IV), Scheduled Time Table (Table V), may also be supported. and User Table (Table VI)2 . In the following section, we discuss to which degree The Queue Table (Table I) stores information about CloudBATCH succeeds in meeting these desired properties. queues such as type of jobs (e.g., a queue for MapReduce jobs or serial jobs), queue capacity measured in the max- IV. C LOUD BATCH imum number of simultaneous active jobs allowed, queue A. Architecture Overview job priority (numerical values with increasing priority level from 1-10), execution time limit in milliseconds associated CloudBATCH is designed as a fully distributed system with the queue, queue domain (private for some users or on top of Hadoop and HBase. It uses a set of HBase public for all), and list of users or groups (access control list) tables globally accessible across all the nodes to manage allowed to access the queue. Note that Table I is a sparse Meta-data for jobs and resources, and runs jobs through table with rows of varying lengths. In our example settings, Hadoop MapReduce for transparent data and computation queue “serial” is under the “public” domain and does not distribution. specify a list of users/groups who are allowed to access the Figure 2 shows the architecture overview of the Cloud- queue, while queue “bioinformatics” is under the “private” BATCH system. CloudBATCH has several distributed com- domain and only allows the listed users to submit jobs to it. ponents, Clients, Job Brokers, Wrappers and Monitors. The There is no priority associated with queue “bioinformatics” general sequence of job submission and execution using meaning that either the default priority of “1” or the priority CloudBATCH is as follows: Users use Clients to submit specified by the user at job submission will be used. jobs to the system. Job information with proper job status is put into HBase tables by the Client. In the meantime, a 2 HBase column families are omitted for simplicity for all tables. 371
  • 5. The Job Table (Tables II, III, IV) is a large table con- specific policies or restrictions such as the maximum number taining extensive information about jobs. Here we show of simultaneous jobs allowed to be running in queues at any the essential columns only and is split into three tables for time. The “IndividualJobPriorityLimit” regulates the highest better display. Jobs are identified by unique IDs, submitted job priority value a user can submit a job with. This ensures to queues, and associated with the submitting users. Cloud- that only authorized users can submit high priority jobs that BATCH currently accepts three types of jobs, namely, serial, may preempt lower priority jobs if desired (future work). MapReduce and Scheduled Time (i.e., serial or MapReduce In the following subsections, we introduce the key Cloud- jobs scheduled with an earliest start time). Each job has an BATCH components one by one and discuss how well associated priority (either inherited from the queue it is sub- CloudBATCH satisfies the desired properties mentioned in mitted to or specified explicitly through the Client), execu- Section III as well as several other design issues. tion length limit, SubmitTime (the time at the HBase server B. Client when the Client first puts in the job record), QueuedTime (the time a Wrapper is submitted for this job), StartTime (the The Client provides user interaction functionality for user time the Wrapper starts executing the job), EndTime (the access control and job submission to proper queues with the time the Wrapper finishes job execution and status update), capability of staging Non-NFS files into Hadoop HDFS. The and Status (submitted, queued, running, failed, succeeded). Client normally runs on the job submitting machine. Figure Each job should contain an executable command. In the case 3 shows the process of handling job submission. Job submission command when the Network File System (NFS) is used (which is the common case in clusters among compute nodes), job Get User Name from commandline commands should contain the necessary file paths needed by the job. When NFS is not used, job commands should Query User Specific user Table for policies contain input file paths relative to the execution directory Default user on the compute node and in the meantime specify the Apply User corresponding absolute file paths on the machine where policies the files are stored (normally in the machine from which User non-exist User not allowed for Get Queue name from the intended queue job submissions are made) using the –NonNFSFiles option commandline and query Queue Table for policies through the Client so that those files will be staged into Default queue HDFS by the Client and later retrieved by the Wrapper and terminate terminate Apply Queue put into the correct directory structure relative to the com- Policies mand execution directory on the cloud node where the job Non-NFS files Stage files is executed (in this case, there is a single globally accessible HDFS system and each cloud node also has its own local file Scheduled Time Job Update Scheduled Time system). Other supporting non-NFS input files can also be Table staged in this way. Output result files are however not staged Update Job Table automatically in the current CloudBATCH prototype. Users should include a file staging logic if necessary at the end of Figure 3. The flow of operations in the CloudBATCH Client. their programs to stage result files explicitly to the desired destinations. Please pay attention to the way how some of the After a job is submitted through the commandline inter- columns are named and data are stored. For example, instead face, the Client parses the command for user and queue of having a single “Status” column, we use columns named names (if unspecified, the default user and queue will be by the corresponding status such as “Status:queued”. This used). The User Table will be checked for access control is designed for doing fast queries such as “all the jobs that purposes, and user policies such as the maximum job priority are queued” by taking advantage of some methods provided value allowed will be applied to the submitted job. The by HBase that can rapidly scan for all the rows for columns Queue Table will also be checked to see if the user is allowed with non-empty values. to submit jobs to the specific queue and to apply queue The Scheduled Time Table (Table V) is used to record job policies to the job as well. If the policy from the Queue Table requests with a specified earliest start time. Scheduled job conflicts with the one from the User Table, an exception information is first entered into the Scheduled Time table will be raised and the job should be submitted again with and then put into the Job table with “Status:submitted”. The corrected information. After these checks, the Client finally value of “ScheduledTime” in the Scheduled Time Table is gets a unique job ID by the mechanism of global sequence used to set the “SubmitTime” in the Job Table. When a Job number assignment mentioned in [6] and puts the job infor- Broker sees a scheduled job, it will not process it until the mation into the Job Table with “Status:submitted”. If the job “SubmitTime” has been reached. uses non-NFS files, there is currently some limited support The User Table (Table VI) is used to record some user- from the Client to let users specify colon-separated pairs of 372
  • 6. Table I Q UEUE TABLE QueueID QueueType Capacity JobPriority JobLengthLimit Domain UserAlice UserBob Groupdefault serial serial 50 1 1200000 public Y bioinformatics MapReduce 10 600000 private Y Table II J OB TABLE , PART 1 JobID QueueID: QueueID: JobType:serial JobType:MapReduce JobType:ScheduledTime Priority:1 Priority:5 serial bioinformatics 23 Y Y Y 24 Y Y Y 25 Y Y Y Y Table III J OB TABLE , PART 2 JobID StartTime EndTime SubmitTime QueuedTime Job Status: Status: Status: Status: Status: Length submitted queued running failed succeeded Limit 23 t1 1200000 Y 24 t4 t2 t3 600000 Y 25 t5 1200000 Y Table IV J OB TABLE , PART 3 JobID UserAlice UserBob Groupdefault Groupbio Command JobDescription AbsolutePath1 AbsolutePath2 23 Y Y Command1 mpeg encoder RelativeFileDir1 24 Y Y Command2 bio sequence RelativeFileDir2 25 Y Y Command3 space weather absolute local file paths on the client machine and relative locations. remote file paths on the Hadoop execution node (used in the job command) through the –NonNFSFiles option. The Client C. Job Broker creates a column in the Job table (Table IV) named after the The Job Broker is responsible for polling the Job Table absolute local file path with column value the remote file for lists of user submitted jobs that are ready for execution. path relative to the job command execution directory. The For every job on the lists, the Job Broker submits to absolute file path, concatenated with a HDFS path prefix the Hadoop MapReduce framework a “Wrapper” for job containing the job ID, will be used as the temporary file path execution. Figure 4 shows the tasks a Job Broker should in HDFS where the Client will transfer the user-specified perform periodically when polling the Job Table. local files to. For example, for job ID X with command A Job Broker queries the Job Table periodically to get “ls images/medical.jpg”, the user can specify a file pair a list of jobs with “Status:submitted” according to job “/home/user/images/medical.jpg:images/medical.jpg”. Then priorities from high to low. For each of the jobs in the list, the local image file will be transferred to HDFS un- the Job Broker checks whether the job is a scheduled job that der “/HDFS/tmp/jobX/home/user/images/medical.jpg”. The has not yet reached its scheduled execution time, whether the Wrapper that executes this job later will copy that HDFS intended job queue is already full to its maximum capacity, file to the local relative path “images/medical.jpg” before and whether the user who submits the job already has a invoking the job command. The executable of the command number of running jobs at the maximum value of his allowed should be transferred this way as well if necessary. Future quota. If the job is allowed to be executed immediately, the work can be added to distinguish between input files and Job Broker changes the job status to “Status:queued” and output files so that non-NFS file users can have an option submits a Wrapper to execute it. Note that the job status is to put the result files back into HDFS with customized changed before submitting the Wrapper. If the Wrapper fails 373
  • 7. Table V S CHEDULED T IME TABLE ScheduledTimeID UserBob Groupbio JobID ScheduledTime RequestSubmitTime 2 Y Y 25 t5 t0 Table VI U SER TABLE UserID SimultaneousJobNumLimit IndividualJobPriorityLimit Groupdefault Groupbio Groupspace UserAlice 20 5 Y UserBob 30 8 Y Y Query Job Table for a list of Status:submitted jobs by Priority High to Low pletes, either successful or failed, the Wrapper will update For EACH job Scheduled Time job with the job status (“Status:successful” or “Status:failed”) in the SubmitTime not yet reached Job Table, perform cleanup of the temporary job execution Status:submitted in queue X directory and terminate itself normally. Query Job Table for the total No. of running jobs in queue X Query Queue Table for the capacity of queue X E. Monitor Monitors are currently responsible for detecting and han- Reached max queue capacity dling Wrapper failures which may happen after Wrappers are submitted to Hadoop MapReduce but fail to execute Query User Table for the max No. of simultaneous jobs allowed Query Job Table for the total No. of running jobs for this user the assigned jobs, which results in jobs being marked as “Status:queued” but never executed. The current method Too many running tasks for User to achieve this is to first set a time threshold T, then let the Monitor poll the Job Table for jobs that stay in the Change job status to “queued” status for a time period longer than T counting Status:queued from “QueuedTime”. It is possible that jobs will still be queued after a healthy Wrapper is submitted with it, even Check next job in the job list Submit Wrapper to execute job though queue capacity or user active job number limit are not reached. This may happen when the cluster is overloaded Figure 4. The flow of operations in the CloudBATCH Job Broker. and newly submitted MapReduce jobs (i.e., Wrappers them- selves) are queued by the Hadoop MapReduce framework. Therefore, the threshold shall be chosen and adjusted very before actually starting the job execution, the job could be carefully according to cluster size and load. Monitor usages straggling with the “queued” status forever. This is handled are still not fully explored for our prototype and may be by Monitors as described below. extended in the future for detecting jobs that have been marked as “Status:running” but actually failed. D. Wrapper Wrappers are responsible for executing jobs through the F. Discussion Hadoop MapReduce framework. They are actually Map-only CloudBATCH supports the majority of the desired prop- MapReduce programs acting as agents to execute programs erties explained in Section III. For example, it adopts a at compute nodes where they are scheduled to run by scalable architecture where there is no single point of failure the underlying Hadoop schedulers transparently. When a in its components (HBase is assumed to be scalable and Wrapper starts at some node, it first grabs from the HBase robust) and multiple numbers of its components (such as table the necessary job information and files that need to be Job Brokers) can be run according to the scale of the cluster staged to the local machine. Then it starts the job execution and job requests it manages; it supports advanced user and through commandline invocation, MapReduce and legacy job queue management with individually associated policies job alike, and updates the job status to “Status:running”. concerning user access control and maximum number of During job execution, it will check the execution status such active jobs allowed, queue capacity, job priority, etc. It as total running time in order to follow some job policies also inherits Hadoop’s transparent data and computation such as maximum execution time limit, and may terminate deployment as well as task level fault tolerance concerning a running job if it takes too long. In the future, Wrappers the way how Wrappers are handled. Some other proper- can be extended to do more advanced resource utilization ties are also supported by working complementarily with monitoring during job execution. After job execution com- Hadoop schedulers, such as the minimum number of node 374
  • 8. allocation guaranteed to each queue which can be specified inside Hadoop schedulers. In addition, CloudBATCH can handle simple non-NFS file staging requests rather than requiring users’ manual efforts. The HBase table records can also serve as a good basis for future data provenance support. However, MPI is not yet supported since there is no Hadoop version of the mpirun command yet. The direct reservation of compute nodes rather than earliest start time specification for jobs is not supported either because Hadoop hides compute node details from explicit user control, a Figure 5. Server throughput of accepting job submissions vs Number of tradeoff for the purpose of transparent user access and ease concurrent Clients. of management over large numbers of machines. Concerning fault tolerance, CloudBATCH is capable of handling two types of failures. The first is fault tolerance for VI. C ONCLUSIONS AND F UTURE W ORK jobs against machine failures. This is handled by the fault In this paper, we describe a system called “CloudBATCH” tolerance mechanism of the underlying Hadoop MapRe- to enable Hadoop to function as a traditional batch job queu- duce framework through Wrappers. Because Wrappers are ing system with enhanced management functionalities for MapReduce programs, they will be restarted automatically cluster resource management. With CloudBATCH, clusters by default for 6 times if machine failures occur. The other can be managed using Hadoop alone to cater for hybrid type of fault would potentially occur due to the concurrent computing needs involving both MapReduce and legacy execution of multiple CloudBATCH components. For exam- applications. Future work will be explored in the direction ple, if several Job Brokers run simultaneously, they might of further testing the system under multi-queue, multi-user all try to update the status of the same job. This problem situations with heavy load and refining the prototype im- may be avoided by fail-safe component protocols which may plementation of the system for trial production deployment possibly make use of distributed transactions with global in solving real-world use cases. CloudBATCH may also be snapshot isolation on HBase tables [6] so that updates to the exploited to make dedicated Hadoop clusters (when they HBase tables by different components won’t conflict with coexist with other traditional clusters) useful for the load each other. balancing of legacy batch job submissions. V. P ERFORMANCE E VALUATION R EFERENCES We present the result of a basic test for the server through- [1] J. Dean and S. Ghemawat. Mapreduce: Simplified Data put of handling job submissions vs the number of concurrent Processing on Large Clusters. Commun. ACM, 51:107–113, 2008. Clients. The test is performed because according to the architecture of CloudBATCH, client job submission could [2] Hadoop. http://hadoop.apache.org/. become a performance bottleneck. The test investigates how fast the system can accept job submissions (inserting job [3] HBase. http://hadoop.apache.org/hbase/. information into the HBase tables with proper job status) [4] T. Sandholm and K. Lai. Dynamic Proportional Share Schedul- from more and more concurrent users. The result provides ing in Hadoop. In LNCS: Proceedings of the 15th Workshop insight in the system capacity of handling user requests. We on Job Scheduling Strategies for Parallel Processing, pages assume that there is only one queue with no restrictions in 110–131, 2010. terms of queue/user policies. Having multiple queues with [5] M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, complicated policies may decrease system performance, S. Shenker, and I. Stoica. Delay Scheduling: A Simple which will be studied in future work. The test is performed Technique for Achieving Locality and Fairness in Cluster on a 3-machine cluster with Hadoop and HBase installed. Scheduling. In Proceedings of the 5th European conference In the test, we run an increasing number of Clients with on Computer systems, EuroSys ’10, pages 265–278, 2010. an even distribution of the number of Clients on each of [6] C. Zhang and H. De Sterck. Supporting Multi-row Distributed our 3 machines concurrently. Each Client submits 100 jobs Transactions with Global Snapshot Isolation Using Bare-bones consecutively to our system. The result (Figure 5) shows HBase. In Proceedings of the 11th International Conference that the system can accept about 25 job submission requests on Grid Computing (Grid), 2010. per second. We argue that the performance is satisfactory for [7] C. Zhang, H. De Sterck, A. Aboulnaga, H. Djambazian, and normal user job submission request loads. The reason for the R. Sladek. Case Study of Scientific Data Processing on a Cloud slight decline in throughput after reaching its peak value is Using Hadoop. In LNCS: Proceedings of the 23rd International because the Client components compete for the same system Symposium of High Performance Computing Systems and resources used to host Hadoop and HBase. Applications (HPCS), pages 400–415, 2009. 375