SlideShare ist ein Scribd-Unternehmen logo
1 von 114
Downloaden Sie, um offline zu lesen
Pós-Graduação em Ciência da Computação
“Nubilum: Resource Management System for
Distributed Clouds”
Por
Glauco Estácio Gonçalves
Tese de Doutorado
Universidade Federal de Pernambuco
posgraduacao@cin.ufpe.br
www.cin.ufpe.br/~posgraduacao
RECIFE, 03/2012
UNIVERSIDADE FEDERAL DE PERNAMBUCO
CENTRO DE INFORMÁTICA
PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO
GLAUCO ESTÁCIO GONÇALVES
“Nubilum: Resource Management System for Distributed Clouds"
ESTE TRABALHO FOI APRESENTADO À PÓS-GRADUAÇÃO EM
CIÊNCIA DA COMPUTAÇÃO DO CENTRO DE INFORMÁTICA DA
UNIVERSIDADE FEDERAL DE PERNAMBUCO COMO REQUISITO
PARCIAL PARA OBTENÇÃO DO GRAU DE DOUTOR EM CIÊNCIA
DA COMPUTAÇÃO.
ORIENTADORA: Dra. JUDITH KELNER
CO-ORIENTADOR: Dr. DJAMEL SADOK
RECIFE, MARÇO/2012
Tese de Doutorado apresentada por Glauco Estácio Gonçalves à Pós- Graduação em
Ciência da Computação do Centro de Informática da Universidade Federal de
Pernambuco, sob o título “Nubilum: Resource Management System for Distributed
Clouds” orientada pela Profa. Judith Kelner e aprovada pela Banca Examinadora
formada pelos professores:
___________________________________________________________
Prof. Paulo Romero Martins Maciel
Centro de Informática / UFPE
__________________________________________________________
Prof. Stênio Flávio de Lacerda Fernandes
Centro de Informática / UFPE
____________________________________________________________
Prof. Kelvin Lopes Dias
Centro de Informática / UFPE
_________________________________________________________
Prof. José Neuman de Souza
Departamento de Computação / UFC
___________________________________________________________
Profa. Rossana Maria de Castro Andrade
Departamento de Computação / UFC
Visto e permitida a impressão.
Recife, 12 de março de 2012.
___________________________________________________
Prof. Nelson Souto Rosa
Coordenador da Pós-Graduação em Ciência da Computação do
Centro de Informática da Universidade Federal de Pernambuco.
To my family Danielle, João
Lucas, and Catarina.
iv
Acknowledgments
I would like to express my gratitude to God, cause of all the things and also my
existence; and to the Blessed Virgin Mary to whom I appealed many times in prayer, being
attended always.
I would like to thank my advisor Dr. Judith Kelner and my co-advisor Dr. Djamel
Sadok, whose expertise and patience added considerably to my doctoral experience. Thanks
for the trust in my capacity to conduct my doctorate at GPRT (Networks and
Telecommunications Research Group).
I am indebted to all the people from GPRT for their invaluable help for this work. A
very special thanks goes out to Patrícia, Marcelo, and André Vítor, which have given
valuable comments over the course of my PhD.
I must also acknowledge my committee members, Dr. Jose Neuman, Dr. Otto Duarte,
Dr. Rossana Andrade, Dr. Stênio Fernandes, Dr. Kelvin Lopes, and Dr. Paulo Maciel for
reviewing my proposal and dissertation, offering helpful comments to improve my work.
I would like to thank my wife Danielle for her prayer, patience, and love which gave
me the necessary strength to finish this work. A special thanks to my children, João Lucas
and Catarina. They are gifts of God that make life delightful.
Finally, I would like to thank my parents, João and Fátima, and my sisters, Cynara and
Karine, for their love. Their blessings have always been with me as I followed in my doctoral
research.
v
Abstract
The current infrastructure of Cloud Computing providers is composed of networking and
computational resources that are located in large datacenters supporting as many as
hundreds of thousands of diverse IT equipment. In such scenario, there are several
management challenges related to the energy, failure and operational management and
temperature control. Moreover, the geographical distance between resources and final users
is a source of delay when accessing the services. An alternative to such challenges is the
creation of Distributed Clouds (D-Clouds) with geographically distributed resources along to
a network infrastructure with broad coverage.
Providing resources in such a distributed scenario is not a trivial task, since, beyond the
processing and storage resources, network resources must be taken in consideration offering
users a connectivity service for data transportation (also called Network as a Service – NaaS).
Thereby, the allocation of resources must consider the virtualization of servers and the
network devices. Furthermore, the resource management must consider all steps from the
initial discovery of the adequate resource for attending developers’ demand to its final
delivery to the users.
Considering those challenges in resource management in D-Clouds, this Thesis
proposes then Nubilum, a system for resource management on D-Clouds considering geo-
locality of resources and NaaS aspects. Through its processes and algorithms, Nubilum
offers solutions for discovery, monitoring, control, and allocation of resources in D-Clouds
in order to ensure the adequate functioning of the D-Cloud while meeting developers’
requirements. Nubilum and its underlying technologies and building blocks are described
and their allocation algorithms are also evaluated to verify their efficacy and efficiency.
Keywords: cloud computing, resource management mechamisms, network virtualization.
vi
Resumo
Atualmente, a infraestrutura dos provedores de computação em Nuvem é composta por
recursos de rede e de computação, que são armazenados em datacenters de centenas de
milhares de equipamentos. Neste cenário, encontram-se diversos desafios quanto à gerência
de energia e controle de temperatura, além de, devido à distância geográfica entre os recursos
e os usuários, ser fonte de atraso no acesso aos serviços. Uma alternativa a tais desafios é o
uso de Nuvens Distribuídas (Distributed Clouds – D-Clouds) com recursos distribuídos
geograficamente ao longo de uma infraestrutura de rede com cobertura abrangente.
Prover recursos em tal cenário distribuído não é uma tarefa trivial, pois, além de
recursos computacionais e de armazenamento, deve-se considerar recursos de rede os quais
são oferecidos aos usuários da nuvem como um serviço de conectividade para transporte de
dados (também chamado Network as a Service – NaaS). Desse modo, o processo de alocação
deve considerar a virtualização de ambos, servidores e elementos de rede. Além disso, a
gerência de recursos deve considerar desde a descoberta dos recursos adequados para
atender as demandas dos usuários até a manutenção da qualidade de serviço na sua entrega
final.
Considerando estes desafios em gerência de recursos em D-Clouds, este trabalho
propõe Nubilum: um sistema para gerência de recursos em D-Cloud que considera aspectos
de geo-localidade e NaaS. Por meio de seus processos e algoritmos, Nubilum oferece
soluções para descoberta, monitoramento, controle e alocação de recursos em D-Clouds de
forma a garantir o bom funcionamento da D-Cloud, além de atender os requisitos dos
desenvolvedores. As diversas partes e tecnologias de Nubilum são descritos em detalhes e
suas funções delineadas. Ao final, os algoritmos de alocação do sistema são também
avaliadas de modo a verificar sua eficácia e eficiência.
Palavras-chave: computação em nuvem, mecanismos de alocação de recursos, virtualização
de redes.
vii
Contents
Abstract v
Resumo vi
Abbreviations and Acronyms xii
1 Introduction 1
1.1 Motivation............................................................................................................................................. 2
1.2 Objectives ............................................................................................................................................. 4
1.3 Organization of the Thesis................................................................................................................. 4
2 Cloud Computing 6
2.1 What is Cloud Computing?................................................................................................................ 6
2.2 Agents involved in Cloud Computing.............................................................................................. 7
2.3 Classification of Cloud Providers...................................................................................................... 8
2.3.1 Classification according to the intended audience..................................................................................8
2.3.2 Classification according to the service type.............................................................................................8
2.3.3 Classification according to programmability.........................................................................................10
2.4 Mediation System............................................................................................................................... 11
2.5 Groundwork Technologies.............................................................................................................. 12
2.5.1 Service-Oriented Computing...................................................................................................................12
2.5.2 Server Virtualization..................................................................................................................................12
2.5.3 MapReduce Framework............................................................................................................................13
2.5.4 Datacenters.................................................................................................................................................14
3 Distributed Cloud Computing 15
3.1 Definitions.......................................................................................................................................... 15
3.2 Research Challenges inherent to Resource Management ............................................................ 18
3.2.1 Resource Modeling....................................................................................................................................18
3.2.2 Resource Offering and Treatment..........................................................................................................20
3.2.3 Resource Discovery and Monitoring......................................................................................................22
3.2.4 Resource Selection and Optimization....................................................................................................23
3.2.5 Summary......................................................................................................................................................27
4 The Nubilum System 28
4.1 Design Rationale................................................................................................................................ 28
4.1.1 Programmability.........................................................................................................................................28
4.1.2 Self-optimization........................................................................................................................................29
4.1.3 Existing standards adoption.....................................................................................................................29
4.2 Nubilum’s conceptual view.............................................................................................................. 29
4.2.1 Decision plane............................................................................................................................................30
4.2.2 Management plane.....................................................................................................................................31
4.2.3 Infrastructure plane...................................................................................................................................32
4.3 Nubilum’s functional components.................................................................................................. 32
4.3.1 Allocator......................................................................................................................................................33
4.3.2 Manager.......................................................................................................................................................34
viii
4.3.3 Worker.........................................................................................................................................................35
4.3.4 Network Devices.......................................................................................................................................36
4.3.5 Storage System ...........................................................................................................................................37
4.4 Processes............................................................................................................................................. 37
4.4.1 Initialization processes..............................................................................................................................37
4.4.2 Discovery and monitoring processes......................................................................................................38
4.4.3 Resource allocation processes..................................................................................................................39
4.5 Related projects.................................................................................................................................. 40
5 Control Plane 43
5.1 The Cloud Modeling Language ....................................................................................................... 43
5.1.1 CloudML Schemas.....................................................................................................................................45
5.1.2 A CloudML usage example......................................................................................................................52
5.1.3 Comparison and discussion .....................................................................................................................56
5.2 Communication interfaces and protocols...................................................................................... 57
5.2.1 REST Interfaces.........................................................................................................................................57
5.2.2 Network Virtualization with Openflow.................................................................................................63
5.3 Control Plane Evaluation ................................................................................................................. 65
6 Resource Allocation Strategies 68
6.1 Manager Positioning Problem ......................................................................................................... 68
6.2 Virtual Network Allocation.............................................................................................................. 70
6.2.1 Problem definition and modeling ...........................................................................................................72
6.2.2 Allocating virtual nodes............................................................................................................................74
6.2.3 Allocating virtual links...............................................................................................................................75
6.2.4 Evaluation...................................................................................................................................................76
6.3 Virtual Network Creation................................................................................................................. 81
6.3.1 Minimum length Steiner tree algorithms ...............................................................................................82
6.3.2 Evaluation...................................................................................................................................................86
6.4 Discussion........................................................................................................................................... 89
7 Conclusion 91
7.1 Contributions ..................................................................................................................................... 92
7.2 Publications ........................................................................................................................................ 93
7.3 Future Work ....................................................................................................................................... 94
References 96
ix
List of Figures
Figure 1 Agents in a typical Cloud Computing scenario (from [24]) ..................................................7
Figure 2 Classification of Cloud types (from [71]).................................................................................9
Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24]) ................11
Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud............................................16
Figure 5 ISP-based D-Cloud example ...................................................................................................17
Figure 6 Nubilum’s planes and modules...............................................................................................30
Figure 7 Functional components of Nubilum......................................................................................33
Figure 8 Schematic diagram of Allocator’s modules and relationships with other components..33
Figure 9 Schematic diagram of Manager’s modules and relationships with other components...34
Figure 10 Schematic diagram of Worker modules and relationships with the server system........35
Figure 11 Link discovery process using LLDP and Openflow ..........................................................38
Figure 12 Sequence diagram of the Resource Request process for a developer..............................39
Figure 13 Integration of different descriptions using CloudML........................................................44
Figure 14 Basic status type used in the composition of other types..................................................45
Figure 15 Type for reporting status of the virtual nodes ....................................................................46
Figure 16 XML Schema used to report the status of the physical node...........................................46
Figure 17 Type for reporting complete description of the physical nodes.......................................46
Figure 18 Type for reporting the specific parameters of any node ...................................................47
Figure 19 Type for reporting information about the physical interface ...........................................48
Figure 20 Type for reporting information about a virtual machine..................................................48
Figure 21 Type for reporting information about the whole infrastructure ......................................49
Figure 22 Type for reporting information about the physical infrastructure...................................49
Figure 23 Type for reporting information about a physical link .......................................................50
Figure 24 Type for reporting information about the virtual infrastructure .....................................50
Figure 25 Type describing the service offered by the provider .........................................................51
Figure 26 Type describing the requirements that can be requested by a developer .......................52
Figure 27 Example of a typical Service description XML ..................................................................53
Figure 28 Example of a Request XML..................................................................................................53
Figure 29 Physical infrastructure description........................................................................................54
Figure 30 Virtual infrastructure description..........................................................................................55
Figure 31 Communication protocols employed in Nubilum..............................................................57
Figure 32 REST operation for the retrieval of service information..................................................59
Figure 33 REST operation for updating information of a service ....................................................59
Figure 34 REST operation for requesting resources for a new application.....................................59
Figure 35 REST operation for changing resources of a previous request .......................................60
Figure 36 REST operation for releasing resources of an application ...............................................60
Figure 37 REST operation for registering a new Worker...................................................................60
Figure 38 REST operation to unregister a Worker..............................................................................61
Figure 39 REST operation for update information of a Worker ......................................................61
Figure 40 REST operation for retrieving a description of the D-Cloud infrastructure .................61
Figure 41 REST operation for updating the description of a D-Cloud infrastructure...................61
Figure 42 REST operation for the creation of a virtual node............................................................62
Figure 43 REST operation for updating a virtual node ......................................................................62
Figure 44 REST operation for removal of a virtual node...................................................................62
Figure 45 REST operation for requesting the discovered physical topology ..................................63
Figure 46 REST operation for the creation of a virtual link ..............................................................63
Figure 47 REST operation for updating a virtual link.........................................................................64
Figure 48 REST operation for removal of a virtual link.....................................................................64
x
Figure 49 Example of a typical rule for ARP forwarding...................................................................65
Figure 50 Example of the typical rules created for virtual links: (a) direct, (b) reverse..................65
Figure 51 Example of a D-Cloud with ten workers and one Manager.............................................69
Figure 52 Algorithm for allocation of virtual nodes............................................................................74
Figure 53 Example illustrating the minimax path................................................................................75
Figure 54 Algorithm for allocation of virtual links..............................................................................76
Figure 55 The (a) old and (b) current network topologies of RNP used in simulations................77
Figure 56 Results for the maximum node stress in the (a) old and (b) current RNP topology....78
Figure 57 Results for the maximum link stress in the (a) old and (b) current RNP topology ......79
Figure 58 Results for the mean link stress in the (a) old and (b) current RNP topology...............80
Figure 59 Mean path length (a) old and (b) current RNP topology..................................................80
Figure 60 Example creating a virtual network: (a) before the creation; (b) after the creation ......81
Figure 61 Search procedure used by the GHS algorithm....................................................................83
Figure 62 Placement procedure used by the GHS algorithm.............................................................84
Figure 63 Example of the placement procedure: (a) before and (b) after placement.....................85
Figure 64 Percentage of optimal samples for GHS and STA in the old RNP topology................87
Figure 65 Percentage of samples reaching relative error ≤ 5% in the old RNP topology.............88
Figure 66 Percentage of optimal samples for GHS and STA in the current RNP topology ........88
Figure 67 Percentage of samples reaching relative error ≤ 5% in the current RNP topology......89
xi
List of Tables
Table I Summary of the main aspects discussed..................................................................................27
Table II MIMEtypes used in the overall communications.................................................................58
Table III Models for the length of messages exchanged in the system in bytes.............................67
Table IV Characteristics present in Nubilum’s resource model ........................................................71
Table V Reduced set of characteristics considered by the proposed allocation algorithms ..........72
Table VI Factors and levels used in the MPA’s evaluation ................................................................78
Table VII Factors and levels used in the GHS’s evaluation...............................................................86
Table VIII Scientific papers produced ..................................................................................................94
xii
Abbreviations and Acronyms
CDN Content Delivery Network
CloudML Cloud Modeling Language
D-Cloud Distribute Cloud
DHCP Dynamic Host Configuration Protocol
GHS Greedy Hub Selection
HTTP Hypertext Transfer Protocol
IaaS Infrastructure as a Service
ISP Internet Service Provider
LLDP Link Layer Discovery Protocol
MPA Minimax Path Algorithm
MPP Manager Positioning Problem
NaaS Network as a Service
NV Network Virtualization
OA Optimal Algorithm
OCCI Open Cloud Computing Interface
PoP Point of Presence
REST Representational state transfer
RP Replica Placement
RPA Replica Placement Algorithm
STA Steiner Tree Approximation
VM Virtual Machine
VN Virtual Network
XML Extensible Markup Language
ZAA Zhu and Ammar Algorithm
1
1 Introduction
“A inea incipere.”
Erasmus
Nowadays, it is common to access content across the Internet with little reference to the underlying
datacenter hosting infrastructure maintained by content providers. The entire technology used to
provide such level of locality transparency offers also a new model for the provisioning of
computing services, known as Cloud Computing. This model is attractive as it allows resources to be
provisioned according to users’ requirements leading to overall cost reduction. Cloud users can rent
resources as they become necessary, in a much more scalable and elastic way. Moreover, such users
can transfer operational risks to cloud providers. In the viewpoint of those providers, the model
offers a way for a better utilization of their own infrastructure. Ambrust et al. [1] point out that this
model benefits from a form of statistical multiplexing, since it allocates resources for several users
concurrently on a demand basis. This statistical multiplexing of datacenters is subsequent to several
decades of research in many areas such as distributed computing, Grid computing, web
technologies, service computing, and virtualization.
Current Cloud Computing providers mainly use large and consolidated datacenters in order to
offer their services. However, the ever increasing need for over-provisioning to attend peak
demands and providing redundancy against failures allied to expensive cooling needs are important
factors increasing the energetic costs of centralized datacenters [62]. In current datacenters, the
cooling technologies used for heat dissipation control accounts for as much as 50% of the total
power consumption [38]. In addition to these aspects, it must be observed that the network between
users and the Cloud is often an unreliable best-effort IP service, which can harm delay-constrained
services and interactive applications.
To deal with these problems, there have been some indicatives whereby small cooperative
datacenters can be more attractive since they offer cheaper and low-power consumption alternative
reducing the infrastructure costs of centralized Clouds [12]. These small datacenters can be built at
different geographical regions and connected by dedicated or public (provided by Internet Service
Providers) networks, configuring a new type of Cloud, referred to as a Distributed Cloud. Such
2
Distributed Clouds [20], or just D-Clouds, can exploit the possibility of (virtual) links creation and
the potential of sharing resources across geographic boundaries to provide latency-based allocation
of resources to fully utilize this emerging distributed computing power. D-Clouds can reduce
communication costs by simply provisioning storage, servers, and network resources close to end-
users.
The D-Clouds can be considered as an additional step in the ongoing deployments of Cloud
Computing: one that supports different requirements and leverages new opportunities for service
providers. Users in a Distributed Cloud will be free to choose where to allocate their resources in
order to attend specific market niches, constraints on jurisdiction of software and data, or quality of
service aspects of their clients.
1.1 Motivation
Similarly to Cloud Computing, one of the most important design aspects of D-Clouds is the
availability of “infinite” computing resources which may be used on demand. Cloud users see this
“infinite” resource pool because the Cloud offers the continuous monitoring and management of its
resources and the allocation of resources in an elastic way. Nevertheless, providing on-demand
computing instances and network resources in a distributed scenario is not a trivial task. Dynamic
allocation of resources and their possible reallocation are essential characteristics for accommodating
unpredictable demands and, ultimately, contributing to investment return.
In the context of Clouds, the essential feature of any resource management system is to
guarantee that both user and provider requirements are met satisfactorily. Particularly in D-Clouds,
users may have network requirements, such as bandwidth and delay constraints, in addition to the
common computational requirements, such as CPU, memory, and storage. Furthermore, other user
requirements are relevant including node locality, topology of nodes, jurisdiction, and application
interaction.
The development of solutions to cope with resource management problems remains a very
important topic in the field of Cloud Computing. With regard to this technology, there are solutions
focused on grid computing ([49], [70]) and on datacenters in current Cloud Computing scenarios
([4]). However, such strategies do not fit well the D-Clouds as they are heavily based on assumptions
that do not hold in Distributed Cloud scenarios. For example, such solutions are designed for over-
provisioned networks and commonly do not take into consideration the cost of resources’
communication, which is an important aspect for D-Clouds that must be cautiously monitored
and/or reserved in order to meet users’ requirements.
3
The design of a resource management system involves challenges other than the specific
design of optimization algorithms for resource management. Since D-Clouds are composed of
computational and network devices with different architectures, software, and hardware capabilities,
the first challenge is the development of a suitable resource model covering all this heterogeneity
[20],. In addition, the next challenge is to describe how resources are offered, which is important
since the requirements supported by the D-Cloud provider are defined in this step. The other
challenges are related with the overall operation of the resource management system. When requests
arrive, the system should be aware of the current status of resources, in order to determine if there
are sufficient available resources in the D-Cloud that could satisfy the present request. In this way,
the right mechanisms for resource discovery and monitoring should also be designed, allowing the
system to be aware of the updated status of all its resources. Then, based on the current status and
the requirements of the request, the system may select and allocate resources to serve the present
request.
Please note that the solution to those challenges involves the fine-grained coordination of
several distributed components and the orchestrated execution of the several subsystems composing
the resource management system. At a first glance, these subsystems can be organized into three
parts: one responsible for the direct negotiation of requirements with users; another responsible for
deciding what resources to allocate for given applications; and one last part responsible for the
effective enforcement of these decisions on the resources.
Designing such system is a very interesting and challenging task, and it raises the following
research questions that will be investigated in this thesis:
1. How Cloud users describe their requirements? In order to enable the automatic
negotiation between users and the D-Cloud, the Cloud must recognize a language or
formalism for requirements description. Thus, the investigation of this topic must determine
the proper characteristics of such a language. In addition, it must verify the existent
approaches around this topic in the many relative computing areas.
2. How to represent the resources available in the Cloud? Correlated to the first question,
the resource management system must also maintain an information model to represent all
the resources in the Cloud, including their relationships (topology) and their current status.
3. How the users’ applications are mapped onto Cloud resources? This question is about
the very aspect of resource allocation, i.e., the algorithms, heuristics, and strategies that are
used to decide the set of resources meeting the applications’ requirements and optimizing a
utility function.
4
4. How to enforce the decisions made? The effective enforcement of the decisions involves
the extension of communication protocols or the development of new ones in order to
setup the state of the overall resources in the D-Cloud.
1.2 Objectives
The main objective of this Thesis is to propose an integrated solution to problems related to the
management of resources in D-Clouds. Such solution is presented as Nubilum, a resource
management system that offers a self-managed system for challenges on discovery, control,
monitoring, and allocation of resources in D-Clouds. Nimbulus provides fine-grained orchestration
of their components in order to allocate applications on a D-Cloud.
The specific goals of this Thesis are strictly related to the research questions presented in
Section 1.1, they are:
• Elaborate an information model to describe D-Cloud resources and application
requirements as computational restrictions, topology, geographic location and other
correlated aspects that can be employed to request resources directly to the D-Cloud;
• Explore and extend communication protocols for the provisioning and allocation of
computational and communication resources;
• Develop algorithms, heuristics, and strategies to find suitable D-Cloud resources based on
several different application requirements;
• Integrate the information model, the algorithms, and the communication protocols, into a
single solution.
1.3 Organization of the Thesis
This Thesis identifies the challenges involved in the resource management on Distributed Cloud
Computing and presents solutions for some of these challenges. The remainder of this document is
organized as follows.
The general concepts that make up the basis for all the other chapters are introduced in the
second chapter. Its main objective is to discuss Cloud Computing while trying to explore such
definition and to classify the main approaches in this area.
The Distributed Cloud Computing concept and several important aspects of resource
management on those scenarios are introduced in the third chapter. Moreover, this chapter will
make a comparative analysis of related research areas and problems.
5
The fourth chapter introduces the first contribution of this Thesis: the Nubilum resource
management system, which aggregates the several solutions proposed on this Thesis. Moreover, the
chapter highlights the rationale behind Nubilum as well as their main modules and components.
The fifth chapter examines and evaluates the control plane of Nubilum. It describes the
proposed Cloud Modeling Language and details the communication interfaces and protocols used
for communicating between Nubilum components.
The sixth chapter gives an overview of the resource allocation problems in Distributed
Clouds, and makes a thorough examination of the specific problems related to Nubilum. Some
particular problems are analyzed and a set of algorithms is presented and evaluated.
The seventh chapter of this Thesis reviews the obtained evaluation results, summarizes the
contributions and sets the path to future works and open issues on D-Cloud.
6
2 Cloud Computing
“Definitio est declaratio essentiae rei.”
Legal Proverb
In this chapter the main concepts of Cloud Computing will be presented. It begins with a discussion
on the definition of Cloud Computing (Section 2.1) and the main agents involved in Cloud
Computing (Section 2.2). Next, classifications of Cloud initiatives are offered in Section 2.3. An
exemplary and simple architecture of a Cloud Mediation System is presented in Section 2.4 followed
by a presentation in Section 2.5 of the main technologies acting behind the scenes of Cloud
Computing initiatives.
2.1 What is Cloud Computing?
A definition of Cloud Computing is given by the National Institute of Standards and Technology
(NIST) of the United States: “Cloud computing is a model for enabling convenient, on-demand
network access to a shared pool of configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly provisioned and released with minimal
management effort or service provider interaction” [45]. The definition says that on-demand
dynamic reconfiguration (elasticity) is a key characteristic. Additionally, the definition highlights
another Cloud Computing characteristic: it assumes that minimal management efforts are required
to reconfigure resources. In other words, the Cloud must offer self-service solutions that must
attend to requests on-demand, excluding from the scope of Cloud Computing those initiatives that
operate through the rental of computing resources in a weekly or monthly basis. Hence, it restricts
Cloud Computing to systems that provide automatic mechanisms for resource rental in real-time
with minimal human intervention.
The NIST definition gives a satisfactory concept of Cloud Computing as a computing model.
But, NIST does not cover the main object of Cloud Computing: the Cloud. Thus, in this Thesis,
Cloud Computing is defined as the computing model that operates based on Clouds. In turn, the
Cloud is defined as a conceptual layer that operates above an infrastructure to provide elastic
services in a timely manner.
7
This definition encompasses three main characteristics of Clouds. Firstly, it notes that a Cloud
is primarily a concept, i.e., a Cloud is an abstraction over an infrastructure. Thus, it is independent of
the employed technologies and therefore one can accept different setups, like Amazon EC2 or
Google App Engine, to be named Clouds. Moreover, the infrastructure is defined in a broad sense
once it can be composed by software, physical devices, and/or other Clouds. Secondly, all Clouds
have the same purpose: to provide services. This means that a Cloud hides the complexity of the
underlying infrastructure while exploring the potential of overlying services and acting as a
middleware. In addition, providing a service involves, implicitly, the use of some type of agreement
that should be guaranteed by the Cloud. Such agreements can vary from pre-defined contracts to
malleable agreements defining functional and non-functional requirements. Note that these services
are qualified as elastic ones, which has the same meaning of dynamic reconfiguration that appeared
in the NIST definition. Last but not least, the Cloud must provide services as quickly as possible
such that the infrastructure resources are allocated and reallocated to attend the users’ needs.
2.2 Agents involved in Cloud Computing
Despite previous approaches ([64], [8], [72], and [68]), this Thesis focuses only on three distinct
agents in Cloud Computing as shown in Figure 1: clients, developers, and the provider. The first
notable point is that the provider deals with two types of users that are called developers and clients.
Thus, clients are the customers of a service produced by a developer. Clients use services from
developers, but such use generates demand to the provider that actually hosts the service, and
therefore the client can also be considered a user of the Cloud. It is important to highlight that in
some scenarios (like scientific computing or batch processing) a developer may behave as a client to
the Cloud because it is the end-user of the applications. The text will use “users” when referring to
both classes without distinctions.
Figure 1 Agents in a typical Cloud Computing scenario (from [24])
Developers can be service providers, independent programmers, scientific institutions, and so
on, i.e., all who build applications into the Cloud. They create and run their applications while
Developer
Developer
Client Client Client Client
8
keeping decisions related to maintenance and management of the infrastructure to the provider.
Please note that, a priori, developers do not need to know about the technologies that makeup the
Cloud infrastructure, neither about the specific location of each item in the infrastructure.
Lastly, the term application is used to mean all types of services that can be developed on the
Cloud. In addition, it is important to note that the type of applications supported by a Cloud
depends exclusively on the goals of the Cloud as determined by the provider. Such a wide range of
possible targets generates many different types of Cloud Providers that are discussed in the next
section.
2.3 Classification of Cloud Providers
Currently, there are several operational initiatives of Cloud Computing; however despite all being
called Clouds, they provide different types of services. For that reason, the academic community
([64], [8], [45], [72], and [71]) classified these solutions accurately in order to understand their
relationships. The three complementary proposals for classification are as follows.
2.3.1 Classification according to the intended audience
This first simple taxonomy is suggested by NIST [45] that organizes providers according to the
audience to which the Cloud is aimed. There are four classes in this classification: Private Clouds,
Community Clouds, Public Clouds, and Hybrid Clouds.
The first three classes accommodate providers in a gradual opening of the intended audience
coverage. The Private Cloud class encompasses such types of Clouds destined to be used solely by
an organization operating over their own datacenter or one leased from a third party for exclusive
use. When the Cloud infrastructure is shared by a group of organizations with similar interests it is
classified as a Community Cloud. Furthermore, the Public Cloud class encompasses all initiatives
intended to be used by the general public. Finally, Hybrid Clouds are simply the composition of two
or more Clouds pertaining to different classes (Private, Community, or Public).
2.3.2 Classification according to the service type
In [71], authors offer a classification as represented in Figure 2. Such taxonomy divides Clouds in
five categories: Cloud Application, Cloud Software Environment, Cloud Software Infrastructure,
Software Kernel, and Firmware/Hardware. The authors arranged the different types of Clouds in a
stack, showing that Clouds from higher levels are created using services in the lower levels. This idea
pertains to the definitions of Cloud Computing discussed previously in Sections 2.1 and 2.2.
Essentially, the Cloud provider does not need to be the owner of the infrastructure.
9
Figure 2 Classification of Cloud types (from [71])
The class in the top of the stack, also called Software-as-a-Service (SaaS), involves applications
accessed through the Internet, including social networks, Webmail, and Office tools. Such services
provide software to be used by the general public, whose main interest is to avoid tasks related to
software management like installation and updating. From the point of view of the Cloud provider,
SaaS can decrease costs with software implementation when compared with traditional processes.
Similarly, the Cloud Software Environment, also called Platform-as-a-Service (PaaS), encloses
Clouds that offer programming environments for developers. Through well-defined APIs,
developers can use software modules for access control, authentication, distributed processing, and
so on, in order to produce their own applications in the Cloud. Moreover, developers can contract
services for automatic scalability of their software, databases, and storage services.
In the middle of the stack there is the Cloud Software Infrastructure class of initiatives. This
class encompasses solutions that provide virtual versions of infrastructure devices found in
datacenters like servers, databases, and links. Clouds in this class can be divided into three subclasses
according to the type of resource that is offered by them. Computational resources are grouped in
the Infrastructure-as-a-service (IaaS) subclass that provides generic virtual machines that can be used
in many different ways by the contracting developer. Services for massive data storage are grouped
in the Data-as-a-Service (DaaS) class, whose main mission is to store remotely users’ data on remote,
which allows those users to access their data from anywhere and at anytime. Finally, the third
subclass, called Communications-as-a-Service (CaaS), is composed of solutions that offer virtual
private links and routers through telecommunication infrastructures.
The last two classes do not offer Cloud services specifically, but they are included in the
classification to show that providers offering Clouds in higher layers can have their own software
and hardware infrastructure. The Software Kernel class includes all of the software necessary to
provide services to the other categories like operating systems, hypervisors, cloud management
10
middleware, programming APIs, and libraries. Finally, the class of Firmware/Hardware covers all
sale and rental services of physical servers and communication hardware.
2.3.3 Classification according to programmability
The five-class scheme presented above can classify and organize the current spectrum of Cloud
Computing solutions, but such a model is limited because the number of classes and their
relationships will need to be rearranged as new Cloud services emerge. Therefore, in this Thesis, a
different classification model will be used based on the programmability concept, which was
previously introduced by Endo et al. [19].
Borrowed from the realm of network virtualization [11], programmability is a concept related
to the programming features a network element offers to developers, measuring how much freedom
the developer has to manipulate resources and/or devices. This concept can be easily applied to the
comparison of Cloud Computing solutions. More programmable Clouds offer environments where
developers are free to choose programming paradigms, languages, and platforms. Less
programmable Clouds restrict developers in some way: perhaps by forcing a set of programming
languages or by providing support for only one application paradigm. On the other hand,
programmability directly affects the way developers manage their leased resources. From this point-
of-view, providers of less programmable Clouds are responsible to manage their infrastructure while
being transparent to developers. In turn, a more programmable Cloud leaves more of these tasks to
developers, thus introducing management difficulties due to the more heterogeneous programming
environment.
Thus, Cloud Programmability can be defined as the level of sovereignty under which
developers have to manipulate services leased from a provider. Programmability is a relative
concept, i.e., it was adopted to compare one Cloud with others. Also, programmability is directly
proportional to heterogeneity in the infrastructure of the provider and inversely proportional to the
amount of effort that developers must spend to manage leased resources.
To illustrate how this concept can be used, one can classify two current Clouds: Amazon EC2
and Google App Engine. Clearly the Amazon EC2 is more programmable, since in this Cloud
developers can choose between different virtual machine classes, operating systems, and so on. After
they lease one of these virtual machines, developers can configure it to work as they see fit: as a web
server, as a content server, as a unit for batch processing, and so on. On the other hand, Google
App Engine can be classified as a less programmable solution, because it allows developers to create
Web applications that will be hosted by Google. This restricts developers to the Web paradigm and
to some programming languages.
11
2.4 Mediation System
Figure 3 introduces an Archetypal Cloud Mediation System. This is a conceptual model that will be
used as a reference to the discussion on Resource Management in this Thesis. The Archetypal Cloud
Mediation System focuses on one principle: resource management as the main service of any Cloud
Computing provider. Thus, other important services like authentication, accounting, and security are
out of the scope of this conceptual system and, therefore these services are separated from the
Mediation System in this archetypal Cloud mediation system. Clients also do not factor into this
view of the system, since resource management is mainly related to the allocation of developers’
applications and meeting their requirements.
Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24])
The mediation system is responsible for the entire process of resource management in the
Cloud. Such a process covers tasks that range from the automatic negotiation of developers
requirements to the execution of their applications. It has three main layers: negotiation, resource
management, and resource control.
The negotiation layer deals with the interface between the Cloud and developers. In the case
of Clouds selling infrastructure services, the interface can be a set of operations based on Web
Services for control of the leased virtual machines. Alternately, in the case of PaaS services, this
interface can be an API for software development in the Cloud. Moreover, the negotiation layer
handles the process of contract establishment between the enterprises and the Cloud. Currently, this
process is simple and the contracts tend to be restrictive. One can expect that in the future, Clouds
will offer more sophisticated avenues for user interaction through high level abstractions and service
level policies.
Mediation
System
Resources
Resource Management
Negotiation
Resource Control
Developers
Auxiliary
Services
Account
Authentication
Security
12
The resource management layer is responsible for the optimal allocation of applications for
obtaining the maximum usage of resources. This function requires advanced strategies and heuristics
to allocate resources that meet the contractual requirements as established with the application
developer. These may include service quality restrictions, jurisdiction restrictions, elastic adaptation,
among others.
Metaphorically, one can say that while the resource management layer acts as the “brain” of
the Cloud, the resource control layer plays the role of its “limbs”. The resource control encompasses
all functions needed to enforce decisions generated by the upper layer. Beyond the tools used to
configure the Cloud resources effectively, all communication protocols used by the Cloud are
included in this layer.
2.5 Groundwork Technologies
Some of the main technologies that used by the current Cloud mediation systems (namely Service-
oriented Computing, Virtualization, MapReduce, and Datacenters) will be discussed.
2.5.1 Service-Oriented Computing
Service-Oriented Computing defines a set of principles, architectural models, and technologies for
the design and development of distributed applications. The recent development of software while
focusing on services gave rise to SOA (Service-Oriented Architecture), which can be defined as an
architectural model “that supports the discovery, message exchange, and integration between loosely
coupled services using industry standards” [37]. The common technology for the implementation of
SOA principles is the Web Service that defines a set of standards to implement services over the
World Wide Web.
In Cloud Computing, SOA is the main paradigm for the development of functions on the
several layers of the Cloud. Cloud providers publish APIs for their services on the web, allowing
developers to use the Cloud and to automate several tasks related to the management of their
applications. Such APIs can assume the form of WSDL documents or REST-based interfaces.
Furthermore, providers can make available Software Development Kits (SDKs) and other toolkits
for the manipulation of applications running on the Cloud.
2.5.2 Server Virtualization
Server virtualization is a technique that allows a computer system to be partitioned onto multiple
isolated execution environments offering a similar service as a single physical computer, which are
called Virtual Machines (VM). Each VM can be configured in an independent way while having its
own operating system, applications, and network parameters. Commonly, such VMs are hosted on a
13
physical server running a hypervisor, the software that effectively virtualizes the server and manages
the VMs [54].
There are several hypervisor options that can be used for server virtualization. From the open-
source community, one can cite Citrix’s Xen1
and the Kernel-based Virtual Machine (KVM)2
. From
the realm of proprietary solutions, some examples are VMWare ESX3
and Microsoft’s HyperV4
.
The main factor that boosted up the adoption of server virtualization within Cloud
Computing is that such technology offers good flexibility regarding the dynamic reallocation of
workloads across servers. Such flexibility allows, for example, providers to execute maintenance on
servers without stopping developers’ applications (that are running on VMs) or to implement
strategies for better resource usage through the migration of VMs. Furthermore, server virtualization
is adapted for the fast provisioning of new VMs through the use of templates, which enables
providers to offer elastic services for applications developers [43].
2.5.3 MapReduce Framework
MapReduce [15] is a programming framework developed by Google for distributed processing of
large data sets across computing infrastructures. Inspired on the map and reduce primitives present
in functional languages, its authors developed an entire framework for the automatic distribution of
computations. In this framework, developers are responsible for writing map and reduce operations
and for using them according to their needs, which is similar to the functional paradigm. These map
and reduce operations will be executed by the MapReduce system that transparently distributes
computations across the computing infrastructure and treats all issues related to node
communication, load balancing, and fault tolerance. For the distribution and synchronization of the
data required by the application, the MapReduce system also requires the use of a specially tailored
distributed file system called Google File System (GFS) [23].
Despite being introduced by Google, there are some open source implementations of the
MapReduce system, like Hadoop [6] and TPlatform [55]. The former is a popular open-source
software used for running applications on large clusters built of commodity hardware. This software
is used by large companies like Amazon, AOL, and IBM, as well as in different Web applications
such as Facebook, Twitter, Last.fm, among others. Basically, Hadoop is composed of two modules:
a MapReduce environment for distributed computing, and a distributed file system called the
Hadoop Distributed File System (HDFS). The latter is an academic initiative that provides a
1 http://www.xen.org/products/cloudxen.html
2 http://www.linux-kvm.org/page/Main_Page
3 http://www.vmware.com/
4 http://www.microsoft.com/hyper-v-server/en/us/default.aspx
14
development platform for Web mining applications. Similarly to Hadoop and Google’s MapReduce,
the TPlatform has a MapReduce module and a distributed file system known as the Tianwang File
System (TFS) [55].
The use of MapReduce solutions is common groundwork technology in PaaS Clouds because
it offers a versatile sandbox for developers. Differently from IaaS Clouds, PaaS developers using a
general-purpose language with MapReduce support do not need to be concerned with software
configuration, software updating and, network configurations. All these tasks are the responsibility
of the Cloud provider, which, in turn, benefits from the fact that such configurations will be
standardized across the overall infrastructure.
2.5.4 Datacenters
Developers who are hosting their applications on a Cloud wish to scale their leased resources,
effectively increasing and decreasing their virtual infrastructure according to the demand of their
clients. This is also the case for developers making use of their own private Clouds. Thus,
independently of the class of Cloud under consideration, a robust and safe infrastructure is needed.
Whereas virtualization and MapReduce respond for the software solution required to attend
this demand, the physical infrastructure of Clouds is based on datacenters, which are infrastructures
composed of TI components providing processing capacity, storage, and network services for one
or more organizations [66]. Currently, the size of a datacenter (in number of components) can vary
from tens of components to tens of thousands of components depending on the datacenter’s
mission. In addition, there are several different TI components for datacenters including switches
and routers, load balancers, storage devices, dedicated storage networks, and, the main component
of any datacenter, in other words, servers [27].
Cloud Computing datacenters provide the required power to attend developers’ demands in
terms of processing, storage, and networking capacities. A large datacenter, running a virtualization
solution, allows for better granularity division of the hardware’s power through the statistical
multiplexing of developers’ applications.
15
3 Distributed Cloud Computing
“Quae non prosunt singula, multa iuvant.”
Ovid
This chapter discusses the main concepts of Distributed Cloud (D-Cloud) Computing. It begins
with a discussion of their definition (Section 3.1) in an attempt to distinguish the D-Cloud from the
current Clouds and highlight their main characteristics. Next, the main research challenges regarding
resource management on D-Clouds will be described in Section 3.2.
3.1 Definitions
Current Cloud Computing setups involve a huge amount of investments as part of the datacenter,
which is the common underlying infrastructure of Clouds as previously detailed in Section 2.5.4.
This centralized infrastructure brings many well-known challenges such as the need for resource
over-provisioning and the high cost for heat dissipation and temperature control. In addition to
concerns with infrastructure costs, one must observe that those datacenters are not necessarily close
to their clients, i.e., the network between end-users and the Cloud is often a long best-effort IP
connection, which means longer round-trip delays.
Considering such limitations, industry and academy researchers have presented indicatives that
small datacenters can be sometimes more attractive since they offer a cheaper and low-power
consumption alternative while also reducing the infrastructure costs of centralized Clouds [12].
Moreover, Distributed Clouds, or just D-Clouds, as pointed out by Endo et al. in [20], can exploit
the possibility of links creation and the potential of sharing resources across geographic boundaries
to provide latency-based allocation of resources to ultimately fully utilize this distributed computing
power. Thus, D-Clouds can reduce communication costs by simply provisioning data, servers, and
links close to end-users.
Figure 4 illustrates how D-Clouds can reduce the cost of communication through the spread
of computational power and the usage of a latency-based allocation of applications. In Figure 4(a)
the client uses an application (App) running on the Cloud through the Internet, which is subject to
the latency imposed by the best-effort network. In Figure 4(b), the client is accessing the same App,
16
but in this case, the latency imposed by the network will be reduced due to the allocation of the App
in a server that is in a small datacenter closest to the client than the previous scenario.
(a) (b)
Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud
Please note that the Figure 4(b) intentionally does not specify the network connecting the
infrastructure of the D-Cloud Provider. This network can be rented from different local ISPs (using
the Internet for interconnection) or from an ISP with wide area coverage. In addition, such ISP
could be the own D-Cloud Provider itself. This may be the case as the D-Cloud paradigm
introduces an organic change in the current Internet where ISPs can start to play as D-Cloud
providers. Thus, ISPs could offer their communication and computational resources for developers
interested in deploying their applications at the specific markets covered by those ISPs.
This idea is illustrated by Figure 5 that shows a D-Cloud offered by a hypothetical Brazilian
ISP. In this example, a developer deployed its application (App) on two servers in order to attend
requests from northern and southern clients. If the number of northeastern clients increases, the
developer can deploy its App (represented by the dotted box) on one server close to the northeast
region in order to improve its service quality. It is important to pay attention to the fact that the
contribution of this Thesis falls in this last scenario, i.e., a scenario where the network and
computational resources are all controlled by the same provider.
CloudProvider
Client
Internet
App
Client
App
DistributedCloudProvider
17
Figure 5 ISP-based D-Cloud example
D-Clouds share similar characteristics with current Cloud Computing, including essential
offerings such as scalability, on demand usage, and pay-as-you-go business plans. Furthermore, the
agents already stated for current Clouds (please see Figure 1) are exactly the same in the context of
D-Clouds. Finally, the many different classifications discussed in Section 2.3 can be applied also.
Despite the similarity, one may highlight two peculiarities of D-Clouds: support to geo-locality and
Network as a Service (NaaS) provisioning ([2], [63], [17]).
The geographical diversity of resources potentially improves cost and performance and gives
an advantage to several different applications, particularly, those that do not require massive internal
communication among large server pools. In this category, as pointed out by [12], one can
emphasize, firstly, applications being currently deployed in a distributed manner, like VOIP (Voice
over IP) and online games; secondly, one can indicate the applications that are good candidates for
distributed implementation, like traffic filtering and e-mail distribution. In addition, there are other
different types of applications that use software or data with specific legal restrictions on
jurisdiction, and specific applications whose public is restricted to one or more geographical areas,
like the tracking of buses or subway routes, information about entertainment events, local news, etc.
Support for geo-locality can be considered to be a step further in the deployment of Cloud
Computing that leverages new opportunities for service providers. Thus, they will be free to choose
where to allocate their resources in order to attend to specific niches, constraints on jurisdiction of
software and data, or quality of service aspects of end-users.
The NaaS (or Communication as a Service – CaaS as cited in section 2.3.2) allows service
providers to manage network resources, instead of just computational ones. Authors in [2] call NaaS
as a service offering transport network connectivity with a level of virtualization suitable to be
App
App
App
18
invoked by service providers. In this way, D-Clouds are able to manage their network resources
according to their convenience, offering better response time for hosted applications. The NaaS is
close to the Network Virtualization (NV) research area [31], where the main problem consists in
choosing how to allocate a virtual network over a physical one, meeting requirements and
minimizing usage of the physical resources. Although NV and D-Clouds are subject to similar
problems and scenarios, there is an essential difference between these two. While NV commonly
models its resources at the infrastructure level (requests are always virtual networks mapped on
graphs), a D-Cloud can be engineered to work with applications in a different abstraction level,
exactly as it occurs with actual Cloud service types like the ones described at Section 2.3.2. This way,
one may see Network Virtualization simply as a particular instance of the D-Cloud. Other insights
about NV are given in Section 3.3.2.
Finally, it must be highlighted that the D-Cloud does not compete with the current Cloud
Computing paradigm, since the D-Cloud merely fits a certain type of applications that have hard
restrictions on geographical location, while the existent Clouds continue to be attracting for
applications demanding massive computational resources or simple applications with minor or no
restrictions on geographical location. Thus, the current Cloud Computing providers are the first
potential candidates to take advantage of the D-Cloud paradigm, since the current Clouds could hire
D-Cloud resources on-demand and move the applications to certain geographical locations in order
to meet specific developers’ requirements. In addition to the current Clouds, the D-Clouds can also
serve the developers directly.
3.2 Research Challenges inherent to Resource Management
D-Clouds face challenges similar to the ones presented in the context of current Cloud Computing.
However, as stated in Chapter 1, the object of the present study is the resource management in D-
Clouds. Thus, this Section gives special emphasis to the challenges for resource management in D-
Clouds, while focusing on four categories as presented in [20]: a) resource modeling; b) resource
offering and treatment; c) resource discovery and monitoring; and d) resource selection.
3.2.1 Resource Modeling
The first challenge is the development of a suitable resource model that is essential to all operations
in the D-Cloud, including management and control. Optimization algorithms are also strongly
dependent of the resource modeling scheme used.
In a D-Cloud environment, it is very important that resource modeling takes into account
physical resources as well as virtual ones. On one hand, the amount of details in each resource
should be treated carefully, since if resources are described with great details, there is a risk that the
19
resource optimization becomes hard and complex, since the computational optimization problem
considering the several modeled aspects can create NP-hard problems. On the other hand, more
details give more flexibility and leverage the usage of resources.
There are some alternatives for resource modeling in Clouds that could be applied to D-
Clouds. One can cite, for example, the OpenStack software project [53], which is focused on
producing an open standard Cloud operating system. It defines a Restful HTTP service that
supports JSON and XML data formats and it is used to request or to exchange information about
Cloud resources and action commands. OpenStack also offers ways to describe how to scale server
down or up (using pre-configured thresholds); it is extensible, allowing the seamless addition of new
features; and it returns additional error messages in faults case.
Other resource modeling alternative is the Virtual Resources and Interconnection Networks
Description Language (VXDL) [39], whose main goal is to describe resources that compose a virtual
infrastructure while focusing on virtual grid applications. The VXDL is able to describe the
components of an infrastructure, their topology, and an execution chronogram. These three aspects
compose the main parts of a VXDL document. The computational resource specification part
describes resource parameters. Furthermore, some peculiarities of virtual Grids are also present,
such as the allocation of virtual machines in the same hardware and location dependence. The
specification of the virtual infrastructure can consider specific developers’ requirements such as
network topology and delay, bandwidth, and the direction of links. The execution chronogram
specifies the period of resource utilization, allowing efficient scheduling, which is a clear concern for
Grids rather than Cloud computing. Another interesting point of VXDL is the possibility of
describing resources individually or in groups, according to application needs. VXDL lacks support
for distinct services descriptions, since it is focused on grid applications only.
The proposal presented in [32], called VRD hereafter, describes resources in a network
virtualization scenario where infrastructure providers describe their virtual resources and services
prior to offering them. It takes into consideration the integration between the properties of virtual
resources and their relationships. An interesting point in the proposal is its use of functional and
non-functional attributes. Functional attributes are related to characteristics, properties, and
functions of components. Non-functional attributes specify criteria and constraints, such as
performance, capacity, and QoS. Among the functional properties that must be highlighted is the set
of component types: PhysicalNode, VirtualNode, Link, and Interface. Such properties suggest a
flexibility that can be used to represent routers or servers, in the case of nodes, and wired or wireless
links, in the case of communication links and interfaces.
20
Another proposal known as the Manifest language was developed by Chapman et al. [9]. They
proposed new meta-models to represent service requirements, constraints, and elasticity rules for
software deployment in a Cloud. The building block of such framework is the OVF (Open
Virtualization Format) standard, which was extended by Chapman et al. to perform the vision of D-
Clouds considering locality constraints. These two points are very interesting to our scenario. With
regard to elasticity, it assumes a rule-based specification formed by three fields: a monitored
condition related to the state of the service (such as workload), an operator (relational and logical
ones are accepted), and an associated action to follow when the condition is met. The location
constraints identify sites that should be favored or avoided when selecting a location for a service.
Nevertheless, the Manifest language is focused on the software architecture. Hence, the language is
not concerned with other aspects such as resources’ status or network resources.
Cloud# is a language for modeling Clouds proposed by [16] to be used as a basis for Cloud
providers and clients to establish trust. The model is used by developers to understand the behavior
of Cloud services. The main goal of Cloud# is to describe how services are delivered, while taking
into consideration the interaction among physical and virtual resources. The main syntactic
construct within Cloud# is the computation unit CUnit, which can model Cloud systems, virtual
machines, or operating systems. A CUnit is represented as a tuple of six components modeling
characteristics and behaviors. This language gives developers a better understanding of the Cloud
organization and how their applications are dealt with.
3.2.2 Resource Offering and Treatment
Once the D-Cloud resources are modeled, the next challenge is to describe how resources are
offered to developers, which is important since the requirements supported by the provider are
defined in this step. Such challenge will also define the interfaces of the D-Cloud. This challenge
differs from resource modeling since the modeling is independent of the way that resources are
offered to developers. For example, the provider could model each resource individually, like
independent items in a fine-grained scale such as GHz of CPU or GB of memory, but could offer
them like a coupled collection of those items or a bundle, such as VM templates as cited at Section
2.5.2.
Recall that, in addition to computational requirements (CPU and memory) and traditional
network requirements, such as bandwidth and delay, new requirements are present under D-Cloud
scenarios. The topology of the nodes is a first interesting requirement to be described. Developers
should be able to set inter-nodes relationships and communication restrictions (e.g., downlink and
uplink rates). This is illustrated in the scenario where servers – configured and managed by
21
developers – are distributed at different geographical localities while it is necessary for them to
communicate with each other in a specific way.
Jurisdiction is related to where (geographically) applications and their data must be stored and
handled. Due to restrictions such as copyright laws, D-Cloud users may want to limit the location
where their information will be stored (such as countries or continents). Other geographical
constraint can be imposed by a maximum (or minimum) physical distance (or delay value) between
nodes. Here, though developers do not know about the actual topology of the nodes, they may
merely establish some delay threshold value for example.
Developers should also be able to describe scalability rules, which would specify how and
when the application would grow and consume more resources from the D-Cloud. Authors in [21]
and [9] define a way of doing this, allowing the Cloud user to specify actions that should be taken,
like deploying new VMs, based on thresholds of metrics monitored by the D-Cloud itself.
Additionally, resource offering is associated to interoperability. Current Cloud providers offer
proprietary interfaces to access their services, which can hinder users within their infrastructure as
the migration of applications cannot be easily made between providers [8]. It is hoped that Cloud
providers identify this problem and work together to offer a standardized API.
According to [61], Cloud interoperability faces two types of heterogeneities: vertical
heterogeneity and horizontal heterogeneity. The first type is concerned with interoperability within a
single Cloud and may be addressed by a common middleware throughout the entire infrastructure.
The second challenge, the horizontal heterogeneity, is related to Clouds from different providers.
Therefore, the key challenge is dealing with these differences. In this case, a high level of granularity
in the modeling may help to address the problem.
An important effort in the search for horizontal standardization comes from the Open Cloud
Manifesto5
, which is an initiative supported by hundreds of companies that aims to discuss a way to
produce open standards for Cloud Computing. Their major doctrines are collaboration and
coordination of efforts on the standardization, adoption of open standards wherever appropriate,
and the development of standards based on customer requirements. Participants of the Open Cloud
Manifesto, through the Cloud Computing Use Case group, produced an interesting white paper [51]
highlighting the requirements that need to be standardized in a cloud environment to ensure
interoperability in the most typical scenarios of interaction in Cloud Computing.
5 http://www.opencloudmanifesto.org/
22
Another group involved with Cloud standards is the Open Grid Forum6
, which is intended to
develop the specification of the Open Cloud Computing Interface (OCCI)7
. The goal of OCCI is to
provide an easily extendable RESTful interface Cloud management. Originally, the OCCI was
designed for IaaS setups, but their current specification [46] was extended to offer a generic scheme
for the management of different Cloud services.
3.2.3 Resource Discovery and Monitoring
When requests reach a D-Cloud, the system should be aware of the current status of resources, in
order to determine if there are available resources in the D-Cloud that could satisfy the requests. In
this way, the right mechanisms for resource discovery and monitoring should also be designed,
allowing the system to be aware of the updated status of all its resources. Then, based on the current
status and request’ requirements, the system may select and allocate resources to serve these new
request.
Resource monitoring should be continuous and help taking allocation and reallocation
decisions as part of the overall resource usage optimization. A careful analysis should be done to
find a good and acceptable trade-off between the amount of control overhead and the frequency of
resource information updating.
The monitoring may be passive or active. It is considered passive when there are one or more
entities collecting information. The entity may continuously send polling messages to nodes asking
for information or may do this on-demand when necessary. On the other hand, the monitoring is
active when nodes are autonomous and may decide when to send asynchronously state information
to some central entity. Naturally, D-Clouds may use both alternatives simultaneously to improve the
monitoring solution. In this case, it is necessary to synchronize updates in repositories to maintain
consistency and validity of state information.
The discovery and monitoring in a D-Cloud can be accompanied by the development of
specific communication protocols. Such protocols act as a standard plane for control in the Cloud,
allowing interoperability between devices. It is expected that such type of protocols can control the
different elements including servers, switches, routers, load balancers, and storage components
present in the D-Cloud. One possible method of coping with this challenge is to use smart
communication nodes with an open programming interface to create new services within the node.
One example of this type of open nodes can be seen in the emerging Openflow-enabled switches
[44].
6 http://www.gridforum.org/
7 http://occi-wg.org/about/specification/
23
3.2.4 Resource Selection and Optimization
With information regarding Cloud resource availability at hand, a set of appropriate candidates may
then be highlighted. Next, the resource selection process finds the configuration that fulfills all
requirements and optimizes the usage of the infrastructure. Selecting solutions from a set of
available ones is not a trivial task due to the dynamicity, high algorithm complexity, and all different
requirements that must be contemplated by the provider.
The problem of resource allocation is recurrent on computer science, and several computing
areas have faced such type of problem since early operating systems. Particularly in the Cloud
Computing field, due to the heterogeneous and time-variant environment in Clouds, the resource
allocation becomes a complex task, forcing the mediation system to respond with minimal
turnaround time in order to maintain the developer’s quality requirements. Also, balancing
resources’ load and projecting energy-efficient Clouds are major challenges in Cloud Computing.
This last aspect is especially relevant as a result of the high demand for electricity to power and to
cool the servers hosted on datacenters [7].
In a Cloud, energy savings may be achieved through many different strategies. Server
consolidation, for example, is a useful strategy for minimizing energy consumption while
maintaining high usage of servers’ resources. This strategy saves the energy migrating VMs onto
some servers and putting idle servers into a standby state. Developing automated solutions for
server consolidation can be a very complex task since these solutions can be mapped to bin-packing
problems known to be NP-hard [72].
VM migration and cloning provides a technology to balance load over servers within a Cloud,
provide fault tolerance to unpredictable errors, or reallocate applications before a programmed
service interruption. But, although this technology is present in major industry hypervisors (like
VMWare or Xen), there remains some open problems to be investigated. These include cloning a
VM into multiple replicas on different hosts [40] and developing VM migration across wide-area
networks [14]. Also, the VM migration introduces a network problem, since, after migration, VMs
require adaptation of the link layer forwarding. Some of the strategies for new datacenter
architectures explained in [67] offer solutions to this problem.
Remodeling of datacenter architectures is other research field that tries to overcome
limitations on scalability, stiffness of address spaces, and node congestion in Clouds. Authors in [67]
surveyed this theme, highlighted the problems on network topologies of state-of-the-art datacenters,
and discussed literature solutions for these problems. One of these solutions is the D-Cloud, as
24
pointed also by [72], which offers an energy efficient alternative for constructing a cloud and an
adapted solution for time-critical services and interactive applications.
Considering specifically the challenges on resource allocation in D-Clouds, one can highlight
correlated studies based on the Placement of Replicas and Network Virtualization. The former is
applied into Content Distribution Networks (CDNs) and it tries to decide where and when content
servers should be positioned in order to improve system’s performance. Such problem is associated
with the placement of applications in D-Clouds. The latter research field can be applied to D-Clouds
considering that a virtual network is an application composed by servers, databases, and the network
between them. Both research fields will be described in following sections.
Replica Placement
Replica Placement (RP) consists of a very broad class of problems. The main objective of this type
of problems is to decide where, when, and by whom servers or their content should be positioned in
order to improve CDN performance. The correspondent existing solutions to these problems are
generally known as Replica Placement Algorithms (RPA) [35].
The general RP problem is modeled as a physical topology (represented by a graph), a set of
clients requesting services, and some servers to place on the graph (costs per server can be
considered instead). Generally, there is a pre-established cost function to be optimized that reflects
service-related aspects, such as the load of user’s requests, the distance from the server, etc. As
pointed out by [35], an RPA groups these aspects into two different components: the problem
definition, which consists of a cost function to be minimized under some constraints, and a
heuristic, which is used to search for near-optimal solutions in a feasible time frame, since the
defined problems are usually NP-complete.
Several different variants of this general problem were already studied. But, according to [57],
they fall into two classes: facility location and minimum K-median. In the facility location problem,
the main goal is to minimize the total cost of the graph through the placement of a number of
servers, which have an associated cost. The minimum K-median problem, in turn, is similar but
assumes the existence of a pre-defined number K of servers. More details on the modeling and
comparison between different variants of the RP problem are provided by [35].
Different versions of this problem can be mapped onto resource allocation problems in D-
Clouds. A very simple mapping can be defined considering an IaaS service where virtual machines
can be allocated in a geo-distributed infrastructure. In such mapping, the topology corresponds to
the physical infrastructure elements of the D-Cloud, the VMs requested by developers can be
treated as servers, and the number of clients accessing each server would be their load.
25
Qiu et al. [57] proposed three different algorithms to solve the K-median problem in a CDN
scenario: Tree-based algorithm, Greedy algorithm, and Hot Spot algorithm. The Tree-based solution
assumes that the underlying graph is a tree that is divided into several small trees, placing each server
in each small tree. The Greedy algorithm places servers one at a time in order to obtain a better
solution in each step until all servers are allocated. Finally, the Hot Spot solution attempts to place
servers in the vicinity of clients with the greatest demand. The results showed that the Greedy
Algorithm for replica placement could provide CDNs with performance that is close to optimal.
These solutions can be mapped onto D-Clouds considering the simple scenario of VM
allocation on a geo-distributed infrastructure with the restriction that each developer has a fixed
number of servers to attend their clients. In such case, this problem can be straightforwardly
reduced to the K-median problem and the three solutions proposed could be applied. Basically, one
could treat each developer as a different CDN and optimize each one independently still considering
a limited capacity of the physical resources caused by the allocation of other developers.
Presti et al. [56], treat a RP variant considering a trade-off between the load of requests per
content and the number of replica additions and removals. Their solution considers that each server
in the physical topology decides autonomously, based on thresholds, when to clone overloaded
contents or to remove the underutilized ones. Such decisions also encompass the minimization of
the distance between clients and the respective accessed replica. A similar problem is investigated in
[50], but considering constraints on the QoS perceived by the client. The authors propose a
mathematical offline formulation and an online version that uses a greedy heuristic. The results
show that the heuristic presents good results with minor computational time.
The main focus of these solutions is to provide scalability to the CDN according to the load
caused by client requests. Thus, despite working only with the placement of content replicas, such
solutions can be also applied to D-Clouds with some simple modifications. Considering replicas as
allocated VMs, one can apply the threshold-based solution proposed in [56] to the simple scenario
of VM scalability on a geo-distributed infrastructure.
Network Virtualization
The main problem of NV is the allocation of virtual networks over a physical network [10] and [3].
Analogously, D-Clouds’ main goal is to allocate application requests on physical resources according
to some constraints while attempting to obtain a clever mapping between the virtual and physical
resources. Therefore, problems on D-Clouds can be formulated as NV problems, especially in
scenarios considering IaaS-level services.
26
Several instances of the NV based resource allocation problem can be reduced to a NP-hard
problem [48]. Even the versions where one knows beforehand all the virtual network requests that
will arrive in the system is NP-hard. The basic solution strategy thus is to restrict the problem space
making it easier to deal with and also consider the use of simple heuristic-based algorithms to
achieve fast results.
Given a model based on graphs to represent both physical and virtual servers, switches, and
links [10], an algorithm that allocates virtual networks should consider the constraints of the
problem (CPU, memory, location or bandwidth limits) and an objective function based on the
algorithm objectives. In [31], the authors describe some possible objective functions to be
optimized, like the ones related to maximize the revenue of the service provider, minimizing link
and nodes stress, etc. They also survey heuristic techniques used when allocating the virtual
networks dividing them in two types: static and dynamic. The dynamic type permits reallocating
along the time by adding more resources to already allocated virtual networks in order to obtain a
better performance. The static one means once a virtual network is allocated it will hardly ever
change its setup.
To exemplify the type of problems studied on NV, one can be driven to discuss the one
studied by Chowdhury et al. [10]. Its authors propose an objective function related to the cost and
revenue of the provider and constrained by capacity and geo-location restrictions. They reduce the
problem to a mixed integer programming problem and then relax the integer constraints through the
deriving of two different algorithms for the solution’s approximation. Furthermore, the paper also
describes a Load Balancing algorithm, in which the original objective function is customized
in order to avoid using nodes and links with low residual capacity. This approach implies in
allocation on less loaded components and an increase of the revenue and acceptance ratio of
the substrate network.
Such type of problem and solutions can be applied to D-Clouds. One example could be the
allocation of interactive servers with jurisdiction restrictions. In this scenario, the provider must
allocate applications (which can be mapped on virtual networks) whose nodes are linked and that
must be close to a certain geographical place according to a maximum tolerated delay. Thus, a
provider could apply the proposed algorithms with minor simple adjustments.
In the paper of Razzaq and Rathore [58], the virtual network embedding algorithm is divided
in two steps: node mapping and link mapping. In the node mapping step, nodes with highest
resource demand are allocated first. The link mapping step is based on an edge disjoint k-shortest
path algorithm, by selecting the shortest path which can fulfill the virtual link bandwidth
27
requirement. In [42], a backtracking algorithm for the allocation of virtual networks onto substrate
networks based on the graph isomorphism problem is proposed. The modeling considers multiple
capacity constraints.
Zhu and Ammar [74] proposed a set of four algorithms with the goal of balancing the load on
the physical links and nodes, but their algorithms do not consider capacity aspects. Their algorithms
perform the initial allocation and make adaptive optimizations to obtain better allocations. The key
idea of the algorithms is to allocate virtual nodes considering the load of the node and the load of
the neighbor links of that node. Thus one can say that they perform the allocation in a coordinated
way. For virtual link allocation, the algorithm tries to select paths with few stressed links in the
network. For more details about the algorithm see [74].
Considering the objectives of NV and RP problems, one may note that NV problems are a
general form of the RP problem: RP problems try to allocate virtual servers whereas NV considers
allocation of virtual servers and virtual links. Both categories of problems can be applied to D-
Clouds. Particularly, RP and NV problems may be respectively mapped on two different classes of
D-Clouds: less controllable D-Clouds and more controllable ones, respectively. The RP problems
are suitable for scenarios where allocation of servers is more critical than links. In turn, the NV
problems are especially adapted to situations where the provider is an ISP that has full control over
the whole infrastructure, including the communication infrastructure.
3.2.5 Summary
The D-Clouds’ domain brings several engineering and research challenges that were discussed in this
section and whose main aspects are summarized in Table I. Such challenges are only starting to
receive attention from the research community. Particularly, the system, models, languages, and
algorithms presented in the next chapters will cope with some of these challenges.
Table I Summary of the main aspects discussed
Categories Aspects
Resource Modeling
Heterogeneity of resources
Physical and virtual resources must be considered
Complexity vs. Flexibility
Resource Offering
and Treatment
Describe the resources offered to developers
Describe the supported requirements
New requirements: topology, jurisdiction, scalability
Resource Discovery
and Monitoring
Monitoring must be continuous
Control overhead vs. Updated information
Resource Selection
and Optimization
Find resources to fulfill developer’s requirements
Optimize usage of the D-Cloud infrastructure
Complex problems solved by approximation algorithms
28
4 The Nubilum System
“Expulsa nube, serenus fit saepe dies.”
Popular Proverb
Section 2.4 introduced an Archetypal Cloud Mediation system focusing specifically on the resource
management process that ranges from the automatic negotiation of developers requirements to the
execution of their applications. Further, this system was divided into three layers: negotiation,
resource management, and resource control. Keeping in mind this simple archetypal mediation
system, this chapter presents Nubilum a resource management system that offers a self-managed
solution for challenges resulting from the discovery, monitoring, control, and allocation of resources
in D-Clouds. This system appears previously in [25] under the name of D-CRAS (Distributed Cloud
Resource Allocation System).
Section 4.1 presents some decisions taken to guide the overall design and implementation of
Nubilum. Section 4.2 presents a conceptual view of the Nubilum’s architecture highlighting their
main modules. The functional components of Nubilum are detailed in Section 4.3. Section 4.4
presents the main processes performed by Nubilum. Section 4.5 closes this chapter by summarizing
the contributions of the system and comparing them with correlated resource management systems.
4.1 Design Rationale
As stated previously in Section 1.2, the objective of this Thesis is to develop a self-manageable
system for resource management on D-Clouds. Before the development of the system and their
correspondent architecture, some design decisions that will guide the development of the system
must be delineated and justified.
4.1.1 Programmability
The first aspect to be defined is the abstraction level in which Nubilum will act. Given that D-
Clouds concerns can be mapped on previous approaches on Replica Placement (see Section 0) and
Network Virtualization (see Section 0) research areas, a straightforward approach would be to
consider a D-Cloud working at the same abstraction level. Therefore, knowing that proposals in
both areas commonly seem to work at the IaaS level, i.e., providing virtualized infrastructures,
Nubilum would naturally also operate at the IaaS level.
29
Nubilum offers a Network Virtualization service. Applications can be treated as virtual
networks and the provider’s infrastructure is the physical network. In this way, the allocation
problem is a virtual network assignment problem and previous solutions for the NV area can be
applied. Note that such approach does not exclude previous Replica Placement solutions because
such area can be viewed as a particular case of Network Virtualization.
4.1.2 Self-optimization
As defined in Section 2.1, the Cloud must provide services in a timely manner, i.e., resources
required by users must be configured as quickly as possible. In other words, to meet such restriction,
Nubilum must operate as much as possible without human intervention, which is the very definition
of self-management from Autonomic Computing [69].
The operation involves maintenance and adjustment of the D-Cloud resources in the face of
changing application demands and innocent or malicious failures. Thus, Nubilum must provide
solutions to cope with the four aspects leveraged by Autonomic Computing: self-configuration, self-
healing, self-optimization, and self-protection. Particularly, this Thesis focuses on investigating self-
optimization – and, at some levels possibly, self-configuration – on D-Clouds. The other two
aspects are considered out of scope of this proposal.
According to [69], self-optimization of a system involves letting its elements “continually seek
ways to improve their operation, identifying and seizing opportunities to make themselves more
efficient in performance or cost”. Such definition fits very well the aim of Nubilum, which must
ensure an automatic monitoring and control of resources to guarantee the optimal functioning of
the Cloud while meeting developers’ requirements.
4.1.3 Existing standards adoption
The Open Cloud Manifesto, an industry initiative that aims to discuss a way to produce open
standards for Cloud Computing, states that Cloud providers “must use and adopt existing standards
wherever appropriate” [51]. The Manifesto argues that several efforts and investments have been
made by the IT industry in standardization, so it seems more productive and economic to use such
standards when appropriate. Following this same line, Nubilum will adopt some industry standards
when possible. Such adoption is also extended to open processes and software tools.
4.2 Nubilum’s conceptual view
As shown in Figure 6, the conceptual view of Nubilum’s architecture is composed of three planes: a
Decision plane, a Management plane, and an Infrastructure plane. Starting from the bottom, the
lower plane nestles all modules responsible for the appropriate virtualization of each resource in the
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds
Nubilum: Resource Management System for Distributed Clouds

Weitere ähnliche Inhalte

Was ist angesagt?

DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...Tal Lavian Ph.D.
 
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...CSCJournals
 
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...Larry Smarr
 
Distributeddatabasesforchallengednet
DistributeddatabasesforchallengednetDistributeddatabasesforchallengednet
DistributeddatabasesforchallengednetVinoth Chandar
 
Addressing the Challenges of Tactical Information Management in Net-Centric S...
Addressing the Challenges of Tactical Information Management in Net-Centric S...Addressing the Challenges of Tactical Information Management in Net-Centric S...
Addressing the Challenges of Tactical Information Management in Net-Centric S...Angelo Corsaro
 
Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution ISSGC Summer School
 
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.xModule 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.xNPN Training
 
Real time data management on wsn
Real time data management on wsnReal time data management on wsn
Real time data management on wsnTAIWAN
 
High Performance Distributed Computing with DDS and Scala
High Performance Distributed Computing with DDS and ScalaHigh Performance Distributed Computing with DDS and Scala
High Performance Distributed Computing with DDS and ScalaAngelo Corsaro
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipDLFCLIR
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Robert Grossman
 
Software-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big ServicesSoftware-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big ServicesPradeeban Kathiravelu, Ph.D.
 
Archiving and managing a million or more data files on BiG Grid
Archiving and managing a million or more data files on BiG GridArchiving and managing a million or more data files on BiG Grid
Archiving and managing a million or more data files on BiG Gridpkdoorn
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...Larry Smarr
 
NoSql And The Semantic Web
NoSql And The Semantic WebNoSql And The Semantic Web
NoSql And The Semantic WebIrina Hutanu
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESneirew J
 
The Distributed Cloud
The Distributed CloudThe Distributed Cloud
The Distributed CloudWowd
 

Was ist angesagt? (18)

DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
DWDM-RAM: a data intensive Grid service architecture enabled by dynamic optic...
 
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
Efficient Tree-based Aggregation and Processing Time for Wireless Sensor Netw...
 
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
High Performance Cyberinfrastructure Required for Data Intensive Scientific R...
 
Distributeddatabasesforchallengednet
DistributeddatabasesforchallengednetDistributeddatabasesforchallengednet
Distributeddatabasesforchallengednet
 
Addressing the Challenges of Tactical Information Management in Net-Centric S...
Addressing the Challenges of Tactical Information Management in Net-Centric S...Addressing the Challenges of Tactical Information Management in Net-Centric S...
Addressing the Challenges of Tactical Information Management in Net-Centric S...
 
Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution Session 46 - Principles of workflow management and execution
Session 46 - Principles of workflow management and execution
 
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.xModule 01 - Understanding Big Data and Hadoop 1.x,2.x
Module 01 - Understanding Big Data and Hadoop 1.x,2.x
 
Real time data management on wsn
Real time data management on wsnReal time data management on wsn
Real time data management on wsn
 
High Performance Distributed Computing with DDS and Scala
High Performance Distributed Computing with DDS and ScalaHigh Performance Distributed Computing with DDS and Scala
High Performance Distributed Computing with DDS and Scala
 
Collaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital ScholarshipCollaborative Service Models: Building Support for Digital Scholarship
Collaborative Service Models: Building Support for Digital Scholarship
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)
 
Software-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big ServicesSoftware-Defined Inter-Cloud Composition of Big Services
Software-Defined Inter-Cloud Composition of Big Services
 
Archiving and managing a million or more data files on BiG Grid
Archiving and managing a million or more data files on BiG GridArchiving and managing a million or more data files on BiG Grid
Archiving and managing a million or more data files on BiG Grid
 
Componentizing Big Services in the Internet
Componentizing Big Services in the InternetComponentizing Big Services in the Internet
Componentizing Big Services in the Internet
 
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Glob...
 
NoSql And The Semantic Web
NoSql And The Semantic WebNoSql And The Semantic Web
NoSql And The Semantic Web
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
 
The Distributed Cloud
The Distributed CloudThe Distributed Cloud
The Distributed Cloud
 

Andere mochten auch

Jonathan Teague Gsa Info Pack
Jonathan Teague Gsa Info PackJonathan Teague Gsa Info Pack
Jonathan Teague Gsa Info Packjteague1975
 
Balton From Poland
Balton From PolandBalton From Poland
Balton From Polandjaromatuk
 
MiM Madison SO Conference Presentation
MiM Madison SO Conference PresentationMiM Madison SO Conference Presentation
MiM Madison SO Conference PresentationAthea Wallace
 
MiM Madison 2010 rally
MiM Madison 2010 rallyMiM Madison 2010 rally
MiM Madison 2010 rallyAthea Wallace
 
Fda phacilitate2010final
Fda phacilitate2010finalFda phacilitate2010final
Fda phacilitate2010finalisoasp
 
Year 8 cad toy instructions
Year 8 cad toy instructionsYear 8 cad toy instructions
Year 8 cad toy instructionstsm37
 
Seventh grade mi m presentation
Seventh grade mi m presentationSeventh grade mi m presentation
Seventh grade mi m presentationAthea Wallace
 
FDA 1997 Points to Consider for Monoclonal Antibodies
FDA  1997 Points to Consider for Monoclonal AntibodiesFDA  1997 Points to Consider for Monoclonal Antibodies
FDA 1997 Points to Consider for Monoclonal Antibodiesisoasp
 
The Three Branches Of Government Power Point
The Three Branches Of Government Power PointThe Three Branches Of Government Power Point
The Three Branches Of Government Power Pointguestfe2e35
 
Burst TCP: an approach for benefiting mice flows
Burst TCP: an approach for benefiting mice flowsBurst TCP: an approach for benefiting mice flows
Burst TCP: an approach for benefiting mice flowsGlauco Gonçalves
 
So You Say You Want a Revolution? Evolving Agile Authority
So You Say You Want a Revolution? Evolving Agile AuthoritySo You Say You Want a Revolution? Evolving Agile Authority
So You Say You Want a Revolution? Evolving Agile AuthorityHarold Shinsato
 

Andere mochten auch (16)

Proyecto Escuela Saludable2009
Proyecto Escuela Saludable2009Proyecto Escuela Saludable2009
Proyecto Escuela Saludable2009
 
Jonathan Teague Gsa Info Pack
Jonathan Teague Gsa Info PackJonathan Teague Gsa Info Pack
Jonathan Teague Gsa Info Pack
 
Balton From Poland
Balton From PolandBalton From Poland
Balton From Poland
 
Lean Mean & Agile 2009
Lean Mean & Agile 2009Lean Mean & Agile 2009
Lean Mean & Agile 2009
 
Taller De Pintar Camisas
Taller De Pintar CamisasTaller De Pintar Camisas
Taller De Pintar Camisas
 
MiM Madison SO Conference Presentation
MiM Madison SO Conference PresentationMiM Madison SO Conference Presentation
MiM Madison SO Conference Presentation
 
MiM Madison 2010 rally
MiM Madison 2010 rallyMiM Madison 2010 rally
MiM Madison 2010 rally
 
Fda phacilitate2010final
Fda phacilitate2010finalFda phacilitate2010final
Fda phacilitate2010final
 
Year 8 cad toy instructions
Year 8 cad toy instructionsYear 8 cad toy instructions
Year 8 cad toy instructions
 
Seventh grade mi m presentation
Seventh grade mi m presentationSeventh grade mi m presentation
Seventh grade mi m presentation
 
FDA 1997 Points to Consider for Monoclonal Antibodies
FDA  1997 Points to Consider for Monoclonal AntibodiesFDA  1997 Points to Consider for Monoclonal Antibodies
FDA 1997 Points to Consider for Monoclonal Antibodies
 
The Three Branches Of Government Power Point
The Three Branches Of Government Power PointThe Three Branches Of Government Power Point
The Three Branches Of Government Power Point
 
Burst TCP: an approach for benefiting mice flows
Burst TCP: an approach for benefiting mice flowsBurst TCP: an approach for benefiting mice flows
Burst TCP: an approach for benefiting mice flows
 
2. guru bimbingan kaunseling
2. guru bimbingan kaunseling2. guru bimbingan kaunseling
2. guru bimbingan kaunseling
 
So You Say You Want a Revolution? Evolving Agile Authority
So You Say You Want a Revolution? Evolving Agile AuthoritySo You Say You Want a Revolution? Evolving Agile Authority
So You Say You Want a Revolution? Evolving Agile Authority
 
stress management
stress managementstress management
stress management
 

Ähnlich wie Nubilum: Resource Management System for Distributed Clouds

NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...ijccsa
 
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...neirew J
 
Hybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in CloudHybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in CloudEditor IJCATR
 
A novel cost-based replica server placement for optimal service quality in ...
  A novel cost-based replica server placement for optimal service quality in ...  A novel cost-based replica server placement for optimal service quality in ...
A novel cost-based replica server placement for optimal service quality in ...IJECEIAES
 
Deep Learning Neural Networks in the Cloud
Deep Learning Neural Networks in the CloudDeep Learning Neural Networks in the Cloud
Deep Learning Neural Networks in the CloudIJAEMSJORNAL
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudNexgen Technology
 
IMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHM
IMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHMIMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHM
IMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHMAssociate Professor in VSB Coimbatore
 
Software-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant CloudsSoftware-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant CloudsPradeeban Kathiravelu, Ph.D.
 
On the Optimal Allocation of VirtualResources in Cloud Compu.docx
On the Optimal Allocation of VirtualResources in Cloud Compu.docxOn the Optimal Allocation of VirtualResources in Cloud Compu.docx
On the Optimal Allocation of VirtualResources in Cloud Compu.docxhopeaustin33688
 
DEVNET-1154 Open Source Presentation on Open Standards
DEVNET-1154	Open Source Presentation on Open StandardsDEVNET-1154	Open Source Presentation on Open Standards
DEVNET-1154 Open Source Presentation on Open StandardsCisco DevNet
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET Journal
 
Ieeepro techno solutions 2014 ieee java project - deadline based resource p...
Ieeepro techno solutions   2014 ieee java project - deadline based resource p...Ieeepro techno solutions   2014 ieee java project - deadline based resource p...
Ieeepro techno solutions 2014 ieee java project - deadline based resource p...hemanthbbc
 
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...ASAITHAMBIRAJAA
 
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...ASAITHAMBIRAJAA
 
Dynamic Resource Provisioning with Authentication in Distributed Database
Dynamic Resource Provisioning with Authentication in Distributed DatabaseDynamic Resource Provisioning with Authentication in Distributed Database
Dynamic Resource Provisioning with Authentication in Distributed DatabaseEditor IJCATR
 
An efficient resource sharing technique for multi-tenant databases
An efficient resource sharing technique for multi-tenant databases An efficient resource sharing technique for multi-tenant databases
An efficient resource sharing technique for multi-tenant databases IJECEIAES
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGAIRCC Publishing Corporation
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGijcsit
 
Cloud Computing: A Perspective on Next Basic Utility in IT World
Cloud Computing: A Perspective on Next Basic Utility in IT World Cloud Computing: A Perspective on Next Basic Utility in IT World
Cloud Computing: A Perspective on Next Basic Utility in IT World IRJET Journal
 
GSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdfGSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdfRaoul Miller
 

Ähnlich wie Nubilum: Resource Management System for Distributed Clouds (20)

NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
NEURO-FUZZY SYSTEM BASED DYNAMIC RESOURCE ALLOCATION IN COLLABORATIVE CLOUD C...
 
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
Neuro-Fuzzy System Based Dynamic Resource Allocation in Collaborative Cloud C...
 
Hybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in CloudHybrid Based Resource Provisioning in Cloud
Hybrid Based Resource Provisioning in Cloud
 
A novel cost-based replica server placement for optimal service quality in ...
  A novel cost-based replica server placement for optimal service quality in ...  A novel cost-based replica server placement for optimal service quality in ...
A novel cost-based replica server placement for optimal service quality in ...
 
Deep Learning Neural Networks in the Cloud
Deep Learning Neural Networks in the CloudDeep Learning Neural Networks in the Cloud
Deep Learning Neural Networks in the Cloud
 
A time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloudA time efficient approach for detecting errors in big sensor data on cloud
A time efficient approach for detecting errors in big sensor data on cloud
 
IMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHM
IMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHMIMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHM
IMPROVEMENT OF ENERGY EFFICIENCY IN CLOUD COMPUTING BY LOAD BALANCING ALGORITHM
 
Software-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant CloudsSoftware-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
Software-Defined Approach for QoS and Data Quality in Multi-Tenant Clouds
 
On the Optimal Allocation of VirtualResources in Cloud Compu.docx
On the Optimal Allocation of VirtualResources in Cloud Compu.docxOn the Optimal Allocation of VirtualResources in Cloud Compu.docx
On the Optimal Allocation of VirtualResources in Cloud Compu.docx
 
DEVNET-1154 Open Source Presentation on Open Standards
DEVNET-1154	Open Source Presentation on Open StandardsDEVNET-1154	Open Source Presentation on Open Standards
DEVNET-1154 Open Source Presentation on Open Standards
 
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...IRJET-  	  Improving Data Availability by using VPC Strategy in Cloud Environ...
IRJET- Improving Data Availability by using VPC Strategy in Cloud Environ...
 
Ieeepro techno solutions 2014 ieee java project - deadline based resource p...
Ieeepro techno solutions   2014 ieee java project - deadline based resource p...Ieeepro techno solutions   2014 ieee java project - deadline based resource p...
Ieeepro techno solutions 2014 ieee java project - deadline based resource p...
 
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
 
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...Ieeepro techno solutions   2014 ieee dotnet project - deadline based resource...
Ieeepro techno solutions 2014 ieee dotnet project - deadline based resource...
 
Dynamic Resource Provisioning with Authentication in Distributed Database
Dynamic Resource Provisioning with Authentication in Distributed DatabaseDynamic Resource Provisioning with Authentication in Distributed Database
Dynamic Resource Provisioning with Authentication in Distributed Database
 
An efficient resource sharing technique for multi-tenant databases
An efficient resource sharing technique for multi-tenant databases An efficient resource sharing technique for multi-tenant databases
An efficient resource sharing technique for multi-tenant databases
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
 
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTINGGROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
GROUP BASED RESOURCE MANAGEMENT AND PRICING MODEL IN CLOUD COMPUTING
 
Cloud Computing: A Perspective on Next Basic Utility in IT World
Cloud Computing: A Perspective on Next Basic Utility in IT World Cloud Computing: A Perspective on Next Basic Utility in IT World
Cloud Computing: A Perspective on Next Basic Utility in IT World
 
GSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdfGSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdf
 

Mehr von Glauco Gonçalves

História da Igreja - Cruzadas
História da Igreja - CruzadasHistória da Igreja - Cruzadas
História da Igreja - CruzadasGlauco Gonçalves
 
Nubilum: Sistema para gerência de recursos em Nuvens Distribuídas
Nubilum: Sistema para gerência de recursos em Nuvens DistribuídasNubilum: Sistema para gerência de recursos em Nuvens Distribuídas
Nubilum: Sistema para gerência de recursos em Nuvens DistribuídasGlauco Gonçalves
 
História da Igreja - Fátima e o Século XX
História da Igreja - Fátima e o Século XX História da Igreja - Fátima e o Século XX
História da Igreja - Fátima e o Século XX Glauco Gonçalves
 
História da Igreja - O Século XIX e as Revoluções
História da Igreja - O Século XIX e as RevoluçõesHistória da Igreja - O Século XIX e as Revoluções
História da Igreja - O Século XIX e as RevoluçõesGlauco Gonçalves
 
História da Igreja - Revolução Francesa
História da Igreja - Revolução FrancesaHistória da Igreja - Revolução Francesa
História da Igreja - Revolução FrancesaGlauco Gonçalves
 
História da Igreja - Embates islâmico-cristãos
História da Igreja - Embates islâmico-cristãosHistória da Igreja - Embates islâmico-cristãos
História da Igreja - Embates islâmico-cristãosGlauco Gonçalves
 
História da Igreja - Reforma e Contra-reforma
História da Igreja - Reforma e Contra-reformaHistória da Igreja - Reforma e Contra-reforma
História da Igreja - Reforma e Contra-reformaGlauco Gonçalves
 
História da Igreja - O Renascimento
História da Igreja - O RenascimentoHistória da Igreja - O Renascimento
História da Igreja - O RenascimentoGlauco Gonçalves
 
História da Igreja - Visão Geral da Modernidade
História da Igreja - Visão Geral da ModernidadeHistória da Igreja - Visão Geral da Modernidade
História da Igreja - Visão Geral da ModernidadeGlauco Gonçalves
 
História da Igreja - O Cisma do Ocidente
História da Igreja - O Cisma do OcidenteHistória da Igreja - O Cisma do Ocidente
História da Igreja - O Cisma do OcidenteGlauco Gonçalves
 
História da Igreja - Os gloriosos séculos XII e XIII
História da Igreja - Os gloriosos séculos XII e XIIIHistória da Igreja - Os gloriosos séculos XII e XIII
História da Igreja - Os gloriosos séculos XII e XIIIGlauco Gonçalves
 
História da Igreja - O Cisma do Oriente
História da Igreja - O Cisma do OrienteHistória da Igreja - O Cisma do Oriente
História da Igreja - O Cisma do OrienteGlauco Gonçalves
 
História da Igreja - Cluny e a reforma da Igreja
História da Igreja - Cluny e a reforma da IgrejaHistória da Igreja - Cluny e a reforma da Igreja
História da Igreja - Cluny e a reforma da IgrejaGlauco Gonçalves
 
História da Igreja - Francos: de Clóvis a Carlos Magno
História da Igreja - Francos: de Clóvis a Carlos MagnoHistória da Igreja - Francos: de Clóvis a Carlos Magno
História da Igreja - Francos: de Clóvis a Carlos MagnoGlauco Gonçalves
 
História da Igreja - Visão geral da Idade Média
História da Igreja - Visão geral da Idade MédiaHistória da Igreja - Visão geral da Idade Média
História da Igreja - Visão geral da Idade MédiaGlauco Gonçalves
 
História da Igreja - A queda do Império Romano
História da Igreja - A queda do Império RomanoHistória da Igreja - A queda do Império Romano
História da Igreja - A queda do Império RomanoGlauco Gonçalves
 
História da Igreja - Concílios de Nicéia e Constantinopla
História da Igreja - Concílios de Nicéia e ConstantinoplaHistória da Igreja - Concílios de Nicéia e Constantinopla
História da Igreja - Concílios de Nicéia e ConstantinoplaGlauco Gonçalves
 

Mehr von Glauco Gonçalves (20)

História da Igreja - Cruzadas
História da Igreja - CruzadasHistória da Igreja - Cruzadas
História da Igreja - Cruzadas
 
Nubilum: Sistema para gerência de recursos em Nuvens Distribuídas
Nubilum: Sistema para gerência de recursos em Nuvens DistribuídasNubilum: Sistema para gerência de recursos em Nuvens Distribuídas
Nubilum: Sistema para gerência de recursos em Nuvens Distribuídas
 
A Santa Inquisição
A Santa InquisiçãoA Santa Inquisição
A Santa Inquisição
 
História da Igreja - Fátima e o Século XX
História da Igreja - Fátima e o Século XX História da Igreja - Fátima e o Século XX
História da Igreja - Fátima e o Século XX
 
História da Igreja - O Século XIX e as Revoluções
História da Igreja - O Século XIX e as RevoluçõesHistória da Igreja - O Século XIX e as Revoluções
História da Igreja - O Século XIX e as Revoluções
 
História da Igreja - Revolução Francesa
História da Igreja - Revolução FrancesaHistória da Igreja - Revolução Francesa
História da Igreja - Revolução Francesa
 
História da Igreja - Embates islâmico-cristãos
História da Igreja - Embates islâmico-cristãosHistória da Igreja - Embates islâmico-cristãos
História da Igreja - Embates islâmico-cristãos
 
História da Igreja - Reforma e Contra-reforma
História da Igreja - Reforma e Contra-reformaHistória da Igreja - Reforma e Contra-reforma
História da Igreja - Reforma e Contra-reforma
 
História da Igreja - O Renascimento
História da Igreja - O RenascimentoHistória da Igreja - O Renascimento
História da Igreja - O Renascimento
 
História da Igreja - Visão Geral da Modernidade
História da Igreja - Visão Geral da ModernidadeHistória da Igreja - Visão Geral da Modernidade
História da Igreja - Visão Geral da Modernidade
 
Igreja na Idade Média
Igreja na Idade MédiaIgreja na Idade Média
Igreja na Idade Média
 
História da Igreja - O Cisma do Ocidente
História da Igreja - O Cisma do OcidenteHistória da Igreja - O Cisma do Ocidente
História da Igreja - O Cisma do Ocidente
 
História da Igreja - Os gloriosos séculos XII e XIII
História da Igreja - Os gloriosos séculos XII e XIIIHistória da Igreja - Os gloriosos séculos XII e XIII
História da Igreja - Os gloriosos séculos XII e XIII
 
História da Igreja - O Cisma do Oriente
História da Igreja - O Cisma do OrienteHistória da Igreja - O Cisma do Oriente
História da Igreja - O Cisma do Oriente
 
História da Igreja - Cluny e a reforma da Igreja
História da Igreja - Cluny e a reforma da IgrejaHistória da Igreja - Cluny e a reforma da Igreja
História da Igreja - Cluny e a reforma da Igreja
 
História da Igreja - Francos: de Clóvis a Carlos Magno
História da Igreja - Francos: de Clóvis a Carlos MagnoHistória da Igreja - Francos: de Clóvis a Carlos Magno
História da Igreja - Francos: de Clóvis a Carlos Magno
 
História da Igreja - Visão geral da Idade Média
História da Igreja - Visão geral da Idade MédiaHistória da Igreja - Visão geral da Idade Média
História da Igreja - Visão geral da Idade Média
 
O Primado de São Pedro
O Primado de São PedroO Primado de São Pedro
O Primado de São Pedro
 
História da Igreja - A queda do Império Romano
História da Igreja - A queda do Império RomanoHistória da Igreja - A queda do Império Romano
História da Igreja - A queda do Império Romano
 
História da Igreja - Concílios de Nicéia e Constantinopla
História da Igreja - Concílios de Nicéia e ConstantinoplaHistória da Igreja - Concílios de Nicéia e Constantinopla
História da Igreja - Concílios de Nicéia e Constantinopla
 

Kürzlich hochgeladen

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 

Kürzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Nubilum: Resource Management System for Distributed Clouds

  • 1. Pós-Graduação em Ciência da Computação “Nubilum: Resource Management System for Distributed Clouds” Por Glauco Estácio Gonçalves Tese de Doutorado Universidade Federal de Pernambuco posgraduacao@cin.ufpe.br www.cin.ufpe.br/~posgraduacao RECIFE, 03/2012
  • 2. UNIVERSIDADE FEDERAL DE PERNAMBUCO CENTRO DE INFORMÁTICA PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO GLAUCO ESTÁCIO GONÇALVES “Nubilum: Resource Management System for Distributed Clouds" ESTE TRABALHO FOI APRESENTADO À PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO DO CENTRO DE INFORMÁTICA DA UNIVERSIDADE FEDERAL DE PERNAMBUCO COMO REQUISITO PARCIAL PARA OBTENÇÃO DO GRAU DE DOUTOR EM CIÊNCIA DA COMPUTAÇÃO. ORIENTADORA: Dra. JUDITH KELNER CO-ORIENTADOR: Dr. DJAMEL SADOK RECIFE, MARÇO/2012
  • 3.
  • 4. Tese de Doutorado apresentada por Glauco Estácio Gonçalves à Pós- Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco, sob o título “Nubilum: Resource Management System for Distributed Clouds” orientada pela Profa. Judith Kelner e aprovada pela Banca Examinadora formada pelos professores: ___________________________________________________________ Prof. Paulo Romero Martins Maciel Centro de Informática / UFPE __________________________________________________________ Prof. Stênio Flávio de Lacerda Fernandes Centro de Informática / UFPE ____________________________________________________________ Prof. Kelvin Lopes Dias Centro de Informática / UFPE _________________________________________________________ Prof. José Neuman de Souza Departamento de Computação / UFC ___________________________________________________________ Profa. Rossana Maria de Castro Andrade Departamento de Computação / UFC Visto e permitida a impressão. Recife, 12 de março de 2012. ___________________________________________________ Prof. Nelson Souto Rosa Coordenador da Pós-Graduação em Ciência da Computação do Centro de Informática da Universidade Federal de Pernambuco.
  • 5. To my family Danielle, João Lucas, and Catarina.
  • 6. iv Acknowledgments I would like to express my gratitude to God, cause of all the things and also my existence; and to the Blessed Virgin Mary to whom I appealed many times in prayer, being attended always. I would like to thank my advisor Dr. Judith Kelner and my co-advisor Dr. Djamel Sadok, whose expertise and patience added considerably to my doctoral experience. Thanks for the trust in my capacity to conduct my doctorate at GPRT (Networks and Telecommunications Research Group). I am indebted to all the people from GPRT for their invaluable help for this work. A very special thanks goes out to Patrícia, Marcelo, and André Vítor, which have given valuable comments over the course of my PhD. I must also acknowledge my committee members, Dr. Jose Neuman, Dr. Otto Duarte, Dr. Rossana Andrade, Dr. Stênio Fernandes, Dr. Kelvin Lopes, and Dr. Paulo Maciel for reviewing my proposal and dissertation, offering helpful comments to improve my work. I would like to thank my wife Danielle for her prayer, patience, and love which gave me the necessary strength to finish this work. A special thanks to my children, João Lucas and Catarina. They are gifts of God that make life delightful. Finally, I would like to thank my parents, João and Fátima, and my sisters, Cynara and Karine, for their love. Their blessings have always been with me as I followed in my doctoral research.
  • 7. v Abstract The current infrastructure of Cloud Computing providers is composed of networking and computational resources that are located in large datacenters supporting as many as hundreds of thousands of diverse IT equipment. In such scenario, there are several management challenges related to the energy, failure and operational management and temperature control. Moreover, the geographical distance between resources and final users is a source of delay when accessing the services. An alternative to such challenges is the creation of Distributed Clouds (D-Clouds) with geographically distributed resources along to a network infrastructure with broad coverage. Providing resources in such a distributed scenario is not a trivial task, since, beyond the processing and storage resources, network resources must be taken in consideration offering users a connectivity service for data transportation (also called Network as a Service – NaaS). Thereby, the allocation of resources must consider the virtualization of servers and the network devices. Furthermore, the resource management must consider all steps from the initial discovery of the adequate resource for attending developers’ demand to its final delivery to the users. Considering those challenges in resource management in D-Clouds, this Thesis proposes then Nubilum, a system for resource management on D-Clouds considering geo- locality of resources and NaaS aspects. Through its processes and algorithms, Nubilum offers solutions for discovery, monitoring, control, and allocation of resources in D-Clouds in order to ensure the adequate functioning of the D-Cloud while meeting developers’ requirements. Nubilum and its underlying technologies and building blocks are described and their allocation algorithms are also evaluated to verify their efficacy and efficiency. Keywords: cloud computing, resource management mechamisms, network virtualization.
  • 8. vi Resumo Atualmente, a infraestrutura dos provedores de computação em Nuvem é composta por recursos de rede e de computação, que são armazenados em datacenters de centenas de milhares de equipamentos. Neste cenário, encontram-se diversos desafios quanto à gerência de energia e controle de temperatura, além de, devido à distância geográfica entre os recursos e os usuários, ser fonte de atraso no acesso aos serviços. Uma alternativa a tais desafios é o uso de Nuvens Distribuídas (Distributed Clouds – D-Clouds) com recursos distribuídos geograficamente ao longo de uma infraestrutura de rede com cobertura abrangente. Prover recursos em tal cenário distribuído não é uma tarefa trivial, pois, além de recursos computacionais e de armazenamento, deve-se considerar recursos de rede os quais são oferecidos aos usuários da nuvem como um serviço de conectividade para transporte de dados (também chamado Network as a Service – NaaS). Desse modo, o processo de alocação deve considerar a virtualização de ambos, servidores e elementos de rede. Além disso, a gerência de recursos deve considerar desde a descoberta dos recursos adequados para atender as demandas dos usuários até a manutenção da qualidade de serviço na sua entrega final. Considerando estes desafios em gerência de recursos em D-Clouds, este trabalho propõe Nubilum: um sistema para gerência de recursos em D-Cloud que considera aspectos de geo-localidade e NaaS. Por meio de seus processos e algoritmos, Nubilum oferece soluções para descoberta, monitoramento, controle e alocação de recursos em D-Clouds de forma a garantir o bom funcionamento da D-Cloud, além de atender os requisitos dos desenvolvedores. As diversas partes e tecnologias de Nubilum são descritos em detalhes e suas funções delineadas. Ao final, os algoritmos de alocação do sistema são também avaliadas de modo a verificar sua eficácia e eficiência. Palavras-chave: computação em nuvem, mecanismos de alocação de recursos, virtualização de redes.
  • 9. vii Contents Abstract v Resumo vi Abbreviations and Acronyms xii 1 Introduction 1 1.1 Motivation............................................................................................................................................. 2 1.2 Objectives ............................................................................................................................................. 4 1.3 Organization of the Thesis................................................................................................................. 4 2 Cloud Computing 6 2.1 What is Cloud Computing?................................................................................................................ 6 2.2 Agents involved in Cloud Computing.............................................................................................. 7 2.3 Classification of Cloud Providers...................................................................................................... 8 2.3.1 Classification according to the intended audience..................................................................................8 2.3.2 Classification according to the service type.............................................................................................8 2.3.3 Classification according to programmability.........................................................................................10 2.4 Mediation System............................................................................................................................... 11 2.5 Groundwork Technologies.............................................................................................................. 12 2.5.1 Service-Oriented Computing...................................................................................................................12 2.5.2 Server Virtualization..................................................................................................................................12 2.5.3 MapReduce Framework............................................................................................................................13 2.5.4 Datacenters.................................................................................................................................................14 3 Distributed Cloud Computing 15 3.1 Definitions.......................................................................................................................................... 15 3.2 Research Challenges inherent to Resource Management ............................................................ 18 3.2.1 Resource Modeling....................................................................................................................................18 3.2.2 Resource Offering and Treatment..........................................................................................................20 3.2.3 Resource Discovery and Monitoring......................................................................................................22 3.2.4 Resource Selection and Optimization....................................................................................................23 3.2.5 Summary......................................................................................................................................................27 4 The Nubilum System 28 4.1 Design Rationale................................................................................................................................ 28 4.1.1 Programmability.........................................................................................................................................28 4.1.2 Self-optimization........................................................................................................................................29 4.1.3 Existing standards adoption.....................................................................................................................29 4.2 Nubilum’s conceptual view.............................................................................................................. 29 4.2.1 Decision plane............................................................................................................................................30 4.2.2 Management plane.....................................................................................................................................31 4.2.3 Infrastructure plane...................................................................................................................................32 4.3 Nubilum’s functional components.................................................................................................. 32 4.3.1 Allocator......................................................................................................................................................33 4.3.2 Manager.......................................................................................................................................................34
  • 10. viii 4.3.3 Worker.........................................................................................................................................................35 4.3.4 Network Devices.......................................................................................................................................36 4.3.5 Storage System ...........................................................................................................................................37 4.4 Processes............................................................................................................................................. 37 4.4.1 Initialization processes..............................................................................................................................37 4.4.2 Discovery and monitoring processes......................................................................................................38 4.4.3 Resource allocation processes..................................................................................................................39 4.5 Related projects.................................................................................................................................. 40 5 Control Plane 43 5.1 The Cloud Modeling Language ....................................................................................................... 43 5.1.1 CloudML Schemas.....................................................................................................................................45 5.1.2 A CloudML usage example......................................................................................................................52 5.1.3 Comparison and discussion .....................................................................................................................56 5.2 Communication interfaces and protocols...................................................................................... 57 5.2.1 REST Interfaces.........................................................................................................................................57 5.2.2 Network Virtualization with Openflow.................................................................................................63 5.3 Control Plane Evaluation ................................................................................................................. 65 6 Resource Allocation Strategies 68 6.1 Manager Positioning Problem ......................................................................................................... 68 6.2 Virtual Network Allocation.............................................................................................................. 70 6.2.1 Problem definition and modeling ...........................................................................................................72 6.2.2 Allocating virtual nodes............................................................................................................................74 6.2.3 Allocating virtual links...............................................................................................................................75 6.2.4 Evaluation...................................................................................................................................................76 6.3 Virtual Network Creation................................................................................................................. 81 6.3.1 Minimum length Steiner tree algorithms ...............................................................................................82 6.3.2 Evaluation...................................................................................................................................................86 6.4 Discussion........................................................................................................................................... 89 7 Conclusion 91 7.1 Contributions ..................................................................................................................................... 92 7.2 Publications ........................................................................................................................................ 93 7.3 Future Work ....................................................................................................................................... 94 References 96
  • 11. ix List of Figures Figure 1 Agents in a typical Cloud Computing scenario (from [24]) ..................................................7 Figure 2 Classification of Cloud types (from [71]).................................................................................9 Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24]) ................11 Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud............................................16 Figure 5 ISP-based D-Cloud example ...................................................................................................17 Figure 6 Nubilum’s planes and modules...............................................................................................30 Figure 7 Functional components of Nubilum......................................................................................33 Figure 8 Schematic diagram of Allocator’s modules and relationships with other components..33 Figure 9 Schematic diagram of Manager’s modules and relationships with other components...34 Figure 10 Schematic diagram of Worker modules and relationships with the server system........35 Figure 11 Link discovery process using LLDP and Openflow ..........................................................38 Figure 12 Sequence diagram of the Resource Request process for a developer..............................39 Figure 13 Integration of different descriptions using CloudML........................................................44 Figure 14 Basic status type used in the composition of other types..................................................45 Figure 15 Type for reporting status of the virtual nodes ....................................................................46 Figure 16 XML Schema used to report the status of the physical node...........................................46 Figure 17 Type for reporting complete description of the physical nodes.......................................46 Figure 18 Type for reporting the specific parameters of any node ...................................................47 Figure 19 Type for reporting information about the physical interface ...........................................48 Figure 20 Type for reporting information about a virtual machine..................................................48 Figure 21 Type for reporting information about the whole infrastructure ......................................49 Figure 22 Type for reporting information about the physical infrastructure...................................49 Figure 23 Type for reporting information about a physical link .......................................................50 Figure 24 Type for reporting information about the virtual infrastructure .....................................50 Figure 25 Type describing the service offered by the provider .........................................................51 Figure 26 Type describing the requirements that can be requested by a developer .......................52 Figure 27 Example of a typical Service description XML ..................................................................53 Figure 28 Example of a Request XML..................................................................................................53 Figure 29 Physical infrastructure description........................................................................................54 Figure 30 Virtual infrastructure description..........................................................................................55 Figure 31 Communication protocols employed in Nubilum..............................................................57 Figure 32 REST operation for the retrieval of service information..................................................59 Figure 33 REST operation for updating information of a service ....................................................59 Figure 34 REST operation for requesting resources for a new application.....................................59 Figure 35 REST operation for changing resources of a previous request .......................................60 Figure 36 REST operation for releasing resources of an application ...............................................60 Figure 37 REST operation for registering a new Worker...................................................................60 Figure 38 REST operation to unregister a Worker..............................................................................61 Figure 39 REST operation for update information of a Worker ......................................................61 Figure 40 REST operation for retrieving a description of the D-Cloud infrastructure .................61 Figure 41 REST operation for updating the description of a D-Cloud infrastructure...................61 Figure 42 REST operation for the creation of a virtual node............................................................62 Figure 43 REST operation for updating a virtual node ......................................................................62 Figure 44 REST operation for removal of a virtual node...................................................................62 Figure 45 REST operation for requesting the discovered physical topology ..................................63 Figure 46 REST operation for the creation of a virtual link ..............................................................63 Figure 47 REST operation for updating a virtual link.........................................................................64 Figure 48 REST operation for removal of a virtual link.....................................................................64
  • 12. x Figure 49 Example of a typical rule for ARP forwarding...................................................................65 Figure 50 Example of the typical rules created for virtual links: (a) direct, (b) reverse..................65 Figure 51 Example of a D-Cloud with ten workers and one Manager.............................................69 Figure 52 Algorithm for allocation of virtual nodes............................................................................74 Figure 53 Example illustrating the minimax path................................................................................75 Figure 54 Algorithm for allocation of virtual links..............................................................................76 Figure 55 The (a) old and (b) current network topologies of RNP used in simulations................77 Figure 56 Results for the maximum node stress in the (a) old and (b) current RNP topology....78 Figure 57 Results for the maximum link stress in the (a) old and (b) current RNP topology ......79 Figure 58 Results for the mean link stress in the (a) old and (b) current RNP topology...............80 Figure 59 Mean path length (a) old and (b) current RNP topology..................................................80 Figure 60 Example creating a virtual network: (a) before the creation; (b) after the creation ......81 Figure 61 Search procedure used by the GHS algorithm....................................................................83 Figure 62 Placement procedure used by the GHS algorithm.............................................................84 Figure 63 Example of the placement procedure: (a) before and (b) after placement.....................85 Figure 64 Percentage of optimal samples for GHS and STA in the old RNP topology................87 Figure 65 Percentage of samples reaching relative error ≤ 5% in the old RNP topology.............88 Figure 66 Percentage of optimal samples for GHS and STA in the current RNP topology ........88 Figure 67 Percentage of samples reaching relative error ≤ 5% in the current RNP topology......89
  • 13. xi List of Tables Table I Summary of the main aspects discussed..................................................................................27 Table II MIMEtypes used in the overall communications.................................................................58 Table III Models for the length of messages exchanged in the system in bytes.............................67 Table IV Characteristics present in Nubilum’s resource model ........................................................71 Table V Reduced set of characteristics considered by the proposed allocation algorithms ..........72 Table VI Factors and levels used in the MPA’s evaluation ................................................................78 Table VII Factors and levels used in the GHS’s evaluation...............................................................86 Table VIII Scientific papers produced ..................................................................................................94
  • 14. xii Abbreviations and Acronyms CDN Content Delivery Network CloudML Cloud Modeling Language D-Cloud Distribute Cloud DHCP Dynamic Host Configuration Protocol GHS Greedy Hub Selection HTTP Hypertext Transfer Protocol IaaS Infrastructure as a Service ISP Internet Service Provider LLDP Link Layer Discovery Protocol MPA Minimax Path Algorithm MPP Manager Positioning Problem NaaS Network as a Service NV Network Virtualization OA Optimal Algorithm OCCI Open Cloud Computing Interface PoP Point of Presence REST Representational state transfer RP Replica Placement RPA Replica Placement Algorithm STA Steiner Tree Approximation VM Virtual Machine VN Virtual Network XML Extensible Markup Language ZAA Zhu and Ammar Algorithm
  • 15. 1 1 Introduction “A inea incipere.” Erasmus Nowadays, it is common to access content across the Internet with little reference to the underlying datacenter hosting infrastructure maintained by content providers. The entire technology used to provide such level of locality transparency offers also a new model for the provisioning of computing services, known as Cloud Computing. This model is attractive as it allows resources to be provisioned according to users’ requirements leading to overall cost reduction. Cloud users can rent resources as they become necessary, in a much more scalable and elastic way. Moreover, such users can transfer operational risks to cloud providers. In the viewpoint of those providers, the model offers a way for a better utilization of their own infrastructure. Ambrust et al. [1] point out that this model benefits from a form of statistical multiplexing, since it allocates resources for several users concurrently on a demand basis. This statistical multiplexing of datacenters is subsequent to several decades of research in many areas such as distributed computing, Grid computing, web technologies, service computing, and virtualization. Current Cloud Computing providers mainly use large and consolidated datacenters in order to offer their services. However, the ever increasing need for over-provisioning to attend peak demands and providing redundancy against failures allied to expensive cooling needs are important factors increasing the energetic costs of centralized datacenters [62]. In current datacenters, the cooling technologies used for heat dissipation control accounts for as much as 50% of the total power consumption [38]. In addition to these aspects, it must be observed that the network between users and the Cloud is often an unreliable best-effort IP service, which can harm delay-constrained services and interactive applications. To deal with these problems, there have been some indicatives whereby small cooperative datacenters can be more attractive since they offer cheaper and low-power consumption alternative reducing the infrastructure costs of centralized Clouds [12]. These small datacenters can be built at different geographical regions and connected by dedicated or public (provided by Internet Service Providers) networks, configuring a new type of Cloud, referred to as a Distributed Cloud. Such
  • 16. 2 Distributed Clouds [20], or just D-Clouds, can exploit the possibility of (virtual) links creation and the potential of sharing resources across geographic boundaries to provide latency-based allocation of resources to fully utilize this emerging distributed computing power. D-Clouds can reduce communication costs by simply provisioning storage, servers, and network resources close to end- users. The D-Clouds can be considered as an additional step in the ongoing deployments of Cloud Computing: one that supports different requirements and leverages new opportunities for service providers. Users in a Distributed Cloud will be free to choose where to allocate their resources in order to attend specific market niches, constraints on jurisdiction of software and data, or quality of service aspects of their clients. 1.1 Motivation Similarly to Cloud Computing, one of the most important design aspects of D-Clouds is the availability of “infinite” computing resources which may be used on demand. Cloud users see this “infinite” resource pool because the Cloud offers the continuous monitoring and management of its resources and the allocation of resources in an elastic way. Nevertheless, providing on-demand computing instances and network resources in a distributed scenario is not a trivial task. Dynamic allocation of resources and their possible reallocation are essential characteristics for accommodating unpredictable demands and, ultimately, contributing to investment return. In the context of Clouds, the essential feature of any resource management system is to guarantee that both user and provider requirements are met satisfactorily. Particularly in D-Clouds, users may have network requirements, such as bandwidth and delay constraints, in addition to the common computational requirements, such as CPU, memory, and storage. Furthermore, other user requirements are relevant including node locality, topology of nodes, jurisdiction, and application interaction. The development of solutions to cope with resource management problems remains a very important topic in the field of Cloud Computing. With regard to this technology, there are solutions focused on grid computing ([49], [70]) and on datacenters in current Cloud Computing scenarios ([4]). However, such strategies do not fit well the D-Clouds as they are heavily based on assumptions that do not hold in Distributed Cloud scenarios. For example, such solutions are designed for over- provisioned networks and commonly do not take into consideration the cost of resources’ communication, which is an important aspect for D-Clouds that must be cautiously monitored and/or reserved in order to meet users’ requirements.
  • 17. 3 The design of a resource management system involves challenges other than the specific design of optimization algorithms for resource management. Since D-Clouds are composed of computational and network devices with different architectures, software, and hardware capabilities, the first challenge is the development of a suitable resource model covering all this heterogeneity [20],. In addition, the next challenge is to describe how resources are offered, which is important since the requirements supported by the D-Cloud provider are defined in this step. The other challenges are related with the overall operation of the resource management system. When requests arrive, the system should be aware of the current status of resources, in order to determine if there are sufficient available resources in the D-Cloud that could satisfy the present request. In this way, the right mechanisms for resource discovery and monitoring should also be designed, allowing the system to be aware of the updated status of all its resources. Then, based on the current status and the requirements of the request, the system may select and allocate resources to serve the present request. Please note that the solution to those challenges involves the fine-grained coordination of several distributed components and the orchestrated execution of the several subsystems composing the resource management system. At a first glance, these subsystems can be organized into three parts: one responsible for the direct negotiation of requirements with users; another responsible for deciding what resources to allocate for given applications; and one last part responsible for the effective enforcement of these decisions on the resources. Designing such system is a very interesting and challenging task, and it raises the following research questions that will be investigated in this thesis: 1. How Cloud users describe their requirements? In order to enable the automatic negotiation between users and the D-Cloud, the Cloud must recognize a language or formalism for requirements description. Thus, the investigation of this topic must determine the proper characteristics of such a language. In addition, it must verify the existent approaches around this topic in the many relative computing areas. 2. How to represent the resources available in the Cloud? Correlated to the first question, the resource management system must also maintain an information model to represent all the resources in the Cloud, including their relationships (topology) and their current status. 3. How the users’ applications are mapped onto Cloud resources? This question is about the very aspect of resource allocation, i.e., the algorithms, heuristics, and strategies that are used to decide the set of resources meeting the applications’ requirements and optimizing a utility function.
  • 18. 4 4. How to enforce the decisions made? The effective enforcement of the decisions involves the extension of communication protocols or the development of new ones in order to setup the state of the overall resources in the D-Cloud. 1.2 Objectives The main objective of this Thesis is to propose an integrated solution to problems related to the management of resources in D-Clouds. Such solution is presented as Nubilum, a resource management system that offers a self-managed system for challenges on discovery, control, monitoring, and allocation of resources in D-Clouds. Nimbulus provides fine-grained orchestration of their components in order to allocate applications on a D-Cloud. The specific goals of this Thesis are strictly related to the research questions presented in Section 1.1, they are: • Elaborate an information model to describe D-Cloud resources and application requirements as computational restrictions, topology, geographic location and other correlated aspects that can be employed to request resources directly to the D-Cloud; • Explore and extend communication protocols for the provisioning and allocation of computational and communication resources; • Develop algorithms, heuristics, and strategies to find suitable D-Cloud resources based on several different application requirements; • Integrate the information model, the algorithms, and the communication protocols, into a single solution. 1.3 Organization of the Thesis This Thesis identifies the challenges involved in the resource management on Distributed Cloud Computing and presents solutions for some of these challenges. The remainder of this document is organized as follows. The general concepts that make up the basis for all the other chapters are introduced in the second chapter. Its main objective is to discuss Cloud Computing while trying to explore such definition and to classify the main approaches in this area. The Distributed Cloud Computing concept and several important aspects of resource management on those scenarios are introduced in the third chapter. Moreover, this chapter will make a comparative analysis of related research areas and problems.
  • 19. 5 The fourth chapter introduces the first contribution of this Thesis: the Nubilum resource management system, which aggregates the several solutions proposed on this Thesis. Moreover, the chapter highlights the rationale behind Nubilum as well as their main modules and components. The fifth chapter examines and evaluates the control plane of Nubilum. It describes the proposed Cloud Modeling Language and details the communication interfaces and protocols used for communicating between Nubilum components. The sixth chapter gives an overview of the resource allocation problems in Distributed Clouds, and makes a thorough examination of the specific problems related to Nubilum. Some particular problems are analyzed and a set of algorithms is presented and evaluated. The seventh chapter of this Thesis reviews the obtained evaluation results, summarizes the contributions and sets the path to future works and open issues on D-Cloud.
  • 20. 6 2 Cloud Computing “Definitio est declaratio essentiae rei.” Legal Proverb In this chapter the main concepts of Cloud Computing will be presented. It begins with a discussion on the definition of Cloud Computing (Section 2.1) and the main agents involved in Cloud Computing (Section 2.2). Next, classifications of Cloud initiatives are offered in Section 2.3. An exemplary and simple architecture of a Cloud Mediation System is presented in Section 2.4 followed by a presentation in Section 2.5 of the main technologies acting behind the scenes of Cloud Computing initiatives. 2.1 What is Cloud Computing? A definition of Cloud Computing is given by the National Institute of Standards and Technology (NIST) of the United States: “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [45]. The definition says that on-demand dynamic reconfiguration (elasticity) is a key characteristic. Additionally, the definition highlights another Cloud Computing characteristic: it assumes that minimal management efforts are required to reconfigure resources. In other words, the Cloud must offer self-service solutions that must attend to requests on-demand, excluding from the scope of Cloud Computing those initiatives that operate through the rental of computing resources in a weekly or monthly basis. Hence, it restricts Cloud Computing to systems that provide automatic mechanisms for resource rental in real-time with minimal human intervention. The NIST definition gives a satisfactory concept of Cloud Computing as a computing model. But, NIST does not cover the main object of Cloud Computing: the Cloud. Thus, in this Thesis, Cloud Computing is defined as the computing model that operates based on Clouds. In turn, the Cloud is defined as a conceptual layer that operates above an infrastructure to provide elastic services in a timely manner.
  • 21. 7 This definition encompasses three main characteristics of Clouds. Firstly, it notes that a Cloud is primarily a concept, i.e., a Cloud is an abstraction over an infrastructure. Thus, it is independent of the employed technologies and therefore one can accept different setups, like Amazon EC2 or Google App Engine, to be named Clouds. Moreover, the infrastructure is defined in a broad sense once it can be composed by software, physical devices, and/or other Clouds. Secondly, all Clouds have the same purpose: to provide services. This means that a Cloud hides the complexity of the underlying infrastructure while exploring the potential of overlying services and acting as a middleware. In addition, providing a service involves, implicitly, the use of some type of agreement that should be guaranteed by the Cloud. Such agreements can vary from pre-defined contracts to malleable agreements defining functional and non-functional requirements. Note that these services are qualified as elastic ones, which has the same meaning of dynamic reconfiguration that appeared in the NIST definition. Last but not least, the Cloud must provide services as quickly as possible such that the infrastructure resources are allocated and reallocated to attend the users’ needs. 2.2 Agents involved in Cloud Computing Despite previous approaches ([64], [8], [72], and [68]), this Thesis focuses only on three distinct agents in Cloud Computing as shown in Figure 1: clients, developers, and the provider. The first notable point is that the provider deals with two types of users that are called developers and clients. Thus, clients are the customers of a service produced by a developer. Clients use services from developers, but such use generates demand to the provider that actually hosts the service, and therefore the client can also be considered a user of the Cloud. It is important to highlight that in some scenarios (like scientific computing or batch processing) a developer may behave as a client to the Cloud because it is the end-user of the applications. The text will use “users” when referring to both classes without distinctions. Figure 1 Agents in a typical Cloud Computing scenario (from [24]) Developers can be service providers, independent programmers, scientific institutions, and so on, i.e., all who build applications into the Cloud. They create and run their applications while Developer Developer Client Client Client Client
  • 22. 8 keeping decisions related to maintenance and management of the infrastructure to the provider. Please note that, a priori, developers do not need to know about the technologies that makeup the Cloud infrastructure, neither about the specific location of each item in the infrastructure. Lastly, the term application is used to mean all types of services that can be developed on the Cloud. In addition, it is important to note that the type of applications supported by a Cloud depends exclusively on the goals of the Cloud as determined by the provider. Such a wide range of possible targets generates many different types of Cloud Providers that are discussed in the next section. 2.3 Classification of Cloud Providers Currently, there are several operational initiatives of Cloud Computing; however despite all being called Clouds, they provide different types of services. For that reason, the academic community ([64], [8], [45], [72], and [71]) classified these solutions accurately in order to understand their relationships. The three complementary proposals for classification are as follows. 2.3.1 Classification according to the intended audience This first simple taxonomy is suggested by NIST [45] that organizes providers according to the audience to which the Cloud is aimed. There are four classes in this classification: Private Clouds, Community Clouds, Public Clouds, and Hybrid Clouds. The first three classes accommodate providers in a gradual opening of the intended audience coverage. The Private Cloud class encompasses such types of Clouds destined to be used solely by an organization operating over their own datacenter or one leased from a third party for exclusive use. When the Cloud infrastructure is shared by a group of organizations with similar interests it is classified as a Community Cloud. Furthermore, the Public Cloud class encompasses all initiatives intended to be used by the general public. Finally, Hybrid Clouds are simply the composition of two or more Clouds pertaining to different classes (Private, Community, or Public). 2.3.2 Classification according to the service type In [71], authors offer a classification as represented in Figure 2. Such taxonomy divides Clouds in five categories: Cloud Application, Cloud Software Environment, Cloud Software Infrastructure, Software Kernel, and Firmware/Hardware. The authors arranged the different types of Clouds in a stack, showing that Clouds from higher levels are created using services in the lower levels. This idea pertains to the definitions of Cloud Computing discussed previously in Sections 2.1 and 2.2. Essentially, the Cloud provider does not need to be the owner of the infrastructure.
  • 23. 9 Figure 2 Classification of Cloud types (from [71]) The class in the top of the stack, also called Software-as-a-Service (SaaS), involves applications accessed through the Internet, including social networks, Webmail, and Office tools. Such services provide software to be used by the general public, whose main interest is to avoid tasks related to software management like installation and updating. From the point of view of the Cloud provider, SaaS can decrease costs with software implementation when compared with traditional processes. Similarly, the Cloud Software Environment, also called Platform-as-a-Service (PaaS), encloses Clouds that offer programming environments for developers. Through well-defined APIs, developers can use software modules for access control, authentication, distributed processing, and so on, in order to produce their own applications in the Cloud. Moreover, developers can contract services for automatic scalability of their software, databases, and storage services. In the middle of the stack there is the Cloud Software Infrastructure class of initiatives. This class encompasses solutions that provide virtual versions of infrastructure devices found in datacenters like servers, databases, and links. Clouds in this class can be divided into three subclasses according to the type of resource that is offered by them. Computational resources are grouped in the Infrastructure-as-a-service (IaaS) subclass that provides generic virtual machines that can be used in many different ways by the contracting developer. Services for massive data storage are grouped in the Data-as-a-Service (DaaS) class, whose main mission is to store remotely users’ data on remote, which allows those users to access their data from anywhere and at anytime. Finally, the third subclass, called Communications-as-a-Service (CaaS), is composed of solutions that offer virtual private links and routers through telecommunication infrastructures. The last two classes do not offer Cloud services specifically, but they are included in the classification to show that providers offering Clouds in higher layers can have their own software and hardware infrastructure. The Software Kernel class includes all of the software necessary to provide services to the other categories like operating systems, hypervisors, cloud management
  • 24. 10 middleware, programming APIs, and libraries. Finally, the class of Firmware/Hardware covers all sale and rental services of physical servers and communication hardware. 2.3.3 Classification according to programmability The five-class scheme presented above can classify and organize the current spectrum of Cloud Computing solutions, but such a model is limited because the number of classes and their relationships will need to be rearranged as new Cloud services emerge. Therefore, in this Thesis, a different classification model will be used based on the programmability concept, which was previously introduced by Endo et al. [19]. Borrowed from the realm of network virtualization [11], programmability is a concept related to the programming features a network element offers to developers, measuring how much freedom the developer has to manipulate resources and/or devices. This concept can be easily applied to the comparison of Cloud Computing solutions. More programmable Clouds offer environments where developers are free to choose programming paradigms, languages, and platforms. Less programmable Clouds restrict developers in some way: perhaps by forcing a set of programming languages or by providing support for only one application paradigm. On the other hand, programmability directly affects the way developers manage their leased resources. From this point- of-view, providers of less programmable Clouds are responsible to manage their infrastructure while being transparent to developers. In turn, a more programmable Cloud leaves more of these tasks to developers, thus introducing management difficulties due to the more heterogeneous programming environment. Thus, Cloud Programmability can be defined as the level of sovereignty under which developers have to manipulate services leased from a provider. Programmability is a relative concept, i.e., it was adopted to compare one Cloud with others. Also, programmability is directly proportional to heterogeneity in the infrastructure of the provider and inversely proportional to the amount of effort that developers must spend to manage leased resources. To illustrate how this concept can be used, one can classify two current Clouds: Amazon EC2 and Google App Engine. Clearly the Amazon EC2 is more programmable, since in this Cloud developers can choose between different virtual machine classes, operating systems, and so on. After they lease one of these virtual machines, developers can configure it to work as they see fit: as a web server, as a content server, as a unit for batch processing, and so on. On the other hand, Google App Engine can be classified as a less programmable solution, because it allows developers to create Web applications that will be hosted by Google. This restricts developers to the Web paradigm and to some programming languages.
  • 25. 11 2.4 Mediation System Figure 3 introduces an Archetypal Cloud Mediation System. This is a conceptual model that will be used as a reference to the discussion on Resource Management in this Thesis. The Archetypal Cloud Mediation System focuses on one principle: resource management as the main service of any Cloud Computing provider. Thus, other important services like authentication, accounting, and security are out of the scope of this conceptual system and, therefore these services are separated from the Mediation System in this archetypal Cloud mediation system. Clients also do not factor into this view of the system, since resource management is mainly related to the allocation of developers’ applications and meeting their requirements. Figure 3 Components of an Archetypal Cloud Mediation System (adapted from [24]) The mediation system is responsible for the entire process of resource management in the Cloud. Such a process covers tasks that range from the automatic negotiation of developers requirements to the execution of their applications. It has three main layers: negotiation, resource management, and resource control. The negotiation layer deals with the interface between the Cloud and developers. In the case of Clouds selling infrastructure services, the interface can be a set of operations based on Web Services for control of the leased virtual machines. Alternately, in the case of PaaS services, this interface can be an API for software development in the Cloud. Moreover, the negotiation layer handles the process of contract establishment between the enterprises and the Cloud. Currently, this process is simple and the contracts tend to be restrictive. One can expect that in the future, Clouds will offer more sophisticated avenues for user interaction through high level abstractions and service level policies. Mediation System Resources Resource Management Negotiation Resource Control Developers Auxiliary Services Account Authentication Security
  • 26. 12 The resource management layer is responsible for the optimal allocation of applications for obtaining the maximum usage of resources. This function requires advanced strategies and heuristics to allocate resources that meet the contractual requirements as established with the application developer. These may include service quality restrictions, jurisdiction restrictions, elastic adaptation, among others. Metaphorically, one can say that while the resource management layer acts as the “brain” of the Cloud, the resource control layer plays the role of its “limbs”. The resource control encompasses all functions needed to enforce decisions generated by the upper layer. Beyond the tools used to configure the Cloud resources effectively, all communication protocols used by the Cloud are included in this layer. 2.5 Groundwork Technologies Some of the main technologies that used by the current Cloud mediation systems (namely Service- oriented Computing, Virtualization, MapReduce, and Datacenters) will be discussed. 2.5.1 Service-Oriented Computing Service-Oriented Computing defines a set of principles, architectural models, and technologies for the design and development of distributed applications. The recent development of software while focusing on services gave rise to SOA (Service-Oriented Architecture), which can be defined as an architectural model “that supports the discovery, message exchange, and integration between loosely coupled services using industry standards” [37]. The common technology for the implementation of SOA principles is the Web Service that defines a set of standards to implement services over the World Wide Web. In Cloud Computing, SOA is the main paradigm for the development of functions on the several layers of the Cloud. Cloud providers publish APIs for their services on the web, allowing developers to use the Cloud and to automate several tasks related to the management of their applications. Such APIs can assume the form of WSDL documents or REST-based interfaces. Furthermore, providers can make available Software Development Kits (SDKs) and other toolkits for the manipulation of applications running on the Cloud. 2.5.2 Server Virtualization Server virtualization is a technique that allows a computer system to be partitioned onto multiple isolated execution environments offering a similar service as a single physical computer, which are called Virtual Machines (VM). Each VM can be configured in an independent way while having its own operating system, applications, and network parameters. Commonly, such VMs are hosted on a
  • 27. 13 physical server running a hypervisor, the software that effectively virtualizes the server and manages the VMs [54]. There are several hypervisor options that can be used for server virtualization. From the open- source community, one can cite Citrix’s Xen1 and the Kernel-based Virtual Machine (KVM)2 . From the realm of proprietary solutions, some examples are VMWare ESX3 and Microsoft’s HyperV4 . The main factor that boosted up the adoption of server virtualization within Cloud Computing is that such technology offers good flexibility regarding the dynamic reallocation of workloads across servers. Such flexibility allows, for example, providers to execute maintenance on servers without stopping developers’ applications (that are running on VMs) or to implement strategies for better resource usage through the migration of VMs. Furthermore, server virtualization is adapted for the fast provisioning of new VMs through the use of templates, which enables providers to offer elastic services for applications developers [43]. 2.5.3 MapReduce Framework MapReduce [15] is a programming framework developed by Google for distributed processing of large data sets across computing infrastructures. Inspired on the map and reduce primitives present in functional languages, its authors developed an entire framework for the automatic distribution of computations. In this framework, developers are responsible for writing map and reduce operations and for using them according to their needs, which is similar to the functional paradigm. These map and reduce operations will be executed by the MapReduce system that transparently distributes computations across the computing infrastructure and treats all issues related to node communication, load balancing, and fault tolerance. For the distribution and synchronization of the data required by the application, the MapReduce system also requires the use of a specially tailored distributed file system called Google File System (GFS) [23]. Despite being introduced by Google, there are some open source implementations of the MapReduce system, like Hadoop [6] and TPlatform [55]. The former is a popular open-source software used for running applications on large clusters built of commodity hardware. This software is used by large companies like Amazon, AOL, and IBM, as well as in different Web applications such as Facebook, Twitter, Last.fm, among others. Basically, Hadoop is composed of two modules: a MapReduce environment for distributed computing, and a distributed file system called the Hadoop Distributed File System (HDFS). The latter is an academic initiative that provides a 1 http://www.xen.org/products/cloudxen.html 2 http://www.linux-kvm.org/page/Main_Page 3 http://www.vmware.com/ 4 http://www.microsoft.com/hyper-v-server/en/us/default.aspx
  • 28. 14 development platform for Web mining applications. Similarly to Hadoop and Google’s MapReduce, the TPlatform has a MapReduce module and a distributed file system known as the Tianwang File System (TFS) [55]. The use of MapReduce solutions is common groundwork technology in PaaS Clouds because it offers a versatile sandbox for developers. Differently from IaaS Clouds, PaaS developers using a general-purpose language with MapReduce support do not need to be concerned with software configuration, software updating and, network configurations. All these tasks are the responsibility of the Cloud provider, which, in turn, benefits from the fact that such configurations will be standardized across the overall infrastructure. 2.5.4 Datacenters Developers who are hosting their applications on a Cloud wish to scale their leased resources, effectively increasing and decreasing their virtual infrastructure according to the demand of their clients. This is also the case for developers making use of their own private Clouds. Thus, independently of the class of Cloud under consideration, a robust and safe infrastructure is needed. Whereas virtualization and MapReduce respond for the software solution required to attend this demand, the physical infrastructure of Clouds is based on datacenters, which are infrastructures composed of TI components providing processing capacity, storage, and network services for one or more organizations [66]. Currently, the size of a datacenter (in number of components) can vary from tens of components to tens of thousands of components depending on the datacenter’s mission. In addition, there are several different TI components for datacenters including switches and routers, load balancers, storage devices, dedicated storage networks, and, the main component of any datacenter, in other words, servers [27]. Cloud Computing datacenters provide the required power to attend developers’ demands in terms of processing, storage, and networking capacities. A large datacenter, running a virtualization solution, allows for better granularity division of the hardware’s power through the statistical multiplexing of developers’ applications.
  • 29. 15 3 Distributed Cloud Computing “Quae non prosunt singula, multa iuvant.” Ovid This chapter discusses the main concepts of Distributed Cloud (D-Cloud) Computing. It begins with a discussion of their definition (Section 3.1) in an attempt to distinguish the D-Cloud from the current Clouds and highlight their main characteristics. Next, the main research challenges regarding resource management on D-Clouds will be described in Section 3.2. 3.1 Definitions Current Cloud Computing setups involve a huge amount of investments as part of the datacenter, which is the common underlying infrastructure of Clouds as previously detailed in Section 2.5.4. This centralized infrastructure brings many well-known challenges such as the need for resource over-provisioning and the high cost for heat dissipation and temperature control. In addition to concerns with infrastructure costs, one must observe that those datacenters are not necessarily close to their clients, i.e., the network between end-users and the Cloud is often a long best-effort IP connection, which means longer round-trip delays. Considering such limitations, industry and academy researchers have presented indicatives that small datacenters can be sometimes more attractive since they offer a cheaper and low-power consumption alternative while also reducing the infrastructure costs of centralized Clouds [12]. Moreover, Distributed Clouds, or just D-Clouds, as pointed out by Endo et al. in [20], can exploit the possibility of links creation and the potential of sharing resources across geographic boundaries to provide latency-based allocation of resources to ultimately fully utilize this distributed computing power. Thus, D-Clouds can reduce communication costs by simply provisioning data, servers, and links close to end-users. Figure 4 illustrates how D-Clouds can reduce the cost of communication through the spread of computational power and the usage of a latency-based allocation of applications. In Figure 4(a) the client uses an application (App) running on the Cloud through the Internet, which is subject to the latency imposed by the best-effort network. In Figure 4(b), the client is accessing the same App,
  • 30. 16 but in this case, the latency imposed by the network will be reduced due to the allocation of the App in a server that is in a small datacenter closest to the client than the previous scenario. (a) (b) Figure 4 Comparison between (a) a current Cloud and (b) a D-Cloud Please note that the Figure 4(b) intentionally does not specify the network connecting the infrastructure of the D-Cloud Provider. This network can be rented from different local ISPs (using the Internet for interconnection) or from an ISP with wide area coverage. In addition, such ISP could be the own D-Cloud Provider itself. This may be the case as the D-Cloud paradigm introduces an organic change in the current Internet where ISPs can start to play as D-Cloud providers. Thus, ISPs could offer their communication and computational resources for developers interested in deploying their applications at the specific markets covered by those ISPs. This idea is illustrated by Figure 5 that shows a D-Cloud offered by a hypothetical Brazilian ISP. In this example, a developer deployed its application (App) on two servers in order to attend requests from northern and southern clients. If the number of northeastern clients increases, the developer can deploy its App (represented by the dotted box) on one server close to the northeast region in order to improve its service quality. It is important to pay attention to the fact that the contribution of this Thesis falls in this last scenario, i.e., a scenario where the network and computational resources are all controlled by the same provider. CloudProvider Client Internet App Client App DistributedCloudProvider
  • 31. 17 Figure 5 ISP-based D-Cloud example D-Clouds share similar characteristics with current Cloud Computing, including essential offerings such as scalability, on demand usage, and pay-as-you-go business plans. Furthermore, the agents already stated for current Clouds (please see Figure 1) are exactly the same in the context of D-Clouds. Finally, the many different classifications discussed in Section 2.3 can be applied also. Despite the similarity, one may highlight two peculiarities of D-Clouds: support to geo-locality and Network as a Service (NaaS) provisioning ([2], [63], [17]). The geographical diversity of resources potentially improves cost and performance and gives an advantage to several different applications, particularly, those that do not require massive internal communication among large server pools. In this category, as pointed out by [12], one can emphasize, firstly, applications being currently deployed in a distributed manner, like VOIP (Voice over IP) and online games; secondly, one can indicate the applications that are good candidates for distributed implementation, like traffic filtering and e-mail distribution. In addition, there are other different types of applications that use software or data with specific legal restrictions on jurisdiction, and specific applications whose public is restricted to one or more geographical areas, like the tracking of buses or subway routes, information about entertainment events, local news, etc. Support for geo-locality can be considered to be a step further in the deployment of Cloud Computing that leverages new opportunities for service providers. Thus, they will be free to choose where to allocate their resources in order to attend to specific niches, constraints on jurisdiction of software and data, or quality of service aspects of end-users. The NaaS (or Communication as a Service – CaaS as cited in section 2.3.2) allows service providers to manage network resources, instead of just computational ones. Authors in [2] call NaaS as a service offering transport network connectivity with a level of virtualization suitable to be App App App
  • 32. 18 invoked by service providers. In this way, D-Clouds are able to manage their network resources according to their convenience, offering better response time for hosted applications. The NaaS is close to the Network Virtualization (NV) research area [31], where the main problem consists in choosing how to allocate a virtual network over a physical one, meeting requirements and minimizing usage of the physical resources. Although NV and D-Clouds are subject to similar problems and scenarios, there is an essential difference between these two. While NV commonly models its resources at the infrastructure level (requests are always virtual networks mapped on graphs), a D-Cloud can be engineered to work with applications in a different abstraction level, exactly as it occurs with actual Cloud service types like the ones described at Section 2.3.2. This way, one may see Network Virtualization simply as a particular instance of the D-Cloud. Other insights about NV are given in Section 3.3.2. Finally, it must be highlighted that the D-Cloud does not compete with the current Cloud Computing paradigm, since the D-Cloud merely fits a certain type of applications that have hard restrictions on geographical location, while the existent Clouds continue to be attracting for applications demanding massive computational resources or simple applications with minor or no restrictions on geographical location. Thus, the current Cloud Computing providers are the first potential candidates to take advantage of the D-Cloud paradigm, since the current Clouds could hire D-Cloud resources on-demand and move the applications to certain geographical locations in order to meet specific developers’ requirements. In addition to the current Clouds, the D-Clouds can also serve the developers directly. 3.2 Research Challenges inherent to Resource Management D-Clouds face challenges similar to the ones presented in the context of current Cloud Computing. However, as stated in Chapter 1, the object of the present study is the resource management in D- Clouds. Thus, this Section gives special emphasis to the challenges for resource management in D- Clouds, while focusing on four categories as presented in [20]: a) resource modeling; b) resource offering and treatment; c) resource discovery and monitoring; and d) resource selection. 3.2.1 Resource Modeling The first challenge is the development of a suitable resource model that is essential to all operations in the D-Cloud, including management and control. Optimization algorithms are also strongly dependent of the resource modeling scheme used. In a D-Cloud environment, it is very important that resource modeling takes into account physical resources as well as virtual ones. On one hand, the amount of details in each resource should be treated carefully, since if resources are described with great details, there is a risk that the
  • 33. 19 resource optimization becomes hard and complex, since the computational optimization problem considering the several modeled aspects can create NP-hard problems. On the other hand, more details give more flexibility and leverage the usage of resources. There are some alternatives for resource modeling in Clouds that could be applied to D- Clouds. One can cite, for example, the OpenStack software project [53], which is focused on producing an open standard Cloud operating system. It defines a Restful HTTP service that supports JSON and XML data formats and it is used to request or to exchange information about Cloud resources and action commands. OpenStack also offers ways to describe how to scale server down or up (using pre-configured thresholds); it is extensible, allowing the seamless addition of new features; and it returns additional error messages in faults case. Other resource modeling alternative is the Virtual Resources and Interconnection Networks Description Language (VXDL) [39], whose main goal is to describe resources that compose a virtual infrastructure while focusing on virtual grid applications. The VXDL is able to describe the components of an infrastructure, their topology, and an execution chronogram. These three aspects compose the main parts of a VXDL document. The computational resource specification part describes resource parameters. Furthermore, some peculiarities of virtual Grids are also present, such as the allocation of virtual machines in the same hardware and location dependence. The specification of the virtual infrastructure can consider specific developers’ requirements such as network topology and delay, bandwidth, and the direction of links. The execution chronogram specifies the period of resource utilization, allowing efficient scheduling, which is a clear concern for Grids rather than Cloud computing. Another interesting point of VXDL is the possibility of describing resources individually or in groups, according to application needs. VXDL lacks support for distinct services descriptions, since it is focused on grid applications only. The proposal presented in [32], called VRD hereafter, describes resources in a network virtualization scenario where infrastructure providers describe their virtual resources and services prior to offering them. It takes into consideration the integration between the properties of virtual resources and their relationships. An interesting point in the proposal is its use of functional and non-functional attributes. Functional attributes are related to characteristics, properties, and functions of components. Non-functional attributes specify criteria and constraints, such as performance, capacity, and QoS. Among the functional properties that must be highlighted is the set of component types: PhysicalNode, VirtualNode, Link, and Interface. Such properties suggest a flexibility that can be used to represent routers or servers, in the case of nodes, and wired or wireless links, in the case of communication links and interfaces.
  • 34. 20 Another proposal known as the Manifest language was developed by Chapman et al. [9]. They proposed new meta-models to represent service requirements, constraints, and elasticity rules for software deployment in a Cloud. The building block of such framework is the OVF (Open Virtualization Format) standard, which was extended by Chapman et al. to perform the vision of D- Clouds considering locality constraints. These two points are very interesting to our scenario. With regard to elasticity, it assumes a rule-based specification formed by three fields: a monitored condition related to the state of the service (such as workload), an operator (relational and logical ones are accepted), and an associated action to follow when the condition is met. The location constraints identify sites that should be favored or avoided when selecting a location for a service. Nevertheless, the Manifest language is focused on the software architecture. Hence, the language is not concerned with other aspects such as resources’ status or network resources. Cloud# is a language for modeling Clouds proposed by [16] to be used as a basis for Cloud providers and clients to establish trust. The model is used by developers to understand the behavior of Cloud services. The main goal of Cloud# is to describe how services are delivered, while taking into consideration the interaction among physical and virtual resources. The main syntactic construct within Cloud# is the computation unit CUnit, which can model Cloud systems, virtual machines, or operating systems. A CUnit is represented as a tuple of six components modeling characteristics and behaviors. This language gives developers a better understanding of the Cloud organization and how their applications are dealt with. 3.2.2 Resource Offering and Treatment Once the D-Cloud resources are modeled, the next challenge is to describe how resources are offered to developers, which is important since the requirements supported by the provider are defined in this step. Such challenge will also define the interfaces of the D-Cloud. This challenge differs from resource modeling since the modeling is independent of the way that resources are offered to developers. For example, the provider could model each resource individually, like independent items in a fine-grained scale such as GHz of CPU or GB of memory, but could offer them like a coupled collection of those items or a bundle, such as VM templates as cited at Section 2.5.2. Recall that, in addition to computational requirements (CPU and memory) and traditional network requirements, such as bandwidth and delay, new requirements are present under D-Cloud scenarios. The topology of the nodes is a first interesting requirement to be described. Developers should be able to set inter-nodes relationships and communication restrictions (e.g., downlink and uplink rates). This is illustrated in the scenario where servers – configured and managed by
  • 35. 21 developers – are distributed at different geographical localities while it is necessary for them to communicate with each other in a specific way. Jurisdiction is related to where (geographically) applications and their data must be stored and handled. Due to restrictions such as copyright laws, D-Cloud users may want to limit the location where their information will be stored (such as countries or continents). Other geographical constraint can be imposed by a maximum (or minimum) physical distance (or delay value) between nodes. Here, though developers do not know about the actual topology of the nodes, they may merely establish some delay threshold value for example. Developers should also be able to describe scalability rules, which would specify how and when the application would grow and consume more resources from the D-Cloud. Authors in [21] and [9] define a way of doing this, allowing the Cloud user to specify actions that should be taken, like deploying new VMs, based on thresholds of metrics monitored by the D-Cloud itself. Additionally, resource offering is associated to interoperability. Current Cloud providers offer proprietary interfaces to access their services, which can hinder users within their infrastructure as the migration of applications cannot be easily made between providers [8]. It is hoped that Cloud providers identify this problem and work together to offer a standardized API. According to [61], Cloud interoperability faces two types of heterogeneities: vertical heterogeneity and horizontal heterogeneity. The first type is concerned with interoperability within a single Cloud and may be addressed by a common middleware throughout the entire infrastructure. The second challenge, the horizontal heterogeneity, is related to Clouds from different providers. Therefore, the key challenge is dealing with these differences. In this case, a high level of granularity in the modeling may help to address the problem. An important effort in the search for horizontal standardization comes from the Open Cloud Manifesto5 , which is an initiative supported by hundreds of companies that aims to discuss a way to produce open standards for Cloud Computing. Their major doctrines are collaboration and coordination of efforts on the standardization, adoption of open standards wherever appropriate, and the development of standards based on customer requirements. Participants of the Open Cloud Manifesto, through the Cloud Computing Use Case group, produced an interesting white paper [51] highlighting the requirements that need to be standardized in a cloud environment to ensure interoperability in the most typical scenarios of interaction in Cloud Computing. 5 http://www.opencloudmanifesto.org/
  • 36. 22 Another group involved with Cloud standards is the Open Grid Forum6 , which is intended to develop the specification of the Open Cloud Computing Interface (OCCI)7 . The goal of OCCI is to provide an easily extendable RESTful interface Cloud management. Originally, the OCCI was designed for IaaS setups, but their current specification [46] was extended to offer a generic scheme for the management of different Cloud services. 3.2.3 Resource Discovery and Monitoring When requests reach a D-Cloud, the system should be aware of the current status of resources, in order to determine if there are available resources in the D-Cloud that could satisfy the requests. In this way, the right mechanisms for resource discovery and monitoring should also be designed, allowing the system to be aware of the updated status of all its resources. Then, based on the current status and request’ requirements, the system may select and allocate resources to serve these new request. Resource monitoring should be continuous and help taking allocation and reallocation decisions as part of the overall resource usage optimization. A careful analysis should be done to find a good and acceptable trade-off between the amount of control overhead and the frequency of resource information updating. The monitoring may be passive or active. It is considered passive when there are one or more entities collecting information. The entity may continuously send polling messages to nodes asking for information or may do this on-demand when necessary. On the other hand, the monitoring is active when nodes are autonomous and may decide when to send asynchronously state information to some central entity. Naturally, D-Clouds may use both alternatives simultaneously to improve the monitoring solution. In this case, it is necessary to synchronize updates in repositories to maintain consistency and validity of state information. The discovery and monitoring in a D-Cloud can be accompanied by the development of specific communication protocols. Such protocols act as a standard plane for control in the Cloud, allowing interoperability between devices. It is expected that such type of protocols can control the different elements including servers, switches, routers, load balancers, and storage components present in the D-Cloud. One possible method of coping with this challenge is to use smart communication nodes with an open programming interface to create new services within the node. One example of this type of open nodes can be seen in the emerging Openflow-enabled switches [44]. 6 http://www.gridforum.org/ 7 http://occi-wg.org/about/specification/
  • 37. 23 3.2.4 Resource Selection and Optimization With information regarding Cloud resource availability at hand, a set of appropriate candidates may then be highlighted. Next, the resource selection process finds the configuration that fulfills all requirements and optimizes the usage of the infrastructure. Selecting solutions from a set of available ones is not a trivial task due to the dynamicity, high algorithm complexity, and all different requirements that must be contemplated by the provider. The problem of resource allocation is recurrent on computer science, and several computing areas have faced such type of problem since early operating systems. Particularly in the Cloud Computing field, due to the heterogeneous and time-variant environment in Clouds, the resource allocation becomes a complex task, forcing the mediation system to respond with minimal turnaround time in order to maintain the developer’s quality requirements. Also, balancing resources’ load and projecting energy-efficient Clouds are major challenges in Cloud Computing. This last aspect is especially relevant as a result of the high demand for electricity to power and to cool the servers hosted on datacenters [7]. In a Cloud, energy savings may be achieved through many different strategies. Server consolidation, for example, is a useful strategy for minimizing energy consumption while maintaining high usage of servers’ resources. This strategy saves the energy migrating VMs onto some servers and putting idle servers into a standby state. Developing automated solutions for server consolidation can be a very complex task since these solutions can be mapped to bin-packing problems known to be NP-hard [72]. VM migration and cloning provides a technology to balance load over servers within a Cloud, provide fault tolerance to unpredictable errors, or reallocate applications before a programmed service interruption. But, although this technology is present in major industry hypervisors (like VMWare or Xen), there remains some open problems to be investigated. These include cloning a VM into multiple replicas on different hosts [40] and developing VM migration across wide-area networks [14]. Also, the VM migration introduces a network problem, since, after migration, VMs require adaptation of the link layer forwarding. Some of the strategies for new datacenter architectures explained in [67] offer solutions to this problem. Remodeling of datacenter architectures is other research field that tries to overcome limitations on scalability, stiffness of address spaces, and node congestion in Clouds. Authors in [67] surveyed this theme, highlighted the problems on network topologies of state-of-the-art datacenters, and discussed literature solutions for these problems. One of these solutions is the D-Cloud, as
  • 38. 24 pointed also by [72], which offers an energy efficient alternative for constructing a cloud and an adapted solution for time-critical services and interactive applications. Considering specifically the challenges on resource allocation in D-Clouds, one can highlight correlated studies based on the Placement of Replicas and Network Virtualization. The former is applied into Content Distribution Networks (CDNs) and it tries to decide where and when content servers should be positioned in order to improve system’s performance. Such problem is associated with the placement of applications in D-Clouds. The latter research field can be applied to D-Clouds considering that a virtual network is an application composed by servers, databases, and the network between them. Both research fields will be described in following sections. Replica Placement Replica Placement (RP) consists of a very broad class of problems. The main objective of this type of problems is to decide where, when, and by whom servers or their content should be positioned in order to improve CDN performance. The correspondent existing solutions to these problems are generally known as Replica Placement Algorithms (RPA) [35]. The general RP problem is modeled as a physical topology (represented by a graph), a set of clients requesting services, and some servers to place on the graph (costs per server can be considered instead). Generally, there is a pre-established cost function to be optimized that reflects service-related aspects, such as the load of user’s requests, the distance from the server, etc. As pointed out by [35], an RPA groups these aspects into two different components: the problem definition, which consists of a cost function to be minimized under some constraints, and a heuristic, which is used to search for near-optimal solutions in a feasible time frame, since the defined problems are usually NP-complete. Several different variants of this general problem were already studied. But, according to [57], they fall into two classes: facility location and minimum K-median. In the facility location problem, the main goal is to minimize the total cost of the graph through the placement of a number of servers, which have an associated cost. The minimum K-median problem, in turn, is similar but assumes the existence of a pre-defined number K of servers. More details on the modeling and comparison between different variants of the RP problem are provided by [35]. Different versions of this problem can be mapped onto resource allocation problems in D- Clouds. A very simple mapping can be defined considering an IaaS service where virtual machines can be allocated in a geo-distributed infrastructure. In such mapping, the topology corresponds to the physical infrastructure elements of the D-Cloud, the VMs requested by developers can be treated as servers, and the number of clients accessing each server would be their load.
  • 39. 25 Qiu et al. [57] proposed three different algorithms to solve the K-median problem in a CDN scenario: Tree-based algorithm, Greedy algorithm, and Hot Spot algorithm. The Tree-based solution assumes that the underlying graph is a tree that is divided into several small trees, placing each server in each small tree. The Greedy algorithm places servers one at a time in order to obtain a better solution in each step until all servers are allocated. Finally, the Hot Spot solution attempts to place servers in the vicinity of clients with the greatest demand. The results showed that the Greedy Algorithm for replica placement could provide CDNs with performance that is close to optimal. These solutions can be mapped onto D-Clouds considering the simple scenario of VM allocation on a geo-distributed infrastructure with the restriction that each developer has a fixed number of servers to attend their clients. In such case, this problem can be straightforwardly reduced to the K-median problem and the three solutions proposed could be applied. Basically, one could treat each developer as a different CDN and optimize each one independently still considering a limited capacity of the physical resources caused by the allocation of other developers. Presti et al. [56], treat a RP variant considering a trade-off between the load of requests per content and the number of replica additions and removals. Their solution considers that each server in the physical topology decides autonomously, based on thresholds, when to clone overloaded contents or to remove the underutilized ones. Such decisions also encompass the minimization of the distance between clients and the respective accessed replica. A similar problem is investigated in [50], but considering constraints on the QoS perceived by the client. The authors propose a mathematical offline formulation and an online version that uses a greedy heuristic. The results show that the heuristic presents good results with minor computational time. The main focus of these solutions is to provide scalability to the CDN according to the load caused by client requests. Thus, despite working only with the placement of content replicas, such solutions can be also applied to D-Clouds with some simple modifications. Considering replicas as allocated VMs, one can apply the threshold-based solution proposed in [56] to the simple scenario of VM scalability on a geo-distributed infrastructure. Network Virtualization The main problem of NV is the allocation of virtual networks over a physical network [10] and [3]. Analogously, D-Clouds’ main goal is to allocate application requests on physical resources according to some constraints while attempting to obtain a clever mapping between the virtual and physical resources. Therefore, problems on D-Clouds can be formulated as NV problems, especially in scenarios considering IaaS-level services.
  • 40. 26 Several instances of the NV based resource allocation problem can be reduced to a NP-hard problem [48]. Even the versions where one knows beforehand all the virtual network requests that will arrive in the system is NP-hard. The basic solution strategy thus is to restrict the problem space making it easier to deal with and also consider the use of simple heuristic-based algorithms to achieve fast results. Given a model based on graphs to represent both physical and virtual servers, switches, and links [10], an algorithm that allocates virtual networks should consider the constraints of the problem (CPU, memory, location or bandwidth limits) and an objective function based on the algorithm objectives. In [31], the authors describe some possible objective functions to be optimized, like the ones related to maximize the revenue of the service provider, minimizing link and nodes stress, etc. They also survey heuristic techniques used when allocating the virtual networks dividing them in two types: static and dynamic. The dynamic type permits reallocating along the time by adding more resources to already allocated virtual networks in order to obtain a better performance. The static one means once a virtual network is allocated it will hardly ever change its setup. To exemplify the type of problems studied on NV, one can be driven to discuss the one studied by Chowdhury et al. [10]. Its authors propose an objective function related to the cost and revenue of the provider and constrained by capacity and geo-location restrictions. They reduce the problem to a mixed integer programming problem and then relax the integer constraints through the deriving of two different algorithms for the solution’s approximation. Furthermore, the paper also describes a Load Balancing algorithm, in which the original objective function is customized in order to avoid using nodes and links with low residual capacity. This approach implies in allocation on less loaded components and an increase of the revenue and acceptance ratio of the substrate network. Such type of problem and solutions can be applied to D-Clouds. One example could be the allocation of interactive servers with jurisdiction restrictions. In this scenario, the provider must allocate applications (which can be mapped on virtual networks) whose nodes are linked and that must be close to a certain geographical place according to a maximum tolerated delay. Thus, a provider could apply the proposed algorithms with minor simple adjustments. In the paper of Razzaq and Rathore [58], the virtual network embedding algorithm is divided in two steps: node mapping and link mapping. In the node mapping step, nodes with highest resource demand are allocated first. The link mapping step is based on an edge disjoint k-shortest path algorithm, by selecting the shortest path which can fulfill the virtual link bandwidth
  • 41. 27 requirement. In [42], a backtracking algorithm for the allocation of virtual networks onto substrate networks based on the graph isomorphism problem is proposed. The modeling considers multiple capacity constraints. Zhu and Ammar [74] proposed a set of four algorithms with the goal of balancing the load on the physical links and nodes, but their algorithms do not consider capacity aspects. Their algorithms perform the initial allocation and make adaptive optimizations to obtain better allocations. The key idea of the algorithms is to allocate virtual nodes considering the load of the node and the load of the neighbor links of that node. Thus one can say that they perform the allocation in a coordinated way. For virtual link allocation, the algorithm tries to select paths with few stressed links in the network. For more details about the algorithm see [74]. Considering the objectives of NV and RP problems, one may note that NV problems are a general form of the RP problem: RP problems try to allocate virtual servers whereas NV considers allocation of virtual servers and virtual links. Both categories of problems can be applied to D- Clouds. Particularly, RP and NV problems may be respectively mapped on two different classes of D-Clouds: less controllable D-Clouds and more controllable ones, respectively. The RP problems are suitable for scenarios where allocation of servers is more critical than links. In turn, the NV problems are especially adapted to situations where the provider is an ISP that has full control over the whole infrastructure, including the communication infrastructure. 3.2.5 Summary The D-Clouds’ domain brings several engineering and research challenges that were discussed in this section and whose main aspects are summarized in Table I. Such challenges are only starting to receive attention from the research community. Particularly, the system, models, languages, and algorithms presented in the next chapters will cope with some of these challenges. Table I Summary of the main aspects discussed Categories Aspects Resource Modeling Heterogeneity of resources Physical and virtual resources must be considered Complexity vs. Flexibility Resource Offering and Treatment Describe the resources offered to developers Describe the supported requirements New requirements: topology, jurisdiction, scalability Resource Discovery and Monitoring Monitoring must be continuous Control overhead vs. Updated information Resource Selection and Optimization Find resources to fulfill developer’s requirements Optimize usage of the D-Cloud infrastructure Complex problems solved by approximation algorithms
  • 42. 28 4 The Nubilum System “Expulsa nube, serenus fit saepe dies.” Popular Proverb Section 2.4 introduced an Archetypal Cloud Mediation system focusing specifically on the resource management process that ranges from the automatic negotiation of developers requirements to the execution of their applications. Further, this system was divided into three layers: negotiation, resource management, and resource control. Keeping in mind this simple archetypal mediation system, this chapter presents Nubilum a resource management system that offers a self-managed solution for challenges resulting from the discovery, monitoring, control, and allocation of resources in D-Clouds. This system appears previously in [25] under the name of D-CRAS (Distributed Cloud Resource Allocation System). Section 4.1 presents some decisions taken to guide the overall design and implementation of Nubilum. Section 4.2 presents a conceptual view of the Nubilum’s architecture highlighting their main modules. The functional components of Nubilum are detailed in Section 4.3. Section 4.4 presents the main processes performed by Nubilum. Section 4.5 closes this chapter by summarizing the contributions of the system and comparing them with correlated resource management systems. 4.1 Design Rationale As stated previously in Section 1.2, the objective of this Thesis is to develop a self-manageable system for resource management on D-Clouds. Before the development of the system and their correspondent architecture, some design decisions that will guide the development of the system must be delineated and justified. 4.1.1 Programmability The first aspect to be defined is the abstraction level in which Nubilum will act. Given that D- Clouds concerns can be mapped on previous approaches on Replica Placement (see Section 0) and Network Virtualization (see Section 0) research areas, a straightforward approach would be to consider a D-Cloud working at the same abstraction level. Therefore, knowing that proposals in both areas commonly seem to work at the IaaS level, i.e., providing virtualized infrastructures, Nubilum would naturally also operate at the IaaS level.
  • 43. 29 Nubilum offers a Network Virtualization service. Applications can be treated as virtual networks and the provider’s infrastructure is the physical network. In this way, the allocation problem is a virtual network assignment problem and previous solutions for the NV area can be applied. Note that such approach does not exclude previous Replica Placement solutions because such area can be viewed as a particular case of Network Virtualization. 4.1.2 Self-optimization As defined in Section 2.1, the Cloud must provide services in a timely manner, i.e., resources required by users must be configured as quickly as possible. In other words, to meet such restriction, Nubilum must operate as much as possible without human intervention, which is the very definition of self-management from Autonomic Computing [69]. The operation involves maintenance and adjustment of the D-Cloud resources in the face of changing application demands and innocent or malicious failures. Thus, Nubilum must provide solutions to cope with the four aspects leveraged by Autonomic Computing: self-configuration, self- healing, self-optimization, and self-protection. Particularly, this Thesis focuses on investigating self- optimization – and, at some levels possibly, self-configuration – on D-Clouds. The other two aspects are considered out of scope of this proposal. According to [69], self-optimization of a system involves letting its elements “continually seek ways to improve their operation, identifying and seizing opportunities to make themselves more efficient in performance or cost”. Such definition fits very well the aim of Nubilum, which must ensure an automatic monitoring and control of resources to guarantee the optimal functioning of the Cloud while meeting developers’ requirements. 4.1.3 Existing standards adoption The Open Cloud Manifesto, an industry initiative that aims to discuss a way to produce open standards for Cloud Computing, states that Cloud providers “must use and adopt existing standards wherever appropriate” [51]. The Manifesto argues that several efforts and investments have been made by the IT industry in standardization, so it seems more productive and economic to use such standards when appropriate. Following this same line, Nubilum will adopt some industry standards when possible. Such adoption is also extended to open processes and software tools. 4.2 Nubilum’s conceptual view As shown in Figure 6, the conceptual view of Nubilum’s architecture is composed of three planes: a Decision plane, a Management plane, and an Infrastructure plane. Starting from the bottom, the lower plane nestles all modules responsible for the appropriate virtualization of each resource in the