Cloud DBMS for large scale data analysis

The concept of
‘cloud computing’
is
currently receiving considerable attention,
both in the research and commercial arenas
Cloud
computing
is
the
delivery
of
computing as a service rather than a
product,
whereby
shared
resources,
software and information are provided to
computers and other devices as a utility
(like the electricity grid) over a network
(typically the Internet).

In this
paper we
discuss the
limitations
and
opportunities
of deploying
data
management
issues on
these
emerging
cloud
computing
platforms.

We present a list of features that
a DBMS designed for large scale
data analysis tasks running on an
Amazon-style
offering
should
contain.
We thus express the need for a
new DBMS, designed specifically
for cloud computing environments.

Data management applications are potential
candidates for deployment in the cloud.
Cloud computing vendors typically maintain little
more than the hardware, and give customers a
set of virtual machines in which to install their
own software.
Cloud-based DBMS are extremely scalable. They
are able to handle volumes of data and
processes that would exhaust a typical DBMS.

• We thus foreground a research objective for
large scale data analysis in the cloud,
showing why currently available systems are
not ideally suited for cloud deployment, and
arguing that there is a need for a newly
designed DBMS, architected specifically for
cloud computing platforms.

. Cloud computing is a subscription-based service
where you can obtain networked storage space
and computer resources.
. There are different types of clouds that you can
subscribe to depending on your needs. As a home
user or small business owner, you will most likely
use public cloud services.
Public Cloud - A public cloud can be accessed by any subscriber
with an internet connection and access to the cloud space.
Private Cloud - A private cloud is established for a specific group
or organization and limits access to just that group.

• Community Cloud - A
community cloud is shared
among two or more
organizations that have similar
cloud requirements.
• Hybrid Cloud - A hybrid cloud is
essentially a combination of at
least two clouds, where the
clouds included are a mixture of
public, private, or community.

Compute power is elastic, but
only if workload is parallelizable
Agility
Cost
Reliability
Data is stored at an untrusted
host.
Data is replicated, often across
large geographic distances

Transactional data

Analytical data management

management
Shared-Nothing
Typically

architecture Shared-nothing architecture is a good

not

use

transactional

in match

for

analytical

data

data management.

management.

ACID

Property

is

Hard

to ACID Property is not needed

maintain in transactional data
management.
Transactional

database

generally small system.

are Analytical data management systems
are

generally

larger

than

transactional systems.
There are enormous risks in Particularly sensitive data can often
storing transactional data on an be

left

out

of

the

analysis

data

In the contemporary scenario there is implicit
need for construction of a new database
distinctively for clouds understanding its
applications, need and compatibility…
Architecture which can detect and prevent the
various threats, attacks and other security related
issues which continuously depletes the efficiency
and the productivity of the cloud that can be in the
future a platform for cloud computing.
The next step is to propose a model for grid
computing also.

•J. Hurwitz, M. Kaufman, and R. Bloor, “Cloud Computing for Dummies,”
Wiley Publishing, Inc. 2010.
•Leah Muthoni Riungu, Ossi Taipale, Kari Smolander, “Software Testing as an
Online Service: Observations from Practice,” In Third International Conference
on Software Testing, Verification, and Validation Workshops (ICSTW), 418-423,
2010.
•M. Brantner, D. Florescu, D. Graf, D. Kossmann, and T. Kraska. Building a
Database on S3. In Proc. of SIGMOD, pages 251–264, 2008.
•] B. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.
Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!s hosted data serving
platform. In Proceedings of VLDB, 2008.
•J. Dean and S. Ghemawat. Mapreduce: Simpliﬁed data processing on large
clusters. pages 137–150, December 2004.
•Y. Yang, C. Onita, J. Dhaliwal, X. Zhang, “TESTQUAL: conceptualizing
software testing as a service,” In the 15th Americas conf. on information
systems, 6-9.08, San Francisco, California, USA, paper 608, 2009.

Cloud DBMS for large scale data analysis

Cloud DBMS for large scale data analysis

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Cloud DBMS for large scale data analysis

Ähnlich wie Cloud DBMS for large scale data analysis (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Cloud DBMS for large scale data analysis