1
CHAPTER 1
INTRODUCTION
1.1 CLOUD COMPUTING
Cloud Computing can be seen as the logical evolution in outsourcing
IT services. It is a computing in which large groups of remote servers
are networked to allow centralized data storage and online access to computer
services or resources. Cloud computing relies on sharing of resources to achieve
coherence and economies of scale, similar to a utility (like the electricity grid)
over a network. We need not to install a piece of software on our local PC and
this is how the cloud computing overcomes platform dependency issues. Hence,
the Cloud Computing is making our business application mobile and
collaborative. Ex: Google Apps, Microsoft online and infrastructures like
Amazon’s EC2, Eucalyptus, Nimbus, and platforms to help developers write
applications like Amazon’s S3, Windows Azure. Much of the data stored in
clouds is highly sensitive, for example, medical records and social networks.
In analogy to above usage the word cloud was used as a metaphor for
the Internet and a standardized cloud-like shape was used to denote a network on
telephony schematics and later to depict the Internet in computer network
diagrams. With this simplification, the implication is that the specifics of how the
end points of a network are connected are not relevant for the purposes of
understanding the diagram. The cloud symbol was used to represent the Internet
as early as 1994, in which servers were then shown connected to, but external to,
the cloud.
2
The term "moving to cloud" also refers to an organization moving
away from a traditional model (buy the dedicated hardware and depreciate it over
a period of time) to the cloud model (use a shared cloud infrastructure and pay as
one uses it). Cloud computing, or in simpler shorthand just "the cloud", also
focuses on maximizing the effectiveness of the shared resources. Cloud resources
are usually not only shared by multiple users but are also dynamically reallocated
per demand. This can work for allocating resources to users.
1.1.1 Architecture
Cloud architecture the systems architecture of the software systems
involved in the delivery of cloud computing, typically involves multiple cloud
components communicating with each other over a loose coupling mechanism
such as a messaging queue. Elastic provision implies intelligence in the use of
tight or loose coupling as applied to mechanisms such as these and others.
Fig 1.1 Cloud computing sample architecture
Cloud Engineering
Cloud engineering is the application of engineering disciplines to
cloud computing. It brings a systematic approach to the high-level concerns of
3
commercialization, standardization, and governance in conceiving, developing,
operating and maintaining cloud computing systems. It is a multidisciplinary
method encompassing contributions from diverse areas such as systems,
software, web, performance, information, security, platform, risk, and quality
engineering.
Cloud Computing Paradigm
The increasing demand for flexibility in obtaining and releasing
computing resources in a cost-effective manner has resulted in a wide adoption
of the Cloud computing paradigm. The availability of an extensible pool of
resources for the user provides an effective alternative to deploy applications
with high scalability and processing requirements. In general, a Cloud computing
infrastructure is built by interconnecting large-scale virtualized data centers, and
computing resources are delivered to the user over the Internet in the form of an
on-demand service by using virtual machines. While the benefits are immense,
this computing paradigm has significantly changed the dimension of risks on
user’s applications, specifically because the failures that manifest in the
datacenters are outside the scope of the user’s organization. Nevertheless, these
failures impose high implications on the applications deployed in virtual
machines and, as a result, there is an increasing need to address users’ reliability
and availability concerns.
1.1.2 Service Models in Cloud
Cloud computing providers offer their services according to several
fundamental models; Service delivery in Cloud Computing comprises three
different service models, namely Infrastructure-as-a-Service (IaaS), Platform-as-
a-Service (PaaS), and Software-as-a-Service (SaaS). The three service models or
layer are completed by an end user layer that encapsulates the end user
perspective on cloud services. The vast majority of SaaS solutions are based on a
multi-tenant architecture. With this model, a single version of the application,
4
with a single configuration (hardware, network, operating system), is used for all
customers ("tenants").
Fig 1.2 Cloud Service models outline
Infrastructure as a Service (Iaas)
In the most basic cloud-service model & according to the IETF
(Internet Engineering Task Force), providers of IaaS offer computers – physical
or (more often) virtual machines – and other resources. IaaS clouds often offer
additional resources such as a virtual-machine disk image library, raw block
storage, and file or object storage, firewalls, load balancers, IP addresses, virtual
local area networks (VLANs), and software bundles. IaaS-cloud providers supply
these resources on-demand from their large pools installed in data centers. For
wide-area connectivity, customers can use either the Internet or carrier clouds
(dedicated virtual private networks).
Physical resources are abstracted by virtualization, which means they
can then be shared by several operating systems and end user environments on
the virtual resources – ideally, without any mutual interference. These virtualized
resources usually comprise CPU and RAM, data storage resources (elastic block
5
store and databases), and network resources. To deploy their applications, cloud
users install operating-system images and their application software on the cloud
infrastructure. In this model, the cloud user patches and maintains the operating
systems and the application software. Cloud providers typically bill IaaS services
on a utility computing basis: cost reflects the amount of resources allocated and
consumed
Fig 1.3 Infrastructure as a Service (IaaS) architecture
Software as a Service (Saas)
In the business model using software as a service (SaaS), users are
provided access to application software and databases. Cloud providers manage
the infrastructure and platforms that run the applications. SaaS is sometimes
referred to as "on-demand software" and is usually priced on a pay-per-use basis.
SaaS providers generally price applications using a subscription fee. In the SaaS
model, cloud providers install and operate application software in the cloud and
6
cloud users access the software from cloud clients. Cloud users do not manage
the cloud infrastructure and platform where the application runs. This eliminates
the need to install and run the application on the cloud user's own computers,
which simplifies maintenance and support. Cloud applications are different from
other applications in their scalability—which can be achieved by cloning tasks
onto multiple virtual machines at run-time to meet changing work demand. Load
balancers distribute the work over the set of virtual machines.
The services on the application layer can be seen as an extension of
the ASP (application service provider) model, in which an application is run,
maintained, and supported by a service vendor. The main differences between the
services on the application layer and the classic ASP model are the encapsulation
of the application as a service, the dynamic procurement, and billing by units of
consumption (pay as you go). However, both models pursue the goal of focusing
on core competencies by outsourcing applications.
Fig 1.4 Software as a Service (SaaS) architecture
7
Platform as a Service (Paas)
In the PaaS models, cloud providers deliver a computing platform,
typically including operating system, programming language execution
environment, database, and web server. Application developers can develop and
run their software solutions on a cloud platform without the cost and complexity
of buying and managing the underlying hardware and software layers. With some
PaaS offers like Microsoft Azure and Google App Engine, the underlying
computer and storage resources scale automatically to match application demand
so that the cloud user does not have to allocate resources manually. The latter has
also been proposed by an architecture aiming to facilitate real-time in cloud
environments. Platform as a service (PaaS) provides a computing platform and a
key chimney. It joins with software as a service (SaaS) and infrastructure as a
service (IaaS), model of cloud computing.
Fig 1.5 Platform as a Service (PaaS) architecture
8
1.2 CLOUD SERVICE PROVIDERS
Organization or enterprises provide various services to cloud users.
Confidentiality and integrity of cloud data should be maintained by CSP. The
Provider should ensure that user’s data and application are secured on a cloud.
CSP may not leak the information or else cannot modify or access user’s content.
The attacker can log into network communication.
Amazon:
Amazon was one of the first companies to offer cloud services to the
public, and they are very sophisticated. Amazon offers a number of cloud
services, including:
Elastic Compute Cloud (EC2) Offers virtual machines and extra
CPU cycles for your organization.
Simple Storage Service (S3) Allows you to store items up to 5GB in
size in Amazon’s virtual storage service.
Simple Queue Service (SQS) Allows your machines to talk to each
other using this message-passing API.
Google:
In stark contrast to Amazon’s offerings is Google’s App Engine. On
Amazon you get root privileges, but on App Engine, you can’t write a file in your
own directory. Google removed the file write feature out of Python as a security
measure, and to store data you must use Google’s database.Google offers online
documents and spreadsheets, and encourages developers to build features for
those and other online software, using its Google App Engine
Security threats which bring the lack of security of data and code in
cloud environments are as follows: According to CSA (Cloud Security Alliance)
the top threats in Cloud Environments are:
Abuse and Nefarious Use of Cloud Computing
9
Insecure Application Programming Interfaces
Malicious Insiders
Shared Technology Vulnerabilities
Data Loss/Leakage
Account, Service & Traffic Hijacking
Auditability Schemes
Auditing reduces risk for customers as well as give incentives to
providers to improve their services. Auditability falls under two categories as
follows when we consider the available schemes in audit ability: private audit
ability and public auditability. Even though schemes with private auditability can
attain higher scheme efficiency, public auditability permits anyone, not just the
client (data owner), to deal with the cloud server for correctness of data storage
while keeping no private information. Then, clients are able to pass on the
evaluation of the service performance to an independent third party auditor
(TPA), without giving their computation resources. So we can denote the types
of auditing protocols as Data Owner Auditing and Third Party Auditing.
According to the methods of data storage auditing methods can be categorized
into three: Message Authentication Code (MAC) - based methods, AES- based
Homomorphic methods and Boneh-Lynn-Shacham signature (BLS [12]) – based
Homomorphic methods. The challenging issues of data storage auditing include
Dynamic Auditing, Collaborative Auditing and Batch Auditing. We need to meet
the three performance criteria when comes to designing of auditing protocols as:
low storage overhead, low communication cost and low computational
complexity.
Third Party Auditing (TPA)
Cloud consumers save data in cloud server so that security as well as
data storage correctness is primary concern. The data owners having huge
10
amount of outsourced data and the task of auditing the data correctness in a cloud
environment can be difficult and expensive for data owners.
To support third party auditing where user safely delegate in integrity
checking tasks to third party auditors(TPA)this scheme can almost guarantee the
simultaneous localization of data error(i.e. the identification of misbehaving
servers).
A novel and homogeneous structure is introduced to provide security
to different cloud types. To achieve data storage security, BLS (Bonch-Lynn-
Sachems) [12] algorithm is used to signing the data blocks before outsourcing
data into cloud. Reed Solomon technique is used for error correction and to
ensure data storage correction
1.3 ORGANIZATION OF THE PROJECT
This section gives a brief about the sections and their contents.
Chapter 2 discusses the Literature Survey.
Chapter 3 deals with the existing system, drawbacks of the existing
system and it briefs about the proposed system.
Chapter 4 and Chapter 5gives the various software and the hardware
requirement details needed to implement the project.
Chapter 6 describes the architectural design of the proposed system.
Chapter 7 discusses the various modules and the input and output
details of the system in detail
Chapter 8 includes the results of the implemented system
Chapter 9 concludes with a summary of the future enhancement of
the work. Following the future enhancement is the list of papers referred for the
project.
In this chapter we discussed about overview of the project and the
detail explanation about various service providers, Auditing and service model
traits used and organization of the thesis was also described.
11
CHAPTER 2
LITERATURE REVIEW
Literature survey is the most important step in software development
process. Use of privacy-preserving public auditing mechanism and cryptographic
scheme can increase the security level for the data that are stored on the cloud
servers.
Following is the literature survey of some existing technique for cloud
security:
2.1 ENABLING PUBLIC VERIFIABILITY AND DATA DYNAMICS
FOR STORAGE SECURITY IN CLOUD COMPUTING
Author: Q Wang, C. Wang
Cloud Computing has been envisioned as the next generation
architecture of IT Enterprise. It moves the application software and databases to
the centralized large data centers, where the management of the data and services
may not be fully trustworthy. This unique paradigm brings about many new
security challenges, which have not been well understood. This work studies the
problem of ensuring the integrity of data storage in Cloud Computing. In
particular, we consider the task of allowing a third party auditor (TPA), on behalf
of the cloud client, to verify the integrity of the dynamic data stored in the cloud.
The introduction of TPA eliminates the involvement of the client through the
auditing of whether his data stored in the cloud is indeed intact, which can be
important in achieving economies of scale for Cloud Computing. The support for
data dynamics via the most general forms of data operation, such as block
modification, insertion and deletion, is also a significant step toward practicality,
12
since services in Cloud Computing are not limited to archive or backup data
only. While prior works on ensuring remote data integrity often lacks the support
of either public verifiability or dynamic data operations, this paper achieves both.
We first identify the difficulties and potential security problems of
direct extensions with fully dynamic data updates from prior works and then
show how to construct an elegant verification scheme for the seamless
integration of these two salient features in our protocol design. In particular, to
achieve efficient data dynamics, we can improve the existing proof of storage
models by manipulating the classic (MHT) Merkle Hash Tree construction for
block tag authentication. To support efficient handling of multiple auditing tasks,
we can further explore the technique of bilinear aggregate signature [5] to extend
our main result into a multi-user setting, where TPA can perform multiple
auditing tasks simultaneously. Extensive security and performance analysis show
that the proposed schemes are highly efficient and provably secure.
2.2 PRIVACY-PRESERVING PUBLIC AUDITING FOR DATA
STORAGE SECURITY IN CLOUD STORAGE
Author: C. Wang, Q. Wang
Using Cloud Storage, users can remotely store their data and enjoy the
on-demand high quality applications and services from a shared pool of
configurable computing resources, without the burden of local data storage and
maintenance. However, the fact that users no longer have physical possession of
the outsourced data makes the data integrity protection in Cloud Computing a
formidable task, especially for users with constrained computing resources. In
this paper, author propose a secure cloud storage system supporting privacy-
preserving public auditing .We can further extend our result to enable the TPA to
perform audits for multiple users simultaneously and efficiently.
13
2.3 PORS: PROOFS OF RETRIEVABILITY FOR LARGE FILES
Author: A. Juels, B. S. Kaliski [10]
We define and explore proofs of retrievability (POR). A POR scheme
enables an archive or backup service (prover) to produce a concise proof that a
user (verifier) can retrieve a target file F, that is, that the archive retains and
reliably transmits file data sufficient for the user to recover F in its entirety. A
POR may be viewed as a kind of cryptographic proof of knowledge (POK), but
one specially designed to handle a large file (or bit string) F. We explore POR
protocols here in which the communication costs, number of memory accesses
for the prover, and storage requirements of the user (verifier) are small
parameters essentially independent of the length of F. In addition to proposing
new, practical POR constructions, we explore implementation considerations and
optimizations that bear on previously explored, related schemes. In a POR,
unlike a POK, neither the prover nor the verifier need actually have knowledge
of F. PORs give rise to a new and unusual security definition whose formulation
is another contribution of our work.
We view PORs as an important tool for semi-trusted online archives.
Existing cryptographic techniques help users ensure the privacy and integrity of
files they retrieve. It is also natural, however, for users to want to verify that
archives do not delete or modify files prior to retrieval. The goal of a POR is to
accomplish these checks without users having to download the files themselves.
A POR can also provide quality-of service guarantees, i.e., show that a file is
retrievable within a certain time bound.
2.4 SCALABLE AND EFFICIENT PROVABLE DATA POSSESSION
Author: G. Ateniese, R. D. Pietro
In this paper author introduce a model for provable data possession
(PDP) [2] that allows a client that has stored data at an untrusted server to verify
14
that the server possesses the original data without retrieving it. The model
generates probabilistic proofs of possession by sampling random sets of blocks
from the server, which drastically reduces I/O costs. The client maintains a
constant amount of metadata to verify the proof. The challenge/response protocol
transmits a small, constant amount of data, which minimizes network
communication. Thus, the PDP [2], [11] model for remote data checking
supports large data sets in widely distributed storage systems. To support the
dynamic auditing, Ateniese et al. [9] developed a dynamic provable data
possession protocol based on cryptographic hash function and symmetric key
encryption. Their idea is to pre compute a certain number of metadata during the
setup period, so that the number of updates and challenges is limited and fixed
beforehand.
The author construct a highly efficient and provably secure PDP
technique [2] based entirely on symmetric key cryptography, while not requiring
any bulk encryption. Also, in contrast with its predecessors, this PDP technique
allows outsourcing of dynamic data, i.e., it efficiently supports operations, such
as block modification, deletion and append.
2.5 TOWARDS SECURE AND DEPENDABLE STORAGE
SERVICES IN CLOUD COMPUTING
Author: C. Wang, Q. Wang
In this paper, author has propose an effective and flexible distributed
storage verification scheme with explicit dynamic data support to ensure the
correctness and availability of users’ data in the cloud. They rely on erasure
correcting code in the file distribution preparation to provide redundancies and
guarantee the data dependability against Byzantine servers, where a storage
server may fail in arbitrary ways. This construction drastically reduces the
communication and storage overhead as compared to the traditional replication
based file distribution techniques. By utilizing the homomorphic token with
15
distributed verification of erasure coded data, their scheme achieves the storage
correctness insurance as well as data error localization: whenever data corruption
has been detected during the storage correctness verification, this scheme can
almost guarantee the simultaneous localization of data errors, i.e., the
identification of the misbehaving server(s).
In order to strike a good balance between error resilience and data
dynamics, their work further explore the algebraic property of our token
computation and erasure coded data, and demonstrate how to efficiently support
dynamic operation on data blocks, while maintaining the same level of storage
correctness assurance. In order to save the time, re- sources, and even the related
online burden of users, extension of the proposed main scheme to support third
party auditing, where users can safely delegate the integrity checking tasks to
third party auditors and be worry free to use the cloud storage services.
16
CHAPTER 3
SYSTEM STUDY AND ANALYSIS
3.1 INTRODUCTION
Cloud service providers manage an enterprise-class infrastructure that
offers a scalable, secure and reliable environment for users, at a much lower
marginal cost due to the sharing nature of resources. It is routine for users to use
cloud storage services to share data with others in a team, as data sharing
becomes a standard feature in most cloud storage offerings, including Drop box
and Google Docs.
The integrity of data in cloud storage, however, is subject to
skepticism and scrutiny, as data stored in an untrusted cloud can easily be lost or
corrupted, due to hardware failures and human errors [1]. To protect the integrity
of cloud data, it is best to perform public auditing by introducing a third party
auditor (TPA), who offers its auditing service with more powerful computation
and communication abilities than regular users.
The first provable data possession (PDP) mechanism [2] to perform
public auditing is designed to check the correctness of data stored in an untrusted
server, without retrieving the entire data. Moving a step forward, Wang et al. [3]
(referred to as WWRL) is designed to construct a public auditing mechanism for
cloud data, so that during public auditing, the content of private data belonging to
a personal user is not disclosed to the third party auditor.
We believe that sharing data among multiple users is perhaps one of
the most engaging features that motivate cloud storage. A unique problem
introduced during the process of public auditing for shared data in the cloud is
17
how to preserve identity privacy from the TPA, because the identities of signers
on shared data may indicate that a particular user in the group or a special block
in shared data is a higher valuable target than others. For example, Alice and Bob
work together as a group and share a file in the cloud. The shared file is divided
into a number of small blocks, which are independently signed by users. Once a
block in this shared file is modified by a user, this user needs to sign the new
block using her public/private key pair. The TPA needs to know the identity of
the signer on each block in this shared file, so that it is able to audit the integrity
of the whole file based on requests from Alice or Bob.
We propose Oruta, a new privacy preserving public auditing
mechanism for shared data in an untrusted cloud. In Oruta, we utilize ring
signatures [4] to construct homomorphic authenticators, so that the third party
auditor is able to verify the integrity of shared data for a group of users without
retrieving the entire data — while the identity of the signer on each block in
shared data is kept private from the TPA. In addition, we further extend our
mechanism to support batch auditing, which can audit multiple shared data
simultaneously in a single auditing task. Meanwhile, Oruta continues to use
random masking to support data privacy during public auditing, and leverage
index hash tables [7] to support fully dynamic operations on shared data. A
dynamic operation indicates an insert, delete or update operation on a single
block in shared data. A high-level comparison between Oruta and existing
mechanisms in the literature is shown. To our best knowledge, this represents the
first attempt towards designing an effective privacy preserving public auditing
mechanism for shared data in the cloud.
3.2 EXISTING SYSTEM
Many mechanisms have been proposed to allow not only a data owner
itself but also a public verifier to efficiently perform integrity checking without
downloading the entire data from the cloud, which is referred to as public
18
auditing. In these mechanisms, data is divided into many small blocks, where
each block is independently signed by the owner; and a random combination of
all the blocks instead of the whole data is retrieved during integrity checking. A
public verifier could be a data user (e.g., researcher) who would like to utilize the
owner’s data via the cloud or a third-party auditor (TPA) who can provide expert
integrity checking services.
Moving a step forward, Wang et al. designed an advanced auditing
mechanism .so that during public auditing on cloud data, the content of private
data belonging to a personal user is not disclosed to any public verifiers.
Unfortunately, current public auditing solutions mentioned above only focus on
personal data in the cloud. We believe that sharing data among multiple users is
perhaps one of the most engaging features that motivate cloud storage. Therefore,
it is also necessary to ensure the integrity of shared data in the cloud is correct.
Existing public auditing mechanisms [15], [16] can actually be
extended to verify shared data integrity. However, a new significant privacy issue
introduced in the case of shared data with the use of existing mechanisms is the
leakage of identity privacy to public verifiers. However, a new significant
privacy issue introduced in the case of shared data with the use of existing
mechanisms is the leakage of identity privacy to public verifiers.
3.3DISADVANTAGES OF EXISTING SYSTEM:
Failing to preserve identity privacy on shared data during public auditing
will reveal significant confidential information to public verifiers.
Protect these confidential information is essential and critical to preserve
identity privacy from public verifiers during public auditing.
3.4 PROPOSED SYSTEM
To solve the above privacy issue on shared data , we propose Oruta, a
novel privacy-preserving public auditing mechanism. More specifically, we
19
utilize ring signatures [4] to construct homomorphic authenticators in Oruta, so
that a public verifier is able to verify the integrity of shared data without
retrieving the entire data while the identity of the signer on each block in shared
data is kept private from the public verifier.
In addition, we further extend our mechanism to support batch
auditing, which can perform multiple auditing tasks simultaneously and improve
the efficiency of verification for multiple auditing tasks. Meanwhile, Oruta is
compatible with random masking, which has been utilized in WWRL and can
preserve data privacy from public verifiers. Moreover, we also leverage index
hash tables [7] from a previous public auditing solution to support dynamic data.
A high-level comparison among Oruta and existing mechanisms is presented.
3.5 PROBLEM STATEMENT
3.5.1 System Model
This application involves three parties: the cloud server, the third
party auditor (TPA) and users. There are two types of users in a group: the
original user and a number of group users.
The original user and group users are both members of the group.
Group members are allowed to access and modify shared data created by the
original user based on access control polices [8]. Shared data and its verification
information (i.e. signatures) are both stored in the cloud server. The third party
auditor is able to verify the integrity of shared data in the cloud server on behalf
of group members. Our system model includes the cloud server, the third party
auditor and users. The user is responsible for deciding who is able to share her
data before outsourcing data to the cloud. When a user wishes to check the
integrity of shared data, she first sends an auditing request to the TPA. After
receiving the auditing request, the TPA generates an auditing message to the
cloud server, and retrieves an auditing proof of shared data [14] from the cloud
server. Then the TPA verifies the correctness [13] of the auditing proof. Finally,
20
the TPA sends an auditing report to the user based on the result of the
verification.
3.5.2 Threat Model
Integrity Threats
Two kinds of threats related to the integrity of shared data are
possible. First, an adversary may try to corrupt the integrity of shared data and
prevent users from using data correctly. Second, the cloud service provider may
inadvertently corrupt (or even remove) data in its storage due to hardware
failures and human errors [1]. Making matters worse, in order to avoid this
integrity threat the cloud server provider may be reluctant to inform users about
such corruption of data.
Privacy Threats
The identity of the signer on each block in shared data is private and
confidential to the group. During the process of auditing, a semi-trusted TPA,
who is only responsible for auditing the integrity of shared data, may try to reveal
the identity of the signer on each block in shared data based on verification
information. Once the TPA reveals the identity of the signer on each block, it can
easily distinguish a high-value target (a particular user in the group or a special
block in shared data).
3.6 DESIGN OBJECTIVES
To enable the TPA efficiently and securely verify shared data for a
group of users, Oruta should be designed to achieve following properties:
(1) Public Auditing: The third party auditor is able to publicly verify
the integrity of shared data for a group of users without retrieving the entire data.
(2) Correctness: The third party auditor is able to correctly detect
whether there is any corrupted block in shared data.
21
(3) Unforgeability: Only a user in the group can generate valid
verification information on shared data.
(4) Identity Privacy: During auditing, the TPA cannot distinguish the
identity of the signer on each block in shared data.
3.7ADVANTAGES OF PROPOSED SYSTEM
A public verifier is able to correctly verify shared data integrity.
A public verifier cannot distinguish the identity of the signer on each
block in shared data during the process of auditing.
The ring signatures [4] generated for not only able to preserve identity
privacy but also able to support blockless verifiability.
22
CHAPTER 4
REQUIREMENTS SPECIFICATION
4.1 INTRODUCTION
A System Requirements is a complete description of the behavior of
the system to be developed. It includes a set of use cases that describes all of the
interactions that the users will have with the software. A good requirement
defines how an application will interact with system hardware, other programs
and human users in a wide variety of real-world situations. Software
Requirement Specification has been developed for future references in case
of any ambiguity and misunderstanding. It provides a detailed description of
the problem that the software must solve.
4.2 HARDWARE REQUIREMENTS
Processor : Intel Pentium
Hard Disk : 500 GB
Mouse : Touch Pad
RAM : 4GB
4.3 SOFTWARE REQUIREMENTS
Operating System : Windows
Coding Language : ASP.NET with C#
Database : Microsoft SQL Server
23
CHAPTER 5
SOFTWARE ENVIRONMENT
5.1 FEATURES OF .NET
Microsoft .NET is a set of Microsoft software technologies for rapidly
building and integrating XML Web services, Microsoft Windows-based
applications, and Web solutions. The .NET Framework is a language-neutral
platform for writing programs that can easily and securely interoperate. There is
no language barrier with .NET: there are numerous languages available to the
developer including Managed C++, C#, Visual Basic and Java Script. The .NET
framework provides the foundation for components to interact seamlessly,
whether locally or remotely on different platforms. It standardizes common data
types and communications protocols so that components created in different
languages can easily interoperate.
―.NET‖ is also the collective name given to various software
components built upon the .NET platform. These will be both products (Visual
Studio.NET and Windows.NET Server, for instance) and services (like Passport,
.NET My Services, and so on).
5.1.1 The .NET Framework
The .NET Framework has two main parts:
1. The Common Language Runtime (CLR).
2. A hierarchical set of class libraries.
The CLR is described as the ―execution engine‖ of .NET. It provides
the environment within which programs run. The most important features are:
24
Conversion from a low-level assembler-style language, called
Intermediate Language (IL), into code native to the platform being
executed on.
Memory management, notably including garbage collection.
Checking and enforcing security restrictions on the running code.
Loading and executing program, with version control and other such
features.
Managed Code:
The code that targets .NET, and which contains certain extra
Information-―metadata‖ –to describe itself. While both managed and unmanaged
code can run in the runtime, only managed code contains the information that
allows the CLR to guarantee, for instance, safe execution and interoperability.
Managed Data:
With managed code comes Managed data. CLR provides memory
allocation and Deal location facilities, and garbage collection. Some .NET
languages use Managed Data by default, such as C#, Visual Basic.NET and
Jscript.NET, whereas others, namely C++, do not. Targeting CLR can, depending
on the language you’re using, impose certain constraints on the features
available. As with managed and unmanaged code, one can have both managed
and unmanaged data in .NET applications – data that doesn’t get garbage
collected but instead is looked after by unmanaged code.
Common Type System:
The CLR uses something called the Common Type System (CTS) to
strictly enforce type-safety. This ensures that all classes are compatible with each
other, by describing types in a common way. CTS define how types work within
the runtime, which enables types in one language to interoperate with types in
another language, including cross-language exception handling. As well as
25
ensuring that types are only used in appropriate ways, the runtime also ensures
that code doesn’t attempt to access memory that hasn’t been allocated to it.
Common Language Specification:
The CLR provide built-in support for language interoperability. To
ensure that you develop managed code that can that can be fully used by
developers using any programming language, a set of language features and
rules for using them called the Common Language Specification (CLS) has been
defined. Components that follow these rules and expose only CLS features are
considered CLS-compliant.
5.1.2 The Class Library
.NET provides a single-rooted hierarchy of classes, containing over
7000 types. The root of the namespace is called System; this contains basic types
like Byte, Double, Boolean, and String, as well as Object. All objects derive from
System. Object. As well as objects, there are value types. Value types can be
allocated on the stack, which can provide useful flexibility. There are also
efficient means of converting value types to object types if and when necessary.
The class library is subdivided into a number of sets (or
namespaces), each providing distinct areas of functionality, with dependencies
between the namespaces kept to a minimum.
5.2 LANGUAGES SUPPORTED BY .NET
The multi-language capability of the .NET Framework and Visual
Studio .NET enables developers to use their existing programming skills to build
all types of applications and XML Web services. The .NET framework supports
new versions of Microsoft’s old favorites Visual Basic and C++ (as VB.NET and
Managed C++), but there are also a number of new additions to the family.
Visual Basic .NET has been updated to include many new and
improved language features that make it a powerful object-oriented programming
26
language. These features include inheritance, interfaces, and o verloading, among
others. Visual Basic also now supports structured exception handling, custom
attributes and also supports multi-threading.
Managed Extensions for C++ and attributed programming are just
some of the enhancements made to the C++ language. Managed Extensions
simplify the task of migrating existing C++ applications to the new .NET
Framework.
Microsoft Visual J# .NET provides the easiest transition for Java-
language developers into the world of XML Web Services and dramatically
improves the interoperability of Java-language programs with existing software
written in a variety of other programming languages.
Active State has created Visual Perl and Visual Python, which
enable .NET-aware applications to be built in either Perl or Python. Both
products can be integrated into the Visual Studio .NET environment. Visual Perl
includes support for Active State’s Perl Dev Kit. Other languages for which
.NET compilers are available include
FORTAN
COBOL
Eiffel
C#.NET is also compliant with CLS (Common Language
Specification) and supports structured exception handling. CLS is set of rules and
constructs that are supported by the CLR (Common Language Runtime).
5.3 CONSTRUCTORS AND DESTRUCTORS:
Constructors are used to initialize objects, whereas destructors are
used to destroy them. In other words, destructors are used to release the resources
allocated to the object. In C#.NET the sub finalize procedure is available. The
sub finalize procedure is used to complete the tasks that must be performed when
27
an object is destroyed. The sub finalize procedure is called automatically when
an object is destroyed. In addition, the sub finalize procedure can be called only
from the class it belongs to or from derived classes.
5.3.1 Garbage Collection
Garbage Collection is another new feature in C#.NET. The .NET
Framework monitors allocated resources, such as objects and variables. In
addition, the .NET Framework automatically releases memory for reuse by
destroying objects that are no longer in use.
In C#.NET, the garbage collector checks for the objects that are not
currently in use by applications. When the garbage collector comes across an
object that is marked for garbage collection, it releases the memory occupied by
the object.
5.3.2 Overloading
Overloading is another feature in C#. Overloading enables us to
define multiple procedures with the same name, where each procedure has a
different set of arguments. Besides using overloading for procedures, we can use
it for constructors and properties in a class.
5.3.3 Multithreading
C#.NET also supports multithreading. An application that supports
multithreading can handle multiple tasks simultaneously, we can use
multithreading to decrease the time taken by an application to respond to user
interaction.
5.3.4 Structured Exception Handling
C#.NET supports structured handling, which enables us to detect
and remove errors at runtime. In C#.NET, we need to use Try…Catch…Finally
statements to create exception handlers. Using Try…Catch…Finally statements,
28
we can create robust and effective exception handlers to improve the
performance of our application.
5.4 OBJECTIVES OF .NET FRAMEWORK
1. To provide a consistent object-oriented programming environment
whether object codes is stored and executed locally on Internet-
distributed, or executed remotely.
2. To provide a code-execution environment to minimizes software
deployment and guarantees safe execution of code.
3. Eliminates the performance problems.
29
CHAPTER 6
SYSTEM DESIGN
6.1 SYSTEM ARCHITECTURE
The architecture involves three parties: the cloud server, the third
party auditor (TPA) and users. There are two types of users in a group: the
original user and a number of group users.
Fig 6.1: System Architecture
The original user and group users are both members of the group.
Group members are allowed to access and modify shared data created by the
original user based on access control polices [8]. Shared data and its verification
information (i.e. signatures) are both stored in the cloud server. The third party
auditor is able to verify the integrity of shared data in the cloud server on behalf
of group members. Our system model includes the cloud server, the third party
auditor and users. The user is responsible for deciding who is able to share her
30
data before outsourcing data to the cloud. When a user wishes to check the
integrity of shared data, she first sends an auditing request to the TPA. After
receiving the auditing request, the TPA generates an auditing message to the
cloud server, and retrieves an auditing proof of shared data [14] from the cloud
server. Then the TPA verifies the correctness of the auditing proof [13]. Finally,
the TPA sends an auditing report to the user based on the result of the
verification.
6.2 DATA FLOW DIAGRAM
A Data Flow diagram (DFD) is a graphical representation of the flow
of data through an information system, modeling its process aspects. Often they
are a preliminary step used to create an overview of the system which can later
be elaborated. DFD’s can also be used for the visualization of data processing
(structured design).
6.2.1 Level 0 Data Flow Diagram
Fig 6.2: Level 0 Data Flow Diagram
The step by step taken in the phase of input to output processing of
data is shown in this diagram. This includes user input and corresponding output
the user receives.
6.2.2 Level 1 Data Flow Diagram-Admin
Admin login with valid username and password to the application.
The admin has access to the files verified and other details pertaining to the file.
31
The admin overall has all the options like viewing of the files verified by the
TPA and the changes made to the existing files.
Fig 6.3: Level 1 Data Flow Diagram
6.2.3 Level 2 Data Flow Diagram-Owner
Fig 6.4: Level 2 Data Flow Diagram
32
An owner is a person who can access resources from the cloud. The
owner would first register to the interface to get the services with the valid
username and password. In order to correctly audit the integrity of the entire data,
a public verifier needs to choose the appropriate public key for each block. Then
they can request for the file to the cloud service admin. There will be a third
party auditor who performs the integrity checking of the data before providing it
to the owner or the users. This is done by 1st splitting the data into blocks and
then performing integrity check. The owner has the option of downloading the
verified file and also uploads new files.
6.2.4 Level 3 Data Flow Diagram-TPA
The TPA registers to the application with a valid username and
password. TPA logins to the application and verifies the integrity of data. TPA
views all the list of files uploaded by the owner without the key has the privilege
of encrypting the data and save it on cloud. TPA also views data which is
uploaded by various owners.
Fig 6.5: Level 3 Data Flow Diagram
33
6.3 USE CASE DIAGRAM
A use case captures the interactions that occur between developers and
users of information and the system itself. The Use Case Diagram shows a
number of external actors and their connections to the use cases that the system
provides. A use case is a description of a functionality that the system provides.
The use case diagram presents an outside view of the system. The use-case
model consists of use-case diagrams consists of the following:
1. The use-case diagrams illustrate the actors, the use cases and their
relationships.
2. Use cases also require a textual description (use case specification), as
the visual diagrams can’t contain all of the information that is necessary.
3. The customers, the end-users, the domain experts, and the developers all
have an input into the development of the use-case model.
The step by step taken in the phase of input to output processing of
data is shown in the diagram. This includes user input and corresponding output
the user receives. Admin can logins with valid username and password to the
application. The admin has access to the files verified and other details pertaining
to the file.
An owner is a person who can access resources from the cloud. The
owner would first register to the interface to get the services with the valid
username and password. In order to correctly audit the integrity of the entire data,
a public verifier needs to choose the appropriate public key for each block. Then
they can request for the file to the cloud service admin. There will be a third
party auditor who performs the integrity checking of the data before providing it
to the owner or the users. This is done by 1st splitting the data into blocks and
then performing integrity check. The owner has the option of downlo ading the
verified file and also uploads new files. The TPA registers to the application with
a valid username and password. TPA logins to the application and verifies the
integrity of data. TPA views all the list of files uploaded by the owner without
34
the key has the privilege of encrypting the data and save it on cloud. TPA also
views data which is uploaded by various owners.
Fig 6.6 Use Case Diagram
35
CHAPTER 7
SYSTEM IMPLEMENTATION
7.1 PRIVACY PRESERVING PUBLIC AUDITING MODULE:
The details of our public auditing mechanism [2], [3], [6] in Oruta
includes: Key-Gen, Sig-Gen, Modify, Proof-Gen and Proof Verify. In Key-Gen,
users generate their own public/private key pairs. In Sig-Gen, a user is able to
compute ring signatures [4] on blocks in shared data. Each user is able to perform
an insert, delete or update operation on a block, and compute the new ring
signature [4] on this new block in Modify. Proof Gen is operated by the TPA and
the cloud server together to generate a proof of possession of shared data. In
Proof-Verify, the TPA verifies the proof and sends an auditing report to the user.
The proposed scheme is as follows:
Setup Phase
Audit Phase
Setup Phase
The user initializes the public and secret parameters of the system by
executing KeyGen, and pre-processes the data file F by using SigGen to generate
the verification metadata. The user then stores the data file F and the verification
metadata at the cloud server. The user may alter the data file F by performing
updates on the stored data in cloud.
Audit Phase
TPA issues an audit message to the cloud server to make sure that the
cloud server has retained the data file F properly at the time of the audit. The
cloud server will create a response message by executing Genproof using F and
36
its verification metadata as inputs. The TPA then verifies the response by cloud
server via Verify Proof.
An owner is a person who can access resources from the cloud. The
owner would first register to the interface to get the services with the valid
username and password. In order to correctly audit the integrity of the entire data,
a public verifier needs to choose the appropriate public key for each block. Then
they can request for the file to the cloud service admin. There will be a third
party auditor who performs the integrity checking of the data before providing it
to the owner or the users. This is done by 1st splitting the data into blocks and
then performing integrity check.
7.2 BATCH AUDITING MODULE:
With the establishment of privacy-preserving public auditing in Cloud
Computing, TPA may concurrently handle multiple auditing delegations upon
different users’ requests. The individual auditing of these tasks for TPA can be
tedious and very inefficient. Batch auditing not only allows TPA to perform the
multiple auditing tasks simultaneously, but also greatly reduces the computation
cost on the TPA side. Given K auditing delegations on K distinct data files from
K different users, it is more advantageous for TPA to batch these multiple tasks
together and audit at one time.
The TPA registers to the application with a valid username and
password. TPA logins to the application and verifies the integrity of data. TPA
views all the list of files uploaded by the owner without the key.
7.3 DATA DYNAMICS MODULE:
Supporting data dynamics for privacy-preserving public risk auditing
is also of paramount importance. Now we show how our main scheme can be
adapted to build upon the existing work to support data dynamics, including
37
block level operations of modification, deletion and insertion. We can adopt this
technique in our design to achieve privacy-preserving public risk auditing with
support of data dynamics.
To enable each user in the group to easily modify data in the cloud
and shared the latest version of the data with rest of the group, oruta should also
support dynamic operations on shared data. A dynamic operation includes an
insert, delete or update operation on a single block. However, since the
computation of a ring signature [4] includes an identifier of a block, traditional
methods, which only use the index of a block as its identifier, are not suitable for
supporting dynamic operations on shared data. The reason is that, when user
modifies a single block in shared data by performing an insert or delete
operation, the indices of blocks that after the modified block are all changed and
the changes of these indices requires users to recompute the signatures of these
blocks, even though the content of these blocks are not modified.
The details of our public auditing mechanism in Oruta includes: Key-
Gen, Sig-Gen, Modify, Proof-Gen and Proof Verify. In Key-Gen, users generate
their own public/private key pairs. In Sig-Gen, a user is able to compute ring
signatures [4] on blocks in shared data. Each user is able to perform an insert,
delete or update operation on a block, and compute the new ring signature [4] on
this new block in Modify. Proof Gen is operated by the TPA and the cloud server
together to generate a proof [14] of possession of shared data. In Proof-Verify,
the TPA verifies the proof and sends an auditing report to the user.
Modify: A user in the group modifies the block in the shared data by performing
one of the following three operations:
Insert: The user inserts the new block say mj into shared data. Total number of
blocks in shared data is n. He/She computes the new identifier of the inserted
block mj as idj = {vj, rj} where idj = identifier of jth
block, vj = Virtual index. For
the rest of blocks, identifiers of these blocks are not changed. This user outputs
the new ring signature [4] of the inserted block with SignGen, and uploads to the
cloud server. Total number of blocks in shared data increases to n+1.
38
Delete: The user deletes the block mj , its identifier idj and ring signature [4]
from the cloud server. The identifiers of other blocks in shared data are remains
the same. The total number of blocks in shared data in cloud decreases to n-1.
Update: The user updates the jth
block in shared data with a new block mj. The
virtual index of this block remains the same and new ring signature [4] is
computed. The user computes the new identifier of the updated block. The
identifiers of other blocks in shred data are not changed. The user outputs the
new ring signature [4] of new block with SignGen, and uploads to the cloud
server. The total number of blocks in shared data is still n.
40
CHAPTER 9
CONCLUSION
9.1 CONCLUSION
We propose Oruta, the first privacy preserving public auditing
mechanism for shared data in the cloud. We utilize ring signatures [4] to
construct homomorphic authenticators [2], [6], so the TPA is able to audit the
integrity of shared data, yet cannot distinguish who is the signer on each block,
which can achieve identity privacy. To improve the efficiency of verification for
multiple auditing tasks, we further extend our mechanism to support batch
auditing. An interesting problem in our future work is how to efficiently audit the
integrity of shared data with dynamic groups while still preserving the identity of
the signer on each block from the third party auditor.
9.2 FUTURE ENHANCEMENT
An interesting problem in our future work is how to efficiently audit
the integrity of shared data with dynamic groups while still preserving the
identity of the signer on each block from the third party auditor. To avoid the
data duplication and data lost we utilize Merkle hash tree for data splitting in an
advanced manner. Homomorphic ring signature [4] algorithm will implement in
order to generate signature for larger number of users in the group. Privacy of
data will achieve highly since the auditor is unaware of the contents stored in
cloud.
41
REFERENCES
[1] A. Juels and B. S. Kaliski, ―PORs: Proofs pf Retrievability for Large Files,‖
in Proc. ACM Conference on Computer and Communications Security (CCS),
2007, pp. 584–597.
[2] B. Chen, R. Curtmola, G. Ateniese, and R. Burns, ―Remote Data Checking
for Network Coding-based Distributed Stroage Systems,‖ in Proc. ACM Cloud
Computing Security Workshop (CCSW), 2010, pp. 31–42.
[3] B. Chen, R. Curtmola, G. Ateniese, and R. Burns, ―Remote Data Checking
for Network Coding-based Distributed Stroage Systems,‖ in Proc. ACM Cloud
Computing Security Workshop (CCSW), 2010, pp. 31–42.
[4] C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia, ―Dynamic Provable
Data Possession,‖ in Proc. ACM Conference on Computer and Communications
Security (CCS), 2009, pp. 213–222.
[5] C. Wang, Q. Wang, K. Ren, and W. Lou, ―Ensuring Data Storage Security in
Cloud Computing,‖ in Proc. IEEE/ACM International Workshop on Quality of
Service (IWQoS), 2009, pp. 1–9.
[6] C. Wang, Q. Wang, K. Ren, and W. Lou, ―Privacy-Preserving Public
Auditing for Data Storage Security in Cloud Computing,‖ in Proc. IEEE
International Conference on Computer Communications (INFOCOM), 2010, pp.
525–533.
42
[7] D. Boneh, B. Lynn, and H. Shacham, ―Short Signature from the Weil
Pairing,‖ in Proc. International Conference on the Theory and Application of
Cryptology and Information Security (ASIACRYPT). Springer-Verlag, 2001, pp.
514–532.
[8] D. Boneh, C. Gentry, B. Lynn, and H. Shacham, ―Aggregate and Verifiably
Encrypted Signatures from Bilinear Maps,‖ in Proc. International Conference on
the Theory and Applications of Cryptographic Techniques (EUROCRYPT).
Springer-Verlag, 2003, pp. 416–432.
[9] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and
D. Song, ―Provable Data Possession at Untrusted Stores,‖ in Proc. ACM
Conference on Computer and Communications Security (CCS), 2007, pp. 598–
610.
[10] G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, ―Scalable and
Efficient Provable Data Possession,‖ in Proc. International Conference on
Security and Privacy in Communication Networks (SecureComm), 2008.
[11] H. Shacham and B. Waters, ―Compact Proofs of Retrievability,‖ in Proc.
International Conference on the Theory and Application of Cryptology and
Information Security (ASIACRYPT). Springer- Verlag, 2008, pp. 90–107.
[12] M. Armbrust, A. Fox, R. Griffith, A. D.Joseph, R. H.Katz, A. Konwinski, G.
Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, ―A View of Cloud
Computing,‖ Communications of the ACM, vol. 53, no. 4, pp. 50–58, Apirl
2010.
43
[13] Q. Zheng and S. Xu, ―Secure and Efficient Proof of Storage with
Deduplication,‖ in Proc. ACM Conference on Data and Application Security and
Privacy (CODASPY), 2012.
[14] R. L. Rivest, A. Shamir, and Y. Tauman, ―How to Leak a Secret,‖ in Proc.
International Conference on the Theory and Application of Cryptology and
Information Security (ASIACRYPT). Springer-Verlag, 2001, pp. 552–565.
[15] S. Yu, C. Wang, K. Ren, and W. Lou, ―Achieving Secure, Scalable, and
Fine-grained Data Access Control in Cloud Computing,‖ in Proc. IEEE
International Conference on Computer Communications (INFOCOM), 2010, pp.
534–542.
[16] Y. Zhu, H.Wang, Z. Hu, G.-J. Ahn, H. Hu, and S. S.Yau, ―Dynamic Audit
Services for Integrity Verification of Outsourced Storage in Clouds,‖ in Proc.
ACM Symposium on Applied Computing (SAC), 2011, pp. 1550–1557.