This document summarizes a research paper that proposes a scheme for periodically auditing data integrity in cloud storage using random bits. It introduces a proof of retrievability (POR) scheme to ensure data integrity based on service level agreements. The scheme uses probabilistic queries and periodic verification to improve the performance of audit services. It presents an architecture involving a client that pre-processes data before storing it, and a verification protocol to check integrity without retrieving the full data. The scheme aims to reduce overhead on clients and servers while minimizing proof sizes.
Periodic Auditing of Data Integrity in Cloud Storage
1. INTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING & TECHNOLOGY
VOLUME 5 ISSUE 2 – MAY 2015 - ISSN: 2349 - 9303
114
Periodic Auditing of Data in Cloud Using
Random Bits
K.Devika 1
,
1
K.S.R Institute for Engineering and Technology,
Computer Science and Engineering
Email: devikannandk@gmail.com
M.Jawahar2
2
K.S.R Institute for Engineering and Technology,
Computer Science and Engineering
Email: devikannandk@gmail.com
Abstract─ Cloud storage is a service that moves user‟s data into large data centers. Here datas are remotely
located, maintained, managed and backed up through internet. Cloud provides the way for check the integrity of
user‟s data if he can‟t access the data physically. In this paper we provide the proof of retrievability (POR) scheme
to ensure the data integrity in cloud based on SLA (service level agreement). In addition we provide a dynamic audit
service for verifying the integrity of outsourced and untrusted storage of data by using method based on probabilistic
query and periodic verification for improving the performance of audit services.
Index Terms─ Cloud Storage, Data Integrity, POR, Dynamic Audit, Audit Service.
—————————— ——————————
1 INTRODUCTION
The owner (client) of the data moves its data to a
third party cloud storage server which is supposed to -
presumably for a fee - faithfully store the data with it
and provide it back to the owner whenever required.
Whenever additional data is created as data generation is
far outpacing data storage it proves costly for small
firms to frequently update their hardware and
maintaining the storages can be a difficult task. By
reducing the costs of storage, maintenance and
personnel, storage outsourcing of data to cloud storage
helps such firms.
By keeping multiple copies of the data thereby
reducing the chance of losing data by hardware failures,
it can also assure a reliable storage of important data on
cloud. It provides a reliable solution to the problem of
avoiding local storage of data. In this paper
implementing a protocol for obtaining a proof of data
possession in the cloud referred to as Proof of
retrievability (POR). The data that is stored by a user at
remote data storage in the cloud (called cloud storage
archives or simply archives) is not modified by the
archive and thereby the integrity of the data is assured
by using this protocol to obtain and verify a proof of
data.
The cloud storage archives was prevented from
misrepresenting or modifying the data stored at cloud
without the consent of the data owner by using frequent
checks of verification systems on the storage archives.
The frequent checks must allow the data owner to
quickly, efficiently, securely and frequently verify that
the cloud archive is not cheating as that the storage
archive might delete some of the data or may modify
some of the data of the owner.
2 RELATED WORK
For the Purpose of developing proofs for data
possession at untrusted cloud storage servers, we are
often limited by the resources at the cloud server as well
as at the client. Accessing the entire file can be
expensive in I/O costs to the storage server, while that
the data sizes are large and are stored at remote servers
and transmitting the file across the network to the client
can consume heavy bandwidths.
Storing of user data in the cloud need to be
extensively investigated for making it a reliable solution
to the problem of avoiding local storage of data, and
despite its advantages has many interesting security
concerns. Whenever additional data is created the data
generation is far outpacing data storage and it proves
costly for small firms to frequently update their
2. INTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING & TECHNOLOGY
VOLUME 5 ISSUE 2 – MAY 2015 - ISSN: 2349 - 9303
115
hardware and also can be a difficult task for maintaining
the storages. It transmitting the file across the network
to the client for that it can consume heavy bandwidths.
The owner of the data may be a small device, like a
PDA (personal digital assist) or a mobile phone, The
problem is further complicated by the fact which have
limited power of CPU and battery and have limited
communication bandwidth.
3 ARCHITECTURE
One of the important concerns that need to be
addressed is to assure the customer, correctness of his
data in the cloud. The cloud provides a way to check if
the integrity of the user data is maintained or is
compromised while the data is physically not accessible
to the user. In this paper we provide a scheme which
gives a proof of data integrity in the cloud which the
customer can employ to check the correctness of his
data in the cloud and this proof can be agreed by both
the customer and the cloud also this proof can be
incorporated in the Service level agreement (SLA).
It is important to note that our proof of data
integrity protocol just checks the integrity of data means
if the data has been illegally modified or deleted.Apart
from reduction in storage costs data outsourcing to the
cloud also helps in reducing the maintenance .this
scheme is used for Avoiding local storage of data. By
reducing the costs of storage, maintenance and
personnel, it can reduces the chance of losing data by
hardware failures.
At the client, cloud storing its data file F and
process it and also create suitable meta data for the file
which is used in the verification of the data integrity at
the cloud storage. The client queries the cloud storage
for suitable replies based on which it concludes the
integrity of its data stored in the client while checking
for data integrity. The verifier needs to store only a
single cryptographic key - irrespective of the size of the
data file F- and two functions which generate a random
sequence on using the data integrity protocol although
the verifier does not store any data with it.
Figure 1 Architecture
The file to be pre-processed by the verifier before
storing it at the archive and appends some meta data to
the file and stores at the archive. Verifiers can verify the
integrity of their data using the meta data and they can
challenge the archive by using the meta data. The
archive prove the data integrity of users data by
response to the verifiers challenge.
Our scheme was developed to reduce the
computational and storage overhead of the client as well
as to minimize the computational overhead of the cloud
storage server and also minimized the size of the proof
of data integrity so as to reduce the network bandwidth
consumption. Compared to all other schemes, the
storage at the client is very much minimal hence proves
advantage to thin clients. Generally consumes a large
computational power has to be consumed for the
operation of encryption of data. Thereby In this scheme
saving on the computational time of the client, the
encrypting process is very much limited to only a
fraction of the whole data. In many of the schemes the
archive requires to generate the proof of data integrity to
perform tasks that need a lot of computational power.
But in our scheme the archive just need to fetch and
send few bits of data to the client.
3.1 Dynamic Auditing of Data in Cloud
We provide a dynamic audit service for verifying
the integrity of outsourced and untrusted storage of data
by using method based on probabilistic query and
periodic verification for improving the performance of
audit services. Our audit service are comprised of three
processes for realize these functions.
i) Tag Generation: To preprocess a file the client
(DO) uses a secret key sk . The file consists of a
collection of n blocks.
Fig 2.Tag generation for audit file
DO generates a set of Public Verification Parameters
(PVPs) and IHT then transmits the file and some
verification tags to CSP(Cloud Service Provider), that
are stored in TPA(Third Party Auditor), and may delete
its local copy ( Fig. 2).
ii) Periodic Sampling Audit: TPA issues a “random
sampling” challenges to audit the integrity and
availability of the outsourced data in terms of
verification information (involving PVP and IHT(Index
Hash Table)) stored in TPA By using an interactive
proof protocol of retrievability (Fig. 3).
3. INTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING & TECHNOLOGY
VOLUME 5 ISSUE 2 – MAY 2015 - ISSN: 2349 - 9303
116
Fig 3. Periodic sampling audit
iii) Audit for Dynamic Operations: An
AA(Authorized Applications) holds a DO‟s secret key
sk. It can manipulate the outsourced data and update the
associated IHT stored in TPA. By using the privacy of
sk and the checking algorithm, AA ensure that the
storage server cannot cheat the AAs and forge the valid
audit records (Fig. 4).
Fig .4 Dynamic data operations and audit
The AAs be a cloud application services inside
clouds for various application purposes, must be
specifically authorized by DOs for manipulating
outsourced data. Any unauthorized modifications for
data will be detected in audit processes or verification
processes by AA that must present authentication
information for TPA. CSP is trusted to guarantee the
security of stored data based on this kind of strong
authorization-verification mechanism. Therefore, TPA
should be constructed in clouds and maintained by a
CSP.
TPA must be secure enough to resist mischievous
attacks, and it should be strictly controlled to prevent
unauthorized accesses even for internal members in
clouds to ensure the trust and security. TPA in clouds
should be mandated by a Trusted Third Party (TTP).
The TTP mechanism improves the performance of an
audit service, also provides the DO with a maximum
access transparency that stated DOs are entitled to
utilize the audit service without additional costs. These
processes involve some procedures: KeyGen, TagGen,
Update, Delete, Insert algorithms, and an Interactive
Proof Protocol of Retrievability.
4 OUR CONTRIBUTION
A. Fragment Structure and Secure Tags: A
general fragment structure for outsourced storages is
introduced to maximize the storage efficiency and audit
performance. An instance for this framework. An
outsourced file F is split into n blocks{m1,m2…,mn} and
each block mi is split into s sectors {mi,1,mi,2…,mi,s}the
fragment framework consists of n block-tag pair where
σi is a signature tag of a block mi generated by some
secrets τ = τ1,τ2,…,τs. In the verification protocol, we
can use tags and corresponding data to construct a
response in terms of the TPA‟s challenges such that this
response can be verified without raw data. A secure tag
is that tag if a tag is unforgeable by anyone except the
original signer, we call it. Finally, encrypted secrets
(called as PVP) are in TTP and the block-tag pairs are
stored in CSP.
Storage of signature tags can be reduced with the
increase of s although this fragment structure is simple
and straightforward, but the file is split into n×s sectors
and each block (s sectors) corresponds to a tag. Hence
the extra storage for tags and improving the audit
performance is reduced by this structure. There exist
several schemes for the convergence of s blocks to
generate a secure signature tag, e.g., MAC-based, ECC,
or RSA schemes [6], [7]. The features of scalability,
performance, and security are built from collision
resistance hash functions and random oracle model
support.
B.Periodic Sampling Audit: While still achieves
an effective detection of misbehaviors to realize the
anomaly detection in a timely manner, a probabilistic
audit on sampling checking is preferable as well as to
rationally allocate resources. The fragment structure
provides Probabilistic Audit as well: Given a randomly
chosen challenge (or query) Q={(i,vi)}iЄI where I is a
subset of the block indices and vi is a random
coefficient, an efficient algorithm is used to produce a
constant-size response (µ1,µ2,µs,σ′) where µi comes
from all {mk,i,vk}kЄI and σ′ is from all {σk,vk}kЄi.
Generally, this algorithm minimizes network
communication costs and relies on homomorphic
properties to aggregate data and tags into a constant-size
response. We propose a periodic sampling approach to
audit outsourced data, since the single sampling
checking may overlook a small number of data
abnormalities which is named as Periodic Sampling
Audit. Table1 IHT with random values with this
approach,
Table 1 Index Hash Table
4. INTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING & TECHNOLOGY
VOLUME 5 ISSUE 2 – MAY 2015 - ISSN: 2349 - 9303
117
A TPA merely needs to access small portions of
files to perform audit in each activity and the audit
activities are efficiently scheduled in an audit period.
Therefore, this method can reduce the sampling
numbers in each audit and detect exceptions
periodically.
C.Index-Hash Table: We introduce a simple IHT
to support dynamic data operations and to record the
changes of file blocks and generate the hash value of
each block in the verification process. File block
allocation table in file systems is similar to that of the
structure of our IHT. Generally, the IHT χ consists of
block number, serial number, version number, and
random integer (Table 1). To prevent the forgery of data
blocks and tags we must assure all records in the IHT
differ from one another. In addition to recording data
changes, each record χi in the table is used to generate a
unique hash value, which is used for the construction of
a signature tag σi by the secret key sk. The relationship
must be cryptographically secured between χi and σi,
and we make use of it to design our verification
protocol. To monitor the behavior of an untrusted CSP,
the IHT provides a higher assurance and provide
valuable evidence to computer forensics, due to the
reason that anyone else cannot forge the valid χi (in
TPA) and σi (in CSP) without the secret key sk.
4.1 Implementation of Dynamic Operations
It is necessary for TPA to employ an IHT to record
the current status of the stored files to support dynamic
data operations, insecurity due to replay attack on the
same hash values are due to some existing index
schemes for dynamic scenarios. To solve this problem, a
simple IHT χ={χi}is used as described in Table 1, which
includes four columns: No, Bi,Vi and Ri. No. denotes the
real number i of data block mi, Bi is the original number
of block, Vi stated as version number that stores the
version number of updates for this block, and Ri is a
random integer used to avoid collision. To ensure the
security, we require that each χi =„„Bi||Vi||Ri‟‟ is unique
in this table. Although the same values of „„Bi||Vi‟‟ may
be produced by repeating the insert and delete
operations, the random Ri used for avoid this collision.
An alternative method is to generate an updated random
value by Ri
'←
HRi(Σ8
j=1 m'i,j) where the initial value is
Ri←Hξ
(1)
(Σ8
j=1 mi,j) and mi={mi,j}denotes the i th data
block. We show an example to describe the change of
IHT for different operations in Table 1, where an empty
record (i= 0) is used to support the operations on the
first record.
“Append” operation is replaced with the “Insert”
operation on the last record. It is easy to prove that each
χi is unique in χ in our scheme. We propose a simple
method to provide dynamic data modification based on
the construction of IHTs. To improve the performance
all tags and the IHT should be renewed and reorganized
periodically obviously, to improve the efficiency of
updating the IHT we can replace the sequent lists with
dynamically linked.
5 CONCLUSIONS
In this paper, the outsourced data to be checked for
ensuring the integrity of clients data. In addition, for
untrusted and outsourced storages we presented a
construction of dynamic audit services which is used to
perform the dynamic operations done by the authorized
user on data stored in cloud. We also enhance the
performance of TPAs and storage service providers by
presenting an efficient method for periodic sampling
audit. Thus the scheme has a minimal constant amount
of overhead, which minimizes communication and
computation costs.
REFERENCES
[1] E.Mykletun, M.Narasimha, andG.Tsudik,(2006)
“Authentication and integrity in outsourced
databases,” Trans. Storage, vol. 2, no. 2, pp. 107-
1.
[2] Amazon web services (2008), “Amazon S3
Availability Event: July 20, 2008,
“http://status.aws.amazon.com/s3-20080720.html.
[3] A. Juels and B.S. Kaliski Jr. (2007), “PORs: Proofs
of Retrievability for Large Files,” Proc. ACM Conf.
Computer and Communications Security (CCS
‟07), pp. 584-597.
[4] M. Mowbray (2009), “The Fog over the Grimpen
Mire: Cloud Computing and the Law,” Technical
Report HPL-2009-99, HP Lab.
[5] A.A. Yavuz and P. Ning (2009), “BAF: An
Efficient Publicly Verifiable Secure Audit Logging
Scheme for Distributed Systems,” Proc. Ann.
Computer Security Applications Conf. (ACSAC),
pp. 219-228.
[6] C.C. Erway, A. Ku pc¸u , C. Papamanthou, and R.
Tamassia (2009), “Dynamic Provable Data
Possession,” Proc. 16th ACM Conf. Computer and
Comm. Security, pp. 213-222.
[7] H. Shacham and B. Waters (2008), “Compact
Proofs of Retrievability,” Proc. 14th Int‟l Conf.
Theory and Application of Cryptology and
Information Security: Advances in Cryptology
Advances in Cryptology (ASIACRYPT ‟08), J.
Pieprzyk, ed., pp. 90-107.
[8] H.-C. Hsiao, Y.-H. Lin, A. Studer, C. Studer, K.-H.
Wang, H. Kikuchi, A. Perrig, H.-M. Sun, and B.-Y.
Yang, (2009) “A Study of User- Friendly Hash
Comparison Schemes,” Proc. Ann. Computer
5. INTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING & TECHNOLOGY
VOLUME 5 ISSUE 2 – MAY 2015 - ISSN: 2349 - 9303
118
Security Applications Conf. (ACSAC), pp. 105-
114.
[9] A.R. Yumerefendi and J.S. Chase (2007), “Strong
Accountability for Network Storage,” Proc. Sixth
USENIX Conf. File and StorageTechnologies
(FAST), pp. 77-92.
[10] Y. Zhu, H. Wang, Z. Hu, G.-J. Ahn, H. Hu, and
S.S. Yau (2010), “Efficient Provable Data
Possession for Hybrid Clouds,” Proc. 17th ACM
Conf. Computer and Comm. Security, pp. 756-758.
[11] M. Xie, H. Wang, J. Yin, and X. Meng (2007),
“Integrity Auditing of Outsourced Data,” Proc.
33rd Int‟l Conf. Very Large Databases (VLDB), pp.
782-793.
[12] C. Wang, Q. Wang, K. Ren, and W. Lou (2010),
“Privacy-Preserving Public Auditing for Data
Storage Security in Cloud Computing,” Proc. IEEE
INFOCOM, pp. 1-9.
[13] Yan Zhu, Gail-Joon Ahn, Hongxin Hu, Stephen S.
Yau, Ho G. An, Chang-Jun Hu (2013) “Dynamic
Audit Services for Outsourced Storages in Clouds”.
[14]Sravan Kumar R and Ashutosh Saxena(2011),
“Data IntegrityProofs in Cloud Storage”