I-Sieve: An inline High Performance Deduplication System Used in cloud storage
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Measures of Dispersion and Variability: Range, QD, AD and SD
I-Sieve: An inline High Performance Deduplication System Used in cloud storage
1. TSINGHUA SCIENCE AND TECHNOLOGY
ISSNll1007-0214ll03/11llpp17-27
Volume 20, Number 1, February 2015
I-sieve: An Inline High Performance Deduplication System Used in
Cloud Storage
Jibin Wang, Zhigang Zhao, Zhaogang Xu, Hu Zhang, Liang Li, and Ying Guo
Abstract: Data deduplication is an emerging and widely employed method for current storage systems. As this
technology is gradually applied in inline scenarios such as with virtual machines and cloud storage systems, this
study proposes a novel deduplication architecture called I-sieve. The goal of I-sieve is to realize a high performance
data sieve system based on iSCSI in the cloud storage system. We also design the corresponding index and
mapping tables and present a multi-level cache using a solid state drive to reduce RAM consumption and to optimize
lookup performance. A prototype of I-sieve is implemented based on the open source iSCSI target, and many
experiments have been conducted driven by virtual machine images and testing tools. The evaluation results
show excellent deduplication and foreground performance. More importantly, I-sieve can co-exist with the existing
deduplication systems as long as they support the iSCSI protocol.
Key words: I-sieve; cloud storage; data deduplication
1 Introduction
With the rapid development of information technology,
large amounts of data are stored in different storage
systems. According to EMC reports[1]
, the amount of
data will reach 44 zettabytes by 2020. At the same time,
the total shipped disk storage systems capacity only
reached 7.1 exabytes in 2011 based on IDC reports[2]
.
There is a huge gap between data production and
available storage capacity for the foreseeable future,
which means that much of the information will get
discarded. On the other hand, even though disk cost
per GB has seen significant declines in recent years,
the total storage costs have revealed a general trend of
rapid increase[3]
. One of the promising technologies for
Jibin Wang, Zhigang Zhao, Zhaogang Xu, Hu Zhang, Liang
Li, and Ying Guo are with the Shandong Computer Science
Center (National Supercomputer Center in Jinan), Shandong
Provincial Key Laboratory of Computer Networks, Jinan
250101, China. E-mail: fwangjb, zhaozhg, xuzhaogang,
zhanghu, liliang, guoyingg@sdas.org.
To whom correspondence should be addressed.
Manuscript received: 2014-11-15; revised: 2014-12-15;
accepted: 2015-01-07
improving storage efficiency is deduplication, which is
considered to be one of the best optimizations to reduce
storage costs. This technology can effectively eliminate
redundant data blocks by using pointers to reference
an already stored identical data block in the storage
system. Deduplication can work with many data types,
including block[4–7]
and sub-file[8–11]
, also known as
super-block or fixed-size container file, whole-file[12]
,
and hybrid-file[13, 14]
.
Historically, data deduplication technology is only
applied on the backup system[15–17]
. This is because
backup applications consist primarily of duplicate
data, especially for cyclical full backup applications.
Another reason is that most backup systems do
not need a high foreground performance. The main
indicators to evaluate backup system performance
are data throughput and deduplication ratio. Similar
to optimizations in cloud applications[18, 19]
or cloud
services[20, 21]
, there are many promising studies
around performance optimizations resulting from
deduplication. Prior work on deduplication has
primarily focused on the optimization of the current
backup systems, including chunking algorithms[5, 22–24]
and performance optimizations[7, 16]
. Many of the
www.redpel.com +917620593389
www.redpel.com +917620593389
2. 18 Tsinghua Science and Technology, February 2015, 20(1): 17-27
solutions have been applied in the current commercial
systems as the secondary storage.
With the widespread use of primary storage systems
for storing large amounts of data, such as online video,
user directories, and VM images, there has been more
researches[6, 7, 22, 25]
focusing on inline deduplicaton for
primary workloads rather than backup environments
only, especially with cloud storage[26–28]
. The challenge
with inline deduplication is to determine mechanisms
to decrease the high latency introduced due to
deduplication operations, including hash table indexing
and metadata management. Ideally, an in-memory hash
table can decrease latency effectively, however, the
reality is that only a small subset of related tables can
be resident in the limited memory due to the large scale
of the current deduplication systems; as a result, most
of them are migrated to disk.
Another popular research area is to determine how
to speed-up the table lookup process and reduce the
number of chunk hashes in RAM. There are three types
of optimizations to reduce the size of the hash table
in RAM-reducing hash length, using a big chunk size,
and using a sampling hash. It is simple to compute a
hash using a shorter hash function (e.g., 32-bit MD5 or
40-bit SHA-1), however, there are challenges that arise
as a result. Specifically, as the number of the chunks
increase, more hash collisions can occur for different
chunk blocks, which can lead to data inconsistency in
the storage system. A large average chunk size in the
chunking algorithm can be used to as an efficient way to
cut down the total number of chunks in RAM. However,
this strategy can reduce the deduplication ratio since
the average chunk size in the current file system is still
small[29]
. Separate from the above two methods, the
sampling hash is widely used in many deduplication
systems, especially for large scale storage systems.
It is also relatively suitable for combining with hash
and cache structures to improve index performance;
therefore, we also utilize the sampling hash in our I-
sieve design.
Given the significant difference in performance
between RAM memory and disk, fast storage devices
such as flash memory (typically known as Solid
State Drive (SSD) can be a feasible solution in
deduplication systems. It can play the role of a buffer
cache, effectively narrowing the gap between memory
and disk. In ChunkStash[7]
, all metadata is stored
in SSD in order to improve indexing performance;
this optimization only shows marginal loss in the
deduplication quality. Compared to increasing memory
capacity, using flash memory SSD is a more effective
way to improve lookup performance.
This study describes the design, implementation, and
evaluation of our deduplication system (called I-sieve)
based on the open source code iSCSI. We also propose
a series of optimizations to improve inline performance
for primary workloads. Our evaluation shows up to
an 82.96% data savings exist in office environments,
together with an improvement of about 91% in the user
response time in case of common I/O applications. In
summary, our key contributions primarily include the
following.
Analysis of the features of current deduplication
systems together with a detailed summary of their
challenges.
An inline deduplication system (I-sieve), for
primary workloads, that serves as a module
embedded in the cloud storage system.
Design of novel index tables aimed at block level
deduplication and a new multi-cache structure
using SSD to boost foreground performance.
Implementation of an I-sieve prototype in the
Linux operating system based on an open source
code iSCSI target.
Evaluation of our I-sieve performance by
simulating real world applications such as
office and VM environments.
The remainder of this paper is organized as follows.
In Section 2, we provide the background description
and motivation for this paper. In Section 3, we present
the design of the I-sieve, including the indexing table
and the multi-level cache. The evaluation method,
experimental settings, and workloads are presented in
Section 4 followed by numerical results and analysis in
Section 5. In Section 6, we give an overview of related
work, and in Section 7, we conclude the paper.
2 Background and Motivation
Before discussing our deduplication system, we first
introduce the traditional deduplication process in
current storage systems, and subsequently analyze its
strengths and weaknesses.
2.1 Traditional deduplication system
As mentioned above, data deduplication is one of the
popular data reduction technologies, which has been
applied in many backup storage systems[8–11]
. In Fig. 1,
we summarize the general deduplication process. There
www.redpel.com +917620593389
www.redpel.com +917620593389
3. Jibin Wang et al.: I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage 19
Fig. 1 I/O request process in traditional deduplication systems.
are two major I/O operations for any storage system,
read and write. As shown in Fig. 1a, when the system
deals with write I/O requests from the frontend, data
deduplication operations happen on the write path of the
file system, and the following four steps are performed.
(1) File chunking. In this step, all files are
divided into different types of blocks by the chunking
algorithms (e.g., fixed-sized blocks[15]
and variable-
sized blocks[5, 24, 30]
). The system then computes a
fingerprint of each block using a hash function such as
a SHA-1 hash[31]
or an MD5 hash[32]
. Subsequently,
all processed blocks with corresponding fingerprints are
handled by the next deduplication module.
(2) Deduplication checking. In this stage, the
fingerprint of split blocks is checked against the
fingerprint table. If the block has a duplicate in the
table, then this block is not written to disk; instead only
the metadata is recorded (similar to the organization of
super blocks or block groups in the EXT file system).
However, if the block does not have a duplicate in the
table, then the fingerprint of this block is inserted into
fingerprint table as a unique block in the system. This
step involves an extra fingerprint indexing process to be
performed to check for duplicate data when compared
with the traditional I/O process.
(3) Metadata updating. As the block indexing process
completes, the corresponding metadata of the file is
updated, including the mapping of blocks and files
based on pointers. Subsequently, the metadata is stored
in designated zones on the disk by periodic destage
operations.
(4) Data blocks storage. The final step in the write
path is to store the unique data blocks. The storage
elements in most current storage system are organized
by containers[7]
or bins[10]
, which contain a specified
number of data blocks on the disk. When the container
or bin reaches the target size, it is sealed and written
from RAM memory to disk. In addition, some extra
block metadata is recorded to identify data blocks
belonging to a container or file.
Figure 1b shows a data view of the traditional
deduplication system after storing two files, FileA and
FileB. As seen in the figure, blocks a and b are
the deduplicated data blocks in those two files. Only
a single copy is stored on the disk and appropriate
metadata is also created during this process. As seen
in Fig. 1c, the process of handling read I/O for FileA
is inverse of the process with write I/O. First, the
system reads the corresponding metadata information,
and subsequently obtains all blocks associated with
FileA. Next, the system reads the relevant blocks from
disk. Since the requested blocks may exist in various
containers or bins on the disk, this step can be a time
consuming process requiring more than one I/O request.
Finally, the data blocks are returned back to the system.
Based on the processing flow, we can see that the
traditional deduplication module is tightly bound to the
file system in the entire read and write path-embedded
in the reformed file system such as liveDFS[33]
or
in a specific file system like Venti[15]
or HydraFS[34]
.
In other words, current deduplication systems are co-
evolved with their applications, and are not suitable
to deploy as an independent module. In addition, the
complex metadata management model also restricts the
performance exertion, especially for the mappings in
the deduplication operation work path, even though
some optimizations[6, 7, 16, 22]
improve deduplication
performance to some extent.
2.2 Classifying deduplication systems
Before designing a deduplication system, we first
describe the classification of deduplication systems to
be better suited for the required applications. Currently,
there are many classifications for deduplication
www.redpel.com +917620593389
www.redpel.com +917620593389
4. 20 Tsinghua Science and Technology, February 2015, 20(1): 17-27
systems.
Primary versus secondary deduplication systems.
This classification is based on the workloads served.
Primary deduplication systems are designed for
improving performance, in which workloads are
sensitive to I/O latency. Secondary deduplication
systems are primarily used for secondary storage
environments, such as backup applications, which
require high data throughput. As mentioned above, data
deduplication operations are time-consuming, which is
why primary deduplication systems are rarely used in
actual production environments. However, considering
the small-scale application of secondary storage, we
believe that current primary storage systems with
deduplication are a feasible scheme since significant
duplicate data exists in primary storage workloads as
well.
Post-processing versus inline deduplication. Post-
processing deduplication is an out-of-band approach
where data is not deduplicated until after the backup
has completed. In other words, with post-processing
deduplication, new data is first stored on the storage
device and then processed at a later time for duplication.
On the other hand, with inline deduplication, chunk
hash calculations are created on the target device as data
enters the storage devices in real-time. In this paper, we
primarily focus on the inline deduplication system for
its high space utilization and real-time characteristics.
We summarize the current deduplication studies, and
list the following observations that drive us to build a
new deduplication system.
More duplicate data exists in many typical
environments. From the conclusions[29]
, there
exists approximately 50% duplicate data in the
primary storage system and as high as 72% in
backup datasets; all the data for analysis comes
from different departments of Microsoft. Some
cloud-based applications, such as live virtual
machine environments, can achieve at least 40%
data reduction[33]
. The characteristics of data
storage for those scenarios make it possible
for us to build a small-scale, high performance
deduplication system (e.g., private-cloud storage).
Complex data management in the work path. The
operations in a traditional deduplication system
consist of read, write, and delete operations,
with every operation path updates much of
the information, including mapping table, block
metadata, fingerprint table, and so on. All these
work paths have optimization of space to boost the
whole work performance.
Poor performance in inline deduplication systems.
Most of the current deduplication systems are used
with backup scenarios. However, for real-time
applications, these systems are not suitable due to
their poor read or write performance.
Given this, we propose an inline, block level
primary deduplication system with high performance
called I-sieve. The goal of I-sieve is to build a
small-scale storage system for use in an office or
private-cloud environments. In order to address the
challenge of low performance in inline deduplication
systems, a lightweight indexing table and a two-level
cache structure are used to improve deduplication
performance in I-sieve. The evaluations show excellent
tradeoff of I-sieve between foreground performance and
the deduplication ratio.
3 Structure of I-sieve and the Indexing
Table Design
In this section, we first review the architectural of I-
sieve. Next, we present an efficient fingerprint indexing
algorithm. Our I-sieve system always assumes that the
total size of the storage system is 10 TB; all designs
of the mapping table and cache are based on this
assumption.
3.1 Structure of I-sieve
As shown in Fig. 2, I-sieve is a backend storage module,
and all the foreground clients access the storage service
using the iSCSI protocol. Thus, I-sieve can be perceived
as a data sieve, which eliminates much of the data
redundancy seen in file systems. The implementation
of I-sieve is a module imbedded into iSCSI; therefore,
it can be incorporated with many storage applications,
and also be easily deployed.
As seen in Fig. 2, I-sieve consists of three key
functional components, Deduplication engine, Multi-
cache, and a Snapshot module.
The Deduplication engine provides block-level
data deduplication, including hash table (also called
fingerprint table) management and an I/O remapping
module. All the foreground write requests are handled
within the deduplication engine, and then transferred to
disk request queues.
The Multi-cache module primarily handles cache
requests, and also acts as a bridge that connects
memory, SSD, and disk devices. The snapshot module
www.redpel.com +917620593389
www.redpel.com +917620593389
5. Jibin Wang et al.: I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage 21
Fig. 2 Architectural overview of I-sieve.
is responsible for the reliability of the deduplicated
data blocks; it does so by performing periodic snapshot
operations.
From the view of each storage application, I-
sieve plays the role of a bridge between file system
and storage devices. All frontend I/O requests to
storage devices are redirected to the deduplication
engine first. As shown in Fig. 2, the deduplication
engine is below the file system layer; consequently,
the only work of this engine is to handle file I/O
requests. Unlike other deduplication approaches, I-
sieve has simpler data processing logic; thus, its
operation work path is relatively shorter, which makes
the deduplication process more efficient. This is
because the deduplicated data blocks only need some
remapping operations without needing more I/O request
information, which is the biggest difference between
block-level deduplication and other methods.
3.2 Design of the index tables
As described in Fig. 2, the index tables, including the
fingerprint table and other mapping tables, are required
in every deduplication operation. The efficient design
of the index table can result in great performance
improvements, especially for the inline primary
deduplication system.
There are two challenges to be addressed in the
design of the fingerprint index tables in I-sieve, low cost
memory consumption and efficient remapping logic.
Generally, the traditional fingerprint table uses the full
hash index for data deduplication, however, this can
result in memory exhaustion due to large data volumes.
For example, if the deduplication system has 10 TB
capacity and uses a 4 KB block size for chunking, then
there are about 2:68 109
unique blocks within the
system. Assuming that each hash entry in the index
table takes 30 bytes, then the total memory consumption
is about 75 GB just for the fingerprint table. Therefore,
it is not a wise decision to hold the entire hash table in
RAM memory.
On the other hand, there is also a need for the
remapping logic to redirect all frontend LBAs to their
real positions on the disks, since blocks with different
LBAs may have the same data content. In our index
table design, we use the SHA-1 hash[31]
, for its collision
resistant properties, to compute the fingerprint. In total,
there are 20 bytes in each entry with 160 bits of output
for this hash function.
As shown in Fig. 3, we resolve the above issues
with three key tables, LBA Remapping Table (LRT),
Multi-index Table (MT), and Hash node Table (HT). We
provide more details below.
LRT is responsible for remapping the frontend I/O
requests to the fingerprint index table. The first item
of each entry in LRT records the LBA from the top
of the file system, and the second item stores pointer
information of the fingerprint table.
Considering the size of the fingerprint table, we
present a novel structure, called MT, to reduce the table
size. Instead of storing the full index table in RAM,
MT is divided into many cells, called buckets, based
on the first three 8 bits of the full fingerprint of the
block. Every bucket has 256 entries with the same hash
prefix. As MT increases, all identical prefix buckets are
organized with linked lists level by level. In order to
save space with the HT, we use the remaining 136 bits
as the key in HT in the last level. The design motivation
of MT has another consideration, which also facilitates
the cache of buckets using the high performance of
SSD.
3.3 Performance optimization using Bloom filter
As can be seen from Fig. 3, even though the multi-index
table can reduce the scale of the index table in memory,
www.redpel.com +917620593389
www.redpel.com +917620593389
6. 22 Tsinghua Science and Technology, February 2015, 20(1): 17-27
Fig. 3 Index table structures and the mapping process in I-sieve.
there still are a large number of queries to check the
deduplication fingerprint. This lookup process can be
time-consuming, even with an SSD. Thus, we use
a high-efficiency tool bloom filter[35]
to optimize the
indexing process. The description of the bloom filter
is as follows.
An empty bloom filter is a bit array of m bits, all set
to 0. There must also be k independent hash functions
defined, each of which maps or hashes one element to
a position in the m-bit array with a uniform random
distribution. When an element is added, it is fed to each
of the k hash functions to get k array positions. Then
the bits are set at all these positions to 1. In order to
query for an element (test whether it is in the array),
a check is performed to see if any of the bits at these
positions are 0, and if so, the element is not in the array.
If, however, all the bits are 1, then either the element is
in the array, or the bits are set to 1 during the insertion
of other elements, which results in a false positive. The
idea is to allocate a vector to test whether a data block
exists in the system; bloom filters provide a fast and
space-efficient probabilistic data structure used in many
deduplication systems[16, 36]
. However, false positive
retrieval results are possible in bloom filters, which can
be represented by the following expression,
pfalse positive D 1
Â
1
1
m
Ãkn
!k
1 e
kn
m
Ák
;
where k denotes the number of the independent hashes,
m and n represent the table size (array size as mentioned
above) and the number of blocks in the storage system
respectively. In order to reduce the possibility of false
positives in the duplicate checking process, we select
10 hash functions in I-sieve, which results in a false
positive percentage of only about 0.098%. Assume that
the total number of blocks n is 2:7 109
for a 10 TB
storage system; then the memory consumption is about
4.5 GB, which is within the acceptable range.
3.4 Multi-level cache of I-sieve
Besides the optimization of index performance, the
organization of the index tables and data blocks is
important for the overall system performance as well.
Therefore, we introduce a flash-based storage device
SSD acting as a fast cache between RAM memory and
disk.
Figure 4 shows the description of our multi-level
cache structure. I-sieve uses a Cache Mapping Table
(CMT) to manage different metadata and data caches in
RAM and SSD. For the metadata, I-sieve uses two-level
mappings, RAM-SSD, which means that the SSD stores
all the metadata. Some of frequently-used metadata
is migrated from SSD or disk to RAM in order to
boost performance. Specifically, the first three levels
of buckets are persisted in RAM, only taking up about
64 MB. Due to the scale of each HT shown in Fig. 4,
I-sieve selects some hot HTs based on request count to
store in RAM memory; the rest are stored on SSD. For
the cached data blocks, there are three tiers of mapping
in I-sieve, RAM, SSD, and Disk. All deduplicated data
blocks are eventually stored on disks, however, newly
added blocks are written to temporary SSD locations
Fig. 4 Structure of the multi-cache in I-sieve.
www.redpel.com +917620593389
www.redpel.com +917620593389
7. Jibin Wang et al.: I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage 23
and then imported into disks in a particular time zone.
The system also caches hot spot blocks in RAM in order
to improve read performance; the selection of blocks is
based on the value of the reference count described in
HT.
4 Evaluations
For our evaluations of I-sieve, we primarily use the
deduplication ratio and the user response time as the
key performance indicators. The experimental settings
for each test are the same, and each test is run three
times to eliminate any discrepancies.
4.1 Experimental settings
The testbed details of the server and client are listed in
Table 1, and they are connected via a gigabit switch.
In addition, a Microsoft iSCSI software initiator is
installed on the client and the corresponding iSCSI
target with the I-sieve module is deployed on the storage
server, on which a commercial cloud storage system is
deployed.
4.2 Implementation and workloads
We implement I-sieve prototype based on the iSCSI
code[37]
with about 2000 lines of C code. The I-sieve
prototype is released separately as a module; thus, it can
be easily deployed in most application environments.
The workloads that drive our experiments have two
parts. One is a live Virtual Machine (VM) image, the
other is the IOmeter tool. To simulate a real work
environment, we setup two Windows systems in virtual
machines together with some frequently applications
(e.g., MS Office, application development environment,
etc.). The details of the experimental workloads are
listed in Table 2. Note that all VMs in our experiments
are set to 10 GB in size.
We also evaluate foreground performance using the
Table 1 Hardware details of the storage server and the
client. Note that all hardware on both the client and the
server are the same except for the OS and memory.
OS (server) Fedora 14 (kernel-2.6.35)
OS (client) Windows 7 Ultimate with SP1(X86)
Mainboard Supermicro X8DT6
Disks 2 Seagate ST31000524AS, 1 TB
SSD server 1 Intel SSD 520 Series, 128 GB
CPU Intel(R) Xeon(R) CPU 5650
NIC Intel 82574L
Memory (server) 32 GB DDR3
Memory (client) 8 GB DDR3
Table 2 Operating systems and typical applications used
in the evaluation. Note that the system setup images and
software are obtained from official websites or mirror sites.
All their install processes use the default settings in VMs.
Operating
system
VM descriptions and abbreviations
None Office 2003 Office 2007 VS 2005 VS 2008
Windows XP
SP2
SP2 SP2+O3 SP2+O7 SP2+V5 SP2+V8
Windows XP
SP3
SP3 SP3+O3 SP3+O7 SP3+V5 SP3+V8
IOmeter tool. The request size generated by IOmeter
is 4 KB because the average file size in the file
system is 4 KB[29]
. Since some studies[29, 33]
have
confirmed that fixed-sized block chunking is suitable
for VM applications, we adopt the fixed-sized chunking
algorithm in our deduplication engine in I-sieve.
Unless otherwise noted, the workloads used in
evaluations run completely on the client, and we
observe and analyze experimental results based on the
original record from the storage server.
5 Results Analysis
We first validate whether I-sieve can eliminate data
redundancy significantly in virtual machines. Figure 5a
shows the actual data capacity increasing based on the
Fig. 5 Performance of the data deduplication in I-sieve.
Note that all live virtual machine images are stored to the
storage pool in order.
www.redpel.com +917620593389
www.redpel.com +917620593389
8. 24 Tsinghua Science and Technology, February 2015, 20(1): 17-27
deployment orders. I-sieve gets 82% space saving for
the Windows SP2 virtual image. Excluding the zero-
filled blocks of the virtual machine image[38]
, I-sieve
still achieves a 23% deduplication ratio for the Windows
XP SP2 system. Similar results are also seen with other
virtual machine images. In addition, we observe some
interesting phenomena.
First, different applications within the same operating
system have limited data duplication, such as SP2+O3
and SP2+V5. Also, most of the data redundancy exists
in different service packs of the same application.
As an example, the deduplication result of SP3+O3
achieves at least 86.54% data reduction compared to
SP2+O3. Third, we can clearly see the changes
of Windows systems and their applications iteration
versions from the variety of deduplication ratio as
shown in Fig. 5b, which also points us a sensible
deployments for the VM.
The second part of our evaluation is to test I-sieve
foreground performance to confirm the feasibility of
inline deduplication. We simulate common I/O features
using IOmeter, and compare with three different
models, the traditional iSCSI (RAW for short) model,
I-sieve deduplication with the cache module (DC), and
I-sieve deduplication with the cache and bloom filter
modules (DCB).
As shown in Fig. 6, I-sieve outperforms RAW
under all six cases; especially for the random write
application, the response time of DC is only 0.7649 ms,
with an approximately 91% improvement over RAW.
This is because all I/O requests to I-sieve are cached by
our multi-cache module both for sequential and random
I/Os. RAW only gives the optimizations for sequential
operations; small sequential I/Os are combined with a
big sequential data block giving to underlying disks to
improve performance.
From Fig. 6, one can see that I-sieve shields I/O
Fig. 6 Evaluations of the foreground performance in I-sieve.
All results are obtained using the IOmeter tool installed on
the client. The labels R, W, Seq, and Rnd in the legend
represent read operation, write operation, sequence, and
random I/O modes, respectively.
features of six types of applications, and shows good
approximation of foreground performance compared to
RAW. The reason is that the structure of I-sieve is
below the file system; consequently, it does not know
the differences among the I/O requests, such as a read
I/O request or a write I/O request, and all I/O requests
go through the complete deduplication process. As
also can be seen from the figure, I-sieve with bloom
filter module (DCB) shows slightly better performance
improvement compared with DC, and the results prove
the impact of the bloom filter on the speed of block
retrieval.
6 Related Works
Deduplication technology has been widely used in
most inline storage systems including primary[6]
and secondary storage[25]
. There are also some
differences in deduplicaton granularity. In file level
deduplication[12]
, the deduplication ratio is not
obvious[13, 29]
when compared with block level
deduplication[4–7]
or sub-file level[8–11]
. Therefore,
block chunking algorithms are often used to realize
block level deduplicaton, including fixed-sized and
variable-sized algorithms. Some applications (e.g.,
office, virtual machine environments, etc.) are well
suited for fixed-sized deduplication[29, 33]
.
To improve foreground performance for inline
deduplication, iDedup[6]
, a latency-aware inline data
deduplication system for primary storage describes a
novel mechanism to reduce latency. It takes advantage
of the spatial locality and temporal nature of the primary
workloads in order to avoid extra disk I/Os and seeks.
iDedup is complementary to I-sieve, and together they
can further shorten response time.
Besides the optimization based on workloads, Chunk-
Stash[7]
uses a three-tier deduplication architecture by
adding a fast device, SSD. By utilizing SSD, the penalty
of index lookup misses in RAM is reduced by orders of
magnitude because such lookups are served from a fast,
flash-based index. In the same spirit, we also employ
SSD as a strategy for improving performance in I-sieve.
In contrast to current optimizations for inline
deduplication systems, I-sieve improves performance
because of a shorter deduplication operation path and
multi-cache structure. I-sieve can serve as an I/O filter
between file I/O and disk I/O. There is no processing
logic for data, such as “container” or “bin” structures in
other approaches.
www.redpel.com +917620593389
www.redpel.com +917620593389
9. Jibin Wang et al.: I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage 25
7 Conclusions
In this paper, we propose I-sieve, a high performance
inline deduplication system for use in cloud storage.
We design novel index tables to satisfy the I-sieve
architecture, since it is a bridge between frontend
and backend systems. We also implement a prototype
of I-sieve based on iSCSI, and evaluate it with
virtual machines and the IOmeter tool. We present
our detailed test results, and demonstrate that I-
sieve has excellent foreground performance compared
with traditional iSCSI applications. In addition, I-
sieve is also suitable for deduplication in small
storage environments, especially with virtual machine
applications. Finally, I-sieve can co-exist with existing
deduplication systems as long as they support the iSCSI
protocol.
Acknowledgements
Thanks to Wei Huang for the work of programming
of I-sieve prototype. We also would like to thank the
anonymous reviewers for their valuable insights that have
improved the quality of the paper greatly. This work
was supported by the Young Scholars of the Shandong
Academy of Science (No. 2014QN013) and the National
High-Tech Research and Development (863) Program of
China (No. 2012AA011202).
References
[1] EMC Corporation, The emc digital universe study,
Technical Report, 2014.
[2] International Data Corporation, The 2011 digital universe
study, Technical Report, 2011.
[3] D. R. Merrill, Storage economics: Four principles for
reducing total cost of ownership, http://www.hds.com/
assets/pdf/four-principles-for-reducing-total-cost-of-
ownership.pdf, 2011.
[4] A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li,
Decentralized deduplication in san cluster file systems, in
Proceedings of the 2009 Conference on USENIX Annual
Technical Conference, 2009.
[5] E. Kruus, C. Ungureanu, and C. Dubnicki, Bimodal
content defined chunking for backup streams, in
Proceedings of the 8th USENIX Conference on File and
Storage Technologies, 2010.
[6] K. Srinivasan, T. Bisson, G. Goodson, and K. Voruganti,
idedup: Latency-aware, inline data deduplication for
primary storage, in Proceedings of the 10th USENIX
Conference on File and Storage Technologies, 2012.
[7] B. Debnath, S. Sengupta, and J. Li, Chunkstash: Speeding
up inline storage deduplication using flash memory, in
Proceedings of the Annual Conference on USENIX Annual
Technical Conference, 2010.
[8] C. Dubnicki, L. Gryz, L. Heldt, M. Kaczmarczyk, W.
Kilian, P. Strzelczak, J. Szczepkowski, C. Ungureanu, and
M. Welnicki, Hydrastor: A scalable secondary storage, in
Proccedings of the 7th Conference on File and Storage
Technologies, 2009, pp. 197–210.
[9] W. Dong, F. Douglis, K. Li, H. Patterson, S. Reddy,
and P. Shilane, Tradeoffs in scalable data routing for
deduplication clusters, in Proceedings of the 9th USENIX
Conference on File and Storage Technologies, 2011.
[10] D. Bhagwat, K. Eshghi, D. D. E. Long, and M. Lillibridge,
Extreme binning: Scalable, parallel deduplication
for chunk-based file backup, in Proceedings of the
IEEE International Symposium on Modeling, Analysis
Simulation of Computer and Telecommunication Systems,
2009, pp. 1–9.
[11] C. Ungureanu, B. Atkin, A. Aranya, S. Gokhale, S. Rago,
G. Calkowski, C. Dubnicki, and A. Bohra, Hydrafs: A
high-throughput file system for the hydrastor content-
addressable storage system, in Proceedings of the 8th
USENIX Conference on File and Storage Technologies,
2010.
[12] W. J. Bolosky, S. Corbin, D. Goebel, and J. R. Douceur,
Single instance storage in windows 2000, in Proceedings
of the 4th Conference on USENIX Windows Systems
Symposium, 2000.
[13] C. Policroniades and I. Pratt, Alternatives for detecting
redundancy in storage systems data, in Proceedings of
the Annual Conference on USENIX Annual Technical
Conference, 2004.
[14] C. Liu, Y. Lu, C. Shi, G. Lu, D.-C. Du, and D.-S. Wang,
Admad: Application-driven metadata aware deduplication
archival storage system, in Proceedings of the Fifth IEEE
International Workshop on Storage Network Architecture
and Parallel I/Os, 2008, pp. 29–35.
[15] S. Quinlan and S. Dorward, Venti: A new approach to
archival data storage, in Proceedings of the 1st USENIX
Conference on File and Storage Technologies, 2002.
[16] B. Zhu, K. Li, and H. Patterson, Avoiding the disk
bottleneck in the data domain deduplication file system, in
Proceedings of the 6th USENIX Conference on File and
Storage Technologies, 2008, pp. 1–14.
[17] J. A. Paulo and J. Pereira, A survey and classification of
storage deduplication systems, ACM Computing Surveys,
vol. 47, no. 1, pp. 11: 1–11: 30, 2014.
[18] F. Liu, Y. Sun, B. Li, B. Li, and X. Zhang, Fs2you: Peer-
assisted semipersistent online hosting at a large scale, IEEE
Transactions on Parallel and Distributed Systems, vol. 21,
no. 10, pp. 1442–1457, 2010.
[19] F. Liu, B. Li, B. Li, and H. Jin, Peer-assisted on-demand
streaming: Characterizing demands and optimizing
supplies, IEEE Transactions on Computers, vol. 62, no. 2,
pp. 351–361, 2013.
[20] F. Xu, F. Liu, L. Liu, H. Jin, B. Li, and B. Li, iaware:
Making live migration of virtual machines interference-
aware in the cloud, IEEE Transactions on Computers, vol.
63, no. 12, pp. 3012–3025, 2014.
[21] F. Xu, F. Liu, H. Jin, and A. Vasilakos, Managing
performance overhead of virtual machines in cloud
computing: A survey, state of the art, and future directions,
Proceedings of the IEEE, vol. 102, no. 1, pp. 11–31, 2014.
www.redpel.com +917620593389
www.redpel.com +917620593389
10. 26 Tsinghua Science and Technology, February 2015, 20(1): 17-27
[22] M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar,
G. Trezise, and P. Camble, Sparse indexing: Large
scale, inline deduplication using sampling and locality, in
Proccedings of the 7th Conference on File and Storage
Technologies, 2009, pp. 111–123.
[23] A. Muthitacharoen, B. Chen, and D. Mazi`eres, A low-
bandwidth network file system, in Proceedings of the 18th
ACM Symposium on Operating Systems Principles, 2001,
pp. 174–187.
[24] D. R. Bobbarjung, S. Jagannathan, and C. Dubnicki,
Improving duplicate elimination in storage systems,
Transactions on Storage, vol. 2, no. 4, pp. 424–448, 2006.
[25] D. Domain, Data domain appliance series: High-speed
inline deduplication storage, http://www.emc.com/backup-
and-recovery/data-domain/data-domain-deduplication-
storage-systems.htm, 2011.
[26] W. Leesakul, P. Townend, and J. Xu, Dynamic data
deduplication in cloud storage, in Proceedings of the 8th
IEEE International Symposium on Service Oriented System
Engineering, 2014, pp. 320–325.
[27] Y. Fu, H. Jiang, N. Xiao, L. Tian, F. Liu, and L.
Xu, Application-aware local-global source deduplication
for cloud backup services of personal storage, IEEE
Transactions on Parallel and Distributed Systems, vol. 25,
no. 5, pp. 1155–1165, 2014.
[28] P. Puzio, R. Molva, M. Onen, and S. Loureiro, Cloudedup:
Secure deduplication with encrypted data for cloud storage,
in Proceedings of the 5th IEEE International Conference
on Cloud Computing Technology and Science, 2013.
[29] D. T. Meyer and W. J. Bolosky, A study of practical
deduplication, in Proceedings of the 9th USENIX
Conference on File and Storage Technologies, 2011.
[30] K. Eshghi and H. K. Tang, A framework for analysing and
improving content-based chunking algorithms, Tech. Rep.
HPL–2005–30(RI), 2005.
[31] National Institute of Standards and Technology, FIPS PUB
180-1: Secure hash Standards, Technical Report, 1995.
[32] R. Rivest, The md5 message-digest algorithm,
http://www.ietf.org/rfc/rfc1321.txt, 1992.
[33] C.-H. Ng, M. Ma, T.-Y. Wong, P. P. C. Lee, and J.
C. S. Lui, Live deduplication storage of virtual machine
images in an open-source cloud, in Proceedings of the 12th
International Middleware Conference, 2011, pp. 80–99.
[34] C. Ungureanu, B. Atkin, A. Aranya, S. Gokhale, S. Rago,
G. Cakowski, C. Dubnicki, and A. Bohra, Hydrafs: A
high-throughput file system for the hydrastor content-
addressable storage system, in Proceedings of the 8th
USENIX Conference on File and Storage Technologies,
2010.
[35] B. H. Bloom, Space/time trade-offs in hash coding with
allowable errors, Communications of the ACM, vol. 13, no.
7, pp. 422–426, 1970.
[36] G. Lu, Y. J. Nam, and D.-C. Du, Bloomstore: Bloom-
filter based memory-efficient key-value store for indexing
of data deduplication on flash, in Proceedings of the
28th IEEE Symposium on Mass Storage Systems and
Technologies, 2012.
[37] iSCSI enterprise target, http://sourceforge.net/projects/
iscsitarget/files/, 2010.
[38] K. Jin and E. L. Miller, The effectiveness of deduplication
on virtual machine disk images, in Proceedings of SYSTOR
2009: The Israeli Experimental Systems Conference, 2009.
Jibin Wang received the bachelor degree
in computer science from Chengdu
University of Information Technology,
China, in 2007, the MS and PhD degrees
in computer science from Huazhong
University of Science and Technology,
China, in 2010 and 2013, respectively.
He is currently working in Shandong
Computer Science Center (National Supercomputer Center in
Jinan). His research interests include computer architecture
networked storage system, parallel and distributed system, and
cloud computing.
Zhigang Zhao received the BS and
MS degrees in management science and
engineering from Shandong Normal
University, China, in 2002 and 2005,
respectively. Presently, he is a research
assistant in the Shandong Computer
Science Center (National Supercomputer
Center in Jinan). His research interests
include cloud computing, service-oriented computing, and big
data.
Zhaogang Xu received the bachelor
degree in network engineering from
Shandong University of Science and
Technology, China, in 2010, and the MS
degree in computer architecture from
Shandong University of Science and
Technology, China, in 2013. Now, he
works in the Shandong Computer Science
Center (National Supercomputer Center in Jinan), Shandong
Provincial Key Laboratory of Computer Networks. His research
interests include cloud computing and big data storage.
Hu Zhang received the bachelor degree
in electronic information science and
technology from Liaocheng University
of Information Technology, China, in
2009, and the MS degree in signal and
information processing from Southwest
Jiaotong University, China, in 2013.
Presently, he is an engineer in the
Shandong Computer Science Center (National Supercomputer
Center in Jinan), Shandong Provincial Key Laboratory of
Computer Networks. His research interests include cloud
computing and big data mining.
www.redpel.com +917620593389
www.redpel.com +917620593389
11. Jibin Wang et al.: I-sieve: An Inline High Performance Deduplication System Used in Cloud Storage 27
Liang Li received the bachelor degree
in computer science from Shandong
University of Science and Technology,
China in 2003, MS degree in computer
science from Shandong University in
2006. Presently, she is a research assistant
in the Shandong Computer Science Center
(National Supercomputer Center in Jinan),
Shandong Provincial Key Laboratory of Computer Networks.
Her research interests include service-oriented computing and
big data mining.
Ying Guo received the bachelor degree in
information management from Shandong
University of Finance and Economics,
China, in 2000, the MS degree in enterprise
management from Dongbei University of
Finance and Economics, China, in 2004.
She is an associate research professor and
currently working for Shandong Computer
Science Center. Her research interests include cloud computing
and software engineering.
www.redpel.com +917620593389
www.redpel.com +917620593389