SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
Eight Considerations for
Evaluating Disk-Based Backup
          Solutions




                           1
Introduction

 The movement from tape-based to disk-based backup is well underway. Disk eliminates all the
 problems of tape backup. Backing up to disk is faster than backing up to tape, and disk-based
 backups are more reliable and more secure than tape-based backups. Unlike tape, disk resides in a
 hermetically sealed case and in a data center rack so it is not exposed to heat and humidity and is
 secure as it sits behind data center physical security and network security.

 In the past, the movement from tape- to disk-based backup has been less compelling due to the
 expense of storing backup data on disk instead of tape. Despite the disadvantages of tape, tape’s
 cost advantage has allowed it to maintain its presence in the data center, often with straight disk (for
 faster nightly backups) fronting a tape library. With the advent of data deduplication, however, tape’s
 cost advantage has been largely eroded.

 When disk-based backup is used in conjunction with data deduplication, only the unique data is
 stored and depending upon the type of data deduplication and the specific implementation, reduction
 rates of 10 to 1 to as much as 50 to 1 can be realized. This allows only a fraction of the disk to be
 required versus straight disk without data deduplication.

 Let’s look at a simple example below to see how this works. Assume the backup environment is as
 follows:

     •   5 TB of primary data
     •   Full backups of all data every Friday night (weekly); incremental backups on files and full
         backups on e-mail and databases nightly (Monday-Thursday, or four nights per week)
     •   12 weeks of onsite retention for the full weekly backups; 4 weeks of onsite retention for the
         nightly backups

 To back up this amount of data with straight disk would require the following:

     •   Each nightly backup is about 25% of the full backup (so 25% of 5TB) = 1.25 TB per night;
         four nights per week over 4 weeks = (1.25 TB) x (4 nights) x (4 weeks) = 20 TB
     •   Each weekly full is 5 TB, so over 12 weeks, this would require (5 TB) x (12 weeks) = 60 TB
     •   Total disk needed is 20 TB + 60 TB = 80 TB of usable disk; adding in disk required for RAID
         brings up the total amount of disk needed to about 100 TB.

 Using 100 TB of disk for this type of backup environment would of course be prohibitively expensive
 for most organizations. However, with data deduplication, one can expect to reduce the amount of
 disk needed to about 5% to 10% of the amount of disk needed for straight disk. This means that
 with data deduplication you can perform disk-based backups with only 5 TB – 10 TB of disk, in
 contrast to the 100 TB needed in the example above. And it is this drastic reduction in the amount
 of disk needed that has put disk-based backup on a comparable footing with tape in terms of cost,
 and that has enabled the wave from tape- to disk-based backup that is now underway.




                                                                                               2
Considerations When Examining Disk-
Based Backup Approaches

Now that it is economically feasible to move from a tape-based to disk-based backup
approach, a large number of vendors with varying approaches have developed disk-
based backup systems employing data deduplication. This has caused a great amount
of confusion for IT managers looking to adopt a disk-based backup system for their
organization. One cannot assume that all disk-based backup approaches are created
equally – and the following areas should be examined closely when evaluating various
disk-based backup systems.

1. Backup Performance - What backup performance will you achieve? Does the
   implementation slow the writing to disk down by performing compute intensive
   processes on the way to disk (inline deduplication) or does the implementation write
   direct to disk for fastest performance and then deduplicate after the backup is
   complete (post-process deduplication)?

2. Restore Performance - What restore speeds can systems achieve? Do they keep a
   full backup ready to be restored? If they only keep deduplicated data what method
   do they use to re-assemble the data and how fast and reliable is it? How quickly can
   they make an offsite tape copy? Do they have to re-assemble data before copying to
   a tape?

3. Deduplication Approach - Does the implementation use block level deduplication
   or zone level deduplication? What are the pros and cons of each? If using block
   level what size block do they use and what deduplication rate do they get? How well
   does the deduplication approach lend itself to scalability?

4. Scalability - As the data grows can they maintain the backup window by adding
   disk, processor, memory and bandwidth, or do they only add disk such that the more
   data you add is gated by a fixed amount of processor and memory? Is there a
   breaking point or fork lift upgrade where the system can no longer keep up and
   therefore you need to replace the controller server with a faster more powerful
   server?

5. Support for Heterogeneous Environments - Is the approach truly heterogeneous?
   Can it take in data from multiple backup applications? Can it take in data from
   various utilities including SQL dumps, Oracle RMAN, or UNIX Tar? Can it support
   backups from specialty VMware applications such as Veeam or Quest / Vizioncore?
   Or can it only take in data from its own agents and not outside data?

6. Support for Backup App Features Such as Granular Level Restore (GRT) and
   OST - How solid is the GRT (Granular Level Restore) implementation and more
   importantly how fast? How solid is the OST (Symantec Open Storage)


                                                                                          3
implementation and can they utilize the full performance advantages and have they
    implemented all features that allow the backup catalog to be updated for offsite data
    tracking and restore?

7. Offsite Data Protection - Can they replicate data offsite for disaster recovery and if
   so how granular is their deduplication rate and how much bandwidth is truly
   required? If something happens to the primary site how fast and easy is it to restore
   from the offsite?

8. Total Cost of Ownership - What is the true cost up front and over time?
   Some systems require a forklift upgrade point. For a while you just add disk
   shelves as your data grows, which appear less expensive, but eventually you
   must replace the entire front end system at 60% to 70% of your original cost.



ExaGrid – A Disk-Based Backup Approach
ExaGrid started with the premise of building the best disk-based backup solution taking
into account all of the considerations discussed above. While ExaGrid achieves some
of the best deduplication rates in the industry, as you can see from the discussion
above, a disk-based backup solution is not about deduplication alone. It is about
addressing all of the key components related to backup and applying them to a disk-
based backup solution. In the remainder of this paper, we will discuss how ExaGrid
addresses these various aspects of disk-based backup.




1.  Backup Performance: The Fastest Backup Performance with Post­
Process Deduplication 

ExaGrid employs post-process deduplication that allows the backups to
write directly to disk at disk speeds. This produces a faster backup and
shorter backup window. The rationale here is to defer the compute-
intensive process until after the backup has landed, so as not to impact the
time it takes to perform the backup. Another approach in the market is inline
deduplication, which deduplicates data on the fly, before it lands on the disk.
Because inline deduplication can potentially cause a bottleneck at the point
where data is streaming into the backup appliance, inline deduplication can
result in slower performance and a longer backup window (see figure at
right).

Proponents of inline deduplication often argue that their approach requires
less disk and is therefore less expensive. However, because inline
deduplication must rely on faster and more expensive processors--and more
memory--in order to avoid being prohibitively slow, any cost differences in
the amount of disk used are overcome by the need for more expensive

                                                                                            4
processors and memory. Post-process deduplication also provides additional
advantages with respect to restores and tape copies, which will be discussed further
below.


2.  Restore Performance: Quick Restores and Offsite Tape Copy 

Full system restores are the most important restores as hundreds to thousands of
employees at a time can be down when a full system is down. The longer it takes to
recover the more lost hours of productivity. Nearly all disk-based backup appliances
and backup software based deduplication implementations use inline deduplication
(discussed above), however. Unfortunately, in addition to the issues mentioned in the
previous section, the inline deduplication method requires a given backup to be
rehydrated, or put back together from its deduplicated state, in order to be restored.
This approach takes time – and time is typically not a luxury when a full system is down!

ExaGrid’s post-process approach, because it allows backups
to land to disk prior to deduplication, is able to keep that
backup on disk in its complete form. And as you proceed
with nightly and weekly backups over time, the ExaGrid
appliance maintains a complete copy of your most recent
backup on disk in its complete form, so that it can be rapidly
restored when or if it is needed. This approach saves
valuable time and productivity in the event of a full system
restore. It is also quite useful with virtual server backups
using server virtualization such as VMware. In this case,
because a complete backup typically consists of one or more
virtual servers in their entirety, ExaGrid’s ability to enable
rapid restores of the most recent backup effectively gives you
the ability to restore multiple virtual servers very quickly.

Finally, ExaGrid’s post-process approach is very useful for
making fast tape copies. Because the ExaGrid appliance
keeps your most recent backup in its complete form, this
same backup can very easily be used to quickly generate a
tape copy. With an inline deduplication approach, a tape
copy would require the backup to be put back together again
(rehydrated) prior to being sent to tape, even if the tape copy were scheduled
mere moments after the backup itself took place. The result using inline
deduplication, then, is a much slower tape copy – and a longer period of time
until the data is fully copied to tape and protected offsite.
                                 




                                                                                            5
3.  Deduplication Approach:  Zone­Level Deduplication  

There are several key areas to look at when evaluating a vendor’s deduplication
approach. The first, most basic aspect of deduplication is how well it reduces data.
After all, the whole point of deduplication with respect to disk-based backup is to reduce
the amount of disk needed such that the cost of backing up to disk can remain low. But
there are other key aspects of deduplication that can have a profound impact on the
ability of the solution to support a variety of backup applications, and on how well the
solution is able to scale as a customer’s data grows.

One common method of deduplication is known as “block-level” deduplication. This
method takes a block of bytes and then looks for other blocks that match, storing only
the unique blocks. The key to block-level deduplication is the size of the block. Smaller
block sizes, say around 8 KB, can be more easily matched and hence will result in a
higher deduplication rate than larger block sizes (e.g., 64 KB). Block-level deduplication
when used with smaller block sizes achieves excellent data reduction. This method,
because it is generic in nature, also lends itself well to supporting a variety of different
applications.

The problem with block-level deduplication, however, is its lack of scalability. Because
block-level deduplication stores and matches unique blocks, a tracking table (known as
a hash table) is required to manage all of the backup data that is stored. And the
smaller the block size, the larger the hash table that is needed – such that with 8 KB size
blocks, one billion entries are needed to deal with just 10 TB of data! This forces an
appliance architecture consisting of a controller unit with multiple disk shelves – an
inferior configuration that will be discussed further in the section on scalability, below.

ExaGrid utilizes a type of deduplication called “zone-level” deduplication. With zone-
level deduplication, the backup jobs are broken into large 50 MB to 100 MB segments
(instead of blocks). These segments are then broken down into zones, or areas, and
the deduplication algorithm looks for the bytes that have changed from one backup to
the next. Like block-level deduplication, zone-level deduplication achieves excellent
data reduction, and lends itself well to supporting a variety of different applications.

Unlike block-level deduplication, however, the tracking tables required for zone-level
deduplication are much smaller. The tracking tables can therefore be easily copied
across appliances, allowing for a much more scalable grid-based architecture, as
discussed below.


4.  Scalability 

Scalability is another important aspect in evaluating a disk-based backup appliance, and
there are a couple of important things to consider when examining how well a given
solution scales. First is the ability of the solution to scale as the amount of backup data
grows, and second is how easily the system can be upgraded when additional
horsepower is needed.


                                                                                               6
When looking at the various disk-based backup configurations available on the market,
there are two basic alternatives – a controller / disk shelf model, and a grid-based
system. With the controller-shelf model, all of the processing power, memory, and
bandwidth are contained in the controller. Some disk may be contained in the controller
as well, but when there is more data and a need for expansion, additional disk shelves
are added to the controller. This implies a static amount of processing power, memory,
and bandwidth for a given system even as the amount of data is growing, which in turn
results in one or both of the following negative effects: (i) as the amount of backup data
grows with a constant level of processing power, memory, and bandwidth, the backup
starts to take longer and longer; (ii) the amount of processing power, memory, and
bandwidth must be over provisioned when the system is first acquired, to allow for future
data growth, but resulting in a more expensive system at the time of purchase. In
addition, each controller can only handle a certain amount of disk, and when the
customer’s data increases beyond that level, the entire system must be swapped out for
a new one in a costly “fork lift” upgrade.




ExaGrid instead uses a grid-based configuration, where each appliance contains
processing power, memory, bandwidth, and disk. When the system needs to expand,
additional appliance nodes are attached to the grid, bringing with them additional
processing power, memory, and bandwidth, as well as disk. This type of configuration
allows the system to maintain all the aspects of performance as the amount of data
grows – you are no longer simply adding disk to a static amount of processing power,
memory, and bandwidth – and you are only paying for the amount of processing power,
memory, and bandwidth as you need it, rather than up front. A grid-based approach
also avoids the costly fork lift upgrades that come with controller / disk shelf
configurations.

In addition to maintaining backup performance as your data grows and allowing you to
seamlessly upgrade to larger and larger systems, ExaGrid’s grid-based configuration
automatically load-balances available capacity in the grid, maintaining a virtual pool of
storage that is shared across the grid. All of the systems can also be managed using a
single user interface that is able to access all of the systems on the grid. This provides


                                                                                             7
a simple, single-dashboard view to give you a quick view of deduplication and replication
status for any system in the grid.




5. Heterogeneity 

Customer environments are made of many backup approaches, backup applications
and utilities, and different disk-based backup approaches support these in different
ways. Customers may have any number backups occurring in their environment,
including traditional backup applications, specialized VMware backup utilities, direct-to-
disk SQL dumps or Oracle RMAN backups, and specific UNIX utilities such as UNIX
TAR.


Backup application software solutions that have incorporated deduplication by definition
only support their own backup application, with its own backup server software and its
own backup client agents. These solutions are not able to support backup data from
other backup applications or utilities.



Disk-based backup appliances with data deduplication such as ExaGrid’s, however, are
able to support backup data from multiple sources, including a variety of backup
applications and utilities. Performing deduplication in the backup software limits the
ability to have all data from all sources stored and deduplicated in a single target device.
Unless 100% of your backup data passes through that particular backup application, a
purpose built disk-based backup appliance such as ExaGrid’s is the best choice to meet
the requirements of your entire environment.




                                                                                               8
6.  Support for Backup Application Features such as GRT and OST 

Another area to consider when looking at disk-based backup solutions is how well a
particular solution supports advanced backup application features such as GRT
(Granular Level Restore) and OST (Symantec Open Storage). Some solutions do not
integrate well with these features – poorly-implemented GRT solutions, for example,
may take hours to restore an individual e-mail, or may not work at all.

Symantec’s Open Storage is another popular feature that allows for more integrated
offsite data protection, and it is important to check whether these features are supported
if you are using this with Symantec NetBackup or Backup Exec.


7.  Offsite Data Protection 

There are many reasons to keep a complete set of backups offsite at a second location.
This can be accomplished by making offsite tape sets or by replicating data from the
primary site disk-based backup system and the second site disk-based backup system.
There are many questions to ask when considering offsite data protection in a disk-
based backup system:

First, what is the deduplication rate? As discussed earlier, deduplication rate in the
determining factor in the amount of data that is reduced and the amount of disk that is
required. But deduplication rate also impacts the amount of bandwidth that is needed to
maintain a second site, for a given backup window, since it is only the data that has
changed that is sent over a WAN to an offsite system. The poorer the deduplication
rate, the more data must be sent to maintain the offsite backup, the more bandwidth is
required, for a given backup window. Deduplication rates can vary greatly, particularly
when looking at backup application software deduplication. ExaGrid achieves the
highest deduplication rates and requires the lowest bandwidth.

Second, does the offsite system keep only deduplicated data or some form of already
rehydrated data in order to offer quick disaster recovery restores? Any system that does
inline deduplication only stores the deduplicated data and therefore results in slower
disaster recovery (DR) times. As mentioned earlier, ExaGrid performs post-process
deduplication, which makes the most recent backup available in its complete form. And
as an ExaGrid appliance performs replication to an offsite ExaGrid appliance, that most
recent backup is also maintained in its complete form on the offsite system as well. The
result is that the data is ready to quickly restore from either the local or the remote
system.

Third, can data from the offsite system be used to restore any lost or corrupted data on
the primary site? ExaGrid owns the patent for this technology such that if anything
happened to any of the backup data at the primary site the offsite system can be used to
restore / replace the lost or corrupted data. This creates an added level of safety.




                                                                                             9
Fourth, does the system allow you to demonstrate DR restores for an auditor? ExaGrid
has a dedicated function to be able to demonstrate DR restores for an internal or
external audit.




8. Total Cost of Ownership 

Cost needs to be examined both up front and over time. You want a lower price up front
but also over time you don’t want to have to repurchase any part of the system.
Because ExaGrid performs post process deduplication, an ExaGrid configuration does
not require the same level of processor and memory that the controller / disk shelf
approach uses. This allows ExaGrid to be more cost effective up front. Because
ExaGrid’s zone-level deduplication requires smaller tracking tables and allows the
system to scale via a grid-based configuration, you can add servers into a grid and keep
adding as you grow. There are no points where you must perform a fork lift upgrade, so
your investment is protected as a result.

It is also important to look at cost effectiveness when comparing ExaGrid to non-
appliance based deduplication systems, such as deduplication via the backup
application software. On the surface, it would appear that backup application software
deduplication would be fairly low-cost. After all, you can just turn on deduplication from
within the backup server and you’re good to go, right? Not exactly. It is important to
keep in mind that using backup application software deduplication typically requires
greater resources on the backup server – that is, more processing power, more memory,
and more disk. In addition, how do you determine exactly how much of these
components are needed to optimally work in your particular backup environment? And
what do you do as the amount of data you’re backing up grows? To answer these
questions likely means additional costs in the form of professional services. When you
get to the bottom line, then, the additional costs of all these items will typically exceed
what you will be paying for an ExaGrid appliance.

So whether you’re comparing ExaGrid to other appliance solutions, or to non-appliance
solutions such as backup application software deduplication, ExaGrid is the most cost
effective disk-based backup solution, up front and over time.




                                                                                              10
Conclusion
The movement from tape-based to disk-based backup is indeed underway, whether that
is for disk onsite and tape remaining offsite, or for disk both onsite and offsite. There are
a great variety of approaches in the market today. While deduplication is an important
factor, disk-based backup does not begin and end just with deduplication. It is important
to take into account all of the various factors discussed above when evaluating these
systems.




.




                                                                                                11
About ExaGrid

 ExaGrid offers the only disk-based backup appliance with data deduplication purpose-
 built for backup that leverages a unique architecture optimized for performance,
 scalability and price. The product was named “Product of the Year” for Backup and
 Recovery Hardware in 2010 by Storage magazine-SearchStorage.com

 ExaGrid’s unique combination of post-process deduplication, most recent backup
 cache, and GRID scalability enables IT departments to achieve the shortest backup
 window and the fastest, most reliable restores, tape copy, and disaster recovery
 without performance degradation or forklift upgrades as data grows.

 With offices and distribution worldwide, ExaGrid has more than 3,500 systems installed
 and hundreds of published customer success stories and testimonial videos available
 at www.exagrid.com.




ExaGrid Systems, Inc | 2000 West Park Drive | Westborough, MA 01581 | 1-800-868-6985 | www.exagrid.com

© 2011 ExaGrid Systems, Inc. All rights reserved.
ExaGrid is a registered trademark of ExaGrid Systems, Inc.



                                                                                          12

Weitere ähnliche Inhalte

Was ist angesagt?

point in time recovery
point in time recoverypoint in time recovery
point in time recoveryssuser1eca7d
 
White Paper: Still All on One Server: Perforce at Scale
White Paper: Still All on One Server: Perforce at ScaleWhite Paper: Still All on One Server: Perforce at Scale
White Paper: Still All on One Server: Perforce at ScalePerforce
 
WSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutionsWSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutionsAccenture
 
EMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC
 
Data Domain Architecture
Data Domain ArchitectureData Domain Architecture
Data Domain Architecturekoesteruk22
 
Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...
Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...
Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...MarketingArrowECS_CZ
 
TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014
TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014
TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014Symantec
 
How Nyherji Manages High Availability TSM Environments using FlashCopy Manager
How Nyherji Manages High Availability TSM Environments using FlashCopy ManagerHow Nyherji Manages High Availability TSM Environments using FlashCopy Manager
How Nyherji Manages High Availability TSM Environments using FlashCopy ManagerIBM Danmark
 
Performance Tuning
Performance TuningPerformance Tuning
Performance TuningJannet Peetz
 
TECHNICAL WHITE PAPER: NetBackup Appliances WAN Optimization
TECHNICAL WHITE PAPER: NetBackup Appliances WAN OptimizationTECHNICAL WHITE PAPER: NetBackup Appliances WAN Optimization
TECHNICAL WHITE PAPER: NetBackup Appliances WAN OptimizationSymantec
 
Presentation data domain advanced features and functions
Presentation   data domain advanced features and functionsPresentation   data domain advanced features and functions
Presentation data domain advanced features and functionsxKinAnx
 
TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330
TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330
TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330Symantec
 
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksHadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksCloudera, Inc.
 
Firebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVFirebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVMind The Firebird
 

Was ist angesagt? (20)

point in time recovery
point in time recoverypoint in time recovery
point in time recovery
 
White Paper: Still All on One Server: Perforce at Scale
White Paper: Still All on One Server: Perforce at ScaleWhite Paper: Still All on One Server: Perforce at Scale
White Paper: Still All on One Server: Perforce at Scale
 
WSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutionsWSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutions
 
EMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed ReviewEMC Data Domain Retention Lock Software: Detailed Review
EMC Data Domain Retention Lock Software: Detailed Review
 
Data Domain Architecture
Data Domain ArchitectureData Domain Architecture
Data Domain Architecture
 
Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...
Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...
Naprostá bezpečnost vašich dat díky jednoduchému, škálovatelnému, flexibilním...
 
TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014
TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014
TECHNICAL BRIEF▶ NDMP Backups with Backup Exec 2014
 
How Nyherji Manages High Availability TSM Environments using FlashCopy Manager
How Nyherji Manages High Availability TSM Environments using FlashCopy ManagerHow Nyherji Manages High Availability TSM Environments using FlashCopy Manager
How Nyherji Manages High Availability TSM Environments using FlashCopy Manager
 
ZFS
ZFSZFS
ZFS
 
Performance Tuning
Performance TuningPerformance Tuning
Performance Tuning
 
Backup strategy
Backup strategyBackup strategy
Backup strategy
 
TECHNICAL WHITE PAPER: NetBackup Appliances WAN Optimization
TECHNICAL WHITE PAPER: NetBackup Appliances WAN OptimizationTECHNICAL WHITE PAPER: NetBackup Appliances WAN Optimization
TECHNICAL WHITE PAPER: NetBackup Appliances WAN Optimization
 
Select enterprise backup software
Select enterprise backup softwareSelect enterprise backup software
Select enterprise backup software
 
Presentation data domain advanced features and functions
Presentation   data domain advanced features and functionsPresentation   data domain advanced features and functions
Presentation data domain advanced features and functions
 
TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330
TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330
TECHNICAL BRIEF▶NetBackup Appliance AutoSupport for NetBackup 5330
 
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, HortonworksHadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
Hadoop World 2011: HDFS Federation - Suresh Srinivas, Hortonworks
 
Disk configtips wp-cn
Disk configtips wp-cnDisk configtips wp-cn
Disk configtips wp-cn
 
P1141211139
P1141211139P1141211139
P1141211139
 
Firebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISVFirebird database recovery and protection for enterprises and ISV
Firebird database recovery and protection for enterprises and ISV
 
Firebird and RAID
Firebird and RAIDFirebird and RAID
Firebird and RAID
 

Ähnlich wie 8 considerations for evaluating disk based backup solutions

The economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your businessThe economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your businessServium
 
Streamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup AppliancesStreamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup AppliancesMaryJWilliams2
 
2010 data protection best practices
2010 data protection best practices2010 data protection best practices
2010 data protection best practicesMario Tabuada Mussio
 
03 Data Recovery - Notes
03 Data Recovery - Notes03 Data Recovery - Notes
03 Data Recovery - NotesKranthi
 
Decision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guideDecision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guideDavid Soden
 
Disaster Recovery & Data Backup Strategies
Disaster Recovery & Data Backup StrategiesDisaster Recovery & Data Backup Strategies
Disaster Recovery & Data Backup StrategiesSpiceworks
 
Tape and cloud strategies for VM backups
Tape and cloud strategies for VM backupsTape and cloud strategies for VM backups
Tape and cloud strategies for VM backupsVeeam Software
 
The Sun ZFS Backup Appliance
The Sun ZFS Backup ApplianceThe Sun ZFS Backup Appliance
The Sun ZFS Backup Applianceomnidba
 
Four Assumptions Killing Backup Storage Webinar
Four Assumptions Killing Backup Storage WebinarFour Assumptions Killing Backup Storage Webinar
Four Assumptions Killing Backup Storage WebinarStorage Switzerland
 
TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014
TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014
TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014Symantec
 
Business Continuity Presentation
Business Continuity PresentationBusiness Continuity Presentation
Business Continuity Presentationperry57123
 
Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Combining IBM Real-time Compression and IBM ProtecTIER DeduplicationCombining IBM Real-time Compression and IBM ProtecTIER Deduplication
Combining IBM Real-time Compression and IBM ProtecTIER DeduplicationIBM India Smarter Computing
 
S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2Tony Pearson
 
Free Presentation I Safe
Free Presentation   I SafeFree Presentation   I Safe
Free Presentation I Saferusssealey
 
Free Presentation ... I-Safe
Free Presentation ...  I-SafeFree Presentation ...  I-Safe
Free Presentation ... I-Safedaniel_aplin
 
Deduplication Solutions Are Not All Created Equal: Why Data Domain?
Deduplication Solutions Are Not All Created Equal: Why Data Domain?Deduplication Solutions Are Not All Created Equal: Why Data Domain?
Deduplication Solutions Are Not All Created Equal: Why Data Domain?EMC
 
Tape Storage Future Directions and the Data Explosion
Tape Storage Future Directions and the Data ExplosionTape Storage Future Directions and the Data Explosion
Tape Storage Future Directions and the Data ExplosionIBM India Smarter Computing
 
Case Study British Red Cross 161257
Case Study British Red Cross 161257Case Study British Red Cross 161257
Case Study British Red Cross 161257AsigraCloudBackup
 
Business Continuity Presentation[1]
Business Continuity Presentation[1]Business Continuity Presentation[1]
Business Continuity Presentation[1]jrm1224
 

Ähnlich wie 8 considerations for evaluating disk based backup solutions (20)

The economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your businessThe economics of backup 5 ways disk backup can help your business
The economics of backup 5 ways disk backup can help your business
 
Generic RLM White Paper
Generic RLM White PaperGeneric RLM White Paper
Generic RLM White Paper
 
Streamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup AppliancesStreamlining Backup: Enhancing Data Protection with Backup Appliances
Streamlining Backup: Enhancing Data Protection with Backup Appliances
 
2010 data protection best practices
2010 data protection best practices2010 data protection best practices
2010 data protection best practices
 
03 Data Recovery - Notes
03 Data Recovery - Notes03 Data Recovery - Notes
03 Data Recovery - Notes
 
Decision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guideDecision Forward Cloud Backup-guide
Decision Forward Cloud Backup-guide
 
Disaster Recovery & Data Backup Strategies
Disaster Recovery & Data Backup StrategiesDisaster Recovery & Data Backup Strategies
Disaster Recovery & Data Backup Strategies
 
Tape and cloud strategies for VM backups
Tape and cloud strategies for VM backupsTape and cloud strategies for VM backups
Tape and cloud strategies for VM backups
 
The Sun ZFS Backup Appliance
The Sun ZFS Backup ApplianceThe Sun ZFS Backup Appliance
The Sun ZFS Backup Appliance
 
Four Assumptions Killing Backup Storage Webinar
Four Assumptions Killing Backup Storage WebinarFour Assumptions Killing Backup Storage Webinar
Four Assumptions Killing Backup Storage Webinar
 
TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014
TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014
TECHNICAL BRIEF▶ Using Virtual Tape Libraries with Backup Exec 2014
 
Business Continuity Presentation
Business Continuity PresentationBusiness Continuity Presentation
Business Continuity Presentation
 
Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Combining IBM Real-time Compression and IBM ProtecTIER DeduplicationCombining IBM Real-time Compression and IBM ProtecTIER Deduplication
Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
 
S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2
 
Free Presentation I Safe
Free Presentation   I SafeFree Presentation   I Safe
Free Presentation I Safe
 
Free Presentation ... I-Safe
Free Presentation ...  I-SafeFree Presentation ...  I-Safe
Free Presentation ... I-Safe
 
Deduplication Solutions Are Not All Created Equal: Why Data Domain?
Deduplication Solutions Are Not All Created Equal: Why Data Domain?Deduplication Solutions Are Not All Created Equal: Why Data Domain?
Deduplication Solutions Are Not All Created Equal: Why Data Domain?
 
Tape Storage Future Directions and the Data Explosion
Tape Storage Future Directions and the Data ExplosionTape Storage Future Directions and the Data Explosion
Tape Storage Future Directions and the Data Explosion
 
Case Study British Red Cross 161257
Case Study British Red Cross 161257Case Study British Red Cross 161257
Case Study British Red Cross 161257
 
Business Continuity Presentation[1]
Business Continuity Presentation[1]Business Continuity Presentation[1]
Business Continuity Presentation[1]
 

Mehr von Servium

Choosing the right tool for the job - Ten reasons why workstations trump your PC
Choosing the right tool for the job - Ten reasons why workstations trump your PCChoosing the right tool for the job - Ten reasons why workstations trump your PC
Choosing the right tool for the job - Ten reasons why workstations trump your PCServium
 
Servium Freshen Up with NextGen Proliant
Servium Freshen Up with NextGen ProliantServium Freshen Up with NextGen Proliant
Servium Freshen Up with NextGen ProliantServium
 
Mimecast unified-email-management-datasheet
Mimecast unified-email-management-datasheetMimecast unified-email-management-datasheet
Mimecast unified-email-management-datasheetServium
 
Email continuity-datasheet
Email continuity-datasheetEmail continuity-datasheet
Email continuity-datasheetServium
 
Email archive-datasheet
Email archive-datasheetEmail archive-datasheet
Email archive-datasheetServium
 
Ds security
Ds securityDs security
Ds securityServium
 
Tsg mimecast-service-comparison-datasheet
Tsg mimecast-service-comparison-datasheetTsg mimecast-service-comparison-datasheet
Tsg mimecast-service-comparison-datasheetServium
 
Emerging Tech Showcase Exagrid
Emerging Tech Showcase ExagridEmerging Tech Showcase Exagrid
Emerging Tech Showcase ExagridServium
 
Emerging Tech Showcase Oracle
Emerging Tech Showcase OracleEmerging Tech Showcase Oracle
Emerging Tech Showcase OracleServium
 
Exa grid systems product line
Exa grid systems product lineExa grid systems product line
Exa grid systems product lineServium
 
Eliminate tape everywhere data sheet
Eliminate tape everywhere data sheetEliminate tape everywhere data sheet
Eliminate tape everywhere data sheetServium
 

Mehr von Servium (11)

Choosing the right tool for the job - Ten reasons why workstations trump your PC
Choosing the right tool for the job - Ten reasons why workstations trump your PCChoosing the right tool for the job - Ten reasons why workstations trump your PC
Choosing the right tool for the job - Ten reasons why workstations trump your PC
 
Servium Freshen Up with NextGen Proliant
Servium Freshen Up with NextGen ProliantServium Freshen Up with NextGen Proliant
Servium Freshen Up with NextGen Proliant
 
Mimecast unified-email-management-datasheet
Mimecast unified-email-management-datasheetMimecast unified-email-management-datasheet
Mimecast unified-email-management-datasheet
 
Email continuity-datasheet
Email continuity-datasheetEmail continuity-datasheet
Email continuity-datasheet
 
Email archive-datasheet
Email archive-datasheetEmail archive-datasheet
Email archive-datasheet
 
Ds security
Ds securityDs security
Ds security
 
Tsg mimecast-service-comparison-datasheet
Tsg mimecast-service-comparison-datasheetTsg mimecast-service-comparison-datasheet
Tsg mimecast-service-comparison-datasheet
 
Emerging Tech Showcase Exagrid
Emerging Tech Showcase ExagridEmerging Tech Showcase Exagrid
Emerging Tech Showcase Exagrid
 
Emerging Tech Showcase Oracle
Emerging Tech Showcase OracleEmerging Tech Showcase Oracle
Emerging Tech Showcase Oracle
 
Exa grid systems product line
Exa grid systems product lineExa grid systems product line
Exa grid systems product line
 
Eliminate tape everywhere data sheet
Eliminate tape everywhere data sheetEliminate tape everywhere data sheet
Eliminate tape everywhere data sheet
 

Kürzlich hochgeladen

The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 

Kürzlich hochgeladen (20)

The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 

8 considerations for evaluating disk based backup solutions

  • 1. Eight Considerations for Evaluating Disk-Based Backup Solutions 1
  • 2. Introduction The movement from tape-based to disk-based backup is well underway. Disk eliminates all the problems of tape backup. Backing up to disk is faster than backing up to tape, and disk-based backups are more reliable and more secure than tape-based backups. Unlike tape, disk resides in a hermetically sealed case and in a data center rack so it is not exposed to heat and humidity and is secure as it sits behind data center physical security and network security. In the past, the movement from tape- to disk-based backup has been less compelling due to the expense of storing backup data on disk instead of tape. Despite the disadvantages of tape, tape’s cost advantage has allowed it to maintain its presence in the data center, often with straight disk (for faster nightly backups) fronting a tape library. With the advent of data deduplication, however, tape’s cost advantage has been largely eroded. When disk-based backup is used in conjunction with data deduplication, only the unique data is stored and depending upon the type of data deduplication and the specific implementation, reduction rates of 10 to 1 to as much as 50 to 1 can be realized. This allows only a fraction of the disk to be required versus straight disk without data deduplication. Let’s look at a simple example below to see how this works. Assume the backup environment is as follows: • 5 TB of primary data • Full backups of all data every Friday night (weekly); incremental backups on files and full backups on e-mail and databases nightly (Monday-Thursday, or four nights per week) • 12 weeks of onsite retention for the full weekly backups; 4 weeks of onsite retention for the nightly backups To back up this amount of data with straight disk would require the following: • Each nightly backup is about 25% of the full backup (so 25% of 5TB) = 1.25 TB per night; four nights per week over 4 weeks = (1.25 TB) x (4 nights) x (4 weeks) = 20 TB • Each weekly full is 5 TB, so over 12 weeks, this would require (5 TB) x (12 weeks) = 60 TB • Total disk needed is 20 TB + 60 TB = 80 TB of usable disk; adding in disk required for RAID brings up the total amount of disk needed to about 100 TB. Using 100 TB of disk for this type of backup environment would of course be prohibitively expensive for most organizations. However, with data deduplication, one can expect to reduce the amount of disk needed to about 5% to 10% of the amount of disk needed for straight disk. This means that with data deduplication you can perform disk-based backups with only 5 TB – 10 TB of disk, in contrast to the 100 TB needed in the example above. And it is this drastic reduction in the amount of disk needed that has put disk-based backup on a comparable footing with tape in terms of cost, and that has enabled the wave from tape- to disk-based backup that is now underway. 2
  • 3. Considerations When Examining Disk- Based Backup Approaches Now that it is economically feasible to move from a tape-based to disk-based backup approach, a large number of vendors with varying approaches have developed disk- based backup systems employing data deduplication. This has caused a great amount of confusion for IT managers looking to adopt a disk-based backup system for their organization. One cannot assume that all disk-based backup approaches are created equally – and the following areas should be examined closely when evaluating various disk-based backup systems. 1. Backup Performance - What backup performance will you achieve? Does the implementation slow the writing to disk down by performing compute intensive processes on the way to disk (inline deduplication) or does the implementation write direct to disk for fastest performance and then deduplicate after the backup is complete (post-process deduplication)? 2. Restore Performance - What restore speeds can systems achieve? Do they keep a full backup ready to be restored? If they only keep deduplicated data what method do they use to re-assemble the data and how fast and reliable is it? How quickly can they make an offsite tape copy? Do they have to re-assemble data before copying to a tape? 3. Deduplication Approach - Does the implementation use block level deduplication or zone level deduplication? What are the pros and cons of each? If using block level what size block do they use and what deduplication rate do they get? How well does the deduplication approach lend itself to scalability? 4. Scalability - As the data grows can they maintain the backup window by adding disk, processor, memory and bandwidth, or do they only add disk such that the more data you add is gated by a fixed amount of processor and memory? Is there a breaking point or fork lift upgrade where the system can no longer keep up and therefore you need to replace the controller server with a faster more powerful server? 5. Support for Heterogeneous Environments - Is the approach truly heterogeneous? Can it take in data from multiple backup applications? Can it take in data from various utilities including SQL dumps, Oracle RMAN, or UNIX Tar? Can it support backups from specialty VMware applications such as Veeam or Quest / Vizioncore? Or can it only take in data from its own agents and not outside data? 6. Support for Backup App Features Such as Granular Level Restore (GRT) and OST - How solid is the GRT (Granular Level Restore) implementation and more importantly how fast? How solid is the OST (Symantec Open Storage) 3
  • 4. implementation and can they utilize the full performance advantages and have they implemented all features that allow the backup catalog to be updated for offsite data tracking and restore? 7. Offsite Data Protection - Can they replicate data offsite for disaster recovery and if so how granular is their deduplication rate and how much bandwidth is truly required? If something happens to the primary site how fast and easy is it to restore from the offsite? 8. Total Cost of Ownership - What is the true cost up front and over time? Some systems require a forklift upgrade point. For a while you just add disk shelves as your data grows, which appear less expensive, but eventually you must replace the entire front end system at 60% to 70% of your original cost. ExaGrid – A Disk-Based Backup Approach ExaGrid started with the premise of building the best disk-based backup solution taking into account all of the considerations discussed above. While ExaGrid achieves some of the best deduplication rates in the industry, as you can see from the discussion above, a disk-based backup solution is not about deduplication alone. It is about addressing all of the key components related to backup and applying them to a disk- based backup solution. In the remainder of this paper, we will discuss how ExaGrid addresses these various aspects of disk-based backup. 1.  Backup Performance: The Fastest Backup Performance with Post­ Process Deduplication  ExaGrid employs post-process deduplication that allows the backups to write directly to disk at disk speeds. This produces a faster backup and shorter backup window. The rationale here is to defer the compute- intensive process until after the backup has landed, so as not to impact the time it takes to perform the backup. Another approach in the market is inline deduplication, which deduplicates data on the fly, before it lands on the disk. Because inline deduplication can potentially cause a bottleneck at the point where data is streaming into the backup appliance, inline deduplication can result in slower performance and a longer backup window (see figure at right). Proponents of inline deduplication often argue that their approach requires less disk and is therefore less expensive. However, because inline deduplication must rely on faster and more expensive processors--and more memory--in order to avoid being prohibitively slow, any cost differences in the amount of disk used are overcome by the need for more expensive 4
  • 5. processors and memory. Post-process deduplication also provides additional advantages with respect to restores and tape copies, which will be discussed further below. 2.  Restore Performance: Quick Restores and Offsite Tape Copy  Full system restores are the most important restores as hundreds to thousands of employees at a time can be down when a full system is down. The longer it takes to recover the more lost hours of productivity. Nearly all disk-based backup appliances and backup software based deduplication implementations use inline deduplication (discussed above), however. Unfortunately, in addition to the issues mentioned in the previous section, the inline deduplication method requires a given backup to be rehydrated, or put back together from its deduplicated state, in order to be restored. This approach takes time – and time is typically not a luxury when a full system is down! ExaGrid’s post-process approach, because it allows backups to land to disk prior to deduplication, is able to keep that backup on disk in its complete form. And as you proceed with nightly and weekly backups over time, the ExaGrid appliance maintains a complete copy of your most recent backup on disk in its complete form, so that it can be rapidly restored when or if it is needed. This approach saves valuable time and productivity in the event of a full system restore. It is also quite useful with virtual server backups using server virtualization such as VMware. In this case, because a complete backup typically consists of one or more virtual servers in their entirety, ExaGrid’s ability to enable rapid restores of the most recent backup effectively gives you the ability to restore multiple virtual servers very quickly. Finally, ExaGrid’s post-process approach is very useful for making fast tape copies. Because the ExaGrid appliance keeps your most recent backup in its complete form, this same backup can very easily be used to quickly generate a tape copy. With an inline deduplication approach, a tape copy would require the backup to be put back together again (rehydrated) prior to being sent to tape, even if the tape copy were scheduled mere moments after the backup itself took place. The result using inline deduplication, then, is a much slower tape copy – and a longer period of time until the data is fully copied to tape and protected offsite.   5
  • 6. 3.  Deduplication Approach:  Zone­Level Deduplication   There are several key areas to look at when evaluating a vendor’s deduplication approach. The first, most basic aspect of deduplication is how well it reduces data. After all, the whole point of deduplication with respect to disk-based backup is to reduce the amount of disk needed such that the cost of backing up to disk can remain low. But there are other key aspects of deduplication that can have a profound impact on the ability of the solution to support a variety of backup applications, and on how well the solution is able to scale as a customer’s data grows. One common method of deduplication is known as “block-level” deduplication. This method takes a block of bytes and then looks for other blocks that match, storing only the unique blocks. The key to block-level deduplication is the size of the block. Smaller block sizes, say around 8 KB, can be more easily matched and hence will result in a higher deduplication rate than larger block sizes (e.g., 64 KB). Block-level deduplication when used with smaller block sizes achieves excellent data reduction. This method, because it is generic in nature, also lends itself well to supporting a variety of different applications. The problem with block-level deduplication, however, is its lack of scalability. Because block-level deduplication stores and matches unique blocks, a tracking table (known as a hash table) is required to manage all of the backup data that is stored. And the smaller the block size, the larger the hash table that is needed – such that with 8 KB size blocks, one billion entries are needed to deal with just 10 TB of data! This forces an appliance architecture consisting of a controller unit with multiple disk shelves – an inferior configuration that will be discussed further in the section on scalability, below. ExaGrid utilizes a type of deduplication called “zone-level” deduplication. With zone- level deduplication, the backup jobs are broken into large 50 MB to 100 MB segments (instead of blocks). These segments are then broken down into zones, or areas, and the deduplication algorithm looks for the bytes that have changed from one backup to the next. Like block-level deduplication, zone-level deduplication achieves excellent data reduction, and lends itself well to supporting a variety of different applications. Unlike block-level deduplication, however, the tracking tables required for zone-level deduplication are much smaller. The tracking tables can therefore be easily copied across appliances, allowing for a much more scalable grid-based architecture, as discussed below. 4.  Scalability  Scalability is another important aspect in evaluating a disk-based backup appliance, and there are a couple of important things to consider when examining how well a given solution scales. First is the ability of the solution to scale as the amount of backup data grows, and second is how easily the system can be upgraded when additional horsepower is needed. 6
  • 7. When looking at the various disk-based backup configurations available on the market, there are two basic alternatives – a controller / disk shelf model, and a grid-based system. With the controller-shelf model, all of the processing power, memory, and bandwidth are contained in the controller. Some disk may be contained in the controller as well, but when there is more data and a need for expansion, additional disk shelves are added to the controller. This implies a static amount of processing power, memory, and bandwidth for a given system even as the amount of data is growing, which in turn results in one or both of the following negative effects: (i) as the amount of backup data grows with a constant level of processing power, memory, and bandwidth, the backup starts to take longer and longer; (ii) the amount of processing power, memory, and bandwidth must be over provisioned when the system is first acquired, to allow for future data growth, but resulting in a more expensive system at the time of purchase. In addition, each controller can only handle a certain amount of disk, and when the customer’s data increases beyond that level, the entire system must be swapped out for a new one in a costly “fork lift” upgrade. ExaGrid instead uses a grid-based configuration, where each appliance contains processing power, memory, bandwidth, and disk. When the system needs to expand, additional appliance nodes are attached to the grid, bringing with them additional processing power, memory, and bandwidth, as well as disk. This type of configuration allows the system to maintain all the aspects of performance as the amount of data grows – you are no longer simply adding disk to a static amount of processing power, memory, and bandwidth – and you are only paying for the amount of processing power, memory, and bandwidth as you need it, rather than up front. A grid-based approach also avoids the costly fork lift upgrades that come with controller / disk shelf configurations. In addition to maintaining backup performance as your data grows and allowing you to seamlessly upgrade to larger and larger systems, ExaGrid’s grid-based configuration automatically load-balances available capacity in the grid, maintaining a virtual pool of storage that is shared across the grid. All of the systems can also be managed using a single user interface that is able to access all of the systems on the grid. This provides 7
  • 8. a simple, single-dashboard view to give you a quick view of deduplication and replication status for any system in the grid. 5. Heterogeneity  Customer environments are made of many backup approaches, backup applications and utilities, and different disk-based backup approaches support these in different ways. Customers may have any number backups occurring in their environment, including traditional backup applications, specialized VMware backup utilities, direct-to- disk SQL dumps or Oracle RMAN backups, and specific UNIX utilities such as UNIX TAR. Backup application software solutions that have incorporated deduplication by definition only support their own backup application, with its own backup server software and its own backup client agents. These solutions are not able to support backup data from other backup applications or utilities. Disk-based backup appliances with data deduplication such as ExaGrid’s, however, are able to support backup data from multiple sources, including a variety of backup applications and utilities. Performing deduplication in the backup software limits the ability to have all data from all sources stored and deduplicated in a single target device. Unless 100% of your backup data passes through that particular backup application, a purpose built disk-based backup appliance such as ExaGrid’s is the best choice to meet the requirements of your entire environment. 8
  • 9. 6.  Support for Backup Application Features such as GRT and OST  Another area to consider when looking at disk-based backup solutions is how well a particular solution supports advanced backup application features such as GRT (Granular Level Restore) and OST (Symantec Open Storage). Some solutions do not integrate well with these features – poorly-implemented GRT solutions, for example, may take hours to restore an individual e-mail, or may not work at all. Symantec’s Open Storage is another popular feature that allows for more integrated offsite data protection, and it is important to check whether these features are supported if you are using this with Symantec NetBackup or Backup Exec. 7.  Offsite Data Protection  There are many reasons to keep a complete set of backups offsite at a second location. This can be accomplished by making offsite tape sets or by replicating data from the primary site disk-based backup system and the second site disk-based backup system. There are many questions to ask when considering offsite data protection in a disk- based backup system: First, what is the deduplication rate? As discussed earlier, deduplication rate in the determining factor in the amount of data that is reduced and the amount of disk that is required. But deduplication rate also impacts the amount of bandwidth that is needed to maintain a second site, for a given backup window, since it is only the data that has changed that is sent over a WAN to an offsite system. The poorer the deduplication rate, the more data must be sent to maintain the offsite backup, the more bandwidth is required, for a given backup window. Deduplication rates can vary greatly, particularly when looking at backup application software deduplication. ExaGrid achieves the highest deduplication rates and requires the lowest bandwidth. Second, does the offsite system keep only deduplicated data or some form of already rehydrated data in order to offer quick disaster recovery restores? Any system that does inline deduplication only stores the deduplicated data and therefore results in slower disaster recovery (DR) times. As mentioned earlier, ExaGrid performs post-process deduplication, which makes the most recent backup available in its complete form. And as an ExaGrid appliance performs replication to an offsite ExaGrid appliance, that most recent backup is also maintained in its complete form on the offsite system as well. The result is that the data is ready to quickly restore from either the local or the remote system. Third, can data from the offsite system be used to restore any lost or corrupted data on the primary site? ExaGrid owns the patent for this technology such that if anything happened to any of the backup data at the primary site the offsite system can be used to restore / replace the lost or corrupted data. This creates an added level of safety. 9
  • 10. Fourth, does the system allow you to demonstrate DR restores for an auditor? ExaGrid has a dedicated function to be able to demonstrate DR restores for an internal or external audit. 8. Total Cost of Ownership  Cost needs to be examined both up front and over time. You want a lower price up front but also over time you don’t want to have to repurchase any part of the system. Because ExaGrid performs post process deduplication, an ExaGrid configuration does not require the same level of processor and memory that the controller / disk shelf approach uses. This allows ExaGrid to be more cost effective up front. Because ExaGrid’s zone-level deduplication requires smaller tracking tables and allows the system to scale via a grid-based configuration, you can add servers into a grid and keep adding as you grow. There are no points where you must perform a fork lift upgrade, so your investment is protected as a result. It is also important to look at cost effectiveness when comparing ExaGrid to non- appliance based deduplication systems, such as deduplication via the backup application software. On the surface, it would appear that backup application software deduplication would be fairly low-cost. After all, you can just turn on deduplication from within the backup server and you’re good to go, right? Not exactly. It is important to keep in mind that using backup application software deduplication typically requires greater resources on the backup server – that is, more processing power, more memory, and more disk. In addition, how do you determine exactly how much of these components are needed to optimally work in your particular backup environment? And what do you do as the amount of data you’re backing up grows? To answer these questions likely means additional costs in the form of professional services. When you get to the bottom line, then, the additional costs of all these items will typically exceed what you will be paying for an ExaGrid appliance. So whether you’re comparing ExaGrid to other appliance solutions, or to non-appliance solutions such as backup application software deduplication, ExaGrid is the most cost effective disk-based backup solution, up front and over time. 10
  • 11. Conclusion The movement from tape-based to disk-based backup is indeed underway, whether that is for disk onsite and tape remaining offsite, or for disk both onsite and offsite. There are a great variety of approaches in the market today. While deduplication is an important factor, disk-based backup does not begin and end just with deduplication. It is important to take into account all of the various factors discussed above when evaluating these systems. . 11
  • 12. About ExaGrid ExaGrid offers the only disk-based backup appliance with data deduplication purpose- built for backup that leverages a unique architecture optimized for performance, scalability and price. The product was named “Product of the Year” for Backup and Recovery Hardware in 2010 by Storage magazine-SearchStorage.com ExaGrid’s unique combination of post-process deduplication, most recent backup cache, and GRID scalability enables IT departments to achieve the shortest backup window and the fastest, most reliable restores, tape copy, and disaster recovery without performance degradation or forklift upgrades as data grows. With offices and distribution worldwide, ExaGrid has more than 3,500 systems installed and hundreds of published customer success stories and testimonial videos available at www.exagrid.com. ExaGrid Systems, Inc | 2000 West Park Drive | Westborough, MA 01581 | 1-800-868-6985 | www.exagrid.com © 2011 ExaGrid Systems, Inc. All rights reserved. ExaGrid is a registered trademark of ExaGrid Systems, Inc. 12