SlideShare a Scribd company logo
1 of 45
Download to read offline
Physical Models and Logical
Architecture
Sunita Shrivastava
Vijay Sen
1
Work in Progress, Needs your input
• Work In progress
– Intent is to collect feedback and hear your views
– Will need your help to drive this to its logical conclusion
– Several drill downs required
• Why is this significant?
– Historically we did data protection/backup in the following backdrop : no virtualization, no
cloud, low end storage technologies
– New Scenarios around data protection are now a lot more feasible due to the availability and
maturity of these technologies
– We need to understand our existing assets
– Intent is to not redesign the entire code but to be able to think through this systematically.
And come up with the following answers
a) Are we handling the new scenarios in an optimal way ?
b) What are the common constructs/shared meta data across the scenarios and hence across the
subservices ? Are we building silo’d solutions ?
c) Is a layered AND highly scalable AND ha solution is feasible? Where and how do we change gears?
d) If so, what would a roadmap for transition to this look like?
2
Proposed plan
What’s our
vision?
What assets
do we have
today?
What are the
new relevant
scenarios?
What are the
limitations
today?
What is the
new
architecture?
Exit criteria
Benefits of the
new arch well
understood; Split
between platform
and management
bought off by
Windows team
and leads
Exit criteria
vNext scenarios
signed off
Exit criteria
Perceived
strengths and
weaknesses
identified for the
assets
Exit criteria
Comprehensive
vision objectives
and non-
objectives agreed
upon by staff
Exit criteria
Gaps between
scenarios
required and
assets available
identified
3
High Level Scenarios
• Backup
• HA ( geo clustering)
• Disaster Recovery
– Rehydration
• Application Migration
• Archival
4
Principles
• Unified management across data protection, DR and migration
• Tailored to application owners and hosters
• Hybrid cloud awareness
• Enterprise class offering
• Alignment with windows and CDM
• Our team innovates in management, but leverages replication
technologies
• We are a platform and an end to end solution (management is not
extensible, but we are a platform for other replication providers)
Guiding principles
Non goals
• Support for non-Microsoft clouds
• TBD
Hoster Debate?
• Couple of options
– Provide a stack to hosters which allows them to
offer a recovery service all within their data
center, where they provide storage
• Another variation would be that Azure Storage is used
– Provide a stack to hosters which allows them to
leverage the “Recovery Service”(running in Azure)
for both backup and DR
– A combination of the above
6
Existing Assets
• Windows Server 8 Backup (Full Server, Critical Volumes(BMR), System State, Individual Volumes, Files/Folders)
– Strengths
• Free, Simple, Sweet Spot – 8 to 10 machines, Used in departmental/branch office servers in Enterprise Scenarios, Primary Ask is BMR
– Weaknesses
• Clustering Support, Centralized Console/Monitoring
• Client Backup
– File (zip file based, for compat?)
– System Restore (snapshot based, file level recovery, only backs up settings/registry/system files, affiliated with app/driver installs)
– System Image Backup
– History Vault (File level restore, uses Shadow Copies)
• DPM
– Strengths
• Adoption in midmarket
• SQL is the largest workload?
– What do we add over the SQL technologies as a value proposition?
• Exchange the largest workload in Enterprise?
• What about Sharepoint support?
• Can we make a claim that due to strong recovery models (item level) , customers like to use DPM when protecting our apps?
• Can we make a claim otherwise?
– Weaknesses
• Blockers for adoption in large Enterprises
– Tape Support
– Need lower data loss intervals for mission critical applications
– No support for de-duplication - Need more analysis
– Is Scale a blocker?
» Current Deployment Scale for customers : Fan of 10/13 servers, atmost 3 to 4 DPM servers
» We are at the cusp for scale, as demonstrated by 64 node clusters where storage separation is necessary
» DPM Limit 80 TB for recovery volume and 40 TB for replica volume
7
Existing Assets(2)
• OBS
– Strengths
• A service on Azure that is designed for scale and availability
– Weaknesses
• No monitoring of whether backup is actually happening
• Large footprint, worker roles could be shared for higher utilization,
fewer cogs and smaller footprint
• What have we learnt from the early exposure
• DPM as a Gateway to Cloud
– Strengths
• Great Story for off-site protection of data
– Weakness
• Scale Models ?
8
Gaps
• Non Optimal Data Movement
• Storage coupled too tightly with servers, not fungible across servers
• Lack of a Single Unified Protection Namespace
– For Backup, DR, HA
– For different Segments
• Not Hoster Friendly
• Silo’d services in the Enterprise
– No coherent SLA’s
• No application awareness
• Our Resources not as leveraged as they can be
• IMHO,
– a single protection namespace is the single biggest investment that we need to make from a
management perspective
– Layering over a replication service from a platform perspective
– Full fledged support, workflows for recovery
– Storage service for storage management is an investment we need to support hosters and
large enterprises
9
Introducing Protection Namespace
• A protection namespace hosts protected(or a protectable) element
– A binding is applied when a protectable element is made protected
– The binding specifies how a protected element is protected
• Protectable Element
– <Name, PEType>
• Protected Element
– <Name, PEType, ProtectionType, Target Recovery Service URI, Destination URI(optional)>
• User Specifies <Disaster Loss Tolerance, RPO, RTO, Destination Preference> on the
basis of which a protection type is chosen
• A protection namespace is rooted at the level of tenancy/subscription
• The leaf nodes are protected elements
• Container nodes only serve organizational perspective within a namespace
– Is a flat space good enough?
• What can be imported into the namespace?
– Can there be policies to automatically discover protectable elements and import them automagically?
• Questions
• What is the relationship of this namespace to other potential elements or constructs within System
Center/Tofino?
10
Understanding Protection Namespaces
• Providing a unified view across the Protection Namespace will prevent fragmented silo’d solutions which require a
lot of bookkeeping
• Protection Name Space (Sliced By Node/Clusters)
– <Source Type, Source Name(URI), Protection Type, Target Recovery Service URI, Status>
– Node A
• <Host, ‘A’, Windows Server Backup, Disk Z, Green>
• <Host, ‘A’, Windows Server Backup, Network Share Z, Green >
– Node B
• <Folder, ‘Folder xyz’, Snapshot Replication, Azure Storage, Green>
– Node C
• <Volume, ‘Volume a’, Snapshot Replication, Enterprise
– Node D
• <VM, ‘VM abc’, Hyper-V Replication-Hot, Target Host Node X, Green>
• <VM, ‘VM cde’, Snapshot Replication, Azure Storage, Green>
• <VM, ‘VM fgh’ Hyper-V Replication- Cold, Azure Storage,Green>
– Node E
• <SQL DB, DB ‘bcd’ SQL Logging, Green>
– Cluster E
• <VM, ‘VM efg’, Snapshot Replication, Yellow>
• <VM, ‘VM cde’, Hyper-v Replication, Hot, Target Host Cluster G>
– Cluster F
• <All VMs protected by Snapshot Replication, Status Green>
– SAN G
• <SAN Volumes, Volume A to G, SAN Replication, Target SAN>
– Node H
• <Volume,
11
Protection Namespaces and Apps
• Protection Name Space (Sliced By Application) <Protection Type,
Schedule, Status>
– Application XYZ (Replace with real life example, Hrweb?)
• Web Tier
– Node A
– Node B
– Node C
» < Windows Server Backup, Once in 15 days, 3 Recovery Points>
• Middle Tier
– Node E
– Node F
» <Windows Server Backup, Once in 15 days, 3 Recovery Points>
• Data Tier
– Cluster VMs
– < Hyper-V Replication, 5 minutes, 15 Recovery Points>
• Protection Name Space Chaining
• <SQL DB A, SQL Synchronous Replication, SQL DB B>
• <SQL DB b, SQL Asynchronous Replication, SQL DB C>
12
Protection Namespace
• Single Server
– Local Protection Namespace handled by the Local
Recovery Service
– Should we allow Publishing
– Optionally, can publish to the Recovery Service or can publish to
the Recovery Service(Azure)
• Do we really need a protection namespace?
– Is it beneficial to the Application
Owner/Administrator?
– Is it beneficial to the Hoster (Fabric/Service Provider)?
– Is it beneficial to the Fabric Administrator within a
Data Center?
13
Application Migration and Protection
Namespace
• What is the relationship of the Application Migration to
Protection/Recovery Service
• Catalog/VSS Writer could aid in Application Discovery?
– Unknown
• Commonalities
– Application Migration Equals “IR + Simplified Recovery + Hydration”
– Failover to cloud in case of Disaster Recovery is essentially equivalent
of Application Migration in terms of requirements around the
ambience required by the application
• A migrated application may need a VPN
• An application failing over to the cloud may need a VPN to be configured
– In the long term, what do we as a Protection/Recovery Team need to
do to ensure that the application can be protected appropriately as it
is migrated
14
Three segments
• Enterprise
• Cloud
– Azure Services
• Hybrid
– A unified namespace across the enterprise and
cloud
15
Management Tasks for the Protection
Namespace
• The success of the entire solution depends not only on plumbing but the ease with which data protection needs
of a customer can be met
• There are plenty of significant Management Tasks
– Management of Protection NameSpace
• Protection Name Spaces
– Contain Protected Elements
» Application, VM, Volumes, Collection of Volumes
– Hierarchical, Nested
» By Location(Site), By Cluster, By Node
» By Application
– Overlapping ?
– Management of Protection Policies
– Simplifying Policies
» Stock Policies ?
» Based on Intent and Calibration ?
– Provisioning for Replication
– Driving the underlying replication
– Monitoring the Status of Protection
– Given the policies and the namespaces, alert if things are not on schedule
– Orchestration for Recovery of Data
• Indexing and Cataloging for Efficient Retrieval
• Orchestration of Recovery
– Management for Disaster Recovery
• Reserving Fabric
• Testing of Fail Over and Fail back to Primary Site
– Orchestration of Hydration
– Management of Storage, Bandwidth, Fabric, Networking for all of the above
16
VMs and Our Scenarios
• Why do VMs require backup?
– VM corruption?
– Guest Level backup Bs VM Level Backup
• However, it is important to understand where guest level replication makes more sense
• Where does a combination of guest and vm level backup make sense
• VMs lend themselves more easily to migration
– DR drives virtualization, as DR requires migration
• Definition #1 (For Azure?)
– Cold Backup
• Medium to Low Recovery Time, Low Data Tolerance
– Hot Backup
• Low Recovery Time + Low Data loss Tolerance
• Definition #2 (Applicable for Private Clouds)
– Cold Backup
• Low Data Tolerance, Fabric is not reserved, Recovery may get long, however resources may be shared more effectively
– Hot Backup
• Low Recovery Time, Low Data Tolerance, Fabric is reserved
17
Physical Model (Enterprise Only)
Protection
MetaData
Protected Data
Protection Service
Recovery Service
Hydration Service
Subscription Service
Protection
MetaData
Site X
DAS
Protected Node
Protected
Data(Possibly
SAN, Possibly
NAS)
Archived Data
SAN
Fabric Mgmt
Service
Cloud Service (Self
Service)
Site Y
Protected
DataProtected
Data
DR Service
Archival and
Reporting Service
Protected Cluster
Recovery Cloud(Private)
Protected Node(NAS)
CSV Volumes
Protected Node(Hyper-V Host)
S
t
o
r
a
g
e
S
e
r
v
i
c
e
VM
Cloud
Storage
Recovery Node
(Hyper –V Host)
Application
Application
DAS
Protection Service
Recovery Service
Hydration Service
Policy/Monitoring
Service
DR Service
Archival and
Reporting Service
VMHot Backup
Cold Backup
S
t
o
r
a
g
e
S
e
r
v
i
c
e
App Migration
Server Backup
Cold Backup
Large Storage Backup
Catalog Service
18
Cloud Storage
Azure Blobs – For
Data
SQL Azure – For
MetaData
Production Cloud(Private)
Physical Model (Direct to Cloud)
Site X
DAS
Protected Node
SAN
Fabric Mgmt
Service
Cloud Service (Self
Service)
Site Y
Protected Cluster
Recovery Cloud(IaaS)
Protected Node(NAS)
CSV Volumes
Protected Node(Hyper-V Host)
VM
Cloud
Storage
Recovery Node
(Hyper –V Host)
Application
Application
DAS
Protection Service
Recovery Service
Hydration Service
Policy/Monitoring
Service
DR Service
Archival and
Reporting Service
VMHot Backup
Hydration
S
t
o
r
a
g
e
S
e
r
v
i
c
e
Large Storage Backup
Cold Backup
VM
Hydration
19
Windows Server
Logical Architecture
Recovery Service (Web Tier)
Protection
Service
Recovery Service
(Data Tier)
Retrieval
Service
Catalog
Service
Disaster
Recovery
Service
Infrastructure Services(Subscription(tenancy), Transport, Jobs, Networking)
Storage
Service(Protection
Data)
Replication Service
Replication
Provider
Replication
Provider
(Snapshot)
App
Migration
Service
Recovery Service(Job Service)
Recovery
Provider
Recovery
Provider
Off line
Recovery
Providers
Local Recovery Service
Hydration
Service
VSS
Providers
Replication
Provider(Hy
per –v)
VSS
ProvidersVSS
Providers
20
Recovery
Service
Portal
Migration
Service
Portal
Data Post
Processing
Roles
Storage Provider
(Hyper-V R)
Storage Provider
(Modified VHD Writer)
Xport Provider (File
Write)
Catalog
ProvidersCatalog
Providers
Windows Server
Hyper-V
Example(Enterprise DR)
Recovery Service (Web Tier)
Protection
Service (Data
Tier)
Enterprise Storage
Service(Protection
Data)
Replication Service
Replication
Provider
Replication
Provider
(Snapshot)
Protection Service(Job Service)
Local Recovery Service
VSS
Providers
Replication
Provider(Hy
per –v)
VSS
ProvidersVSS
Providers
Xport Provider
(Hyper-V R VM)
Xport Provider
(Modified VHD Writer)
21
Recovery
Service
Portal
Migration
Service
Portal
Windows Server
Local Recovery Service
Xport Provider (File
Write)
Data Post
Processing
Role
Catalog
ProvidersCatalog
Providers
Windows Server
Hyper-V R To Cloud
Example
Recovery Service (Web Tier)
Protection
Service (Data
Tier)
Azure Storage
(Protection Data)
Replication Service
Replication
Provider
Replication
Provider
(Snapshot)
Recovery Service(Job Service)
Local Recovery Service
VSS
Providers
Replication
Provider(Hy
per –v)
VSS
ProvidersVSS
Providers
Xport Provider
(Hyper-V R Cloud)
Xport Provider
(Modified VHD Writer)
22
Recovery
Service
Portal
Migration
Service
Portal
Xport Provider (File
Write)
Catalog
ProvidersCatalog
Providers
Hyper-V
Data Post
Processing
Role
Site Protection
23
Windows Server Backup Example (?)
24
SQL Replication Example
25
Application Migration Sharepoint
Example
26
Fine Grained Recovery From Hyper-V
27
• What other examples do we need
28
Capabilities of Components
• Replication Provider
– Capability Profile
• Supported Protected Element Types
• Min Data Loss Tolerance Window
• Max Data Loss Tolerance Window
• Application Consistency Support
– Requirement profile
• Require Off site Post Processing - Should this be a Xport Provider Requirement?
• Recovery Service Profile
– Capability Profile (Are these per protected Element Type)
• Recovery Time
• Recovery Points in Time
• Retention Time
• Encryption At Rest
• Supported Offline Recovery Providers
• Storage/Xport Provider Profile
– Capability Profile
• Client Side Encryption
• Which Recovery Service are they affiliated to?
– Recovery Service in Cloud --- storage is in cloud
– Recovery Service in Enterprise (DPM vnext) – storage is in Enterprise
– Recovery Service in another node or Cluster – data is stored in storage local to that node/cluster
• Is there a notion of a Recovery Mgmt service that can provide to other Recovery Services for
keeping their metadata and cataloging
29
Major Components
• Subscription Service : Create a Protection Name Space for a given
customer
• Protection Service : Allows creation of Protected Elements within a
protection namespace
• Catalog Service : Provides for creation of a catalog for protected
elements for a given protection namespace
• Recovery Service : Allows recovery of data for a protected element
in a protection name space
• Hydration Service : Uses the recovery service to hydrate VMs in a
private cloud or to Azure
• Job Service : Performs long running tasks submitted by the main
services and provides the infrastructure to monitor their progress
• Data Post Processing Roles : A replication provider can register a
data post processing role to process data before it is stored
30
31
Components (Client Side)
• Recovery Service (Agent) : Manages/Orchestrates the
processes in providing protection for a protected
element and associates it with a recovery service.
• Replication Service : Provides the framework/platform
for different replication providers to plugin
• Replication Provider
• Xport Provider
• Catalog Provider
• VSS Writers
Benefits
• Sets the Framework for a unified namespace for
Backup/DR/HA
• Create a Hoster Friendly Stack
– Hosters should want to deploy our stack in their
datacenter to provide value added offerings
– Retain a model where Hosters can also easily leverage
Azure resources for their recovery scenarios
– Need to understand what kinds of extensibilities they
would need beyond building their own portal
– Over time we have a mostly unified codebase written
to the service model
33
Roadmap/Next Steps
• Next steps
– Build a roadmap, possibly multi-release, to get
there
• Vteams to discuss and iterate over this
34
Plausible Roadmap
• V Next
– Build the Protection Mgmt service for the Azure segment
(Protect on Azure)
• Align with Tofino
• Notion of application definition or service template
– How do we leverage and align with that?
– Evolve DPM to be the protection mgmt service for the
Enterprises/Hosters
• Adopt the OBS/Service architecture that supports multi-tenancy
• Be the platform of choice for hosters to adopt to provide data
protection services to their customers
• Ensure that it works seamlessly with OBS service to provide geo-
protection using the Azure Cloud Storage
• V Next Next
– Figure out the evolution of components to serve the hybrid
cloud or the combined namespace
35
The Data Replication Problem
• Limiting Factors
– Throughput at the sending and the receiving side
– Storage at the processing side
• Consists of the following parts
– IR, Change Tracking and Data Movement
– Catalog
• IR
– Can we avoid/circumvent the problem by the use of published well known images?
• Change Tracking
– Data must be self descriptive
• Data Movement
– Channel
• Must Implement Push and Pull
• Selection of EndPoint Listener
– Azure Replication Storage Service (cloud backup for VMs)
– Private Cloud Replication Storage Service
– Hyper-v Host Replication Listener (hot backup)
• Negotiate for compression
• Encryption on Wire
• Support Throttling
• Catalog
36
A Layered Architecture ? Possible?
Description Responsibilities
(Replication Layer)
Responsibilities
(Data Protection
Layer)
Pros/Cons
Extensibility at Source Replication
Layer Solely
focused on
change
Tracking
1. Enable Change
Tracking
2. Notification of
handlers for safe
transmission and
persistence of data
1. Authentication
2. Transmission
Format
3. Provide the acks
required as per the
replication
protocol
Extensibility at the
listener
1. Change Tracking
2. Provide a set of
listeners at the
destination end of the
channel
3. Authentication for the
channel
4. Formats of
Transmission
Two Models here :
a) Data Protection
Layer controls
persistence
b) Data Protection
Layer preps the
storage and the
replication layers
writes directly to the
storage
Transmit change data
to a Specified Listener
The entities at
the two end of
the channel
agree to
protocol
37
Replication Provider Profile
• Min Data Loss Tolerance Window
• Max Data Loss Tolerance Window
• Application Consistency Support
• Recovery Time
– This depends more on the state in which the most up-to-
date copy is kept
• Recovery Points in Time
– To some extent this is not a capability of the provider but a
limit imposed by the storage or driven by requirements
• Retention Time
– Not really a capability of the provider
38
Requirements for Coupling Replication
Providers and Storage Providers
39
Basic Interaction
• User tells the Mgmt Layer the source he needs to
protect, specifies the SLAs(Data Loss Tolerance, RPO,
RTO and Retention Requirements)
• Mgmt Layer queries the Replication Service
– Replication Service queries the replication providers which
have registered with it
– Returns the provider
– Mgmt Layer will provide the choice to the users
• Mgmt will ask the Replication Service to configure for
replication with the user’s choice
• There is an initial handshake with the listener endpoint
where queries for storage are negotiated
40
Appendix
41
Windows 8 Storage Investments
• Windows Storage Pools : Storage virtualization over commodity disks but
providing advanced capabilities
– Spaces : Virtual disks created off of storage pools
• Offloaded Data Transfer
– Copy is offloaded to the intelligent storage array
• SMB Scaleout
– SMB Direct : Clients need a NIC with RDMA capability
– SMB Multipath : Adds robustness
– SMB VSS for Remote File Shares
• CSV –
– Available for application workloads, integrated with storage pools, thin
provisioning, smb scale out, support for fully featured VSS
• Data De-duplication
– On server : how does dedup compare to our compression
– On Host : DPM 2012 can handle deduped
42
Replication Comparison
• Hyper-v Replication
– Provides low data loss tolerance and write order consistency
– Depends on MSCS clustering
• Not very resilient to primary host failure (Will require resync)
• Not very resilient to replica Failure
• Buffers will overflow, Doesn’t have log folding
– Doesn’t separate Staging of VMs from data storage
• Replica Server may be receiving data for some VMs and at the same hosting a VM that has failed over
– How will it leverage storage deduplication?
• Snapshotting and USN based File Tracking Mechanisms
– USN based file change tracking mechanisms coupled with volume snapshotting help extract the changes between two snapshots
– File System Filter Driver helps tracks the file blocks that have changed
– Resync’s are required if tracking is upset
– More resilient to DPM server outage
– Snapshotting on the receiving side is a blocker for scale --- how many concurrent vss snapshots can a server perform across
different volumes?
• Chained Snapshotting helps utilize epoch based recovery
• Each snapshot representing an epoch
• Data Loss Tolerance
– For Hyper-v, scsi writes are copied into a buffered log pretty much continuously
– For DPM, copy on write is enabled during the interval that buffered copies happen
• So,
– How low can we squeeze the data loss tolerance with DPM?
– How high can we squeeze the data loss window with hyper-v R?
– We need instrumentation of data, ideally we should be able to compare the same workload
– We can calibrate the workload and intent and chose….but then
• What happens when the workload changes?
43
Catalog
• Catalog – Historical, Tells what the high level contents
of a backup are
– This essentially provides for browsability before full
recovery is undertaken
• The meta data for the structure/high level contents of an
application structure is a part of the data associated with a certain
recovery point however the catalog can help you with
identification of which recovery point may have the data of
interest
– Can the catalog information be handed down the VSS
snapshot process
• We expect the catalog to be tree structured
• This can be huge for a large application
• In such cases, can the applications be responsible for keeping an
up-to-date catalog?
44
Replicated Content Format
• DPM stores the content uncompressed/unencrypted
uses VSS snapshots as a mechanism to create point-in-
time copies
• Hyper-v R supports VHD 2.0, data is not encrypted at
rest but may be encrypted for transmission, data is not
compressed
• OBS supports a modified VHD 1.0 (meta data is vhd
1.0, blocks are compressed and encrypted at rest)
• We are doing some tests on how much extraction,
encryption and decompression add to the recovery
time
45

More Related Content

What's hot

Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...
Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...
Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...xKinAnx
 
Techgate solution sets 2014
Techgate solution sets 2014Techgate solution sets 2014
Techgate solution sets 2014Techgate plc
 
Disaster Recovery- A Case Study
Disaster Recovery- A Case StudyDisaster Recovery- A Case Study
Disaster Recovery- A Case Studyoneneckitservices
 
Technical track 2: arcserve UDP for virtualization & cloud
Technical track 2: arcserve UDP for virtualization & cloudTechnical track 2: arcserve UDP for virtualization & cloud
Technical track 2: arcserve UDP for virtualization & cloudarcserve data protection
 
Arcserve Portfolio Technical Overview
Arcserve Portfolio Technical OverviewArcserve Portfolio Technical Overview
Arcserve Portfolio Technical OverviewGina Tragos
 
Trends in Data Protection with DCIG
Trends in Data Protection with DCIGTrends in Data Protection with DCIG
Trends in Data Protection with DCIGGina Tragos
 
Next Generation Data Protection Architecture
Next Generation Data Protection Architecture Next Generation Data Protection Architecture
Next Generation Data Protection Architecture Gina Tragos
 
Data Domain Architecture
Data Domain ArchitectureData Domain Architecture
Data Domain Architecturekoesteruk22
 
Using multi tiered storage systems for storing both structured & unstructured...
Using multi tiered storage systems for storing both structured & unstructured...Using multi tiered storage systems for storing both structured & unstructured...
Using multi tiered storage systems for storing both structured & unstructured...ORACLE USER GROUP ESTONIA
 
Oracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagridOracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagridEmiliano Pecis
 
Storage Architectures And Options
Storage Architectures And OptionsStorage Architectures And Options
Storage Architectures And OptionsAlan McSweeney
 
Provisioning server high_availability_considerations2
Provisioning server high_availability_considerations2Provisioning server high_availability_considerations2
Provisioning server high_availability_considerations2Nuno Alves
 
Druva In Sync Product Overview
Druva In Sync Product OverviewDruva In Sync Product Overview
Druva In Sync Product Overviewrammotive
 
2/18 Technical Overview
2/18 Technical Overview2/18 Technical Overview
2/18 Technical OverviewGina Tragos
 
Learn the facts about replication in mainframe storage webinar
Learn the facts about replication in mainframe storage webinarLearn the facts about replication in mainframe storage webinar
Learn the facts about replication in mainframe storage webinarHitachi Vantara
 
Times Ten in-memory database when time counts - Laszlo Ludas
Times Ten in-memory database when time counts - Laszlo LudasTimes Ten in-memory database when time counts - Laszlo Ludas
Times Ten in-memory database when time counts - Laszlo LudasORACLE USER GROUP ESTONIA
 
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory ComputingIMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory ComputingIn-Memory Computing Summit
 
Business Track 3: arcserve udp licensing pricing & support made simple
Business Track 3: arcserve udp licensing pricing & support made simpleBusiness Track 3: arcserve udp licensing pricing & support made simple
Business Track 3: arcserve udp licensing pricing & support made simplearcserve data protection
 

What's hot (20)

Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...
Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...
Ibm spectrum scale fundamentals workshop for americas part 6 spectrumscale el...
 
Hitachi Data Services. Business Continuity
Hitachi Data Services. Business ContinuityHitachi Data Services. Business Continuity
Hitachi Data Services. Business Continuity
 
Techgate solution sets 2014
Techgate solution sets 2014Techgate solution sets 2014
Techgate solution sets 2014
 
Disaster Recovery- A Case Study
Disaster Recovery- A Case StudyDisaster Recovery- A Case Study
Disaster Recovery- A Case Study
 
Technical track 2: arcserve UDP for virtualization & cloud
Technical track 2: arcserve UDP for virtualization & cloudTechnical track 2: arcserve UDP for virtualization & cloud
Technical track 2: arcserve UDP for virtualization & cloud
 
Arcserve Portfolio Technical Overview
Arcserve Portfolio Technical OverviewArcserve Portfolio Technical Overview
Arcserve Portfolio Technical Overview
 
Technical track 2_Virtualization & Cloud
Technical track 2_Virtualization & CloudTechnical track 2_Virtualization & Cloud
Technical track 2_Virtualization & Cloud
 
Trends in Data Protection with DCIG
Trends in Data Protection with DCIGTrends in Data Protection with DCIG
Trends in Data Protection with DCIG
 
Next Generation Data Protection Architecture
Next Generation Data Protection Architecture Next Generation Data Protection Architecture
Next Generation Data Protection Architecture
 
Data Domain Architecture
Data Domain ArchitectureData Domain Architecture
Data Domain Architecture
 
Using multi tiered storage systems for storing both structured & unstructured...
Using multi tiered storage systems for storing both structured & unstructured...Using multi tiered storage systems for storing both structured & unstructured...
Using multi tiered storage systems for storing both structured & unstructured...
 
Oracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagridOracle Coherence: in-memory datagrid
Oracle Coherence: in-memory datagrid
 
Storage Architectures And Options
Storage Architectures And OptionsStorage Architectures And Options
Storage Architectures And Options
 
Provisioning server high_availability_considerations2
Provisioning server high_availability_considerations2Provisioning server high_availability_considerations2
Provisioning server high_availability_considerations2
 
Druva In Sync Product Overview
Druva In Sync Product OverviewDruva In Sync Product Overview
Druva In Sync Product Overview
 
2/18 Technical Overview
2/18 Technical Overview2/18 Technical Overview
2/18 Technical Overview
 
Learn the facts about replication in mainframe storage webinar
Learn the facts about replication in mainframe storage webinarLearn the facts about replication in mainframe storage webinar
Learn the facts about replication in mainframe storage webinar
 
Times Ten in-memory database when time counts - Laszlo Ludas
Times Ten in-memory database when time counts - Laszlo LudasTimes Ten in-memory database when time counts - Laszlo Ludas
Times Ten in-memory database when time counts - Laszlo Ludas
 
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory ComputingIMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
 
Business Track 3: arcserve udp licensing pricing & support made simple
Business Track 3: arcserve udp licensing pricing & support made simpleBusiness Track 3: arcserve udp licensing pricing & support made simple
Business Track 3: arcserve udp licensing pricing & support made simple
 

Viewers also liked

ALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilSunita Shrivastava
 
Social media for communal leaders (2)
Social media for communal leaders (2)Social media for communal leaders (2)
Social media for communal leaders (2)Brainstorm Digital
 
paper&tree - lowres
paper&tree - lowrespaper&tree - lowres
paper&tree - lowresRini Sucahyo
 
6th mathc2 -l65--april8
6th mathc2 -l65--april86th mathc2 -l65--april8
6th mathc2 -l65--april8jdurst65
 
Tecnologia e sensibilidade no Design Cerâmico
Tecnologia e sensibilidade no Design CerâmicoTecnologia e sensibilidade no Design Cerâmico
Tecnologia e sensibilidade no Design CerâmicoCamila Márcia Contato
 
Kontrol audit sistem informasi
Kontrol audit sistem informasiKontrol audit sistem informasi
Kontrol audit sistem informasiDinda Afani
 
Emrc 7º aula 13
Emrc 7º aula 13Emrc 7º aula 13
Emrc 7º aula 13jv26
 
Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...
Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...
Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...Tasos Lazaridis
 
Barotraumatisme versus Accident de décompresion (ADD)
Barotraumatisme versus Accident de décompresion (ADD)Barotraumatisme versus Accident de décompresion (ADD)
Barotraumatisme versus Accident de décompresion (ADD)Nathalie Aisenberg
 
MySQL Server Defaults
MySQL Server DefaultsMySQL Server Defaults
MySQL Server DefaultsMorgan Tocker
 

Viewers also liked (15)

Proyecto tic
Proyecto ticProyecto tic
Proyecto tic
 
ALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch Council
 
Work1m34 39 40
Work1m34 39 40Work1m34 39 40
Work1m34 39 40
 
Social media for communal leaders (2)
Social media for communal leaders (2)Social media for communal leaders (2)
Social media for communal leaders (2)
 
IT Success Stories - Harrier Information Systems Pvt. Ltd.
IT Success Stories - Harrier Information Systems Pvt. Ltd.IT Success Stories - Harrier Information Systems Pvt. Ltd.
IT Success Stories - Harrier Information Systems Pvt. Ltd.
 
paper&tree - lowres
paper&tree - lowrespaper&tree - lowres
paper&tree - lowres
 
Partes internas del computador
Partes internas del computadorPartes internas del computador
Partes internas del computador
 
6th mathc2 -l65--april8
6th mathc2 -l65--april86th mathc2 -l65--april8
6th mathc2 -l65--april8
 
Tecnologia e sensibilidade no Design Cerâmico
Tecnologia e sensibilidade no Design CerâmicoTecnologia e sensibilidade no Design Cerâmico
Tecnologia e sensibilidade no Design Cerâmico
 
Neurotransmisores
NeurotransmisoresNeurotransmisores
Neurotransmisores
 
Kontrol audit sistem informasi
Kontrol audit sistem informasiKontrol audit sistem informasi
Kontrol audit sistem informasi
 
Emrc 7º aula 13
Emrc 7º aula 13Emrc 7º aula 13
Emrc 7º aula 13
 
Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...
Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...
Double Clamped and Cantilever Beam Theoretical Solution and Numerical Solutio...
 
Barotraumatisme versus Accident de décompresion (ADD)
Barotraumatisme versus Accident de décompresion (ADD)Barotraumatisme versus Accident de décompresion (ADD)
Barotraumatisme versus Accident de décompresion (ADD)
 
MySQL Server Defaults
MySQL Server DefaultsMySQL Server Defaults
MySQL Server Defaults
 

Similar to Logical Architecture for Protection

Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014EDB
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterWebinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterStorage Switzerland
 
Webinar: Is Your Storage Ready for Disaster?
Webinar: Is Your Storage Ready for Disaster?Webinar: Is Your Storage Ready for Disaster?
Webinar: Is Your Storage Ready for Disaster?Storage Switzerland
 
Azure AWS real time-interview questions part 9
Azure AWS real time-interview questions part 9Azure AWS real time-interview questions part 9
Azure AWS real time-interview questions part 9Malleswar Reddy
 
How to “Future Proof” Data Protection for Organizational Resilience
How to “Future Proof” Data Protection for Organizational ResilienceHow to “Future Proof” Data Protection for Organizational Resilience
How to “Future Proof” Data Protection for Organizational ResilienceStorage Switzerland
 
Todays_Cloud_Strategies_100818.pptx
Todays_Cloud_Strategies_100818.pptxTodays_Cloud_Strategies_100818.pptx
Todays_Cloud_Strategies_100818.pptxMOKTARBAKAR2
 
Declare Victory with Big Data
Declare Victory with Big DataDeclare Victory with Big Data
Declare Victory with Big DataJ On The Beach
 
Webinar: 3 Steps to be a Storage Superhero - How to Slash Storage Costs
Webinar: 3 Steps to be a Storage Superhero - How to Slash Storage CostsWebinar: 3 Steps to be a Storage Superhero - How to Slash Storage Costs
Webinar: 3 Steps to be a Storage Superhero - How to Slash Storage CostsStorage Switzerland
 
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...CloudSmartz
 
How to achieve better backup with Symantec
How to achieve better backup with SymantecHow to achieve better backup with Symantec
How to achieve better backup with SymantecArrow ECS UK
 
L12 Session State and Distributation Strategies
L12 Session State and Distributation StrategiesL12 Session State and Distributation Strategies
L12 Session State and Distributation StrategiesÓlafur Andri Ragnarsson
 
The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think Uni Systems S.M.S.A.
 
071310 sun d_0930_feldman_stephen
071310 sun d_0930_feldman_stephen071310 sun d_0930_feldman_stephen
071310 sun d_0930_feldman_stephenSteve Feldman
 
Implementing and Managing Desktop Virtualization in Education
Implementing and Managing Desktop Virtualization in EducationImplementing and Managing Desktop Virtualization in Education
Implementing and Managing Desktop Virtualization in EducationJeremy Anderson
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesWebinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesStorage Switzerland
 
AWS Sydney Summit 2013 - Technical Lessons on How to do DR in the Cloud
AWS Sydney Summit 2013 - Technical Lessons on How to do DR in the CloudAWS Sydney Summit 2013 - Technical Lessons on How to do DR in the Cloud
AWS Sydney Summit 2013 - Technical Lessons on How to do DR in the CloudAmazon Web Services
 
Databarracks & SolidFire - How to run tier 1 applications in the cloud
Databarracks & SolidFire - How to run tier 1 applications in the cloud Databarracks & SolidFire - How to run tier 1 applications in the cloud
Databarracks & SolidFire - How to run tier 1 applications in the cloud NetApp
 

Similar to Logical Architecture for Protection (20)

Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014
 
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it BetterWebinar: Cloud Storage: The 5 Reasons IT Can Do it Better
Webinar: Cloud Storage: The 5 Reasons IT Can Do it Better
 
Webinar: Is Your Storage Ready for Disaster?
Webinar: Is Your Storage Ready for Disaster?Webinar: Is Your Storage Ready for Disaster?
Webinar: Is Your Storage Ready for Disaster?
 
Azure AWS real time-interview questions part 9
Azure AWS real time-interview questions part 9Azure AWS real time-interview questions part 9
Azure AWS real time-interview questions part 9
 
How to “Future Proof” Data Protection for Organizational Resilience
How to “Future Proof” Data Protection for Organizational ResilienceHow to “Future Proof” Data Protection for Organizational Resilience
How to “Future Proof” Data Protection for Organizational Resilience
 
Todays_Cloud_Strategies_100818.pptx
Todays_Cloud_Strategies_100818.pptxTodays_Cloud_Strategies_100818.pptx
Todays_Cloud_Strategies_100818.pptx
 
Declare Victory with Big Data
Declare Victory with Big DataDeclare Victory with Big Data
Declare Victory with Big Data
 
L21 scalability
L21 scalabilityL21 scalability
L21 scalability
 
Adopting the Cloud
Adopting the CloudAdopting the Cloud
Adopting the Cloud
 
Webinar: 3 Steps to be a Storage Superhero - How to Slash Storage Costs
Webinar: 3 Steps to be a Storage Superhero - How to Slash Storage CostsWebinar: 3 Steps to be a Storage Superhero - How to Slash Storage Costs
Webinar: 3 Steps to be a Storage Superhero - How to Slash Storage Costs
 
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
Disaster Recovery & Business Resilience Trends - CloudSmartz | Smarter Transf...
 
How to achieve better backup with Symantec
How to achieve better backup with SymantecHow to achieve better backup with Symantec
How to achieve better backup with Symantec
 
L12 Session State and Distributation Strategies
L12 Session State and Distributation StrategiesL12 Session State and Distributation Strategies
L12 Session State and Distributation Strategies
 
The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think The most trusted, proven enterprise-class Cloud:Closer than you think
The most trusted, proven enterprise-class Cloud:Closer than you think
 
071310 sun d_0930_feldman_stephen
071310 sun d_0930_feldman_stephen071310 sun d_0930_feldman_stephen
071310 sun d_0930_feldman_stephen
 
Implementing and Managing Desktop Virtualization in Education
Implementing and Managing Desktop Virtualization in EducationImplementing and Managing Desktop Virtualization in Education
Implementing and Managing Desktop Virtualization in Education
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesWebinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
 
AWS Sydney Summit 2013 - Technical Lessons on How to do DR in the Cloud
AWS Sydney Summit 2013 - Technical Lessons on How to do DR in the CloudAWS Sydney Summit 2013 - Technical Lessons on How to do DR in the Cloud
AWS Sydney Summit 2013 - Technical Lessons on How to do DR in the Cloud
 
Databarracks & SolidFire - How to run tier 1 applications in the cloud
Databarracks & SolidFire - How to run tier 1 applications in the cloud Databarracks & SolidFire - How to run tier 1 applications in the cloud
Databarracks & SolidFire - How to run tier 1 applications in the cloud
 

More from Sunita Shrivastava

Bing Phone Book Service Arch Spec
Bing Phone Book Service Arch SpecBing Phone Book Service Arch Spec
Bing Phone Book Service Arch SpecSunita Shrivastava
 
Cognito Unified API Specification
Cognito Unified API SpecificationCognito Unified API Specification
Cognito Unified API SpecificationSunita Shrivastava
 
Dev Analytics Aggregate DB Design Analysis
Dev Analytics Aggregate DB Design AnalysisDev Analytics Aggregate DB Design Analysis
Dev Analytics Aggregate DB Design AnalysisSunita Shrivastava
 
Index Provisioning for ALM Search - My Presentation
Index Provisioning for ALM Search - My PresentationIndex Provisioning for ALM Search - My Presentation
Index Provisioning for ALM Search - My PresentationSunita Shrivastava
 

More from Sunita Shrivastava (6)

Bing Phone Book Service Arch Spec
Bing Phone Book Service Arch SpecBing Phone Book Service Arch Spec
Bing Phone Book Service Arch Spec
 
Cognito Unified API Specification
Cognito Unified API SpecificationCognito Unified API Specification
Cognito Unified API Specification
 
Dev Analytics Overview
Dev Analytics OverviewDev Analytics Overview
Dev Analytics Overview
 
Dev Analytics Aggregate DB Design Analysis
Dev Analytics Aggregate DB Design AnalysisDev Analytics Aggregate DB Design Analysis
Dev Analytics Aggregate DB Design Analysis
 
Search Approach - ES, GraphDB
Search Approach - ES, GraphDBSearch Approach - ES, GraphDB
Search Approach - ES, GraphDB
 
Index Provisioning for ALM Search - My Presentation
Index Provisioning for ALM Search - My PresentationIndex Provisioning for ALM Search - My Presentation
Index Provisioning for ALM Search - My Presentation
 

Logical Architecture for Protection

  • 1. Physical Models and Logical Architecture Sunita Shrivastava Vijay Sen 1
  • 2. Work in Progress, Needs your input • Work In progress – Intent is to collect feedback and hear your views – Will need your help to drive this to its logical conclusion – Several drill downs required • Why is this significant? – Historically we did data protection/backup in the following backdrop : no virtualization, no cloud, low end storage technologies – New Scenarios around data protection are now a lot more feasible due to the availability and maturity of these technologies – We need to understand our existing assets – Intent is to not redesign the entire code but to be able to think through this systematically. And come up with the following answers a) Are we handling the new scenarios in an optimal way ? b) What are the common constructs/shared meta data across the scenarios and hence across the subservices ? Are we building silo’d solutions ? c) Is a layered AND highly scalable AND ha solution is feasible? Where and how do we change gears? d) If so, what would a roadmap for transition to this look like? 2
  • 3. Proposed plan What’s our vision? What assets do we have today? What are the new relevant scenarios? What are the limitations today? What is the new architecture? Exit criteria Benefits of the new arch well understood; Split between platform and management bought off by Windows team and leads Exit criteria vNext scenarios signed off Exit criteria Perceived strengths and weaknesses identified for the assets Exit criteria Comprehensive vision objectives and non- objectives agreed upon by staff Exit criteria Gaps between scenarios required and assets available identified 3
  • 4. High Level Scenarios • Backup • HA ( geo clustering) • Disaster Recovery – Rehydration • Application Migration • Archival 4
  • 5. Principles • Unified management across data protection, DR and migration • Tailored to application owners and hosters • Hybrid cloud awareness • Enterprise class offering • Alignment with windows and CDM • Our team innovates in management, but leverages replication technologies • We are a platform and an end to end solution (management is not extensible, but we are a platform for other replication providers) Guiding principles Non goals • Support for non-Microsoft clouds • TBD
  • 6. Hoster Debate? • Couple of options – Provide a stack to hosters which allows them to offer a recovery service all within their data center, where they provide storage • Another variation would be that Azure Storage is used – Provide a stack to hosters which allows them to leverage the “Recovery Service”(running in Azure) for both backup and DR – A combination of the above 6
  • 7. Existing Assets • Windows Server 8 Backup (Full Server, Critical Volumes(BMR), System State, Individual Volumes, Files/Folders) – Strengths • Free, Simple, Sweet Spot – 8 to 10 machines, Used in departmental/branch office servers in Enterprise Scenarios, Primary Ask is BMR – Weaknesses • Clustering Support, Centralized Console/Monitoring • Client Backup – File (zip file based, for compat?) – System Restore (snapshot based, file level recovery, only backs up settings/registry/system files, affiliated with app/driver installs) – System Image Backup – History Vault (File level restore, uses Shadow Copies) • DPM – Strengths • Adoption in midmarket • SQL is the largest workload? – What do we add over the SQL technologies as a value proposition? • Exchange the largest workload in Enterprise? • What about Sharepoint support? • Can we make a claim that due to strong recovery models (item level) , customers like to use DPM when protecting our apps? • Can we make a claim otherwise? – Weaknesses • Blockers for adoption in large Enterprises – Tape Support – Need lower data loss intervals for mission critical applications – No support for de-duplication - Need more analysis – Is Scale a blocker? » Current Deployment Scale for customers : Fan of 10/13 servers, atmost 3 to 4 DPM servers » We are at the cusp for scale, as demonstrated by 64 node clusters where storage separation is necessary » DPM Limit 80 TB for recovery volume and 40 TB for replica volume 7
  • 8. Existing Assets(2) • OBS – Strengths • A service on Azure that is designed for scale and availability – Weaknesses • No monitoring of whether backup is actually happening • Large footprint, worker roles could be shared for higher utilization, fewer cogs and smaller footprint • What have we learnt from the early exposure • DPM as a Gateway to Cloud – Strengths • Great Story for off-site protection of data – Weakness • Scale Models ? 8
  • 9. Gaps • Non Optimal Data Movement • Storage coupled too tightly with servers, not fungible across servers • Lack of a Single Unified Protection Namespace – For Backup, DR, HA – For different Segments • Not Hoster Friendly • Silo’d services in the Enterprise – No coherent SLA’s • No application awareness • Our Resources not as leveraged as they can be • IMHO, – a single protection namespace is the single biggest investment that we need to make from a management perspective – Layering over a replication service from a platform perspective – Full fledged support, workflows for recovery – Storage service for storage management is an investment we need to support hosters and large enterprises 9
  • 10. Introducing Protection Namespace • A protection namespace hosts protected(or a protectable) element – A binding is applied when a protectable element is made protected – The binding specifies how a protected element is protected • Protectable Element – <Name, PEType> • Protected Element – <Name, PEType, ProtectionType, Target Recovery Service URI, Destination URI(optional)> • User Specifies <Disaster Loss Tolerance, RPO, RTO, Destination Preference> on the basis of which a protection type is chosen • A protection namespace is rooted at the level of tenancy/subscription • The leaf nodes are protected elements • Container nodes only serve organizational perspective within a namespace – Is a flat space good enough? • What can be imported into the namespace? – Can there be policies to automatically discover protectable elements and import them automagically? • Questions • What is the relationship of this namespace to other potential elements or constructs within System Center/Tofino? 10
  • 11. Understanding Protection Namespaces • Providing a unified view across the Protection Namespace will prevent fragmented silo’d solutions which require a lot of bookkeeping • Protection Name Space (Sliced By Node/Clusters) – <Source Type, Source Name(URI), Protection Type, Target Recovery Service URI, Status> – Node A • <Host, ‘A’, Windows Server Backup, Disk Z, Green> • <Host, ‘A’, Windows Server Backup, Network Share Z, Green > – Node B • <Folder, ‘Folder xyz’, Snapshot Replication, Azure Storage, Green> – Node C • <Volume, ‘Volume a’, Snapshot Replication, Enterprise – Node D • <VM, ‘VM abc’, Hyper-V Replication-Hot, Target Host Node X, Green> • <VM, ‘VM cde’, Snapshot Replication, Azure Storage, Green> • <VM, ‘VM fgh’ Hyper-V Replication- Cold, Azure Storage,Green> – Node E • <SQL DB, DB ‘bcd’ SQL Logging, Green> – Cluster E • <VM, ‘VM efg’, Snapshot Replication, Yellow> • <VM, ‘VM cde’, Hyper-v Replication, Hot, Target Host Cluster G> – Cluster F • <All VMs protected by Snapshot Replication, Status Green> – SAN G • <SAN Volumes, Volume A to G, SAN Replication, Target SAN> – Node H • <Volume, 11
  • 12. Protection Namespaces and Apps • Protection Name Space (Sliced By Application) <Protection Type, Schedule, Status> – Application XYZ (Replace with real life example, Hrweb?) • Web Tier – Node A – Node B – Node C » < Windows Server Backup, Once in 15 days, 3 Recovery Points> • Middle Tier – Node E – Node F » <Windows Server Backup, Once in 15 days, 3 Recovery Points> • Data Tier – Cluster VMs – < Hyper-V Replication, 5 minutes, 15 Recovery Points> • Protection Name Space Chaining • <SQL DB A, SQL Synchronous Replication, SQL DB B> • <SQL DB b, SQL Asynchronous Replication, SQL DB C> 12
  • 13. Protection Namespace • Single Server – Local Protection Namespace handled by the Local Recovery Service – Should we allow Publishing – Optionally, can publish to the Recovery Service or can publish to the Recovery Service(Azure) • Do we really need a protection namespace? – Is it beneficial to the Application Owner/Administrator? – Is it beneficial to the Hoster (Fabric/Service Provider)? – Is it beneficial to the Fabric Administrator within a Data Center? 13
  • 14. Application Migration and Protection Namespace • What is the relationship of the Application Migration to Protection/Recovery Service • Catalog/VSS Writer could aid in Application Discovery? – Unknown • Commonalities – Application Migration Equals “IR + Simplified Recovery + Hydration” – Failover to cloud in case of Disaster Recovery is essentially equivalent of Application Migration in terms of requirements around the ambience required by the application • A migrated application may need a VPN • An application failing over to the cloud may need a VPN to be configured – In the long term, what do we as a Protection/Recovery Team need to do to ensure that the application can be protected appropriately as it is migrated 14
  • 15. Three segments • Enterprise • Cloud – Azure Services • Hybrid – A unified namespace across the enterprise and cloud 15
  • 16. Management Tasks for the Protection Namespace • The success of the entire solution depends not only on plumbing but the ease with which data protection needs of a customer can be met • There are plenty of significant Management Tasks – Management of Protection NameSpace • Protection Name Spaces – Contain Protected Elements » Application, VM, Volumes, Collection of Volumes – Hierarchical, Nested » By Location(Site), By Cluster, By Node » By Application – Overlapping ? – Management of Protection Policies – Simplifying Policies » Stock Policies ? » Based on Intent and Calibration ? – Provisioning for Replication – Driving the underlying replication – Monitoring the Status of Protection – Given the policies and the namespaces, alert if things are not on schedule – Orchestration for Recovery of Data • Indexing and Cataloging for Efficient Retrieval • Orchestration of Recovery – Management for Disaster Recovery • Reserving Fabric • Testing of Fail Over and Fail back to Primary Site – Orchestration of Hydration – Management of Storage, Bandwidth, Fabric, Networking for all of the above 16
  • 17. VMs and Our Scenarios • Why do VMs require backup? – VM corruption? – Guest Level backup Bs VM Level Backup • However, it is important to understand where guest level replication makes more sense • Where does a combination of guest and vm level backup make sense • VMs lend themselves more easily to migration – DR drives virtualization, as DR requires migration • Definition #1 (For Azure?) – Cold Backup • Medium to Low Recovery Time, Low Data Tolerance – Hot Backup • Low Recovery Time + Low Data loss Tolerance • Definition #2 (Applicable for Private Clouds) – Cold Backup • Low Data Tolerance, Fabric is not reserved, Recovery may get long, however resources may be shared more effectively – Hot Backup • Low Recovery Time, Low Data Tolerance, Fabric is reserved 17
  • 18. Physical Model (Enterprise Only) Protection MetaData Protected Data Protection Service Recovery Service Hydration Service Subscription Service Protection MetaData Site X DAS Protected Node Protected Data(Possibly SAN, Possibly NAS) Archived Data SAN Fabric Mgmt Service Cloud Service (Self Service) Site Y Protected DataProtected Data DR Service Archival and Reporting Service Protected Cluster Recovery Cloud(Private) Protected Node(NAS) CSV Volumes Protected Node(Hyper-V Host) S t o r a g e S e r v i c e VM Cloud Storage Recovery Node (Hyper –V Host) Application Application DAS Protection Service Recovery Service Hydration Service Policy/Monitoring Service DR Service Archival and Reporting Service VMHot Backup Cold Backup S t o r a g e S e r v i c e App Migration Server Backup Cold Backup Large Storage Backup Catalog Service 18
  • 19. Cloud Storage Azure Blobs – For Data SQL Azure – For MetaData Production Cloud(Private) Physical Model (Direct to Cloud) Site X DAS Protected Node SAN Fabric Mgmt Service Cloud Service (Self Service) Site Y Protected Cluster Recovery Cloud(IaaS) Protected Node(NAS) CSV Volumes Protected Node(Hyper-V Host) VM Cloud Storage Recovery Node (Hyper –V Host) Application Application DAS Protection Service Recovery Service Hydration Service Policy/Monitoring Service DR Service Archival and Reporting Service VMHot Backup Hydration S t o r a g e S e r v i c e Large Storage Backup Cold Backup VM Hydration 19
  • 20. Windows Server Logical Architecture Recovery Service (Web Tier) Protection Service Recovery Service (Data Tier) Retrieval Service Catalog Service Disaster Recovery Service Infrastructure Services(Subscription(tenancy), Transport, Jobs, Networking) Storage Service(Protection Data) Replication Service Replication Provider Replication Provider (Snapshot) App Migration Service Recovery Service(Job Service) Recovery Provider Recovery Provider Off line Recovery Providers Local Recovery Service Hydration Service VSS Providers Replication Provider(Hy per –v) VSS ProvidersVSS Providers 20 Recovery Service Portal Migration Service Portal Data Post Processing Roles Storage Provider (Hyper-V R) Storage Provider (Modified VHD Writer) Xport Provider (File Write) Catalog ProvidersCatalog Providers
  • 21. Windows Server Hyper-V Example(Enterprise DR) Recovery Service (Web Tier) Protection Service (Data Tier) Enterprise Storage Service(Protection Data) Replication Service Replication Provider Replication Provider (Snapshot) Protection Service(Job Service) Local Recovery Service VSS Providers Replication Provider(Hy per –v) VSS ProvidersVSS Providers Xport Provider (Hyper-V R VM) Xport Provider (Modified VHD Writer) 21 Recovery Service Portal Migration Service Portal Windows Server Local Recovery Service Xport Provider (File Write) Data Post Processing Role Catalog ProvidersCatalog Providers
  • 22. Windows Server Hyper-V R To Cloud Example Recovery Service (Web Tier) Protection Service (Data Tier) Azure Storage (Protection Data) Replication Service Replication Provider Replication Provider (Snapshot) Recovery Service(Job Service) Local Recovery Service VSS Providers Replication Provider(Hy per –v) VSS ProvidersVSS Providers Xport Provider (Hyper-V R Cloud) Xport Provider (Modified VHD Writer) 22 Recovery Service Portal Migration Service Portal Xport Provider (File Write) Catalog ProvidersCatalog Providers Hyper-V Data Post Processing Role
  • 24. Windows Server Backup Example (?) 24
  • 27. Fine Grained Recovery From Hyper-V 27
  • 28. • What other examples do we need 28
  • 29. Capabilities of Components • Replication Provider – Capability Profile • Supported Protected Element Types • Min Data Loss Tolerance Window • Max Data Loss Tolerance Window • Application Consistency Support – Requirement profile • Require Off site Post Processing - Should this be a Xport Provider Requirement? • Recovery Service Profile – Capability Profile (Are these per protected Element Type) • Recovery Time • Recovery Points in Time • Retention Time • Encryption At Rest • Supported Offline Recovery Providers • Storage/Xport Provider Profile – Capability Profile • Client Side Encryption • Which Recovery Service are they affiliated to? – Recovery Service in Cloud --- storage is in cloud – Recovery Service in Enterprise (DPM vnext) – storage is in Enterprise – Recovery Service in another node or Cluster – data is stored in storage local to that node/cluster • Is there a notion of a Recovery Mgmt service that can provide to other Recovery Services for keeping their metadata and cataloging 29
  • 30. Major Components • Subscription Service : Create a Protection Name Space for a given customer • Protection Service : Allows creation of Protected Elements within a protection namespace • Catalog Service : Provides for creation of a catalog for protected elements for a given protection namespace • Recovery Service : Allows recovery of data for a protected element in a protection name space • Hydration Service : Uses the recovery service to hydrate VMs in a private cloud or to Azure • Job Service : Performs long running tasks submitted by the main services and provides the infrastructure to monitor their progress • Data Post Processing Roles : A replication provider can register a data post processing role to process data before it is stored 30
  • 31. 31
  • 32. Components (Client Side) • Recovery Service (Agent) : Manages/Orchestrates the processes in providing protection for a protected element and associates it with a recovery service. • Replication Service : Provides the framework/platform for different replication providers to plugin • Replication Provider • Xport Provider • Catalog Provider • VSS Writers
  • 33. Benefits • Sets the Framework for a unified namespace for Backup/DR/HA • Create a Hoster Friendly Stack – Hosters should want to deploy our stack in their datacenter to provide value added offerings – Retain a model where Hosters can also easily leverage Azure resources for their recovery scenarios – Need to understand what kinds of extensibilities they would need beyond building their own portal – Over time we have a mostly unified codebase written to the service model 33
  • 34. Roadmap/Next Steps • Next steps – Build a roadmap, possibly multi-release, to get there • Vteams to discuss and iterate over this 34
  • 35. Plausible Roadmap • V Next – Build the Protection Mgmt service for the Azure segment (Protect on Azure) • Align with Tofino • Notion of application definition or service template – How do we leverage and align with that? – Evolve DPM to be the protection mgmt service for the Enterprises/Hosters • Adopt the OBS/Service architecture that supports multi-tenancy • Be the platform of choice for hosters to adopt to provide data protection services to their customers • Ensure that it works seamlessly with OBS service to provide geo- protection using the Azure Cloud Storage • V Next Next – Figure out the evolution of components to serve the hybrid cloud or the combined namespace 35
  • 36. The Data Replication Problem • Limiting Factors – Throughput at the sending and the receiving side – Storage at the processing side • Consists of the following parts – IR, Change Tracking and Data Movement – Catalog • IR – Can we avoid/circumvent the problem by the use of published well known images? • Change Tracking – Data must be self descriptive • Data Movement – Channel • Must Implement Push and Pull • Selection of EndPoint Listener – Azure Replication Storage Service (cloud backup for VMs) – Private Cloud Replication Storage Service – Hyper-v Host Replication Listener (hot backup) • Negotiate for compression • Encryption on Wire • Support Throttling • Catalog 36
  • 37. A Layered Architecture ? Possible? Description Responsibilities (Replication Layer) Responsibilities (Data Protection Layer) Pros/Cons Extensibility at Source Replication Layer Solely focused on change Tracking 1. Enable Change Tracking 2. Notification of handlers for safe transmission and persistence of data 1. Authentication 2. Transmission Format 3. Provide the acks required as per the replication protocol Extensibility at the listener 1. Change Tracking 2. Provide a set of listeners at the destination end of the channel 3. Authentication for the channel 4. Formats of Transmission Two Models here : a) Data Protection Layer controls persistence b) Data Protection Layer preps the storage and the replication layers writes directly to the storage Transmit change data to a Specified Listener The entities at the two end of the channel agree to protocol 37
  • 38. Replication Provider Profile • Min Data Loss Tolerance Window • Max Data Loss Tolerance Window • Application Consistency Support • Recovery Time – This depends more on the state in which the most up-to- date copy is kept • Recovery Points in Time – To some extent this is not a capability of the provider but a limit imposed by the storage or driven by requirements • Retention Time – Not really a capability of the provider 38
  • 39. Requirements for Coupling Replication Providers and Storage Providers 39
  • 40. Basic Interaction • User tells the Mgmt Layer the source he needs to protect, specifies the SLAs(Data Loss Tolerance, RPO, RTO and Retention Requirements) • Mgmt Layer queries the Replication Service – Replication Service queries the replication providers which have registered with it – Returns the provider – Mgmt Layer will provide the choice to the users • Mgmt will ask the Replication Service to configure for replication with the user’s choice • There is an initial handshake with the listener endpoint where queries for storage are negotiated 40
  • 42. Windows 8 Storage Investments • Windows Storage Pools : Storage virtualization over commodity disks but providing advanced capabilities – Spaces : Virtual disks created off of storage pools • Offloaded Data Transfer – Copy is offloaded to the intelligent storage array • SMB Scaleout – SMB Direct : Clients need a NIC with RDMA capability – SMB Multipath : Adds robustness – SMB VSS for Remote File Shares • CSV – – Available for application workloads, integrated with storage pools, thin provisioning, smb scale out, support for fully featured VSS • Data De-duplication – On server : how does dedup compare to our compression – On Host : DPM 2012 can handle deduped 42
  • 43. Replication Comparison • Hyper-v Replication – Provides low data loss tolerance and write order consistency – Depends on MSCS clustering • Not very resilient to primary host failure (Will require resync) • Not very resilient to replica Failure • Buffers will overflow, Doesn’t have log folding – Doesn’t separate Staging of VMs from data storage • Replica Server may be receiving data for some VMs and at the same hosting a VM that has failed over – How will it leverage storage deduplication? • Snapshotting and USN based File Tracking Mechanisms – USN based file change tracking mechanisms coupled with volume snapshotting help extract the changes between two snapshots – File System Filter Driver helps tracks the file blocks that have changed – Resync’s are required if tracking is upset – More resilient to DPM server outage – Snapshotting on the receiving side is a blocker for scale --- how many concurrent vss snapshots can a server perform across different volumes? • Chained Snapshotting helps utilize epoch based recovery • Each snapshot representing an epoch • Data Loss Tolerance – For Hyper-v, scsi writes are copied into a buffered log pretty much continuously – For DPM, copy on write is enabled during the interval that buffered copies happen • So, – How low can we squeeze the data loss tolerance with DPM? – How high can we squeeze the data loss window with hyper-v R? – We need instrumentation of data, ideally we should be able to compare the same workload – We can calibrate the workload and intent and chose….but then • What happens when the workload changes? 43
  • 44. Catalog • Catalog – Historical, Tells what the high level contents of a backup are – This essentially provides for browsability before full recovery is undertaken • The meta data for the structure/high level contents of an application structure is a part of the data associated with a certain recovery point however the catalog can help you with identification of which recovery point may have the data of interest – Can the catalog information be handed down the VSS snapshot process • We expect the catalog to be tree structured • This can be huge for a large application • In such cases, can the applications be responsible for keeping an up-to-date catalog? 44
  • 45. Replicated Content Format • DPM stores the content uncompressed/unencrypted uses VSS snapshots as a mechanism to create point-in- time copies • Hyper-v R supports VHD 2.0, data is not encrypted at rest but may be encrypted for transmission, data is not compressed • OBS supports a modified VHD 1.0 (meta data is vhd 1.0, blocks are compressed and encrypted at rest) • We are doing some tests on how much extraction, encryption and decompression add to the recovery time 45

Editor's Notes

  1. Work In progress, Intent is to collect feedback, here your views and finally help drive this process.
  2. I started by looking at the replication technologies used by the various teams, bottoms up, but realized that in order to make any judgment/decision, I need The context in which these need to be evaluated Tons of Instrumentation and Perf Data
  3. PMs will drive a comprehensive scenario review. This is at a very high level. All of these scenarios have the following things in common : First, “Data needs to be moved offsite”. We will refer to this as Replication. The replication channel needs to move data effectively given bandwidth and storage constraints. It can be given hints if the scenarios and settings can help. Second, “Data needs to be tracked so that only changes are sent”. We will refer to this as change tracking. Third, “Data needs to be stored”. Fourthly, this data needs to be stamped for cataloging and indexed for quick retrieval. We will refer to this as Meta Data Management. The whole set of processes to accomplish these needs orchestration. The entire mgmt solution needs to be highly available and depending on the scale of the customer, scalable as well.
  4. “Azure Recovery Service”
  5. Thanks to Vijay and Anand for providing this background!
  6. This slide obviously needs
  7. These are gaps that I see mostly from a management perspective. Lack of a single unified protection namespace causes non optimal data movements ! It is important to understand this. I will drill into the protection namespace concept a little bit… This may be a controversial statement and I would like all your inputs/feedback.
  8. A single protection namespace for managing data movements related to backup/dr and ha(geo clustering) is being proposed.
  9. Each Subservice has its own namespace today. It is hard to tell the multiple layers of data movement that are happening around an entity from a single pane. Sliced by Physical Location of Protected Element, Slice by Domain of PE Slice by Type of PE What could be the types: SAN Replication VSS Based Snapshotting VM Change Logging
  10. Web Tier : Doesn’t need to be backed that often, backed for application install and configuration
  11. Each of these topics requires a drill down
  12. Not sure these definitions will be required --- but they are useful in terms of evaluating cogs. If the cogs for hot back ups are significantly higher and the recovery times provided by hot backup are still within 3 to 4 hours, would customers want to have a hot backup on cloud? Is it worth engineering for that? Tier 0 service migration is not a priority? Hot backup as implemented by hyper-v has host affinity. It is not as robust in the face of host failures.
  13. This picture looks very much like the Protected or DPM DR’d picture. Note it doesn’t show agents as yet  It doesn’t show direct app migration to the host of interest. It doesn’t yet show the windows consumer backup(do we care). However, note that the service at site X can be made more manageable, more scalable and more DPM has a federated console. This picture is suggesting the possibility of a single namespace for protection of critical assets within an enterprise, managed by a single entity. 2) Note, I have shown DR service as a component of the BBCDR (ha, ha) service. DR service maintains all the hotbackup and cold backup policies, reserves storage through fabric mgmt/cloud service, helps test failovers and failbacks by using the hydration service as necessary. Currently we are building a cloud version of DR, but I see this as being symmetric for a large enterprise. Raises the important question, is in some cases build for cloud and not an on-premise version, the right thing to do. 3) Simply replace Site Y, with cloud. This represents the DPM to cloud configuration that we are building today. 4) Left Hand side shows protected elements Applications, Storage affiliated with a NAS box (Predictions show large and growing amounts of data on NAS) -> what does for us in terms of replciation VMs (whose vhd files can reside on a SAN or CSV or NAS) VMs 5) Note that for aggregated large volumes, the benefits of deduplication can be many fold.
  14. This picture looks very much like the Protected or DPM DR’d picture. 1) However, note that the service at site X can be made more manageable, more scalable and more DPM has a federated console. This picture is suggesting the possibility of a single namespace for protection of critical assets within an enterprise, managed by a single entity. 2) Note, I have shown DR service as a component of the BBCDR (ha, ha) service. DR service maintains all the hotbackup and cold backup policies, reserves storage through fabric mgmt/cloud service, helps test failovers and failbacks by using the hydration service as necessary. Currently we are building a cloud version of DR, but I see this as being symmetric for a large enterprise. Raises the important question, is in some cases build for cloud and not an on-premise version, the right thing to do. BTW, I believe it is the right approach… 3) Simply replace Site Y, with cloud. This represents the DPM to cloud configuration that we are building today. 4) Left Hand side shows protected elements Applications, Storage affiliated with a NAS box VMs (whose vhd files can reside on a SAN or CSV or NAS) VMs 5) How much time does recovery from cold backup really cost over the hot backup case? What are the cogs? How does snapshotting compare with logging on large clusters (64 nodes). 4) Note that for aggregated large volumes, the benefits of deduplication can be many fold.
  15. Should we call this the Protection Service or the Recovery Service ???? I haven’t shown the cloud version --- but at this point you can draw the obvious conclusions 1) Which replication providers are significant? We have some that are existing. 2) Can we collapse all the existing agents/services into a single extensible service that ships in the box? 3) What is the construct for tenancy in an enterprise? That becomes a grouping mechanism for the sake of billing and metering and for delegated administration in some cases. 4) Enterprise storage service will be aligned with windows 8/9 storage enhancements and make full use of those.
  16. Recovery service is where this binding is published. This is akin to what DR is doing today for private cloud.
  17. Need to flush out the sequence on the client User Enables Hyper V Replication to Cloud Which UI What invokes the appropriate transport provider to get coupled How does the log get flattened and stored in the Azure Store How does the user perform failover of the workload? Which UI ?
  18. The added data post processing role should be seen as a COG. Can the jobs performed by the job service be billed for the customer depending on the compute used?
  19. How does the enterprise work with the cloud stack?
  20. V next : Build a
  21. The limiting factors are: a)network throughputs both at the sending side and at the receiver side. b)Storage at sending but more at the receiving side
  22. SNIA Replication Profile – Which models does it support? Krishan and the intern will look at this in more detail.
  23. Does hyper-v over smb remove the need for cluster broker?
  24. Work in progress.