2. Read me (Remove when presenting)
• This is a draft document
– Reviews are required by Penton and Metalogix
– Review each page and notes
– Edit as you see fit and highlight change in RED
• The presenter has 15-20 minutes to present
• The presentation contains 15-20 slides to
meet time slot
3. Abstract (Remove when presenting)
With SharePoint becoming more entrenched in
organizations its importance to the business has
increased significantly. SharePoint might not be directly
linked to revenue generation in your organization but
it’s most likely become a tool that people use daily and
when there is a failure or data loss, you hear about it.
This webinar will provide you with a holistic view of
SharePoint backup and restore with a focus on key
subjects that must be covered in order to plan, design,
implement and manage a SharePoint backup and
restore solution that meets requirements and is
sustainable.
4. BIO
Ron Charity
A published Technologist with 20 + years in
infrastructure and application consulting.
Experience working in the US, Canada,
Australia and Europe. Has worked with
SharePoint and related technologies since
2000.
Plays guitar in a band, rides a Harley
Nightster, owns a Superbird, and enjoys
travel, especially to beach destinations.
5. Agenda
• Scoping and alignment
• Stakeholder Requirements and SLAs
• Corporate policy / roadblocks
• Information architecture considerations
• Technical components
• Operational components
• Quality assurance, POCs and testing
• Training and Awareness
• Proof of Concept/Pilot
• Further Reading and Contact information
6.
7.
8. Stakeholder Requirements
• So where do you begin?
• Fully understand requirements and expectations you must
reach out to the business and all the IT stakeholders.
–Use SharePoint as a tool daily for collaboration
–Run applications (or thirdparty) on top of SharePoint
–IT persons that sustain related infrastructure
–Third-party support organizations
• Question for them include:
–Are clients impacted?
–Is there an impact to business operations ?
–Is the Brand impacted?
–Is there a compliance / records mgmt. impact?
9. Stakeholder Requirements con’t
• Are there outsourcing contracts associated with Backup and
Restore? Related infrastructure?
• What Backup and Restore tools are in place today? Do they
have SharePoint support?
• What Backup and Restore infrastructure is in place today?
• What skills are in place today related to Backup and Restore?
SharePoint and SQL?
• Are there constraints with the IT environment today? Network
bandwidths? Storage? Tape Libraries?
• What are the existing backup rotation schedules and windows?
• Where are the SharePoint farms located today? What is there
configuration? How much data?
10. Defining SLAs
• Defining the SLAs will require a mix of technical skills, financial
skills and political savvy
• The challenge is creating a solution that addresses business
and financial needs yet works within business an technical
limitations
–What is (isn’t) backed up and why (Think RTO/RPO)
–Data restore performance and administration implications
–Backup speed performance as it relates to capacity plans
–IT, Site Administrator and end user responsibilities
–The process for provisioning backup and restore
–The process for recovering data
11. Components of a solution
• Policy
–Application tier policy
–Data management policy
–Security policy
–Records Management
–Third Party contracts
• Process
–Backup and recovery process
–Tasks, activities and hand offs
–Farm rebuild process and
testing
–Operational ticketing,
reporting and escalations
• People
–Procurement
–Product management
–Staffing and skillsets
–Support and outsourcing
–Politics
• Tools
–Backup software
–Help Desk software
–Monitoring software
–Change management
–Code management
–Configuration management
12. Information Architecture
• As farms grow and the number of sites increase it can be
advantageous to structure site collections by value to
organization
• For example
–Critical Sites – those critical to business operations are
played in a site collection(s) separate from normal sites due
to their SLA being higher
–Normal sites are those that don’t impact business
operations and their SLA is lower
• This enables the operations to
–Manage backups and restores in a easier manner
–Help reduce backup windows and or stager jobs
13. Technical Architecture
• Generally the backup architecture consists of
–The SharePoint farm (and Backup agents installed on the
WFEs and SQL servers).
–Staging farm (usually a single server).
–Storage (a location for disk backups) and tape backup
systems.
–Backup operator console.
• Some backup tools are very complex and require multiple
servers for web, business logic and storage to manage backup
images, deduplication, backup job scheduling and logs.
• Also consider all the networking infrastructure between your
SharePoint farms and backup systems – this can be a
constraint.
15. Backup approaches
•SQL Server Backups
–Not application aware
–Not granular – content database level
–Generally faster but space and pipe intensive
–Leverage existing SQL investments
•SharePoint
–Application aware
–Granular control of backup and restore
–Generally slower but optimizes space and pipe
–Requires infrastructure
16. Slip Streaming and WSPs
•Optimize your farm rebuild.
•Document farm, server etc. rebuild processes.
•If you have customized your farm consider slip
streaming your SharePoint installation.
•Make sure they are packaged in WSPs
(Hopefully your developers created WSPs to
automate installation installation).
•Test the process on a regular basis.
17. Policy and Process
•Document and publish all policy and process
•Make sure task ownership and hands offs are
clear and agreed to by senior mgmt.
•Change control to govern changes to
environment
•Important items include;
–How to manual
–Help desk call routing
–Communications plan
–Service levels
18. Training and Awareness
•Training involves a mix of orientations for
business stakeholders and IT staff
•Operational and service level aspects must be
highlighted
•Specific training includes:
–Help desk call routing
–Operator training
–Business stakeholder orientations
–Ongoing re-inforcement
19. Test plan
•Contains key tests for validating the backup will
meet SLA.
•Consists on initial solution certification testing
and ongoing correct operation.
•Plan must be documented and signed off by all
stakeholders
–Business units
–IT department(s)
–Third parties (Contractors, service providers)
20. Test plan
Test Expected Outcome Actual Outcomes Pass/Fail
Farm recovery Ability to recover farm end
to end.
Recoverability
Identify any gaps
Time required
Yes or no
Servers Ability to recover each
server individually (e.g.
WFE fails)
Same as above Yes or no
Site Collection /
Application
Ability to recover a site
collection and associated
application
Same as above Yes or no
Site Recover a site and its
associated settings
Same as above Yes or no
Lists and libraries Ability to recover a Web
Part and its associated
settings
Same as above Yes or no
Data Ability to recover Data
such as documents,
pictures etc.
Same as above Yes or no
21. Proof of concepts and Pilots
•POCs and Pilot are controlled limited
employments
•POCs and Pilot prove the solution works in
your environment
•Enables you and others to carry out the test
plan proving the works and possibly finding
gaps in the technology and or stakeholder
expectations
•Use project manage controls to manage the
POC and Pilot.
22. Sign off with stakeholders
•Present POC and Pilot findings to stakeholders
•A walk through the test plan results is helpful
•Obtain a sign off from the stakeholders in
writing
•Conduct a demo of the solution if need be
•If there are gaps in the testing or errors that
occurred these must be highlighted and a plan
assembled to address the deficiencies
23. Backup schedule
• Plan your backups carefully to ensure proper
data backup occurs while not overloading
servers due to running multiple jobs
• Consider the following:
– When to run full backups –Weekly?
– When to run incremental backups – daily?
•Monitor and report on duration of backup jobs
•Avoid job overlap such as indexing, profile
imports, virus scans etc.
25. Governance and Communications
•Manages the backup solution (SharePoint
service) as a program of its lifecycle
•Consists of stakeholders (business, IT,
purchasing, third parties)
•Executive sponsors help guide and steer
•Decision and communication framework keeps
people aligned
•Also deals with grievances and disconnects
26. Next steps
•Document the SLA for your SharePoint service
offering – get sign off from stakeholders
•Create a business case to obtain funding
•Create project controls for a POC (Charter,
Communication plan, schedule, risk plan, test
plan etc.)
•Build POC environment and carry out tests and
document results
•Review findings and decide on next steps
27. Further Reading
•Create a SharePoint backup and recovery plan
•SharePoint Disaster Recovery Guide
•SharePoint Backup Overview
•SharePoint OOB Backup
•Optimize SQL Server for Backup
•Best Practices
28. Contact Information
• Questions? Ideas or suggestions you want to
share?
• Text chat or contact me at
– roncharity@gmail.com
– ca.linkedin.com/in/ronjcharity/
Editor's Notes
Draft
Version 2.0
Date 10/9/2014
Left blank intentionally
Left blank intentionally
Intentionally left blank
Mental note >> What's the point? Why should they care?
There are many ways to approach this topic
Being allotted 20 minutes, I must briefly touch on important areas
I’m a consultant / architect – take a holistic approach – multiple view points
Your level of success depends what you’re managing to as success criteria
I will be prescriptive throughout the webinar and will be available through email
Lots to cover…
Mental note >> What's the point? Why should they care?
You require a strategy and plan to be truly successful.
Success often depends on specific points of view.
Especially in large organizations without governance or those using outsourced resources.
An executive sponsor is key to your success.
Leverage the team, company policy, stakeholders.
Maneuver carefully around fiefdoms and other politics.
Mental note >> What's the point? Why should they care?
Successful people usually
Have a strategy
Have a solid network
Have some help
Senior sponsor
Think of the it this way…
Coyote as your sponsor and Gorn as all the politics
Mental note >> What's the point? Why should they care?
So where do you begin? To fully understand requirements and expectations you must reach out to the business and all the IT stakeholders. There are two critical goals at play; 1) Gather requirements from the various stakeholders and 2) Continuous education of the stakeholders to proactively manage expectations (Relationship building). Generally, there are many stakeholders include people that;
Use SharePoint as a tool daily for collaboration
Run applications (or components) on top of SharePoint
IT persons that sustain SharePoint and related infrastructure
Third-party support organizations
So not only must you be concerned about SharePoint but also applications that for lack of better words “sit on top of” SharePoint and other IT concerns. Where do you start? Interviewing each stakeholder and having a set of questions for them in advance with explanations for each question.
For business staff, the following questions are a good start:
Is the data directly linked to revenue generation?
What is the cost per hour?
If lost what is the cost to recreate?
Is the brand impacted?
Is the data directly classed as corporate records?
Who uses the data?
Company staff? Business partners? Clients?
How many people rely on the data?
When do the users access the data?
For IT staff, the following questions are a good start:
Are there outsourcing contracts associated with Backup and Restore? Related infrastructure?
What Backup and Restore tools are in place today? Do they have SharePoint support?
What Backup and Restore infrastructure is in place today?
What skills are in place today related to Backup and Restore? SharePoint and SQL?
Are there constraints with the IT environment today? Network bandwidths? Storage? Tape Libraries?
What are the existing backup rotation schedules and windows?
Where are the SharePoint farms located today? What is there configuration? How much data?
For additional reading refer to the Microsoft planning and requirements workbook you can download (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=10895 ).
Once the interviews are complete you can document the service level objectives and distribute them for review, edits and final agreement.
?
Mental note >> What's the point? Why should they care?
So where do you begin? To fully understand requirements and expectations you must reach out to the business and all the IT stakeholders. There are two critical goals at play; 1) Gather requirements from the various stakeholders and 2) Continuous education of the stakeholders to proactively manage expectations (Relationship building). Generally, there are many stakeholders include people that;
Use SharePoint as a tool daily for collaboration
Run applications (or components) on top of SharePoint
IT persons that sustain SharePoint and related infrastructure
Third-party support organizations
So not only must you be concerned about SharePoint but also applications that for lack of better words “sit on top of” SharePoint and other IT concerns. Where do you start? Interviewing each stakeholder and having a set of questions for them in advance with explanations for each question.
For business staff, the following questions are a good start:
Is the data directly linked to revenue generation?
What is the cost per hour?
If lost what is the cost to recreate?
Is the brand impacted?
Is the data directly classed as corporate records?
Who uses the data?
Company staff? Business partners? Clients?
How many people rely on the data?
When do the users access the data?
For IT staff, the following questions are a good start:
Are there outsourcing contracts associated with Backup and Restore? Related infrastructure?
What Backup and Restore tools are in place today? Do they have SharePoint support?
What Backup and Restore infrastructure is in place today?
What skills are in place today related to Backup and Restore? SharePoint and SQL?
Are there constraints with the IT environment today? Network bandwidths? Storage? Tape Libraries?
What are the existing backup rotation schedules and windows?
Where are the SharePoint farms located today? What is there configuration? How much data?
For additional reading refer to the Microsoft planning and requirements workbook you can download (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=10895 ).
Once the interviews are complete you can document the service level objectives and distribute them for review, edits and final agreement.
?
Mental note >> What's the point? Why should they care?
Defining the SLAs will require a mix of technical skills, financial skills and political savvy. Generally the technical aspects are well defined as provided by the various backup and restore toolset venders. Specifically, they have experienced staff, numerous whitepapers that provide comparisons and value statements as well as technical data. The true challenge is creating a solution that addresses business expectations, financing (What is being requested vs. what you can afford) and environmental realities (Infrastructure readiness, SharePoint complexities such as customizations etc…).
In your SLA, you must state the facts regarding the backup and restore service
What is backed up and why (Think RTO/RPO)
What isn’t backup up and why (Think RTO/RPO)
When data is backed up and any performance, change control and administration implications
Data restore performance and administration implications
Backup speed performance as it relates to capacity plans
IT, Site Administrator and end user responsibilities
The process for provisioning backup and restore
The process for recovering data
The SLA must be publicized and reviewed on a regular basis to manage expectations. When provisioning new farms I suggest business and IT stakeholder’s physical sign off on their understanding.
Mental note >> What's the point? Why should they care?
A SharePoint backup and restore solution is complex topic; it must address more than the backup tools (tip of the iceberg) to be successful. It must also include cost, people, process and policy to make sure the solution meets expectations and is sustainable. These topics are usually the most complex unforeseen topics to address.
Backing up and restoring a SharePoint farm is a complex task, think about what it takes to build a Farm from the ground up and then add a few years of usage (Data, customizations, configuration changes, service packs and cumulative updates etc…).
For example, you have to rebuild the server(s), load Windows Server, load SQL Server and then load SharePoint. Then you have to apply Service Packs, Cumulative Updates, customizations and think of all the reboots in between during the build.
Specifically, the following provides further explanation:
Policy
What are the current policies for RTO/RPO? Do they differ from SharePoint’s requirements?
What is the data policy? Value of data to the organization? Compliance requirements?
What are the administration policies that must be followed to remain compliant?
What are the security policies regarding data protection and handling?
Process
What backup and restore process are in place today?
How will the SharePoint backup and restore solution impact SharePoint administration and usage?
What tests must be performed to ensure success?
How will it impact operational windows? What jobs are in place today? How resources intensive are they?
People
How will the SharePoint backup and restore solution impact current operations staffing? Are outsourcers involved?
How will site owners and users be impacted by backup and recovery? What is their role in the process?
What training and awareness programs are required? Operator and helpdesk training and user awareness training? What tools do I require to support the training and communications?
Tools
What tools best meet our requirements? What infrastructure do we have today? Are there agents available for SharePoint? Do I require a second toolset and supporting infrastructure?
What is the data center footprint? Servers, Network, Storage, Power and A/C
Which vender(s) do we have agreements with? What services do they provide?
Do we outsource? How will that impact services costs? Initial provisioning costs plus monthly fees?
How will our infrastructure be impacted? More servers? Networking? Storage?
How do I balance capacity with recovery times? SQL database sizes vs. application needs?
Mental note >> What's the point? Why should they care?
As farms grow and the number of sites increase it can be advantageous to structure site collections by value to organization
For example
Critical Sites – those critical to business operations are played in a site collection(s) separate from normal sites due to their SLA being higher
Normal sites are those that don’t impact business operations and their SLA is lower
This enables the operations to
Manage backups and restores in a easier manner
Help reduce backup windows and or stager jobs
Mental note >> What's the point? Why should they care?
Generally the backup architecture consists of
The SharePoint farm (and Backup agents installed on the WFEs and SQL servers).
Staging farm (usually a single server).
Storage (a location for disk backups) and tape backup systems.
Backup operator console.
Some backup tools are very complex and require multiple servers for web, business logic and storage to manage backup images, deduplication, backup job scheduling and logs.
Also consider all the networking infrastructure between your SharePoint farms and backup systems – this can be a constraint.
Mental note >> What's the point? Why should they care?
Operator Console
Backup Software operator console
SharePoint Farm
Production SharePoint farm
Staging Farm
SharePoint farm used by some toolsets to stage data recovery before being restored to the production farm
Should reside in same datacenter as production farm for recovery speed purposes
Change control is critical to keeping farms in sync
Client traffic network
Think isolating of traffic to reduce chance of performance degradation
Farm network
Farm and operational traffic network
Think isolating of traffic to reduce chance of performance degradation
Storage Area Network
Location for backup files
Tape / Disk Library
Location for transferring backup files to tape
Other items include
Help Desk software
Monitoring and reporting software
Change management
Code management
Configuration management
Mental note >> What's the point? Why should they care?
SQL Server Backups
Not application aware
Done at SQL layer
Doesn't include all SP files and DBs
More work for you
Not granular
content database level
Backup foot print
Generally more processing and pipe required
Equals more workload
Generally faster
Backup of database
Simpler than backing up all the SP files (OS, SQL, Index etc.)
Leverage existing SQL investments
No spin up costs if you have SQL backups in place
Or simply can add SQL agent to backup system
SharePoint
Application aware
Item level, security, index etc.
Granular control of backup
Can backup and restore item or farm
Slower due to being application aware
More logic required
Backup foot print
Generally less storage and pipe required
Equals less workload
Requires infrastructure
Software (console and agents, even more if complex)
Servers for backup software and databases (SQL DB for bus logic and de-dupe)
Staffing and support
Mental note >> What's the point? Why should they care?
If you have customized your farm you might want to consider slip streaming your SharePoint installation. This blog (http://blogs.technet.com/b/seanearp/archive/2009/05/20/slipstreaming-sp2-into-sharepoint-server-2007.aspx ) describes the process for SharePoint 2007.
Also, if you have customizations make sure they are packaged in WSPs (Hopefully your developers created WSPs to automate installation installation). This blog covers how to create WSPs (http://geekswithblogs.net/evgenyblog/archive/2008/01/27/118966.aspx ).
Mental note >> What's the point? Why should they care?
Your solution will require policy and procedural documents for operators, site administrators and users to follow to ensure the solution delivers on requirements (expectations). To achieve this you’ll require the following documents accompanied by training:
How to manual – how to backup all the farms, how to rebuild farms and how to recover individual components of a farm such as a server, site collection, site, web parts and data.
Help Desk call handling manual – how to handle requests for backup and recovery, what questions to ask, how to track requests, follow up, tools to use etc…
Communications plan – this document includes policy and instructions regarding communications with the parties involved. If you have a communications department they would generally handle this with your assistance.
Contact list – this would include contacts for media, farm owners, support, help desk and other important contacts such as data center persons and or third parties (outsourcers).
Service levels – specific facts regarding RTO/RPO, business, IT and third party responsibilities.
Change control – processes, tasks and policy followed.
Mental note >> What's the point? Why should they care?
Now that you have a solution in place people must be trained how to use the solution (Administration and operation), stakeholders must be educated about the solution (SLA) and general awareness must occur. For training the following is recommended:
Training programs for operators – how to training so they understand how to backup, restore and handle other related administration such as communications.
Awareness training – other staff such as stakeholders and architects will have questions about the solution and it’s important to provide them with architectural information.
You might want to hold follow up sessions to reinforce key points and drive awareness across your organization. You might also want to create a site that contain all the information about the backup solution such as design documents, provisioning forms, backup schedules, performance data (speed of backup and restore) and key contacts.
Mental note >> What's the point? Why should they care?
Your testing should include two components;
1) initial testing of the solution in a Proof of Concept/Pilot environment and
2) ongoing testing (Fire drills) which should occur one to two times a year.
In order to test properly (confidently) a documented plan is required that also includes test scripts and format for documenting test results.
Generally, the test plan would include a list of tests, expected outcomes and report on the actual outcome.
The test plan would be utilized during the Proof of Concept/Pilot for running end to end tests of the solution and signing off with the stakeholders.
Also, the test plan demonstrates thoroughness helping build credibility with stakeholders.
Mental note >> What's the point? Why should they care?
When developing your test cases it’s important to include
Any details you want tested
Make sure the results are noted so that stakeholder’s expectations can be managed.
For example,
the test cases for Web Parts and Data should have tests that include verifying Meta data (Columns) is recovered, content types, version history, workflows etc. since these are important configuration changes and have user impacts if not present.
Mental note >> What's the point? Why should they care?
Whether you use proof of concepts (POC) or pilots (or both) the outcomes are generally the same, you prove the solution works in your environment. You’re POCs or pilot must reside in your data centers and test representations of production systems and dataset (for pilots you might want to backup actual production systems). This might seem costly but provides a quality check that ensures your solution works with no surprises. Your POC/Pilot must include the following:
Charter – what is the scope of the project (Technology tests, process development, performance tests etc…)
Staffing plan – you’ll require operational staff, farm/application owners and venders technical staff
Test Plan – what are we testing (farm rebuild, server rebuild, site collection recovery, site recovery, document recovery)
Physical environment – the technology required by the solution
Your POC/Pilot must deliver the following outcomes:
Backup and recovery process documented step by step
Any prerequisites are documented
Backup and restore performance documented
Data loss (if any) is documented
Test plan report for each test documented
Plan for deploying solution into production
Impact/risk assessment completed
Mental note >> What's the point? Why should they care?
Present findings to stakeholders
Once the POC and Pilot is completed get sign off on the test report.
Make sure sign off is official – signature of document or email confirming acceptance and understanding
Conduct a demo of the solution if need be – to highlight success and gaps
If there are gaps in the testing or errors that occurred these must be highlighted and a plan assembled to address the deficiencies
Mental note >> What's the point? Why should they care?
When planning you backup schedule there are several things you must consider to make sure you can recover successfully but also make sure the servers are not saturated as a result of running multiple jobs. Consider the following:
When to run full backups – Monthly? Weekly? Depending on your SLA weekly is probably the best
When to run incremental backups – daily is the norm
Duration of backup jobs – this is key to understand since you have to plan your windows to avoid overlap with other jobs and degraded performance during usage
Avoiding overlap with other jobs (Virus scans etc…) – plan your jobs so they don’t overlap
Mental note >> What's the point? Why should they care?
List your operational jobs
Make note of when they start
Length of job in hours
Continue to monitor or life of solution
Mental note >> What's the point? Why should they care?
Governance will play a key role in keeping the stakeholders aligned.
Specifically, backup and recovery is a complex topic, it requires tools, process, policy and staffing to function properly.
Non-IT types tend to over simplify IT and IT types tend to over complicate.
Each has a different perspective and expectation regarding the topic. Governance creates a forum for communications, tabling requirements and issues and working through them with the end goal being consensus.
Your governance plan should consist of the following:
Execute decision maker
Stakeholders from the business and IT
Tools for tracking issues, discussion topics and decisions
Decision framework
Communication plan
Mental note >> What's the point? Why should they care?
Document the SLA for your SharePoint service offering – get sign off from stakeholders
Create a business case to obtain funding
Create project controls for a POC (Charter, Communication plan, schedule, risk plan, test plan etc.)
Build POC environment and carry out tests and document results
Review findings and decide on next steps
Mental note >> What's the point? Why should they care?
SharePoint backup and recovery overview - http://technet.microsoft.com/en-us/library/cc261687.aspx
SharePoint Volume Shadow Copy Service - http://msdn.microsoft.com/en-us/library/cc264295.aspx
To backup using the SharePoint out of the box tools http://technet.microsoft.com/en-us/library/ee663490.aspx.
To backup using SQL Server backup tools http://msdn.microsoft.com/library/ms187048.aspx
Shane Young’s (SharePoint MVP, SharePoint 911) SharePoint 2010 backup webinar recording (http://technet.microsoft.com/en-us/sharepoint/ee518668)
SharePoint 2010 Disaster Recovery Guide – Amazon (http://www.amazon.com/SharePoint-2010-Disaster-Recovery-Guide/dp/1435456459 )
SharePoint 2007 Disaster Recovery Guide – Amazon (http://www.amazon.com/SharePoint-2007-Disaster-Recovery-Guide/dp/1584505990/ref=sr_1_2?s=books&ie=UTF8&qid=1310414136&sr=1-2 )
Microsoft best practices http://technet.microsoft.com/en-us/library/gg266384.aspx
How to optimize SQL Server backup and restore performance http://go.microsoft.com/fwlink/?LinkId=126630