SlideShare a Scribd company logo
1 of 114
Download to read offline
‫שם המצגת‬
‫אהרון שילה| מנכ"ל| די בי סי אס בע"מ (‪)DBCS‬‬
‫קצת עלי‬

                      ‫מי אני - ‪DBA‬‬
                             ‫נשוי + 3‬    ‫•‬
              ‫למעלה מ01 שנים בתחום‬       ‫•‬
‫מוסמך ‪ PRO‬בטכנולוגיות ‪Sql Server‬‬         ‫•‬
                             ‫ו-‪Oracle‬‬
 ‫לשעבר ‪ CTO‬ומוביל תחום בג'ון ברייס‬       ‫•‬
                               ‫הדרכה‬
  ‫מנכ"ל חברת ‪ DBCS‬העוסקת בתחום‬           ‫•‬
     ‫יועץ ל- בזק בינ"ל, הלמ"ס, פונטיס,‬   ‫•‬
 ‫טרפילוג, ‪ storenext ,galcomm‬ועוד.‬
• Introduction to High Availability in SQL Server:
  Hardware and software solutions
• Features and techniques comparison
  –   Log Shipping
  –   Database Mirroring
  –   Replication
  –   Database Snapshots
  –   Backup improvements
  –   Online operations
• HADR deep dive: How to implement the next
  generation of high availability and disaster
  recovery solution with SQL Server
Introduction to High Availability
  and Disaster Recovery

• Definitions
  – Introduce key terms and concepts
• Business Continuity Planning
  – Overview of the BCP process
• SQL Server High Availability Planning
  – How does BCP apply to SQL Server availability?
High Availability and Disaster
     Recovery: Definition
• High Availability                         • Disaster Recovery
•   High availability is a system design
    protocol and associated                 • Processes and procedures
    implementation that ensures a             designed to restore business
    certain absolute degree of
    operational continuity during a given     operations due to a natural or
    measurement period                        human-induced disaster
                                                – Typically involves providing
•   Availability defined in terms of              redundancy spanning multiple
    service level agreements (SLA)
     – Recovery Time                              sites or across geographic
     – Data loss during unplanned                 regions
       downtime

•   A highly available application should
    be accessible by users x% of the
    time
Defining x and SLA
      Availability   Acceptable           Acceptable Data      •   Recovery Time Objective (RTO)
      Class          Downtime (hrs/yr)    Loss (time of last
                     OR RTO               copy) OR RPO             guided by availability requirements
                                                                    –   How much downtime can you tolerate?
      Tier 1         >99.99%              5 min or less
                     (1 hr or less)

      Tier 2         99.9% - 99.99% (1-   5 mins to 8.5 hrs    •   Recovery Point Objective (RPO)
                     8.5 hrs)                                      guided by criticality of
      Tier 3         (<99.9%)             Hours to days            application data
                     (Hours to days)                                – How much data can you lose?




RPO




      Tier1

                                 RTO
Protection Levels
                                                Regional DR
•   Protection against resource failures
     –   Machine
     –   Database Corruption
     –   Disk

•   Location Redundancy                                                 Geographic DR
     –   Building
                                                                     Protection against
     –   < 10 miles
                                                                          Natural Disasters


                                                                     Location Redundancy
                                            Protection against
                                                                       – State, Country
                                                 Network Outages
                                                                       – > 100-200 miles
                                                 Site Failures


                                            Location Redundancy
                                               – City, County
                Local HA                       – < 100-200 miles
Business Continuity Planning
                                                    •   Impact Analysis
                                                         – Critical Functions
                    Analysis                             – Threat Identification
                                                         – Recovery Objectives
                                                    •   Solution Design
                                                         – Achieve recovery objectives for
                                         Solution          relevant threats within specified
Maintenance
                                          Design           constraints like budget, human
                                                           resources etc
                                                         – CostBenefit analysis of solutions
                                                    •   Implementation
                                                         – Deploy the recommended solution
                                                    •   Testing
                               Implementati              – Test to see if the solution meets the
          Testing
                                    on                     recovery requirements
                                                    •   Maintenance
                                                         – Yearly testing and review of
                                                           procedures
SQL Server High Availability Planning

• Analysis
   – Application tiers serviced by the databases
   – Causes of database downtime
   – Protection levels: Local HA, Regional DR, Geographic DR                Analysis


• Solution Design
   – Need to understand what solutions exists?
   – What are the characteristics and               Maintenance                                   Solution Design


     cost of the solution?
• Implementation
   – What are the deployment steps and best practices?
• Testing                                                         Testing              Implementation

   – How do I test my implementation?
• Maintenance
   – How do I monitor and maintain the solution?
Database Downtime Drivers                  Analysis



                             Failure
                            Protection
              Unplanned
              Downtime
                           User Errors
   Database
   Downtime
                             Online
                          Administration
               Planned
              Downtime
                           Predictable
                           Resourcing
Solution Design                        Solution
                                          Design




• Understand the       Solution Architecture
  solutions and
  choices before
                          HA Capabilities
  making a decision
                      Limitations and Caveats


                           Cost Vector
SQL Server          Solution
                          Design
Always On Technologies
Always On Technologies                                          Solution
                                                                    Design



• Provides a full
  range of options to                  •   Backup and Restore
                                       •   Log Shipping
  minimize downtime      Increases     •   Database Mirroring
  and maintain          Availability   •
                                       •
                                           Failover Clustering
                                           Peer-Peer Replication
  appropriate levels
  of application
  availability
                                       • Online Index Operations
                                       • Table Partitioning
                        Decreased      • Enhanced Locking
                                       • Resource Governor
                        Downtime       • Database Snapshot
                                       • Dedicated Admin Connection
                                       • Dynamic Configuration
Always On Technology Overview
                                                                      Solution
• Architecture Overview                                                Design


  – How does it work?

                                              •   Backup and Restore
• Solution Characteristics      Increases
                                              •   Log Shipping
                                              •   Database Mirroring
  – Data Loss Guarantees       Availability   •   Failover Clustering
  – Failover Characteristics                  •   Peer-Peer Replication

  – Redundancy Levels
    and Utilization
  – Cost
  – Limitations and Caveats
What’s New in SQL Server 2008
• New Features                  • Feature Enhancements
• Resource Governor             •   Database Mirroring
                                    – Automatic recovery from
 – Manage SQL Server
                                        page corruption
   workloads and resources          – Log stream compression
   by specifying limits on          – Faster recovery on failover
   resource consumption         •   Log Shipping
                                     – Sub-Minute Log Shipping
                                     – Backup compression
• Backup Compression            •   Failover Clustering
  – Reduce backup and restore        – 16 nodes
    time                             – Rolling upgrade
                                •   Peer-Peer Replication
                                     – Hot add new nodes
Backup & restore
Backup and Restore                                                  Solution
                                                                        Design




• Base availability technology for any solution
   – Protects against failures and recovery from errors
   – Provides Local HA and Site DR
       • Need to ensure the backups are accessible if site goes down
   – High RTO due to restore time
   – RPO=0 can never be guaranteed

• Types: Full, Differential, and Transaction Log
   – File-group backup/restore for large databases


• Backup Compression provides faster and
  smaller backups in SQL Server 2008
Enhanced Error Detection

• In SQL Server 2000 RESTORE
  VERIFYONLY does not guarantee that the
  backup is good
  – Data may be corrupt
• In SQL Server 2005 RESTORE
  VERIFYONLY checks everything
  – Ensures that the data is correct
Database Checksums

• SQL Server 2000 had TornPageDetection to
  detect incomplete I/O Operations by power
  failures
• SQL Server 2005 adds checksums to data
  pages
  – Header of every page contains a checksum value
  – When reading page, it re-computes checksum and
    compares with checksum stored
  – Returns error (824) if difference found
  – Detects errors not reported by I/O Subsystem
Backup Checksums

• Detect errors introduced by backup hardware but
  not reported by hardware or operating system
   – Backup media error detection
   – Backup devices do not always detect errors
   – Works with
      • RESTORE
      • RESTORE VERIFYONLY

• Restore also checks page checksums, if present
   – Disk error detection on data pages prior to backup
• Can continue past errors if desired
Backup Compression
• Common questions:                        • ―We saw an 85 percent
   – ―How much compression will I see?‖
                                             reduction in file size using
   – ―Will it be comparable to, say,
     SQL Litespeed?‖                         SQL Server 2008 Backup
                                             Compression,‖ says Colin
• One simple answer:                         Neller, Senior Software
  ―It depends!‖                              Engineer at ServiceU and
                                             part of the company‘s SQL
• All data compresses                        Server 2008 implementation
  differently – the compression              team. ―A backup file that was
  ratio achieved depends on:                 previously over 300 GB is
   – The type of data in the database
   – Whether the data in the database is
                                             now only 40 GB, and the job
     already compressed                      runs in about half the time.‖
   – Whether the data/database is
     encrypted
Backup Compression: Backup
  Performance
• Backup of a 322 MB Adventureworks database




                               Uncompressed




                                                                   Compressed
       Hardly any CPU used (avg 5%),          A LOT more CPU used (avg 25%)
       runtime = 39.5s, compression           BUT runtime = 21.6s (45%
       ratio of 0.                            improvement) and backup stored in
                                              76.7MB (4.2x compression ratio)
DEMO
DATABASE SNAPSHOTS
Database Snapshots
• Read-only, consistent view of a     Page
  database
  – Specified point-in-time
• Modifying data
  – Copy-on-write of affected pages
• Reading data                                    Page
  – Accesses snapshot if data has
    changed
  – Redirected to original database
                                             12:00 Snapshot
    otherwise
Using Database Snapshot to Recover
                                           Data
Scenario                      Example Code / Steps
Undeleting             INSERT INTO Production.WorkOrderRouting
rows                   SELECT * FROM
                       AdventureWorks_dbsnapshot_1800.Prod.WorkOrderRouting


Undoing                UPDATE HR.Department
an update              SET Name = ( SELECT Name FROM
                         AdventureWorks_dbsnapshot_1800.HR.Department
                           WHERE DepartmentID = 1)
                       WHERE DepartmentID = 1


Recovering             1 Script the object in the database snapshot
a dropped
object                 2 Execute the script in the source database
                       3 Repopulate the object (if appropriate)

             Caution: Not a substitute for a comprehensive backup and restore strategy
DEMO
Log Shipping
Log Shipping                                  Solution
                                                    Design




•   Automated transaction log backup and
    restore provides redundancy at the
    database level

•   SQLLogship.exe provides the underlying
    framework for doing automated backup,
    copy and restore
     – Backup on primary instance
     – Restore on secondary instance(s)

•   Scheduling is done through
    SQL Server Agent jobs
     – SQL Server 2008 provides sub-minute
       scheduling interval providing the ability
       to do quick backup and restores

•   No automatic failover capabilities
Log Shipping (Key terms)
• Primary Server:
  – Contains your primary database.
  – SQL Server Agent makes periodic transaction log
    backups to capture changes.
• Secondary Server
  – Contain an unrecovered copy of the production
    database.
  – One standby server can contain standby databases
    from multiple primary servers.
Log Shipping (Key terms) cont…


• Monitor Server (Optional)
  – Monitors the status of the log-shipping jobs on the
    primary and each standby server.
  – One monitoring server can monitor multiple primary-
    standby server pairs.
  – Should use a server other than the primary or the
    standby to detect problems on either server.
Log Shipping
                                                          Copy and
                                                           Restore
                                                          Backups
              Perform
              Backups
                              Copy   Secondary Database
                                                          Copy and
                                                           Restore
                                                          Backups
                             Copy

                            Copy     Secondary Database
Primary Database                                          Copy and
                                                           Restore
                                                          Backups


     Raise                           Secondary Database
     Alerts



                Monitor Database
Strength & weakness
• Strengths
  – Can Ship Logs Across WAN (Wide-Area Network)
  – Protects an Entire Database
• Weaknesses
  – Configured Per Database
  – NO AUTOMATIC FAILOVER
DEMO
Mirroring
Database Mirroring                           Solution
                                                   Design




•   A database level high availability solution
    that provides complete protection
    against data loss and fast recovery
    through automatic failover

•   Maintains a redundant database by
    shipping log blocks when the
    transactions are committed on the
    principal

•   Synchronous and Asynchronous
    modes provide the spectrum of
    options to choose between
    availability and performance

•   Automatic failover when using
    witness server
Database Mirroring Modes

• High-Availability Mode
   – Safety Full; Synchronous operation
   – Database is available whenever a quorum exists
   – Automatic failover
• High-Protection Mode
   – Safety Full; Synchronous operation
   – No witness – quorum provided by partners
   – If Principal loses quorum, it stops servicing the database
       • Ensures high protection; database is never in ‗exposed‘ state
   – Manual failover only; no automatic failover
   – A transition mode; should not be in this mode for long
• High-Performance Mode
   – Safety Off; Asynchronous operation
   – Manual failover only
       • Supports only one form of role switching: forced service (with
         possible data loss)
Database Mirroring
 How it works

                                                         Mirror is always
            Application         Witness                  redoing – it remains
                                                         current
       Commit
Principal                                 Mirror




              1           5
                                2
            SQL Server                         SQL Server

             2      >2          4                  3     >3

              Log        Data                      Log      Data
DBM – Automatic Page Recovery


                                        Witness
Client
                             2. Request page

                                                             3. Find page
      6. Write                5. Transfer page
 1. Bad Page
         Page
     Detected                  Log
                   XData                                     Data    Log

                 Principal               4. Retrieve page   Mirror
Database Mirroring Enhancements

• Enhancements in SQL 2008
  – Compression of stream data for which at least a 12.5
    percent compression ratio can be achieved.
  – Automatic Recovery from Corrupted Pages.
  – Page read-ahead during the undo phase.
  – Improved use of log send buffers.
Strength & Weakness

• Strengths
  – Can Mirror Across WAN
  – Automatic Failover, and Nearly Instantaneous, Better
    than Failover Clustering
  – Protects an Entire Database
• Weaknesses
  – Requires Enterprise Edition
  – Must be Configured Per Database
DEMO
Replication
Replication
• Primarily used where
  availability is required in
  conjunction with scale out of
  read activity
• Failover possible; a custom
  solution
• Not limited to entire
  database; Can define subset
  of source database or tables
• Copy of database is
  continuously accessible for
  read activity
• Latency between source
  and copy can be as low as
  seconds
Transactional Replication                                            Solution
                                                                         Design




• A high performance data replication solution that provides
  granular table level replication
   – Logical data movement provides flexibility and
     better hardware utilization

• Key scenarios:
   – Customized application-specific DR
   – Real-time reporting on secondary server that be used for Site DR
   – Scale out application queries with ability to use any one
     database copy for Site DR

• Two types relevant for HA and DR
   – Transactional and Peer-to-Peer
Peer-to-Peer Replication
• Provides high availability    Peer Node   Peer Node
  and read scalability
• Builds redundancy by
  eliminating single point of
  failure
• Enable online upgrades of
  servers
                                Peer Node   Peer Node
• Maximize Application
  Uptime
• Support for both Ring and
  Grid Topology
• Centralized Management
  using Management Studio
New Features
                             Replicated
                               Data
                                           Write

                         Load Balancing
               Read




                      Application Server

                                        User
                                      Requests
Strength & Weakness

• Strengths
  – Perpetual or on-demand replication of data, local or
    remote
  – Protects (duplicates or merges) the exact portion of the
    database I want
• Weaknesses
  – Configured per database, even per table
  – Generally does not protect or duplicate an entire
    Database
DEMO
FailOver Custering
Failover Clustering                                    Solution
                                                          Design




• Instance level protection built on Windows
  Failover Clustering shared disk model
  – Cluster nodes typically co-located within the
    same site to provide local HA
  – Regional DR possible using VLAN and stretch
    storage level replication


• No built in data redundancy like database
  mirroring and log shipping
  – Data protection has to be provided at the
    storage level or by combining with other solutions
Failover Clustering

                    Node 2




          Node 1
                   Virtual   Node 3
                   Server


                    Shared
                     Disk
SQL Server Cluster Topologies
• Supports many scenarios:                            Failover Cluster

   •   Single Instance
   •   Multiple Instance
                                                           * Inst1
   •   Multiple Active Nodes
   •   N+1
   •   N+M

 Multiple Active Nodes    N+1: N Active, 1 Inactive   N+M: N Active, M Inactive
                          Nodes                       Nodes


    * Inst1                    * Inst1
     Inst3 *
                                          Inst2 *
                Inst2 *
Failover Clustering (Facts)
• Redundancy at database instance level
    – All databases fail over together
    – Shared copy of system databases
• Single data copy on shared storage device
    – No I/O overhead reducing throughput
    – Storage unit is single point of failure for cluster
• All database services are clustered
    – SQL Agent; Analysis Services; Full-Text engine, MS DTC
•   Automatic failover (up to minutes)
•   DBMS accessed over virtual IP
•   Storage is controlled by one cluster node at a time
•   Requires hardware certified by Microsoft for Microsoft
    Cluster Service
Strength & Weakness
• Strengths
  – Provides Protection Against a Node Failure, Protects
    the Entire SQL Instance
  – Automatic Failover Supported
• Weaknesses
  –   Generally Expensive, Requires Specialty Hardware
  –   Specialty Hardware Requirements
  –   Not Trivial to Configure and Manage
  –   Doesn‘t Protect Against a Complete
       Site Failure
DEMO
Best Practices
• Backup your system databases after
  modifications.
• Test if backups are restorable.
• Practice / Test your disaster recovery plans.
• Documentation is not only for you.
• Keep dedicated DR Server ready.
• Use BACKUP CHECKSUM features.
• Run DBCC CHECKDB regularly.
• Don‘t ignore any runtime errors.
What Solution Is Best For US ?
Always On Solution                                                                                      Solution
                                                                                                             Design
    Characteristics  Redundancy and
                      RPO                Failover                         Utilization                   Cost
Solutions         No Data   Failover Unit         Auto         Read        Mult-   Write   Hard-     App Perf    Manag-
                  Loss                            Failover                 iple            ware      Impact      eability
                  (RPO=0)                         (RTO)
                            Inst    DB      Tab
Log Shipping                                                          *
                                                                                           Low       Low         Low
DBM       Sync                                                        *                    Low       High        Low
                                                        + **
          Async                                                       *                    Low       Low         Low
Cluster                                                                                    High***   Low ***     Low***

Transactional                                                                              Low       Low         High
Replication
Peer-Peer                                                                                  Low       Low         High
Replication


 * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
   database snapshots respectively
 ** Database Mirroring provides fastest failover to hot secondary
 *** Depends on SAN technology
Recap                                                                   Solution
                                                                             Design


•   Application availability requirements
    or SLA drive primary solution choices
     – RPO and RTO are the key metrics           Application Availability
       used to define the SLA
                                                Unplanned
                                                              Planned Downtime
•   Need mitigation against planned and         downtime
    unplanned downtimes

•   Multiple solution choices that                                   Database
    provides varying costbenefits                                   Mirroring
                                                Clustering

•   Other requirements apart from
    application SLA factor into the choice
                                               Log            Peer-Peer
                                             Shipping         Replication
•   Understand constraints and tradeoffs
    you can make
Always On Solution                                                                                       Solution
                                                                                                             Design

   Characteristics Redundancy and
                      RPO                Failover                         Utilization                   Cost
Solutions         No Data   Failover Unit         Auto         Read        Mult-   Write   Hard-     App Perf    Manag-
                  Loss                            Failover                 iple            ware      Impact      eability
                  (RPO=0)                         (RTO)
                            Inst    DB      Tab
Log Shipping                                                          *
                                                                                           Low       Low         Low
DBM       Sync                                                        *                    Low       High        Low
                                                        + **
          Async                                                       *                    Low       Low         Low
Cluster                                                                                    High***   Low ***     Low***

Transactional                                                                              Low       Low         High
Replication
Peer-Peer                                                                                  Low       Low         High
Replication


 * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or
   database snapshots respectively
 ** Database Mirroring provides fastest failover to hot secondary
 *** Depends on SAN technology
AdventureWorks Inc Scenario
    Adventureworks Inc is a                                                     Solution
     manufacturing company that           • One datacenter located in            Design
     manufactures and sells bicycles        Omaha
     across the world. There are a
     number of applications, some         • Three applications
     that are mission critical that run       – Manufacturing – Tier 1
     on multiple SQL Server                   – Finance – Tier 2
     Instances
                                              – Scheduling – Tier 3
•    The DBA team is run by Darren        • Manufacturing application runs
     who is responsible for deploying       on a dedicated SQL Server
     and managing the application           2008 Instance
     databases. One of his core               – All other applications run on
     responsibilities is to ensure              a second instance
     availability of all application      • Availability of manufacturing
     databases in order to meet the
     application SLA                        application is critical
                                          • Implement a solution at the
                                            lowest possible cost
Application Requirements                                                                   Solution
                                                                                            Design



  Applications      Data    RTO in   Failover Unit         Auto       Read   Multiple   Read
                    Loss    secs                           Failover          Sites      Write
                    RPO=0
                                     Inst    DB      Tab
  Manufacturing

  Finance


  Scheduling




  • Manufacturing application has strict SLA‘s
  • Finance application requires readability on the secondary
            – The reports are run every 4 hours and need to be fresh as of the
              last one hour. To offload the reporting load from the main system
              they would like to utilize the mirror
Solution Choice for Manufacturing
                                                                                                                                         Solution
     Application                                                                                                                          Design


    Solutions                     Data Loss             Fast            Failover Unit         Auto       Read   >1          Read
                                  RPO=0                 RTO                                   Failover          Sites      Write
                                                                        Inst       DB   Tab
                                                                                                                Copy
    Cluster

    SAN Replication


•   DBM - Sync a zero data loss solution that can also provide fast instance level
    Clustering can provide                                                                               
    failover
•   Use RAID configuration to provide data redundancy on the SAN
•   If a redundant copy is required that can provide instance failover with zero
    data loss use SAN replication
    DBM - AsyncSolution
      –   High Cost
                                                                                                         
•   Use synchronous database mirroring if instance failover is not needed


    Log Shipping


    Transactional
    Replication

    Peer-Peer
    Replication


                                                                                                                  Clustering with RAID
Solution Choice for Finance                                                                                      Solution
                                                                                                                    Design
  Application
 Solutions         Data Loss   Fast   Failover Unit          Auto          Read        >1          Read
                   RPO=0       RTO                           Failover                  Sites      Write
                                      Inst      DB    Tab
                                                                                       Copy
 Cluster


 SAN Replication



 DBM - Sync                                                                

 DBM - Async                                                               


 Log Shipping
    For database level redundancy with acceptable
   data loss with minimal perf impact,
   asynchronous database mirroring is an optimal
 Transactional
   choice
 Replication

 Peer-Peer
    Use database snapshots at periodic intervals to                                                                 Reports
   providea readable
 Replication                                                    Finance
   snapshot of the data for reporting
                                                              Scheduling                                   Db Snapshot
 Low cost solution                                                               Async Database           every hour
                                                                                     Mirroring


                                                            Omaha Datacenter
Adding a Regional Datacenter Into
  the Mix                                                     Solution
                                                               Design

• Regulatory and compliance requirements drive
  the need for having a additional datacenter within
  a 10 mile radius to provide redundancy against
  site level failure.
   – It is now required that all applications have the ability
     to failover to the regional datacenter across the river in
     Council Bluff


• The SLA need to be maintained for tier 1
  applications even in the case of site failures
Regional Site Solution                                                            Solution
                                                                                   Design

Choices
     Manufacturing



             Cluster with SAN

                                                     Sync Mirroring
                                                     no witness




                                           Reports
  Finance

Scheduling                        Db Snapshot
                 Async Database   every hour
                    Mirroring
                                                     Log Shipping

                        Omaha Datacenter                              CB Datacenter
A Complete Topology                                Solution
                                                      Design




• Considering the potential of floods and
  tornadoes destroying the regional data centers,
  Adventureworks Inc wants to maintain a disaster
  recovery site in
  San Antonio, TX

• The disaster recovery site has lower SLA
  requirements for all applications
  – The manufacturing application can have an RPO of 1
    hour
  – The RTO is set at 4 hours
Topology Diagram                                    Solution
                                                     Design




                                   Sync Mirroring
Manufacturing                      No witness



       Cluster with SAN


                          Log Shipping
Scale Out and Availability                                                       Solution
                                                                                   Design

 Scenario        Requirements  
                                   –   Geo Redundancy
• Adventureworks is building
                                   –   Data Locality
  a new web based order
                                   –   High Availability
  management system that           –   Local Read-Scale
  allows customers from all
                                Workload Characteristics
  over the world access the
                                   – Mainly reads
  system and place orders
                                   – Few writes
• The core group of             Application Characteristics
  customers are in Western         – Each user logging in connects to a
  Europe, South East Asia            particular server
  and North America                      Partitioned based on user-id and region
                                         Writes from a user always happen on one
                                          server regardless of the region the user log in
                                          from
                                   – All reads redirected to the closest geo-
                                     location
                                         Reasonable tolerance for latency (5-10 minutes)
Replication Topology                 Solution
                                      Design




                     Asia1   Asia2




 Peer Nodes


 Read-Only Servers
Licensing Facts
• Passive servers are mirror, log
  shipped secondary and
  clustering passive node
• No license required on passive
  if it is truly passive
• A passive server does not need
  a license if the number of
  processors in the passive server
  is equal to or less than the
  number of processors in the
  active server.
• The passive server can take the
  duties of the active server for 30
  days. Afterwards, it must be
  licensed accordingly.
HA Features Edition Support
Feature               Express      Workgroup   Standard   Enterprise   Comments
                                                                       Advanced high
                                                                       availability solution
Database                                       1                       that includes fast
Mirroring                                                              failover and
                                                                       automatic client
                                                                       redirection

Failover Clustering                            2


Backup Log-                                                            Data backup and
shipping                                                               recovery solution
                                                                       Includes Hot Add
                                                                       Memory, dedicated
Online System                                                          administrative
Changes                                                                connection, and
                                                                       other online
                                                                       operations

Online Indexing

Online Restore

                                                                       Database available
Fast Recovery                                                          when undo
                                                                       operations begin

     ₁Single thread redo
     ₂ Limited to 2 node cluster
Summary
• There is no ―one size fits all‖ solution

• Consider the costbenefitsconstraints and compare that
  to availability requirements of the organization to
  determine the best solution

• Use the charts to understand cost, benefit and
  constraints of
  the various SQL Server High Availability solutions

• TEST the solution to ensure it can meet the availability
  requirements and meet SLA‘s
•question & answer
SQL Server AlwaysOn:
Mission Critical Capabilities in SQL
Server “Denali”


            •   Jon Jahren
            •   Exec VP, Prediktor
            •   jon.jahren@prediktor.no
High Availability and Disaster
        Recovery
        SQL Server “Denali” AlwaysOn
                                                                                            A
                                                                           A        A



                                                                                Shared Storage
•   Faster failover, easier administration with Availability Groups                         A

    •   Identify databases to failover as a unit to reduce unplanned
        downtime                                                                                        A

    •   Faster application failover using virtual name                 A
                                                                                                    A
    •   Increase application uptime using flexible failover policy
    •   Enable better data redundancy and protection with up to four           Non-Shared Storage
        secondaries and up to two synchronous secondaries

•   Limited downtime with enhanced online operations                       A
                                                                                                A

•   Run Microsoft SQL Server® on Windows Server® Core to
    reduce planned downtime (50-60% fewer OS patch reboots)
                                                                                 Disaster Recovery
Maximize Resources
Higher return on high availability investments
•   Increase hardware utilization through active secondaries for
    backups, reporting, and ad hoc queries
•   Reuse existing infrastructure with support for both SAN and
    direct attached storage




Simplify management and administration
•   Integrated manageability for one-stop configuration
•   Easy setup and monitoring integrated into Microsoft SQL
    Server Management Studio
•   Availability Groups that provide failover units with contained
    dependencies (such as logons)
80


      Breakthrough Performance and Scale


• Dramatically faster star-join query processing—
  much faster than current SQL Server (~10X)
  •   Query speed increase varies with query and data   110010100
                                                        101001010
                                                        011101011
                                                        00101001


• Reduced I/O
• Consistent query performance
• Reduced performance tuning effort


  •
Mission Critical High Availability Solution
 Meets
  mission
critical high    Integrated        Flexible       Efficient
availability
     SLA




    Microsoft recommended prescriptive HA solutions and
                   customer references          81
Introducing SQL Server
                AlwaysOn
Integrated, Flexible, Efficient high Availability for
mission critical business
    A high availability platform built for the future


AlwaysOn provides database level and instance
              level protection
      AlwaysOn Availability Groups   AlwaysOn Failover Cluster Instances
      for database protection        for instance level protection

         Multi-Database Failover        Multisite Clustering
         Multiple Secondaries           Flexible Failover Policy
         Active Secondaries             Improved Diagnostics
         Integrated HA Management       Built for consolidation scenarios
AlwaysOn – A flexible solution
AlwaysOn provides the flexibility of different HA
configurations
                                                              A

    A
                                               A
                                 A                  A
                                                                                               A
                                                                          A




   Direct attached storage local, regional and geo target   Shared Storage, regional and geo secondaries




            Synchronous                Asynchcronous
            Data Movement              Data Movement
                                                                                       83
AlwaysOn Availability Groups
   AlwaysOn Availability Groups is a new feature that enhances and
   combines database mirroring and log shipping capabilities
    Flexible                        Integrated            Efficient
 Multi-database failover          Application              Active
 Multiple secondaries              failover using            Secondary
   Total of 4 secondaries          virtual name               Readable
   2 synchronous                  Configuration               Secondary
    secondaries                     Wizard                     Backup from
     1 automatic failover pair                                 Secondary
                                   Dashboard
 Synchronous and
                                   System Center            Automation
  asynchronous
                                    Integration               using power-
  data movement
                                   Rich diagnostic           shell
 Built in compression              infrastructure
  and encryption
                                   File-stream
 Automatic and manual              replication
  failover
Availability Groups Virtual Name
Availability Groups Virtual Name allow applications to failover seamlessly to any secondary
     – Application reconnects using a virtual name after a failover to a secondary



           ServerA                           ServerB                         ServerC

                     HR                HR                               HR
                     DB                DB                               DB

                     AG_HR

                     HR_VNN
                                            Primary                       Secondary
        Primary                             Secondary                     Secondary
                                Application retry during failover
                                                          Connect to new primary once
-server HR_VNN;-catalog                                   failover is complete
HRDB                                                      and the virtual name is online
Backward Compatible
What about Server Objects?
 Introducing Contained
  Databases or CDB‘s
   Unit of application
    programmability in Denali
      A DB which establishes a
       boundary between application
       and server                     Authentication information
                                      moves with the CDB
 CDBs sever the user–login
  relationship
   Windows users no longer need
    matching logins
   Users with passwords replace
    SQL logins
Databases are not always easy to move

    Master                       MSDB
                                              Master   MSDB
 Instance Collation               Agent
       Logins                   Replication
     Credentials                 DB Mail
Linked Server Defs.                  …
        CLR           User DB
         …
       …
TempDB Collation
                                Other DBs
   Other Apps



   TempDB                       User DB
                                              Temp
Introducing the Contained Database

• New database option – CONTAINMENT
 •    Only option supported in Denali is PARTIAL meaning,
      non-enforaced containment
• Partially contained databases solve problems
  related to:
 •    Logins: Database Users with passwords or mapped
      directly to Windows principles
 •    System Collation: Temp tables use the database‘s
      collation
 •    sys.dm_db_uncontained_entities will display all
      potential containment breaches
Availability Group Architecture
                 Windows Server Failover Cluster



             Database                     Database
             Active Log Synchronization   Active Log Synchronization


Availability Group uses Windows           WSFC Common Microsoft Availability
Server Failover Cluster (WSFC) for        Platform
 Inter-node health detection,             SQL Server AlwaysOn Failover cluster
                                             instances
 Failover coordination,
                                           SQL Server AlwaysOn Availability Group
 Primary health detection,
                                           Microsoft Hyper-V
 Distributed data store for
  settings and state,                      Microsoft Exchange

 Distributed change                       Built-in WSFC workloads (e.g. file share,
  notifications                              NLB, etc) and third party workloads
AlwaysOn Availability Group
    Instance Preparation

1.    Install WSFC on each machine and create a single WSFC cluster

2.    Install SQL Server Instances on each machine

3.    Enable AlwaysOn through SQL Configuration Manager

4.    CREATE ENDPOINT on each instance

•    Notes:
     – Steps 1 and 2 can occur in any order (except for AlwaysOn Failover Cluster
       Instance (FCI) installation which of course requires WSFC installed)
WSFC Cluster vs. SQL Server “Cluster”
  Setup
• Install WSFC feature

• Setup WSFC cluster


• Configure SAN and Shared Disks

• Install SQL Server Failover Cluster Instance (FCI):
    –   Specify resource group
    –   Select shared disks
    –   Configure virtual IPs
    –   Configure virtual network names
    –   Specify domain accounts for services
    –   Configure domain groups*
Simplified WSFC Cluster setup in Windows
Server 2008+
Availability Group Concepts Recap

• Availability Group
    – Defines the high availability requirements
        • Databases, Replicas, Availability Mode,
          Failover Mode etc


• Availability Replica
    – SQL Server Instances that are part of the
      availability group which hosts the physical
      copy of the database
    – Role: Primary, Secondary, Resolving

• Availability Database
    – SQL Server database that is part of an
      availability group
    – This can be a regular database or contained
      database
Availability Group Architecture Drilldown
     Client connections transparently redirected
     to primary via IP and network name         User tells SQL to failover Availability Group 2 to Node1
     resources
                                             Clients disconnected from AG2




SQL Server Instance                               SQL Server Instance                                    SQL Server Instance




                                Availability Group 1
                                                                                  Availability Group 2



                                                                               Secondaries request
                                                                               primary connection


                        WSFC tells                                          WSFC tells
                                                            Notification
                        AG Res DLL SQL confirms                             AG Res DLL                          Notification
                                                            of new
                        to bring AG2and tells                               to bring AG2                        of new
                                                            primary
                        online      WSFC                                    offline                             primary
               AG Res DLL                                         AG Res DLL                                          AG Res DLL




    WSFC Service                                       WSFC Service                                          WSFC Service
Active Secondary – Making Secondary
                    Readable

       SQLservr.exe                                                 SQLservr.exe
                          Primary                    Secondary
              InstanceA
                           Secondary                 Primary
                                                                         InstanceB


      DB1     DB2                                                  DB1      DB2


        Reports                                                      Reports


   Readable secondary allow offloading read queries to secondary
   Close to real-time data, latency of log synchronization impact data freshness
   Read applications can reconnect to another secondary on failover
   Not a replacement for replication scenarios
Active Secondary: Enabling Backup
            On Secondary
  R/W workload

                                              Backups can be done on
                                               any replica of a database
                                   Backups
                                              Secondary replica may be
                                               synchronous or
                       Secondary               asynchronous

                                              Backups on primary replica
                                               still works
Backups
                                            Log backups done on all
                                             replicas form a single log
             Primary               Backups   chain

                                              Recovery Advisor makes
                       Secondary               restores simple
Readable Secondary Latency
         Primary                                       Secondary

                    Log      Network        Log
                   Capture                 Apply
          DB1                                            DB1



Commit
                                                                    Redo
           Log                               Log                   Thread
          Cache                             Cache
  Log                                                 Log           Redo
                                                                    Pages
 Flush                                               Harden
                     DB1     Acknowledge                             DB1
         DB1 Log                           DB1 Log                   Page
                     Data                                           Data
                                                                    Updated
                               Commit

    • Updated data is visible on the readable secondary as and when the
      page is redone
      Redo happens asynchronously after log hardening on the secondary
Readable Secondary Behavior

• Contention between redo thread and query thread
  avoided by
   – Internally mapping read workload to non blocking isolation levels
       •   Read Uncommitted  Snapshot Isolation
       •   Read Committed  Snapshot Isolation
       •   Repeatable Read  Snapshot Isolation
       •   Serializable  Snapshot Isolation
   – Ignore all locking hints

• Maintains query performance on secondary compared to
  primary
   – Auto-create statistics on the secondary replica but persist them in
     TempDB
Providing Instance Availability
Key Enhancements
                                 Flexible Failover
Fast instance failover through   Policy
predictable database recovery    • Eliminates false failover
time                             • Configurable failure
                                   condition levels
                                 • Better diagnostics
Native support for multi-site
clustering across subnets        SMB support
                                 enables consolidation
enable DR using failover
                                 of more than 26
cluster instances                instances
                                 Support TEMPDB
                                 on local drive
AlwaysOn Failover Cluster Instance

• AlwaysOn Failover Cluster Instance provides
  instance level failover

• Key Enhancements
  – Multi-site clustering across subnets
  – Flexible Failover Policy
  – Improved system diagnostics
  – Support for network attached storage (NAS) using
    SMB
  – Support for tempdb on local drive
Multi-Site Clustering
• Multi-site clustering provides protection from site failures




• AlwaysOn Failover Cluster Instance natively supports multi-site
  clustering without requiring V-LAN
   – Each site can have separate IP subnet
   – DNS entry updated to reflect current IP address on failover
Flexible Failover Policy
                        User sets new Cluster properties
                        HealthCheckTimeout and FailureConditionLevel

                                                                             •   FailureConditionLevel (0 to 5):
                                                                                  –   5 – Failover or restart on any qualified failure
                                                                                      conditions
                                                                                  –   4 – Failover or restart on moderate SQL Server
                                                                                      errors
                         SQL Server Failover
                                                                                  –   3 – Failover or restart on critical SQL Server
                          Cluster Instance                                            errors
                                                                                  –   2 – Failover or restart on SQL Server
                                                                                      unresponsive
                          Diagnostics generated                                   –   1 – Failover or restart on SQL Server down
                          for Health State                                        –   0 – No Automatic Failover or restart
                          Components
                          • System
                          • Resource                                         •   Diagnostics returned regardless of
                          • Query Processing
                          • IO Subsystem
                                                                                 FailureConditionLevel
                          • Events

                                                                             •   All levels optimized to minimize false
                                                                                 failures
               Diagnostics                     exec sp_server_diagnostics
               (periodically returned)


                                 FCI Res DLL
IsAlive/ LooksAlive                                        IsAlive /LooksAlive
result based on                                            WSFC asks Res
diagnostics and                                            DLL if
                               WSFC Service
FailureConditionLevel                                      SQL FCI alive
Reducing Planned Downtime

 Support for Windows Server Core
    Reduce OS patching by as much as 50-60%

 Support for rolling upgrade and patching of SQL Server
  for both Availability Groups and Failover Cluster Instance

 Fast failover time for both Availability Groups and Failover
  Cluster Instances

 New online operations supported
    LOB Index
    Adding of column with default
AlwaysOn Solution Guidance
Flexible Solution Choices
AlwaysOn             AlwaysOn                    AlwaysOn
Availability      Failover Cluster       Multi-site Failover Cluster
 Groups              Instances                    Instances




               Optionally combine with
               Availability Groups
               for DR
Virtualization with AlwaysOn Guidance

 Virtualization provides best consolidation isolation

 Virtualization without AlwaysOn:
    Simplest management story for limited HA/DR:
             Planned           Unplanned
     Host    Live Migration    VM failover (OS restart)
     Guest   Downtime during   No protection from
             patch             virtualization

 When to use AlwaysOn for the guest:
    Need better HA/DR protection than standalone VM
Available Now – CTP1
•   SQL Server Code Name Denali CTP1 is now public

•   CTP1 has the following feature set that you can test and provide
    feedback
     – AlwaysOn Failover Cluster Instance Features are RTM Quality:
         • Multi-Subnet Failover
         • Flexible Failover Policy

     – AlwaysOn Availability Groups Preview
         •   Ability to configure availability groups through T-SQL, SSMS, and PowerShell
         •   Multiple databases support in availability groups
         •   Read-only access to the secondary
         •   Support for Filestream data type
         •   Manually failing over and resynchronizing without reseeding
         •   Failing over client connections using the new connectivity story based on virtual
             network names and virtual IP addresses
         •   Including logins in user databases through a Contained Database
         •   SSMS, Catalog Views, and DMVs to view and monitor state
         •   Support for multiple availability groups on the same instance
         •   Support for availability groups on standalone instances and/or failover cluster
             instances
Conclusion
•   SQL Server AlwaysOn is a                  •   SQL Server AlwaysOn Availability Group
    comprehensive high availability                –   Multi-database failover
    solution                                       –   Multiple secondaries
                                                   –   Synchronous and asynchronous data
     – Better application availability,                movement
     – Higher return on investment and             –   Built in compression and encryption
     – Simplified deployment and                   –   Automatic and manual Failover
       management                                  –   Flexible failover policy
                                                   –   Automatic Page Repair
                                                   –   Readable secondary
•   AlwaysOn Availability Group and                –   Secondary backup
    AlwaysOn Failover Cluster Instance             –   Automatic       application redirection using
    provide flexibility in HA configuration            virtual name
                                                   –   Configuration Wizard
                                                   –   AlwaysOn Dashboard
•   Windows Server Core support                    –   System Center Integration
    significantly reduces downtime due             –   Automation using power-shell
    to patching                                    –   Rich diagnostic infrastructure
                                              •   SQL Server AlwaysOn Failover Cluster
                                                  Instance
                                                   –   Multi-site clustering across subnets
                                                   –   Flexible Failover Policy
                                                   –   Improved system diagnostics
                                                   –   Support for network attached storage
                                                       (NAS) using SMB
                                                   –   Support for tempdb on local drive
AlwaysOn Resources
 ―Denali‖ AlwaysOn Resource Center: http://msdn.microsoft.com/en-
   us/sqlserver/gg490638(en-us,MSDN.10)


       CTP download
       Documentation
       MSDN forums
       Microsoft Connect
       AlwaysOn Blog

       Credits :
       Vinod Kumar
       Balmukund Lakhani
       Matt Hollingsworth
       Jon Jahren
Sql Explore   Hebrew
Sql Explore   Hebrew
Sql Explore   Hebrew

More Related Content

What's hot

Gsm rf-optimization
Gsm rf-optimizationGsm rf-optimization
Gsm rf-optimizationkarimfeel
 
Hoyt diana
Hoyt dianaHoyt diana
Hoyt dianaNASAPMC
 
Mullane stanley-hamilton-wise
Mullane stanley-hamilton-wiseMullane stanley-hamilton-wise
Mullane stanley-hamilton-wiseNASAPMC
 
Thomas.mc vittie
Thomas.mc vittieThomas.mc vittie
Thomas.mc vittieNASAPMC
 
Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshopFrank Gielen
 
Engineered Resilient Systems, overview and status, 31 october 2011
Engineered Resilient Systems, overview and status, 31 october 2011Engineered Resilient Systems, overview and status, 31 october 2011
Engineered Resilient Systems, overview and status, 31 october 2011RNeches
 
Baldwin.kristen
Baldwin.kristenBaldwin.kristen
Baldwin.kristenNASAPMC
 
Bill Stankiewicz De Bontpresentation
Bill Stankiewicz De BontpresentationBill Stankiewicz De Bontpresentation
Bill Stankiewicz De BontpresentationBillStankiewicz
 
Water service delivery indicators
Water service delivery indicatorsWater service delivery indicators
Water service delivery indicatorsIRC
 
Dectron Support for Your LEED Projects
Dectron Support for Your LEED ProjectsDectron Support for Your LEED Projects
Dectron Support for Your LEED ProjectsDectron Internationale
 
CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...
CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...
CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...QAI
 
Advertisement jakarta walk_in_interview_published_nov_20_2011
Advertisement jakarta walk_in_interview_published_nov_20_2011Advertisement jakarta walk_in_interview_published_nov_20_2011
Advertisement jakarta walk_in_interview_published_nov_20_2011Ade Herdiansah
 
Consulting Services: McDaniel Consulting
Consulting Services: McDaniel ConsultingConsulting Services: McDaniel Consulting
Consulting Services: McDaniel Consultinggnemcda
 
Vonnie simonsen
Vonnie simonsenVonnie simonsen
Vonnie simonsenNASAPMC
 
Dezfuli youngblood
Dezfuli youngbloodDezfuli youngblood
Dezfuli youngbloodNASAPMC
 
Hopkins.marghi
Hopkins.marghiHopkins.marghi
Hopkins.marghiNASAPMC
 
Se lect12 btech
Se lect12 btechSe lect12 btech
Se lect12 btechIIITA
 
Se lect13 btech
Se lect13 btechSe lect13 btech
Se lect13 btechIIITA
 
Predacorr Presentation
Predacorr PresentationPredacorr Presentation
Predacorr PresentationSamm
 

What's hot (20)

Gsm rf-optimization
Gsm rf-optimizationGsm rf-optimization
Gsm rf-optimization
 
Hoyt diana
Hoyt dianaHoyt diana
Hoyt diana
 
Mullane stanley-hamilton-wise
Mullane stanley-hamilton-wiseMullane stanley-hamilton-wise
Mullane stanley-hamilton-wise
 
Thomas.mc vittie
Thomas.mc vittieThomas.mc vittie
Thomas.mc vittie
 
Pr 005 qa_workshop
Pr 005 qa_workshopPr 005 qa_workshop
Pr 005 qa_workshop
 
Engineered Resilient Systems, overview and status, 31 october 2011
Engineered Resilient Systems, overview and status, 31 october 2011Engineered Resilient Systems, overview and status, 31 october 2011
Engineered Resilient Systems, overview and status, 31 october 2011
 
Baldwin.kristen
Baldwin.kristenBaldwin.kristen
Baldwin.kristen
 
Bill Stankiewicz De Bontpresentation
Bill Stankiewicz De BontpresentationBill Stankiewicz De Bontpresentation
Bill Stankiewicz De Bontpresentation
 
Water service delivery indicators
Water service delivery indicatorsWater service delivery indicators
Water service delivery indicators
 
Dectron Support for Your LEED Projects
Dectron Support for Your LEED ProjectsDectron Support for Your LEED Projects
Dectron Support for Your LEED Projects
 
CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...
CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...
CMMI High Maturity Best Practices HMBP 2010: Demystifying High Maturity Imple...
 
Advertisement jakarta walk_in_interview_published_nov_20_2011
Advertisement jakarta walk_in_interview_published_nov_20_2011Advertisement jakarta walk_in_interview_published_nov_20_2011
Advertisement jakarta walk_in_interview_published_nov_20_2011
 
Consulting Services: McDaniel Consulting
Consulting Services: McDaniel ConsultingConsulting Services: McDaniel Consulting
Consulting Services: McDaniel Consulting
 
Vonnie simonsen
Vonnie simonsenVonnie simonsen
Vonnie simonsen
 
Dezfuli youngblood
Dezfuli youngbloodDezfuli youngblood
Dezfuli youngblood
 
Hopkins.marghi
Hopkins.marghiHopkins.marghi
Hopkins.marghi
 
SunGard Remote Console
SunGard Remote ConsoleSunGard Remote Console
SunGard Remote Console
 
Se lect12 btech
Se lect12 btechSe lect12 btech
Se lect12 btech
 
Se lect13 btech
Se lect13 btechSe lect13 btech
Se lect13 btech
 
Predacorr Presentation
Predacorr PresentationPredacorr Presentation
Predacorr Presentation
 

Viewers also liked

A tuna seminars 4 novembri 2010-www
A tuna seminars 4 novembri 2010-wwwA tuna seminars 4 novembri 2010-www
A tuna seminars 4 novembri 2010-wwwSanita
 
PerfUG 3 - perfs système
PerfUG 3 - perfs systèmePerfUG 3 - perfs système
PerfUG 3 - perfs systèmeLudovic Piot
 
Evija klave biss
Evija klave bissEvija klave biss
Evija klave bissSanita
 
Cloud hybridation leveraging on Docker 1.12
Cloud hybridation leveraging on Docker 1.12Cloud hybridation leveraging on Docker 1.12
Cloud hybridation leveraging on Docker 1.12Ludovic Piot
 

Viewers also liked (6)

Brochure
BrochureBrochure
Brochure
 
A tuna seminars 4 novembri 2010-www
A tuna seminars 4 novembri 2010-wwwA tuna seminars 4 novembri 2010-www
A tuna seminars 4 novembri 2010-www
 
PerfUG 3 - perfs système
PerfUG 3 - perfs systèmePerfUG 3 - perfs système
PerfUG 3 - perfs système
 
Brochure1
Brochure1Brochure1
Brochure1
 
Evija klave biss
Evija klave bissEvija klave biss
Evija klave biss
 
Cloud hybridation leveraging on Docker 1.12
Cloud hybridation leveraging on Docker 1.12Cloud hybridation leveraging on Docker 1.12
Cloud hybridation leveraging on Docker 1.12
 

Similar to Sql Explore Hebrew

Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliencyMasashi Narumoto
 
PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...
PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...
PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...IBM Danmark
 
Chen.tim
Chen.timChen.tim
Chen.timNASAPMC
 
Ph.D. Dissertation
Ph.D. DissertationPh.D. Dissertation
Ph.D. DissertationSumant Tambe
 
AITP July 2012 Presentation - Disaster Recovery - Business + Technology
AITP July 2012 Presentation - Disaster Recovery - Business + TechnologyAITP July 2012 Presentation - Disaster Recovery - Business + Technology
AITP July 2012 Presentation - Disaster Recovery - Business + TechnologyAndrew Miller
 
Capacity Management for SAN
Capacity Management for SANCapacity Management for SAN
Capacity Management for SANMetron
 
Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012Munish Gupta
 
Vmt Company Overview Draf Tv5.New
Vmt Company Overview Draf Tv5.NewVmt Company Overview Draf Tv5.New
Vmt Company Overview Draf Tv5.Newprattysd12
 
Preventing the Next Deployment Issue with Continuous Performance Testing and ...
Preventing the Next Deployment Issue with Continuous Performance Testing and ...Preventing the Next Deployment Issue with Continuous Performance Testing and ...
Preventing the Next Deployment Issue with Continuous Performance Testing and ...Correlsense
 
Michael.bay
Michael.bayMichael.bay
Michael.bayNASAPMC
 
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint PresentationSTN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint Presentationmcini
 
Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1
Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1
Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1Plan4Demand
 
Cloud Ready Apps
Cloud Ready AppsCloud Ready Apps
Cloud Ready AppsDotitude
 
Architecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud ExpoArchitecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud Exposmw355
 
Backing up your virtual environment best practices
Backing up your virtual environment   best practicesBacking up your virtual environment   best practices
Backing up your virtual environment best practicesInterop
 
My talk at PMI Sweden Congress 2013 on Agile and Large Software Products
My talk at PMI Sweden Congress 2013 on Agile and Large Software ProductsMy talk at PMI Sweden Congress 2013 on Agile and Large Software Products
My talk at PMI Sweden Congress 2013 on Agile and Large Software ProductsSvante Lidman
 
Peer group itsm presentation 6.12
Peer group itsm presentation 6.12Peer group itsm presentation 6.12
Peer group itsm presentation 6.12James Sutter
 
VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!VMworld
 
SaaS Operations Practice Overview SoftServe DevOps
SaaS Operations Practice Overview SoftServe DevOpsSaaS Operations Practice Overview SoftServe DevOps
SaaS Operations Practice Overview SoftServe DevOpsSoftServe
 
Managing High Availability with Low Cost
Managing High Availability with Low CostManaging High Availability with Low Cost
Managing High Availability with Low CostDataLeader.io
 

Similar to Sql Explore Hebrew (20)

Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliency
 
PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...
PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...
PCTY 2012, Overvågning af forretningssystemer i et virtuelt miljø v. Hans Ped...
 
Chen.tim
Chen.timChen.tim
Chen.tim
 
Ph.D. Dissertation
Ph.D. DissertationPh.D. Dissertation
Ph.D. Dissertation
 
AITP July 2012 Presentation - Disaster Recovery - Business + Technology
AITP July 2012 Presentation - Disaster Recovery - Business + TechnologyAITP July 2012 Presentation - Disaster Recovery - Business + Technology
AITP July 2012 Presentation - Disaster Recovery - Business + Technology
 
Capacity Management for SAN
Capacity Management for SANCapacity Management for SAN
Capacity Management for SAN
 
Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012Patterns for Building High Performance Applications in Cloud - CloudConnect2012
Patterns for Building High Performance Applications in Cloud - CloudConnect2012
 
Vmt Company Overview Draf Tv5.New
Vmt Company Overview Draf Tv5.NewVmt Company Overview Draf Tv5.New
Vmt Company Overview Draf Tv5.New
 
Preventing the Next Deployment Issue with Continuous Performance Testing and ...
Preventing the Next Deployment Issue with Continuous Performance Testing and ...Preventing the Next Deployment Issue with Continuous Performance Testing and ...
Preventing the Next Deployment Issue with Continuous Performance Testing and ...
 
Michael.bay
Michael.bayMichael.bay
Michael.bay
 
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint PresentationSTN Event 12.8.09 - Chris Vain Powerpoint Presentation
STN Event 12.8.09 - Chris Vain Powerpoint Presentation
 
Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1
Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1
Supply Planning Leadership Exchange: SAP PP/DS What You Need to Know Part 1
 
Cloud Ready Apps
Cloud Ready AppsCloud Ready Apps
Cloud Ready Apps
 
Architecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud ExpoArchitecting a Private Cloud - Cloud Expo
Architecting a Private Cloud - Cloud Expo
 
Backing up your virtual environment best practices
Backing up your virtual environment   best practicesBacking up your virtual environment   best practices
Backing up your virtual environment best practices
 
My talk at PMI Sweden Congress 2013 on Agile and Large Software Products
My talk at PMI Sweden Congress 2013 on Agile and Large Software ProductsMy talk at PMI Sweden Congress 2013 on Agile and Large Software Products
My talk at PMI Sweden Congress 2013 on Agile and Large Software Products
 
Peer group itsm presentation 6.12
Peer group itsm presentation 6.12Peer group itsm presentation 6.12
Peer group itsm presentation 6.12
 
VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!VMworld 2014: Virtualize Active Directory, the Right Way!
VMworld 2014: Virtualize Active Directory, the Right Way!
 
SaaS Operations Practice Overview SoftServe DevOps
SaaS Operations Practice Overview SoftServe DevOpsSaaS Operations Practice Overview SoftServe DevOps
SaaS Operations Practice Overview SoftServe DevOps
 
Managing High Availability with Low Cost
Managing High Availability with Low CostManaging High Availability with Low Cost
Managing High Availability with Low Cost
 

More from Aaron Shilo

שבוע אורקל 2016
שבוע אורקל 2016שבוע אורקל 2016
שבוע אורקל 2016Aaron Shilo
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Aaron Shilo
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Aaron Shilo
 
New fordevelopersinsql server2008
New fordevelopersinsql server2008New fordevelopersinsql server2008
New fordevelopersinsql server2008Aaron Shilo
 
resource governor
resource governorresource governor
resource governorAaron Shilo
 
Sql Server & PowerShell
Sql Server & PowerShellSql Server & PowerShell
Sql Server & PowerShellAaron Shilo
 

More from Aaron Shilo (7)

שבוע אורקל 2016
שבוע אורקל 2016שבוע אורקל 2016
שבוע אורקל 2016
 
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
 
New fordevelopersinsql server2008
New fordevelopersinsql server2008New fordevelopersinsql server2008
New fordevelopersinsql server2008
 
Our Services
Our ServicesOur Services
Our Services
 
resource governor
resource governorresource governor
resource governor
 
Sql Server & PowerShell
Sql Server & PowerShellSql Server & PowerShell
Sql Server & PowerShell
 

Sql Explore Hebrew

  • 1. ‫שם המצגת‬ ‫אהרון שילה| מנכ"ל| די בי סי אס בע"מ (‪)DBCS‬‬
  • 2. ‫קצת עלי‬ ‫מי אני - ‪DBA‬‬ ‫נשוי + 3‬ ‫•‬ ‫למעלה מ01 שנים בתחום‬ ‫•‬ ‫מוסמך ‪ PRO‬בטכנולוגיות ‪Sql Server‬‬ ‫•‬ ‫ו-‪Oracle‬‬ ‫לשעבר ‪ CTO‬ומוביל תחום בג'ון ברייס‬ ‫•‬ ‫הדרכה‬ ‫מנכ"ל חברת ‪ DBCS‬העוסקת בתחום‬ ‫•‬ ‫יועץ ל- בזק בינ"ל, הלמ"ס, פונטיס,‬ ‫•‬ ‫טרפילוג, ‪ storenext ,galcomm‬ועוד.‬
  • 3. • Introduction to High Availability in SQL Server: Hardware and software solutions • Features and techniques comparison – Log Shipping – Database Mirroring – Replication – Database Snapshots – Backup improvements – Online operations • HADR deep dive: How to implement the next generation of high availability and disaster recovery solution with SQL Server
  • 4. Introduction to High Availability and Disaster Recovery • Definitions – Introduce key terms and concepts • Business Continuity Planning – Overview of the BCP process • SQL Server High Availability Planning – How does BCP apply to SQL Server availability?
  • 5. High Availability and Disaster Recovery: Definition • High Availability • Disaster Recovery • High availability is a system design protocol and associated • Processes and procedures implementation that ensures a designed to restore business certain absolute degree of operational continuity during a given operations due to a natural or measurement period human-induced disaster – Typically involves providing • Availability defined in terms of redundancy spanning multiple service level agreements (SLA) – Recovery Time sites or across geographic – Data loss during unplanned regions downtime • A highly available application should be accessible by users x% of the time
  • 6. Defining x and SLA Availability Acceptable Acceptable Data • Recovery Time Objective (RTO) Class Downtime (hrs/yr) Loss (time of last OR RTO copy) OR RPO guided by availability requirements – How much downtime can you tolerate? Tier 1 >99.99% 5 min or less (1 hr or less) Tier 2 99.9% - 99.99% (1- 5 mins to 8.5 hrs • Recovery Point Objective (RPO) 8.5 hrs) guided by criticality of Tier 3 (<99.9%) Hours to days application data (Hours to days) – How much data can you lose? RPO Tier1 RTO
  • 7. Protection Levels Regional DR • Protection against resource failures – Machine – Database Corruption – Disk • Location Redundancy Geographic DR – Building  Protection against – < 10 miles  Natural Disasters  Location Redundancy  Protection against – State, Country  Network Outages – > 100-200 miles  Site Failures  Location Redundancy – City, County Local HA – < 100-200 miles
  • 8. Business Continuity Planning • Impact Analysis – Critical Functions Analysis – Threat Identification – Recovery Objectives • Solution Design – Achieve recovery objectives for Solution relevant threats within specified Maintenance Design constraints like budget, human resources etc – CostBenefit analysis of solutions • Implementation – Deploy the recommended solution • Testing Implementati – Test to see if the solution meets the Testing on recovery requirements • Maintenance – Yearly testing and review of procedures
  • 9. SQL Server High Availability Planning • Analysis – Application tiers serviced by the databases – Causes of database downtime – Protection levels: Local HA, Regional DR, Geographic DR Analysis • Solution Design – Need to understand what solutions exists? – What are the characteristics and Maintenance Solution Design cost of the solution? • Implementation – What are the deployment steps and best practices? • Testing Testing Implementation – How do I test my implementation? • Maintenance – How do I monitor and maintain the solution?
  • 10. Database Downtime Drivers Analysis Failure Protection Unplanned Downtime User Errors Database Downtime Online Administration Planned Downtime Predictable Resourcing
  • 11. Solution Design Solution Design • Understand the Solution Architecture solutions and choices before HA Capabilities making a decision Limitations and Caveats Cost Vector
  • 12. SQL Server Solution Design Always On Technologies
  • 13. Always On Technologies Solution Design • Provides a full range of options to • Backup and Restore • Log Shipping minimize downtime Increases • Database Mirroring and maintain Availability • • Failover Clustering Peer-Peer Replication appropriate levels of application availability • Online Index Operations • Table Partitioning Decreased • Enhanced Locking • Resource Governor Downtime • Database Snapshot • Dedicated Admin Connection • Dynamic Configuration
  • 14. Always On Technology Overview Solution • Architecture Overview Design – How does it work? • Backup and Restore • Solution Characteristics Increases • Log Shipping • Database Mirroring – Data Loss Guarantees Availability • Failover Clustering – Failover Characteristics • Peer-Peer Replication – Redundancy Levels and Utilization – Cost – Limitations and Caveats
  • 15. What’s New in SQL Server 2008 • New Features • Feature Enhancements • Resource Governor • Database Mirroring – Automatic recovery from – Manage SQL Server page corruption workloads and resources – Log stream compression by specifying limits on – Faster recovery on failover resource consumption • Log Shipping – Sub-Minute Log Shipping – Backup compression • Backup Compression • Failover Clustering – Reduce backup and restore – 16 nodes time – Rolling upgrade • Peer-Peer Replication – Hot add new nodes
  • 17. Backup and Restore Solution Design • Base availability technology for any solution – Protects against failures and recovery from errors – Provides Local HA and Site DR • Need to ensure the backups are accessible if site goes down – High RTO due to restore time – RPO=0 can never be guaranteed • Types: Full, Differential, and Transaction Log – File-group backup/restore for large databases • Backup Compression provides faster and smaller backups in SQL Server 2008
  • 18. Enhanced Error Detection • In SQL Server 2000 RESTORE VERIFYONLY does not guarantee that the backup is good – Data may be corrupt • In SQL Server 2005 RESTORE VERIFYONLY checks everything – Ensures that the data is correct
  • 19. Database Checksums • SQL Server 2000 had TornPageDetection to detect incomplete I/O Operations by power failures • SQL Server 2005 adds checksums to data pages – Header of every page contains a checksum value – When reading page, it re-computes checksum and compares with checksum stored – Returns error (824) if difference found – Detects errors not reported by I/O Subsystem
  • 20. Backup Checksums • Detect errors introduced by backup hardware but not reported by hardware or operating system – Backup media error detection – Backup devices do not always detect errors – Works with • RESTORE • RESTORE VERIFYONLY • Restore also checks page checksums, if present – Disk error detection on data pages prior to backup • Can continue past errors if desired
  • 21. Backup Compression • Common questions: • ―We saw an 85 percent – ―How much compression will I see?‖ reduction in file size using – ―Will it be comparable to, say, SQL Litespeed?‖ SQL Server 2008 Backup Compression,‖ says Colin • One simple answer: Neller, Senior Software ―It depends!‖ Engineer at ServiceU and part of the company‘s SQL • All data compresses Server 2008 implementation differently – the compression team. ―A backup file that was ratio achieved depends on: previously over 300 GB is – The type of data in the database – Whether the data in the database is now only 40 GB, and the job already compressed runs in about half the time.‖ – Whether the data/database is encrypted
  • 22. Backup Compression: Backup Performance • Backup of a 322 MB Adventureworks database Uncompressed Compressed Hardly any CPU used (avg 5%), A LOT more CPU used (avg 25%) runtime = 39.5s, compression BUT runtime = 21.6s (45% ratio of 0. improvement) and backup stored in 76.7MB (4.2x compression ratio)
  • 23. DEMO
  • 25. Database Snapshots • Read-only, consistent view of a Page database – Specified point-in-time • Modifying data – Copy-on-write of affected pages • Reading data Page – Accesses snapshot if data has changed – Redirected to original database 12:00 Snapshot otherwise
  • 26. Using Database Snapshot to Recover Data Scenario Example Code / Steps Undeleting INSERT INTO Production.WorkOrderRouting rows SELECT * FROM AdventureWorks_dbsnapshot_1800.Prod.WorkOrderRouting Undoing UPDATE HR.Department an update SET Name = ( SELECT Name FROM AdventureWorks_dbsnapshot_1800.HR.Department WHERE DepartmentID = 1) WHERE DepartmentID = 1 Recovering 1 Script the object in the database snapshot a dropped object 2 Execute the script in the source database 3 Repopulate the object (if appropriate) Caution: Not a substitute for a comprehensive backup and restore strategy
  • 27. DEMO
  • 29. Log Shipping Solution Design • Automated transaction log backup and restore provides redundancy at the database level • SQLLogship.exe provides the underlying framework for doing automated backup, copy and restore – Backup on primary instance – Restore on secondary instance(s) • Scheduling is done through SQL Server Agent jobs – SQL Server 2008 provides sub-minute scheduling interval providing the ability to do quick backup and restores • No automatic failover capabilities
  • 30. Log Shipping (Key terms) • Primary Server: – Contains your primary database. – SQL Server Agent makes periodic transaction log backups to capture changes. • Secondary Server – Contain an unrecovered copy of the production database. – One standby server can contain standby databases from multiple primary servers.
  • 31. Log Shipping (Key terms) cont… • Monitor Server (Optional) – Monitors the status of the log-shipping jobs on the primary and each standby server. – One monitoring server can monitor multiple primary- standby server pairs. – Should use a server other than the primary or the standby to detect problems on either server.
  • 32. Log Shipping Copy and Restore Backups Perform Backups Copy Secondary Database Copy and Restore Backups Copy Copy Secondary Database Primary Database Copy and Restore Backups Raise Secondary Database Alerts Monitor Database
  • 33. Strength & weakness • Strengths – Can Ship Logs Across WAN (Wide-Area Network) – Protects an Entire Database • Weaknesses – Configured Per Database – NO AUTOMATIC FAILOVER
  • 34. DEMO
  • 36. Database Mirroring Solution Design • A database level high availability solution that provides complete protection against data loss and fast recovery through automatic failover • Maintains a redundant database by shipping log blocks when the transactions are committed on the principal • Synchronous and Asynchronous modes provide the spectrum of options to choose between availability and performance • Automatic failover when using witness server
  • 37. Database Mirroring Modes • High-Availability Mode – Safety Full; Synchronous operation – Database is available whenever a quorum exists – Automatic failover • High-Protection Mode – Safety Full; Synchronous operation – No witness – quorum provided by partners – If Principal loses quorum, it stops servicing the database • Ensures high protection; database is never in ‗exposed‘ state – Manual failover only; no automatic failover – A transition mode; should not be in this mode for long • High-Performance Mode – Safety Off; Asynchronous operation – Manual failover only • Supports only one form of role switching: forced service (with possible data loss)
  • 38. Database Mirroring How it works Mirror is always Application Witness redoing – it remains current Commit Principal Mirror 1 5 2 SQL Server SQL Server 2 >2 4 3 >3 Log Data Log Data
  • 39. DBM – Automatic Page Recovery Witness Client 2. Request page 3. Find page 6. Write 5. Transfer page 1. Bad Page Page Detected Log XData Data Log Principal 4. Retrieve page Mirror
  • 40. Database Mirroring Enhancements • Enhancements in SQL 2008 – Compression of stream data for which at least a 12.5 percent compression ratio can be achieved. – Automatic Recovery from Corrupted Pages. – Page read-ahead during the undo phase. – Improved use of log send buffers.
  • 41. Strength & Weakness • Strengths – Can Mirror Across WAN – Automatic Failover, and Nearly Instantaneous, Better than Failover Clustering – Protects an Entire Database • Weaknesses – Requires Enterprise Edition – Must be Configured Per Database
  • 42. DEMO
  • 44. Replication • Primarily used where availability is required in conjunction with scale out of read activity • Failover possible; a custom solution • Not limited to entire database; Can define subset of source database or tables • Copy of database is continuously accessible for read activity • Latency between source and copy can be as low as seconds
  • 45. Transactional Replication Solution Design • A high performance data replication solution that provides granular table level replication – Logical data movement provides flexibility and better hardware utilization • Key scenarios: – Customized application-specific DR – Real-time reporting on secondary server that be used for Site DR – Scale out application queries with ability to use any one database copy for Site DR • Two types relevant for HA and DR – Transactional and Peer-to-Peer
  • 46. Peer-to-Peer Replication • Provides high availability Peer Node Peer Node and read scalability • Builds redundancy by eliminating single point of failure • Enable online upgrades of servers Peer Node Peer Node • Maximize Application Uptime • Support for both Ring and Grid Topology • Centralized Management using Management Studio
  • 47. New Features Replicated Data Write Load Balancing Read Application Server User Requests
  • 48. Strength & Weakness • Strengths – Perpetual or on-demand replication of data, local or remote – Protects (duplicates or merges) the exact portion of the database I want • Weaknesses – Configured per database, even per table – Generally does not protect or duplicate an entire Database
  • 49. DEMO
  • 51. Failover Clustering Solution Design • Instance level protection built on Windows Failover Clustering shared disk model – Cluster nodes typically co-located within the same site to provide local HA – Regional DR possible using VLAN and stretch storage level replication • No built in data redundancy like database mirroring and log shipping – Data protection has to be provided at the storage level or by combining with other solutions
  • 52. Failover Clustering Node 2 Node 1 Virtual Node 3 Server Shared Disk
  • 53. SQL Server Cluster Topologies • Supports many scenarios: Failover Cluster • Single Instance • Multiple Instance * Inst1 • Multiple Active Nodes • N+1 • N+M Multiple Active Nodes N+1: N Active, 1 Inactive N+M: N Active, M Inactive Nodes Nodes * Inst1 * Inst1 Inst3 * Inst2 * Inst2 *
  • 54. Failover Clustering (Facts) • Redundancy at database instance level – All databases fail over together – Shared copy of system databases • Single data copy on shared storage device – No I/O overhead reducing throughput – Storage unit is single point of failure for cluster • All database services are clustered – SQL Agent; Analysis Services; Full-Text engine, MS DTC • Automatic failover (up to minutes) • DBMS accessed over virtual IP • Storage is controlled by one cluster node at a time • Requires hardware certified by Microsoft for Microsoft Cluster Service
  • 55. Strength & Weakness • Strengths – Provides Protection Against a Node Failure, Protects the Entire SQL Instance – Automatic Failover Supported • Weaknesses – Generally Expensive, Requires Specialty Hardware – Specialty Hardware Requirements – Not Trivial to Configure and Manage – Doesn‘t Protect Against a Complete Site Failure
  • 56. DEMO
  • 57. Best Practices • Backup your system databases after modifications. • Test if backups are restorable. • Practice / Test your disaster recovery plans. • Documentation is not only for you. • Keep dedicated DR Server ready. • Use BACKUP CHECKSUM features. • Run DBCC CHECKDB regularly. • Don‘t ignore any runtime errors.
  • 58. What Solution Is Best For US ?
  • 59. Always On Solution Solution Design Characteristics Redundancy and RPO Failover Utilization Cost Solutions No Data Failover Unit Auto Read Mult- Write Hard- App Perf Manag- Loss Failover iple ware Impact eability (RPO=0) (RTO) Inst DB Tab Log Shipping * Low Low Low DBM Sync * Low High Low + ** Async * Low Low Low Cluster High*** Low *** Low*** Transactional Low Low High Replication Peer-Peer Low Low High Replication * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology
  • 60. Recap Solution Design • Application availability requirements or SLA drive primary solution choices – RPO and RTO are the key metrics Application Availability used to define the SLA Unplanned Planned Downtime • Need mitigation against planned and downtime unplanned downtimes • Multiple solution choices that Database provides varying costbenefits Mirroring Clustering • Other requirements apart from application SLA factor into the choice Log Peer-Peer Shipping Replication • Understand constraints and tradeoffs you can make
  • 61. Always On Solution Solution Design Characteristics Redundancy and RPO Failover Utilization Cost Solutions No Data Failover Unit Auto Read Mult- Write Hard- App Perf Manag- Loss Failover iple ware Impact eability (RPO=0) (RTO) Inst DB Tab Log Shipping * Low Low Low DBM Sync * Low High Low + ** Async * Low Low Low Cluster High*** Low *** Low*** Transactional Low Low High Replication Peer-Peer Low Low High Replication * Database Mirroring and Log Shipping can provide point in time read capability using STANDBY or database snapshots respectively ** Database Mirroring provides fastest failover to hot secondary *** Depends on SAN technology
  • 62. AdventureWorks Inc Scenario Adventureworks Inc is a Solution manufacturing company that • One datacenter located in Design manufactures and sells bicycles Omaha across the world. There are a number of applications, some • Three applications that are mission critical that run – Manufacturing – Tier 1 on multiple SQL Server – Finance – Tier 2 Instances – Scheduling – Tier 3 • The DBA team is run by Darren • Manufacturing application runs who is responsible for deploying on a dedicated SQL Server and managing the application 2008 Instance databases. One of his core – All other applications run on responsibilities is to ensure a second instance availability of all application • Availability of manufacturing databases in order to meet the application SLA application is critical • Implement a solution at the lowest possible cost
  • 63. Application Requirements Solution Design Applications Data RTO in Failover Unit Auto Read Multiple Read Loss secs Failover Sites Write RPO=0 Inst DB Tab Manufacturing Finance Scheduling • Manufacturing application has strict SLA‘s • Finance application requires readability on the secondary – The reports are run every 4 hours and need to be fresh as of the last one hour. To offload the reporting load from the main system they would like to utilize the mirror
  • 64. Solution Choice for Manufacturing Solution Application Design Solutions Data Loss Fast Failover Unit Auto Read >1 Read RPO=0 RTO Failover Sites Write Inst DB Tab Copy Cluster SAN Replication • DBM - Sync a zero data loss solution that can also provide fast instance level Clustering can provide  failover • Use RAID configuration to provide data redundancy on the SAN • If a redundant copy is required that can provide instance failover with zero data loss use SAN replication DBM - AsyncSolution – High Cost  • Use synchronous database mirroring if instance failover is not needed Log Shipping Transactional Replication Peer-Peer Replication Clustering with RAID
  • 65. Solution Choice for Finance Solution Design Application Solutions Data Loss Fast Failover Unit Auto Read >1 Read RPO=0 RTO Failover Sites Write Inst DB Tab Copy Cluster SAN Replication DBM - Sync  DBM - Async   Log Shipping For database level redundancy with acceptable data loss with minimal perf impact, asynchronous database mirroring is an optimal Transactional choice Replication  Peer-Peer Use database snapshots at periodic intervals to Reports providea readable Replication Finance snapshot of the data for reporting Scheduling Db Snapshot  Low cost solution Async Database every hour Mirroring Omaha Datacenter
  • 66. Adding a Regional Datacenter Into the Mix Solution Design • Regulatory and compliance requirements drive the need for having a additional datacenter within a 10 mile radius to provide redundancy against site level failure. – It is now required that all applications have the ability to failover to the regional datacenter across the river in Council Bluff • The SLA need to be maintained for tier 1 applications even in the case of site failures
  • 67. Regional Site Solution Solution Design Choices Manufacturing Cluster with SAN Sync Mirroring no witness Reports Finance Scheduling Db Snapshot Async Database every hour Mirroring Log Shipping Omaha Datacenter CB Datacenter
  • 68. A Complete Topology Solution Design • Considering the potential of floods and tornadoes destroying the regional data centers, Adventureworks Inc wants to maintain a disaster recovery site in San Antonio, TX • The disaster recovery site has lower SLA requirements for all applications – The manufacturing application can have an RPO of 1 hour – The RTO is set at 4 hours
  • 69. Topology Diagram Solution Design Sync Mirroring Manufacturing No witness Cluster with SAN Log Shipping
  • 70. Scale Out and Availability Solution Design Scenario Requirements  – Geo Redundancy • Adventureworks is building – Data Locality a new web based order – High Availability management system that – Local Read-Scale allows customers from all  Workload Characteristics over the world access the – Mainly reads system and place orders – Few writes • The core group of  Application Characteristics customers are in Western – Each user logging in connects to a Europe, South East Asia particular server and North America  Partitioned based on user-id and region  Writes from a user always happen on one server regardless of the region the user log in from – All reads redirected to the closest geo- location  Reasonable tolerance for latency (5-10 minutes)
  • 71. Replication Topology Solution Design Asia1 Asia2 Peer Nodes Read-Only Servers
  • 72. Licensing Facts • Passive servers are mirror, log shipped secondary and clustering passive node • No license required on passive if it is truly passive • A passive server does not need a license if the number of processors in the passive server is equal to or less than the number of processors in the active server. • The passive server can take the duties of the active server for 30 days. Afterwards, it must be licensed accordingly.
  • 73. HA Features Edition Support Feature Express Workgroup Standard Enterprise Comments Advanced high availability solution Database 1 that includes fast Mirroring failover and automatic client redirection Failover Clustering 2 Backup Log- Data backup and shipping recovery solution Includes Hot Add Memory, dedicated Online System administrative Changes connection, and other online operations Online Indexing Online Restore Database available Fast Recovery when undo operations begin ₁Single thread redo ₂ Limited to 2 node cluster
  • 74. Summary • There is no ―one size fits all‖ solution • Consider the costbenefitsconstraints and compare that to availability requirements of the organization to determine the best solution • Use the charts to understand cost, benefit and constraints of the various SQL Server High Availability solutions • TEST the solution to ensure it can meet the availability requirements and meet SLA‘s
  • 76. SQL Server AlwaysOn: Mission Critical Capabilities in SQL Server “Denali” • Jon Jahren • Exec VP, Prediktor • jon.jahren@prediktor.no
  • 77. High Availability and Disaster Recovery SQL Server “Denali” AlwaysOn A A A Shared Storage • Faster failover, easier administration with Availability Groups A • Identify databases to failover as a unit to reduce unplanned downtime A • Faster application failover using virtual name A A • Increase application uptime using flexible failover policy • Enable better data redundancy and protection with up to four Non-Shared Storage secondaries and up to two synchronous secondaries • Limited downtime with enhanced online operations A A • Run Microsoft SQL Server® on Windows Server® Core to reduce planned downtime (50-60% fewer OS patch reboots) Disaster Recovery
  • 78. Maximize Resources Higher return on high availability investments • Increase hardware utilization through active secondaries for backups, reporting, and ad hoc queries • Reuse existing infrastructure with support for both SAN and direct attached storage Simplify management and administration • Integrated manageability for one-stop configuration • Easy setup and monitoring integrated into Microsoft SQL Server Management Studio • Availability Groups that provide failover units with contained dependencies (such as logons)
  • 79. 80 Breakthrough Performance and Scale • Dramatically faster star-join query processing— much faster than current SQL Server (~10X) • Query speed increase varies with query and data 110010100 101001010 011101011 00101001 • Reduced I/O • Consistent query performance • Reduced performance tuning effort •
  • 80. Mission Critical High Availability Solution Meets mission critical high Integrated Flexible Efficient availability SLA Microsoft recommended prescriptive HA solutions and customer references 81
  • 81. Introducing SQL Server AlwaysOn Integrated, Flexible, Efficient high Availability for mission critical business A high availability platform built for the future AlwaysOn provides database level and instance level protection AlwaysOn Availability Groups AlwaysOn Failover Cluster Instances for database protection for instance level protection Multi-Database Failover Multisite Clustering Multiple Secondaries Flexible Failover Policy Active Secondaries Improved Diagnostics Integrated HA Management Built for consolidation scenarios
  • 82. AlwaysOn – A flexible solution AlwaysOn provides the flexibility of different HA configurations A A A A A A A Direct attached storage local, regional and geo target Shared Storage, regional and geo secondaries Synchronous Asynchcronous Data Movement Data Movement 83
  • 83.
  • 84. AlwaysOn Availability Groups AlwaysOn Availability Groups is a new feature that enhances and combines database mirroring and log shipping capabilities Flexible Integrated Efficient  Multi-database failover  Application  Active  Multiple secondaries failover using Secondary  Total of 4 secondaries virtual name  Readable  2 synchronous  Configuration Secondary secondaries Wizard  Backup from  1 automatic failover pair Secondary  Dashboard  Synchronous and  System Center  Automation asynchronous Integration using power- data movement  Rich diagnostic shell  Built in compression infrastructure and encryption  File-stream  Automatic and manual replication failover
  • 85. Availability Groups Virtual Name Availability Groups Virtual Name allow applications to failover seamlessly to any secondary – Application reconnects using a virtual name after a failover to a secondary ServerA ServerB ServerC HR HR HR DB DB DB AG_HR HR_VNN Primary Secondary Primary Secondary Secondary Application retry during failover Connect to new primary once -server HR_VNN;-catalog failover is complete HRDB and the virtual name is online
  • 87. What about Server Objects?  Introducing Contained Databases or CDB‘s  Unit of application programmability in Denali  A DB which establishes a boundary between application and server Authentication information moves with the CDB  CDBs sever the user–login relationship  Windows users no longer need matching logins  Users with passwords replace SQL logins
  • 88. Databases are not always easy to move Master MSDB Master MSDB Instance Collation Agent Logins Replication Credentials DB Mail Linked Server Defs. … CLR User DB … … TempDB Collation Other DBs Other Apps TempDB User DB Temp
  • 89. Introducing the Contained Database • New database option – CONTAINMENT • Only option supported in Denali is PARTIAL meaning, non-enforaced containment • Partially contained databases solve problems related to: • Logins: Database Users with passwords or mapped directly to Windows principles • System Collation: Temp tables use the database‘s collation • sys.dm_db_uncontained_entities will display all potential containment breaches
  • 90. Availability Group Architecture Windows Server Failover Cluster Database Database Active Log Synchronization Active Log Synchronization Availability Group uses Windows WSFC Common Microsoft Availability Server Failover Cluster (WSFC) for Platform  Inter-node health detection,  SQL Server AlwaysOn Failover cluster instances  Failover coordination,  SQL Server AlwaysOn Availability Group  Primary health detection,  Microsoft Hyper-V  Distributed data store for settings and state,  Microsoft Exchange  Distributed change  Built-in WSFC workloads (e.g. file share, notifications NLB, etc) and third party workloads
  • 91. AlwaysOn Availability Group Instance Preparation 1. Install WSFC on each machine and create a single WSFC cluster 2. Install SQL Server Instances on each machine 3. Enable AlwaysOn through SQL Configuration Manager 4. CREATE ENDPOINT on each instance • Notes: – Steps 1 and 2 can occur in any order (except for AlwaysOn Failover Cluster Instance (FCI) installation which of course requires WSFC installed)
  • 92. WSFC Cluster vs. SQL Server “Cluster” Setup • Install WSFC feature • Setup WSFC cluster • Configure SAN and Shared Disks • Install SQL Server Failover Cluster Instance (FCI): – Specify resource group – Select shared disks – Configure virtual IPs – Configure virtual network names – Specify domain accounts for services – Configure domain groups*
  • 93. Simplified WSFC Cluster setup in Windows Server 2008+
  • 94. Availability Group Concepts Recap • Availability Group – Defines the high availability requirements • Databases, Replicas, Availability Mode, Failover Mode etc • Availability Replica – SQL Server Instances that are part of the availability group which hosts the physical copy of the database – Role: Primary, Secondary, Resolving • Availability Database – SQL Server database that is part of an availability group – This can be a regular database or contained database
  • 95. Availability Group Architecture Drilldown Client connections transparently redirected to primary via IP and network name User tells SQL to failover Availability Group 2 to Node1 resources Clients disconnected from AG2 SQL Server Instance SQL Server Instance SQL Server Instance Availability Group 1 Availability Group 2 Secondaries request primary connection WSFC tells WSFC tells Notification AG Res DLL SQL confirms AG Res DLL Notification of new to bring AG2and tells to bring AG2 of new primary online WSFC offline primary AG Res DLL AG Res DLL AG Res DLL WSFC Service WSFC Service WSFC Service
  • 96. Active Secondary – Making Secondary Readable SQLservr.exe SQLservr.exe Primary Secondary InstanceA Secondary Primary InstanceB DB1 DB2 DB1 DB2 Reports Reports  Readable secondary allow offloading read queries to secondary  Close to real-time data, latency of log synchronization impact data freshness  Read applications can reconnect to another secondary on failover  Not a replacement for replication scenarios
  • 97. Active Secondary: Enabling Backup On Secondary R/W workload  Backups can be done on any replica of a database Backups  Secondary replica may be synchronous or Secondary asynchronous  Backups on primary replica still works Backups  Log backups done on all replicas form a single log Primary Backups chain  Recovery Advisor makes Secondary restores simple
  • 98. Readable Secondary Latency Primary Secondary Log Network Log Capture Apply DB1 DB1 Commit Redo Log Log Thread Cache Cache Log Log Redo Pages Flush Harden DB1 Acknowledge DB1 DB1 Log DB1 Log Page Data Data Updated Commit • Updated data is visible on the readable secondary as and when the page is redone Redo happens asynchronously after log hardening on the secondary
  • 99. Readable Secondary Behavior • Contention between redo thread and query thread avoided by – Internally mapping read workload to non blocking isolation levels • Read Uncommitted  Snapshot Isolation • Read Committed  Snapshot Isolation • Repeatable Read  Snapshot Isolation • Serializable  Snapshot Isolation – Ignore all locking hints • Maintains query performance on secondary compared to primary – Auto-create statistics on the secondary replica but persist them in TempDB
  • 101. Key Enhancements Flexible Failover Fast instance failover through Policy predictable database recovery • Eliminates false failover time • Configurable failure condition levels • Better diagnostics Native support for multi-site clustering across subnets SMB support enables consolidation enable DR using failover of more than 26 cluster instances instances Support TEMPDB on local drive
  • 102. AlwaysOn Failover Cluster Instance • AlwaysOn Failover Cluster Instance provides instance level failover • Key Enhancements – Multi-site clustering across subnets – Flexible Failover Policy – Improved system diagnostics – Support for network attached storage (NAS) using SMB – Support for tempdb on local drive
  • 103. Multi-Site Clustering • Multi-site clustering provides protection from site failures • AlwaysOn Failover Cluster Instance natively supports multi-site clustering without requiring V-LAN – Each site can have separate IP subnet – DNS entry updated to reflect current IP address on failover
  • 104. Flexible Failover Policy User sets new Cluster properties HealthCheckTimeout and FailureConditionLevel • FailureConditionLevel (0 to 5): – 5 – Failover or restart on any qualified failure conditions – 4 – Failover or restart on moderate SQL Server errors SQL Server Failover – 3 – Failover or restart on critical SQL Server Cluster Instance errors – 2 – Failover or restart on SQL Server unresponsive Diagnostics generated – 1 – Failover or restart on SQL Server down for Health State – 0 – No Automatic Failover or restart Components • System • Resource • Diagnostics returned regardless of • Query Processing • IO Subsystem FailureConditionLevel • Events • All levels optimized to minimize false failures Diagnostics exec sp_server_diagnostics (periodically returned) FCI Res DLL IsAlive/ LooksAlive IsAlive /LooksAlive result based on WSFC asks Res diagnostics and DLL if WSFC Service FailureConditionLevel SQL FCI alive
  • 105. Reducing Planned Downtime  Support for Windows Server Core  Reduce OS patching by as much as 50-60%  Support for rolling upgrade and patching of SQL Server for both Availability Groups and Failover Cluster Instance  Fast failover time for both Availability Groups and Failover Cluster Instances  New online operations supported  LOB Index  Adding of column with default
  • 107. Flexible Solution Choices AlwaysOn AlwaysOn AlwaysOn Availability Failover Cluster Multi-site Failover Cluster Groups Instances Instances Optionally combine with Availability Groups for DR
  • 108. Virtualization with AlwaysOn Guidance  Virtualization provides best consolidation isolation  Virtualization without AlwaysOn:  Simplest management story for limited HA/DR: Planned Unplanned Host Live Migration VM failover (OS restart) Guest Downtime during No protection from patch virtualization  When to use AlwaysOn for the guest:  Need better HA/DR protection than standalone VM
  • 109. Available Now – CTP1 • SQL Server Code Name Denali CTP1 is now public • CTP1 has the following feature set that you can test and provide feedback – AlwaysOn Failover Cluster Instance Features are RTM Quality: • Multi-Subnet Failover • Flexible Failover Policy – AlwaysOn Availability Groups Preview • Ability to configure availability groups through T-SQL, SSMS, and PowerShell • Multiple databases support in availability groups • Read-only access to the secondary • Support for Filestream data type • Manually failing over and resynchronizing without reseeding • Failing over client connections using the new connectivity story based on virtual network names and virtual IP addresses • Including logins in user databases through a Contained Database • SSMS, Catalog Views, and DMVs to view and monitor state • Support for multiple availability groups on the same instance • Support for availability groups on standalone instances and/or failover cluster instances
  • 110. Conclusion • SQL Server AlwaysOn is a • SQL Server AlwaysOn Availability Group comprehensive high availability – Multi-database failover solution – Multiple secondaries – Synchronous and asynchronous data – Better application availability, movement – Higher return on investment and – Built in compression and encryption – Simplified deployment and – Automatic and manual Failover management – Flexible failover policy – Automatic Page Repair – Readable secondary • AlwaysOn Availability Group and – Secondary backup AlwaysOn Failover Cluster Instance – Automatic application redirection using provide flexibility in HA configuration virtual name – Configuration Wizard – AlwaysOn Dashboard • Windows Server Core support – System Center Integration significantly reduces downtime due – Automation using power-shell to patching – Rich diagnostic infrastructure • SQL Server AlwaysOn Failover Cluster Instance – Multi-site clustering across subnets – Flexible Failover Policy – Improved system diagnostics – Support for network attached storage (NAS) using SMB – Support for tempdb on local drive
  • 111. AlwaysOn Resources  ―Denali‖ AlwaysOn Resource Center: http://msdn.microsoft.com/en- us/sqlserver/gg490638(en-us,MSDN.10)  CTP download  Documentation  MSDN forums  Microsoft Connect  AlwaysOn Blog  Credits :  Vinod Kumar  Balmukund Lakhani  Matt Hollingsworth  Jon Jahren