SlideShare a Scribd company logo
1 of 8
Download to read offline
Preventing Data Corruption in the
Event of an Extended Power Outage

White Paper 10
Revision 3


by Victor Avelar




                                                              Contents
    > Executive summary                                       Click on a section to jump to it

                                                              Introduction                       2
    Despite advances in computer technology, power
    outages continue to be a major cause of PC and server
                                                              Recommended configurations
    downtime. Protecting computer systems with uninter-                                          3
                                                              for UPS software
    ruptible power supply (UPS) hardware is part of a total
    solution, but power management software is also           Different types of operating
    necessary to prevent data corruption after extended                                          5
                                                              system shutdown
    power outages. Various software configurations are
    discussed, and best practices aimed at ensuring uptime    Best practices                     6
    are presented.
                                                              Conclusion                         7

                                                              Resources                          8
Preventing Data Corruption in the Even of an Extended Power Outage




Introduction   An extended power outage, which can strike at any time, can prevent unprotected computers
               from initiating their required shutdown procedure. PC and server operating systems are not
               designed to support abrupt losses of power known as “hard” shutdowns, but rather rely on a
               set of built-in processes that prepare a computer for shut down such as saving memory,
               stopping applications and services, etc. Shutting down in this manner is often referred to a
               “graceful” shutdown. Hard shutdowns, on the other hand can result in lost or corrupted data
               and a lengthier time-to-recovery after power returns.

               An uninterruptible power supply (UPS) can protect the system from damaging power prob-
               lems and improve server availability by allowing users to continue working without interrup-
               tion during a short power outage. During an extended power outage, defined as any outage
               that might outlast the UPSs runtime, if the system is equipped with UPS shutdown software, it
               can communicate with the UPS and perform a graceful, unattended system shutdown before
               the UPS battery is exhausted.

               There are many reasons for the occurrence of extended power outages, ranging from a local
               transformer failure due to lightning, or a regional power grid going offline. Steps must be
               taken to protect computer systems and the data they store from the corrupting effects of a
               hard shutdown. One cause of potential data corruption in the event of an extended power
               outage is abnormal termination of applications or the operating system while manipulating
               data. This can affect documents, critical file system structures (such as File Allocation
               Tables), or dynamic application data, and in many cases can also lead to increased “time-to-
               recovery” when power returns, as the operating system or application attempts to rebuild
               corrupted tables, etc.

               Another cause of concern is with a computers hard drive. While progress has certainly been
               made in the industry over the last decade in hard drive technology to prevent “head crashes”
               (where the read/write head of the hard drive could actually damage the surface of the disk if
               not properly “parked”), another advance in hard drive technology has actually contributed to
               the likelihood of data corruption. To achieve high levels of performance, hard disk controllers
               are often designed to take advantage of caching techniques, which involve temporarily writing
               information to memory and then writing the data out to the actual disk later. In the event of a
               power loss, information in the cache is lost, leading to potential data file or data corruption.

               One does not have to search extensively in business and government publications to see
               that, despite technological advances, data corruption due to power loss is still a widely
               recognized problem in the IT industry. This is emphasized in the industry quotes below:

                 • “Even a moment’s disruption can have devastating effects on power sensitive custom-
                    ers such as internet service providers, data centers, wireless telecommunication net-
                    works, on-line traders, computer chip manufacturers and medical research centers. For
                    these customers, power disruptions can result in data corruption, burned circuit
                    boards, component damage, file corruption and lost customers.”
                    - U.S. Dept. of Energy Office of Power Technologies, Electrical Power Interruption Cost
                    Estimates for Individual Industries, Sectors, and U.S. Economy, February 2002
                 • "Failure to boot after a power failure is generally caused by corrupted files or a
                    damaged hard disk - neither of which last known good configuration is capable of repair-
                    ing."
                    - MCSE Microsoft® Windows® XP Professional Readiness Review Exam 70-270, Sec-
                    tion 70-270.04.03.002, 11/28/2001
                 • “Total failures, or blackouts, constitute a complete loss of electrical power to the net-
                    working or computing equipment…these failures can cause system and network
                    crashes, PC lockups, and corruption or loss of valuable data from servers and work-



               APC by Schneider Electric                                         White Paper 10    Rev 3   2
Preventing Data Corruption in the Even of an Extended Power Outage


                                    stations.”
                                    - Contingency Planning Management Magazine, Power Protection Basics, March 2002
                                 • "The system and its data can become corrupt as a result of a power failure....a
                                    UPS can protect the system if power is lost. A UPS usually provides ...temporary power
                                    which may be enough to permit a graceful shutdown."
                                    - National Institute of Standards and Technology, Special Publication 800-34 Contin-
                                    gency Planning Guide for Information Technology Systems , June 2002




Recommended                    Configuration 1: Protecting a single computer with a single UPS
configurations                 In this configuration, each computer is backed up by its own UPS, and the UPS communi-
for UPS software               cates with the computer over a serial or USB cable. UPS software is installed on the
                               computer to provide graceful, unattended shutdown in the event of an extended power
                               outage. In this case the UPS is managed locally by the connected computer. This is the
                               simplest configuration and is widespread for both server and workstation deployments.



                               Configuration 2: Protecting two to three computers with a single UPS
                               In this configuration, several computers are plugged into a larger UPS (typically one rated at
                               1500 VA or higher). One computer will be connected directly to the serial port on the UPS,
                               while the other two are connected to an expansion card installed in the UPS that provides two
                               additional serial ports. In this situation, all three computers will have graceful shutdown
                               capability, but management of the UPS is handled via the computer connected directly to the
                               UPS. Note that since the USB standard addresses communication with a single system only,
                               USB connections cannot be used in this configuration. Although this scheme can be
                               extended to handle up to 24 computers (via daisy-chaining), APC by Schneider Electric does
                               not recommend such an approach due to the additional cabling required.




                                                                                                  Server running
                                                                                                  UPS software

Figure 1
Protecting a single computer      Management
with a single UPS                  Console



                                                                                                     UPS

                                                                      Power
                                                                      Serial or USB Communications




                               APC by Schneider Electric                                       White Paper 10      Rev 3   3
Preventing Data Corruption in the Even of an Extended Power Outage




                                                              Servers running UPS software




                                                                                                     Interface
Figure 2
                                                                                                     Expander
Protecting two to three
computers with a single UPS
                               Management
                                Console

                                                                                    UPS with built-in expansion slot

                                                                    Power
                                                                    Serial or USB Communications




                               Configuration 3: Protecting three or more computers with a single
                               UPS
                               An increasingly popular approach involves managing the UPS directly over an Ethernet
                               network. A network management card (with a real-time operating system and hardware
                               watchdog chip) installed in the UPS eliminates the requirement for server-based manage-
                               ment. One example of such a configuration is the InfraStruXure architecture from APC which
                               utilizes this approach. Software installed on the computers in this configuration need only
                               encompass shutdown functionality since management capabilities are embedded in the UPS
                               itself.


                                                           Servers running UPS software




Figure 3                                                                                        Network
Protecting three or more Management                                                             Management
computers with a single
                          Console                                                               Card
UPS



                                                                                   UPS with built-in expansion slot
                                                                                                   Power
                                                                                                   Network




                               APC by Schneider Electric                                       White Paper 10      Rev 3   4
Preventing Data Corruption in the Even of an Extended Power Outage




Different types   Modern operating systems such as Microsoft Windows® are increasingly including more
                  advanced approaches to power management, including new methods of shutting down.
of operating      Although these advances have largely been driven by laptop user requirements, selecting the
system            right one for use with UPS software can decrease time-to-recovery after an extended power
shutdown          outage.



                  Shutdown
                  This is the traditional method where the computers’ operating system receives a shutdown
                  command from the UPS shutdown software and goes through a sequence of killing active
                  processes before exiting. On a Windows® system for instance, this would bring the computer
                  to the state where a message “You may now turn off your computer” appears.



                  Shutdown and “off”
                  This is similar to the method above, but at the end of the process, the operating system
                  actually commands the computer to turn off and it goes into a state where it no longer draws
                  power. This can be a useful approach for Configuration 2 above - one computer can be shut
                  down and turned off to lengthen the runtime of the remaining computers (this approach is
                  sometimes referred to as “load shedding”). Shutdown and “off” capability sometimes requires
                  a BIOS setting change to enable the “off” portion to occur.



                  Hibernation
                  A Hibernation process (for instance, as found in Microsoft’s latest Windows® operating
                  systems) is similar to the methods above, but some highly valuable additional steps are
                  taken:

                    1. First the computer’s desktop state including all open files and documents is saved. It
                         does this by saving all of RAM to a large file on the hard disk.
                    2. Then the system is shutdown and powered off.
                    3. When power returns and the operating system boots up, the RAM is reloaded from the
                         hard disk.
                    4. The desktop and all open files and applications are then presented as they appeared
                         before the hibernation occurred.


                  This has a major advantage over the other methods of preserving both work in progress and
                  the state of the machine before the shutdown occurred. For these reasons, APC strongly
                  recommends customers consider selecting this method of shutdown for their UPS software.



                  Standby
                  When a computer goes into “standby” mode, it is not turned completely off, but is placed into
                  a low power state where certain components (monitor, I/O chips, etc.) are powered down.
                  DRAM continues to be refreshed etc., and when the computer is taken out of “standby” mode,
                  it typically reverts to the previous state very quickly. If you select a standby setting for your
                  computer, it is important to make sure that the UPS you select can “wake” the system in the
                  event of an extended power outage so a graceful shutdown can be initiated – otherwise the
                  system may stay in standby state until the UPS is completely drained and then power to the
                  system will be dropped (a “hard” shutdown).




                  APC by Schneider Electric                                         White Paper 10    Rev 3    5
Preventing Data Corruption in the Even of an Extended Power Outage




                              Purchase a UPS with extended runtime capability and a generator if
Best practices                required
                              The amount of standardized data on AC power reliability is limited. However, there are two
                              significant surveys related to AC power reliability in the USA which have been done, one by
                              AT&T Bell Labs and one by IBM. In addition, American Power Conversion has some
                              experience by having approximately 8 million UPS systems installed, many of which are
                              capable of logging power problems. In the USA, the data from surveys agrees with the
                              experience of APC and shows the following essential points:

                              The average number of outages sufficient to cause IT system malfunction per year at a
                              typical site is approximately 15:

                                • 90% of the outages are less than 5 minutes in duration (conversely, 10% are greater
                                   than 5 minutes)
                                • 99% of the outages are less than 1 hour in duration (conversely, 1% are greater than 1
                                   hour)
                                • Total cumulative outage duration is approximately 100 minutes per year

                              This information is highly variable from site to site and in some geographic locations in the
                              USA such as Florida the outage rate is an order of magnitude higher. Building specific
                              problems can also raise the outage rate by as much as 3 orders of magnitude. This data is
                              also believed to be representative of Japan and Western Europe.

                              Since 10% of outages are greater than 5 minutes and 1% are greater than one hour, pur-
      Link to resource        chasing a UPS with extended runtime capability merits serious consideration when the cost of
      APC White Paper 52      downtime is high. In cases where hours of runtime are required a generator is recom-
Four Steps to Determine       mended. However, a UPS with a few minutes of runtime is still required to maintain the load
when a Standby Generator is   until the generator comes online. For more information on this topic see APC White Paper
Needed for Small Data         52, Four Steps to Determine when a Standby Generator is Needed for Small Data Centers
Centers and Network Rooms     and Network Rooms.



                              Protect the network equipment with UPSs
                              Applications are only as available as the network that they are accessed through. Power
                              protection for hubs, routers, and switches is an essential but sometimes overlooked ingredi-
                              ent in ensuring availability of applications. Additionally, if computers are running UPS
                              shutdown software as in Configuration 3 above, the UPS shutdown software requires the
                              network to be up during the power outage in order to communicate properly. If the network is
                              unprotected, graceful shutdown of the computer will not be accomplished.



                              Accommodate each server’s individual time requirement for
                              shutdown
                              The time required to properly shut down the operating system varies from system to system -
                              some email servers with many accounts have been known to take upwards of 20 minutes to
                              shut down for instance. Make sure the UPS software’s settings take each of your computers’
                              specific requirements into account and are set properly.




                              APC by Schneider Electric                                        White Paper 10    Rev 3   6
Preventing Data Corruption in the Even of an Extended Power Outage




Conclusion   Without shutdown software installed on the protected computer, the net effect of the UPS is
             simply to delay the inevitable. Regardless of which configuration, which best practices, and
             which particular UPS software is utilized, APC highly recommends customers not overlook
             this requirement – the small effort involved in installing and configuring such software can be
             well worth it in the event of an extended power outage that exceeds the runtime of the UPS.




                      About the author
                 Victor Avelar is a Senior Research Analyst at APC by Schneider Electric. He is responsible
                 for data center design and operations research, and consults with clients on risk assessment
                 and design practices to optimize the availability and efficiency of their data center environ-
                 ments. Victor holds a Bachelor’s degree in Mechanical Engineering from Rensselaer
                 Polytechnic Institute and an MBA from Babson College. He is a member of AFCOM and the
                 American Society for Quality.




             APC by Schneider Electric                                              White Paper 10      Rev 3     7
Preventing Data Corruption in the Even of an Extended Power Outage




 Resources
 Click on icon to link to resource



                                         Four Steps to Determine when a Standby Generator
                                         is Needed for Small Data Centers and Network Rooms
                                         APC White Paper 52




                                         APC White Paper Library
                                         whitepapers.apc.com




                                         APC TradeOff Tools™
                                         tools.apc.com




References                             1. Allen and Segall, Monitoring of Computer Installations for Power Line Disturbances,
                                           IBM, IEEE PES Winter conference, 1974.
                                           A study conducted from 1969 to 1970 using 38 monitor-months of data
                                       2. Goldstein and Speranza, The Quality of US Commercial AC Power, ATT Bell Labs,
                                           Intellec conference, 1982
                                           A study conducted from 1977 to 1979 at 24 sites around the US
                                       3. Martzloff, Power Quality Site Surveys: Facts, Fiction, and Fallacies, IEEE Transactions
                                           on Industry Applications, Vol 24, No 6




                                              Contact us
                                        For feedback and comments about the content of this white paper:

                                            Data Center Science Center, APC by Schneider Electric
                                            DCSC@Schneider-Electric.com

                                        If you are a customer and have questions specific to your data center project:

                                              Contact your APC by Schneider Electric representative




                                     APC by Schneider Electric                                     White Paper 10    Rev 3   8

More Related Content

Recently uploaded

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Recently uploaded (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

WP10 Preventing Data Corruption During Power Outages

  • 1. Preventing Data Corruption in the Event of an Extended Power Outage White Paper 10 Revision 3 by Victor Avelar Contents > Executive summary Click on a section to jump to it Introduction 2 Despite advances in computer technology, power outages continue to be a major cause of PC and server Recommended configurations downtime. Protecting computer systems with uninter- 3 for UPS software ruptible power supply (UPS) hardware is part of a total solution, but power management software is also Different types of operating necessary to prevent data corruption after extended 5 system shutdown power outages. Various software configurations are discussed, and best practices aimed at ensuring uptime Best practices 6 are presented. Conclusion 7 Resources 8
  • 2. Preventing Data Corruption in the Even of an Extended Power Outage Introduction An extended power outage, which can strike at any time, can prevent unprotected computers from initiating their required shutdown procedure. PC and server operating systems are not designed to support abrupt losses of power known as “hard” shutdowns, but rather rely on a set of built-in processes that prepare a computer for shut down such as saving memory, stopping applications and services, etc. Shutting down in this manner is often referred to a “graceful” shutdown. Hard shutdowns, on the other hand can result in lost or corrupted data and a lengthier time-to-recovery after power returns. An uninterruptible power supply (UPS) can protect the system from damaging power prob- lems and improve server availability by allowing users to continue working without interrup- tion during a short power outage. During an extended power outage, defined as any outage that might outlast the UPSs runtime, if the system is equipped with UPS shutdown software, it can communicate with the UPS and perform a graceful, unattended system shutdown before the UPS battery is exhausted. There are many reasons for the occurrence of extended power outages, ranging from a local transformer failure due to lightning, or a regional power grid going offline. Steps must be taken to protect computer systems and the data they store from the corrupting effects of a hard shutdown. One cause of potential data corruption in the event of an extended power outage is abnormal termination of applications or the operating system while manipulating data. This can affect documents, critical file system structures (such as File Allocation Tables), or dynamic application data, and in many cases can also lead to increased “time-to- recovery” when power returns, as the operating system or application attempts to rebuild corrupted tables, etc. Another cause of concern is with a computers hard drive. While progress has certainly been made in the industry over the last decade in hard drive technology to prevent “head crashes” (where the read/write head of the hard drive could actually damage the surface of the disk if not properly “parked”), another advance in hard drive technology has actually contributed to the likelihood of data corruption. To achieve high levels of performance, hard disk controllers are often designed to take advantage of caching techniques, which involve temporarily writing information to memory and then writing the data out to the actual disk later. In the event of a power loss, information in the cache is lost, leading to potential data file or data corruption. One does not have to search extensively in business and government publications to see that, despite technological advances, data corruption due to power loss is still a widely recognized problem in the IT industry. This is emphasized in the industry quotes below: • “Even a moment’s disruption can have devastating effects on power sensitive custom- ers such as internet service providers, data centers, wireless telecommunication net- works, on-line traders, computer chip manufacturers and medical research centers. For these customers, power disruptions can result in data corruption, burned circuit boards, component damage, file corruption and lost customers.” - U.S. Dept. of Energy Office of Power Technologies, Electrical Power Interruption Cost Estimates for Individual Industries, Sectors, and U.S. Economy, February 2002 • "Failure to boot after a power failure is generally caused by corrupted files or a damaged hard disk - neither of which last known good configuration is capable of repair- ing." - MCSE Microsoft® Windows® XP Professional Readiness Review Exam 70-270, Sec- tion 70-270.04.03.002, 11/28/2001 • “Total failures, or blackouts, constitute a complete loss of electrical power to the net- working or computing equipment…these failures can cause system and network crashes, PC lockups, and corruption or loss of valuable data from servers and work- APC by Schneider Electric White Paper 10 Rev 3 2
  • 3. Preventing Data Corruption in the Even of an Extended Power Outage stations.” - Contingency Planning Management Magazine, Power Protection Basics, March 2002 • "The system and its data can become corrupt as a result of a power failure....a UPS can protect the system if power is lost. A UPS usually provides ...temporary power which may be enough to permit a graceful shutdown." - National Institute of Standards and Technology, Special Publication 800-34 Contin- gency Planning Guide for Information Technology Systems , June 2002 Recommended Configuration 1: Protecting a single computer with a single UPS configurations In this configuration, each computer is backed up by its own UPS, and the UPS communi- for UPS software cates with the computer over a serial or USB cable. UPS software is installed on the computer to provide graceful, unattended shutdown in the event of an extended power outage. In this case the UPS is managed locally by the connected computer. This is the simplest configuration and is widespread for both server and workstation deployments. Configuration 2: Protecting two to three computers with a single UPS In this configuration, several computers are plugged into a larger UPS (typically one rated at 1500 VA or higher). One computer will be connected directly to the serial port on the UPS, while the other two are connected to an expansion card installed in the UPS that provides two additional serial ports. In this situation, all three computers will have graceful shutdown capability, but management of the UPS is handled via the computer connected directly to the UPS. Note that since the USB standard addresses communication with a single system only, USB connections cannot be used in this configuration. Although this scheme can be extended to handle up to 24 computers (via daisy-chaining), APC by Schneider Electric does not recommend such an approach due to the additional cabling required. Server running UPS software Figure 1 Protecting a single computer Management with a single UPS Console UPS Power Serial or USB Communications APC by Schneider Electric White Paper 10 Rev 3 3
  • 4. Preventing Data Corruption in the Even of an Extended Power Outage Servers running UPS software Interface Figure 2 Expander Protecting two to three computers with a single UPS Management Console UPS with built-in expansion slot Power Serial or USB Communications Configuration 3: Protecting three or more computers with a single UPS An increasingly popular approach involves managing the UPS directly over an Ethernet network. A network management card (with a real-time operating system and hardware watchdog chip) installed in the UPS eliminates the requirement for server-based manage- ment. One example of such a configuration is the InfraStruXure architecture from APC which utilizes this approach. Software installed on the computers in this configuration need only encompass shutdown functionality since management capabilities are embedded in the UPS itself. Servers running UPS software Figure 3 Network Protecting three or more Management Management computers with a single Console Card UPS UPS with built-in expansion slot Power Network APC by Schneider Electric White Paper 10 Rev 3 4
  • 5. Preventing Data Corruption in the Even of an Extended Power Outage Different types Modern operating systems such as Microsoft Windows® are increasingly including more advanced approaches to power management, including new methods of shutting down. of operating Although these advances have largely been driven by laptop user requirements, selecting the system right one for use with UPS software can decrease time-to-recovery after an extended power shutdown outage. Shutdown This is the traditional method where the computers’ operating system receives a shutdown command from the UPS shutdown software and goes through a sequence of killing active processes before exiting. On a Windows® system for instance, this would bring the computer to the state where a message “You may now turn off your computer” appears. Shutdown and “off” This is similar to the method above, but at the end of the process, the operating system actually commands the computer to turn off and it goes into a state where it no longer draws power. This can be a useful approach for Configuration 2 above - one computer can be shut down and turned off to lengthen the runtime of the remaining computers (this approach is sometimes referred to as “load shedding”). Shutdown and “off” capability sometimes requires a BIOS setting change to enable the “off” portion to occur. Hibernation A Hibernation process (for instance, as found in Microsoft’s latest Windows® operating systems) is similar to the methods above, but some highly valuable additional steps are taken: 1. First the computer’s desktop state including all open files and documents is saved. It does this by saving all of RAM to a large file on the hard disk. 2. Then the system is shutdown and powered off. 3. When power returns and the operating system boots up, the RAM is reloaded from the hard disk. 4. The desktop and all open files and applications are then presented as they appeared before the hibernation occurred. This has a major advantage over the other methods of preserving both work in progress and the state of the machine before the shutdown occurred. For these reasons, APC strongly recommends customers consider selecting this method of shutdown for their UPS software. Standby When a computer goes into “standby” mode, it is not turned completely off, but is placed into a low power state where certain components (monitor, I/O chips, etc.) are powered down. DRAM continues to be refreshed etc., and when the computer is taken out of “standby” mode, it typically reverts to the previous state very quickly. If you select a standby setting for your computer, it is important to make sure that the UPS you select can “wake” the system in the event of an extended power outage so a graceful shutdown can be initiated – otherwise the system may stay in standby state until the UPS is completely drained and then power to the system will be dropped (a “hard” shutdown). APC by Schneider Electric White Paper 10 Rev 3 5
  • 6. Preventing Data Corruption in the Even of an Extended Power Outage Purchase a UPS with extended runtime capability and a generator if Best practices required The amount of standardized data on AC power reliability is limited. However, there are two significant surveys related to AC power reliability in the USA which have been done, one by AT&T Bell Labs and one by IBM. In addition, American Power Conversion has some experience by having approximately 8 million UPS systems installed, many of which are capable of logging power problems. In the USA, the data from surveys agrees with the experience of APC and shows the following essential points: The average number of outages sufficient to cause IT system malfunction per year at a typical site is approximately 15: • 90% of the outages are less than 5 minutes in duration (conversely, 10% are greater than 5 minutes) • 99% of the outages are less than 1 hour in duration (conversely, 1% are greater than 1 hour) • Total cumulative outage duration is approximately 100 minutes per year This information is highly variable from site to site and in some geographic locations in the USA such as Florida the outage rate is an order of magnitude higher. Building specific problems can also raise the outage rate by as much as 3 orders of magnitude. This data is also believed to be representative of Japan and Western Europe. Since 10% of outages are greater than 5 minutes and 1% are greater than one hour, pur- Link to resource chasing a UPS with extended runtime capability merits serious consideration when the cost of APC White Paper 52 downtime is high. In cases where hours of runtime are required a generator is recom- Four Steps to Determine mended. However, a UPS with a few minutes of runtime is still required to maintain the load when a Standby Generator is until the generator comes online. For more information on this topic see APC White Paper Needed for Small Data 52, Four Steps to Determine when a Standby Generator is Needed for Small Data Centers Centers and Network Rooms and Network Rooms. Protect the network equipment with UPSs Applications are only as available as the network that they are accessed through. Power protection for hubs, routers, and switches is an essential but sometimes overlooked ingredi- ent in ensuring availability of applications. Additionally, if computers are running UPS shutdown software as in Configuration 3 above, the UPS shutdown software requires the network to be up during the power outage in order to communicate properly. If the network is unprotected, graceful shutdown of the computer will not be accomplished. Accommodate each server’s individual time requirement for shutdown The time required to properly shut down the operating system varies from system to system - some email servers with many accounts have been known to take upwards of 20 minutes to shut down for instance. Make sure the UPS software’s settings take each of your computers’ specific requirements into account and are set properly. APC by Schneider Electric White Paper 10 Rev 3 6
  • 7. Preventing Data Corruption in the Even of an Extended Power Outage Conclusion Without shutdown software installed on the protected computer, the net effect of the UPS is simply to delay the inevitable. Regardless of which configuration, which best practices, and which particular UPS software is utilized, APC highly recommends customers not overlook this requirement – the small effort involved in installing and configuring such software can be well worth it in the event of an extended power outage that exceeds the runtime of the UPS. About the author Victor Avelar is a Senior Research Analyst at APC by Schneider Electric. He is responsible for data center design and operations research, and consults with clients on risk assessment and design practices to optimize the availability and efficiency of their data center environ- ments. Victor holds a Bachelor’s degree in Mechanical Engineering from Rensselaer Polytechnic Institute and an MBA from Babson College. He is a member of AFCOM and the American Society for Quality. APC by Schneider Electric White Paper 10 Rev 3 7
  • 8. Preventing Data Corruption in the Even of an Extended Power Outage Resources Click on icon to link to resource Four Steps to Determine when a Standby Generator is Needed for Small Data Centers and Network Rooms APC White Paper 52 APC White Paper Library whitepapers.apc.com APC TradeOff Tools™ tools.apc.com References 1. Allen and Segall, Monitoring of Computer Installations for Power Line Disturbances, IBM, IEEE PES Winter conference, 1974. A study conducted from 1969 to 1970 using 38 monitor-months of data 2. Goldstein and Speranza, The Quality of US Commercial AC Power, ATT Bell Labs, Intellec conference, 1982 A study conducted from 1977 to 1979 at 24 sites around the US 3. Martzloff, Power Quality Site Surveys: Facts, Fiction, and Fallacies, IEEE Transactions on Industry Applications, Vol 24, No 6 Contact us For feedback and comments about the content of this white paper: Data Center Science Center, APC by Schneider Electric DCSC@Schneider-Electric.com If you are a customer and have questions specific to your data center project: Contact your APC by Schneider Electric representative APC by Schneider Electric White Paper 10 Rev 3 8