Learn about the IBM i for Enterprise Businesses Quantifying the Value of Resilience.The IBM i operating environment has a longstanding track record of maintaining extremely high levels of
availability, security and disaster recovery. Users routinely describe it as “highly stable, extremely robust, completely dependable, rock-solid” and comparable terms.
IBM i for Enterprise Businesses Quantifying the Value of Resilience
1. October
2012
MANAGEMENT
BRIEF
IBM i for Enterprise Businesses
Quantifying the Value of Resilience
International Technology Group
609 Pacific Avenue, Suite 102
Santa Cruz, California 95060-4406
Telephone: + 831-427-9260
Email: Contact@ITGforInfo.com
Website: ITGforInfo.com
3. International Technology Group i
TABLE OF CONTENTS
EXECUTIVE SUMMARY 1
Value Proposition 1
Costs Of Downtime 2
Severe Unplanned Outages 5
Security and Malware Protection 7
Conclusions 8
RISK TRENDS 9
Overview 9
Supply Chain Disruption 9
Retail Vulnerabilities 12
Financial Services 13
Risk Sensitivities 13
Availability and Recovery 13
Security and Malware 14
Data Breaches 15
PLATFORM DIFFERENTIATORS 16
Overview 16
IBM i 17
Principal Characteristics 17
High-end Storage Support 19
Power Systems 21
Overview 21
Virtualization 21
PowerVM and x86 Virtualization 23
Availability Optimization 24
Power Systems 24
Software Solutions 25
DETAILED DATA 27
Company Profiles 27
Costs of Downtime 28
Calculation Process 28
Supply Chain Companies 28
Financial Services Companies 29
Severe Unplanned Outages 29
List of Figures
1. Average Costs of Downtime per Hour – Supply Chain Companies 3
2. Average Costs of Downtime per Hour – Financial Services Companies 4
3. Three-year Costs of Downtime by Platform – Supply Chain Companies 4
4. Three-year Costs of Downtime by Platform – Financial Services Companies 5
5. Three-year Risk Exposure to Severe Unplanned Outages
– Averages for Supply Chain Companies 6
6. Three-year Risk Exposure to Severe Unplanned Outages
– Averages for Financial Services Companies 6
7. Comparative Vulnerability Data – January 2008 Through June 2012 7
8. Comparative Vulnerability Data – Lifetime Totals 7
9. Basic Manufacturing Supply Chain Processes – SCOR Model 10
10. Potential Costs of Outages – Manufacturing Companies 11
11. Data Breach Costs – U.S. Examples 15
12. IBM i Single-level Storage Structure 17
13. IBM i and Power Systems Autonomic Functions 20
14. IBM i and Power Systems Architecture 22
15. System Environment Layers – Example 24
16. Key Power Systems Availability Optimization Technologies 26
17. Company Profiles 27
18. Average Costs of Outages per Hour Detail 29
4. International Technology Group 1
EXECUTIVE SUMMARY
Value Proposition
The IBM i operating environment has a longstanding track record of maintaining extremely high levels of
availability, security and disaster recovery. Users routinely describe it as “highly stable...extremely
robust…completely dependable…rock-solid” and comparable terms.
This has been the experience not only of midsize businesses, but also of large organizations requiring
enterprise-class capabilities. Among IBM i users are some of the world’s largest corporations, including
members of the Fortune 100 and FTSE 100.
Among this group, IBM i typically supports high-volume business-critical systems. Examples include
enterprise resource planning (ERP) systems, along with supply chain management, core banking and
retail, e-commerce and equivalents in a wide range of industries. IBM i offers levels of availability,
security and recoverability that are – by wide margins – greater than any competitive platform.
What is the value of these strengths? Few would dispute that disruption of core enterprise systems can
affect the bottom line. Many organizations, however, do not factor costs of downtime into their platform
selection processes. This may be a serious mistake. Business damage due to planned as well as unplanned
outages may vary significantly between platforms.
This report presents two sets of three-year cost comparisons for use of IBM i, Microsoft Windows Server
Failover Clusters (WSFC), and Oracle Exadata Database Machine to support core enterprise systems in
six companies. Comparisons are presented for companies operating supply chains, and for financial
services companies with revenues of between $1 billion and $10 billion.
Results may be summarized as follows:
• Costs of downtime – i.e., business costs due to outages – averaged 90 percent less for use of IBM
i than for Windows server clusters, and 71 percent less than for Oracle Exadata. This calculation
is for planned outages and unplanned outages of less than three hours duration.
Lower IBM i costs of downtime translated into three-year business savings of $2.8 million to
$35.3 million compared to use of clustered Windows servers, and $700,000 to $8.6 million
compared to use of Oracle Exadata.
• Risk exposure to severe unplanned outages of 6 to 24 hours duration is also significantly lower
for use of IBM i. These calculations, which employ a standard probability/impact methodology,
indicate that risks of severe business damage for use IBM i average 93 percent less than for use of
clustered Windows servers and 73 percent less than for use of Oracle Exadata.
These variances translated into $257,000 to $7.43 million in higher risk exposure for use of
clustered Windows servers and $56,000 to $1.69 million for use of Oracle Exadata.
Comparisons are based on use of IBM i 7.1 with IBM PowerHA SystemMirror for i high availability
clusters on latest-generation Power Systems; Windows Server 2008 R2, SQL Server 2008 R2 and WSFC
on latest-generation Intel E5- and E7-based platforms; and current Oracle Exadata models with Oracle
11g Database including Real Application Clusters (RAC).
Lower costs of downtime and risk exposure for use of IBM i are due to fundamental differences in
architecture and technology.
5. International Technology Group 2
IBM i is designed specifically to run business-critical systems. High levels of availability reflect features
built into the IBM i kernel, and embedded into Power Systems hardware and firmware. IBM i is the most
tightly integrated and automated operating environment in existence. The potential for system, operator or
administrator errors is minimal.
The strengths of IBM i in security and malware protection reflect the system’s distinctive object-based
architecture. Objects are encapsulated in a manner that places strict controls on data as well as system
code, making it extremely difficult for unauthorized instructions to execute. Security violations are rare,
and malware incidents are virtually unknown. There are no known native IBM i viruses.
Disaster recovery capabilities are built into the IBM i kernel and tightly integrated with IBM PowerHA
and third-party failover and recovery solutions. These have supported high-volume business-critical
systems for decades.
IBM i and Power Systems routinely handle enterprise-class workloads requiring high levels of scalability
and performance – many users employ IBM Power 770, 780 and 795 models with up to 64, 96 and 256
POWER7 cores respectively – and offer highly granular, real-time virtualization even in demanding
production environments.
In terms of technological currency, IBM i implements the full function SQL-compliant DB2 relational
database, Internet standards and interfaces to tablets and smartphones. It also supports a wide range of
development languages, including C/C++, COBOL, RPG, Java, PHP, XML and others.
IBM i users have been able to take full advantage of Internet and, more recently, mobile technologies to
employ popular “open” development tools and to exploit growing pools of developer skills.
A further point should be noted. The virtualization strengths of IBM i and PowerVM provide a strong
base for realization of public as well as private clouds. IBM i and PowerVM are central platforms in IBM
cloud strategy, and will be fully supported in the future evolution of IBM cloud solutions and services.
In most organizations – including those upon which comparisons presented in this report are based – IBM
i systems coexist with a variety of UNIX and x86 servers. IBM i may not be appropriate for all
applications. But for core systems that “run the business,” its distinctive strengths are unmatched.
Costs Of Downtime
The comparisons presented in this report are based on detailed financial and operational data supplied by
60 companies employing IBM i, WSFC or Oracle Exadata to run core enterprise systems.
Based on this input, six composite company profiles were created. These included companies operating
supply chains (an auto parts manufacturer, a retail chain and an industrial distributor) as well as financial
services companies (a diversified retail bank, a property and casualty insurer and a services company).
Average costs of downtime per hour were first calculated for these companies, and then multiplied by
numbers of hours of downtime for each of the three platform options. The focus was placed on underlying
hardware and software platform outages, rather than application-level downtime.
For supply chain companies, average costs of downtime ranged from $549,000 to $1.21 million to per
hour. Figure 1 summarizes these results.
6. International Technology Group 3
Figure 1: Average Costs of Downtime per Hour – Supply Chain Companies
Costs of downtime allow for cascading effects. In tightly integrated supply chains characterized by lean
operating models (i.e., there are few or no inventory buffers) and real-time operations, evidence has shown
that the effects of disruption at any point may cascade across the entire supply chain.
This significantly changes the costs of downtime equation. In the past, for example, companies often
calculated that, if annual sales were $5 billion, the cost of an hour of downtime was $5 billion divided by
8,760 hours per year = $570,000. In practice, however, the cost may be four or five times higher, and the
effects may continue to felt for days or even weeks.
For supply chain companies, allowance was also made for lost sales, increases in operational costs,
remedial costs such as late delivery and imperfect order penalties, and related effects. Selling, general and
administrative (SG&A) costs for the retail chain are due to disruption of store operations.
Costs of downtime may be substantial in other industries where cascading occurs. These include
transportation, where schedule disruption may have major bottom-line impacts; third-party logistics
services; engineering and construction; energy companies and public sector organizations.
For financial services companies, costs of downtime varied by type of business. For the bank, costs of
downtime include lost or delayed transaction fees, lost interest income and lost productivity for branch
and call center staff.
For the insurer and services company, costs include lost policy income (the company transacted a large
volume of business over the Internet, and was thus exceptionally vulnerable to effects of outages) and lost
services income respectively. Other items include lost interest income and lost productivity for customer-
facing staff.
Costs of downtime for all three companies allow for the effects customer attrition, calculated using
appropriate customer lifetime value (CLV) metrics, and lost customer acquisition costs.
Average costs of downtime ranged from $128,000 to $259,000 per hour. Figure 2 summarizes these
results.
Retail
Chain
Lost
sales
Supply
chain
disrupCon
SG&A
costs
685.03
Auto
Parts
Manufacturer
Outbound
supply
chain
disrupCon
Inbound
supply
chain
&
producCon
disrupCon
Customer
penalCes
&
remedial
costs
1,213.71
Industrial
Distributor
Lost
sales
Supply
chain
disrupCon
549.16
$
thousands
SUPPLY
CHAIN
COMPANIES
7. International Technology Group 4
Figure 2: Average Costs of Downtime per Hour – Financial Services Companies
Based on these values, three-year costs of downtime for the three platform options were as shown in
figures 3 and 4.
Figure 3: Three-year Costs of Downtime by Platform – Supply Chain Companies
Comparatively high WSFC costs of downtime are notable in that “five nines” (99.999 percent)
availability is commonly claimed for this platform. There are several reasons for this disparity.
One is that such claims commonly refer to low-volume environments and/or applications whose
characteristics are significantly different to those of enterprise business systems. The technical challenges
of maintaining high levels of availability for, say, email or collaborative networks are not the same as
those for high-volume transactional and mixed workloads.
Insurance
Company
Lost
income
Other
costs
150.36
Bank
Customer
aNriCon
Lost
fee
income
Other
costs
259.45
Services
Companies
Lost
income
Other
costs
127.86
$
thousands
FINANCIAL
SERVICES
COMPANIES
Industrial
Distributor
Retail
Chain
Auto
Parts
Manufacturer
1.10
2.77
3.52
4.39
8.56
12.14
9.88
20.55
38.84
MicrosoY
WSFC
Oracle
Exadata
IBM
i/Power
$
millions
8. International Technology Group 5
Figure 4: Three-year Costs of Downtime by Platform – Financial Services Companies
A second reason is that claims typically refer to avoidance of unplanned outages rather than overall
downtime. Windows cluster environments tend to be highly complex, and in practice require extensive
software maintenance. Complexity also increases risks that unplanned outages will occur – there are more
potential points of failure.
WSFC deployments supporting enterprise-class systems are typically customized by professional services
firms. Modifications, as well as testing of these, tend to be more difficult and time-consuming than might
be the case in a less complex Microsoft environment.
Higher levels of availability for Oracle Exadata reflect more resilient hardware and use of the company’s
RAC cluster technology. Planned outages, however, tend to be longer and more frequent than for IBM i.
Oracle Exadata systems have been variously deployed for business intelligence (BI), transactional
applications and consolidation of Oracle database servers. The underlying architecture is, however,
primarily optimized for high-performance analytics.
Severe Unplanned Outages
There is a great deal of evidence that, when severe unplanned outages occur, the bottom-line impact
increases in a manner that is as much exponential as arithmetic. A 24-hour outage, for example, may not
have four times the impact of a 6-hour outage. It may have 20 times more.
Experience with major supply chain failures has shown that effects may extend beyond operational costs
and lost sales to include reputational damage, impaired corporate financial performance, share price
declines, reduced investor confidence and other negative effects.
Financial services companies are equally if not more vulnerable. Customer attrition and remedial costs are
likely to be substantial, reputational damage may be immediate and massive, and regulatory penalties and
legal costs may be incurred.
Services
Company
Insurance
Company
Bank
0.32
0.45
0.78
1.02
1.50
2.98
3.07
4.81
9.34
MicrosoY
WSFC
Oracle
Exadata
IBM
i/Power
$
millions
9. International Technology Group 6
The example of the recent (June 2012) core banking system outage affecting the UK’s Royal Bank of
Scotland (RBS) is instructive. The outage left 17 million of the company’s 23 million customers unable to
access account information, withdraw or transfer funds, or process payments for up to six days. Media
coverage was massive and predominantly negative.
In August 2012, RBS projected more than $200 million in remedial costs for customer reimbursements,
overdraft extensions and related actions. The company was also obliged to extend hours at more than
1,200 out of 2,500 branches and double call center staff in order to handle customer queries. The extent of
customer attrition is still unclear.
Like costs of downtime, risk exposure is materially affected by platform choices. In supply chain
companies, three-year exposure for use of IBM i on Power Systems averaged 93 percent less than for
WSFC, and 73 percent less than for use of Oracle Exadata.
Disparities for financial services companies were generally similar. Comparable averages were 95 percent
and 76 percent less respectively. Figures 5 and 6 illustrate these results.
Figure 5: Three-year Risk Exposure to Severe Unplanned Outages
– Averages for Supply Chain Companies
Figure 6: Three-year Risk Exposure to Severe Unplanned Outages
– Averages for Financial Services Companies
IBM
i/Power
Oracle
Exadata
MicrosoY
WSFC
357.63
1,319.79
4,858.51
$
thousands
IBM
i/Power
Oracle
Exadata
MicrosoY
WSFC
34.25
140.76
604.54
$
thousands
10. International Technology Group 7
Security and Malware Protection
A further area of risk exposure should be highlighted. Hacking and infection by malware (malicious code)
remain ubiquitous threats for all large organizations.
Companies that experience customer data breaches may incur fines and other regulatory penalties, along
with costs of remedial actions such as notifications, monitoring for identity theft, query handling, and
investigation and resolution of security flaws. In the event of a publicized breach, customer attrition and
reputational damage may also be substantial.
Even if customer data is not compromised, other types of sensitive information may be compromised, and
damage to systems and software may occur.
In security and malware protection, differences between IBM and competitive platforms are not merely
significant – they are dramatic. These differences are reflected in data compiled by Secunia, one of the
industry’s leading authorities on security and malware exposure.
Figure 7 shows numbers of advisory notices issued by the company between January 2008 and June 2012
inclusive for the most recent versions of IBM i, the two principal Linux distributions – Red Hat
Enterprise Linux (RHEL) and SUSE Linux Enterprise Server (SLES) – and for Windows Server 2008.
SEVERITY
WINDOWS
SERVER
2008
RHEL
Server
5
RHEL
Server
6
SLES
10
SLES
11
IBM
i
7.1
i5/OS
6.x
Extremely
critical
3
1
0
0
0
0
0
Highly
critical
64
93
61
134
88
0
0
Moderately
critical
34
185
84
79
53
0
6
Less
critical
73
175
85
60
66
0
5
Not
critical
5
53
31
18
14
0
0
TOTAL
ADVISORIES
179
507
261
291
221
0
11
Source:
Secunia
Figure 7: Comparative Vulnerability Data – January 2008 Through June 2012
Figure 8 shows lifetime vulnerabilities; i.e., the number of vulnerabilities recorded by the company since
each version was introduced. Multiple vulnerabilities may be documented in a single advisory notice.
WINDOWS
SERVER
2008
RHEL
Server
5
RHEL
Server
6
SLES
10
SLES
11
IBM
i
7.1
i5/OS
6.x
Release
Date
February
2008
March
2007
November
2010
July
2006
March
2009
April
2010
January
2008
Lifetime
Vulnerabilities
352
1,871
906
3,557
1,889
0
16
Source:
Secunia
Figure 8: Comparative Vulnerability Data – Lifetime Totals
(Oracle Enterprise Linux, the most commonly employed Exadata operating system, is not tracked
separately. It is based on RHEL. Windows Server 2012 became generally available in September 2012.)
Disparities are confirmed by other sources. During 2011, for example, the National Vulnerability Database
maintained the U.S. National Institute of Standards and Technology (NIST), recorded 197 medium and
high security vulnerabilities for Linux, and 130 for Windows Server. None were recorded for IBM i.
11. International Technology Group 8
The significance of IBM i security strengths is reinforced by two factors. One is that, most security
authorities recognize, firewall-based perimeter defenses are no longer enough. Penetration of these has
become increasingly common, and they do not prevent escalating threats of insider abuse. Higher levels
of protection are required for core business databases.
The second is that, since the onset of recession, businesses have become reluctant to increase spending on
IT security, and many have reduced it. Threats, however, have continued to increase. Organizations have
been faced with a choice between greater expenditure or greater risk. IBM i enables them to avoid this
choice. Better security may be maintained at a lower cost.
Conclusions
No matter how one rates the value of IBM i’s distinctive strengths, that value is increasing over time.
Industry trends are magnifying the effects of downtime. Among companies operating supply chains,
economic conditions have accelerated inventory drawdowns, while moves to lower-cost offshore sites
increase supply chain complexity and geographic dispersion. Adoption of real-time business analytics and
technologies such as RFID will further reduce cycle times.
For financial services companies, mergers and acquisitions mean that core systems outages affect larger
numbers of customers. Growth of online and – increasingly – mobile services has introduced new points
of vulnerability. Competitive pressures have increased risks of customer attrition. In these and other
industries, use of social media has increased security and malware exposure.
IBM i has been employed, in some cases for more than 20 years, by large users worldwide. It was
designed to offer a simple, reliable, secure and easy-to-administer platform to support core business
systems.
In an era when the IT world has veered toward ever-greater complexity, IBM i has retained these
characteristics. More than any other server environment available today, it is designed to minimize the
complexities with which organizations must deal.
Over the last few years the IT industry has, ironically, rediscovered the advantages of complexity
reduction. The principal value propositions for cloud computing – faster deployment and provisioning,
more effective use of virtualization to enable consolidation and reduced administrative overhead – have
been enjoyed by IBM i users for decades.
Others may enjoy the same benefits.
12. International Technology Group 9
RISK TRENDS
Overview
Key industry trends mean that the significance of IBM i strengths in availability and disaster recovery,
and in security and malware resistance are increasing over time. These trends are discussed in this section.
The following section, Platform Differentiators, deals with differences in architecture and technology
between IBM i and Power Systems, and competitive hardware and software platforms. The last section,
Detailed Data, provides additional information on the methodology employed for calculations. Detailed
cost breakdowns are also provided.
Supply Chain Disruption
Decades of experience have shown that, in industries operating supply chains, downtime costs money.
Risks of supply chain disruption have, however, been the subject of greatly increased attention since the
mid 2000s. This shift been driven by a number of trends, including the following:
1. Integration. ERP systems have progressively expanded to integrate a broader range of
transactional processes, as well as new analytical and collaborative functions.
ERP environments now commonly include customer relationship management (CRM), e-
commerce, supply chain management (SCM), product data management (PDM) and product
lifecycle management (PLM), supplier relationship management (SRM), BI and a wide range of
other applications.
Businesses, however, have found that the benefits of broader functionality and organization-wide
process integration have a side effect: they become fundamentally dependent upon their systems.
Quite simply, an outage may grind the entire business to a halt.
Vulnerabilities are magnified by consolidation of systems. Mergers and acquisitions, as well as
adoption of shared services structures for order processing, finance, human resources (HR),
customer service and other functions have contributed to this trend.
Exposure extends to solutions for planning and forecasting, analytics, mobile computing and
other informational applications. Even if applications are deployed on different platforms, they
draw upon core databases – if these are down, they will at best be working with stale data.
2. Globalization. The growth of offshore sourcing has caused procurement and logistics operations
to grow more complex, while transportation times have increased.
The impact of disruptions tends to be greater for regional and global supply chains than for those
in more restricted geographies. A delay in shipping from a local plant to a nearby distribution
center, for example, may mean waiting for another truck. A delay in shipping from a Chinese
plant to North America or Europe may mean waiting 10 days for the next ship.
3. Supply chain strategies. Adoption of just in time, lean and real-time operating models has further
increased vulnerability.
In most supply chain industries, lean strategies have become the norm. In consumer products and
retailing, they have been reflected in techniques such as Efficient Customer Response (ECR),
Collaborative Planning, Forecasting and Replenishment (CPFR), Continuous Replenishment and
Vendor Managed Inventory (VMI).
13. International Technology Group 10
The effects of lean strategies may permeate the entire supply chain. At the corporate or business
unit level, for example, forecasting and planning cycles may be reduced from weeks to days, or to
24 hours or less.
At the other end of the spectrum, cross docking (i.e., the immediate transshipment of goods
between arriving and departing vehicles, without intermediate storage) in distribution centers may
increase both efficiency and vulnerability to disruption.
In the automotive industry, for example, suppliers now receive continuous demand signals from
their customers, recalibrate plans and forecasts, and initiate procurement, production and logistics
actions in real-time. In a fiercely competitive industry, supplier shortfalls are rarely tolerated.
The automotive parts company profiled in this report, for example, delivers to its customers’
manufacturing plants on a just-in-time basis, often several times a day. Orders are typically
processed in minutes and deliveries dispatched within two to five hours.
4. Cascading effects. These may be simply illustrated. Even a basic manufacturing supply chain
will typically involve most or all of the processes summarized in figure 9.
SOURCE
§ Identify
sources
of
supply
§ Select
supplier(s)
§ Negotiate
with
supplier(s)
§ Schedule
product
deliveries
§ Receive
product
§ Verify
product
§ Transfer
product
§ Authorize
supplier
payment
MAKE
§ Schedule
production
§ Set
up
production
§ Issue
product
§ Produce
§ Inspect/test
product
§ Package
product
§ Stage
product
§ Release
to
delivery
DELIVER
§ Process
inquiry
&
quote
§ Receive,
enter
&
validate
order
§ Reserve
inventory
resources
§ Reserve
delivery
resources
§ Determine
delivery
date
§ Consolidate
orders
§ Build
loads
§ Route
shipments
§ Select
carrier(s)/rate(s)
§ Receive
product
§ Pick
product
§ Pack
product
§ Load
product
§ Generate
shipping
docs
§ Ship
product
§ Customer
receipt
&
verify
§ Install
product
§ Invoice
customer
Figure 9: Basic Manufacturing Supply Chain Processes – SCOR Model
The figure above is based on selected segments of the Supply Chain Operations Reference
(SCOR) model developed by the Supply Chain Council.
Delays in one process can spread rapidly to others. For example, a delay in delivering parts to a
plant may cause finished product shipment deadlines to be missed. This may affect transportation
schedules and distribution center operations, affecting other deliveries. The impact is cumulative.
5. Customer responses. Economic conditions, changing expectations and mounting competition
have made customers less tolerant of supplier failures. Although the costs of operational
disruption may be substantial, the largest bottom-line impact often involves customers.
Sales may be lost, and customers may defect. Even if this does not occur, suppliers may be
subject to late delivery, imperfect order and other penalties. It may also be necessary to offer
special discounts or terms and conditions in order to win back the customer’s business.
14. International Technology Group 11
A less visible, but potentially more damaging erosion of confidence may also occur. This could
cause the customer to hedge by diverting some future purchases to other suppliers in order to
reduce dependence. In addition, the customer might be reluctant to rely upon the company for
future strategic orders, particularly where these were time-sensitive.
No manufacturer wants to learn that customers now consider them a high-risk supplier.
An additional set of “strategic” costs may be incurred if outages are severe, protracted or both. Share
prices may be affected. Other effects such as reduced brand value; increased risk provision; higher
insurance premiums; and a variety of reputational, legal and compliance problems may be experienced.
System outages may have a wide range of potential cost impacts. Figure 10, for example, shows a
representative list of these for manufacturing companies.
STRATEGIC
COSTS
Charge
against
earnings
Financial
metrics/ratios
Share
price
decline
Share
price
volatility
Cost
of
capital
Increased
risk
provision
Reduced
brand
value
Insurance
premiums
Damaged
reputation
- Financial
markets
- Customers/prospects
- Banks
- Business
partners
- M&A
candidates
Impaired
credit
Liquidity
exposure
Legal
exposure
- Customers
- Third
parties
- Shareholders
Compliance
exposure
- Regulatory
reporting
- Impaired
inspection
- Impaired
traceability
CUSTOMER-‐RELATED
COSTS
Lost
short-‐term
sales
Lost
short-‐term
profit
Lost
future
sales/profit
Late
delivery
penalties
Imperfect
order
penalties
Product
defect
penalties
Customer
rebates
Buyback
pricing/concessions
Additional
customer
service
cost
OPERATIONAL
COSTS
Idle
capacity
- Overall
supply
chain
- Procurement
- Plant
operations
- Logistics/distribution
- Transportation
- Warehouses
- Third-‐party
services
Personnel
costs
- Idleness/underutilization
- Reduced
productivity
- Additional
work
required
- Overtime/shift
premiums
- Additional
T&E
costs
Finance
processes
- Delayed
billing/receivables
- Inventory
carrying
cost
- Cash
flow
cost
- Delayed
close
Costs
of
change
- Procurement
change
- Revised
order
processing
- Special
order
cost
- Production
schedule
change
- Line
change
cost
- Costs
of
logistics
change
- Supplier
premiums
- Expedited
transportation
- Additional
handling
cost
- Additional
inventory
cost
- Additional
checking
cost
Error-‐related
costs
- Order
processing
errors
- Product
defect
- Specification
error
- Manufacturing
error
- Quality
failure
- Shipment
error
- Damaged
product
- Wrong
packaging
- Routing
error
- Wrong
delivery
time
Other
costs
- Lost
promotional
expenditure
- Lost
marketing
expenditure
- IT
costs
- Administrative
costs
- Overhead
Figure 10: Potential Costs of Outages – Manufacturing Companies
The potential significance of such effects was highlighted by a study co-authored by Kevin Hendricks of
the University of Western Ontario and Vinod Singhal of the Georgia Institute of Technology. After
reviewing the financial results of more than 800 public companies that had experienced severe supply
chain disruptions, the authors concluded that company stocks experienced 33 to 40 percent lower returns
relative to industry benchmarks over a three-year period because of these.
15. International Technology Group 12
The study also reported declines of 7 percent in sales growth, 107 percent in operating income, 114
percent in return on sales, 93 percent in return on assets, and increases in cost of sales, selling, SG&A
expenses and inventory levels.
A further implication should be highlighted. Disruptions tend to raise error rates across any or all stages
of supply chains. This is particularly likely if there is a rush to catch up with backlogs. Results may
include dissatisfied customers, remedial costs, legal and regulatory exposure and other negative effects.
A clear conclusion emerges. Whether outages result in operational disruption, customer-related costs
and/or strategic costs, they have a significant impact. Maintenance of the highest possible level of
availability and recovery for core supply chain systems should be a central goal of IT strategy.
Retail Vulnerabilities
Retailers worldwide have experienced many of the same trends as manufacturers. Supply chains have
become more complex and fragile, logistics structures have been consolidated and cycle times have been
cut across the board.
Acceleration has affected processes such as sales and inventory tracking, and merchandising decisions.
Although the pace has varied between retail lines and geographies, there has been a steady trend toward
more frequent new product launches, and greater use of time-sensitive promotions and markdowns.
System downtime occurring during such periods may have particularly serious effects.
Service interruptions may also cause lost sales and customers. In conventional storefront retailing, the
industry “rule of thumb” is 40 to 80 percent of stockouts result in lost sales rather than purchases of an
alternative in-store product. Additional costs may be incurred for changes to store displays, backorders,
restocking, markdowns and other remedial actions.
Online sales are even more vulnerable. More than 20 years of experience with retail websites has shown
that 24x365 usage is the norm, and that even short outages during off-peak periods may cause significant
loss of business. If protracted outages occur at times of high usage (e.g., during seasonal sales peaks, or in
response to new product launches, promotions or Internet buzz), losses may be massive.
It has become a truism that, in online retailing, shoppers who are diverted to another supplier because
they are unable to research a product, determine availability or place an order may not return. Even if they
do, they are more likely to buy from multiple sources in the future.
Retailers also face growing use of mobile devices in stores. In the United States, for example, more than
40 percent of tablet and smartphone owners use these for comparison-shopping while visiting retail
outlets, and some estimates put the ratio at over 60 percent. There are similar trends in other geographies.
It has long been a principle that, in e-commerce, “customers are only a few clicks away from
competitors” and that online outages translate rapidly into lost sales. Mobility extends this effect to stores.
Increasingly, any customer may be “only a few clicks away from competitors.”
16. International Technology Group 13
Financial Services
Risk Sensitivities
Financial services are, more than any other industry, sensitive to risk. Financial institutions are equipped
with highly sophisticated risk management processes and systems. Cultures of risk awareness and
mitigation are well established.
This extends to IT. Most companies have developed high availability, disaster recovery and security
infrastructures over decades. Risk sensitivity, however, has its blind spots. While the importance of such
infrastructures is generally understood, there is less awareness that the platforms around which they are
built may themselves be risk factors.
In large banks, core systems are typically mainframe-based. A significant minority, however, run on IBM
i. Worldwide, more than 15,000 banks, including large as well as small and midsize institutions, run core
systems on this platform. The general industry recognition is that IBM i offers mainframe-class levels of
availability, security and recoverability.
IBM i is supported by most of the industry’s major ISVs offering core banking and electronic funds
transfer (EFT) solutions, and has also been deployed by numerous insurance and other financial services
companies.
As the results presented in this report indicate, risk exposure may be significantly greater for other
platforms with which there is less experience in large-scale core system deployments.
Availability and Recovery
The financial services industry has long been sensitive to outages. The cost per hour of downtime for
trading, credit card processing, ATM and debit card networks, and other high-volume EFT systems has
often run to hundreds of thousands or millions of dollars.
With the growth of Internet services, vulnerability has increased further. “Normal business hours” no
longer exist. Most companies experience some level of activity at all hours of the day and night, 365 days
per year. Any interruption of service, at any time, may affect customers. An outage at times of high
activity may impact millions.
Apart from lost fee income, lost or delayed payments and other financial effects, customer loss may also
occur. Even if defections cannot be attributed to a specific incident, their effects will show up in overall
attrition statistics.
Outages may accelerate trends that are already causing concern among many companies. Bank customer
attrition rates, for example, continue to increase. In North America and Western Europe, annual rates are
already in the 5 to 10 percent per year range, while in many developing economies rates of 10 to 20
percent are becoming the norm.
Similar trends have been reported in insurance and other financial services businesses. For these, as for
banks, service issues are – by a wide margin – the most common cause of attrition.
Customer loss is magnified if it measured in terms of CLV. In banking, insurance and other lines of
business, the effects are magnified by the growing numbers of products held per customer, and by the fact
that relationships tend to become more profitable the longer they last.
17. International Technology Group 14
Allowance should also be made for lost customer acquisition costs. In banking, for example, acquisition
costs in developed countries are routinely $200 to $400 per customer, and average costs are escalating in
developing geographies. This expenditure is inevitably lost if a customer defects.
Disruption of core banking systems may be exceptionally damaging. Over time, these have developed
links to a wide range of other systems within banking infrastructures. A disruption may create cascading
effects as severe and long lasting as those in supply chain companies.
The recent Royal Bank of Scotland core banking system outage, for example, affected not only batch
processing but also all branch systems, ATMs, debit and credit cards, online banking and call center
systems. All channels and customer touch points were affected.
Vulnerability to such disruptions has tended to increase. Mergers and acquisitions have led many banks in
developed countries to merge legacy core banking systems (this was notably the case for Royal Bank of
Scotland), while in developing economies new deployments have often been driven by the need to
support business growth and offer new services.
The replacement of a core banking system is, under any scenario, a high-risk proposition. Risks increase
in proportion to the size of institutions. They increase further if new systems are deployed on platforms
whose stability and robustness is problematic.
Security and Malware
Financial services companies are the preferred target of the most sophisticated cybercriminals, including
organized gangs operating worldwide.
Hacking as well as malware attacks are growing more sophisticated over time. Companies also face a
growing threat from “hacktivists” promoting social and political agendas. During 2011, groups such as
Anonymous and its affiliates are believed to have exposed more confidential records in the U.S. than
cybercriminals. Despite occasional law enforcement successes, the problem continues to grow.
Financial services companies continue to invest heavily in perimeter defenses. These are, increasingly,
by-passed by two forms of threat:
1. Advanced persistent threats (APTs) involve malware that illicitly collects and forwards
confidential information over time. In many cases, APTs, which operate inside firewalls, have
functioned for months or years before being detected. No doubt, many have not been detected.
Increasingly, APTs have been directed to theft of funds rather than identify information. During
late 2011 and 2012, for example, a growing number of banks have reported “High Roller” attacks
which target high balance customer accounts and transfer funds elsewhere.
2. Insider abuse also appears to be expanding, and some industry sources estimate that insiders now
account for between a quarter and a third of all cybercrime incidents in financial services
companies. Perpetrators range from low-level employees to high-level executives, often
cooperating with external cybercriminals. Schemes routinely, again, operate for months or years.
Economic conditions have contributed to growth in all types of cybercrime.
Growing attention is also being paid to the threat of nation-state attacks. Rogue nations are capable of
assembling and protecting larger numbers of computer specialists, and their activities may have access to
greater resources and more advanced skills than cybercriminals. Financial services companies and
payments infrastructures are natural targets.
18. International Technology Group 15
Data Breaches
Despite increasingly stringent privacy laws in most countries, data breaches remain pervasive.
In the United States, for example, credit card processor Global Payments reported in March 2012 that
hackers had compromised more than 1.5 million accounts of American Express, Discover, MasterCard
and Visa cardholders. Some external estimates put the number of accounts compromised at over 7
million. It is believed that hackers first penetrated Global Payments during 2011.
Penetration over long periods would not be unusual. For example, services company Heartland Data
Systems recently revealed that a Ukraine-based hacker group operated inside the company’s perimeter
defenses for around six months. The company experienced a major breach in 2008 that exposed 134
million credit card accounts.
In June 2011, Citicorp disclosed that a hacker attack had compromised more than 360,000 customer
accounts. Numerous other such incidents have been reported during 2011 and 2012 worldwide.
In most countries, privacy laws expose businesses to regulatory penalties in the event of data breaches,
and other costs may be substantial. Figure 11 shows examples.
ACTIVITY
COSTS
Forensic
examination
&
fixes
Weeks
to
months
using
specialists
at
$1,000-‐5,000
per
person/day
(1)
Customer
notification
$0.20
to
$5
per
customer,
depending
on
medium
(1)
Query-‐handling
$10
to
$25
per
customer
(call
center)
(1)
Credit/identity
monitoring
$100 to $300 per customer per year
(1)
Other
customer
remedial
actions
$15
to
$1,000+
per
customer
(1)
Reissue
payment
card
$12-‐22
per
card
(2)
Legal
costs
Average
legal
defense
cost:
$500,000
Average
legal
settlement:
$1
million
(3)
Regulatory
fines
&
penalties
Variable
Management,
PR
costs
Variable
Customer
attrition,
brand
damage
Variable
Sources:
(1)
International
Technology
Group
(2)
“Data
Breach
Cost,”
Zurich
Insurance
Group
2011
(3)
“Cyber
Liability
&
Data
Breach
Insurance
Claims,
A
Study
of
Actual
Payouts
for
Covered
Data
Breaches,”
June
2011,
NetDiligence
Figure 11: Data Breach Costs – U.S. Examples
Companies that have quantified breach costs report that customer attrition and brand damage represent the
largest cost components.
19. International Technology Group 16
PLATFORM DIFFERENTIATORS
Overview
IBM i and Power Systems represent the convergence of two major technology streams:
1. IBM i originated with the AS/400 in 1988, and has been progressively enhanced to incorporate
new technologies.
According to the company, IBM i is employed by more than 150,000 organizations worldwide.
Although the installed base has decreased since the early 2000s, most of this has been due to
system consolidation. Many organizations that had initially deployed AS/400s to remote sites
later replaced these with larger centralized systems.
IBM i is supported by more than 2,500 ISVs – including most major vendors of ERP and
industry-specific core business systems – along with systems integrators and professional services
firms worldwide. It enjoys one of the highest levels of customer loyalty of any platform.
Many organizations continue to employ custom-developed RPG- and COBOL-based systems.
Among this group, application modernization initiatives – ranging from simple addition of
browser-based interfaces to large-scale re-engineering projects employing service oriented
architecture (SOA) – have been common.
IBM’s policy on i technology upgrades is distinctive. As a general principle, the company
introduces new i releases every two years.
New technology is also implemented in Technology Updates, which are introduced every six
months, and may be applied in a simple and non-disruptive manner. This approach, which was
widely requested by customers, enables them to implement new capabilities in an incremental
manner rather than though major migrations every few years.
2. Power Systems are built upon the seventh generation of IBM POWER reduced instruction set
computing (RISC) architecture. POWER7-based systems, which also support the IBM AIX
UNIX-based operating system and Power versions of RHEL and SLES Linux, have consistently
outperformed competitive platforms in a wide range of industry benchmarks.
POWER7-based systems incorporate industry-leading advances in chip density, memory
technology, multithreading virtualization, workload management, availability optimization,
energy efficiency and other areas.
In the UNIX server market, Power Systems have progressively increased their share since 2008,
and by the end of 2011 had reached the 50 percent mark. This share has continued to expand
during 2012 in both developed and growth markets worldwide.
In addition, IBM i runs on Power processors in new IBM PureFlex Systems, which combine IBM
Power, System x (x86) and midrange Storwize V7000 disk arrays in a single integrated platform.
PureFlex Systems implement common management services across the full range of operating
systems, systems software and hypervisors supported by the platform.
For large organizations considering whether to deploy new enterprise business systems on IBM i or
competitive servers, or debating whether to maintain commitments to existing i-based systems, it is
important to understand the differences between these platforms.
20. International Technology Group 17
IBM i
Principal Characteristics
Major IBM i features include the following:
1. Core design. The core IBM i design is built around an object-based kernel in which all system
resources are defined and managed as objects.
The kernel incorporates single-level storage capability, meaning that the system treats all storage
resources, including main memory and disks, as a single logical entity. Placement and
management of data on all resources is handled automatically by the system, minimizing tasks
that must be handled by administrators.
This capability, illustrated in figure 12, enables high levels of configuration flexibility; improves
system administrator productivity; and materially improves the efficiency with which processor
and storage resources are used, improving performance and capacity utilization.
Figure 12: IBM i Single-level Storage Structure
A further benefit is that integration and management of solid-state drives (SSDs) is comparatively
simple. IBM i automatically places the most frequently accessed data on SSDs, reallocates data to
SSDs or hard drives as workloads evolve, and optimizes performance on an ongoing basis.
IBM i users have realized performance gains from use of SSDs in high-throughput applications
such as large batch runs (reductions of 20 to 50 percent in elapsed time are common) and initial
program loads (IPLs).
The IBM i kernel also embeds the Technology Independent Machine Interface (TIMI), a unique
IBM i feature that acts as a “virtual” instruction set with which applications interact regardless of
the instruction set of underlying processor hardware.
The TIMI has enabled IBM to update underlying hardware platforms without obliging users to
recompile applications software. Organizations have found avoidance of costs, workloads and
disruptions of application migration to be major benefits.
SINGLE-‐LEVEL
STORAGE
STORAGE
MANAGEMENT
Objects
Main
memory
(RAM)
Disk
storage
Solid
state
21. International Technology Group 18
2. System integration. IBM i includes not only operating system functions, but also DB2 for i, an
integrated file system, WebSphere Application Server (WAS), Tivoli Directory Server, Java
Virtual Machine (JVM) environments, and more than 300 tools handling system, database,
storage, backup and recovery, communications, security, operations and other management tasks.
DB2 for i is an i-optimized version of IBM DB2 platform, which is offered by the company for
Windows, Linux, UNIX and mainframe systems. It is a full-functional SQL relational database
enabling high levels of transactional as well as query performance, along with industry-leading
data compression, encryption and Extensible Markup Language (XML) compatibility.
IBM i components are not simply bundled. They are engineered to interact with each other in a
simple and efficient manner, and extensive testing is carried out to ensure that they do so. This
testing extends not only across IBM hardware and software, but also across key independent
software vendor (ISV) solutions.
The implications are important. Integration affects performance – efficient software structures
generate lower system overhead – as well as availability. Tightly integrated, tested systems are
less likely to experience outages.
Equivalent functionality in Windows and x86 Linux server environments typically requires that
users acquire, install, configure and administer multiple software products from different vendors.
Integration and testing of these is less coordinated, and version upgrades rarely follow the same
schedule. Deployment complexity and management challenges are increased.
In addition to increasing full time equivalent (FTE) staffing for system, database and security
administration, less integrated environments are more likely to degrade performance.
Maintenance of availability, security and disaster recovery also become a great deal more
problematic.
3. Workload management. Since its inception, IBM i has incorporated industry-leading workload
management (in IBM i terminology, work management) capabilities designed to handle diverse
workloads such as online, batch and collaborative processing in a highly efficient manner.
The backbone of these capabilities is provided by IBM i subsystems, which leverage the IBM i
object-based architecture – individual workloads or applications (e.g., ERP, CRM, e-mail, Web
serving) are described and managed independently. The system allocates memory, limits
consumption of resources by individual workloads, and manages scheduling, tuning and other
tasks automatically, or based on priorities set by users.
Subsystems are integral to the IBM i design, and may be employed independently of or in
conjunction with PowerVM virtualization. This approach represents one of the most elegant and
sophisticated forms of workload management available for any server platform.
4. Automation. IBM i was designed to automatically handle a wide range of functions – including
configuration, tuning, software updates, availability and security optimization and other common
operational tasks – for which most other systems require extensive manual intervention.
Although the most visible effect of automation is that it reduces FTE staffing (users report that
IBM i typically requires two to five times fewer administrators than Windows and x86 Linux
equivalents), other benefits may be expected.
A system that can determine workload requirements and reallocate system resources in a matter
of milliseconds, for example, will use capacity more efficiently than one that is dependent on
administrator or operator intervention. Automation reduces the potential for human errors leading
to performance bottlenecks, outages, data loss or corruption and other negative effects.
22. International Technology Group 19
IBM i automation strengths have been reinforced by autonomic technologies. Autonomic
computing – meaning the application of artificial intelligence technologies to IT administration
and optimization tasks – has been a major IBM development focus since the 1990s, and the
company is the recognized industry leader in this area.
Four categories of autonomic functions – self-configuring, self-optimizing, self-protecting and
self-healing – are implemented in IBM i and Power Systems. These functions, which represent
one of the most advanced implementations of autonomic technologies within the IBM product
line, are summarized in figure 13.
5. Security and malware resistance. The strengths of IBM i’s object-based design are reinforced by
tight integration of security functions with compiler, directory server and object-based file system
structures. In contrast, security functions for Windows and x86 Linux are implemented as
software subsystems. The level of integration is significantly less.
IBM i also contains a full IP security suite, including support for the principal industry security
standards and encryption techniques; and extensive access control and audit facilities. Single
sign-on is enabled using an industry-leading IBM autonomic technology, Enterprise Identity
Mapping (EIM), which maps user IDs across all middleware and application components.
The time and effort that must be spent on routine security and malware protection tasks, and in
patching and auditing is a great deal less than for Windows and x86 Linux servers.
A broader IBM i characteristic is that its different components are implemented in a highly synergistic
manner. For example, DB2 for i exploits the underlying object-based structure and single level storage
capabilities of the operating system. Multithreading, virtualization, workload management and other
functions are closely integrated.
High-end Storage Support
The IBM i presence in the high-end systems market is reflected in support by the industry’s principal
vendors of enterprise-class disk arrays and software.
IBM’s System Storage DS8000, which offers the highest levels of performance and availability within the
IBM storage product line, may be attached to IBM i systems. The DS8000 platform is commonly
employed for the most business-critical mainframe- and UNIX server-based systems worldwide.
Easy Tier, IBM’s solution for automated storage tiering, is supported by IBM i for DS8000 as well as
other IBM disk arrays. Easy Tier has a reputation for enabling full-function tiering while minimizing the
complexities with which storage administrators must deal.
IBM PowerHA SystemMirror for i integrates IBM’s top-of-the-line Metro Mirror and Global Mirror tools
for synchronous and asynchronous remote replication respectively. Metro Mirror supports failover and
recovery at distances of up to 300 kilometers, while there is no distance limit to Global Mirror coverage.
IBM i users have also deployed the company’s XIV Storage System. Built around an innovative parallel
processing design, the XIV system has demonstrated exceptional reliability, high-volume snapshot
copying and disk-caching capabilities. Integrated software and low management overheads have also
contributed to its popularity.
IBM i is supported by EMC for its high-end VMAX arrays, including the multiple-petabyte VMAX 40K.
EMC announced in May 2012 that its automated storage tiering technology, FAST VP (Fully Automated
Storage Tiering for Virtual Pools), could be exploited by VMAX arrays attached to IBM i systems.
23. International Technology Group 20
SYSTEM
Self-‐configuring
Self-‐protecting
Connect
automated
services
CPU
capacity
upgrade
on
demand
Enterprise
Identity
Mapping
EZSetup
Wizards
Hot
plug
disk
&
I/O
Linux
&
Windows
Virtual
I/O
RAID
subsystem
Switchable
auxiliary
storage
pools
Windows
file/print
support
Windows
dynamic
storage
addition
Wireless
system
management
access
Automatic
virus
removal
Chipkill
Memory
Digital
certificates
Digital
object
tagging
Enterprise
Identity
Mapping
Integrated
Kerberos
support
Integrated
SSL
support
IP
takeover
RAID
subsystem
Self-‐protecting
kernel
Tagged
storage
Self-‐optimizing
Self-‐healing
Adaptive
e-‐transaction
services
Automatic
performance
management
Automatic
workload
balancing
Dynamic
disk
load
balancing
Dynamic
LPAR
for
i
&
Linux
Expert
Cache
Global
resource
manager
Heterogeneous
workload
manager
Quality
of
service
optimization
Single-‐level
storage
ABLE
problem
management
engine
Auto-‐fix
defective
PTFs
Automatic
performance
adjuster
Chipkill
Memory,
dynamic
bit
steering
Concurrent
maintenance
Domino
auto
restart,
clustering
Dynamic
IP
takeover,
clustering
Electronic
Service
Agent
(“call
home”)
First-‐failure
data
capture
&
alerts
Service
director
DATABASE
Self-‐configuring
Self-‐protecting
Automatic
collection
of
object
relationships
Automatic
data
spreading
&
disk
allocation
Automatic
data
striping
&
disk
balancing
Automatic
disk
space
allocation
Automatic
distributed
access
configuration
Automatic
object
placement
Automatic
self-‐balancing
indexes
Automatic
tablespace
allocation
Automatic
TCP/IP
startup
Graphical
database
monitor
Automatic
Encryption
management
Automatic
enforcement
of
user
query
&
storage
limits
Automatic
synchronization
of
user
security
Digital
object
signing
Object
auditing
OS-‐controlled
resource
management
Self-‐optimizing
Self-‐healing
Adaptive
Query
Processing
Automatic
Index
Advisor
Automatic
memory
pool
tuning
Automatic
query
plan
adjustment
Automatic
rebind
&
reoptimization
Automatic
statistics
collection
Auto
Tuner
Caching
of
open
data
paths
&
statements
Cost-‐based
Query
Optimizer
On
Demand
Performance
Center
Performance
monitoring
&
analysis
Automatic
object
backup/restore
Automatic
database
object
extents
Automatic
database
restart
Automatic
index
rebalancing
Automatic
journaling
of
indexes
&
objects
Automatic
rebuild
of
catalog
views
Automatic
restart
of
journal
processing
Self
managed
database
logging
Self-‐managed
journal
receivers
Systems
managed
access
path
protection
Figure 13: IBM i and Power Systems Autonomic Functions
EMC and IBM cooperate under an agreement first concluded in 2006, and recently extended to 2016, to
ensure full integration of IBM i with VMAX arrays.
A wide range of other IBM and third-party disk arrays may be used with IBM i systems.
24. International Technology Group 21
Power Systems
Overview
Power Systems have been the recognized industry leader in server performance since the mid-2000s. To
some extent, this has been a function of the performance delivered by successive generations of POWER
processors. However, other factors come into play.
In Power Systems, system-level performance potential has been optimized at all levels of design and
implementation – including microelectronics, module- and subsystem-level components, internal
communications, I/O and system-level hardware and software.
Key features include highly effective compiler- and operating system-level performance acceleration,
including chip simultaneous multithreading; low levels of symmetric multiprocessing (SMP) overhead;
and extensive system-level integration and optimization of performance-related features.
Intelligent Cache and Intelligent Threads in Power Systems allow cache allocation and numbers of
threads (two to four may be employed) to be varied according to workload requirements. Parameters may
be set by administrators, or determined automatically by the system based on application priorities.
The overall architecture, illustrated in figure 14, integrates with IBM i to allow users to manipulate a
wider range of variables – including subsystems, threads, processors, cache, main memory and I/O,
multiple types of partition, multiple threads and dedicated or pooled processors – with higher levels of
granularity and flexibility than any competitive platform.
Power Systems are optimized not only to deliver high levels of performance for single applications and
workloads, but also for the mixed workload environments that are typically generated by core enterprise
systems. Transactional as well as query and collaborative workloads may be handled concurrently in a
highly efficient manner.
Current-generation Power Systems include single-socket (710 and 720), two-socket (730 and 740) and
four-socket (750, 770 and 780) models covering a wide range of prices, and performance and
expandability levels; and the high-end Power 795, which is configurable up to 32 sockets (256 cores).
There are also single- and two-socket POWER7-based blade models.
Virtualization
Effective virtualization consists of more than the ability to create virtual machines.
Multiple mechanisms are required to create and modify partitions; share system resources between these,
and change resource allocations as needs change. It is also necessary to prioritize availability of resources
to different applications based on business criticality; monitor and control workload execution processes;
and meet service-level performance and uptime targets.
PowerVM virtualization meets these requirements. Capabilities include three types of partitioning:
1. Logical partitions (LPARs) are microcode-based partitions that may be configured in increments
as small as 1/10th
core. The technology was originally developed for IBM mainframes.
As a general principle, this approach (often referred to as hard partitioning) offers better isolation
of workloads than software-based techniques. Workloads running in different partitions are less
likely to interfere with each other, enabling higher levels of concentration. LPARs provide
additional security functions.
25. International Technology Group 22
Figure 14: IBM i and Power Systems Architecture
No equivalent capability is available for Intel-based servers with Windows, x86 Linux and/or x86
virtualization tools, or for newer Oracle Sun servers.
2. Micro-partitions are software-based partitions. They are typically employed to support instances
requiring limited system resources, and to improve load balancing for large, complex workloads.
Micro-partitions may be configured in initial increments of 1/20th
core, and subsequent
increments as small as 1/100th
core.
International Technology Group Approval Version – August 3, 2012 1
Figure 12: Power Systems Architecture
RESOURCE'SHARING'
Processors,'Cache,'Memory,'I/O'
Threads'
!
VIRTUAL'I/O'SERVER' VIRTUAL'I/O'SERVER'
Physical processors
DEDICATED''
PROCESSORS'
'
'
'
'
'
'
Physical'processors'
SHARED'
PROCESSOR'POOL'
'
'
'
'
Virtual'processors'
'
SHARED'
PROCESSOR'POOL'
'
'
'
'
Virtual'processors'
'
Virtual'LAN'
'
LPAR
Micro-partitions
Virtual'tape'
'
LPAR'
Virtual'disks'
LPAR'
LPAR'
LPAR
Micro-partitions
LPAR'
'
POWERVM HYPERVISOR
IBM'i'7.1'
ObjectMbased'architecture''•''SingleMlevel'storage'
System'integration'&'automation'
WORKLOAD'MANAGEMENT'
Subsystem' Subsystem' Subsystem' Subsystem'
26. International Technology Group 23
LPARs and micro-partitions are supported by mechanisms that allow processor, memory and I/O
resources to be pooled and reallocated in an extremely granular manner. The system monitors
resource utilization every 10 milliseconds, and may change allocations as rapidly.
Business-critical workloads may run in dedicated LPARs, using dedicated physical processors.
However, other workloads may be executed based on assigned priorities using combinations of
threads, partitions and shared processor pools. The system allows workloads to run on one or
more processor cores within shared pools.
3. Virtual I/O Servers allow operating system instances running in multiple LPARs to share a
common pool of LAN adapters as well as Fiber Channel, SCSI and RAID devices; i.e., it is not
necessary to dedicate adapters to individual partitions. Hardware, maintenance and energy cost
savings may be realized. Virtual I/O Servers may be duplexed to provide redundancy.
PowerVM also provides key availability optimization features. Live Partition Mobility, introduced for
IBM i 7.1 in April 2012, allows movement of active LPARs between Power Systems without disrupting
operations. Service interruptions of one or two seconds may occur due to network latency. These are,
however, rarely noticeable to users.
This capability has proved particularly attractive to organizations that need to perform scheduled
maintenance and software upgrades without downtime.
The PowerHA SystemMirror for i clustering solution enables failover and recovery of even large-scale,
highly granular PowerVM environments in a highly efficient and reliable manner.
PowerVM and x86 Virtualization
x86 virtualization tools such as VMware, Microsoft Hyper-V, Xen KVM and Oracle VM employ only a
single, software-based partitioning method. While they may be able to support diverse workloads, they do
so less efficiently. System overhead may be significantly larger.
(Hard partitioning is supported on the Intel Itanium-based HP Integrity with HP-UX and RHEL, and on
older Oracle Sun SPARC-based M-Series with the Solaris operating system. New installations of these
are, however, now comparatively rare.)
Differences in other areas should also be highlighted.
• Workload management. Most workloads experience fluctuations, and processes (e.g., online,
batch, collaborative) may vary. Unexpected spikes may occur. When multiple applications are
concentrated on a single physical platform – particularly if these generate mixed workloads –
highly granular, real-time monitoring and resource assignment will be required.
If systems cannot provide such capabilities, administrators will tend to limit the number and size
of partitions to prevent workloads interfering with each other. This is one of the key weaknesses
of VMware and other x86 hypervisors, and helps explain why most installations of these realize
only a fraction of their architectural potential.
• Complexity. Ironically, solutions intended to reduce complexity by enabling consolidation of
physical x86 servers have often had the reverse effect. As figure 15 illustrates, virtualization
introduces a new layer of architecture into system environments.
27. International Technology Group 24
Figure 15: System Environment Layers – Example
In an IBM i environment, the bottom four layers shown in the figure above are integrated by
IBM. In addition, the company’s close relationships with ISVs mean that the applications layer
is better tested and optimized for the overall IBM stack than is the case for Windows and x86
Linux servers.
A VMware environment, in contrast, will typically include components from Intel or Advanced
Micro Devices (AMD); the server hardware manufacturer; operating system, database and/or
application suppliers; and VMware itself. The number of vendors may be significantly larger if
storage and networks, and third-party tools are included.
Integration among these vendors may leave much to be desired and, even though they cooperate,
overall complexity in customer installations will still be significantly greater than for IBM i on
Power Systems.
Attention should be drawn to a further differentiator. VMware and other x86 tools have become common
hacker and malware targets. Businesses that deploy them have often found that their vulnerabilities
increase, while patching workloads expand.
IBM i is less vulnerable, as is PowerVM. National Vulnerability Database maintained by the U.S.
National Institute of Standards and Technology (NIST), for example recorded 39 medium and high
severity vulnerabilities for VMware, and 13 for Xen and KVM during 2011. None were reported for
PowerVM over the same period.
Lower PowerVM vulnerability reflects, to some extent, the fact that it is less targeted than x86
equivalents. However, security and malware protection mechanisms are more closely embedded and
integrated across IBM i, Power Systems and PowerVM than is the case for competitive platforms.
Availability Optimization
Power Systems
A first set of availability optimization features is built into Power Systems hardware and microcode. It
includes the following:
• Basic capabilities include high levels of component reliability and redundancy, along with hot
swap capabilities enabling devices to be replaced without taking systems offline. Redundant and
hot swap components include disk drives, PCI adapters, fans, blowers, power supplies, on high-
end models, system clocks, service processors, and power regulators.
HARDWARE
VIRTUALIZATION
OPERATING
SYSTEM
DATABASES/MIDDLEWARE
APPLICATIONS
28. International Technology Group 25
• Monitoring, diagnostic and fault isolation and resolution facilities are built into all major
components, including processors, main memory, cache and packaging modules, as well as
adapters, power supplies, cooling and other devices. In many cases, multiple layers of protection
and self-test are implemented.
Key functionality is provided by IBM-developed Chipkill and First Failure Data Capture
(FFDC) technologies. Chipkill is significantly more reliable than conventional error correction
code (ECC) techniques. FFDC employs embedded sensors that identify and report failures to a
separately powered Service Processor, which also monitors environmental conditions.
The Service Processor can automatically notify system administrators or contact an IBM Support
Center (electronic support or call home service) to report events requiring service intervention.
• Fault masking capabilities prevent outages in case failures do occur. For example, in the event
an instruction fails to execute due to a hardware or software fault, the system will automatically
repeat the operation. If the failure persists, the operation will be repeated on a different processor
and, if this does not succeed, the failed processor will be taken out of service.
In addition, memory sparing enables alternate memory modules to be activated in the event of
failures; and enhanced memory subsystem enables memory controller and cache sparing.
Availability optimization features of Power Systems are summarized in figure 16. Additional capabilities
are provided for high-end Power 770, 780 and 795 models.
LPARs contribute to reduction of planned outages. Software modifications may be made and new
versions installed and assured without disrupting operations. Backups may be performed, and batch
workloads executed concurrently with online processes.
Software Solutions
Avoidance of planned as well as unplanned outages is a central IBM i design parameter. High levels of
stability, integration and automation minimize risks of unplanned outages caused by software failures and
human error, and reduce both the frequency and duration of planned outages.
Specialized features further minimize risks of data loss in the event of an unplanned outage. These
include Remote Journaling (file and system changes may be automatically copied to a second server),
Save While Active (backups may be performed without taking systems offline) and Independent
Auxiliary Storage Pools (IASPs) (data may be mirrored to local or remotely located alternate systems).
Additional protection may be provided by IBM or third-party clustered failover solutions, IBM PowerHA
SystemMirror for i, for example, builds upon IASP technology to provide more advanced database
mirroring, failover and recovery. Synchronous or asynchronous replication may be employed.
Although the amount of time required to failover and restart systems and reinstate data may vary, the best
practice norm for use of PowerHA SystemMirror for i is that operations may be resumed in a matter of
seconds, and data fully restored within an hour. Users have routinely achieved mainframe-class failover
and recovery even for complex large-scale transactional workloads.
29. International Technology Group 26
BASIC
CAPABILITIES
Redundancy,
hot-‐swap
&
related
Redundant/hot-‐swap
disks,
PCI
adapters,
GX
buses,
fans
&
blowers,
power
supplies,
power
regulators
&
other
components.
Redundant
disk
controllers.
I/O
paths
&
oscillators.
Concurrent
system
clock
repair.
Concurrent
firmware
update
Server
microcode
may
be
updated
without
taking
systems
offline.
Concurrent
maintenance
Allows
processors,
memory
cards
&
adapters
to
be
replaced,
upgraded
or
serviced
without
taking
systems
offline.
MONITORING,
DIAGNOSTICS
&
FAULT
ISOLATION/RESOLUTION
Hardware-‐assisted
memory
scrubbing
Automatic
daily
test
of
all
system
memory.
Detects
&
reports
developing
memory
errors
before
they
cause
problems.
Chipkill
error
checking
Employs
RAID-‐like
striping
of
data
across
memory
devices
to
provide
redundancy
&
enable
reinstatement
of
original
data.
Significantly
more
reliable
than
conventional
error
correction
code
(ECC)
technology.
First
Failure
Data
Capture
(FFDC)
Employs
1,000+
embedded
sensors
that
identify
errors
in
any
system
component.
Root
causes
of
errors
are
determined
without
the
need
to
recreate
problems
or
run
tracing
or
diagnostics
programs.
FAULT
MASKING
Processor
instruction
retry
Alternate
processor
recovery
Processor-‐contained
checkstop
If
an
instruction
fails
to
execute
due
to
a
hardware
or
software
fault,
the
system
automatically
retries
the
operation.
If
the
failure
persists,
the
operation
is
repeated
on
a
different
processor
&,
if
this
does
not
succeed,
the
failed
processor
is
taken
out
of
service
(checkstopped).
Only
LPARs
supported
by
the
failed
processor
are
affected.
Dynamic
processor
sparing
Allows
idle
Capacity
Upgrade
on
Demand
(CUoD)
processors
to
be
automatically
activated
as
replacements
for
failed
processors.
Partition
availability
priority
In
the
event
of
a
processor
failure,
maintains
LPAR-‐based
workloads
based
on
assigned
priorities;
i.e.,
remaining
processor
capacity
is
assigned
to
the
highest-‐
priority
workloads.
Memory
sparing
Enables
redundant
memory
to
be
activated
in
the
event
of
failure.
Enhanced
memory
subsystem
Enables
memory
controller
&
cache
sparing.
Enhanced
cache
recovery
Detects
&
purges
processor
&
cache
errors.
Recovers
original
data.
Dynamic
I/O
line
bit
repair
(eRepair)
Detects
&
bypasses
failed
memory
pins.
PCI
bus
parity
error
retry
Retries
an
I/O
operation
if
an
error
occurs.
Figure 16: Key Power Systems Availability Optimization Technologies
30. International Technology Group 27
DETAILED DATA
Company Profiles
The results presented in this report were based on the company profiles summarized in figure 17.
SUPPLY
CHAIN
COMPANIES
Auto
Parts
Manufacturer
Retail
Chain
Industrial
Distributor
Business
Profile
Tier
1
automotive
parts
manufacturer
$8
billion
sales
50,000
employees
80
manufacturing
&
distribution
centers
worldwide
Hard
lines
retailer
$5
billion
sales
25,000
full-‐time
employees
500
stores
+
Internet,
catalog
&
call
center
channels
5
distribution
centers
Industrial
distributor
$3
billion
sales
7,000+
employees
400
branches
10
distribution
centers
Applications
Automotive
ERP
system
Core
merchandise
management,
logistics
management,
finance
&
HR
ERP
system,
e-‐commerce
FINANCIAL
SERVICES
COMPANIES
Bank
Insurance
Company
Services
Company
Business
Profile
Diversified
retail
bank
$10
billion
revenues
$300
billion
assets
30,000+
employees
1,150
branches
+
ATMs,
Internet
&
mobile
banking
services
Property
&
casualty
insurer
$3
billion
revenues
$5
billion
assets
5,000+
employees
3
million
customers
Agent,
Internet
&
call
center
channels
Loan
processing
services
$1
billion
revenues
5,000+
customers
2,500+
employees
Applications
Core
banking,
EFT/POS,
online
banking,
card
management,
financial
Core
policy
&
claims
management,
customer-‐facing
Web
services,
call
center
operations,
finance
&
compliance
Core
processing,
customer
service,
online
billing
&
payments
Figure 17: Company Profiles
Profiles were constructed using survey data from 60 companies in the same industries: i.e., automotive
parts manufacturing, hard lines retail industrial distribution for supply chain companies; and retail
banking, property and casualty insurance and financial IT services. Companies employed IBM i, WSFC
and Oracle Exadata clusters.
Companies employed systems that could be realistically compared across platforms; e.g., the same ERP
suites were used for comparisons where these were supported on IBM i and Windows servers.
Data was collected on business operations including, where appropriate, vulnerability to cascading
effects; applications employed including packaged as well as custom software, and workloads;
availability experiences including frequency and duration of planned as well as unplanned outages;
security and disaster recovery arrangements, and other subjects.
31. International Technology Group 28
Costs of Downtime
Calculation Process
Costs of downtime were calculated using a two-phase process. First, average costs per hour of downtime
were calculated for all companies using appropriate industry- and organization-specific values.
“Average,” in this context, means that costs are based on overall annual volumes of business activity
divided by hours of operation (in all cases, 24 x 365 = 8,760). Values were as described below.
Second, average costs of downtime per hour were multiplied by numbers of hours of downtime per year
for each platform. These were calculated based on user input.
Supply Chain Companies
Values for these were as follows:
• For all companies, supply chain disruption costs include costs incurred for planning and
operational processes between initial customer queries and final delivery.
Calculations include costs of idle and underutilized capacity, including personnel; handling of
delivery delays (including distribution center and transportation costs); additional inventory
carrying costs; costs of customer billing and payments processing delays; costs of change for
affected processes; and, for the retail chain, increased markdown costs.
• For the automotive parts manufacturer, supply chain disruption costs are divided between
inbound supply chain and production disruption, consisting of costs incurred between supplier
queries and factory release; and outbound supply chain disruption, consisting of costs incurred
between factory release and final customer delivery.
These categories generally correspond to the “Source and Make” and “Deliver” segments
respectively of the Supply Chain Operations Reference (SCOR) model developed by the Supply
Chain Council. Inbound supply chain and production disruption calculations include the effects
of delays on production operations, including costs of production scheduling and setup changes.
Because the company has achieved high levels of vertical integration, inbound supply chain costs
are comparatively low. Other costs include customer penalties and remedial costs including
penalties for late delivery and imperfect orders, along with buyback costs such as additional
discounts and rebates.
• For the retail chain, costs of downtime include Lost Sales due to stockouts and, for the
company’s Internet channel, inability to quote product availability and process customer orders
due to outages; and selling, general and administrative (SG&A) costs primarily due to disruption
of store operations. SG&A costs include idle capacity, handling and administrative costs for late
and imperfect deliveries, and reordering, display changes and restocking.
• For the industrial distributor, costs of downtime include lost sales due to inventory shortages,
inability to process customer queries and orders due to outages and related effects. Customer
penalties and remedial costs are included in supply chain disruption costs.
Values were calculated based on user input as well as published material such as company financial
reports and presentations.
32. International Technology Group 29
Financial Services Companies
Values for these were as follows:
• For the bank, costs of downtime include customer attrition (lost customer income), lost
transaction fees (including ATM/debit fees, and fees for transactions conducted online and
through call centers) and other costs, including lost interest, lost customer acquisition expenditure
and productivity loss by branch, call center and other customer-facing staff during outages.
• For the insurance company, costs include lost policy income due to customer attrition, missed
sales opportunities and payment delays caused by outages, and other costs, including lost interest,
lost customer acquisition expenditure and productivity loss by call center and other customer-
facing staff during outages.
• For the services company, costs include lost fee income, customer attrition, lost interest and
productivity loss by customer interaction center staff during outages.
Values for customer loss and missed sales opportunities were calculated based on CLV. Published
materials were again employed where appropriate.
Breakdowns of costs of downtime per hour for individual companies are shown in figure 18.
Cost
Category
Outage
cost
per
hour
Cost
Category
Outage
cost
per
hour
SUPPLY
CHAIN
COMPANIES
FINANCIAL
SERVICES
COMPANIES
AUTO
PARTS
MANUFACTURER
BANK
Outbound
supply
chain
disruption
759.06
Customer
attrition
108.21
Inbound
supply
chain
&
production
disruption
185.03
Lost
fee
income
126.01
Customer
penalties
&
remedial
costs
269.62
Other
costs
25.23
TOTAL
($000)
1,213.71
TOTAL
($000)
259.45
RETAIL
CHAIN
INSURANCE
COMPANY
Lost
sales
383.10
Lost
income
146.31
Supply
chain
disruption
218.62
Other
costs
4.05
SG&A
costs
83.31
TOTAL
($000)
150.36
TOTAL
($000)
685.03
INDUSTRIAL
DISTRIBUTOR
SERVICES
COMPANY
Lost
sales
265.80
Lost
income
79.18
Supply
chain
disruption
283.36
Other
costs
48.68
TOTAL
($000)
549.16
TOTAL
($000)
127.86
Figure 18: Average Costs of Outages per Hour Detail
Severe Unplanned Outages
Calculations for exposure to these were based on two sets of estimates:
1. Probability of 6-, 12- or 24-hour outages for each platform for each company. Probabilities
were calculated based on user input as well as general industry data for the frequency and severity
of outages for IBM i on Power Systems, WSFC and Oracle Exadata.
33. International Technology Group 30
2. Costs of downtime for 6-, 12- and 24-hour outages for each company. Costs include the same
components as for average costs of downtime per hour calculations, although the proportions of
different components varied, in some cases significantly. For supply chain companies, allowance
was made for cascading effects.
For 12- and 24-hour outages affecting financial services companies, costs also include customer
notification, query- and complaint-handling, along with customer reimbursements, extended
overdrafts and payment deadlines, and other remedial costs.
The probability of severe unplanned outages was then multiplied by projected business impact; e.g., if the
probability of a six-hour outage was 0.18, and the cost of such an outage was $10.46 million, the
calculation was 0.18 x $10.46 million = $1.883 million. Overall totals were calculated as the sum of
business impact for all outages over a three-year period.
All values for costs of downtime as well as severe unplanned outage exposure were for the United States.
34. ABOUT THE INTERNATIONAL TECHNOLOGY GROUP
ITG sharpens your awareness of what’s happening and your competitive edge
. . . this could affect your future growth and profit prospects
International Technology Group (ITG), established in 1983, is an independent research and management
consulting firm specializing in information technology (IT) investment strategy, cost/benefit metrics,
infrastructure studies, deployment tactics, business alignment and financial analysis.
ITG was an early innovator and pioneer in developing total cost of ownership (TCO) and return on
investment (ROI) processes and methodologies. In 2004, the firm received a Decade of Education Award
from the Information Technology Financial Management Association (ITFMA), the leading professional
association dedicated to education and advancement of financial management practices in end-user IT
organizations.
The firm has undertaken more than 120 major consulting projects, released more than 250 management
reports and white papers and more than 1,800 briefings and presentations to individual clients, user
groups, industry conferences and seminars throughout the world.
Client services are designed to provide factual data and reliable documentation to assist in the decision-
making process. Information provided establishes the basis for developing tactical and strategic plans.
Important developments are analyzed and practical guidance is offered on the most effective ways to
respond to changes that may impact complex IT deployment agendas.
A broad range of services is offered, furnishing clients with the information necessary to complement
their internal capabilities and resources. Customized client programs involve various combinations of the
following deliverables:
Status Reports In-depth studies of important issues
Management Briefs Detailed analysis of significant developments
Management Briefings Periodic interactive meetings with management
Executive Presentations Scheduled strategic presentations for decision-makers
Email Communications Timely replies to informational requests
Telephone Consultation Immediate response to informational needs
Clients include a cross section of IT end users in the private and public sectors representing multinational
corporations, industrial companies, financial institutions, service organizations, educational institutions,
federal and state government agencies as well as IT system suppliers, software vendors and service firms.
Federal government clients have included agencies within the Department of Defense (e.g., DISA),
Department of Transportation (e.g., FAA) and Department of Treasury (e.g., US Mint).
International Technology Group
609 Pacific Avenue, Suite 102
Santa Cruz, California 95060-4406
Telephone: + 831-427-9260
Email: Contact@ITGforInfo.com
Website: ITGforInfo.com