2. The History of Data Storage
• Storage media: charcoal and dirt on stone
• Data type: analog (image)
• Storage life: >17,000 years (in a sealed dry ‘Diamond Sutra’ (the world’s earliest
complete survival of a dated printed book),
cave)
AD 868
Storage media: ink on paper
Data type: analog (images, characters)
Storage life: >1,100 years (sealed in a cave)
Andrei Khurshudov, 2007
3. The History of Computer Data Storage 1.8” Perpendicular
2005
5.25” drive 2.5” drive 1.8” drive
RAMAC Hard disk drive
3340 Winchester, 1980 1991
1988
1956
1962
Hybrid
Jazz
Zip
The floppy 3.5” drive SSD
Magnetic drum 1983
Blue-ray/HD DVD
Don’t know how to sell more
storage…
DVD
CD ROM
Direct access to data Magnetic Tape Holographic Disk disk
1980
CD/DVD Holographic
Sequential access to data
Need more storage!
Punch cards
Compact Cassette
Magnetic tape
Punched tape
1940 1950 1960 1970 1980 1990 2000 2010 2020
quot;Do not fold, spindle or mutilate”
Andrei Khurshudov, 2007
4. The First HDD is Born
• Stands for quot;Random Access Method of
Accounting and Controlquot;
• Born: 1956
• Capacity: 5 MB
• Disk diameter: 24”
• Recording surfaces: 100
• Tracks/surface: 100
• RPM: 1200
• Weight: >1 ton
• Cost: leased for $3,200 per month
“While the storage capacity of the drive could have been increased above five megabytes, the
marketing department at IBM was against a larger capacity drive because they didn't know how
to sell a product with more storage (source: Currie Munce, VP, IBM Research)
Andrei Khurshudov, 2007
5. Modern Disk Drive
About 50 years old Runs faster with every year…
Mass-produced electro-mechanical device 2006 total industry output >400M drives
Utilizes principles of magnetic recording Most recent products utilize PMR
Relies on a flying magnetic element Typical mechanical separation ~5-10 nm
Available in several standard form factors 1”, 1.8”, 2.5”, 3.5”
Designed for several distinct markets Desktop, Enterprise, Mobile, HH, CE
Uses various computer interfaces PATA, SATA, SAS, SCSI, FCAL
Historically high data density growth rate CAGR of 30% to 50% over the last decades
Experiences constant cost pressure Cost of GB is under $0.5 and falling
Always under attack from disruptive Destroys or assimilates competition for 50
technologies years
Continually expands into new markets Most recent: CE, automotive, archival
Highly competitive industry Darwinian principles in accelerated action*
Industry share leader: Seagate ~40% of the total market share
* “The Innovator’s Dilemma” by Clayton M. Christensen
Innovator’ Dilemma”
Andrei Khurshudov, 2007
6. Disk Drive Industry Trends
0.85” drive
Source: PC World, The Hard Drive Turns 50
Source: Coughlin Associates
Bear Stearns Technology Conference, 2006
Bear Stearns Technology Conference, 2006
Ed Grochowski, IBM
Ed Grochowski, IBM
Drives get denser, smaller, faster, and cheaper
Reliability becomes increasingly difficult
Andrei Khurshudov, 2007
7. Yesterday, Today, and Tomorrow
Tomorrow
Yesterday
Today
There’s plenty of room at the bottom!
Andrei Khurshudov, 2007
8. Estimated Number of Units Shipped
900,000
800,000
700,000
U n i ts , M il li o n s
600,000
500,000
400,000
300,000
200,000
100,000
-
00
01
02
03
04
05
06
07
08
09
10
11
12
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
CY
Source: Seagate Market Research
Rapid overall HDD unit growth will continue into the
foreseeable future
More than 1.5X increase in units shipped in 2012
compared to 2007
Andrei Khurshudov, 2007
9. Strong Link Between Information Growth and
Storage Produced
• Internet
• Blogs
• Movies
• TV
• Music
• Maps
• Databases
• Archives
New Storage
New Data
• Business
• Legal
• Science
• Diaries
• Art
• Gaming
• Literature
• Noise
• Etc.
Balance is required!
Data storage technology underpins information growth
Andrei Khurshudov, 2007
10. Estimated Total PB’s Capacity Shipped
T otal PB's shippe d Proje ction y = 7872.3e 0.3679x
R 2 = 0.9883
500,000
450,000
400,000
Exponential growth
350,000
Total PB 's ship
300,000
250,000
200,000
150,000
100,000
50,000
-
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
Ye ar
Source: Seagate Market Research
Information growth trend is indeed exponential!
Overall information growth will scale with the HDD capacity growth
It is estimated that over 90% of all new information produced in the world is being stored on
magnetic media, most of it on hard disk drives (Google)
Shipped capacity doubles every 30 months
Over 1M PB of storage will be produced between 2008-12
Andrei Khurshudov, 2007
11. Long-term Storage Growth Projection
Long-term storage growth projection
Alotabyte?
!!!
1,000,000,000,000
100,000,000,000
Total PB PB shippe
Total Shipped
10,000,000,000
Yottabyte
1,000,000,000
100,000,000
10,000,000
Zettabyte
1,000,000
100,000
10,000
Exabyte
1,000
100
10
Petabyte
1
2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050
Year Andrei Khurshudov
Exponential growth in storage capacity will
enable the information avalanche!
Andrei Khurshudov, 2007
12. Definitions of reliability
Reliability is the probability of performing required functions for a
specified time under the stated operational conditions
For HDD:
Required functions include storing and accessing data at the specified high
data rate and with specified power consumption, acoustic noise, start-up
time, etc.
Specified time is the service life, which is typically 3 to 10 years.
Stated operational conditions are those specified by the HDD
specification (temperature, humidity, shock, vibration, etc.)
Weibull reliability model :
Describes the “weakest link” in a product
Treats system as a series of components each having finite
reliability:
R1 R2 Rn
HDD Reliability
Etc.
Code
Motor PCBA
HDI
HDD fails if any one component fails!
R = R1*R2*R3*…Rn
Andrei Khurshudov, 2007
13. HDD Reliability Trends
Manufacturer’s HDD MTBF Specifications
From: Ed Grochowski, IBM
From: Ed Grochowski, IBM
• MTBF indicates,
on average, how
many hours a
product is expected
to operate before
failures.
• MTBF = Total
The Ultimate Battle
Product
Reliability vs. Storage Density Operational Time /
Number of Failures
Reliability vs. Cost
Current typical MTBF numbers (by product class):
Reliability vs. Performance
Server: 1,400,000 hours
Reliability vs. Development Time Desktop: 700,000 hours
Mobile: 400,000 hours
Reliability vs. Environment
…
Reliability keeps increasing with time in spite of design
complexities and more stringent qualification test
requirements
Andrei Khurshudov, 2007
14. HDD Reliability Hierarchy
Involvement Dealing with…
Customer perception of reliability Limited statistics
Closing gap between expected
Reliability in User Environment
reliability and reality
The last line of defense.
Manufacturing for Reliability
Balancing quality against cost
Advanced test techniques and
Product reliability qualification
failure modes analysis
Engineering and Technology
Design for Reliability
Principles
Reliability Physics & Theory Fundamental laws of nature
HDD reliability is built upon Tribology !
Andrei Khurshudov, 2007
15. A Perspective on HDD Reliability
Cumulative Failure / Repair / Return rates (after 3-4 years)
Laptop com puter
Refrigerator: side-by-side, w icem
ith aker and dispenser
Rider m er
ow
Desktop com puter
When Compared to
Law tractor
n
Washing machine (front-loading)
many other products,
Self-propelled m er
ow
Vacuum cleaner (canister)
HDD reliability looks
Washing m achine (top-loading)
Dishw asher
very high
Gas range
Refrigerator: top- and bottom-freezer, w icem
/ aker
Average 3-4 year
W oven (electric)
all
Push m er (gas)
ow
cumulative repair
Microwave oven (over-the-range)
Cooktop (gas)
Clothes dryer
rate for CE products
Average for CE products
Vacuum cleaner (upright)
is 15%
Cam corder (digital)
Refrigerator: top- and bottom-freezer, no icemaker
HDD is a component,
Cooktop (electric)
Range (electric)
not a product
Digital cam era
TV: 30- to 36-inch direct view
TV: 25- to 27-inch direct view
Proton rocket
HDD
M edical Pacem akers
Sony PS3 (w H
ith DD)
%
0 5 10 15 20 25 30 35 40 45 50
Source: Consumer Reports National Research Center, 2006 Product Reliability Survey; http://en.wikipedia.org/wiki/Proton_rocket;
www.seagate.com; http://www.medscape.com/viewarticle/536755
Andrei Khurshudov, 2007
16. The Actual Cost of Unreliability
If the company experiences a major loss of data then
60% of companies that lose their data will shut down within 6 months of the
disaster (source: Bostoncomputing.net))
Bostoncomputing.net
72% of businesses that suffer major data loss disappear within 24 months
(Source: Realty Times)
93% of companies that lost their data center for 10 days or more due to a
disaster filed for bankruptcy within one year of the disaster (source:
Bostoncomputing.net)
Bostoncomputing.net)
Recreating data from scratch is estimated to cost between $2000
and $8000 per MB (Source: Realty Times)
Of those companies participating in the 2001 Cost of Downtime
Survey (Source: 2001 Cost of Downtime Survey Results):
8% said it would cost their companies more than $1 million per hour
18% said each hour would cost between $251K and $1 million
28% said each hour would cost between $51K and $250K
46% said each hour of downtime would cost their companies up to $50k
Andrei Khurshudov, 2007
17. Aggravating Aspects of Data Loss
40% of Small and Medium Sized Businesses do not back up their data (Source: Realty
Times)
40 - 50% of all backups are not fully recoverable (Source: Realty Times)
34% of companies fail to test their tape backups, and
of those that do, 77% have found tape back-up failures (source: Bostoncomputing.net))
Bostoncomputing.net
quot;More than 109,000 TBs of unique enterprise PC data are not being regularly
backed up“ (IDC)
A national Harris Interactive survey reveals (Source: Realty Times):
Only 25% of users frequently back up digital files, even when 85 percent of
computer users say they are very concerned about losing important digital data
37% of the survey's respondents admitted to backing up their files less than once
per month
9% admitted they have never backed up their files
More than 22% said backing up information is on their to-do list, but they
seldom do it
Andrei Khurshudov, 2007
18. What do drives fail for?
Generic HDD failure mode pareto
Write abort
High-fly write
• Up to 40%
NTF
CND
Scratch • System-dependent
TA
Head degradation
• Up to 30%
• System-dependent,
Grown defect
personnel-dependent,
Motor
procedure-dependent,
Mishandling
Handling damage
PCB
etc.
Observation:
Tribology is responsible for many failure modes !
Andrei Khurshudov, 2007
19. Tribology inside HDD
Connectors
FDB Motor
Head-Disk Interface
Ramp (friction and wear)
Pivot Bearing
Screws
(wear and torque retention)
There are multiple ways in which tribology impacts HDD reliability
Andrei Khurshudov, 2007
20. The Role of Tribology in HDD Reliability
It is estimated that 15% to 35% of all HDD failures are
linked to Tribology (25% on average)
Improving tribological robustness enhances overall disk drive
reliability
Major known failure modes related to tribological issues:
Scratch (on both head and media; with or w/out particles)
Thermal erasure (disk) and head degradation
New defects
Weak write / read
Crash
Failure of some other moving parts
Etc.
Andrei Khurshudov, 2007
21. Future Improvement Opportunities
HDD reliability:
Number of drives that will not fail between 2008 and 2012 per
every 0.1% AFR improvement: ~ 3,000,000
Amount of stored information that will not be lost/impacted
between 2008 and 2012 per every 0.1% AFR improvement:
~ 1,000,000 TB (or 1 EB)
Tribology:
Number of drives that will not fail between 2008 and 2012 due to
Tribological problems per every 0.1% AFR improvement: ~
750,000
Amount of stored information that will not be lost/impacted
between 2008 and 2012 due to Tribological problems per every
0.1% AFR improvement: ~ 250,000 TB = 250 PB
Andrei Khurshudov, 2007
22. Is this worth the effort?
Petabytes in use:
The “American Memory” project is one of the largest digitized archives of U.S.
history, with more than 7.5 million digital records from 100 collections of
manuscripts, books, maps, films, sound recordings and photographs. The total
size of the project is 0.008 Petabytes [Wired]
As of November 2006, eBay had 2 Petabytes of data [Wikipedia]
[Wikipedia
Jefferson National Accelerator Facility has a 2 Petabyte storage farm used to
collect data from experiments on the particle accelerator [Wikipedia]
[Wikipedia
RapidShare in 2007 had 3.5 Petabytes of hard-disk storage [Wikipedia] [Wikipedia
The San Diego Supercomputer Center (SDSC) in the USA has a 1-Petabyte hard
disk store and a 6-Petabyte robotic tape store [Wikipedia]
[Wikipedia
Microsoft stores on 900 servers a total of about 14 Petabytes. These are mostly
imagery for Microsoft's digital model planet, Virtual Earth [Wikipedia]
15 Petabytes of data will be generated each year in particle physics experiments
using CERN’s Large Hadron Collider, due to be launched in May 2008 [Wikipedia] [Wikipedia
The total storage capacity needed for the above data is ~ 44 PB
A failure rate reduction of 0.005% over the next 5 years is required to
cover the above storage capacity needs
Andrei Khurshudov, 2007
23. Future Scenario
Exponential growth of data over time
(information avalanche)
Lower cost of data storage per GB
Many more disk drives required to
accommodate all of the new data and backup
Continually increasing reliability of disk drives
Nevertheless, more total failures (in absolute
terms) unless HDD reliability increases on a
faster rate than the drive unit growth
Andrei Khurshudov, 2007
24. Conclusions
Data storage capacity growth enables overall
information growth
Reliability of data storage devices is a key element in this
growth
Unreliability is extremely costly
Even small improvements in reliability will have huge
impact on the amount of information preserved in the
future
Tribology is, and will remain, a major enabler of the
future information growth
Relative contribution of Tribology to HDD unreliability is on
the order of 25%
Andrei Khurshudov, 2007
25. References
“The Innovator’s Dilemma” by Clayton M. Christensen
Google: Failure Trends in a Large Disk Drive Population, E. Pinheiro, W.-D. Weber and
L. Andr´e Barroso, FAST 2007
Wired: http://www.wired.com/science/discoveries/news/2002/10/55509
Wikipedia on Petabytes: http://en.wikipedia.org/wiki/Petabyte
Consumer Reports National Research Center, 2006 Product Reliability Survey:
http://www.squaretrade.com/htm/pop/lm_failureRates.html
Proton rocket launcher: http://en.wikipedia.org/wiki/Proton_rocket
HDD specifications: www.seagate.com
Medical pacemaker’s reliability: http://www.medscape.com/viewarticle/536755
2001 Cost of Downtime Survey Results: http://www.datadepositbox.com/media/data-
loss-statistics.asp
BostonComputing.net:
http://www.bostoncomputing.net/consultation/databackup/statistics
IDC: IDC analyst Fred Broussard, PC Backup and Higher Prioritization for the
Enterprise and Consumer, July 2002
Andrei Khurshudov, 2007