1. The future of DataThe future of Data--
Centers?Centers?
Prof Ian Bitterlin
CEng PhD BSc(Hons) BA DipDesInn
MIET MCIBSE MBCS MIEEE
Visiting Professor, School of Mechanical Engineering, University of Leeds
Chief Technology Officer, Emerson Network Power Systems, EMEA
Member, UK Expert Panel, EN50600 – Data Centre Infrastructure - TCT7/-/3
UK National Body Representative, ISO/IEC JCT1 SC39 WG1 – Resource Efficient Data Centres
Project Editor for ISO/IEC 30143, General Requirements of KPI’s, WUE, CUE & REC
Committee Member, BSI IST/46 – Sustainability for and by IT
Member, Data Centre Council of Intellect UK
SVP & Technical Director (Power), Data Centre Alliance – non-for profit Trade Association
Chairman of Judges, DataCenterDynamics, USA & EMEA Awards
Chairman of The Green Grid’s EMEA Technical Work Group
2. Data is growing faster and faster andData is growing faster and faster andData is growing faster and faster andData is growing faster and faster and
Capacity driven by exponential data growth
80% CAGR compared to the 40% CAGR of Moore’s Law
Virtualisation of hardware partly closes the gap
Growth in emerging markets is faster than mature regions
Increasing capacity and
efficiency of ICT hardware
has always been outstripped
by demand
3. The Law of Accelerating Returns:The Law of Accelerating Returns: KurzweilKurzweil
Information generation
• 2009 = 50GB/s
• 2020 = 500GB/ s
• 10,000,000x increase
The Singularity is Near
Raymond Kurzweil, 2005, Viking
Introduced the ‘law of
accelerating returns’ and
extended Moore’s Law
Ray Kurzweil has been described as “the restless genius” by the Wall Street Journal, and “the ultimate
thinking machine” by Forbes magazine, ranking him #8 among entrepreneurs in the United States and
calling him the “rightful heir to Thomas Edison”. PBS included Ray as one of 16 “revolutionaries who
made America,” along with other inventors of the past two centuries.
4. 䀀
Gordon Moore was a founder of Intel
30 years ago he wrote Moore’s Law which predicted the doubling
of the number of transistors on a microprocessor every two years
Moore’s Law has held true ever since
Applies as well to
– Doubling compute capacity
Moore’s LawMoore’s LawMoore’s LawMoore’s Law
– Doubling compute capacity
– Halving the Watts/FLOP
– Halving kWh per unit of compute load etc
Kurzweil now suggests that the doubling is every 1.2 years
Encourages ever-shorter hardware refresh rates
– Facebook 9-12 months, Google 24-30 months etc
Keeping ICT hardware 3 years is energy profligate
5. Five ‘Moore’ years?Five ‘Moore’ years?Five ‘Moore’ years?Five ‘Moore’ years?
Is 3D graphene the fifth paradigm?
6. Data generation growthData generation growth
• At Photonics West 2009 in San Jose, Cisco correctly predicted for
2012 that ‘20 US homes with FTTH will generate more traffic than the
entire internet backbone carried in 1995’
• Japanese average home with FTTH - download rate is 500MB per
day, dominated by HD-Video
• More video content is uploaded to YouTube every month than a TV
station can broadcast in 300 years 24/7/365station can broadcast in 300 years 24/7/365
• Phones with 4G are huge data-generators. Even with 3G in 2011
Vodafone reported a 79% data-growth in one year – was that all
social networking?
• 4K UHD-TV? A 3D 4K Movie = 2h download over fast broadband
7. Jevons Paradox (Rebound Effect)Jevons Paradox (Rebound Effect)
‘It is a confusion of ideas to suppose that the economical
use of fuel is equivalent to diminished consumption. The
very contrary is the truth’
William Stanley Jevons, 1865
The Coal Question, Published 1865, London, Macmillan Co
Newcomen’ s engine was c2% thermally efficient and coal supplies in the UK were highly strained
Watt’s engine replaced it with c5% efficiency - but the result was rapid increase in coal consumption
Can the same be said of data generation and proliferation?
Don’t forget that less than 30% of the world’s population have access to the internet
And the rest want it .
8. Infrastructure and energy!Infrastructure and energy!Infrastructure and energy!Infrastructure and energy!
Time magazine reported that it
takes 0.0002kWh to stream 1
minute of video from the
YouTube data centre
Based on Jay Walker’s recent
TED talk, 0.01kWh of energy is
consumed on average in
downloading 1MB over the
Internet.
The average Internet device For 1.7B downloads of this 17MBThe average Internet device
energy consumption is around
0.001kWh for 1 minute of video
streaming
For 1.7B downloads of this 17MB
file and streaming for 4 minutes
gives the overall energy for just
this one pop video in one year
9.
310GWh in one year from 15310GWh in one year from 15thth July 12July 12310GWh in one year from 15310GWh in one year from 15thth July 12July 12
c36MW of 24/7/365 diesel generation
310GWh = more than the annual electricity
consumption of Burundi, population 9 million
(273GWh in 2008)
100 million litres of fuel oil
250,000 Tons CO2
80,000 average UK car years
– 960 million miles (c8,000 cars, cradle to grave)
Just for one pop-video on YouTube
10. Japanese IP Router power consumptionJapanese IP Router power consumption
• Paper by S. Namiki, T. Hasama H. Ishikawa
• National Institute of Advanced Industrial Science and Technology
• Network Photonics Research Center, 2009
• Japanese traffic has grown exponentially
• Broadband Subscribers Mar-00 to Jul-07, 0.22 to 27.76 million
• 40% CAGR in daily average JPIX Traffic
• 11/04 324Gbps
• 11/05 468Gbps
• 11/06 637Gbps
• 05/07 722Gbps
• By Sep-07 10.52 million FTTH subscribers
• Forecast c25million subscribers by end-2010
• Forecast download per user per day = 225MB
• The current technologies can’t scale to the future traffic
• Japan needs a new technology paradigm with 3-4 orders of energy
reduction on today’s technology
11. Energy limitation on current technologyEnergy limitation on current technology
The current technology would consume the entire
grid power capacity before 2030!
12. Data has always outstripped Moore's LawData has always outstripped Moore's Law
Vodafone experienced 69% annual data growth in mobile data in 2011
13. Choose your starting pointChoose your starting point
10% of grid capacity consumed in 4-6 years? 100% in under 10 years?
The result is unsustainable with any start-value
14. Can data centres be ‘sustainable’?Can data centres be ‘sustainable’?
• Never in isolation!
• Data centres are the factories of the digital age
• They convert power into digital services – its impossible to calculate the
‘efficiency’ as there is no definition of ‘work done’
• All the energy is treated as waste and, in almost every case, is dumped
into the local environment
• Only if the application of the data centre can be shown to be an enabler
of a low-carbon process can it be regarded as sustainable
• Not ‘sustainable’, unless• Not ‘sustainable’, unless
• The load is a low-carbon solution
• They have minimised consumption by best-in-class hardware
• They have reduced their PUE to the minimum for business case
• They source power from renewable (or low-carbon) sources?
• They re-use waste heat
• Is a true ‘parallel computing’ model ‘efficient’?
• If you build two ultra-low PUE facilities (close to PUE=1) to push
redundancy and availability into the hardware-software layer then could
your peak overall power consumption be 2?
15. Fast broadband for all?Fast broadband for all?
The EU has a digital agenda that involves super-fast broadband for all
citizens at an affordable price, if not free to those who are less able to pay
Faster access will, according to Jevons Paradox, generate a power demand
increase but no government has yet appeared to understand the direct
linkage mechanism between data-generation and power demand
Faster access used for business or education is one thing, but for social
networking?
Faster access used for education, medical services security may be key to
many 3rd World nations development
‘Internet access will become a privilege, not a right’
Vint Cerf, 2011
Inventor of the IP address and often regarded as one of the ‘Fathers of the Internet’
Now VP and Chief Internet Evangelist, Google – working on inter-Galactic IP addresses
16. Industry predictions that point the wayIndustry predictions that point the way
• Nokia Siemens Networks
• By 2015 2,500% mobile data
• 23 Exabytes/year (23,000,000,000,000,000,000 bytes)
• Planning for 1,000x increase in network storage capacity 2010-2020
• Cisco
• By 2015 2,600% mobile data• By 2015 2,600% mobile data
• 76 Exabytes/year
• Internet traffic increases 32%/year to 966 Exabytes/year
• 3,900% the Internet traffic (by volume) in 2005
• IDC
• 2009-2020 data-growth of 4,400%
• A faster growth rate than Moore’s Law and technology?
17. But ICT infrastructure needs energy...But ICT infrastructure needs energy...
• A viral-like spread and expansion of digital data – but how will it be
transferred?
• By courier on hard-drives or via fibre?
• At the moment sending 2TB between Bristol California is cheaper, faster and
lower carbon footprint by DHL on a jumbo-jet
• Is there a natural limit to growth? Or an un-natural one?
We all remember when Gartner (2008) said that energy consumption• We all remember when Gartner (2008) said that energy consumption
of data-centres will grow by 1,600% from 2005 to 2025 and that ICT
produces 2% of worldwide CO2 emissions
• Could the 2% of ‘today’ grow into...
• Cisco 39x = 78% by 2015
• Nokia Siemens 25x = 50% by 2015
• IDC 44x = 88% by 2020
• Gartner 16x = 32% by 2025
18. Is ‘The Cloud’ an answer?Is ‘The Cloud’ an answer?Is ‘The Cloud’ an answer?Is ‘The Cloud’ an answer?
Partly, ‘The Cloud’ = ‘Someone else’s data-centre’
– They will proliferate and get bigger
– They will increase dramatically in ICT utilisation
– Built in an increasingly modular/scalable fashion
They will strive for low costs via ultra-low PUEThey will strive for low costs via ultra-low PUE
– They will innovate and move to sub-1.15 PUE
– Low energy cooling, ‘thermal management’ (major influence)
– High efficiency UPS with advanced eco-mode
– Visibility and control via DCIM will be essential
19. Big, virtualised heavily loaded and ‘greener’Big, virtualised heavily loaded and ‘greener’
• UK data centres consume c1GW
• 35-40,000 ‘data-centres’, ripe for consolidation/outsourcing
• If average PUE=2 then ICT load = 500MW
• ‘Cloud’ is an outsourced flexible pay-as-you-go compute
storage business with relaxed hardware SLA’s in a highly
virtualised environmentvirtualised environment
• ‘Cloud’ = ‘someone else’s data centre’
• ‘Cloud’ will (has already?) become a commodity service driven by the cost of
power and its efficient use
• Logically ‘cloud’ should be more efficient with a cost driven PUE
of 1.2, cutting grid demand by 40%
• But data-growth will continue to demand more power
20. Don’t pay for heavyweight reports on the growth rate of data
centres
Choose your ‘best guess’ data growth-rate
– Currently 80%? e.g. mobile data, storage sales etc
Deduct Moore’s Law (40%CAGR)
– E.g. 80%-40% = 40% annual power growth
Bitterlin’s LawBitterlin’s Law ☺☺20122012Bitterlin’s LawBitterlin’s Law ☺☺20122012
Compare virtualisation software sales to server sales and take
a view on the impact
– E.g. Halving the 40% = 20%
So, data-centres power-growth rate is currently 20% - and
mostly in emerging markets rather than in the old economies
A paradigm shift will only extend exponential growth, not solve
the power-growth problem
21. It’s all about the moneyIt’s all about the moneyIt’s all about the moneyIt’s all about the money
Power Usage EffectivenessPower Usage EffectivenessPower Usage EffectivenessPower Usage Effectiveness
A universally accepted and harmonised metric that covers the
infrastructure and soon to be embodied in an ISO/IEC Standard
22. ͩ印
Power costs have become dominantPower costs have become dominantPower costs have become dominantPower costs have become dominant
At UK power costs 40-60% of 10-year data
centre TCO is the cost of electrical power
– Land, structure, ICT hardware staffing are all
subjugated by the cost of electricity
ICT hardware costs have fallen to less than 3-ICT hardware costs have fallen to less than 3-
years of its own power consumption
– Refresh rates have fallen to 3y, for some 1y
Low PUE has become the dominant mantra
Monitoring and control have become vital
23. 쓠Ў
An example of a UK colo cost modelAn example of a UK colo cost modelAn example of a UK colo cost modelAn example of a UK colo cost model
Tier 3 build cost = £10k-£13k/kW
One 4kW cabinet lease = £27,500/year
– c£6k/year/kW
Power cost for 4kW IT at PUE 1.6 = £5,600paPower cost for 4kW IT at PUE 1.6 = £5,600pa
– Over 10 years = 4x the infrastructure build cost
The cost of power dominates the TCO and a low
PUE becomes a key enabler
24. PUE = 1.7 (EU CoC Participant average)PUE = 1.7 (EU CoC Participant average)
Cooling fans, pumps compressors
Lighting small power
Ventilation – Fresh Air5 kW
15 kW
1MVA
IT terminal load
Distribution conversion losses
Cooling fans, pumps compressors
Security, NOC, BMS, outdoor lighting
Communications
250 kW
470 kW
35 kW
13 kW
2 kW
Total 800 kW
1MVA
25. 籐Ј
The misuse of PUE for marketing?The misuse of PUE for marketing?The misuse of PUE for marketing?The misuse of PUE for marketing?
Has Facebook, Google et al spoiled it for the
mainstream data-center industry?
Ultra-low PUE’s set unachievable targets for
enterprise facilities
– 1.12 by Google to 1.07 PUE shattering by Facebook– 1.12 by Google to 1.07 PUE shattering by Facebook
26. 籐Ј
‘Horses for courses’‘Horses for courses’‘Horses for courses’‘Horses for courses’
What is good for Google is not usually
acceptable or possible for enterprise facilities,
but it is not ‘wrong’ – it’s ‘right’ for Google!
– Fresh-air cooling but with short refresh cycle
• Low ambient locations are preferable• Low ambient locations are preferable
– No central UPS but ride-thru battery built into server
• Redundancy in the software/hardware layer
Resultant PUE 1.12 and going down
– With a very high processor utilisation from a single
application like ‘search’
27. Is a low PUE ‘sustainable’ engineering?Is a low PUE ‘sustainable’ engineering?
• Cooling efficiency
• Site selection, latitude and local climate (water-usage a limiting factor?)
• Rigorous air-management in the room
• High server inlet temperature (avoiding fan ramp-up, 27°C?)
• Minimum humidification and de-hum (if any?)
• Free-cooling coils for when the external ambient is cool
• If possible avoid compressor operation altogether
• Power efficiency
• Avoid high levels of redundancy and low partial loads in general• Avoid high levels of redundancy and low partial loads in general
• Design redundancy to always run at 60% load
• Adopt high-efficiency, modular, transformer-less UPS where efficiency is 96% at
20% load
• Adopt eco-mode UPS where peak efficiency is 99% with an annual average
efficiency close to 98%
• Apply high efficiency lighting etc
• Best practice gets us to a PUE of 1.11-1.15
• Extreme data-centre ‘engineering’ gets us down to below 1.1
• ‘Risk’ (perceived or real) increases as PUE goes sub-1.2
28. 짰Ў
Can ICT save the planet?Can ICT save the planet?
• Will ICT lower our energy consumption and help to counter Global
Warming?
• Less travel, video conferencing, home working
• Internet shopping, smarter logistics (no right-hand turns?)
• Smarter buildings (sensors, sensors everywhere )
• Better manufacturing
• Smart-grid enablement
• Better education and access to medical services
• But we all seem to want more digital services and content
• 24x7 x Forever
• Wherever the location, fixed and mobile
• Increasingly HD-video content
• 4G mobile network 4K-TV will exacerbate the problem
• Government plan for ‘fast-broadband for all’ at low cost will only drive consumption up
• Let’s not forget that 25% of the world’s population has access to the
internet and the rest want/need it
29. 籐Ј
Power Cooling in the pastPower Cooling in the past
• Data-centres have evolved from the Mainframe
machine-rooms of the mid-50s to the file-server and
storage-array dominated mega-facilities of today
• From 35W/m² in the 70s to 5,000W/m² in 2010
• The power requirement hardly changed in 20 years
• 1990 441Hz, derived from aircraft technology
• 1997 50Hz, voltage frequency ±1%, fidelity 10ms
• 1997 50Hz, voltage frequency ±10%, fidelity 20ms
• But in 2013 things may have regressed
• The environmental requirements of IT hardware have
changed drastically in very recent times
• The original specification was based on humidity control for punch-
cards and read/write accuracy on magnetic tape-heads
• 45%RH and too much static-electricity built up
• 55%RH and the punch-cards absorbed too much moisture
• Humidification and de-hum were key elements in the thermal
management design and the result was precision air-con
• Temperature was controlled to 22°C±1°C (usually return air)
• Until 2-3 years ago, and still for (far too) many facilities, this was/is the
‘safe’ SLA and avoids any conflict for legacy loads
30. 籐Ј
Cooling is the lowCooling is the low--hanging fruithanging fruit
pPUE = Partial Power Usage EffectivenesspPUE = Partial Power Usage EffectivenesspPUE = Partial Power Usage EffectivenesspPUE = Partial Power Usage Effectiveness
The cooling system has become the most important target for
saving energy in the data centre
31. 籐Ј
PUE only measures the infrastructurePUE only measures the infrastructure
PUE takes no account of the IT load or its ‘efficiency’
PUE must never be used to compare facilities
PUE is annualised energy (kWh), not ‘power’ (kW)
PUE varies by location, season and load
Low PUE enables a bigger IT load
Peak power can be very different from PUE
32. 漐Ў
PUE varies with load climatePUE varies with load climate
PUE = energy ratio of the annualised ‘kWh-Facility’ divided by ‘kWh-ICT load’
Above example PUE = 9 at 10% load improves to 1.4 at 100% load
33. 蚐Ј
Partial load performance is keyPartial load performance is keyPartial load performance is keyPartial load performance is key
Partial load is endemic in Data Centres
worldwide
– 400MW of Trinergy delivered in the last 2 years is
running with an average load of 29%
Partial load is the enemy of energy efficiencyPartial load is the enemy of energy efficiency
– Modular/scalable solutions are the key to keeping the
system load high and efficiency maximised
– Trinergy example, running at 97.8% efficiency
High redundancy often exacerbates partial load
34. 셀Ў
CompressorCompressor--free cooling?free cooling?
• UK examples, where the design peak external dry-bulb
ambient is c33°C wet-bulb c23°C then:
• Open fresh-air system with adiabatic cooling, limited to peak 26°C
server inlet = 100 hours/year compressor operation
• Closed system with air-to-air heat-exchanger and adiabatic spray,
limited to peak 30°C server inlet = zero hours/year compressor
operation
• Note! ‘Free-cooling’ does not mean ‘fresh-air’
• Wherever the peak external ambient is below 35°C and
water for evaporation is available it is possible to have
compressor-free cooling 8760h/year and keep within
the latest Class 2 ‘recommended’ limits
• Annualised PUE of 1.15 could be achieved Europe-
wide
• Compared to industry legacy of 3 in operation
• More than a 60% reduction in power consumption
35. 셀Ў
The UK could avoid compressor operation...The UK could avoid compressor operation...
Approach temperature of 7°K (indirect or direct airside economization)
Maximum server inlet temperature of 30°C for 50 hours/year using
water for adiabatic cooling – about 1,000T/MW/year
Average server inlet temperature of a ‘traditional’ 22°C
°C
Dry-bulb monthly average
Wet-bulb monthly average
36. 셀Ў
Risk, real or perceived?Risk, real or perceived?
Complexity can be the enemy of reliabilityComplexity can be the enemy of reliabilityComplexity can be the enemy of reliabilityComplexity can be the enemy of reliability
Balancing redundancy and the chances for human error is key
37. 漐Ў
What is your appetite for risk?What is your appetite for risk?What is your appetite for risk?What is your appetite for risk?
This is the first question that a designer should
ask a data-centre client
– Thermal envelope for hardware
• ASHRAE TC9.9 Class 1,2, 3 or 4?
• Recommended or Allowable for ‘X’ hours per year?• Recommended or Allowable for ‘X’ hours per year?
– Contamination and corrosion
• Air quality? Direct or Indirect economisation?
– Power Quality and Availability
• High efficiency UPS?
• Single-bus or dual-bus power?
High reliability usually costs energy
38. 籐Ј
Enabling factors for innovationEnabling factors for innovationEnabling factors for innovationEnabling factors for innovation
ASHRAE TC9.9 slowly widening the ‘recommended’ and,
faster, the ‘allowable’ thermal windows
– Allowable A1 temperature 18°-32°C, Humidity 20-80%RH
– Encouraging no refrigeration in data centres of the future
The Green Grid pushing DCMM, the Maturity Model
– Eco-mode UPS plus no refrigeration, even in back-up
EU CoC is reported to be considering +45°C?
ISO/IEC, ETSI ITU will push energy efficiency of data
centres to the top of the agenda
39. 漐Ў
The future: Ever wider thermal envelopeThe future: Ever wider thermal envelope
• The critical change has been to concentrate on server inlet
temperatures, maximising the return-air temperature
• Rigorous air-containment is ‘best practice’
Do ASHRAE need to go
further and expand the
‘Recommend’, not just the
‘Allowable’?
41. 漐Ў
Our industry is like a cometOur industry is like a cometOur industry is like a cometOur industry is like a comet
Facebook Google et al are the bright-white tip but
99.5% of the matter is in the dark tail
Governed by paranoia rather than engineering
Not littered with Early Adopters; thermal SLA’s are more
often still based upon ASHRAE 2004 limits
– 22°C (where?) and 45-55%RH
42. 籐Ј
Chilled Water, DX Adiabatic?Chilled Water, DX Adiabatic?Chilled Water, DX Adiabatic?Chilled Water, DX Adiabatic?
Chilled Water will remain dominant for 1MW multi-storey and larger city-
centre locations where space and external wall runs are limited and
flexibility of heat rejection location is low
– Latest technology from ENP will enable pPUE of 1.4
– Adiabatic coils likely to become a standard feature
– Will remain dominant where ambient conditions are very hot and/or very
humid
– Will remain dominant for tight thermal envelope SLAs– Will remain dominant for tight thermal envelope SLAs
DX will remain dominant for smaller facilities and city-centre locations.
Up to c300kW
– Latest technology from ENP enables pPUE of 1.2
Adiabatic systems will dominate the new green-field mega-facilities
– Latest technology from ENP will enable pPUE of 1.06
– Indirect economization will dominate over Direct (fresh-air) systems
– Water consumption may be an issue for some locations
43. 漐Ў
70% of all failures are human error70% of all failures are human error70% of all failures are human error70% of all failures are human error
Power ArchitecturePower ArchitecturePower ArchitecturePower Architecture
Reliability versus human-error versus energy efficiency?
2N power removes a lot of human error!
44. 셀Ў
The drive for higher Availability leadsThe drive for higher Availability leads
to increasing complexityto increasing complexity
The drive for higher Availability leadsThe drive for higher Availability leads
to increasing complexityto increasing complexity
45. 漐Ў
Uptime Institute Tier Ratings for Data Centres
ANSI/TIA 942 – Infrastructure Standard for Data Centres
ANSI/BICSI 002 – Data Centre Design and Implementation Best Practice
New EN Standard BS EN 50600 will be introduced in 2013 and use the
terminology ‘Availability Class’ in four discrete steps
Site Distribution: Tier TopologySite Distribution: Tier TopologySite Distribution: Tier TopologySite Distribution: Tier Topology
46. 漐Ў
Why are there only four tiers/classes?Why are there only four tiers/classes?Why are there only four tiers/classes?Why are there only four tiers/classes?
Before the founders of The Uptime Institute innovated the dual-cord
load, critical loads only had one power connection (one active path)
With single-cord loads you can only have two tiers/classes
– Single path without redundant components
– Single path with redundant components, e.g. N+1 UPS
Static Transfer Switches were first introduced in Air Traffic Control
applications to increase the power availability but an STS is always aapplications to increase the power availability but an STS is always a
single point of failure
With dual-cord loads two more tiers/classes were made available
– Dual-path with one ‘active’ (e.g. N+1 UPS) and one ‘passive’ – a wrap-
around pathway that could be used in emergency or for covering routine
maintenance in the ‘active’ path
– Dual-path with two ‘active’ paths (e.g. 2(N+1) or 2S) where no common
point of failure exists between the two pathways and load availability is
maximised
The (0) classification of BICSI doesn’t really reflect a dedicated data-
centre
46
47. 셀Ў
UTI Tier Classifications:UTI Tier Classifications: I to IVI to IVUTI Tier Classifications:UTI Tier Classifications: I to IVI to IV
• The Tier classification system takes into account that 16 sub-systems
contribute to the overall site availability
• Tier I = 99.67% site
• Tier II = 99.75% site
• Tier III = 99.98% site
• Tier IV = 99.99% site = 99.9994% power-system• Tier IV = 99.99% site = 99.9994% power-system
• Note that any system requiring 4h maintenance per year = 99.95% max
• All systems have to meet:
Tier IV later revised to 2(N)
48. 漐Ў
Combinations of MTBF/MTTR = Any TierCombinations of MTBF/MTTR = Any TierCombinations of MTBF/MTTR = Any TierCombinations of MTBF/MTTR = Any Tier
49. 漐Ў
N - Meets base load requirements with no redundancy
– Note that where N1 the reliability is rapidly degraded
N+1 - One additional unit/path/module more than the base
requirement; the stoppage of a single unit will not disrupt
operations
– N+2 is also specified so that maintenance does not degrade resilience
Levels of RedundancyLevels of RedundancyLevels of RedundancyLevels of Redundancy
– An N+1 system running at partial load can become N+2
2N - Two complete units/paths/modules for every one required
for the base system; failure of one entire system will not disrupt
operations for dual-corded loads
2(N+1) - Two complete (N+1) units/paths/modules; failure of one
system still leaves an entire system with a resilient components
for dual-corded loads
49
50. 셀Ў
Redundancy: What is ‘N’?Redundancy: What is ‘N’?Redundancy: What is ‘N’?Redundancy: What is ‘N’?
Module capacity
= Load
2x Module capacity
= Load
3x Module capacity
= Load
MTBF = X MTBF = 0.5X MTBF = 0.33X
N=1 N=2 N=3
Unitary string Power-parallel Power-parallel
51. 齰Ў
Redundancy: What is ‘N+1’?Redundancy: What is ‘N+1’?Redundancy: What is ‘N+1’?Redundancy: What is ‘N+1’?
Module capacity
= 100% Load
Module capacity
= 50% Load
Module capacity
= 33.3% Load
MTBF = 10X MTBF = 9X MTBF = 8X
N=1 N=2 N=3
52. Redundancy: What is ‘2N’?Redundancy: What is ‘2N’?Redundancy: What is ‘2N’?Redundancy: What is ‘2N’?
Module capacity
= 100% Load
Module capacity
= 33.3% Load
A B BA BA
MTBF = 100X MTBF = 50X
N=1 N=3
A B BA BA
53. Ў
Redundancy: What is ‘2(N+1)’?Redundancy: What is ‘2(N+1)’?Redundancy: What is ‘2(N+1)’?Redundancy: What is ‘2(N+1)’?
Module capacity
= 100% Load
Module capacity
= 50% Load
BA
MTBF = 1000X MTBF = 800X
N=1 N=2
BA
A B
54. ͪ꺰
Module capacity
= 100% Load
Think smart: When N+1 = 2N for no costThink smart: When N+1 = 2N for no cost
A B
Module capacity
= 100% Load
R = 10X R = 100X
N+1 2N
N=1 N=1
A B
For dual-cord loads (or PoU-STS’s) and when N=1
55. DistributionDistribution limits the MTBF Availabilitylimits the MTBF AvailabilityDistributionDistribution limits the MTBF Availabilitylimits the MTBF Availability
Mains/Generator Feed
Maintenance Bypass
UPS Input Switchboard
Critical Load Bus
UPS Output Switchboard
N+X does not improve things – the MTBF and Availability is entirely dependent
upon the output switches, only 2N offers high Availabilty
MCCB/ACB MTBF=250,000h, so two in series offer a 125,000h ceiling
56. ㆰͬ
Connection to the (one!) utility gridConnection to the (one!) utility gridConnection to the (one!) utility gridConnection to the (one!) utility grid
230-400kV
66kV
33kV
56
Data
Centre
11kV
400V
Data
Centre
Data
Centre
Best = A+B
The higher the connection voltage the better
Fewest shared connections
Diverse substations
Diverse routing
57. ͪ
In the EU we have EN 50160:2000 Voltage characteristics of electricity
supplied by public distribution systems (see next slide)
In the USA:
– The Sustained Average Interruption Frequency Index (SAIFI):
• Measurement of the months between interruption
• A SAIFI of 0.9 indicates that the utility’s average customer experiences a sustained
electric interruption every 10.8 months (0.9 x 12 months)
UtilityUtility Supply:Supply: Power Quality MetricsPower Quality MetricsUtilityUtility Supply:Supply: Power Quality MetricsPower Quality Metrics
– The Customer Average Interruption Duration Index (CAIDI):
• An average of outage minutes experienced by each customer who experiences a
sustained interruption
– The Momentary Average Interruption Frequency Index (MAIFI):
• The average number of momentary interruptions experienced by utility customers
– Depending upon state regulations, momentary interruptions are defined as any
interruption lasting less than 2 to 5 minutes – NOT 20ms!
In all cases national regulations provide for a public power supply that
is not suitable for compute loads with embedded microprocessors
57
58. ʀͬ
EN 50160:2000EN 50160:2000 -- Voltage characteristics of electricity supplied byVoltage characteristics of electricity supplied by
public distribution systemspublic distribution systems
EN 50160:2000EN 50160:2000 -- Voltage characteristics of electricity supplied byVoltage characteristics of electricity supplied by
public distribution systemspublic distribution systems
Phenomenon Limits Measuremen
t interval
Monitoring
period
Acceptan
ce
Percentag
e
Frequency 49.5 to 50.5Hz
47 to 52Hz
10s 1 week 95%
100%
Slow Voltage
changes
230V ± 10% and
outside of 10% for
5% of the time
10 minutes 1 week 95%
Voltage sags
(1min)
10 to 1000 times
per year (85%
nominal)
10ms 1 year 100%
If you try to plot
this against the
CBEMA Curve
you get MTBF
c50hShort interruptions
(3min)
10 to 100 times per
year (1% nominal)
10ms 1 year 100%
Accidental, long
interruptions
(3min)
10 to 50 times per
year (1% nominal)
10ms 1 year 100%
Temporary over-
voltage (Line-
Ground)
Mostly 1.5kV 10ms 1 year 100%
Transient over-
voltage (Line-
Ground)
Mostly 6kV N/A N/A 100%
Voltage unbalance Mostly 2% but
occasionally 3%
10 minutes 1 week 95%
Harmonic voltages 8% Total Harmonic
Distortion (THD)
10 minutes 1 week 95%
58
c50h
59. ʀͬ
BlackBlack--outout atat the 11kV distribution levelthe 11kV distribution level
UK ElectricityUK Electricity Council data, 1988Council data, 1988
BlackBlack--outout atat the 11kV distribution levelthe 11kV distribution level
UK ElectricityUK Electricity Council data, 1988Council data, 1988
Availability MTBF(years)
MDT(hours) Urban Rural
0.01 36sec 3.1 0.39
0.02 3.2 0.40
0.08 3.7 0.46
0.20 12mins 4.1 0.50
0.33 4.4 0.55
0.50 30mins 4.9 0.60
0.65 5.7 0.70
0.80 48mins 6.8 0.80
Black-out Total loss of voltage on three phases
Brown-out Depression of one, or more, phases
Frequency Grid or standby set generated
Surges Switching, fault clearance re-closure
Voltage distortion Caused by consumer connection
Micro-breaks Short-circuits fault clearance
Swells Over-voltage for several cycles
0.80 48mins 6.8 0.80
1.00 8.2 0.90
How much diesel fuel need you store?
Sags Under-voltage for several cycles
Quality of the grid supply
34 German data centers, 1995
Deviations (10ms to V±5%, 50Hz±1%) over 2190 hours
Worst Average Best
MTBF 43 h 155 h 685 h
MDT 81.45 s 1.72 s 0.1 s
Availability 99.94738% 99.99969% 99.99999%
Typical connection voltage 380V 20kV
60. Typical utility power qualityTypical utility power qualityTypical utility power qualityTypical utility power quality
107,834 MV deviations
(RMS) over 24 months
– 300 MV feeders
– 49.90
events/connection/year
– MTBF = 175h
– MTTR = 3.6s– MTTR = 3.6s
– 2% better when closer to
sub-station feed
Over 60% of events are
– 10 cycles duration,
200ms
– 50% voltage sag
61. ʀͬ
UPS requirements for big data centresUPS requirements for big data centresUPS requirements for big data centresUPS requirements for big data centres
High efficiency below 40% load, pPUE of 1.03
Maximum protection when grid is poor quality
Scalable for ‘invest as you grow’ CapEx to MW
Low voltage distortion against non-linear load current
– Emerson Trinergy provides all– Emerson Trinergy provides all
• 98% efficiency over a full year (average 97.8% at 29% load)
• Three operating modes from double-conversion upwards
• 200kW blocks, 1600kW modules to multi-MW LV systems
• THVD 3% with 100% distorted load
• MV systems optional
63. ʀͬ
Server hardware developments?Server hardware developments?Server hardware developments?Server hardware developments?
Relaxed cooling but increased demands for UPS?
64. ʀͬ
But now it’s the turn of the ‘One’!But now it’s the turn of the ‘One’!
• Typical servers in 2013 consume 40% (from as low as
23% to as much as 80%) of their peak power when
doing zero IT ‘work’
• Average microprocessor utilisation across the globe is
c10%, whilst the best virtualisation takes it to c40% for
(rare) homogeneous loads only 90% for HPC
If the IT hardware had a linear power demand profile• If the IT hardware had a linear power demand profile
versus IT load we would only be using 10% grid power
• In the UK that could mean 100MW instead of 1000MW
• PUE of 1.2 is a law of diminishing returns and
increasing risk so is it time to look at the ICT load?
• DCIM can offer a path to high utilisation rates
65. ʀͬ
Spec_Power: OEMs input dataSpec_Power: OEMs input data
Utility servers
In this small extract from
the web-site, HP ProLiant
models average 41% idle
power and vary from 24%power and vary from 24%
to 79%
HP is ‘better’ than ‘worst’
http://www.spec.org/power_ssj2008/
66. ʀͬ
This is the real ‘efficiency’ battlegroundThis is the real ‘efficiency’ battleground
......
Average utilisation must increase
The IT load will become highly dynamic and the PUE may
get ‘worse’, although the overall energy consumption will
reduce!
67. ʀͬ
1313thth Generation Servers?Generation Servers?1313thth Generation Servers?Generation Servers?
Optimised for 27°C inlet temperature
– 300W server would have typical 20W fan load
Capable of 45°C inlet temperature
– Server power rises 60% with 200W fan load
– Dramatic increase in noise
20°K delta-T, front-to-back
All terminations for power and connectivity
brought to front – nothing in the hot-aisle
Disaggregation?
68. ʀͬ
High efficiency has consequencesHigh efficiency has consequencesHigh efficiency has consequencesHigh efficiency has consequences
69. ʀͬ
Neutral Current
(Balanced load)
The problem with high harmonic loadsThe problem with high harmonic loads
Phase Currents
Kirchoff’s Law:
Sum of the currents
at a junction is zero
Source: Visa International, London, 1995
(Balanced load)
N-E Potential
(5.4V Peak) GRD
N
70. ʀͬ
NeutralNeutral current induces noise in thecurrent induces noise in the
EarthEarth
NeutralNeutral current induces noise in thecurrent induces noise in the
EarthEarth
0.00
200.00
400.00
600.00
800.00
1000.00
Voltage
-1000.00
-800.00
-600.00
-400.00
-200.00
0.00
0
16
32
48
64
80
96
112
128
144
160
176
192
208
224
240
256
272
288
304
320
336
352
High frequency current flowing through the impedance of
the Neutral conductor causes voltage impulses (with respect
to Earth) in the Neutral. This “noise” on the Earth can cause
communication errors.
20ms
71. ʀͬ
Utility Supply? What the load needsUtility Supply? What the load needsUtility Supply? What the load needsUtility Supply? What the load needs
Voltage
300%
200%
Unacceptable Range
Electro-mechanical
switch (60-80ms)
140
Time (60Hz)
0.02ms 0.2ms 20ms2ms 0.2s 2s 20s
100%
STS
(4ms)
0%
Acceptable Range
Unacceptable Range
140
120
70
80
110
90
Note! The IEEE-1100/CBEMA Curve was only ever issued
for 120V/60Hz single-phase equipment
72. ʀͬ
Current future power quality demands?Current future power quality demands?Current future power quality demands?Current future power quality demands?
Pre-1997 CEBMA PQ Curve (IEEE 466 1100)
– 10ms zero-voltage immunity
Post-1997 CEBMA/ITIC PQ Curve
– 20ms zero-voltage immunity
2012 typical server when fully loaded only meets the pre-2012 typical server when fully loaded only meets the pre-
97 10ms zero-voltage tolerance
– In mature markets MTBF of grid to this spec = 250h
– Leading PF of c0.95
– Harmonics at low load 30%THID, at full load c5% THID
– More need for UPS with leading PF capacity and low THVD
against load current distortion
73. ʀͬ
Standards in developments?Standards in developments?Standards in developments?Standards in developments?
I am Spartacus! Everyone is involved in guides, white papers and
standards. Governments are increasingly interested in energy
efficiency
74. ʀͬ
International Standards workInternational Standards workInternational Standards workInternational Standards work
EN50600 - Data Centre Infrastructure
– Facility, power, cooling, cabling, fire, security etc
– Availability Class replaces Tiers
ISO/IEC JCT1 SC39 – Resource Efficient Data Centres
– Sustainability for and by ICT
– WG1 – Metrics;
• IEC 30134-1 Introduction to KPIs
• -2 PUE, -3 ITEE, -4 ITEU, -5 WUE
• Then CUE, KREC, RWH and others
• Korea favours an aggregated ‘silver bullet’ KPI
– WG2 – Sustainability by ICT; low carbon enablement
The Green Grid
– Innovators in energy efficient facilities
– Original work being adopted in ISO
– Technical work continues apace so please come and join us!
75. ʀͬ
Why metrics?Why metrics?Why metrics?Why metrics?
You cant control what you don’t measure
– Identify areas that need improvement and take actions
– Monitor that improvement
– Continuously move forward
Legislation has to be based on measurementsLegislation has to be based on measurements
– The CRC was to be based on PUE improvement
– The best metrics are those suggested by the industry
– Most facilities cannot be judged by the extremes of Google, Facebook
et al
76. ʀͬ
Conclusions or predictions?Conclusions or predictions?Conclusions or predictions?Conclusions or predictions?
Data Centres are at the heart of the internet, enabling our digital
economy. They will expand as our demands, for social, educational,
medical and business purposes, for digital content and services grow
– Facilities will become storage dominant and footprint will increase
– Loads will become more load:power linear and, as a result, more dynamic.
– Thermal management will become increasingly adopted and PUE’s will fall to c1.2
across all of Europe
– Only larger, highly virtualised and heavily loaded facilities will enable low-cost digital
services as the cost of power escalates
Despite our best efforts power consumption will rise, not fallDespite our best efforts power consumption will rise, not fall
– Data growth continues to outstrip Moore’s Law and a paradigm shift in network
photonics and devices will be required but, even then, a change in usage behaviour
will probably be required
– Bitterlin’s Law forecasts a growth rate at c20% CAGR for the foreseeable future –
often in connected locations where energy is cheap and taxes are low
Only a restriction in access will moderate power consumption
– Probably for ‘social’ applications rather than business, medical or education?
– Through price, tax or legislation?
Using DCIM to match load to capacity and maximising utilisation is
one key component
77. ʀͬ
But predicting the future of IT is risky...But predicting the future of IT is risky...
+9 years
Top500
June 2013, China
33.86 PetaFLOPS
c20,000x 1997
Source: www.top500.org
1997 – the world’s fastest super-computer
SANDIA National Laboratories ‘ASCI RED’
1.8 teraflops
150m² raised floor
800kW
2006
Sony Playstation3
1.8 teraflops
0.08m²
0.2kW
78. ʀͬ
Questions?Questions?
Data centres are here to stay and will
increase in number and power. Weincrease in number and power. We
need to explain that this power growth
problem is of society’s own making
and not ‘dirty data-centres’