Wikipedia defines Platform as "A raised level surface on which people or things can stand". A more familiar technical interpretation applies to the hardware and OS configuration applicable to the execution of software; most frequently applicable to highly stable PC or Mainframe architectures. But the world has changed a lot since serious computing power moved into the embedded consumer arena. Now, with runs of many millions for single products, the argument for customisation is much more justifiable; so the traditional view of platforms is struggling against a tide of individuality. Can the ARM architecture bring stability back into this chaos, or is something else needed? Isaac Newton realised the reality of platforms when he talked of standing on the shoulders of giants. A platform is a stable place where engineers and scientists can stand to achieve more than they would otherwise have done. So our XXI Century Platforms are the shape to deliver improved Productivity, Reuse, Quality, TTM, Cost, etc. for the System Products we are now charged to deliver. Its business, stupid!
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
Computing Platforms for the XXIc - DSD/SEAA Keynote
1. Computing Platforms for the XXI Century
Abstract:
Wikipedia defines Platform as "A raised level surface on which people or things can stand". A more familiar
technical interpretation applies to the hardware and OS configuration applicable to the execution of
software; most frequently applicable to highly stable PC or Mainframe architectures. But the world has
changed a lot since serious computing power moved into the embedded consumer arena. Now, with runs of
many millions for single products, the argument for customisation is much more justifiable; so the traditional
view of platforms is struggling against a tide of individuality. Can the ARM architecture bring stability back
into this chaos, or is something else needed? Isaac Newton realised the reality of platforms when he talked of
standing on the shoulders of giants. A platform is a stable place where engineers and scientists can stand to
achieve more than they would otherwise have done. So our XXI Century Platforms are the shape to deliver
improved Productivity, Reuse, Quality, TTM, Cost, etc. for the System Products we are now charged to
deliver. Its business, stupid!
Context
Keynote at the Euromicro conference
http://www.teisa.unican.es/dsd-seaa-2013/
The series (est1973) is known worldwide for its scientific quality. Its main event in 2013 is the collocated Digital System Design
(DSD) and Software Engineering and Advanced Applications (SEAA) conference in Santander, Spain.
45min Keynote, 60min Slot. 4sep13
Pdf and Tube available at http://ianp24.blogspot.co.uk/
1
2. 1v0
Prof. Ian Phillips
Principal Staff Eng’r,
ARM Ltd
ian.phillips@arm.com
Visiting Prof. at ...
Contribution to Industry
Award 2008
Euromicro DSD/SEAA Keynote
Santander, Spain
04sep13
Pdf and Tube available at http://ianp24.blogspot.co.uk/
2
3. Classic Computing Platforms
General Purpose Compute Platforms
PC – Dominated by x86 architecture (Intel + AMD + Windows)
Linux
OpenBSD
FreeVMS
MacOS ‘N’ – Universal Binaries (PowerPC/x86)
Mainframe - IBM, EMC, Hitachi, Unysis, HP, NEC, Fujitsu
DOS
But also Apple ...
Windows ‘N’
Fortran
C/C++
Cobol - One of first languages (1959). In 1997, 80% of the world's business ran on COBOL with >200
billion lines of code in existence and >5 billion lines of new code annually (Gartner).
Portable Computing – Pocketable GP Compute Platforms
iOS (iPad/iPhone/iPod)
Android
Windows 8
... We all have our personal favourites!
3
4. But What About Embedded?
Computers, but without General Programmability
The Chip as a Platform?
MCU and CPU chips from many vendors?
The PCB Platform?
ARM IP a Platform?
What about the RTOS’s?
∘ The CPUs?
∘ The GPUs?
∘ AMBA?
∘ CoreLink Cells? ∘ SoC Methods?
There are 45 listed on ARM’s web-site
Or the Design Tools?
Verilog/VHDL and Synthesis?
Digital Logic : Based on Boolean Mathematics?
Software Kernels/RTOs Debuggers?
Lots of form-factors, targeting different markets
... By-Far the Biggest Footprint of Computers Today!
4
BeagleBone
Black (TI)
7. A Machine for Computing ...
Computing: A general term for algebraic manipulation of data ...
Numerated
Phenomena
IN (x)
y=F(x,t,s)
Processed Data/
Information
OUT (y)
... State and Time are normally factors in this.
It can include phenomena ranging from human thinking to calculations
with a narrower meaning.
Usually used it to exercise analogies (models) of real-world situations;
Frequently in real-time (Fast enough to be a stabilising factor in a loop).
Wikipedia
... Not prescriptive about Implementation Technology!
... Not prescriptive about ease of (re)Programmability!
7
8. Electronic Systems1: the KET for 21c!
Fundamental to the solutions to all of Societies Challenges
Dependent on them today; we will
become ever more so in the future
National Independence is not an option:
but Mutual Co-Dependence is!
Though Animated by Electronics;
ES are much more than that ...
... They Include all the
Technologies and Methods to
make them ‘work’ as a Product.
The most important technology is
the one that doesn’t work!
... ES Technologies will
literally be the Platform
on which the 21c will be constructed.
8
1: aka; Cyber-Physical Systems (Geek-Talk!)
9. Putting Technology into Context
21c Businesses have to be
Selling things that People (End-Customers) want to buy.
Globalisation makes Them Focus on Their Core Competencies
Customers, Competition, Operations and Investors are Global
Objective: (World) Best at That; Outsource ‘everything else’
Nationality: Has little meaning (Loyalty, Tradition, etc)
Business needs
End-Customers buy Functionality not Technology
Technologies enable Product Options
Business-Models make Money
..but..
New Products are
Technology (HW, SW, Mechanics, Optics, etc) is just a
way to enable Product Options (Create Differentiation)!
New Technology always increases Cost/Risk ... But not always Value
Design is a Cost/Risk to be Minimised
... Technology is never a Product in its own right!
9
10. Moore’s Law is a Technology Opportunity
X
100nm
10um
Transistor/PM (K)
1um
Transistors/Chip (M)
Approximate Process Geometry
10nm
100um
ITRS’99
10
http://en.wikipedia.org/wiki/Moore’s_law
11. Markets provide the Growth Drivers
3rd Era
Millions of Units
Computing as part
of our lives
2nd Era
Broad-based computing
for specific tasks
1st Era
Select work
tasks
1960
1970
1980
1990
2000
2010
... Yesterdays Markets are still valuable; just not the Biggest!
11
2020
13. What Happened to the Productivity Gap?
Pre.1990 chip design was entire ...
Moore’s Law was handled by ever Bigger Teams and ever Faster Tools
With Improved Productivity through HDL and Synthesis
... I was a chip designer in 1975; and did it all, myself, in 3mth (1k gates!)
Post 1995 reuse silently entered the picture ...
Circuit Blocks
CPUs (and Software)
... With
Supporting
External IP
Methodology!
Up-Integration
(Incl. Software)
Chip Reuse (ASSP)
... Delivering Productivity, Quality and Reliability
... Birth of HW/SW IP Companies (eg ARM c1991)
... But it also brought the Commoditisation of Silicon (and FABs) !
13
14. How Much Reuse Today?
Mobile Products have ~500m gate SoCs / ~500m lines of code
Doubling every 18mth
Designer Productivity: is just 100-1000 Gates(Lines)/day
That is tested, verified, incorporated gates(lines)
That’s 2,500-25,000 p.yrs to clean-sheet design! (Un-Resourceable)
Typically ‘Product Designs’ have 50-200 p.yr available ...
That’s just ~0.5% New ... >99.5% Reuse already!
Not Viable to do clean-sheet product design ... nor has it been since ~1995
The core HW/SW is only a part of a Product ...
14
There’s all of the other Components and Sub-Systems
There’s the IO systems (RF, Audio, Optical, Geo-spatial, Temporal)
There’s the Mechanical
There’s the Reproduction (Factory)
There's the Business Model (Cash-flow, Distribution, Legal)
There’s the Support (Repair, Installation, Maintenance, Replacement)
15. How do we Reuse?
Design Tools (across all Product Disciplines) underpin this ...
Reuse of Modules and Components
Reuse of Existing Code and Circuits
Sharing Methodology
Sharing Architecture
Creating Tools to Accelerate Methodology and Repeatability
Design For “x” (DFx) is Design For Up-Stream (Re)Deployment
A significant part is (and will remain) Knowledge based ...
The Designer has done similar work before
The Team has Collective experience
The Company has experience and a customer base
The Design Engineering Role is ...
To create Order out of Chaos
To apply state-of-the-art and knowledge; to create a Viable Product
15
16. Platforms Mean Productivity
Reusing rather than Re-Developing
Allows Focus on your value-add; and less on stuff that you can acquire
competitively (which has become commoditised).
∘ English as the lingua-franca
∘ Instant global telecoms (ICT)
∘ IT and the Internet
∘ International Contract Law
∘ The World-Trade Organisation (WTO)
∘ Standardisation of GP-Compute Architecture
Globalisation has changed the meaning of Local...
∘ Actual Business-2-Business cooperation (Partnering, not just Out-Sourcing).
∘ In all aspects of business: Technical and Administrative
∘ Irrespective of geographic location
∘ Irrespective of tangibility of ‘product’.
... Just like
does.
Platforms have changed scope of Reuse ...
... And these businesses avoid Commoditisation
... By Differentiating their Platform Products
16
18. All Exponentials Must End ...
130nm
Growing opinion that 14 or 7nm will be
the smallest yieldable node ... Ever!
Just 3-4 gen. (5-8yr) to the
90nm
end of Planar Scaling
30nm
Only things on
the drawing
board today ...
14nm
... can get into the
last of the of planar chips!
Its also the end-of-the-road for
‘promising technologies’ !
18
Clean-Sheet Synthesis
Scalable Processor Arrays
Formal Design
Top-Down Design
7nm
...And the end for Moore’s Law?
19. Moore's Real Law: x2 Functionality Every 18mth!
Cascade of Technologies supporting Functional growth ...
Functional Density (units)
1012
1010
106
102
Electronic era:
System era:
1975-2005
2003-2030
100
1960
1980
2000
2020
... The ‘Law’ started with Wood ⇒ Stone ⇒ Bronze ⇒ Iron
19
20. … System-Scaling Maintains Momentum!
Interposer today
Die-Integration ..and..
13aug13
Genuine 3D-Process very soon
24-Layers
3D NAND-Flash
4x Transfer
to Production
Die-Stack
10 Layer Interposer
Die-Stack Mixed-Technology
8x Sampling
Active Carrier
PV - 500nm Ge
RF - 300nm GaAs
CPU- 90nm Si CMOS
DRAM - 20nm Si FIN-MOS
300nm Si CMOS
10 stack 1.6 mm
... A disconnect for Moore’s Planar-Scaling Law,
... but not for ‘his’ System-Scaling Law.
20
21. Packing Technology in an iConic Product
Analogue and Digital Design
Embedded Software
Mechanics, Plastics and Glass
Micro-Machines (MEMs)
Displays and Transducers
Robotics and Test
Knowledge and Know-How
Research, Education and Training
Components, Sub-Systems and Systems;
Design, Assembly and Manufacture
Metrology, Methodology and Tools
... Involving Many Specialist Businesses
... Round and Round the World
...Not-Least from Europe
21
22. A lot of Technologies in a Smart Phone
... And more than 99%+ is Reused!
22
23. Take a Look Inside...
Level-1: Modules
The Control Board.
23
http://www.ifixit.com
25. Inside The Control Board
(b-side)
Level-2: Sub-Assemblies
More Visible Computing Contributors ...
A4 Processor. Spec:Apple, Design & Mfr: Samsung
Digital-CMOS (nm) ...
Provides the iPhone 4 with its GP computing power.
(Said to contain ARM A8 600 MHz CPU and other ARM IP)
ST-Micro: 3 axis Gyroscope - MEM-CMOS (ARM Partner)
Broadcom: Wi-Fi, Bluetooth, and GPS - Analogue-CMOS (ARM Ptr)
Skyworks: GSM
Analogue-Bipolar
Triquint: GSM PA Analogue-GaAs
Infineon: GSM Transceiver - Anal/Digi-CMOS (ARM Partner)
GPS
Bluetooth,
EDR &FM
25
http://www.ifixit.com
26. The A4 SIP Package
(Cross-section)
Memory
‘Package’
2 Memory Dies
Processor SOC Die
Glue
4-Layer Platform
Package’
Down 3-Levels: IC Packaging
26
The processor is the centre rectangle. The silver circles beneath it are solder balls.
Two rectangles above are RAM die, offset to make room for the wirebonds.
Putting the RAM close to the processor reduces latency, making RAM faster and cuts power.
Unknown Mfr (Memory)
Samsung/ARM (Processor)
Unknown (SIP Technology)
Source ... http://www.ifixit.com
27. The Processor Unit
NB: The Tegra 3 is similar to the
A4/5, but is not used in the iPhone
27
(Nvidea Tegra 3, Around 1B transistors)
28. Lots and Lots of Designers ...
159 Tier-1 Suppliers ...
Thousands of Design Engineers
10’s of thousands of Engineers
Globally
... Hundreds more Tier-2
suppliers (Including ARM)
28
29. So What Does ARM Really Do?
“ARM designs processor technology that lies at
the heart
of advanced consumer products”
29
30. 1991: ARM a RISC-Processor Core …
ADDR[31:0]
Address Register
Address
Incrementer
Scan
Debug
Control
Incrementer
P
C
PC Update
Register Bank
Instruction
Decoder
Decode
Stage
A
L
U
B
u
s
A
B
u
s
Multiplier
B
B
u
s
Instruction
Decompression
Control
Logic
Write Data
Register
WDATA[31:0]
30
nIRQ
nFIQ
nRESET
ABORT
TRANS
PROT
Barrel
Shifter
32 Bit ALU
and
CFGBIGEND
CLK
CLKEN
WRITE
SIZE[1:0]
Read Data
Register
RDATA[31:0]
LOCK
CPnOPC
CPnCPI
CPA
CPB
32. But Systems Got Ever-More Complex!
Today, users require a pocket ‘Super-Computer’ ...
Silicon Technology Provides a few-Billion transistors ...
ARM’s Technology (still) makes it Practical to utilise them ...
• 10 Processors
•
•
•
•
•
nVidea Tegra3
ARM
ARM
ARM
ARM
ARM
ARM
•
4 x A9 Processors (2x2):
4 x MALI 400 Fragment Proc:
1 x MALI 400 Vertex Proc.
1 x MALI Video CoDec
Software Stacks, OS’s and Design
Tools/
ARM Technology gives
chip/system designers ...
• Improved Productivity
• Improved TTM
• Improved Quality/Certainty
... So By Definition ARM is (≥1) Platform!
32
33. Making Systems out of Transistors
ARM Technology drives efficient
Electronic System solutions:
Software increasing system efficiency
with optimized software solutions
Diverse components, including CPU
and GPU processors designed for
specific tasks
Interconnect System IP delivering
coherency and the quality of service
required for lowest memory bandwidth
Physical IP for a highly optimized
processor implementation
Backed by >900 Global Partners ...
33
>800 Licences
Millions of Developers
34. Methodology As Well As Hardware
C/C++
Debug & Trace
Development
Energy Trace
Modules
Middleware
34
35. The Right Horse for The Course ...
About 50MTr
About 50KTr
... Delivering ~5x speed (Architecture + Process + Clock)
35
37. Power-Efficiency
Watts don’t just Happen; they are Caused!
In the Chip
Matching the processor to the application
Minimise voltage/frequency (P=CV2f)
Variable/Gated clock domains
Variable/Switched voltage domains
Maximises ‘Activity Power’ dependence (Counter Intuitive)
`
37
In the Software
Give the OS and the Application SW
Information and Controls
Methodology and Utilities
In the System
Architecture
Extend control beyond the chip
... HW Dissipates; but SW Makes It!
38. Parallel is More Power-Efficient
Processor
Input
Processor
Output
Output
Input
f/2
f
Processor
Capacitance = C
Voltage = V
Frequency = f
Power = CV2f
f/2
Capacitance = 2.2C
Voltage = 0.6V
Frequency = 0.5f
Power = 0.4CV2f
f
... By a factor determined by Amdahl or Gustafson?
38
39. CoreLink Supports Multi-Processing
Heterogeneous processors – CPU, GPU,
DSP and accelerators
Virtualized Interrupts
Up to 4 cores
per cluster
Up to 4
coherent
clusters
Quad
CortexA15
Quad
CortexA15
Quad
CortexA15
L2 cache
L2 cache
L2 cache
Quad
ACE
CortexA15
L2 cache
DSP
DSP
DSP
PCIe
DPI
Crypto
USB
AHB
ACE
SATA
NIC-400
IO Virtualisation with System MMU
CoreLink™ CCN-504 Cache Coherent Network
Integrated
L3 cache
Snoop
Filter
8-16MB L3 cache
CoreLink™
DMC-520
Dual channel
DDR3/4 x72
10-40
GbE
Interrupt Control
Uniform
System
memory
CoreLink™
DMC-520
NIC-400 Network Interconnect
PHY
x72
DDR4-3200
x72
DDR4-3200
Flash
GPIO
Peripheral address space
39
Up to 18
AMBA
interfaces for
I/O coherent
accelerators
and IO
40. big.LITTLE Processing
For High-Performance systems...
Tightly coupled combination of two ARM CPU clusters:
Cortex-A15 (big Performance) and Cortex-A7 (LITTLE Power) - functionally identical
Same programmers view, looks the same to OS and applications
big.LITTLE combines high-performance and low power
Automatically selects the right processor for the right job
Redefines the efficiency/performance trade-off
“Demanding tasks”
>2x Performance
Current big.LITTLE
smartphone
40
big
“Always on, always
connected tasks”
LITTLE
30% of the Power
(select use cases)
Current big.LITTLE
smartphone
41. LITTLE
Fine-Tuned to Different Performance Points
Most energy-efficient applications processor from ARM
Simple, in-order, 8 stage pipelines
Performance better than mainstream, high-volume
smartphones (Cortex-A8 and Cortex-A9)
big
Highest performance in mobile power envelope
41
Complex, out-of-order, multi-issue pipelines
Up to 2x the performance of today’s high-end
smartphones
Cortex-A7
Cortex-A53
Q
u
e
u
e
I
s
s
u
e
I
n
t
e
g
e
r
Cortex-A15
Cortex-A57
42. big.LITTLE Software Model
CPU Migration
Migrate a single processor workload to the appropriate CPU
Migration = save context then resume on another core
Also known as Linaro “In Kernel Switcher”
DVFS driver modifications and kernel modifications
Based on standard power management routines
Small modification to OS and DVFS, ~600 lines of code
big.LITTLE MP
OS scheduler moves threads/tasks to appropriate CPU
Based on CPU workload
Based on dynamic thread performance requirements
Enables highest peak performance by using all cores at once
42
43. Businesses Within The Global Life-Cycle
Company A, Product-X
Design
Design Tools
Training
Education
ICT
Conferences
Patents
Know-How
Tool-Libraries
Models
Software
Research
Methods
Design
Integrate
Tools
Technologies
Prototypes
FABs
Components
Know-How
Methods
Qualify
Equipment
Know-How
Standards
Procedures
ICT
Methods
Training
Reproduce
Big Finance
Equipment
Know-How
Components
Out-Sourcing
JIT
Factory Auto’n
Methods
TQM
Training
Install
Equipment
Know-How
Standards
Methods
Supply
Logistics
Training
Maintain
Equipment
Know-How
Supply
Logistics
Training
Upgrade
Equipment
Know-How
Supply
Logistics
Training
Qualify
Reproduce
Equipment
Know-How
Standards
Logistics
Training
Companies B & C Provide Their Valued Product(s)
to Other Customers As Well (Efficiency of Reuse)...
Company-B, Product-J,K,L
Integrate
DeCommission
Install
Maintain
Upgrade
DeCommission
... Enabled By Globalisation: ICT, WTO, English
Language, Containers and Int’l Contract Law
Company-C, Product-M,N,O
Design
Integrate
Qualify
Reproduce
Install
Maintain
... All Platforms are Valued in Product Life-Cycles
43
Upgrade
DeCommission
44. Conclusions ...
Business is about Making Money for Investors ...
Technology just enables Product Options, not all of which are Valuable
“Optimality” is seldom a Product Differentiator; “Better” is!
... Most Tech. Enterprises provide Components into Product Life-Cycles
Platforms are just Productivity Aids ...
A way of creating new Products as quickly and cheaply as possible
Valued is not the same as Valuable
ARM is a Productivity Aid to the biggest market for Computers today
... So by definition ARM’s Products are (key) Computing Platforms (plural)
Electronic Systems will be the foundation of our future ...
They will be fundamental to whatever Society makes of the 21C (+ and -)
And Society will be increasingly unaware of them!
Requirements for ever more Sophisticated Functionality will require ever
more sophisticated Technology-Platforms throughout their Life-Cycles
... But Electronic Systems will be The Product-Platform for the XXIc
44
45. Prof. Ian Phillips
Principal Staff Eng’r,
ARM Ltd
ian.phillips@arm.com
Visiting Prof. at ...
Contribution to Industry
Award 2008
Pdf and Tube available at http://ianp24.blogspot.co.uk/
45