5. Graph of Moore’s Law – with MS
11/01/2008 EADS 5
Human
Intelligence
6. Memory bottleneck
• The CPU can add two numbers in less than
one nanosecond.
– If they are both in registers
• Putting a number from memory into a register
takes about 100 nanoseconds.
• “Stall” – the CPU waits on memory.
11/01/2008 EADS 6
8. GPU architecture
• GPUs have much less space devoted to cache.
• GPUs have multiple (100-1000) cores, which
are simpler, slower processing units.
• GPU cores all perform the same instructions,
but on different data.
• Not all the cores can be active at once. When
one stalls, another one starts up.
11/01/2008 EADS 8
9. GPU and CPU: The Differences
DRAM
Cache
ALU
Control
ALU
ALU
ALU
DRAM
CPU GPU
GPU
More transistors devoted to computation, instead of caching
or flow control
Suitable for data-intensive computation
High arithmetic/memory operation ratio