1) A CPU core can cycle over 3 billion times per second, while light only travels 10cm during one CPU cycle.
2) Soon, servers will have 128 CPU cores with over 400 billion CPU cycles per second, but most of that power will be wasted waiting for data.
3) By 2022, there will be a 128x increase in transistors per chip, but disk storage cannot keep up with bandwidth demands and will only be used for archival purposes, with DRAM, flash, and phase change memory replacing it for active data.
11. 2010 - 2022
128X increase in transistors per chip
CPU
NIC RAM FLASH DISK
? Moores Law will continue for at least 10 Years
? Transistors per area will double ~ every 2 year
? 128 X increase in ~ 12 years
? 2022: 512Gbit / DRAM, 8 Tbit / Flash
? Frequency Gains are difficult
? Pollacks rule: Power scales quadratic with clock performance
? Parallelism with more cores is a must
12. 2010 - 2022
128X increase in transistors per chip
CPU
NIC RAM FLASH DISK
? 2014: 64 cores, 2016: 128 cores, 2022: 1024 cores
? Memory/IO bandwidth need to grow with processing power
? Disks cannot follow!
13. 2010 - 2022
128X increase in transistors per chip
CPU
NIC RAM FLASH DISK
2010 2022
CORES PER
10 1024
CHIP
MEMORY
Challenging!
BANDWIDTH 40 Gb/s 2.5 Tb/s But needed to feed the
cores !
IO
2 Gb/s 250 Gb/s
BANDWIDTH
?No big change : Single Core Clock Rate (will stay < 5GHz )
?But impressive overall computing power: ? 5000 ( core * GHz )
14. Disks are Tape
DISK
Spinning Rust
? Forget Hard Disks !
? Disks cannot go faster
? Disks cannot follow bandwidth requirements
? Random-read scanning of a 1TB disk space today :
takes 15 C 150 days (!)
? To reach 1TB/s you would need 10.000 disks in parallel
? Disks can only be archives any more (sequential access)
? DRAM, Flash and PCM will be replacement
15. 2010 - 2022
128X increase in transistors per chip
CPU
NIC RAM FLASH DISK
2010 2022
CORES PER
16 1024
CHIP
MEMORY
BANDWIDTH 40 GB/s 2.5 TB/s
IO
2 GB/s 250 GB/s
BANDWIDTH
No big change : Latency
16. Latency and Bandwidth
2 determining factors , which wont change :
RAM C CPU latency : ~ 0.1 ?s
NIC latency via LAN or WAN : 0.1 C 100 ms
RAM
CPU DISK
NIC
FLASH
archive
? NICs move to PCI Express
? Throughput x 2 / year
? May move onto CPU chip
? Access time falls by 50% / year
?10 C 100 Gbit/s already today
? goes from SATA to PCI Express
?Latency in cluster ~1 ?s possible
(Infiniband/opt. Ethern.)
? LAN/WAN latency 0.1 C 100 ms
19. A CPU accessesLevel 12 cache memoryC
It accesses Level cache memory in 1
in 6 C cycles.
2 20 cycles.
20. It accesses Level 2 cache memory in 6 C 20
It accesses RAM in 100 C 400 cycles.
cycles.
21. It accesses Flash memory in 5000
It accesses RAM in 100 C 400 cycles.
cycles.
22. It accesses Flash memorystorage
It accesses Disc in 5000 cycles.
in 1, 000, 000 cycles.
23. translate cycles to miles and
assume you were a CPU core ..
then Level 1 cache would be
in the building
Level 2 cache would be
at the edge of this city
RAM would be in a different state
Flash memory would be a different
country
... and disc storage would be the planet
Mars.
25. Software Implications
Latency and locality are the determining factors
What could that mean?
Roundtrip latency 500
cycles
RAM
CPU DISK
NIC
FLASH
1000 C archive
5,000 1,000,000
500,000,000
cycles cycles
cycles
26. Why Bother ?
Systems may just get smaller !
More users for transaction
processing on a single machine -
isnt that great?
Already today most customers
could run the ERP load of a
company on a single blade
Commodity hardware becomes
sufficient for ERP
No threat!
( or may be becoming a commodity is a threat?)