CS2305: Computer Architecture
Fundamentals of Computer Design
(Computer Architecture: Chapter 1 )
Copyright By PowCoder代写 加微信 powcoder
Department of Computer Science and Engineering
Fundamentals Introduction Agenda
1.1 Introduction
Classes of Computers
Defining Computer Architecture
Trends in Technology
Trends in Power and Energy in ICs Trends in Cost
Dependability
Measuring Performance
Quantitative Principles
Fundamentals
Evolvement of Processors
Fundamentals
x86 Manufacturers
(In the past)
Transmeta (discontinued its x86 line)
Rise Technology (acquired by SiS)
IDT (Centaur Technology x86 division acquired by VIA)
National Semiconductor (sold the x86 PC designs to VIA and later the
x86 embedded designs to AMD)
Cyrix (acquired by National Semiconductor)
NexGen (acquired by AMD)
Chips and Technologies (acquired by Intel)
IBM (discontinued its own x86 line)
UMC (discontinued its x86 line)
NEC (discontinued its x86 line)
Fundamentals
Intel 4004 Die Photo
Introduced in 1970
First microprocessor
2,250 transistors 12 mm2 (die size) 108 KHz
Fundamentals
Intel 8086 Die Scan
Introduced in 1979
Basic architecture of the IA32 PC
29,000 transistors
33 mm2 5 MHz
Fundamentals
Intel 80486 Die Scan
Introduced in 1989
1st pipelined implementation of IA32
1,200,000 transistors
81 mm2 25 MHz
Fundamentals
Pentium Die Photo
Introduced in 1993
1st superscalar implementation of IA32
3,100,000 transistors
296 mm2 60 MHz
Fundamentals
Pentium III
Introduced in 1999
9,500,000 transistors
125 mm2 450 MHz
Fundamentals
Pentium IV and Duo
Intel P4 – 55M tr (2001)
Intel Core 2 Extreme Quad-core 2x291M tr. (2006)
Intel Itanium – 221M tr.
Fundamentals
Dual-Core Itanium 2 (Montecito)
1.72 B Transistors
2 GHz frequency
Fundamentals
List of Intel Microprocessors
1 The 4-bit processors 1.1 Intel 4004
10 32-bit processors: P6/Pentium M
1.2 Intel 4040
microarchitecture
2 The 8-bit processors 2.1 8008
10.1 Pentium Pro
10.2 Pentium II
10.3 Celeron (Pentium II-based)
10.4 Pentium III
10.5 Pentium II and III Xeon
10.6 Celeron (Pentium III Coppermine-based)
10.7 Celeron (Pentium III Tualatin-based)
10.8 Pentium M
10.9 Celeron M
10.10 Intel Core
10.11 Dual-Core Xeon LV
2.2 8080
2.3 8085
3 Microcontrollers
3.1 Intel 8048
3.2 Intel 8051
3.3 Intel 80151
3.4 Intel 80251
3.5 MCS-96 Family
11 32-bit processors: NetBurst microarchitecture 11.1 Pentium 4
4 The bit-slice processor 4.1 3000 Family
11.2 Xeon
11.3 Mobile Pentium 4-M 11.4 Pentium 4 EE
11.5 Pentium 4E
11.6 Pentium 4F
5 The 16-bit processors: MCS-86 family 5.1 8086
5.2 8088
5.3 80186
5.4 80188
5.5 80286
12 64-bit processors: IA-64 12.1 Itanium
6 32-bit processors: the non-x86
12.2 Itanium 2
13 64-bit processors: Intel 64 – NetBurst
microprocessors
microarchitecture
6.1 iAPX 432
6.2 i960 aka 80960
6.3 i860 aka 80860
6.4 XScale
13.1 Pentium 4F
13.2 Pentium D
13.3 Pentium Extreme Edition
13.4 Xeon
7 32-bit processors: the 80386 range 7.1 80386DX
14 64-bit processors: Intel 64 – Core
7.2 80386SX
7.3 80376
7.4 80386SL
7.5 80386EX
microarchitecture
8 32-bit processors: the 80486 range 8.1 80486DX
14.1 Xeon
14.2 Intel Core 2
14.3 Pentium Dual-Core
14.4 Celeron
14.5 Celeron M
8.2 80486SX
8.3 80486DX2
8.4 80486SL
8.5 80486DX4
15 64-bit processors: Intel 64 – Nehalem
9 32-bit processors: P5 microarchitecture
15.1 Intel Pentium
15.2 Core i3
15.3 Core i5
15.4 Core i7
9.1 Original Pentium
9.2 Pentium with MMX Technology
15.5 Xeon
16 64-bit processors: Intel 64 – /
microarchitecture
microarchitecture
16.1 Celeron
16.2 Pentium
16.3 Core i3 / 16.4 Core i5 /16.5 Core i7
Fundamentals
Comparison of Intel Processors
Series Nomenclature
Clock Rate
Socket 2, Socket 3,Socket
Fabrication
Number of Cores
Intel Pentium Intel Pentium
P5, P54C, P54CTB, P54CS
60 MHz – 200 MHz 120 MHz – 300 MHz
4, Socket 5,Socket 7
800 nm – 350 nm 350 nm – 250 nm
Unknown Unknown
Single Single
50 MHz – 66 MHz
Intel Atom
Z5xx, Z6xx, N2xx, 2xx, 3xx, N4xx, D4xx, D5xx, N5xx, D2xxx, N2xxx
Diamondville,Pineview, Sil verthorne, Lincroft, Cedarv iew, Medfield,
800 MHz – 2.13 GHz
Socket PBGA437,Socket PBGA441, Socket micro- FCBGA8 559
32 nm, 45 nm
Single,Double
400 MHz, 533 MHz, 667 MHz, 2.5 GT/s
512 KiB – 1 MiB
Intel Celeron
3xx, 4xx, 5xx
Banias,Cedar
Mill, Conroe, Coppermine, Covington, Dothan, Mendo cino,Northwood, Prescott, Tualatin, Willamette, Yona h
Klamath,Deschutes, Tonga , ,Coppermine, Tuala tin
Allendale,Cascades, Clover town, Conroe, Cranford, D empsey, Drake, Dunningto n,Foster, Gainestown, Gall atin, Harpertown, Irwindal e, Kentsfield,Nocona, Paxv ille, Potomac, Prestonia, So ssaman, Tanner, Tigerton, Tulsa, Wolfdale, Woodcres t
Mill,Northwood, Prescott, Willamette
266 MHz – 3.6 GHz
Slot 1,Socket 370, Socket 478, Socket 479,Socket 495, LGA 775,Socket
M, Socket T
45 nm, 65 nm, 90 nm, 130 nm, 180 nm, 250 nm
5.5 W – 86 W
Single,Double
66 MHz, 100 MHz, 133 MHz, 400 MHz, 533 MHz, 800 MHz
0 KiB – 1 MiB
Intel Pentium Intel Pentium
150 MHz – 200 MHz 233 MHz – 450 MHz
Slot 1, MMC-1, MMC- 2,Mini-Cartridge
350 nm, 500 nm 250 nm, 350 nm
29.2W-47W 16.8 W – 38.2 W
Single Single
60 MHz, 66 MHz 66 MHz, 100 MHz
256 KiB, 512 KiB, 1024 KiB 256 KiB – 512 KiB
Intel Pentium
450 MHz – 1.4 GHz
Slot 1, Socket 370
130 nm, 180 nm, 250 nm
100 MHz, 133 MHz
256 KiB – 512 KiB
Intel Xeon
n3xxx, n5xxx, n7xxx
400 MHz – 4.4 GHz
Slot 2,Socket 603, Socket 604, Socket J, Socket
T, Socket B LGA 1156,LGA 1366
45 nm, 65 nm, 90 nm, 130 nm, 180 nm, 250 nm
16 W – 165 W
Single,Double, Quad, Hexa, Octa
100 MHz, 133 MHz,
400 MHz, 533 MHz,
667 MHz, 800 MHz,
1066 MHz, 1333 MHz, 1600 MHz, 4.8 GT/s, 5.86 GT/s, 6.4 GT/s
256 KiB – 12 MiB
4 MiB – 16 MiB
1.3 GHz – 3.8 GHz
Socket 423,Socket 478, LGA 775,Socket T
65 nm, 90 nm, 130 nm, 180 nm
21 W – 115 W
400 MHz, 533 MHz, 800 MHz, 1066 MHz
256 KiB – 2 MiB
Pentium 4 Extreme Edition
Gallatin,Prescott 2M
3.2 GHz – 3.73 GHz 800 MHz – 2.266 GHz 2.66 GHz – 3.73 GHz
Socket 478,Socket T
90 nm, 130 nm 90 nm, 130 nm 65 nm, 90 nm
92W-115W 5.5W-27W 95W-130W
Single Single Double
800 MHz, 1066 MHz
512 KiB – 1 MiB
0 KiB – 2 MiB –
Pentium D/EE
Intel Pentium Dual-Core
Smithfield,Presler
2×1 MiB – 2×2 MiB
Intel Pentium New Intel Core
Penryn,Wolfdale,
1.2 GHz – 3.33 GHz
Socket 775,Socket
P, Socket T,LGA 1156, LGA 1155,
32nm,45nm,65nm
Single,Double
800 MHz, 1066 MHz, 2.5GT/s, 5 GT/s
2×256 KiB – 2 MiB
0 KiB – 3 MiB
Intel Core Intel Core Intel Core
M, Socket P,Socket J, Socket T
Intel Core
i7-6xx, i7-7xx, i7-8xx, i7- 9xx, i7-2xxx, i7-37xx, i7- 38xx, i7-47xx
XM, Lynnfield, ,
-E, , Haswell
1.6 GHz – 3.6 GHz
LGA 1156,LGA 1155, LGA 1366,LGA 2011
22nm,32nm,45nm
4.8 GT/s, 6.4 GT/s
6MiB-10MiB
Intel Core
i7-970, i7-980, i7-980x, i7- 990x, i7-39xx, i7-38xx
Gulftown, -E Code Name
3.2 GHz – 3.46 GHz Clock Rate
LGA 1366,LGA 2011 Socket
32 nm Fabrication
Number of Cores
6.4 GT/s Bus Speed
6×256 KiB L2 Cache
12MiB-15MiB L3 Cache
Series Nomenclature
P55C, Tillamook
60 MHz – 66 MHz
Banias,Dothan
Socket 479
400 MHz, 533 MHz 533 MHz, 800 MHz, 1066 MHz
1 MiB – 2 MiB
E2xxx, E3xxx, E5xxx,
T2xxx, T3xxx
E5xxx, E6xxx, T4xxx,
SU2xxx, SU4xxx, G69xx,
P6xxx, U5xxx, G6xx, G8xx, , , B9xx
Txxxx, Lxxxx, , Lxxxx, Exxxx, Txxxx, P7xxx, Xxxxx, Qxxxx, Q ,Conroe, Merom, Penryn, Kentsfield, Wolfda le, Yorkfield
1.06 GHz – 2.33 GHz 1.06 GHz – 3.33 GHz
Socket 775,Socket
65 nm 45nm,65nm
5.5W-49W 5.5W-150W
Single,Double Single,Double, Quad
533 MHz, 667 MHz 533 MHz, 667 MHz, 800 MHz, 1066 MHz, 1333 MHz, 1600 MHz
1 MiB – 12 MiB
i3-xxx, i3-2xxx, i3-3xxx
Arrandale,Clarkdale, , Arrandale,Clarkdale, Clark sfield, Lynnfield, , Bloomfield,Nehalem, Clark sfield, Clarksfield
2.4 GHz – 3.4 GHz 1.06 GHz – 3.46 GHz
LGA 1156, LGA 1155 LGA 1156, LGA 1155
22nm,32nm 22nm,32nm,45nm
35W-73W 17W-95W
Double Double,Quad
1066 MHz, 1600 MHz, 2.5 – 5 GT/s
256 KiB 256 KiB
3MiB-4MiB 4MiB-8MiB
i5-7xx, i5-6xx, i5-2xxx, i5- 3xxx
2.5 – 5 GT/s
Allendale,Penryn, Wolfdale , Yonah
1.6 GHz – 2.93 GHz
Socket 775,Socket M, Socket P,Socket T
533 MHz, 667 MHz, 800 MHz, 1066 MHz
1 MiB – 2 MiB
Fundamentals
Processor Transistor Count
(from http://en.wikipedia.org/wiki/Transistor_count)
Transistor Date of count intro-
Manufactu- rer
Transistor count
Date of introdu- ction
Manufacturer
Intel 4004 Intel 8008 Intel 8080 Intel 8088 Intel 80286 Intel 80386 Intel 80486 Pentium AMD K5 Pentium II AMD K6 Pentium III AMD K6-III AMD K7 Pentium 4
2300 1971 2500 1972 4500 1974
Intel Intel Intel Intel Intel Intel Intel Intel AMD Intel AMD Intel AMD AMD Intel
25 000 000
Intel AMD AMD Intel Intel
29 000 1978 134 000 1982 275 000 1985
1 200 000 1989 3 100 000 1993 4 300 000 1996 7 500 000 1997 8 800 000 1997 9 500 000 1999
241 000 000
Sony/IBM/ Toshiba
21 300 000 1999 22 000 000 1999 42 000 000 2000
54 300 000 105 900 000 220 000 000 592 000 000
Itanium 2 with 9MB cache
Core 2 Duo
291 000 000
Intel Intel Intel
Core 2 Quadro
582 000 000 1 700 000 000
Dual-Core Itanium 2
Fundamentals
Moore’s Law
0.09 μm 596 mm2
10 μm 13.5mm2
1.7 billions
Exponential growth
Transistor count will be doubled every 18 months
, Intel co-founder
42millions
Fundamentals
Memory Capacity (Single Chip DRAM)
Moore’s Law for Memory: capacity increases by 4x every 3 years
size(Mb) cyc time
1980 1983 1986 1989 1992 1996 2000 2007
0.0625 250 ns 0.25 220 ns 1 190 ns 4 165 ns 16 145 ns 64 120 ns 256 100 ns 2G 52 ns
Fundamentals
Trends in Technology
Trends in Technology followed closely Moore’s Law “Transistor density of chips doubles every 1.5-2.0 years”
As a consequence of Moore’s Law: Processor speed doubles every 1.5-2.0 years DRAM size doubles every 1.5-2.0 years
These constitute a target that the computer industry aim for.
Fundamentals
Rapid Improvement
Fundamentals
Rapid Improvements
Understand the unprecedented innovations in computers!
What have been making computers faster and cheaper? And how
to continue the innovations?
50 years of non-stop innovation – a technological miracle!
IBM7030 (Stretch)1961
1G (typical)
1.2 MIPS 1000 MIPS
Pentium 4 Desktop 2000
Capacity (memory)
Speed (CPU)
Price US$13,500,000
Fundamentals
Automobile and Computer
What would cars be like if they had innovated like computer?
A good car in the 1960’s
US$5,000 US$0.29
A car in the 2000’s had it developed like computers
39062 83333
Capacity Speed
(passengers)
(in KM/hour)
Fundamentals
How to Make Computers Faster? Option 1: To increase the clock rate or main
Today’s mainstream: 3 GHz
Option 2: To increase the logic density (number of gates in a chip)
Today’s mainstream: 14 nm processor(2015) In comparison, 10 microns (1971)
Always useful?
Fundamentals Introduction
Single Processor Performance
HW Technology
HW Technology + Arc. Org. ideas
Power issue + ILP run out
Move to multi-processor
Fundamentals
Growth in Clock Rate
Fundamentals Introduction Effects of Dramatic Growth
1) It has significantly enhanced the capability of a computer to users
2) The dramatic improvement in cost-performance leads to new classes of computers
Personal computers/ workstations emerged in 1980 Smart cell phones and tablets
Warehouse-scale computers
3) We see the dominance of microprocessor-based computers across the entire range of computer design
Minicomputers, made from off-the-self logic or gate arrays, disappear
Even mainframes and supercomputers made from microprocessors
4) Impact of software development.
Trade performance (C/C++) for productivity (Java, C#)
Fundamentals Introduction
Current Trends in Architecture
Historic switch in 2003: from uniprocessor to multiprocessor per chip. Single processor performance improvement ended in 2003
Walls encountered Power issue
No more Instruction-Level parallelism (ILP)
This signals the change from solely relying on instruction-level parallelism to exploiting more coarse-grained parallelism
Data-level parallelism (DLP)
Thread-level parallelism (TLP) Request-level parallelism (RLP)
These require explicit restructuring of the applications
Fundamentals Classes of Computers Agenda
Introduction
1.2 Classes of Computers
Defining Computer Architecture
Trends in Technology
Trends in Power and Energy in ICs Trends in Cost
Dependability
Measuring Performance
Quantitative Principles
Fundamentals Classes of Computers Five Classes of Computers
1) Personal Mobile Device (PMD) e.g. smart phones, tablet computers
Emphasis on energy efficiency and real-time
2) Desktop Computing
Emphasis on price-performance
3) Servers
Emphasis on availability, scalability, throughput
4) Clusters / Warehouse Scale Computers Used for “Software as a Service (SaaS)”
Emphasis on availability and price-performance
5) Embedded Computers Emphasis: price
Fundamentals
System Characteristics
Fundamentals
Personal Mobile Device (PMD)
PMD: Collection of wireless devices with multimedia user interfaces
Energy efficiency is critical
Most of the devices are driven by batteries There is no fan for cooling the processor
Flash memory instead of disks is used Energy and size requirements
Responsiveness or real-time performance Hard real time
Soft real time
Fundamentals
Desktop Computing
The largest market in dollar terms How about now?
The metric of price-performance: combination of both price and performance
Compute performance Graphics performance
Fundamentals
Servers are backbone of large-scale enterprise computing
Availability: the percentage of time that a server is operational
Scalability: scale up to increasing demand of services
Fundamentals
Cost of Downtime
Fundamentals
Clusters/Warehouse-Scale Computers (WSC)
Software as a Service (SaaS): search, social networking, video sharing
Cluster: Collection of desktop computers or servers by local area networks, each of which runs its own OS
Energy: 80% of a warehouse is associated with power and cooling of computers
Availability
Throughput: the amount of work that is done in a unit time
Fundamentals
Supercomputers vs. WSC
Supercomputer: a large pool of processors
Both supercomputers and WSCs are expensive
Differences between supercomputers and WSCs
Supercomputers run large, communication- intensive applications, and thus faster internal networks required
Supercomputers emphasize floating-point performance
WSCs emphasize interactive applications, large- scale storage, dependability and high bandwidth
Fundamentals
Embedded Computers
They can be found almost in every machine: washing machine, microwave, and printers
Can be 8-, 16-, and 32-bit
An embedded computer is usually
application-specific
It does not run various, changing applications
Fundamentals Classes of Computers Two Kinds of Parallelism in Applications
Data-Level Parallelism (DLP)
• Many data items can be operated on at the same time
Task-Level Parallelism (TLP)
• Tasks of work are created that can operate independently and largely in parallel
Fundamentals
Four Major Ways for Exploiting Parallelism
Instruction-Level Parallelism (ILP)- e.g., Pipelining
• Data-level parallelism
Vector architectures/Graphic Processor Units (GPUs)
• Data-level parallelism
Thread-Level Parallelism – e.g., Multicore
• Data-level parallelism or task-level parallelism
Request-Level Parallelism – e.g., Clusters
• Data-level parallelism or task-level parallelism
Fundamentals Classes of Computers Flynn’s Taxonomy
Single instruction stream, single data stream (SISD)
Single instruction stream, multiple data streams (SIMD)
Multiple instruction streams, single data stream (MISD)
Multiple instruction streams, multiple data streams (MIMD)
•No commercial implementation
•Tightly-coupled MIMD
Michael J. Flynn
•Vector architectures •Multimedia
extensions •Graphics processor
units (GPU)
•Loosely-coupled MIMD
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com