代写代考 CS2305: Computer Architecture

CS2305: Computer Architecture
Fundamentals of Computer Design
(Computer Architecture: Chapter 1 )

Copyright By PowCoder代写 加微信 powcoder

Department of Computer Science and Engineering

Fundamentals Introduction Agenda
 1.1 Introduction
 Classes of Computers
 Defining Computer Architecture
 Trends in Technology
 Trends in Power and Energy in ICs  Trends in Cost
 Dependability
 Measuring Performance
 Quantitative Principles

Fundamentals
Evolvement of Processors

Fundamentals
x86 Manufacturers
(In the past)
 Transmeta (discontinued its x86 line)
 Rise Technology (acquired by SiS)
 IDT (Centaur Technology x86 division acquired by VIA)
 National Semiconductor (sold the x86 PC designs to VIA and later the
x86 embedded designs to AMD)
 Cyrix (acquired by National Semiconductor)
 NexGen (acquired by AMD)
 Chips and Technologies (acquired by Intel)
 IBM (discontinued its own x86 line)
 UMC (discontinued its x86 line)
 NEC (discontinued its x86 line)

Fundamentals
Intel 4004 Die Photo
Introduced in 1970
First microprocessor
2,250 transistors 12 mm2 (die size) 108 KHz

Fundamentals
Intel 8086 Die Scan
Introduced in 1979
Basic architecture of the IA32 PC
 29,000 transistors
33 mm2 5 MHz

Fundamentals
Intel 80486 Die Scan
Introduced in 1989
1st pipelined implementation of IA32
 1,200,000 transistors
81 mm2 25 MHz

Fundamentals
Pentium Die Photo
Introduced in 1993
1st superscalar implementation of IA32
 3,100,000 transistors
296 mm2 60 MHz

Fundamentals
Pentium III
Introduced in 1999
 9,500,000 transistors
125 mm2 450 MHz

Fundamentals
Pentium IV and Duo
Intel P4 – 55M tr (2001)
Intel Core 2 Extreme Quad-core 2x291M tr. (2006)
Intel Itanium – 221M tr.

Fundamentals
Dual-Core Itanium 2 (Montecito)
 1.72 B Transistors
 2 GHz frequency

Fundamentals
List of Intel Microprocessors
 1 The 4-bit processors  1.1 Intel 4004
 10 32-bit processors: P6/Pentium M
 1.2 Intel 4040
microarchitecture
 2 The 8-bit processors  2.1 8008
 10.1 Pentium Pro
 10.2 Pentium II
 10.3 Celeron (Pentium II-based)
 10.4 Pentium III
 10.5 Pentium II and III Xeon
 10.6 Celeron (Pentium III Coppermine-based)
 10.7 Celeron (Pentium III Tualatin-based)
 10.8 Pentium M
 10.9 Celeron M
 10.10 Intel Core
 10.11 Dual-Core Xeon LV
 2.2 8080
 2.3 8085
 3 Microcontrollers
 3.1 Intel 8048
 3.2 Intel 8051
 3.3 Intel 80151
 3.4 Intel 80251
 3.5 MCS-96 Family
 11 32-bit processors: NetBurst microarchitecture  11.1 Pentium 4
 4 The bit-slice processor  4.1 3000 Family
 11.2 Xeon
 11.3 Mobile Pentium 4-M  11.4 Pentium 4 EE
 11.5 Pentium 4E
 11.6 Pentium 4F
 5 The 16-bit processors: MCS-86 family  5.1 8086
 5.2 8088
 5.3 80186
 5.4 80188
 5.5 80286
 12 64-bit processors: IA-64  12.1 Itanium
 6 32-bit processors: the non-x86
 12.2 Itanium 2
 13 64-bit processors: Intel 64 – NetBurst
microprocessors
microarchitecture
 6.1 iAPX 432
 6.2 i960 aka 80960
 6.3 i860 aka 80860
 6.4 XScale
 13.1 Pentium 4F
 13.2 Pentium D
 13.3 Pentium Extreme Edition
 13.4 Xeon
 7 32-bit processors: the 80386 range  7.1 80386DX
 14 64-bit processors: Intel 64 – Core
 7.2 80386SX
 7.3 80376
 7.4 80386SL
 7.5 80386EX
microarchitecture
 8 32-bit processors: the 80486 range  8.1 80486DX
 14.1 Xeon
 14.2 Intel Core 2
 14.3 Pentium Dual-Core
 14.4 Celeron
 14.5 Celeron M
 8.2 80486SX
 8.3 80486DX2
 8.4 80486SL
 8.5 80486DX4
 15 64-bit processors: Intel 64 – Nehalem
 9 32-bit processors: P5 microarchitecture
 15.1 Intel Pentium
 15.2 Core i3
 15.3 Core i5
 15.4 Core i7
 9.1 Original Pentium
 9.2 Pentium with MMX Technology
 15.5 Xeon
 16 64-bit processors: Intel 64 – /
microarchitecture
microarchitecture
 16.1 Celeron
 16.2 Pentium
 16.3 Core i3 / 16.4 Core i5 /16.5 Core i7

Fundamentals
Comparison of Intel Processors
Series Nomenclature
Clock Rate
Socket 2, Socket 3,Socket
Fabrication
Number of Cores
Intel Pentium Intel Pentium
P5, P54C, P54CTB, P54CS
60 MHz – 200 MHz 120 MHz – 300 MHz
4, Socket 5,Socket 7
800 nm – 350 nm 350 nm – 250 nm
Unknown Unknown
Single Single
50 MHz – 66 MHz
Intel Atom
Z5xx, Z6xx, N2xx, 2xx, 3xx, N4xx, D4xx, D5xx, N5xx, D2xxx, N2xxx
Diamondville,Pineview, Sil verthorne, Lincroft, Cedarv iew, Medfield,
800 MHz – 2.13 GHz
Socket PBGA437,Socket PBGA441, Socket micro- FCBGA8 559
32 nm, 45 nm
Single,Double
400 MHz, 533 MHz, 667 MHz, 2.5 GT/s
512 KiB – 1 MiB
Intel Celeron
3xx, 4xx, 5xx
Banias,Cedar
Mill, Conroe, Coppermine, Covington, Dothan, Mendo cino,Northwood, Prescott, Tualatin, Willamette, Yona h
Klamath,Deschutes, Tonga , ,Coppermine, Tuala tin
Allendale,Cascades, Clover town, Conroe, Cranford, D empsey, Drake, Dunningto n,Foster, Gainestown, Gall atin, Harpertown, Irwindal e, Kentsfield,Nocona, Paxv ille, Potomac, Prestonia, So ssaman, Tanner, Tigerton, Tulsa, Wolfdale, Woodcres t
Mill,Northwood, Prescott, Willamette
266 MHz – 3.6 GHz
Slot 1,Socket 370, Socket 478, Socket 479,Socket 495, LGA 775,Socket
M, Socket T
45 nm, 65 nm, 90 nm, 130 nm, 180 nm, 250 nm
5.5 W – 86 W
Single,Double
66 MHz, 100 MHz, 133 MHz, 400 MHz, 533 MHz, 800 MHz
0 KiB – 1 MiB
Intel Pentium Intel Pentium
150 MHz – 200 MHz 233 MHz – 450 MHz
Slot 1, MMC-1, MMC- 2,Mini-Cartridge
350 nm, 500 nm 250 nm, 350 nm
29.2W-47W 16.8 W – 38.2 W
Single Single
60 MHz, 66 MHz 66 MHz, 100 MHz
256 KiB, 512 KiB, 1024 KiB 256 KiB – 512 KiB
Intel Pentium
450 MHz – 1.4 GHz
Slot 1, Socket 370
130 nm, 180 nm, 250 nm
100 MHz, 133 MHz
256 KiB – 512 KiB
Intel Xeon
n3xxx, n5xxx, n7xxx
400 MHz – 4.4 GHz
Slot 2,Socket 603, Socket 604, Socket J, Socket
T, Socket B LGA 1156,LGA 1366
45 nm, 65 nm, 90 nm, 130 nm, 180 nm, 250 nm
16 W – 165 W
Single,Double, Quad, Hexa, Octa
100 MHz, 133 MHz,
400 MHz, 533 MHz,
667 MHz, 800 MHz,
1066 MHz, 1333 MHz, 1600 MHz, 4.8 GT/s, 5.86 GT/s, 6.4 GT/s
256 KiB – 12 MiB
4 MiB – 16 MiB
1.3 GHz – 3.8 GHz
Socket 423,Socket 478, LGA 775,Socket T
65 nm, 90 nm, 130 nm, 180 nm
21 W – 115 W
400 MHz, 533 MHz, 800 MHz, 1066 MHz
256 KiB – 2 MiB
Pentium 4 Extreme Edition
Gallatin,Prescott 2M
3.2 GHz – 3.73 GHz 800 MHz – 2.266 GHz 2.66 GHz – 3.73 GHz
Socket 478,Socket T
90 nm, 130 nm 90 nm, 130 nm 65 nm, 90 nm
92W-115W 5.5W-27W 95W-130W
Single Single Double
800 MHz, 1066 MHz
512 KiB – 1 MiB
0 KiB – 2 MiB –
Pentium D/EE
Intel Pentium Dual-Core
Smithfield,Presler
2×1 MiB – 2×2 MiB
Intel Pentium New Intel Core
Penryn,Wolfdale,
1.2 GHz – 3.33 GHz
Socket 775,Socket
P, Socket T,LGA 1156, LGA 1155,
32nm,45nm,65nm
Single,Double
800 MHz, 1066 MHz, 2.5GT/s, 5 GT/s
2×256 KiB – 2 MiB
0 KiB – 3 MiB
Intel Core Intel Core Intel Core
M, Socket P,Socket J, Socket T
Intel Core
i7-6xx, i7-7xx, i7-8xx, i7- 9xx, i7-2xxx, i7-37xx, i7- 38xx, i7-47xx
XM, Lynnfield, ,
-E, , Haswell
1.6 GHz – 3.6 GHz
LGA 1156,LGA 1155, LGA 1366,LGA 2011
22nm,32nm,45nm
4.8 GT/s, 6.4 GT/s
6MiB-10MiB
Intel Core
i7-970, i7-980, i7-980x, i7- 990x, i7-39xx, i7-38xx
Gulftown, -E Code Name
3.2 GHz – 3.46 GHz Clock Rate
LGA 1366,LGA 2011 Socket
32 nm Fabrication
Number of Cores
6.4 GT/s Bus Speed
6×256 KiB L2 Cache
12MiB-15MiB L3 Cache
Series Nomenclature
P55C, Tillamook
60 MHz – 66 MHz
Banias,Dothan
Socket 479
400 MHz, 533 MHz 533 MHz, 800 MHz, 1066 MHz
1 MiB – 2 MiB
E2xxx, E3xxx, E5xxx,
T2xxx, T3xxx
E5xxx, E6xxx, T4xxx,
SU2xxx, SU4xxx, G69xx,
P6xxx, U5xxx, G6xx, G8xx, , , B9xx
Txxxx, Lxxxx, , Lxxxx, Exxxx, Txxxx, P7xxx, Xxxxx, Qxxxx, Q ,Conroe, Merom, Penryn, Kentsfield, Wolfda le, Yorkfield
1.06 GHz – 2.33 GHz 1.06 GHz – 3.33 GHz
Socket 775,Socket
65 nm 45nm,65nm
5.5W-49W 5.5W-150W
Single,Double Single,Double, Quad
533 MHz, 667 MHz 533 MHz, 667 MHz, 800 MHz, 1066 MHz, 1333 MHz, 1600 MHz
1 MiB – 12 MiB
i3-xxx, i3-2xxx, i3-3xxx
Arrandale,Clarkdale, , Arrandale,Clarkdale, Clark sfield, Lynnfield, , Bloomfield,Nehalem, Clark sfield, Clarksfield
2.4 GHz – 3.4 GHz 1.06 GHz – 3.46 GHz
LGA 1156, LGA 1155 LGA 1156, LGA 1155
22nm,32nm 22nm,32nm,45nm
35W-73W 17W-95W
Double Double,Quad
1066 MHz, 1600 MHz, 2.5 – 5 GT/s
256 KiB 256 KiB
3MiB-4MiB 4MiB-8MiB
i5-7xx, i5-6xx, i5-2xxx, i5- 3xxx
2.5 – 5 GT/s
Allendale,Penryn, Wolfdale , Yonah
1.6 GHz – 2.93 GHz
Socket 775,Socket M, Socket P,Socket T
533 MHz, 667 MHz, 800 MHz, 1066 MHz
1 MiB – 2 MiB

Fundamentals
Processor Transistor Count
(from http://en.wikipedia.org/wiki/Transistor_count)
Transistor Date of count intro-
Manufactu- rer
Transistor count
Date of introdu- ction
Manufacturer
Intel 4004 Intel 8008 Intel 8080 Intel 8088 Intel 80286 Intel 80386 Intel 80486 Pentium AMD K5 Pentium II AMD K6 Pentium III AMD K6-III AMD K7 Pentium 4
2300 1971 2500 1972 4500 1974
Intel Intel Intel Intel Intel Intel Intel Intel AMD Intel AMD Intel AMD AMD Intel
25 000 000
Intel AMD AMD Intel Intel
29 000 1978 134 000 1982 275 000 1985
1 200 000 1989 3 100 000 1993 4 300 000 1996 7 500 000 1997 8 800 000 1997 9 500 000 1999
241 000 000
Sony/IBM/ Toshiba
21 300 000 1999 22 000 000 1999 42 000 000 2000
54 300 000 105 900 000 220 000 000 592 000 000
Itanium 2 with 9MB cache
Core 2 Duo
291 000 000
Intel Intel Intel
Core 2 Quadro
582 000 000 1 700 000 000
Dual-Core Itanium 2

Fundamentals
Moore’s Law
0.09 μm 596 mm2
10 μm 13.5mm2
1.7 billions
Exponential growth
Transistor count will be doubled every 18 months
 , Intel co-founder
42millions

Fundamentals
Memory Capacity (Single Chip DRAM)
Moore’s Law for Memory: capacity increases by 4x every 3 years
size(Mb) cyc time
1980 1983 1986 1989 1992 1996 2000 2007
0.0625 250 ns 0.25 220 ns 1 190 ns 4 165 ns 16 145 ns 64 120 ns 256 100 ns 2G 52 ns

Fundamentals
Trends in Technology
Trends in Technology followed closely Moore’s Law “Transistor density of chips doubles every 1.5-2.0 years”
As a consequence of Moore’s Law: Processor speed doubles every 1.5-2.0 years DRAM size doubles every 1.5-2.0 years
These constitute a target that the computer industry aim for.

Fundamentals
Rapid Improvement

Fundamentals
Rapid Improvements
Understand the unprecedented innovations in computers!
What have been making computers faster and cheaper? And how
to continue the innovations?
 50 years of non-stop innovation – a technological miracle!
IBM7030 (Stretch)1961
1G (typical)
1.2 MIPS 1000 MIPS
Pentium 4 Desktop 2000
Capacity (memory)
Speed (CPU)
Price US$13,500,000

Fundamentals
Automobile and Computer
What would cars be like if they had innovated like computer?
A good car in the 1960’s
US$5,000 US$0.29
A car in the 2000’s had it developed like computers
39062 83333
Capacity Speed
(passengers)
(in KM/hour)

Fundamentals
How to Make Computers Faster? Option 1: To increase the clock rate or main
Today’s mainstream: 3 GHz
Option 2: To increase the logic density (number of gates in a chip)
Today’s mainstream: 14 nm processor(2015) In comparison, 10 microns (1971)
Always useful?

Fundamentals Introduction
Single Processor Performance
HW Technology
HW Technology + Arc. Org. ideas
Power issue + ILP run out
Move to multi-processor

Fundamentals
Growth in Clock Rate

Fundamentals Introduction Effects of Dramatic Growth
 1) It has significantly enhanced the capability of a computer to users
 2) The dramatic improvement in cost-performance leads to new classes of computers
 Personal computers/ workstations emerged in 1980  Smart cell phones and tablets
 Warehouse-scale computers
 3) We see the dominance of microprocessor-based computers across the entire range of computer design
 Minicomputers, made from off-the-self logic or gate arrays, disappear
 Even mainframes and supercomputers made from microprocessors
 4) Impact of software development.
 Trade performance (C/C++) for productivity (Java, C#)

Fundamentals Introduction
Current Trends in Architecture
 Historic switch in 2003: from uniprocessor to multiprocessor per chip. Single processor performance improvement ended in 2003
 Walls encountered  Power issue
 No more Instruction-Level parallelism (ILP)
 This signals the change from solely relying on instruction-level parallelism to exploiting more coarse-grained parallelism
 Data-level parallelism (DLP)
 Thread-level parallelism (TLP)  Request-level parallelism (RLP)
 These require explicit restructuring of the applications

Fundamentals Classes of Computers Agenda
 Introduction
 1.2 Classes of Computers
 Defining Computer Architecture
 Trends in Technology
 Trends in Power and Energy in ICs  Trends in Cost
 Dependability
 Measuring Performance
 Quantitative Principles

Fundamentals Classes of Computers Five Classes of Computers
 1) Personal Mobile Device (PMD)  e.g. smart phones, tablet computers
 Emphasis on energy efficiency and real-time
 2) Desktop Computing
 Emphasis on price-performance
 3) Servers
 Emphasis on availability, scalability, throughput
 4) Clusters / Warehouse Scale Computers  Used for “Software as a Service (SaaS)”
 Emphasis on availability and price-performance
 5) Embedded Computers Emphasis: price

Fundamentals
System Characteristics

Fundamentals
Personal Mobile Device (PMD)
 PMD: Collection of wireless devices with multimedia user interfaces
 Energy efficiency is critical
Most of the devices are driven by batteries There is no fan for cooling the processor
 Flash memory instead of disks is used Energy and size requirements
 Responsiveness or real-time performance Hard real time
Soft real time

Fundamentals
Desktop Computing
The largest market in dollar terms How about now?
The metric of price-performance: combination of both price and performance
Compute performance Graphics performance

Fundamentals
Servers are backbone of large-scale enterprise computing
Availability: the percentage of time that a server is operational
Scalability: scale up to increasing demand of services

Fundamentals
Cost of Downtime

Fundamentals
Clusters/Warehouse-Scale Computers (WSC)
Software as a Service (SaaS): search, social networking, video sharing
Cluster: Collection of desktop computers or servers by local area networks, each of which runs its own OS
Energy: 80% of a warehouse is associated with power and cooling of computers
 Availability
Throughput: the amount of work that is done in a unit time

Fundamentals
Supercomputers vs. WSC
 Supercomputer: a large pool of processors
 Both supercomputers and WSCs are expensive
 Differences between supercomputers and WSCs
Supercomputers run large, communication- intensive applications, and thus faster internal networks required
Supercomputers emphasize floating-point performance
WSCs emphasize interactive applications, large- scale storage, dependability and high bandwidth

Fundamentals
Embedded Computers
They can be found almost in every machine: washing machine, microwave, and printers
Can be 8-, 16-, and 32-bit
An embedded computer is usually
application-specific
It does not run various, changing applications

Fundamentals Classes of Computers Two Kinds of Parallelism in Applications
Data-Level Parallelism (DLP)
• Many data items can be operated on at the same time
Task-Level Parallelism (TLP)
• Tasks of work are created that can operate independently and largely in parallel

Fundamentals
Four Major Ways for Exploiting Parallelism
Instruction-Level Parallelism (ILP)- e.g., Pipelining
• Data-level parallelism
Vector architectures/Graphic Processor Units (GPUs)
• Data-level parallelism
Thread-Level Parallelism – e.g., Multicore
• Data-level parallelism or task-level parallelism
Request-Level Parallelism – e.g., Clusters
• Data-level parallelism or task-level parallelism

Fundamentals Classes of Computers Flynn’s Taxonomy
Single instruction stream, single data stream (SISD)
Single instruction stream, multiple data streams (SIMD)
Multiple instruction streams, single data stream (MISD)
Multiple instruction streams, multiple data streams (MIMD)
•No commercial implementation
•Tightly-coupled MIMD
Michael J. Flynn
•Vector architectures •Multimedia
extensions •Graphics processor
units (GPU)
•Loosely-coupled MIMD

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com