CS代考 COMPUTER ORGANIZATION AND DESIGN

COMPUTER ORGANIZATION AND DESIGN
The Hardware/Software Interface
Computer Abstractions and Technology

Copyright By PowCoder代写 加微信 powcoder

The Computer Revolution
 Progress in computer technology  Underpinned by Moore’s Law
 Makes novel applications feasible  Computers in automobiles
 Cell phones
 Human genome project
 World Wide Web  Search Engines
 Computers are pervasive
Chapter 1 — Computer Abstractions and Technology — 2
§1.1 Introduction

Classes of Computers
 Personal computers
 General purpose, variety of software  Subject to cost/performance tradeoff
 Server computers
 Network based
 High capacity, performance, reliability
 Range from small servers to building sized
Chapter 1 — Computer Abstractions and Technology — 3

Classes of Computers
 Supercomputers
 High-end scientific and engineering
calculations
 Highest capability but represent a small fraction of the overall computer market
 Embedded computers
 Hidden as components of systems
 Stringent power/performance/cost constraints
Chapter 1 — Computer Abstractions and Technology — 4

The PostPC Era
Chapter 1 — Computer Abstractions and Technology — 5

The PostPC Era
 Personal Mobile Device (PMD)  Battery operated
 Connects to the Internet
 Hundreds of dollars
 Smart phones, tablets, electronic glasses
 Cloud computing
 Warehouse Scale Computers (WSC)
 Software as a Service (SaaS)
 Portion of software run on a PMD and a portion run in the Cloud
 Amazon and Google
Chapter 1 — Computer Abstractions and Technology — 6

 How programs are translated into the machine language
 And how the hardware executes them
 The hardware/software interface
 What determines program performance  And how it can be improved
 How hardware designers improve performance
 What is parallel processing
Chapter 1 — Computer Abstractions and Technology — 7

Understanding Performance
 Algorithm
 Determines number of operations executed
 Programming language, compiler, architecture  Determine number of machine instructions executed
per operation
 Processor and memory system
 Determine how fast instructions are executed  I/O system (including OS)
 Determines how fast I/O operations are executed Chapter 1 — Computer Abstractions and Technology — 8

Eight Great Ideas
 Design for Moore’s Law
 Use abstraction to simplify design  Make the common case fast
 Performance via parallelism
 Performance via pipelining
 Performance via prediction
 Hierarchy of memories
 Dependability via redundancy
Chapter 1 — Computer Abstractions and Technology — 9
§1.2 Eight Great Ideas in Computer Architecture

Below Your Program
 Application software
 Written in high-level language
 System software
 Compiler: translates HLL code to
machine code
 Operating System: service code
 Handling input/output
 Managing memory and storage
 Scheduling tasks & sharing resources
 Hardware
 Processor, memory, I/O controllers
Chapter 1 — Computer Abstractions and Technology — 10
§1.3 Below Your Program

Levels of Program Code
 High-level language
 Level of abstraction closer
to problem domain
 Provides for productivity and portability
 Assembly language
 Textual representation of
instructions
 Hardware representation
 Binary digits (bits)
 Encoded instructions and data
Chapter 1 — Computer Abstractions and Technology — 11

Components of a Computer
The BIG Picture
 Same components for all kinds of computer
 Desktop, server, embedded
 Input/output includes  User-interface devices
 Display, keyboard, mouse  Storage devices
 Hard disk, CD/DVD, flash  Network adapters
For communicating with other computers
Chapter 1 — Computer Abstractions and Technology — 12
§1.4 Under the Covers

Touchscreen
 PostPC device
 Supersedes keyboard
 Resistive and Capacitive types
 Most tablets, smart phones use capacitive
 Capacitive allows multiple touches simultaneously
Chapter 1 — Computer Abstractions and Technology — 13

Through the Looking Glass
 LCD screen: picture elements (pixels)  Mirrors content of frame buffer memory
Chapter 1 — Computer Abstractions and Technology — 14

Opening the Box
Capacitive multitouch LCD screen
3.8 V, 25 Watt-hour battery Computer board
Chapter 1 — Computer Abstractions and Technology — 15

Inside the Processor (CPU)
 Datapath: performs operations on data
 Control: sequences datapath, memory, …
 Cache memory
 Small fast SRAM memory for immediate
access to data
Chapter 1 — Computer Abstractions and Technology — 16

Inside the Processor
Chapter 1 — Computer Abstractions and Technology — 17

Abstractions
The BIG Picture
 Abstraction helps us deal with complexity  Hide lower-level detail
 Instruction set architecture (ISA)  The hardware/software interface
 Application binary interface
 The ISA plus system software interface
 Implementation
 The details underlying and interface
Chapter 1 — Computer Abstractions and Technology — 18

A Safe Place for Data
 Volatile main memory
 Loses instructions and data when power off
 Non-volatile secondary memory  Magnetic disk
 Flash memory
 Optical disk (CDROM, DVD)
Chapter 1 — Computer Abstractions and Technology — 19

 Communication, resource sharing, nonlocal access
 Local area network (LAN): Ethernet
 Wide area network (WAN): the Internet
 Wireless network: WiFi, Bluetooth
Chapter 1 — Computer Abstractions and Technology — 20

Technology Trends
 Electronics technology continues to evolve
 Increasedcapacity and performance
 Reduced cost
DRAM capacity
Technology
Relative performance/cost
Vacuum tube
Transistor
Integrated circuit (IC)
Very large scale IC (VLSI)
Ultra large scale IC
250,000,000,000
Chapter 1 — Computer Abstractions and Technology — 21
§1.5 Technologies for Building Processors and Memory

Semiconductor Technology
 Silicon: semiconductor
 Add materials to transform properties:  Conductors
 Insulators
Chapter 1 — Computer Abstractions and Technology — 22

Manufacturing ICs
 Yield: proportion of working dies per wafer Chapter 1 — Computer Abstractions and Technology — 23

Intel Core i7 Wafer
 300mm wafer, 280 chips, 32nm technology  Each chip is 20.7 x 10.5 mm
Chapter 1 — Computer Abstractions and Technology — 24

Integrated Circuit Cost
Cost per die = Cost per wafer Dies per wafer × Yield
Dies per wafer ≈ Wafer area Die area
(1+(Defectsper area×Diearea/2))2
 Nonlinear relation to area and defect rate
 Wafer cost and area are fixed
 Defect rate determined by manufacturing process
 Die area determined by architecture and circuit design
Chapter 1 — Computer Abstractions and Technology — 25

Defining Performance
 Which airplane has the best performance?
Boeing 777
Boeing 747
BAC/Sud Concorde
0 2000 4000 6000 8000 10000
Cruising Range (miles)
Boeing 777
Boeing 747
BAC/Sud Concorde
200 300 400 500
Passenger Capacity
Boeing 777
Boeing 747
BAC/Sud Concorde
0 100000 200000 300000 400000
Passengers x mph
Boeing 777
Boeing 747
BAC/Sud Concorde
Cruising Speed (mph)
Chapter 1 — Computer Abstractions and Technology — 26
§1.6 Performance

Response Time and Throughput
 Response time
 How long it takes to do a task
 Throughput
 Total work done per unit time
 e.g., tasks/transactions/… per hour
 How are response time and throughput affected by
 Replacing the processor with a faster version?  Adding more processors?
 We’ll focus on response time for now…
Chapter 1 — Computer Abstractions and Technology — 27

Relative Performance
 Define Performance = 1/Execution Time  “X is n time faster than Y”
PerformanceX PerformanceY
= Execution time Y Execution time X = n
 Example: time taken to run a program
 10s on A, 15s on B
 Execution TimeB / Execution TimeA = 15s / 10s = 1.5
 So A is 1.5 times faster than B
Chapter 1 — Computer Abstractions and Technology — 28

Measuring Execution Time
 Elapsed time
 Total response time, including all aspects
 Processing, I/O, OS overhead, idle time  Determines system performance
 CPU time
 Time spent processing a given job
 Discounts I/O time, other jobs’ shares
 Comprises user CPU time and system CPU
 Different programs are affected differently by CPU and system performance
Chapter 1 — Computer Abstractions and Technology — 29

CPU Clocking
 Operation of digital hardware governed by a constant-rate clock
Clock period
Clock (cycles)
Data transfer and computation
Update state
 Clock period: duration of a clock cycle
 e.g., 250ps = 0.25ns = 250×10–12s
 Clock frequency (rate): cycles per second
 e.g., 4.0GHz = 4000MHz = 4.0×109Hz
Chapter 1 — Computer Abstractions and Technology — 30

CPU Time = CPU Clock Cycles × Clock Cycle Time
= CPU Clock Cycles Clock Rate
 Performance improved by
 Reducing number of clock cycles
 Increasing clock rate
 Hardware designer must often trade off clock rate against cycle count
Chapter 1 — Computer Abstractions and Technology — 31

CPU Time Example
 Computer A: 2GHz clock, 10s CPU time
 Designing Computer B
 Aim for 6s CPU time
 Can do faster clock, but causes 1.2 × clock cycles (some instructions will take more than 1 cycle to finish)
 How fast must Computer B clock be?
Clock RateB = Clock CyclesB = 1.2 × Clock CyclesA CPU TimeB 6s
ClockCyclesA =CPUTimeA×ClockRateA =10s×2GHz = 20×109
Clock RateB = 1.2×20×109 = 24×109 = 4GHz 6s 6s
Chapter 1 — Computer Abstractions and Technology — 32

Instruction Count (IC) and CPI
Clock Cycles = Instruction Count × Cycles per Instruction CPU Time = Instruction Count × CPI× Clock Cycle Time
= Instruction Count × CPI Clock Rate
 Instruction Count (IC) for a program
 Determined by program, ISA and compiler
 Average Cycles Per Instruction (CPI)
 Determined by CPU hardware
 If different instructions have different CPI  Average CPI is affected by instruction mix
Chapter 1 — Computer Abstractions and Technology — 33

CPI (Cycles Per Instr.) Example
 Computer A: Cycle Time = 250ps, CPI = 2.0  Computer B: Cycle Time = 500ps, CPI = 1.2  Same Instruction Set Architecture (ISA)
 Which is faster, and by how much?
CPUTimeA =InstructionCount×CPIA×CycleTimeA =I×2.0×250ps =I×500ps
CPU Time = Instruction Count × CPI × Cycle Time BBB
=I×1.2×500ps =I×600ps
CPU TimeB = I× 600ps = 1.2 CPUTimeA I×500ps
A is faster…
…by this much
Chapter 1 — Computer Abstractions and Technology — 34

CPI in More Detail
 If different instruction classes take different numbers of cycles (instruction mix)
Clock Cycles = ∑(CPI ×Instruction Count )
 Weighted average CPI
Clock Cycles n  Instruction Count  CPI= =∑CPIi× i
Instruction Count i=1  Instruction Count 
Relative frequency
Chapter 1 — Computer Abstractions and Technology — 35

CPI Example
 Alternative compiled code sequences using instructions in classes A, B, C
CPI for class
IC in sequence 1
IC in sequence 2
 Sequence1:IC=5
 Clock Cycles
= 2×1 + 1×2 + 2×3 = 10
 Avg.CPI=10/5=2.0
Chapter 1 — Computer Abstractions and Technology — 36
 Sequence2:IC=6
 Clock Cycles
= 4×1 + 1×2 + 1×3 = 9
 Avg.CPI=9/6=1.5

Performance Summary
The BIG Picture
CPU Time = Instructions × Clock cycles × Seconds Program Instruction Clock cycle
 Performance depends on
 Algorithm: affects IC, possibly CPI (instr. mix)
 Programming language: affects IC, CPI
 Compiler: affects IC, CPI
 Instruction set architecture: affects IC, CPI, Tc(Clock Rate)
Chapter 1 — Computer Abstractions and Technology — 37

Power Trends
 In CMOS IC technology
Power = Capacitive load× Voltage2 ×Frequency
×30 5V → 1V ×1000
Chapter 1 — Computer Abstractions and Technology — 38
§1.7 The Power Wall

Reducing Power
 Suppose a new CPU has
 85% of capacitive load of old CPU
 15% voltage and 15% frequency reduction
×0.85×(V ×0.85)2 ×F ×0.85 = 0.854 = 0.52 old 2 old
C ×V ×F old old old
 The power wall
 We can’t reduce voltage further  We can’t remove more heat
 How else can we improve performance? Chapter 1 — Computer Abstractions and Technology — 39

Uniprocessor Performance
Constrained by power, instruction-level parallelism, memory latency
Chapter 1 — Computer Abstractions and Technology — 40
§1.8 The Sea Change: The Switch to Multiprocessors

Multiprocessors
 Multicore microprocessors
 More than one processor per chip
 Requires explicitly parallel programming
 Compare with instruction level parallelism
 Hardware executes multiple instructions at once  Hidden from the programmer
 Hard to do
 Programming for performance
 Load balancing
 Optimizing communication and synchronization
Chapter 1 — Computer Abstractions and Technology — 41

Pitfall: Amdahl’s Law
 Improving an aspect of a computer and expecting a proportional improvement in overall performance
 Example: multiply accounts for 80s/100s
 How much improvement in multiply performance to
T=T+T improved affected unaffected
improvement factor
get 5× overall?
20 = 80 + 20
 Can’t be done!
 Corollary: make the common case fast
Chapter 1 — Computer Abstractions and Technology — 42
§1.10 Fallacies and Pitfalls

Fallacy: Low Power at Idle
 As per the i7 power benchmarks  At 100% load: 258W
 At 50% load: 170W (66%)
 At 10% load: 121W (47%)
 Google data center
 Mostly operates at 10% – 50% load
 At 100% load less than 1% of the time
 Consider designing processors to make power proportional to load
Chapter 1 — Computer Abstractions and Technology — 43

Pitfall: MIPS as a Performance Metric
 MIPS: Millions of Instructions Per Second
 Doesn’t account for
 Differences in ISAs between computers
 Differences in complexity between instructions
MIPS = Instruction count Execution time ×106
= Instruction count = Clock rate Instructioncount×CPI×106 CPI×106
Clock rate
 CPI varies between programs on a given CPU Chapter 1 — Computer Abstractions and Technology — 44

Concluding Remarks
 Cost/performance is improving
 Due to underlying technology development
 Hierarchical layers of abstraction  In both hardware and software
 Instruction set architecture
 The hardware/software interface
 Execution time: the best performance measure
 Power is a limiting factor
 Use parallelism to improve performance
Chapter 1 — Computer Abstractions and Technology — 45
§1.11 Concluding Remarks

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com