CS计算机代考程序代写 computer architecture cache Computer Architecture ELEC3441

Computer Architecture ELEC3441
Lecture 7 – Memory Technologies
Dr. Hayden Kwok-Hay So
Department of Electrical and Electronic Engineering
CPU vs Memory
Memory
• store(sw,sh,sb) • instructionfetch • load(lw,lh,lb)
HKUEEE ENGG3441 – HS
2
CPU
Processor-DRAM Gap (latency)
100,000 10,000 1000 100 10
1
1980 1985 1990 1995
Year
2000 2005 2010
~60%/year (2x every 1.5 yr)
Gap growing 50%/yr
/year (2x every ~10 yr)
~7%
Processor
Memory
3GHz uP, 100ns latencyè300 cycles to get 1st data HKUEEE ENGG3441 – HS 3
Memory Challenges
n Requirements for an ideal system:
• Largecapacity
• Highperformance • low latency
• high bandwidth n Reality:
• Capacity,Latency,Bandwidthoftencounteract each other
n Solution:
• Usedifferenttypesofmemorytobuildamemory
hierarchy that performances well on average
HKUEEE ENGG3441 – HS 4
normalized performance
Performance
regfile

HKUEEE ENGG3441 – HS 5
Types of Memory
ROM
RAM Volatile
DRAM
SRAM
Non-Volatile
Flash
EEPROM
PCM
RRAM
n ROM: read only memory
n RAM: random access memory
n Volatile: Content may disappear w/o power n Non-Volatile: Content remains w/o power
HKUEEE ENGG3441 – HS
6
Early Memory Examples
ROM
Punched cards, From early 1700s through Jaquard Loom, Babbage, and then IBM
Core memory, invent late 40s/early 50s at MIT
Punched paper tape instruction stream in Harvard Mk 1
DEC PDP-8/E
Core Memory Board,
4K words x 12 bits, (1968)
RAM
HKUEEE
ENGG3441 – HS
7
Semiconductor Memory
§Semiconductor memory began to be competitive in early 1970s
– Intel formed to exploit market for semiconductor memory
– Early semiconductor memory was Static RAM (SRAM). SRAM cell internals similar to a latch (cross-coupled inverters).
§First commercial Dynamic RAM (DRAM) was Intel 1103
– 1Kbit of storage on single chip
– charge on a capacitor used to hold value
Semiconductor memory quickly replaced core in ‘70s
8

HKUEEE ENGG3441 – HS 9
DRAM
n Dynamic Random Access Memory
n Data stored in a single capacitor
n Read/write access through 1 transistor • Sometimes referred as a 1T cell
n Read is destructive
• Data stored in capacitor is lost after read • èwrite back data after read
n Data lost over time due to charge leakage
• èrefresh needed
• Most modern DRAM have auto refresh capability
HKUEEE ENGG3441 – HS 10
1-T DRAM Cell
bit
Storage
capacitor (FET gate, trench, stack)
word
access transistor
One-Transistor Dynamic RAM [Dennard, IBM]
VREF
TiN top electrode (VREF)
Ta2O5 dielectric
poly word line
W bottom electrode
access transistor
11
Trench Capacitor
n Maximize capacitance through vertical trench into the substrate
Column Address Bit Line
N-well
P- Substrate
Row Address Word Line
P+
Transfer Node Strap
P+
Note: Not to Scale
Trench Capacitor
Loose favor to stacked capacitor HKUEEE ENGG3441 – HS 12
IBM trench capacitor

Modern DRAM Structure
[Samsung, sub-70nm DRAM, 2004]
13
DRAM Architecture
bit lines
Col.
1 2M
Col.
word lines
Row 1
Row 2N
Memory cell (one bit)
N
N+M
M
Column Decoder & Sense Amplifiers
D
§ Bits stored in 2-dimensional arrays on chip
Data
§ Modern chips have around 4-8 logical banks on each chip § each logical bank physically implemented as many smaller arrays
14
Read Operation
1. Precharge bitlines to Vdd/2
2. Enable target word line
• 1: V_bitlineé
• 0: V_bitlineê
3. Sense Amps sense changes in bitline voltage
• Latch results
4. Select desired column from output latch
0.5 0.5
0.5
0.5
0
1
00
1
1
HKUEEE
ENGG3441 – HS
15
1
0.5 0.5
100
0.5
0.5
0
Latch
1
1
DRAM Operation (1)
n 3 basic operations: Row Access, Column Access, Precharge
n Row Access
• Selectrowsdependingonaddress
• Eachrowcanhaveupto10sofKb
• Senseampssenseverysmallchangeinbitline level
• Voltagechangeissmallbecausechargestoredin a small cap is shared with long bitline
• Senseampsrestorefullswinglevel+restoredata in cell (refresh)
HKUEEE ENGG3441 – HS 16
Row Address Decoder

DRAM Operation (2)
n Column Access:
• Selectthedesiredbitsoutoftheentirerow
• Usuallyselectjustasmallportion(4,8,16,or32 bits)
• OnRead:sendthedataoutofpackage
• OnWrite:
• write the design data in the sense amp latches
• let sense amp “refresh” array with new data + original data form unchanged bits
n Precharge:
• Prechargebitlinefornextoperation
HKUEEE ENGG3441 – HS 17
Performance
n Latency vs. Bandwidth
n Each step (Precharge, RAS, CAS) takes 15-20ns
in modern DRAM
n Getting first bit takes very long è high latency
n But since entire row of bits are sensed, subsequent column data can be send out of package at high bandwidth
• Various burst mode access
• Modern SDRAM (Synchronous DRAM) has very high
bandwidth output to send data out as soon as possible
• e.g. Double Data Rate (DDR) interface
HKUEEE ENGG3441 – HS 18
HKUEEE ENGG3441 – HS 19
SRAM Overview
n Static Radom Access Memory
n Data stored is persist as long as power is
supplied
n Design mostly based on standard digital circuit technology
• e.g.NoexoticcapacitorslikeDRAM n Simple read/write interface
• nocomplexcommandsequence n Challenge:
• Capacity
HKUEEE ENGG3441 – HS 20

SRAM
1971 state of the art.
Intel 2102, a 1kb, 1 MHz static RAM chip with 6000 nFETs transistors in a 10 μm process.
HKUEEE
ENGG3441 – HS 21
SRAM Cell Design
n The most common SRAM cell follows the design of a simple cross-coupled inverter
Q Q’
n READ: simply take value from Q or Q’, • non-destructive
n Need additional logic to write into the cross- coupled inverter
HKUEEE ENGG3441 – HS 22
SRAM Write
WE
B Q Q’ B’
n Write data by “forcing” values through the 2 access transistors
• SetBandB’todesiredvalue
• Turnonwriteenable(WE)momentarily
• DisableWEkeepsthenewvalueinQandQ’
HKUEEE ENGG3441 – HS 23
Typical 6T SRAM Cell
n Each inverter can be implemented using 2 transistors
n Entire SRAM cell uses 6 transistors WE
B B’
Each SRAM cell is 6-10x larger than 1 DRAM cell
Q
Q’
HKUEEE ENGG3441 – HS 24

DRAM vs SRAM
n DRAM has 6x-10x density advantage n SRAM has low access latency
• Nocomplexcommand
n SRAM has deterministic latency
• Norefresh
n SRAM can be fabricated in the same logic IC process
• CanbebuilttogetherwiththeCPUinthesame die.
HKUEEE ENGG3441 – HS 25
Next Time…
Providing capacity of DRAM and low latency of SRAM:
Memory Cache and Hierarchy
HKUEEE ENGG3441 – HS 26
Acknowledgements
n These slides contain material developed and copyright by:
• Arvind (MIT)
• Krste Asanovic (MIT/UCB)
• Joel Emer (Intel/MIT)
• James Hoe (CMU)
• John Kubiatowicz (UCB)
• David Patterson (UCB)
• John Lazzaro (UCB)
n MIT material derived from course 6.823
n UCB material derived from course CS152,
CS252
HKUEEE ENGG3441 – HS 27