FIT1047 – Week 4
Part 1:
Memory, Indirect Addressing
Reference: https://www.alexandriarepository.org/syllabus/introduction-to-computer-systems-networks-and-security/
FIT1047 Monash University
Recap
So far we saw
● Basic MARIE programming
● Combinational circuits (decoders, muxes, adders, ALUs)
● Sequential circuits (flip flops, registers)
FIT1047 Monash University
Overview
● Memory organisation ○ Addresses
● Accessing memory
○ Indirect addressing ○ Subroutines
Memory
FIT1047 Monash University
Memory
Think of it as a sequence of “boxes”:
1004 3005 2006 7000 008E 0D80 0000
0123456
Each box contains a value (here: a 16-bit number). This could be a machine code instruction, or data.
…
We give each box an address: the number of the box, starting from 0. Programs can read and change the value stored at a location.
FIT1047 Monash University
What is stored in memory?
FIT1047 Monash University
What is stored in memory?
The memory doesn’t know! The CPU doesn’t know! It’s up to the program to interpret the memory
in a certain way.
FIT1047 Monash University
Addressing
Most architectures store one byte per memory location: 10 05 AB 0F 8E 0D 00
0123456
So each byte has its own address. This is called byte-addressable.
…
FIT1047 Monash University
Addressing in MARIE
Some architectures (including MARIE) store one word per location:
…
1004 3005 2006 7000 008E 0D80 0000
0123456
So each word has its own address.
This is called word-addressable. Remember: In MARIE, one word is 16 bits.
FIT1047 Monash University
RAM
chip
row
Each chip = 214 =16384 bits
Column = 8 chips
Row = 8 chips
TotalRAMsize=64chipsX16384=26 X 214 =220 BITS=1Mbits
Each chip = 214 =16384 bits = 2KB TotalRAMsize=64chipsX2KB=128KB
A RAM module with 1 megabit (220 bit) capacity. Source: Wikipedia.
column
FIT1047 Monash University
RAM
Each module is made up of multiple chips Each chip has a fixed size L×W
● L: number of locations
● W: number of bits per location
E.g. 2K×8 means
= 2×210 locations of 8 bits each = 2×210×23 = 214 bits per chip
FIT1047 Monash University
RAM
64 chips, 214 bits each
Each chip = 214 =16384 bits
Column = 8 chips 220 BITS = 1Mbits
Row = 8 chips
Total RAM size = 64 chips X 16384 = 26 X 214 = 220 BITS = 1Mbits 2Byte=Word Addressable with 28 address with 8 bits
A RAM module with 1 megabit (220 bit) capacity. Source: Wikipedia.
How can we address individual locations?
1Byte= Byte Addressable with 212 address with 12 bits
FIT1047 Monash University
RTL of Basic MARIE Code
Register Transfer Language or RTL shows how the CPU (Assembler) works. Within the CPU there are many components including:
• AC or Accumulator : intermediate data is stored within the AC
• PC or Program Counter : as the name suggests it counts the current position of the code, each line has it’s own address
• MAR or Memory Access Register , stores or fetches the ‘data’ at the given address
• MBR or Memory Buffer Register , stores the data when being transferred
• IR or Instruction Register
Note that the end of each code, you will need to increment the PC by 1, so: PC ← PC + 1
FIT1047 Monash University
RTL of Basic MARIE Code
FIT1047 Monash University
RAM
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
211×8
8 rows
8 × 8 × 211 = 217 locations
Each address is 17 bits long!
8 columns = 8×211 bytes per row
FIT1047 Monash University
RAM addressing
Example address (17 bits):
010 110 00010010011
Row 2 Column 6 3-bit 3-bit
Byte addr 14710
We could implement this using MUXes!
● One MUX per chip selects the correct byte (here Byte address in dec: 147)
● One MUX per row selects the chip in a column (here: 6)
● One MUX per module selects the row (here: 2)
Indirect Addressing
FIT1047 Monash University
Accessing memory in MARIE
So far:
● Store X ● Load X ● Add X ● Jump X
Use value stored at X This is not very flexible!
FIT1047 Monash University
Indirect Addressing
Use address stored at X Comparison:
● Load X:
Load value stored at address X into AC.
● LoadI X:
Look up value stored at address X, use it as an address, load value from that address into AC.
(“indirect load”)
FIT1047 Monash University
Indirect Addressing
Example:
… 1004 3005 0206 7000 008E 0D80 0EA3 … 200 201 202 203 204 205 206
Load 202: Load value from address 202 into AC. Result: AC=0206.
LoadI 202: Look up value stored at address 202. Value is 0206. Then load value from that address into AC. Result: AC=0EA3.
FIT1047 Monash University
Indirect Addressing
Advantages:
● Addresses don’t need to be hard-coded into our program code
● We can compute the address!
● This enables important programming patterns, e.g. looping through a list of
values
FIT1047 Monash University
Indirect Addressing: Example
List of numbers
00COne, DEC1 00DSum, DEC0 00E Addr, HEX 00F 00F DEC 70
010 DEC 73
011 DEC 84
012 DEC 0
000 Loop, 001
002
003
004
005
006
007
008
009 End, 00A
00B
LoadI Addr /AC=70,73,84,0 SkipCond 800 /AC≠0, stop if 0 Jump End AddSum/AC,AC←AC+sum=0 Store Sum /Sum = AC
Load Addr Add One Store Addr Jump Loop Load Sum Output Halt
Addr incr 00F to 010
Program computes sum of a list of numbers.
Note: length of list is not hard-coded!
End of list indicated by DEC 0
FIT1047 Monash University
Indirect Addressing
Other instructions that work with indirect addressing:
● AddI X:
Use address stored at X, load value from that address and add to value currently stored in AC.
● JumpI X:
Jump to address stored at address X.
FIT1047 Monash University
RTL for LoadI X
1. MAR ← PC
2. MBR ← M[MAR]
3. IR ← MBR
4. PC ← PC+1
5. MAR ← X
6. MBR ← M[MAR]
7. MAR ← MBR
8. MBR ← M[MAR]
9. AC ← MBR
fetch
decode execute
Subroutines
FIT1047 Monash University
Subroutines
AKA procedures, functions, methods.. A piece of code that
● Has a well-defined function
● Needs to be executed often
● We can call, passing arguments to it
● Returns to where it was called from
FIT1047 Monash University
Subroutines in Machine Code
ISAs provide support for subroutines. In MARIE:
● JnSX:
Stores PC into X, then jumps to X+1. X hold the return address.
(“Jump and Store”)
● JumpI X:
Jump to address stored at X. Returns to the calling code.
FIT1047 Monash University
Subroutine Example
Load FortyTwo /AC holds 42
Store Print_Arg / Stores Contents of AC into Address Print_Arg (DEC 0) JnS Print / Jumps and Store: Stores value of PC at address Print=0 then
/ increments PC to Print+1 = 1 Halt /End the program
FortyTwo, DEC 42
/ Subroutine that prints one number
Print_Arg, DEC 0 Print, HEX 0
Load Print_Arg Output
JumpI Print
/ put argument here
/ holds return address
/ return to caller
FIT1047 Monash University
Subroutine Example
/Subroutine to double: subroutine_double.mas
loop, Input Store Val
Output Store Temp Jns ACDouble
Load Temp Output Jump loop
Val, Dec 00
/ ***** Subroutine *****
/ get value
ACDouble,
Dec 0
Load Temp
Add Temp
Store Temp JumpI ACDouble
/ return address placed here / Loading Temp
/ Doubling AC
/ Save Temp
Temp, Dec 0
/ return with value in AC / Empty: subroutine argument
/ / / /
/ /
store value as argument temp
call subroutine ACDouble.
Jns=Jumps and Stores: Stores PC at addr ACDouble and jumps to ACDouble+1 load subroutine temporary data
Infinite loop Initial Value
FIT1047 Monash University
Summary + Outlook
Memory:
● Stores bits, up to the program to interpret them
● Need one address per byte (in byte-addressable memory) or word (in word-
addressable memory, e.g. MARIE) Indirect addressing + subroutines:
● Important if we want to write more complex programs Labs this week:
● More MARIE programming
FIT1047 – Week 4
Part 2:
I/O and Interrupts
Reference: https://www.alexandriarepository.org/syllabus/introduction-to-computer-systems-networks-and-security/
FIT1047 Monash University
Recap
We’ve now seen
● CPUs
● Memory
There’s one component missing to complete the picture:
Input and Output
FIT1047 Monash University
Overview
Computers are useless without input/output. T oday:
● How does the CPU communicate with I/O devices?
● How does it handle time critical I/O?
FIT1047 Monash University
Early I/O
The first computers had very limited I/O.
● Punched paper tape or cards ( Also called Hollerith cards, IBM cards or paper cards)
● T eleprinters
Teletype with papertape punch and reader
Image source: Wikipedia
https://simple.wikipedia.org/wiki/Punched_tape https://commons.wikimedia.org/wiki/File:Teletype_with_pape rtape_punch_and_reader.jpg
FIT1047 Monash University
Modern I/O
● Keyboard, mouse, touch pad, touch screen, voice control, gestures, accelerometers, barometers, GPS, …
● Screens, printers, audio, smart home appliances, robots, … Also classed as I/O:
● Storage (hard disks, SSDs, SD cards)
● Network devices (WiFi, 4G, Ethernet)
FIT1047 Monash University
Modern I/O interfaces
I/O devices are now usually connected via interfaces:
● Standardised connectors and protocols
● Can be internal or external Examples: USB, PCIe, HDMI, …
https://en.wikipedia.org/wiki/PCI_Express
FIT1047 Monash University
I/O and the CPU
The CPU needs to be able to
● Receive data from an I/O device
● Send data to an I/O device
Why send to input devices?
● E.g. set sensitivity, calibrate, switch on/off, …
Why receive from output devices?
● E.g. check if ready, check if successful, …
FIT1047 Monash University
I/O and the CPU
I/O devices have their own registers.
E.g. keyboard device: register holds the currently pressed key How can we access these registers in machine code?
Two methods:
● memory-mapped and ● instruction-basedI/O.
FIT1047 Monash University
I/O Access Method 1: Memory-mapped
Each I/O register is “mapped” to a special memory address. We can then use
Load/Store.
Example: keyboard register is mapped to address A000.
● Load A000 loads currently pressed key into AC register. Example: printer register is mapped to address A100.
● Store A100 stores current AC value in printer register (which then gets printed out).
FIT1047 Monash University
I/O Access Method 1: Memory-mapped Advantages:
● No need for new instructions ● Simple
Disadvantages:
● We cannot use “mapped” addresses for memory any more
● The overall amount of usable memory is reduced (because some addresses
are now unavailable)
● Programs may accidentally access I/O (when a program has a bug)
FIT1047 Monash University
I/O Access Method 2: Instruction-based
● Add special I/O instructions to CPU ISA. For example, Input and Output.
● Each I/O register still has an address (as in memory-mapped I/O).
● But now these addresses are separate from the memory. Example:
Load A000 loads value from memory address A000
Input A000 loads value from I/O register A000 (e.g. the keyboard)
FIT1047 Monash University
When to perform I/O
Now we know how to communicate with I/O devices.
But most I/O devices are much, much slower than the CPU. So when should the CPU communicate with the device?
● How does it know that a key has been pressed?
● How does it know a printer is ready for receiving the next character? ●…
Two methods: programmed and interrupt-based I/O.
FIT1047 Monash University
Programmed I/O
Programmer adds checks into code to periodically checks I/O registers. Also called polling I/O.
Pseudocode:
while (true) {
if (keyboardRegister.canRead()) {
processKeyboardRegister();
} else if (printerRegister.canWrite()) {
processPrinterRegister(); }
}
FIT1047 Monash University
Programmed I/O
Advantages:
● Very simple (no extra hardware needed)
● Programmer can decide how often to poll
(e.g. 10000 times/second for network, 10 times/second for keyboard) Disadvantages:
● Programmer must be careful (poll registers regularly)
● Program is “I/O driven” (constructed around I/O)
● CPU is constantly in “busy loop”, causing high power usage
Programmed I/O is mostly used in embedded special-purpose systems.
FIT1047 Monash University
Interrupts
Opposite of polling:
● CPU is notified when it should communicate with I/O device
● CPU interrupts current program, executes special interrupt handler code,
then continues normal program
● Programmer can write separate code for main program and interrupt handler
(better separation of concerns)
FIT1047 Monash University
Interrupt signals
Device notifies CPU of pending interrupt by setting a bit in a special register. CPU checks before each fetch-decode-execute cycle:
● Is interrupt bit set? Call interrupt handler.
● Otherwise: normal fetch-decode-execute.
Let’s look at the RTL for this.
FIT1047 Monash University
RTL for Fetch with Interrupts
An interrupt handler is like a subroutine:
● Use JnS to
○ Store the return address
○ Jump to the code implementing the handler
● When the handler finishes, return to caller using JumpI RTL for JnS X:
• MBR ← PC
• M[MAR] ← MBR
• PC ← MAR
• PC ← PC + 1
Save current PC at X Jump to X+1
FIT1047 Monash University
RTL for Fetch with Interrupts
1. If InterruptBit is 1:
a. ClearInterruptBit
b. MAR ← InterruptHandler
c. MBR ← PC
d. M[MAR] ← MBR
e. PC ← MAR
f. PC ← PC + 1
2. MAR ← PC
3. MBR ← M[MAR]
4. IR ← MBR
5. PC ← PC+1
6. …
Normal fetch cycle
Save current PC at InterruptHandler Jump to InterruptHandler+1
FIT1047 Monash University
Interrupt Handler
Must leave the CPU in the same state as before the interrupt! ● For MARIE: Contents of AC must be the same as before
This is called a context switch. It can be achieved by
● Shadow registers:
When an interrupt happens, the CPU switches to a separate register file
Or
● In Programming the interrupt handler to save registers to memory (see next example)
FIT1047 Monash University
Interrupt Vectors
How can the interrupt handler distinguish interrupts from different devices?
● Each device is assigned an identification number
● When raising an interrupt, the device stores its number in a special register
● The interrupt handler uses that number to jump to different subroutines
● The list of subroutines (one per device) is called an interrupt vector
FIT1047 Monash University
Example Interrupt Handler (MARIE syntax)
InterruptHandler, HEX 0
Store SaveAC
Save AC register to memory
SaveAC, Destination, InterruptVecAddr, InterruptVec,
Load InterruptVecAddr Add InterruptDeviceID Store Destination JumpI Destination
HEX 0
HEX 0
ADR InterruptVec ADR KeyboardHandler ADR MouseHandler ADR PrinterHandler ADR SoundHandler
Add device number to address of interrupt vector
Jump to device handler
/ Temporary storage for AC register
/ Computed destination handler
/ device 0 / device 1 / device 2 / device 3
Restore AC register from memory Return to normal program
KeyboardHandler, … / do some work Load SaveAC
JumpI InterruptHandler
FIT1047 Monash University
Interrupts in x86 PCs
Original design:
● 15 interrupt request (IRQ) signals
● hardware must be configured to use
correct IRQ
● Results in conflicts
● e.g. setting jumper on a network or sound card
● devices have to share IRQs
FIT1047 Monash University
Interrupts in x86 PCs
https://en.wikipedia.org/wiki/Sound_card
FIT1047 Monash University
Modern Interrupts
Use Advanced Programmable Interrupt Controllers (APICs).
● Integrated into CPUs
● Support more IRQs (results in fewer
conflicts)
● Include high-resolution timers
○ E.g. cause an interrupt every millisecond
○ We will see later why that’s useful!
FIT1047 Monash University
Disadvantages of interrupt-based I/O
● Different devices may need different priorities
○ E.g. keyboard (very low) vs graphics card (very high priority)
○ Can be achieved using different interrupt signals
● All memory transfers run through the CPU
○ E.g. read byte from disk, store into memory; or
○ Read byte from memory, transfer to graphics card
● I/O devices are fully controlled by CPU Solution: DMA
FIT1047 Monash University
Comparison Chart
Operations
Interrupt
Polling
Basic operation
Device notify CPU that it needs CPU attention.
CPU constantly checks device status whether it needs CPU’s attention.
Mechanism
An interrupt is a hardware mechanism.
Polling is a Protocol.(software)
Servicing
Interrupt handler services the Device.
CPU services the device.
Identification
Interrupt-request(IRQ’s) line indicates that device needs servicing.
Command-ready bit indicates the device needs servicing.
CPU response
CPU is interrupted only when a device needs servicing through IRQ, which saves CPU cycles.
CPU has to wait and check whether a device needs servicing which wastes lots of CPU cycles.
Events
An interrupt can occur at any time.
CPU polls the devices at regular interval.
Efficiency
Interrupt becomes inefficient when devices keep on interrupting the CPU repeatedly.
Polling becomes inefficient when CPU rarely finds a device ready for service.
An Analogy
When door bell rings(IRQ request), then we check the door to see who has arrived.
In Polling, we constantly open the door to check if someone has arrived.
FIT1047 Monash University
Direct Memory Access (DMA)
Modern CPUs can delegate memory transfer to dedicated controller.
● Hard disk controller can transfer data directly to RAM
● Graphics card can fetch image directly from RAM
● CPU is free to perform other tasks
CPU and DMA controller share the data and address bus:
● Only one can perform memory transfer at the same time
FIT1047 Monash University
Summary
I/O
● Memory-mapped vs. instruction based
● Programmed (polling) vs interrupt-driven
Interrupts
● Require context switch (call subroutine to handle interrupt, save registers)
● Jump into interrupt vector (one handler per device)
DMA
● Off-load responsibility for data transfer to special controller
● Keeps CPU free to do useful tasks
FIT1047 Monash University
Outlook
Next lecture:
● Booting a computer
● BIOS / UEFI
● BIOS (Basic Input-Output system), Intel has announced plans to completely replace it with
UEFI(Unified Extended Firmware Interface Forum) by 2020.
https://en.wikipedia.org/wiki/BIOS
https://www.howtogeek.com/56958/htg-explains-how-uefi-will-replace- the-bios/