Reconfigurable computing
Small Embedded Systems
Unit 2.5 Instruction Set Architecture
Introduction
Processor architecture
High level languages and assembly languages
Big-endian and little-endian
Registers
The ALU needs to be provided with a source of data x, y and a place to store its results z
This is done in registers, which each contain a single word of data
Registers hold values read in from data memory and ready to be written out to data memory
Various registers can be switched to x, y and z
Opcode
x
y
z
Status
Status
register 0
register 2
register 1
Switching network
Register File
Registers are often arranged as register file: tiny multi-port memories, fed with three input addresses:
Source1 selects one of the registers to be fed to x
Source2 selects one of the registers to be fed to y
Destination selects one of the registers to be receive the result z
Applying appropriate values to these addresses selects the data to be operated upon
Opcode
x
y
z
Status
Status
r0
r1
r2
r3
r4
r5
r6
r7
Source1
Source2
Destination
4
5
7
Value stored in r4
Value stored in r5
Result to store in r7
Machine Instructions
A machine instruction is a binary word that encodes
The operation to be carried out
The location of the operands
The location to store the result
Opcode
x
y
z
Status
Status
r0
r1
r2
r3
r4
r5
r6
r7
Source1
Source2
Destination
4
5
7
Value stored in r4
Value stored in r5
Result to store in r7
opcode
dest
src1
src2
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Bit number:
Machine Instructions
Here is a 16-bit instruction that would add the values stored in r4, r5 and put the result into r7
remember that our ALU opcode for add is 1
Opcode
x
y
z
Status
Status
r0
r1
r2
r3
r4
r5
r6
r7
Source1
Source2
Destination
4
5
7
Value stored in r4
Value stored in r5
Result to store in r7
0 0 0 0 0 0 0 1
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
1 1 1
1 0 1
1 0 0
Instruction Formats
This is just one possible format
Many different formats are in use on different 16-bit processors
The format above is has some disadvantages
3 bits for each address
Can only use 23=8 registers (more would be better)
7 bits for the opcode
Can have 27=128 different instructions
But if we use 4 bits for the register addresses, we have only 4 bits left for opcode: 16 operations won’t be enough
opcode
dest
src1
src2
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Bit number:
The 2-register Format
The format above uses 3 registers which can all be different
The 2-register format restricts our register use
The destination must always be the same as one of the sources
So we only need to give the address of the other source
This allows us to have 6 bits for the opcode (64 different instructions) and 5 bits for register address (32 file registers)
This (with modifications) is the approach used in Atmel 328P
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
opcode
dest
src1
src2
Bit number:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
opcode
dest
source
Bit number:
Additional Opcodes
The discussion above is a simplified picture
We would need further opcodes to handle:
Load a word from memory into a register
LOAD r1=MEM[0x0100]
Reads the value stored at memory location 0x0100 into r1
Load literal values into a register
LOAD r1=0x0100
Sets the value of r1 to the literal value 0x0100
Store a register value into memory
Jump to a different point in the instruction stream
Other ALU instructions (e.g. shift, rotate)
High Level and Assembly Languages
High level languages, such as C, are designed for humans to express the computation that they wish to perform
A program written in a high level language is compiled into machine code
Assembly language is a human-readable version of machine code and shows the detail of what the machine is doing
High Level and Assembly Languages
Suppose we have a piece of C code,
with variables stored at these data memory locations
If we compile on a micro with 16-bit ALU, data bus and data memory words, the assembly code might look like this:
uint16_t a, b, c
c = b+a;
a: 0x0100
b: 0x0102
c: 0x0104
LOAD r1=MEM[0x0100]
LOAD r2=MEM[0x0102]
ADD r1=r2+r1
STORE r1 MEM[0x104]
Load variable a into register 1
Load variable b into register 2
Compute the result
Store register 1 into memory location of variable c
High Level and Assembly Languages
If we compile on a micro with 8-bit ALU, data bus and data memory words, the assembly code might look like this:
LOAD r1=MEM[0x0100]
LOAD r2=MEM[0x0101]
LOAD r3=MEM[0x0102]
LOAD r4=MEM[0x0103]
ADD r1=r3+r1
ADDC r2=r4+r2
STORE r1 to MEM[0x104]
STORE r2 to MEM[0x105]
Load low byte of variable a into r1
Compute the low byte of the result
Store low byte of variable c
Load high byte of variable a into r2
Load low byte of variable b into r3
Load high byte of variable b into r4
Add-with-carry the high byte of result
Store high byte of variable c
Stages of Compilation of C Program
Pre-processor
Compiler
Assembler
Linker
Remove comments
Expand Macros
Expand included files.
Convert to assembly
Convert to machine code
If multiple files compiled, link all together
Link to library functions
High Level and Assembly Languages
High level languages, such as C, are designed to be as platform independent as possible
Assembly language exposes the programmer to machine-dependent issues (e.g. size of instruction and words, location of resources in memory, etc.)
Assembly languages differ greatly between different processor families
Assembly programming by human can sometimes be needed to optimise speed or to deal with awkward machine-dependent details (e.g. interrupt handling), but normally humans would use a high level language
Endian-ness
Memory location Big Endian Little endian
0x0100 0x12 0x34
0x0101 0x34 0x12
In the preceding, we made an assumption about which way round the bytes of a number are stored
In practice, there are two different arrangements that can be used by processors:
Big endian: store most significant byte first
Little endian: store least significant byte first
Suppose variable a has the value of 0x1234 and is stored memory address 0x0100
This is how the bytes would be laid out in memory:
Summary
Machine instructions encode the ALU opcode and the addresses of the operands
Different processors make different compromises on how the encoding should be done
Assembly language is human readable and has a direct correspondence to the machine instructions
High level languages are compiled into assembler and then to machine code
/docProps/thumbnail.jpeg