SCHOOL OF ENGINEERING
DIGITAL SYSTEM DESIGN 4 ELEE10007
Exam Date: 17/05/2017 From and To: 09:30-11:30 Exam Diet: May 2017 Please read full instructions before commencing writing
Exam paper information
Copyright By PowCoder代写 加微信 powcoder
• Paper consists of TWO Sections
• Candidates to answer THREE questions
• SECTION A: (ONE question) Answer whole section
• SECTION B: Answer TWO out of THREE questions
Special instructions
• Students should assume reasonable values for any data not given in a question nor available on a datasheet, and should make any such assumptions clear on their script.
• Students in any doubt as to the interpretation of the wording of a question, should make their own decision, and should state it clearly on their script.
• Please write your name in the space indicated at the right hand side on the front cover of the answer book. Also enter you examination number in the appropriate space on the front cover.
• Write ONLY your examination number on any extra sheets or worksheets used and firmly attach these to the answer book(s).
• This examination will be marked anonymously.
Special items
• Data sheets (two pages)
Convenor of Board of Examiners: Professor R Examiners: Professor P Mellor and Dr Z ECTION A Question A1
a) The MIPS architecture is an example of a RISC machine.
(i) Define the term RISC and explain why there has been a recent
move away from complex instruction machines. (2)
(ii) With respect to a single-cycle processor MIPS implementation,
and the four MIPS design principles, describe the main factors
that motivated designers to use a pipelined architecture. (2)
b) Describe the Von-Neumann bottleneck, use a diagram to show its
physical origin and, again using diagrams, describe how the two
Harvard architectures solve this problem. (4)
c) Assume that registers s0, s1, and s2 contain the values -10, 5 and 3 respectively.
In the MIPS code below, the third instruction is shown as machine- code instead of assembly language.
main:add $s3, $s1, $s1 #5+5 add $s4, $s1, $s0 #5+-10
exit: 0000 0010 0011 0010 1010 1000 0010 0010
(i) Use the MIPS green-card to decode this instruction. In your
answer you should provide the assembly language
representation and an appropriate comment to explain what the code is doing. (3)
(ii) Write a series of MIPS assembly instructions that implement the c-code fragment below, assume that $s1 stores i, $s2 stores j, and $s3 stores k. You should fully annotate your MIPS code.
while ( i < j )
{ k++ ; }i=i*2;
(i) Calculate the maximum reverse address (in hexadecimal) that
can be reached for this instruction. (2)
(ii) Explain why MIPS architectures include a shift-left-by-two unit.
What would be the implications if this unit were omitted? (2)
(iii) In your answer to part c)(ii) you may have used a pseudo- instruction: explain how the assembler deals with these.
d) Consider a 32-bit MIPS architecture and assume the programme counter is currently at a decimal address of 3248674260. Consider an i-format instruction which uses PC-relative addressing.
ELEE10007 Digital System Design 4 - May 2017
SECTION B Question B1
(i) With the aid of a suitable diagram, describe the hierarchy of memory and storage in a modern computer system with
reference to the purpose of the memory architecture. (5)
(ii) Explain the two forms of locality in cache memory systems and
how these are exploited to improve memory access times. (4)
(ii) Define the address structure if it is a directly mapped cache.
(iii) How does the address structure change if it is changed to an 8-way set-associative cache?
(iv) If this was changed to a fully associative cache, would you expect the hit time to increase or decrease? Give a reason for your answer.
Assume a directly mapped cache with the same characteristics as in part b). Consider a data streaming process that reads a 1024 KiB working set with the following byte address stream:
0 2 4 6 8 10 12 14 16 ...
(i) Calculate the miss rate for this address stream.
(ii) How does the miss rate change if the cache is changed to an 8-way set associative structure and what form of locality does this represent?
A cache system has the following characteristics: • Cache size of 32 KiB
• Block size of 16 words (64 bytes)
• 32 bit address size
(i) How many blocks are in the cache?
ELEE10007 Digital System Design 4 - May 2017
Question B2
a) Amdahl’s Law can be embodied by the following equation:
Execution time after improvement =
Execution time affected by improvement + Execution time unaffected
Amount of Improvement
(i) Explain how this applies to parallel computing systems and the concept of weak scaling versus strong scaling.
(ii) Consider a problem that involves the multiplication of two
floating point numbers and the sum of two 10x10 arrays of
integers. Assume that multiplication cannot take place at the
same time as addition and that each takes roughly the same
time. Calculate the relative speedup for running this problem
on a 10 core multiprocessor and a 100 core multiprocessor
system, over a single processor with the same characteristics. (2)
(iii) Comment on the efficiency of the parallel computer systems
used for the problem in part a)(ii). (2)
b) Flynn’s taxonomy defines four different computer architectures based on the arrangement of data and instruction streams. For each type provide a diagram illustrating the architecture and suggest an example implementation or application.
(i) SISD (2) (ii) SIMD (2) (iii) MISD (2) (iv) MIMD (2)
c) Figure B2c shows a roofline model diagram for a multicore processor.
(i) Which properties of the processor limit the performance for
arithmetic intensities of 1/2 and 2 respectively? (2)
(ii) Sketch the graph you would expect if the memory performance was improved to double the attainable speed. Comment on the performance gain.
Continued overleaf
ELEE10007 Digital System Design 4 - May 2017
1/8 1/4 1/2 1 2 4 8 16 Arithmetic Intensity: FLOPs/Byte Ratio
Figure B2c
ELEE10007 Digital System Design 4 - May 2017
Attainable GFLOPs/second
Question B3
a) Figure B3a shows part of a simple MIPS architecture, which is capable of implementing four instructions: load word, store word, branch-on-equal, and R-type arithmetic. The four R-type functions are shown within the ALU block. The two-bit AluOP is derived from the opcode by the main controller.
(i) What arithmetic functions does the ALU perform for load/store- word and branch instructions?
For full marks explain the features of the MIPS ISA that require
the ALU to perform these functions. (6)
(ii) Use the MIPS green card to determine the “funct” codes for
each of the four ALU functions shown in Figure B3a. (2)
(iii) Using information provided by Figure B3a and the MIPS green card, design a combinational logic block which implements the desired functionality of the ALU Controller. You should clearly show each step in your working and draw the full logic diagram.
Figure B3a
END OF PAPER
ELEE10007 Digital System Design 4 - May 2017
M I P S Reference Data CORE INSTRUCTION SET
ARITHMETIC CORE INSTRUCTION SET
OPCODE / FMT /FT / FUNCT (Hex) 11/8/1/-- 11/8/0/-- 0/--/--/1a 0/--/--/1b 11/10/--/0
11/11/--/0 11/10/--/y 11/11/--/y
11/10/--/3
11/11/--/3
11/10/--/2
11/11/--/2
11/10/--/1
11/11/--/1
31/--/--/--
35/--/--/--
0 /--/--/10 0 /--/--/12 10 /0/--/0 0/--/--/18 0/--/--/19 0/--/--/3 39/--/--/--
3d/--/--/--
NAME, MNEMONIC Add add Add Immediate addi Add Imm. Unsigned addiu Add Unsigned addu And and And Immediate andi
Branch On Equal beq Branch On Not Equal bne
divu add.s
Lo=R[rs]/R[rt]; Hi=R[rs]%R[rt] R Lo=R[rs]/R[rt]; Hi=R[rs]%R[rt]
Jump And Link Jump Register
Load Byte Unsigned
Load Halfword Unsigned
Load Linked
Load Upper Imm.
Or Immediate
Set Less Than
Set Less Than Imm.
Set Less Than Imm. Unsigned
Set Less Than Unsig. Shift Left Logical Shift Right Logical
Store Byte
Store Conditional
Store Halfword
Store Word Subtract
Subtract Unsigned
j J jal J jr R
lbu I lhu I
ll I lui I lw I nor R or R ori I slt R slti I
sltiu I sltu R
sll R srl R
sub R subu R
PC=JumpAddr R[31]=PC+8;PC=JumpAddr PC=R[rs] R[rt]={24’b0,M[R[rs]
+SignExtImm](7:0)} R[rt]={16’b0,M[R[rs]
+SignExtImm](15:0)} R[rt] = M[R[rs]+SignExtImm] R[rt] = {imm, 16’b0}
R[rt] = M[R[rs]+SignExtImm] R[rd] = ~ (R[rs] | R[rt])
R[rd] = R[rs] | R[rt]
R[rt] = R[rs] | ZeroExtImm
R[rd] = (R[rs] < R[rt]) ? 1 : 0 R[rt] = (R[rs] < SignExtImm)? 1 : R[rt] = (R[rs] < SignExtImm)
R[rd] = (R[rs] < R[rt]) ? 1 : 0
R[rd] = R[rt] << shamt
R[rd] = R[rt] >> shamt M[R[rs]+SignExtImm](7:0) =
R[rt](7:0) M[R[rs]+SignExtImm] = R[rt]; R[rt] = (atomic) ? 1 : 0 M[R[rs]+SignExtImm](15:0) =
(5) 2hex (5) 3hex
F[fd]=F[fs] – F[ft]
{F[fd],F[fd+1]} = {F[fs],F[fs+1]} –
{F[ft],F[ft+1]} F[rt]=M[R[rs]+SignExtImm] F[rt]=M[R[rs]+SignExtImm];
F[rt+1]=M[R[rs]+SignExtImm+4] R[rd] = Hi
R[rd] = Lo
R[rd] = CR[rs]
{Hi,Lo} = R[rs] * R[rt]
{Hi,Lo} = R[rs] * R[rt]
R[rd] = R[rt] >>> shamt M[R[rs]+SignExtImm] = F[rt] M[R[rs]+SignExtImm] = F[rt]; M[R[rs]+SignExtImm+4] = F[rt+1]
21 20 21 20
16 15 16 15
MAT OPERATION (in Verilog)
R R[rd] = R[rs] + R[rt]
I R[rt] = R[rs] + SignExtImm I R[rt] = R[rs] + SignExtImm
R R[rd] = R[rs] + R[rt] R R[rd] = R[rs] & R[rt]
I R[rt] = R[rs] & ZeroExtImm
I if(R[rs]==R[rt]) PC=PC+4+BranchAddr
I if(R[rs]!=R[rt]) PC=PC+4+BranchAddr
OPCODE / FUNCT (Hex)
(1) 0 / 20hex (1,2) 8hex
FOR- NAME, MNEMONIC MAT
Branch On FP True bc1t FI Branch On FP False bc1f FI
OPERATION if(FPcond)PC=PC+4+BranchAddr (4) if(!FPcond)PC=PC+4+BranchAddr (4)
M[R[rs]+SignExtImm] = R[rt] R[rd] = R[rs] – R[rt]
R[rd] = R[rs] – R[rt]
REGISTER NAME, NUMBER, USE, CALL CONVENTION
(2) (2,7) R[rt](15:0) (2)
PSEUDOINSTRUCTION SET
NAME Branch Less Than
Branch Greater Than
Branch Less Than or Equal
Branch Greater Than or Equal
Load Immediate
OPERATION if(R[rs]
if(R[rs]<=R[rt]) PC = Label if(R[rs]>=R[rt]) PC = Label R[rd] = immediate
R[rd] = R[rs]
0 / 08hex (2) 24hex
(2) 25hex (2,7) 30hex
0 (2) ahex
0 / 00hex 0 / 02hex
28hex 38hex 29hex
(2) 2bhex (1) 0 / 22hex 0 / 23hex
FP Divide Single FP Divide
FP Multiply Single FP Multiply Double
FP Subtract Single FP Subtract Double
Load FP Single Load FP
Move From Hi Move From Lo Move From Control Multiply
Multiply Unsigned Shift Right Arith. Store FP Single Store FP
div.s FR div.d FR mul.s FR mul.d FR sub.s FR sub.d FR
F[fd] = F[fs] / F[ft] {F[fd],F[fd+1]} = {F[fs],F[fs+1]} /
{F[ft],F[ft+1]} F[fd] = F[fs] * F[ft]
{F[fd],F[fd+1]} = {F[fs],F[fs+1]} * {F[ft],F[ft+1]}
(6) 0 / 2bhex
31 26 25 31 26 25
21 20 21 20
Divide Unsigned
FP Add Single
FP Compare Single c.x.s* FP Compare c.x.d* Double
FR F[fd ]= F[fs] + F[ft]
FR {F[fd],F[fd+1]} = {F[fs],F[fs+1]} +
{F[ft],F[ft+1]} FR FPcond = (F[fs] op F[ft]) ? 1 : 0
FR FPcond = ({F[fs],F[fs+1]} op {F[ft],F[ft+1]}) ? 1 : 0
* (x is eq, lt, or le) (op is ==, <, or <=) ( y is 32, 3c, or 3e)
mfhi mflo mfc0 mult multu sra swc1
R R R R R R I
FLOATING-POINT INSTRUCTION FORMATS
16 15 MNEMONIC
blt bgt ble bge li
NAME NUMBER USE
PRESERVED ACROSS A CALL?
$zero 0 The Constant Value 0 N.A.
$at 1 Assembler Temporary No
$v0-$v1 2-3 Values for Function Results No and Expression Evaluation
$a0-$a3 4-7 Arguments No
$t0-$t7 8-15 Temporaries No
$s0-$s7 16-23 Saved Temporaries Yes
$t8-$t9 24-25 Temporaries No
$k0-$k1 26-27 Reserved for OS Kernel No
$gp 28 Global Pointer Yes
$sp 29 Stack Pointer Yes
$fp 30 Frame Pointer Yes
$ra 31 Return Address No
BASIC INSTRUCTION FORMATS
(1) May cause overflow exception
(2) SignExtImm = { 16{immediate[15]}, immediate
(3) ZeroExtImm = { 16{1b’0}, immediate }
(4) BranchAddr = { 14{immediate[15]}, immediate,
(5) JumpAddr = { PC+4[31:28], address, 2’b0 }
(6) Operands considered unsigned numbers (vs. 2’s comp.)
(7) Atomic test&set pair; R[rt] = 1 if pair atomic, 0 if not atomic
Copyright 2009 by Elsevier, Inc., All rights reserved. From Patterson and Hennessy, Computer Organization and Design, 4th ed.
MIPS Reference Data Card (“ ”) 1. Pull along perforation to separate card 2. Fold bottom side (columns 3 and 4) together
OPCODES, BASE CONVERSION, ASCII SYMBOLS
IEEE 754 FLOATING-POINT STANDARD
(-1)S × (1 + Fraction) × 2(Exponent - Bias) where Single Precision Bias = 127,
Double Precision Bias = 1023.
IEEE Single Precision and Double Precision Formats:
31 30 2322
63 62 5251
MEMORY ALLOCATION
IEEE 754 Symbols
S.P. MAX = 255, D.P. MAX = 2047
MIPS opcode (31:26)
(1) MIPS funct (5:0)
(2) MIPS funct (5:0)
Deci- Hexa- ASCII Binary mal deci- Char- mal acter
Deci- Hexa- ASCII mal deci- Char- mal acter
(1) sll add.f sub.f j srl mul.f jal sra div.f
00 0000 0 0 NUL 00 0001 1 1 SOH 00 0010 2 2 STX 00 0011 3 3 ETX
64 40 @ 65 41 A 66 42 B 67 43 C
beq sllv sqrt.f bne abs.f blez srlv mov.f bgtz srav neg.f
00 0100 4 4 EOT 00 0101 5 5 ENQ 00 0110 6 6 ACK 00 0111 7 7 BEL
68 44 D 69 45 E 70 46 F 71 47 G
addi jr addiu jalr slti movz sltiu movn
00 1000 8 8 BS 00 1001 9 9 HT 00 1010 10 a LF 00 1011 11 b VT
72 48 H 73 49 I 74 4a J 75 4b K
andi syscall round.w.f ori break trunc.w.f xori ceil.w.f lui sync floor.w.f
00 1100 00 1101 00 1110 00 1111
12 c FF 13 d CR 14 e SO 15 f SI
76 4c L 77 4d M 78 4e N 79 4f O
mfhi (2) mthi
mflo movz.f mtlo movn.f
01 0000 01 0001 01 0010 01 0011
16 10 DLE 17 11 DC1 18 12 DC2 19 13 DC3
80 50 P 81 51 Q 82 52 R 83 53 S
01 0100 01 0101 01 0110 01 0111
20 14 DC4 21 15 NAK 22 16 SYN 23 17 ETB
84 54 T 85 55 U 86 56 V 87 57 W
mult multu div divu
01 1011 27 1b ESC
24 18 CAN 25 19 EM 26 1a SUB
88 58 X 89 59 Y 90 5a Z 91 5b [
01 1100 01 1101 01 1110 01 1111
28 1c FS 29 1d GS 30 1e RS 31 1f US
92 5c \ 93 5d ] 94 5e ^ 95 5f _
lb add cvt.s.f lh addu cvt.d.f lwl sub
10 0000 10 0001 10 0010 10 0011
32 20 Space 33 21 ! 34 22 " 35 23 #
96 60 ‘ 97 61 a 98 62 b 99 63 c
lbu and cvt.w.f lhu or
10 0100 10 0101 10 0110 10 0111
36 24 $ 37 25 % 38 26 & 39 27 ’
100 64 d 101 65 e 102 66 f 103 67 g
swl slt sw sltu
10 1000 10 1001 10 1010 10 1011
40 28 ( 41 29 ) 42 2a * 43 2b +
104 68 h 105 69 i 106 6a j 107 6b k
10 1100 10 1101 10 1110 10 1111
44 2c , 45 2d - 46 2e . 47 2f /
108 6c l 109 6d m 110 6e n 111 6f o
ll tge c.f.f lwc1 tgeu c.un.f lwc2 tlt c.eq.f pref tltu c.ueq.f
11 0000 11 0001 11 0010 11 0011
48 30 0 49 31 1 50 32 2 51 33 3
112 70 p 113 71 q 114 72 r 115 73 s
teq c.olt.f c.ult.f ldc2 tne c.ole.f c.ule.f
11 0100 11 0101 11 0110 11 0111
52 34 4 53 35 5 54 36 6 55 37 7
116 74 t 117 75 u 118 76 v 119 77 w
swc1 c.ngle.f
swc2 c.seq.f
11 1000 11 1001 11 1010 11 1011
56 38 8 57 39 9 58 3a : 59 3b ;
120 78 x 121 79 y 122 7a z 123 7b {
sdc1 c.nge.f
sdc2 c.le.f c.ngt.f
11 1100 11 1101 11 1110 11 1111
60 3c < 61 3d = 62 3e > 63 3f ?
124 7c | 125 7d } 126 7e ~ 127 7f DEL
1 to MAX – 1
± Fl. Pt. Num.
Higher Memory Addresses
Stack Grows
Lower Memory Addresses
UEI MLE 410
BD = Branch Delay, UM = User Mode, EL = Exception Level, IE =Interrupt Enable
EXCEPTION CODES
SIZE PREFIXES (10x for Disk, Communication; 2x for Memory)
if fmt(25:21)==17ten (11hex) f = d (double)
Copyright 2009 by Elsevier, Inc., All rights reserved. From Patterson and Hennessy, Computer Organization and Design, 4th ed.
Argument 6
Argument 5
Saved Registers
Local Variables
Dynamic Data
Static Data
7fff fffchex
1000 8000hex 1000 0000hex 0040 0000hex
STACK FRAME
DATA ALIGNMENT
Double Word
01234567 Value of three least significant bits of byte address (Big Endian)
EXCEPTION CONTROL REGISTERS: CAUSE AND STATUS
31 15 86 2
Interrupt Mask
Number Name Cause of Exception
Number Name Cause of Exception
0 Int Interrupt (hardware)
9 Exception
4 AdEL Address Error Exception (load or instruction fetch)
Reserved Instruction Exception
5 AdES Address Error Exception (store)
Coprocessor Unimplemented
Bus Error on Instruction Fetch
Arithmetic Overflow Exception
Bus Error on Load or Store
13 Tr Trap
8 Exception
15 FPE Floating Point Exception
PRE- SIZE FIX
PRE- SIZE FIX
PRE- SIZE FIX
PRE- SIZE FIX
103, 210 Kilo-
1015, 250 Peta-
10-3 milli-
10-15 femto-
106, 220 Mega-
1018, 260 Exa-
10-6 micro-
10-18 atto-
109, 230 Giga-
1021, 270 Zetta-
10-9 nano-
10-21 zepto-
1012, 240 Tera-
1024, 280 Yotta-
10-12 pico-
10-24 yocto-
(1) opcode(31:26) == 0
(2) opcode(31:26) == 17ten (11hex); if fmt(25:21)==16ten (10hex) f = s (single);
The symbol for each prefix is just its first letter, except μ is used for micro.
Exception Code
MIPS Reference Data Card (“ ”) 1. Pull along perforation to separate card 2. Fold bottom side (columns 3 and 4) together
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com