CS代考 ELEE10007

SCHOOL OF ENGINEERING
DIGITAL SYSTEM DESIGN 4 ELEE10007
Exam Date: 17/05/2017 From and To: 09:30-11:30 Exam Diet: May 2017 Please read full instructions before commencing writing
Exam paper information

Copyright By PowCoder代写 加微信 powcoder

• Paper consists of TWO Sections
• Candidates to answer THREE questions
• SECTION A: (ONE question) Answer whole section
• SECTION B: Answer TWO out of THREE questions
Special instructions
• Students should assume reasonable values for any data not given in a question nor available on a datasheet, and should make any such assumptions clear on their script.
• Students in any doubt as to the interpretation of the wording of a question, should make their own decision, and should state it clearly on their script.
• Please write your name in the space indicated at the right hand side on the front cover of the answer book. Also enter you examination number in the appropriate space on the front cover.
• Write ONLY your examination number on any extra sheets or worksheets used and firmly attach these to the answer book(s).
• This examination will be marked anonymously.
Special items
• Data sheets (two pages)
Convenor of Board of Examiners: Professor R Examiners: Professor P Mellor and Dr Z ECTION A Question A1
a) The MIPS architecture is an example of a RISC machine.
(i) Define the term RISC and explain why there has been a recent
move away from complex instruction machines. (2)
(ii) With respect to a single-cycle processor MIPS implementation,
and the four MIPS design principles, describe the main factors
that motivated designers to use a pipelined architecture. (2)
b) Describe the Von-Neumann bottleneck, use a diagram to show its
physical origin and, again using diagrams, describe how the two
Harvard architectures solve this problem. (4)
c) Assume that registers s0, s1, and s2 contain the values -10, 5 and 3 respectively.
In the MIPS code below, the third instruction is shown as machine- code instead of assembly language.
main:add $s3, $s1, $s1 #5+5 add $s4, $s1, $s0 #5+-10
exit: 0000 0010 0011 0010 1010 1000 0010 0010
(i) Use the MIPS green-card to decode this instruction. In your
answer you should provide the assembly language
representation and an appropriate comment to explain what the code is doing. (3)
(ii) Write a series of MIPS assembly instructions that implement the c-code fragment below, assume that $s1 stores i, $s2 stores j, and $s3 stores k. You should fully annotate your MIPS code.
while ( i < j ) { k++ ; }i=i*2; (i) Calculate the maximum reverse address (in hexadecimal) that can be reached for this instruction. (2) (ii) Explain why MIPS architectures include a shift-left-by-two unit. What would be the implications if this unit were omitted? (2) (iii) In your answer to part c)(ii) you may have used a pseudo- instruction: explain how the assembler deals with these. d) Consider a 32-bit MIPS architecture and assume the programme counter is currently at a decimal address of 3248674260. Consider an i-format instruction which uses PC-relative addressing. ELEE10007 Digital System Design 4 - May 2017 SECTION B Question B1 (i) With the aid of a suitable diagram, describe the hierarchy of memory and storage in a modern computer system with reference to the purpose of the memory architecture. (5) (ii) Explain the two forms of locality in cache memory systems and how these are exploited to improve memory access times. (4) (ii) Define the address structure if it is a directly mapped cache. (iii) How does the address structure change if it is changed to an 8-way set-associative cache? (iv) If this was changed to a fully associative cache, would you expect the hit time to increase or decrease? Give a reason for your answer. Assume a directly mapped cache with the same characteristics as in part b). Consider a data streaming process that reads a 1024 KiB working set with the following byte address stream: 0 2 4 6 8 10 12 14 16 ... (i) Calculate the miss rate for this address stream. (ii) How does the miss rate change if the cache is changed to an 8-way set associative structure and what form of locality does this represent? A cache system has the following characteristics: • Cache size of 32 KiB • Block size of 16 words (64 bytes) • 32 bit address size (i) How many blocks are in the cache? ELEE10007 Digital System Design 4 - May 2017 Question B2 a) Amdahl’s Law can be embodied by the following equation: Execution time after improvement = Execution time affected by improvement + Execution time unaffected Amount of Improvement (i) Explain how this applies to parallel computing systems and the concept of weak scaling versus strong scaling. (ii) Consider a problem that involves the multiplication of two floating point numbers and the sum of two 10x10 arrays of integers. Assume that multiplication cannot take place at the same time as addition and that each takes roughly the same time. Calculate the relative speedup for running this problem on a 10 core multiprocessor and a 100 core multiprocessor system, over a single processor with the same characteristics. (2) (iii) Comment on the efficiency of the parallel computer systems used for the problem in part a)(ii). (2) b) Flynn’s taxonomy defines four different computer architectures based on the arrangement of data and instruction streams. For each type provide a diagram illustrating the architecture and suggest an example implementation or application. (i) SISD (2) (ii) SIMD (2) (iii) MISD (2) (iv) MIMD (2) c) Figure B2c shows a roofline model diagram for a multicore processor. (i) Which properties of the processor limit the performance for arithmetic intensities of 1/2 and 2 respectively? (2) (ii) Sketch the graph you would expect if the memory performance was improved to double the attainable speed. Comment on the performance gain. Continued overleaf ELEE10007 Digital System Design 4 - May 2017 1/8 1/4 1/2 1 2 4 8 16 Arithmetic Intensity: FLOPs/Byte Ratio Figure B2c ELEE10007 Digital System Design 4 - May 2017 Attainable GFLOPs/second Question B3 a) Figure B3a shows part of a simple MIPS architecture, which is capable of implementing four instructions: load word, store word, branch-on-equal, and R-type arithmetic. The four R-type functions are shown within the ALU block. The two-bit AluOP is derived from the opcode by the main controller. (i) What arithmetic functions does the ALU perform for load/store- word and branch instructions? For full marks explain the features of the MIPS ISA that require the ALU to perform these functions. (6) (ii) Use the MIPS green card to determine the “funct” codes for each of the four ALU functions shown in Figure B3a. (2) (iii) Using information provided by Figure B3a and the MIPS green card, design a combinational logic block which implements the desired functionality of the ALU Controller. You should clearly show each step in your working and draw the full logic diagram. Figure B3a END OF PAPER ELEE10007 Digital System Design 4 - May 2017 M I P S Reference Data CORE INSTRUCTION SET ARITHMETIC CORE INSTRUCTION SET OPCODE / FMT /FT / FUNCT (Hex) 11/8/1/-- 11/8/0/-- 0/--/--/1a 0/--/--/1b 11/10/--/0 11/11/--/0 11/10/--/y 11/11/--/y 11/10/--/3 11/11/--/3 11/10/--/2 11/11/--/2 11/10/--/1 11/11/--/1 31/--/--/-- 35/--/--/-- 0 /--/--/10 0 /--/--/12 10 /0/--/0 0/--/--/18 0/--/--/19 0/--/--/3 39/--/--/-- 3d/--/--/-- NAME, MNEMONIC Add add Add Immediate addi Add Imm. Unsigned addiu Add Unsigned addu And and And Immediate andi Branch On Equal beq Branch On Not Equal bne divu add.s Lo=R[rs]/R[rt]; Hi=R[rs]%R[rt] R Lo=R[rs]/R[rt]; Hi=R[rs]%R[rt] Jump And Link Jump Register Load Byte Unsigned Load Halfword Unsigned Load Linked Load Upper Imm. Or Immediate Set Less Than Set Less Than Imm. Set Less Than Imm. Unsigned Set Less Than Unsig. Shift Left Logical Shift Right Logical Store Byte Store Conditional Store Halfword Store Word Subtract Subtract Unsigned j J jal J jr R lbu I lhu I ll I lui I lw I nor R or R ori I slt R slti I sltiu I sltu R sll R srl R sub R subu R PC=JumpAddr R[31]=PC+8;PC=JumpAddr PC=R[rs] R[rt]={24’b0,M[R[rs] +SignExtImm](7:0)} R[rt]={16’b0,M[R[rs] +SignExtImm](15:0)} R[rt] = M[R[rs]+SignExtImm] R[rt] = {imm, 16’b0} R[rt] = M[R[rs]+SignExtImm] R[rd] = ~ (R[rs] | R[rt]) R[rd] = R[rs] | R[rt] R[rt] = R[rs] | ZeroExtImm R[rd] = (R[rs] < R[rt]) ? 1 : 0 R[rt] = (R[rs] < SignExtImm)? 1 : R[rt] = (R[rs] < SignExtImm) R[rd] = (R[rs] < R[rt]) ? 1 : 0 R[rd] = R[rt] << shamt R[rd] = R[rt] >> shamt M[R[rs]+SignExtImm](7:0) =
R[rt](7:0) M[R[rs]+SignExtImm] = R[rt]; R[rt] = (atomic) ? 1 : 0 M[R[rs]+SignExtImm](15:0) =
(5) 2hex (5) 3hex
F[fd]=F[fs] – F[ft]
{F[fd],F[fd+1]} = {F[fs],F[fs+1]} –
{F[ft],F[ft+1]} F[rt]=M[R[rs]+SignExtImm] F[rt]=M[R[rs]+SignExtImm];
F[rt+1]=M[R[rs]+SignExtImm+4] R[rd] = Hi
R[rd] = Lo
R[rd] = CR[rs]
{Hi,Lo} = R[rs] * R[rt]
{Hi,Lo} = R[rs] * R[rt]
R[rd] = R[rt] >>> shamt M[R[rs]+SignExtImm] = F[rt] M[R[rs]+SignExtImm] = F[rt]; M[R[rs]+SignExtImm+4] = F[rt+1]
21 20 21 20
16 15 16 15
MAT OPERATION (in Verilog)
R R[rd] = R[rs] + R[rt]
I R[rt] = R[rs] + SignExtImm I R[rt] = R[rs] + SignExtImm
R R[rd] = R[rs] + R[rt] R R[rd] = R[rs] & R[rt]
I R[rt] = R[rs] & ZeroExtImm
I if(R[rs]==R[rt]) PC=PC+4+BranchAddr
I if(R[rs]!=R[rt]) PC=PC+4+BranchAddr
OPCODE / FUNCT (Hex)
(1) 0 / 20hex (1,2) 8hex
FOR- NAME, MNEMONIC MAT
Branch On FP True bc1t FI Branch On FP False bc1f FI
OPERATION if(FPcond)PC=PC+4+BranchAddr (4) if(!FPcond)PC=PC+4+BranchAddr (4)
M[R[rs]+SignExtImm] = R[rt] R[rd] = R[rs] – R[rt]
R[rd] = R[rs] – R[rt]
REGISTER NAME, NUMBER, USE, CALL CONVENTION
(2) (2,7) R[rt](15:0) (2)
PSEUDOINSTRUCTION SET
NAME Branch Less Than
Branch Greater Than
Branch Less Than or Equal
Branch Greater Than or Equal
Load Immediate
OPERATION if(R[rs]R[rt]) PC = Label
if(R[rs]<=R[rt]) PC = Label if(R[rs]>=R[rt]) PC = Label R[rd] = immediate
R[rd] = R[rs]
0 / 08hex (2) 24hex
(2) 25hex (2,7) 30hex
0 (2) ahex
0 / 00hex 0 / 02hex
28hex 38hex 29hex
(2) 2bhex (1) 0 / 22hex 0 / 23hex
FP Divide Single FP Divide
FP Multiply Single FP Multiply Double
FP Subtract Single FP Subtract Double
Load FP Single Load FP
Move From Hi Move From Lo Move From Control Multiply
Multiply Unsigned Shift Right Arith. Store FP Single Store FP
div.s FR div.d FR mul.s FR mul.d FR sub.s FR sub.d FR
F[fd] = F[fs] / F[ft] {F[fd],F[fd+1]} = {F[fs],F[fs+1]} /
{F[ft],F[ft+1]} F[fd] = F[fs] * F[ft]
{F[fd],F[fd+1]} = {F[fs],F[fs+1]} * {F[ft],F[ft+1]}
(6) 0 / 2bhex
31 26 25 31 26 25
21 20 21 20
Divide Unsigned
FP Add Single
FP Compare Single c.x.s* FP Compare c.x.d* Double
FR F[fd ]= F[fs] + F[ft]
FR {F[fd],F[fd+1]} = {F[fs],F[fs+1]} +
{F[ft],F[ft+1]} FR FPcond = (F[fs] op F[ft]) ? 1 : 0
FR FPcond = ({F[fs],F[fs+1]} op {F[ft],F[ft+1]}) ? 1 : 0
* (x is eq, lt, or le) (op is ==, <, or <=) ( y is 32, 3c, or 3e) mfhi mflo mfc0 mult multu sra swc1 R R R R R R I FLOATING-POINT INSTRUCTION FORMATS 16 15 MNEMONIC blt bgt ble bge li NAME NUMBER USE PRESERVED ACROSS A CALL? $zero 0 The Constant Value 0 N.A. $at 1 Assembler Temporary No $v0-$v1 2-3 Values for Function Results No and Expression Evaluation $a0-$a3 4-7 Arguments No $t0-$t7 8-15 Temporaries No $s0-$s7 16-23 Saved Temporaries Yes $t8-$t9 24-25 Temporaries No $k0-$k1 26-27 Reserved for OS Kernel No $gp 28 Global Pointer Yes $sp 29 Stack Pointer Yes $fp 30 Frame Pointer Yes $ra 31 Return Address No BASIC INSTRUCTION FORMATS (1) May cause overflow exception (2) SignExtImm = { 16{immediate[15]}, immediate (3) ZeroExtImm = { 16{1b’0}, immediate } (4) BranchAddr = { 14{immediate[15]}, immediate, (5) JumpAddr = { PC+4[31:28], address, 2’b0 } (6) Operands considered unsigned numbers (vs. 2’s comp.) (7) Atomic test&set pair; R[rt] = 1 if pair atomic, 0 if not atomic Copyright 2009 by Elsevier, Inc., All rights reserved. From Patterson and Hennessy, Computer Organization and Design, 4th ed. MIPS Reference Data Card (“ ”) 1. Pull along perforation to separate card 2. Fold bottom side (columns 3 and 4) together OPCODES, BASE CONVERSION, ASCII SYMBOLS IEEE 754 FLOATING-POINT STANDARD (-1)S × (1 + Fraction) × 2(Exponent - Bias) where Single Precision Bias = 127, Double Precision Bias = 1023. IEEE Single Precision and Double Precision Formats: 31 30 2322 63 62 5251 MEMORY ALLOCATION IEEE 754 Symbols S.P. MAX = 255, D.P. MAX = 2047 MIPS opcode (31:26) (1) MIPS funct (5:0) (2) MIPS funct (5:0) Deci- Hexa- ASCII Binary mal deci- Char- mal acter Deci- Hexa- ASCII mal deci- Char- mal acter (1) sll add.f sub.f j srl mul.f jal sra div.f 00 0000 0 0 NUL 00 0001 1 1 SOH 00 0010 2 2 STX 00 0011 3 3 ETX 64 40 @ 65 41 A 66 42 B 67 43 C beq sllv sqrt.f bne abs.f blez srlv mov.f bgtz srav neg.f 00 0100 4 4 EOT 00 0101 5 5 ENQ 00 0110 6 6 ACK 00 0111 7 7 BEL 68 44 D 69 45 E 70 46 F 71 47 G addi jr addiu jalr slti movz sltiu movn 00 1000 8 8 BS 00 1001 9 9 HT 00 1010 10 a LF 00 1011 11 b VT 72 48 H 73 49 I 74 4a J 75 4b K andi syscall round.w.f ori break trunc.w.f xori ceil.w.f lui sync floor.w.f 00 1100 00 1101 00 1110 00 1111 12 c FF 13 d CR 14 e SO 15 f SI 76 4c L 77 4d M 78 4e N 79 4f O mfhi (2) mthi mflo movz.f mtlo movn.f 01 0000 01 0001 01 0010 01 0011 16 10 DLE 17 11 DC1 18 12 DC2 19 13 DC3 80 50 P 81 51 Q 82 52 R 83 53 S 01 0100 01 0101 01 0110 01 0111 20 14 DC4 21 15 NAK 22 16 SYN 23 17 ETB 84 54 T 85 55 U 86 56 V 87 57 W mult multu div divu 01 1011 27 1b ESC 24 18 CAN 25 19 EM 26 1a SUB 88 58 X 89 59 Y 90 5a Z 91 5b [ 01 1100 01 1101 01 1110 01 1111 28 1c FS 29 1d GS 30 1e RS 31 1f US 92 5c \ 93 5d ] 94 5e ^ 95 5f _ lb add cvt.s.f lh addu cvt.d.f lwl sub 10 0000 10 0001 10 0010 10 0011 32 20 Space 33 21 ! 34 22 " 35 23 # 96 60 ‘ 97 61 a 98 62 b 99 63 c lbu and cvt.w.f lhu or 10 0100 10 0101 10 0110 10 0111 36 24 $ 37 25 % 38 26 & 39 27 ’ 100 64 d 101 65 e 102 66 f 103 67 g swl slt sw sltu 10 1000 10 1001 10 1010 10 1011 40 28 ( 41 29 ) 42 2a * 43 2b + 104 68 h 105 69 i 106 6a j 107 6b k 10 1100 10 1101 10 1110 10 1111 44 2c , 45 2d - 46 2e . 47 2f / 108 6c l 109 6d m 110 6e n 111 6f o ll tge c.f.f lwc1 tgeu c.un.f lwc2 tlt c.eq.f pref tltu c.ueq.f 11 0000 11 0001 11 0010 11 0011 48 30 0 49 31 1 50 32 2 51 33 3 112 70 p 113 71 q 114 72 r 115 73 s teq c.olt.f c.ult.f ldc2 tne c.ole.f c.ule.f 11 0100 11 0101 11 0110 11 0111 52 34 4 53 35 5 54 36 6 55 37 7 116 74 t 117 75 u 118 76 v 119 77 w swc1 c.ngle.f swc2 c.seq.f 11 1000 11 1001 11 1010 11 1011 56 38 8 57 39 9 58 3a : 59 3b ; 120 78 x 121 79 y 122 7a z 123 7b { sdc1 c.nge.f sdc2 c.le.f c.ngt.f 11 1100 11 1101 11 1110 11 1111 60 3c < 61 3d = 62 3e > 63 3f ?
124 7c | 125 7d } 126 7e ~ 127 7f DEL
1 to MAX – 1
± Fl. Pt. Num.
Higher Memory Addresses
Stack Grows
Lower Memory Addresses
UEI MLE 410
BD = Branch Delay, UM = User Mode, EL = Exception Level, IE =Interrupt Enable
EXCEPTION CODES
SIZE PREFIXES (10x for Disk, Communication; 2x for Memory)
if fmt(25:21)==17ten (11hex) f = d (double)
Copyright 2009 by Elsevier, Inc., All rights reserved. From Patterson and Hennessy, Computer Organization and Design, 4th ed.
Argument 6
Argument 5
Saved Registers
Local Variables
Dynamic Data
Static Data
7fff fffchex
1000 8000hex 1000 0000hex 0040 0000hex
STACK FRAME
DATA ALIGNMENT
Double Word

01234567 Value of three least significant bits of byte address (Big Endian)
EXCEPTION CONTROL REGISTERS: CAUSE AND STATUS
31 15 86 2
Interrupt Mask
Number Name Cause of Exception
Number Name Cause of Exception
0 Int Interrupt (hardware)
9 Exception
4 AdEL Address Error Exception (load or instruction fetch)
Reserved Instruction Exception
5 AdES Address Error Exception (store)
Coprocessor Unimplemented
Bus Error on Instruction Fetch
Arithmetic Overflow Exception
Bus Error on Load or Store
13 Tr Trap
8 Exception
15 FPE Floating Point Exception
PRE- SIZE FIX
PRE- SIZE FIX
PRE- SIZE FIX
PRE- SIZE FIX
103, 210 Kilo-
1015, 250 Peta-
10-3 milli-
10-15 femto-
106, 220 Mega-
1018, 260 Exa-
10-6 micro-
10-18 atto-
109, 230 Giga-
1021, 270 Zetta-
10-9 nano-
10-21 zepto-
1012, 240 Tera-
1024, 280 Yotta-
10-12 pico-
10-24 yocto-
(1) opcode(31:26) == 0
(2) opcode(31:26) == 17ten (11hex); if fmt(25:21)==16ten (10hex) f = s (single);
The symbol for each prefix is just its first letter, except μ is used for micro.
Exception Code
MIPS Reference Data Card (“ ”) 1. Pull along perforation to separate card 2. Fold bottom side (columns 3 and 4) together

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com