CS代考 ELEE10007 - PowCoder代写

SCHOOL OF ENGINEERING DIGITAL SYSTEM DESIGN 4 ELEE10007
Exam Date: 06/05/2019 From and To: 14:30-16:30 Exam Diet: May 2019 Please read full instructions before commencing writing
Exam paper information
• This paper consists of TWO sections.

• Candidates should attempt THREE questions, chosen as follows:
• Section A: ONE question. Attempt the whole section.
• Section B: Attempt TWO out of the THREE questions.
Special instructions
• Students should assume reasonable values for any data not given in a question nor available on a datasheet, and should make any such assumptions clear on their script.
• Students in any doubt as to the interpretation of the wording of a question, should make their own decision, and should state it clearly on their script.
• Please write your name in the space indicated at the right hand side on the front cover of the answer book. Also enter your examination number in the appropriate space on the front cover.
• Write ONLY your examination number on any extra sheets or worksheets used and firmly attach these to the answer book(s).
• This examination will be marked anonymously.
Special items
• Formula sheets (2 pages)
Convenor of Board of Examiners: Professor R Examiner: Professor J Morrow and Dr Z ECTION A Question A1
a) (i) With the aid of appropriate diagrams describe the key differences between the and Harvard computer architectures. (4)
(ii) Explain the problem with the architecture that the Harvard architecture is designed to solve. (2)
b) In a multiple issue processor, parts of the datapath are duplicated in order to provide instruction level parallelism.
(i) Describe the key differences between dynamic and static multiple- issue process architectures. (2)
(ii) Consider a programme with a loop structure that involves reading
data from successive memory locations, performing an arithmetic operation and writing the result back. Describe the process by
which such a loop is “unrolled” in order to improve performance on
a typical multiple issue processor. (4)
c) Suppose you need to perform two mathematical operations, the first is a sum of 20 scalar variables while the second is a matrix sum of a pair of 100×100 arrays. Assume for the following questions that the scalar sums cannot be parallelised and each sum takes t seconds.
(i) What are the execution time speedup factors if 10 processors or
100 processors are used rather than 1? (2)
(ii) What concept or law of computer design does the previous result illustrate? (1)
(iii) What would happen if the load wasn’t balanced when the
calculations are performed with 100 processors? (1)
d) The roofline model is a method of characterising the performance of different processor systems undertaking a range of different tasks (kernels) with different characteristics.
(i) Referring to Figure A1 overleaf, state which processor has the
highest performance for a kernel which requires one word of data
to be read in and then written out to memory for every four floating
point operations. (2)
(ii) What aspect of the processor with the lower performance should be improved for it to equal the processor with the higher performance, and by how much?
Continued overleaf
ELEE10007 Digital System Design 4 – May 2019

ELEE10007 Digital System Design 4 – May 2019

SECTION B Question B1
Consider the following MIPS code when answering this question. Refer to the MIPS which is appended to the exam paper.
sll $t0, $s3, 2 add $t0, $t0, $s6 lw $t0, 0($t0)
sll $t1, $s4, 2 add $t1, $t1, $s6 lw $t1, 0($t1)
add $t2, $t0 $t1 sw $t2, 32($s7)
a) For the FIRST THREE instructions in the MIPS code shown above:
(i) State which of these are R-format, and which are I-format
instructions? (2)
(ii) For the R-format instruction(s), show the decimal value of the
opcode (OP), source register (RS), target register (RT), destination register (RD), shift amount (shamt) and function bit (funct). For the I-format instruction(s), show the value of the opcode (OP), source register (RS), target register (RT), and the immediate field. (6)
(iii) Translate the three instructions into machine language expressed
in hexadecimal. (3)
b) For the LAST THREE instructions in the MIPS code shown above:
(i) What hardware elements are needed to implement these
instructions? (2)
(ii) Sketch the datapath with high-level abstractions of the hardware
elements that can run the add/lw/sw instructions. (4)
c) Translate the MIPS code to C. Assume that the variables f, g, h, i, and
j are assigned to registers $s0, $s1, $s2, $s3, and $s4, respectively. Assume that the base address of the arrays A and B are in registers
$s6 and $s7, respectively. (3)
ELEE10007 Digital System Design 4 – May 2019

Question B2
a) Considering the following MIPS code:
0000 0000 0000 1000 0101 0000 0010 1010 bne $t2, $0, ELSE
ELSE: addi $t2, $t2, 2 DONE:
(i) The first instruction is shown as machine code instead of assembly language. Use the MIPS green-card appended to the exam paper to decode this instruction to the assembly language representation.
the PC to the address as 0x4000 0000? Explain your answer. (3)
(ii) Is it possible to use the branch-not-equal (bne) MIPS assembly instruction to set the PC to 0x4000 0000? Explain your answer.
(iii) What are the addressing modes when using j and bne?
c) To implement the bne instruction in MIPS architecture:
(ii) Assume $t0 holds the value 0x00101000, what is the value of $t2 after running the above instructions? In your answer you should explain what this MIPS loop is doing.
b) Suppose that the program counter (PC) is set to 0x2000 0000.
(i) Is it possible to use the jump (j) MIPS assembly instruction to set
(i) What arithmetic functions does the Arithmetic Logic Unit (ALU) perform? (2)
(ii) Sketch the datapath with high-level abstractions of the hardware elements that can run the bne instructions. (4)
ELEE10007 Digital System Design 4 – May 2019

Question B3
a) The following questions are concerned with the arrangement of
memory and cache in a modern computer system.
(i) With the aid of an appropriate diagram describe the hierarchy of memory and storage, with reference to the aim or purpose of this memory architecture. (3)
(ii) Explain the concept of temporal locality and suggest how this is exploited through the arrangement and operation of the cache. (2)
(iii) Explain the concept of spatial locality and suggest how this is
exploited through the arrangement and operation of the cache. (2)
(iv) Explain the key purpose of a virtual memory system. (2)
b) If a processor has a 4-way set associative instruction cache with a total size of 32 KiB and a block size of 16 words, define the following characteristics of the cache, assuming that the address is 32 bits wide:
(i) Number of blocks in the cache; (1)
(ii) Number of index bits in the address; (1)
(iii) Number of tag bits in the address; (1)
(iv) If the cache is rearranged to be directly mapped, describe the new address structure with a suitable diagram. (2)
c) Decisions made in the design of a memory architecture will have an impact (positive or negative) on the hit time, miss rate and miss penalty. Describe the effects on these parameters when:
(i) The level of associativity is decreased; (2) (ii) The block size is increased; (2) (iii) Write back is used instead of write through in the data cache. (2)
ELEE10007 Digital System Design 4 – May 2019
END OF PAPER

M I P S Reference Data CORE INSTRUCTION SET
ARITHMETIC CORE INSTRUCTION SET
OPCODE / FMT /FT / FUNCT (Hex) 11/8/1/– 11/8/0/– 0/–/–/1a 0/–/–/1b 11/10/–/0
11/11/–/0 11/10/–/y 11/11/–/y
11/10/–/3
11/11/–/3
11/10/–/2
11/11/–/2
11/10/–/1
11/11/–/1
31/–/–/–
35/–/–/–
0 /–/–/10 0 /–/–/12 10 /0/–/0 0/–/–/18 0/–/–/19 0/–/–/3 39/–/–/–
3d/–/–/–
NAME, MNEMONIC Add add Add Immediate addi Add Imm. Unsigned addiu Add Unsigned addu And and And Immediate andi
Branch On Equal beq Branch On Not Equal bne
divu add.s
Lo=R[rs]/R[rt]; Hi=R[rs]%R[rt] R Lo=R[rs]/R[rt]; Hi=R[rs]%R[rt]
Jump And Link Jump Register
Load Byte Unsigned
Load Halfword Unsigned
Load Linked
Load Upper Imm.
Or Immediate
Set Less Than
Set Less Than Imm.
Set Less Than Imm. Unsigned
Set Less Than Unsig. Shift Left Logical Shift Right Logical
Store Byte
Store Conditional
Store Halfword
Store Word Subtract
Subtract Unsigned
j J jal J jr R
lbu I lhu I
ll I lui I lw I nor R or R ori I slt R slti I
sltiu I sltu R
sll R srl R
sub R subu R
PC=JumpAddr R[31]=PC+8;PC=JumpAddr PC=R[rs] R[rt]={24’b0,M[R[rs]
+SignExtImm](7:0)} R[rt]={16’b0,M[R[rs]
+SignExtImm](15:0)} R[rt] = M[R[rs]+SignExtImm] R[rt] = {imm, 16’b0}
R[rt] = M[R[rs]+SignExtImm] R[rd] = ~ (R[rs] | R[rt])
R[rd] = R[rs] | R[rt]
R[rt] = R[rs] | ZeroExtImm
R[rd] = (R[rs] < R[rt]) ? 1 : 0 R[rt] = (R[rs] < SignExtImm)? 1 : R[rt] = (R[rs] < SignExtImm) R[rd] = (R[rs] < R[rt]) ? 1 : 0 R[rd] = R[rt] << shamt R[rd] = R[rt] >> shamt M[R[rs]+SignExtImm](7:0) =
R[rt](7:0) M[R[rs]+SignExtImm] = R[rt]; R[rt] = (atomic) ? 1 : 0 M[R[rs]+SignExtImm](15:0) =
(5) 2hex (5) 3hex
F[fd]=F[fs] – F[ft]
{F[fd],F[fd+1]} = {F[fs],F[fs+1]} –
{F[ft],F[ft+1]} F[rt]=M[R[rs]+SignExtImm] F[rt]=M[R[rs]+SignExtImm];
F[rt+1]=M[R[rs]+SignExtImm+4] R[rd] = Hi
R[rd] = Lo
R[rd] = CR[rs]
{Hi,Lo} = R[rs] * R[rt]
{Hi,Lo} = R[rs] * R[rt]
R[rd] = R[rt] >>> shamt M[R[rs]+SignExtImm] = F[rt] M[R[rs]+SignExtImm] = F[rt]; M[R[rs]+SignExtImm+4] = F[rt+1]
MAT OPERATION (in Verilog)
R R[rd] = R[rs] + R[rt]
I R[rt] = R[rs] + SignExtImm I R[rt] = R[rs] + SignExtImm
R R[rd] = R[rs] + R[rt] R R[rd] = R[rs] & R[rt]
I R[rt] = R[rs] & ZeroExtImm
I if(R[rs]==R[rt]) PC=PC+4+BranchAddr
I if(R[rs]!=R[rt]) PC=PC+4+BranchAddr
OPCODE / FUNCT (Hex)
(1) 0 / 20hex (1,2) 8hex
FOR- NAME, MNEMONIC MAT
Branch On FP True bc1t FI Branch On FP False bc1f FI
OPERATION if(FPcond)PC=PC+4+BranchAddr (4) if(!FPcond)PC=PC+4+BranchAddr (4)
M[R[rs]+SignExtImm] = R[rt] R[rd] = R[rs] – R[rt]
R[rd] = R[rs] – R[rt]
REGISTER NAME, NUMBER, USE, CALL CONVENTION
(2) (2,7) R[rt](15:0) (2)
PSEUDOINSTRUCTION SET
NAME Branch Less Than
Branch Greater Than
Branch Less Than or Equal
Branch Greater Than or Equal
Load Immediate
OPERATION if(R[rs]R[rt]) PC = Label
if(R[rs]<=R[rt]) PC = Label if(R[rs]>=R[rt]) PC = Label R[rd] = immediate
R[rd] = R[rs]
0 / 08hex (2) 24hex
(2) 25hex (2,7) 30hex
0 (2) ahex
0 / 00hex 0 / 02hex
28hex 38hex 29hex
(2) 2bhex (1) 0 / 22hex 0 / 23hex
FP Divide Single FP Divide
FP Multiply Single FP Multiply Double
FP Subtract Single FP Subtract Double
Load FP Single Load FP
Move From Hi Move From Lo Move From Control Multiply
Multiply Unsigned Shift Right Arith. Store FP Single Store FP
div.s FR div.d FR mul.s FR mul.d FR sub.s FR sub.d FR
F[fd] = F[fs] / F[ft] {F[fd],F[fd+1]} = {F[fs],F[fs+1]} /
{F[ft],F[ft+1]} F[fd] = F[fs] * F[ft]
{F[fd],F[fd+1]} = {F[fs],F[fs+1]} * {F[ft],F[ft+1]}
(6) 0 / 2bhex
31 26 25 31 26 25
21 20 21 20
Divide Unsigned
FP Add Single
FP Compare Single c.x.s* FP Compare c.x.d* Double
FR F[fd ]= F[fs] + F[ft]
FR {F[fd],F[fd+1]} = {F[fs],F[fs+1]} +
{F[ft],F[ft+1]} FR FPcond = (F[fs] op F[ft]) ? 1 : 0
FR FPcond = ({F[fs],F[fs+1]} op {F[ft],F[ft+1]}) ? 1 : 0
* (x is eq, lt, or le) (op is ==, <, or <=) ( y is 32, 3c, or 3e) mfhi mflo mfc0 mult multu sra swc1 R R R R R R I FLOATING-POINT INSTRUCTION FORMATS 16 15 MNEMONIC blt bgt ble bge li NAME NUMBER USE PRESERVED ACROSS A CALL? $zero 0 The Constant Value 0 N.A. $at 1 Assembler Temporary No $v0-$v1 2-3 Values for Function Results No and Expression Evaluation $a0-$a3 4-7 Arguments No $t0-$t7 8-15 Temporaries No $s0-$s7 16-23 Saved Temporaries Yes $t8-$t9 24-25 Temporaries No $k0-$k1 26-27 Reserved for OS Kernel No $gp 28 Global Pointer Yes $sp 29 Stack Pointer Yes $fp 30 Frame Pointer Yes $ra 31 Return Address No BASIC INSTRUCTION FORMATS (1) May cause overflow exception (2) SignExtImm = { 16{immediate[15]}, immediate (3) ZeroExtImm = { 16{1b’0}, immediate } (4) BranchAddr = { 14{immediate[15]}, immediate, 2’b0 } (5) JumpAddr = { PC+4[31:28], address, 2’b0 } (6) Operands considered unsigned numbers (vs. 2’s comp.) (7) Atomic test&set pair; R[rt] = 1 if pair atomic, 0 if not atomic 21 20 21 20 16 15 16 15 Copyright 2009 by Elsevier, Inc., All rights reserved. From Patterson and Hennessy, Computer Organization and Design, 4th ed. MIPS Reference Data Card (“ ”) 1. Pull along perforation to separate card 2. Fold bottom side (columns 3 and 4) together OPCODES, BASE CONVERSION, ASCII SYMBOLS IEEE 754 FLOATING-POINT (-1)S × (1 + Fraction) × 2(Exponent - Bias) where Single Precision Bias = 127, Double Precision Bias = 1023. IEEE Single Precision and Double Precision Formats: 31 30 2322 63 62 5251 MEMORY ALLOCATION IEEE 754 Symbols S.P. MAX = 255, D.P. MAX = 2047 MIPS opcode (31:26) (1) MIPS funct (5:0) (2) MIPS funct (5:0) Deci- Hexa- ASCII Binary mal deci- Char- mal acter Deci- Hexa- ASCII mal deci- Char- mal acter (1) sll add.f sub.f j srl mul.f jal sra div.f 00 0000 0 0 NUL 00 0001 1 1 SOH 00 0010 2 2 STX 00 0011 3 3 ETX 64 40 @ 65 41 A 66 42 B 67 43 C beq sllv sqrt.f bne abs.f blez srlv mov.f bgtz srav neg.f 00 0100 4 4 EOT 00 0101 5 5 ENQ 00 0110 6 6 ACK 00 0111 7 7 BEL 68 44 D 69 45 E 70 46 F 71 47 G addi jr addiu jalr slti movz sltiu movn 00 1000 8 8 BS 00 1001 9 9 HT 00 1010 10 a LF 00 1011 11 b VT 72 48 H 73 49 I 74 4a J 75 4b K andi syscall round.w.f ori break trunc.w.f xori ceil.w.f lui sync floor.w.f 00 1100 00 1101 00 1110 00 1111 12 c FF 13 d CR 14 e SO 15 f SI 76 4c L 77 4d M 78 4e N 79 4f O mfhi (2) mthi mflo movz.f mtlo movn.f 01 0000 01 0001 01 0010 01 0011 16 10 DLE 17 11 DC1 18 12 DC2 19 13 DC3 80 50 P 81 51 Q 82 52 R 83 53 S 01 0100 01 0101 01 0110 01 0111 20 14 DC4 21 15 NAK 22 16 SYN 23 17 ETB 84 54 T 85 55 U 86 56 V 87 57 W mult multu div divu 01 1011 27 1b ESC 24 18 CAN 25 19 EM 26 1a SUB 88 58 X 89 59 Y 90 5a Z 91 5b [ 01 1100 01 1101 01 1110 01 1111 28 1c FS 29 1d GS 30 1e RS 31 1f US 92 5c \ 93 5d ] 94 5e ^ 95 5f _ lb add cvt.s.f lh addu cvt.d.f lwl sub 10 0000 10 0001 10 0010 10 0011 32 20 Space 33 21 ! 34 22 " 35 23 # 96 60 ‘ 97 61 a 98 62 b 99 63 c lbu and cvt.w.f lhu or 10 0100 10 0101 10 0110 10 0111 36 24 $ 37 25 % 38 26 & 39 27 ’ 100 64 d 101 65 e 102 66 f 103 67 g swl slt sw sltu 10 1000 10 1001 10 1010 10 1011 40 28 ( 41 29 ) 42 2a * 43 2b + 104 68 h 105 69 i 106 6a j 107 6b k 10 1100 10 1101 10 1110 10 1111 44 2c , 45 2d - 46 2e . 47 2f / 108 6c l 109 6d m 110 6e n 111 6f o ll tge c.f.f lwc1 tgeu c.un.f lwc2 tlt c.eq.f pref tltu c.ueq.f 11 0000 11 0001 11 0010 11 0011 48 30 0 49 31 1 50 32 2 51 33 3 112 70 p 113 71 q 114 72 r 115 73 s teq c.olt.f c.ult.f ldc2 tne c.ole.f c.ule.f 11 0100 11 0101 11 0110 11 0111 52 34 4 53 35 5 54 36 6 55 37 7 116 74 t 117 75 u 118 76 v 119 77 w swc1 c.ngle.f swc2 c.seq.f 11 1000 11 1001 11 1010 11 1011 56 38 8 57 39 9 58 3a : 59 3b ; 120 78 x 121 79 y 122 7a z 123 7b { sdc1 c.nge.f sdc2 c.le.f c.ngt.f 11 1100 11 1101 11 1110 11 1111 60 3c < 61 3d = 62 3e > 63 3f ?
124 7c | 125 7d } 126 7e ~ 127 7f DEL
1 to MAX – 1
± Fl. Pt. Num.
Argument 6
Argument 5
Saved Registers
Local Variables
Dynamic Data
Static Data
7fff fffchex
1000 8000hex 1000 0000hex 0040 0000hex
STACK FRAME
Higher Memory Addresses
Stack Grows
Lower Memory Addresses
DATA ALIGNMENT
Double Word

01234567 Value of three least significant bits of byte address (Big Endian)
EXCEPTION CONTROL REGISTERS: CAUSE AND STATUS
31 15 86 2
Interrupt Mask
Exception Code
BD = Branch Delay, UM = User Mode, EL = Exception Level, IE =Interrupt Enable
EXCEPTION CODES
SIZE PREFIXES (10 x for Disk, Communication; 2 x for Memory)
if fmt(25:21)==17ten (11hex) f = d (double)
Copyright 2009 by Elsevier, Inc., All rights reserved. From Patterson and Hennessy, Computer Organization and Design, 4th ed.
Number Name Cause of Exception
Number Name Cause of Exception
0 Int Interrupt (hardware)
9 Exception
4 AdEL Address Error Exception (load or instruction fetch)
Reserved Instruction Exception
5 AdES Address Error Exception (store)
Coprocessor Unimplemented
Bus Error on Instruction Fetch
Arithmetic Overflow Exception
Bus Error on Load or Store
13 Tr Trap
8 Exception
15 FPE Floating Point Exception
PRE- SIZE FIX
PRE- SIZE FIX
PRE- SIZE FIX
PRE- SIZE FIX
103, 210 Kilo-
1015, 250 Peta-
10-3 milli-
10-15 femto-
106, 220 Mega-
1018, 260 Exa-
10-6 micro-
10-18 atto-
109, 230 Giga-
1021, 270 Zetta-
10-9 nano-
10-21 zepto-
1012, 240 Tera-
1024, 280 Yotta-
10-12 pico-
10-24 yocto-
(1) opcode(31:26) == 0
(2) opcode(31:26) == 17ten (11hex); if fmt(25:21)==16ten (10hex) f = s (single);
The symbol for each prefix is just its first letter, except μ is used for micro.
MIPS Reference Data Card (“ ”) 1. Pull along perforation to separate card 2. Fold bottom side (columns 3 and 4) together

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts