CS计算机代考程序代写 CSE 141 Homework 2 – Solutions

CSE 141 Homework 2 – Solutions
Problem 1:
begin loop
finish
addi $t0,$zero,0
addi $t1,$zero,1
slt $t2,$a0,$t1 #
bne $t2,$zero,finish add $t0,$t0,$t1
addi $t1,$t1,2
j loop #
add $v0,$t0,$zero
# return v
$a = n (positive integer) $v = output
# initialize variable v=0; # initialize step=1
# if step >=n goto finish # v=v+step
# step=step+2
The code sum all odd integers between 0 and n
Problem 2: [2 pts each starting from b]
a) move $t5,$t3
add $t5,$zero,$t3
b) clear $t5
add $t5,$zero,$zero
c) li $t5,small
addi $t5,small,$zero
d) li $t5,big
lui $t5,bigH
addi $t5,$t5,bigL
bigH – decimal value of the 16 higher bits of number big;
e) lw $t5,big($t3)
lui $at,bigH
addi $at,$at,bigL add $at,$at,$t3

lw $t5,0($at)
f) addi $t5,$t3,big
lui $at,bigH
addi $at,$at,bigL addi $t5,$at,$t3
g) beq $t5,small,L
addi $at,$zero,small beq $at,$t5,L
h) beq $t5,big,L
addi $at,$zero,big beq $at,$t5,L
i) ble $t5,$t3,L
slt $at,$t5,$t3 beq $at,$zero,L beq $t5,$t3,L
j) bgt $t3,$t5,L
blt $t3,$t5,L
k) bge $t5,$t3,L
ble $t3,$t5,L
# see (f)
#see (i)
Problem 3: [15 pts code + 5 pts for the final statistics]
for(i=0;i<=100;i=i+1){a[i] =b[i] +c} $a0 - a (base address) $a1 - b (base address) $t0 - i $s0 - c add $t0,$zero,$zero addi $t1,$zero,101 loop slt $t2,$t0,$t1 bne $t2,$zero,exit # i=0 #limit = 100 add $t3,$t0,$t0 add $t4,$t3,$t3 add $t5,$a1,$t4 lw $t6,0($t5) add $t2,$t6,$s0 add $t3,$a0,$t4 sw $t2,0($t3) addi $t0,$t0,1 j loop Number of instructions executed = 2+101.11 = 1113 Number of memory data references = 101.2 =202 Problem 4: a) Percentage of all memory accesses that are for data: Pdata = 33/133 = 25% b) Percentage of all memory accesses that are reads: Preads = (2/3).33/133 = 16% Problem 5: CPIgcc = 0.48*1.3+0.33*1.6+0.17*1.7+0.02*2=1.48 CPIspice=0.5*1.3+0.41*1.6+0.08*1.7+0.01*2=1.46 Problem 6: [10 pts each item] a) Execution Time=IC*CPI*ClockPeriod We know that: CPInew=CPIold ClockPeriodnew=1.1*ClockPeriodold In order to keep the same performance we must have: Execution Timenew= Execution Timeold where, Execution Timenew=ICnew*CPInew*ClockPeriodnew Execution Timeold=ICold*CPIold*ClockPeriodold Dividing term by term the two equations above we get: ICnew=0.91*ICold This means that the new gcc must have 91% of the size of the old gcc. This reduction comes from eliminating loads/add pairs. A total of (100-91)/(33*2/3)=41% loads of gcc must be eliminated at least. b) Assume that in the above sequence we make the following replacement: addm $2, addr($3) sub $3,$2,$8 In such a case the information that should be loaded to register $8 is not present there when the subtraction operation is performed. The semantics of the second program is therefore incorrect. Problem 7: [10 pts code + 10 pts statistics] a=(x*y)-(b*c) Acc: Load[b] Mult[c] Store[temp] Load[x] Mult[y] Subt[temp] Store[a] Load/Store: Load $1,[b] Load $2,[c] lw $8,addr($3) add $2,$2,$8 sub $3,$2,$8 Mult $1,$1,$2 Load $2, [x] Load $3, [y] Mult $2, $2, $3 Sub $1, $2, $1 Store $1, [a] Stack: Push[b] Push[c] Mult Push[x] Push[y] Mult Subt Pop[a] Mem: Mult [a], [x], [y] Mult [temp], [b], [c] Subt [a], [a], [temp] Instruction bytes fetched Acc: 3+3+3+3+3+3+3=21 bytes Load/Store: 4+4+4+4+4+4+4+4=32 bytes Stack: 3+3+1+3+3+1+1+3=18 bytes Mem: 7+7+7=21 bytes Memory bytes transferred Acc: 7*4=28 bytes Load/Store: 5*4=20 bytes Stack: 5*4=20 bytes Mem: 9*4=36 bytes In both cases the most efficient archictecture is the stack. Problem 8: [20 pts] faverg: sub $sp,$sp,8 sw $s0,4($sp) sw $s1,0($sp) # $s0=i # $s1=sum # sum=0 # i=0 # Important !!! $t1=4*i loop: exit: add $s1,$zero,$zero add $s0,$zero,$zero add $t0,$s0,$s0 add $t1,$t0,$t0 add $t2,$a0,$t1 lw $t3,0($t2) beq $t3,NULL,exit add $s1,$s1,$t3 addi $t1,$t1,4 j loop addi $s0,$s0,1 div $s1,$s0 add $v0,$Lo,$zero lw $s0,4($sp) lw $s1,0($sp) addi $sp,$sp,8 jr $ra