IT代写 CMU 18-642 https://course.ece.cmu.edu/~ece642/ , Sourced from . 2005

15-213 Recitation 7: Style, Valgrind, Blocking
Monday, October 11th, 2021

■ Logistics

Copyright By PowCoder代写 加微信 powcoder

■ Code Reviews
■ Blocking
■ Valgrind / Intro to Git
■ Looking Ahead: Cache Lab

■ Cache lab is due tomorrow!
■ Come to office hours for help ■ NO Midterm!

Code Reviews

Code Reviews
• Why code reviews?
• Used in industry – Nearly all companies utilize code reviews
• Systematic code reviews are highly effective at finding bugs
efficiently and effectively.

Code Reviews
• Industry example from an embedded system machine critical pipeline flow device requiring high software quality
No code reviews / planned testing
The same team implemented testing and code reviews. This is a similar project done 5 years later.
Example from CMU 18-642 https://course.ece.cmu.edu/~ece642/ , Sourced from . 2005

Code Review Signup
• All students in the course will receive an email with a link to signup for a code review timeslot.
• All students will receive a final style score from 0-4 points
• 213 code reviews will be short (<= 15 minutes) and cover code style and code quality. Code Style • Properly document your code • Function + File header comments, overall operation of large blocks, any tricky bits • Write robust code – check error and failure conditions • Write modular code • Use interfaces for data structures, e.g. create/insert/remove/free functions for a linked • No magic numbers – use #define or static const • Formatting • 80 characters per line (use Autolab’s highlight feature to double-check) • Consistent braces and whitespace • No memory or file descriptor leaks ■ Finding memory leaks - part of the style ■ $ valgrind –leak-resolution=high –leak-check=full –show-reachable=yes –track-fds=yes ./myProgram arg1 arg ■ Remember that Valgrind can be used for other things, like finding invalid reads and writes! Activity: Valgrind Activity Setup ■ Split up into groups of 2-3 people ■ One person needs a laptop ■ Log in to a Shark machine, and type: $ wget https://www.cs.cmu.edu/~213/activities/rec7.tar $ tar -xvf rec7.tar 213_exam_answers.c Example: Matrix Multiplication /* multiply 4x4 matrices */ void mm(int a[4][4], int b[4][4], int c[4][4]) { int i, j, k; for (i = 0; i < 4; i++) for (j = 0; j < 4; j++) for (k = 0; k < 4; k++) c[i][j] += a[i][k] * b[k][j]; Let’s step through this to see what’s actually happening Example: Matrix Multiplication ■ Assume a tiny cache with 4 lines of 8 bytes (2 ints) ■ S = 1, E = 4, B = 8 ■ Let’s see what happens if we don’t use blocking c[0][0] += a[0][0] * b[0][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 0 2 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 00 10 20 30 0 0 0 1 0 2 0 3 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 0 2 0 3 1 0 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][0] * b[0][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 0 2 0 3 1 0 1 1 c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 0 2 0 3 1 0 1 1 1 2 c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] * b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] * b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache What is the miss rate of a? 0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3 c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] * b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache What is the miss rate of a? What is the miss rate of b? Example: Matrix Multiplication (blocking) /* multiply 4x4 matrices using blocks of size 2 */ void mm_blocking(int a[4][4], int b[4][4], int c[4][4]) { int i, j, k; int i_c, j_c, k_c; int B = 2; // control loops for (i_c = 0; i_c < 4; i_c += B) for (j_c = 0; j_c < 4; j_c += B) for (k_c = 0; k_c < 4; k_c += B) // block multiplications for (i = i_c; i < i_c + B; i++) for (j = j_c; j < j_c + B; j++) for (k = k_c; k < k_c + B; k++) c[i][j] += a[i][k] * b[k][j]; Let’s step through this to see what’s actually happening Example: Matrix Multiplication (blocking) ■ Assume a tiny cache with 4 lines of 8 bytes (2 ints) ■ S = 1, E = 4, B = 8 ■ Let’s see what happens if we now use blocking c[0][0] += a[0][0] * b[0][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 1 0 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 00 10 20 30 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 1 0 1 1 0 0 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 1 0 1 1 0 0 0 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 1 0 1 1 0 0 0 1 1 0 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] Key: Grey = accessed Dark grey = currently accessing Red border = in cache 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] c[0][0] += a[0][2] * b[2][0] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] 8 0 9 0 10 0 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] 80 90 100 110 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] 8 0 9 0 10 0 11 0 12 1 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] iter i j k 8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] iter i j k 8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 14 1 1 2 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] iter i j k 8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 14 1 1 2 15 1 1 3 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1] c[1][1] += a[1][3] * b[3][1] 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1] 8 0 9 0 10 0 11 0 12 1 13 1 14 1 15 1 0 2 0 3 1 2 1 3 c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1] What is the miss rate of a? 0 3 1 2 1 3 c[1][1] += a[1][3] * b[3][1] What is the miss rate of b? Introduction to Git Version control is your friend What is Git? • Most widely used version control system out there • Version control: • Help track changes to your source code over time • Help teams manage changes on shared code Git Commands • Clone: git clone
• Add: git add . or git add
• Push / Pull: git push / git pull
• Commit: git commit -m “your-commit-message”
• Good commit messages are key!
• Bad:“commit”, “change”, “fixed”
• Good: “Fixed buffer overflow potential in AttackLab”

If you get stuck…
■ Reread the writeup
■ Look at CS:APP Chapter 6
■ Review lecture notes (http://cs.cmu.edu/~213)
■ Come to Office Hours (Sunday to Friday, 6:00-10:00 PM Locations on Piazza)
■ Post private question on Piazza
■ man malloc, man valgrind, man gdb

Practice Problems

Class Question / Discussions
■ We’ll work through a series of questions
■ Write down your answer for each question ■ You can discuss with your classmates

What Type of Locality?
• The following function exhibits which type of
locality? Consider only array accesses.
void who(int *arr, int size) {
for (int i = 0; i < size-1; ++i) arr[i] = arr[i+1]; A. B. C. D. Spatial Temporal Both A and B Neither A nor B What Type of Locality? • The following function exhibits which type of locality? Consider only array accesses. void who(int *arr, int size) { for (int i = 0; i < size-1; ++i) arr[i] = arr[i+1]; A. B. C. D. Spatial Temporal Both A and B Neither A nor B What Type of Locality? • The following function exhibits which type of locality? Consider only array accesses. void coo(int *arr, int size) { for (int i = size-2; i >= 0; –i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B

What Type of Locality?
• The following function exhibits which type of
locality? Consider only array accesses.
void coo(int *arr, int size) {
for (int i = size-2; i >= 0; –i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B

Calculating Cache Parameters
• Given the following address partition, how many
int values will fit in a single data block?
# of int in block 0
Unknown: We need more info
B. index offset C. D.

Calculating Cache Parameters
• Given the following address partition, how many
int values will fit in a single data block?
# of int in block 0
Unknown: We need more info
B. index offset C. D.

Direct-Mapped Cache Example
• Assuming a 32-bit address (i.e. m=32), how many bits are used for tag (t), set index (s), and block offset (b).
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
tsb A.123 B. 27 2 3 C. 25 4 3 D. 1 4 8 E. 20 4 8
t s bits b
Tag Set index Block offset

Direct-Mapped Cache Example
• Assuming a 32-bit address (i.e. m=32), how many bits are used for tag (t), set index (s), and block offset (b).
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
tsb A.123 B. 27 2 3 C. 25 4 3 D. 1 4 8 E. 20 4 8
t s bits b
Tag Set index Block offset

Which Set Is it?
• Which set is the address 0xFA1C located in?
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
Set # for 0xFA1C 0
More than one of the above
Tag Set index Block offset

Which Set Is it?
• Which set is the address 0xFA1C located in?
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
Set # for 0xFA1C 0
More than one of the above
Tag Set index Block offset

Cache Block Range
• What range of addresses will be in the same block as
address 0xFA1C? 8 bytes
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
A. B. C. D. E.
Addr. Range
0xFA1C – 0xFA23
0xFA1C – 0xFA1F
0xFA18 – 0xFA1F
It depends on the access size (byte, word, etc)
Tag Set index Block offset

Cache Block Range
• What range of addresses will be in the same block as
address 0xFA1C? 8 bytes
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
A. B. C. D. E.
Addr. Range
0xFA1C – 0xFA23
0xFA1C – 0xFA1F
0xFA18 – 0xFA1F
It depends on the access size (byte, word, etc)
Tag Set index Block offset

Cache Misses
If N = 16, how many bytes does the loop access of a?
int foo(int* a, int N)
int sum = 0;
for(i = 0; i < N; i++) sum += a[i]; } return sum; } Accessed Bytes A4 B 16 C 64 D 256 Cache Misses If N = 16, how many bytes does the loop access of a? int foo(int* a, int N) int sum = 0; for(i = 0; i < N; i++) sum += a[i]; } return sum; } Accessed Bytes A4 B 16 C 64 D 256 Cache Misses Consider a 32 KB cache in 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com