15-213 Recitation 7: Style, Valgrind, Blocking
Monday, October 11th, 2021
■ Logistics
Copyright By PowCoder代写 加微信 powcoder
■ Code Reviews
■ Blocking
■ Valgrind / Intro to Git
■ Looking Ahead: Cache Lab
■ Cache lab is due tomorrow!
■ Come to office hours for help ■ NO Midterm!
Code Reviews
Code Reviews
• Why code reviews?
• Used in industry – Nearly all companies utilize code reviews
• Systematic code reviews are highly effective at finding bugs
efficiently and effectively.
Code Reviews
• Industry example from an embedded system machine critical pipeline flow device requiring high software quality
No code reviews / planned testing
The same team implemented testing and code reviews. This is a similar project done 5 years later.
Example from CMU 18-642 https://course.ece.cmu.edu/~ece642/ , Sourced from . 2005
Code Review Signup
• All students in the course will receive an email with a link to signup for a code review timeslot.
• All students will receive a final style score from 0-4 points
• 213 code reviews will be short (<= 15 minutes) and cover code style
and code quality.
Code Style
• Properly document your code
• Function + File header comments, overall operation of large blocks, any tricky bits
• Write robust code – check error and failure conditions
• Write modular code
• Use interfaces for data structures, e.g. create/insert/remove/free functions for a linked
• No magic numbers – use #define or static const
• Formatting
• 80 characters per line (use Autolab’s highlight feature to double-check)
• Consistent braces and whitespace
• No memory or file descriptor leaks
■ Finding memory leaks - part of the style
■ $ valgrind –leak-resolution=high –leak-check=full
–show-reachable=yes –track-fds=yes ./myProgram arg1 arg
■ Remember that Valgrind can be used for other things, like finding invalid reads and writes!
Activity: Valgrind
Activity Setup
■ Split up into groups of 2-3 people
■ One person needs a laptop
■ Log in to a Shark machine, and type:
$ wget https://www.cs.cmu.edu/~213/activities/rec7.tar
$ tar -xvf rec7.tar
213_exam_answers.c
Example: Matrix Multiplication
/* multiply 4x4 matrices */
void mm(int a[4][4], int b[4][4], int c[4][4]) {
int i, j, k;
for (i = 0; i < 4; i++)
for (j = 0; j < 4; j++)
for (k = 0; k < 4; k++)
c[i][j] += a[i][k] * b[k][j];
Let’s step through this to see what’s actually happening
Example: Matrix Multiplication
■ Assume a tiny cache with 4 lines of 8 bytes (2 ints) ■ S = 1, E = 4, B = 8
■ Let’s see what happens if we don’t use blocking
c[0][0] += a[0][0] * b[0][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 0 2
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
00 10 20 30
0 0 0 1 0 2 0 3
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 0 2 0 3 1 0
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][0] * b[0][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 0 2 0 3 1 0 1 1
c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] *
b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 0 2 0 3 1 0 1 1 1 2
c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] *
b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3
c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] *
b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3
c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] *
b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
What is the miss rate of a?
0 0 0 1 0 2 0 3 1 0 1 1 1 2 1 3
c[0][0] += a[0][0] * c[0][0] += a[0][1] * c[0][0] += a[0][2] * c[0][0] += a[0][3] * c[0][1] += a[0][0] * c[0][1] += a[0][1] * c[0][1] += a[0][2] * c[0][1] += a[0][3] *
b[0][0] b[1][0] b[2][0] b[3][0] b[0][1] b[1][1] b[2][1] b[3][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
What is the miss rate of a? What is the miss rate of b?
Example: Matrix Multiplication (blocking)
/* multiply 4x4 matrices using blocks of size 2 */
void mm_blocking(int a[4][4], int b[4][4], int c[4][4]) {
int i, j, k;
int i_c, j_c, k_c;
int B = 2;
// control loops
for (i_c = 0; i_c < 4; i_c += B)
for (j_c = 0; j_c < 4; j_c += B)
for (k_c = 0; k_c < 4; k_c += B)
// block multiplications
for (i = i_c; i < i_c + B; i++)
for (j = j_c; j < j_c + B; j++)
for (k = k_c; k < k_c + B; k++)
c[i][j] += a[i][k] * b[k][j];
Let’s step through this to see what’s actually happening
Example: Matrix Multiplication (blocking)
■ Assume a tiny cache with 4 lines of 8 bytes (2 ints) ■ S = 1, E = 4, B = 8
■ Let’s see what happens if we now use blocking
c[0][0] += a[0][0] * b[0][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 1 0
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
00 10 20 30
0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 1 0 1 1 0 0
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 1 0 1 1 0 0 0 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 1 0 1 1 0 0 0 1 1 0
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
Key: Grey = accessed
Dark grey = currently accessing Red border = in cache
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
c[0][0] += a[0][2] * b[2][0]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
8 0 9 0 10 0
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
80 90 100 110
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
8 0 9 0 10 0 11 0 12 1
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
iter i j k
8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
iter i j k
8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 14 1 1 2
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
iter i j k
8 0 0 2 9 0 0 3 10 0 1 2 11 0 1 3 12 1 0 2 13 1 0 3 14 1 1 2 15 1 1 3
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1] c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1] c[1][1] += a[1][3] * b[3][1]
0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1
c[0][0] += a[0][0] * b[0][0] c[0][0] += a[0][1] * b[1][0] c[0][1] += a[0][0] * b[0][1] c[0][1] += a[0][1] * b[1][1] c[1][0] += a[1][0] * b[0][0] c[1][0] += a[1][1] * b[1][0] c[1][1] += a[1][0] * b[0][1] c[1][1] += a[1][1] * b[1][1]
8 0 9 0 10 0 11 0 12 1 13 1 14 1 15 1
0 2 0 3 1 2 1 3
c[0][0] += a[0][2] * b[2][0] c[0][0] += a[0][3] * b[3][0] c[0][1] += a[0][2] * b[2][1] c[0][1] += a[0][3] * b[3][1]
c[1][0] += a[1][2] * b[2][0] c[1][0] += a[1][3] * b[3][0] c[1][1] += a[1][2] * b[2][1]
What is the miss rate of a?
0 3 1 2 1 3
c[1][1] += a[1][3] * b[3][1] What is the miss rate of b?
Introduction to Git
Version control is your friend
What is Git?
• Most widely used version control system out there
• Version control:
• Help track changes to your source code over time
• Help teams manage changes on shared code
Git Commands
• Clone: git clone
• Add: git add . or git add
• Push / Pull: git push / git pull
• Commit: git commit -m “your-commit-message”
• Good commit messages are key!
• Bad:“commit”, “change”, “fixed”
• Good: “Fixed buffer overflow potential in AttackLab”
If you get stuck…
■ Reread the writeup
■ Look at CS:APP Chapter 6
■ Review lecture notes (http://cs.cmu.edu/~213)
■ Come to Office Hours (Sunday to Friday, 6:00-10:00 PM Locations on Piazza)
■ Post private question on Piazza
■ man malloc, man valgrind, man gdb
Practice Problems
Class Question / Discussions
■ We’ll work through a series of questions
■ Write down your answer for each question ■ You can discuss with your classmates
What Type of Locality?
• The following function exhibits which type of
locality? Consider only array accesses.
void who(int *arr, int size) {
for (int i = 0; i < size-1; ++i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B
What Type of Locality?
• The following function exhibits which type of
locality? Consider only array accesses.
void who(int *arr, int size) {
for (int i = 0; i < size-1; ++i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B
What Type of Locality?
• The following function exhibits which type of
locality? Consider only array accesses.
void coo(int *arr, int size) {
for (int i = size-2; i >= 0; –i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B
What Type of Locality?
• The following function exhibits which type of
locality? Consider only array accesses.
void coo(int *arr, int size) {
for (int i = size-2; i >= 0; –i)
arr[i] = arr[i+1];
A. B. C. D.
Spatial Temporal
Both A and B Neither A nor B
Calculating Cache Parameters
• Given the following address partition, how many
int values will fit in a single data block?
# of int in block 0
Unknown: We need more info
B. index offset C. D.
Calculating Cache Parameters
• Given the following address partition, how many
int values will fit in a single data block?
# of int in block 0
Unknown: We need more info
B. index offset C. D.
Direct-Mapped Cache Example
• Assuming a 32-bit address (i.e. m=32), how many bits are used for tag (t), set index (s), and block offset (b).
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
tsb A.123 B. 27 2 3 C. 25 4 3 D. 1 4 8 E. 20 4 8
t s bits b
Tag Set index Block offset
Direct-Mapped Cache Example
• Assuming a 32-bit address (i.e. m=32), how many bits are used for tag (t), set index (s), and block offset (b).
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
tsb A.123 B. 27 2 3 C. 25 4 3 D. 1 4 8 E. 20 4 8
t s bits b
Tag Set index Block offset
Which Set Is it?
• Which set is the address 0xFA1C located in?
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
Set # for 0xFA1C 0
More than one of the above
Tag Set index Block offset
Which Set Is it?
• Which set is the address 0xFA1C located in?
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
E = 1 lines per set
Set # for 0xFA1C 0
More than one of the above
Tag Set index Block offset
Cache Block Range
• What range of addresses will be in the same block as
address 0xFA1C? 8 bytes
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
A. B. C. D. E.
Addr. Range
0xFA1C – 0xFA23
0xFA1C – 0xFA1F
0xFA18 – 0xFA1F
It depends on the access size (byte, word, etc)
Tag Set index Block offset
Cache Block Range
• What range of addresses will be in the same block as
address 0xFA1C? 8 bytes
per data block
Cache block
Cache block
Cache block
Cache block
Set 1: Set 2:
A. B. C. D. E.
Addr. Range
0xFA1C – 0xFA23
0xFA1C – 0xFA1F
0xFA18 – 0xFA1F
It depends on the access size (byte, word, etc)
Tag Set index Block offset
Cache Misses
If N = 16, how many bytes does the loop access of a?
int foo(int* a, int N)
int sum = 0;
for(i = 0; i < N; i++)
sum += a[i]; }
return sum; }
Accessed Bytes
A4 B 16 C 64 D 256
Cache Misses
If N = 16, how many bytes does the loop access of a?
int foo(int* a, int N)
int sum = 0;
for(i = 0; i < N; i++)
sum += a[i]; }
return sum; }
Accessed Bytes
A4 B 16 C 64 D 256
Cache Misses
Consider a 32 KB cache in
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com