程序代写代做代考 cache simulator compiler cache Carnegie Mellon

Carnegie Mellon
Design and Debugging
15-213/18-213/15-513: Introduction to Computer Systems 12th Lecture, June 18, 2020
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
1

Carnegie Mellon
After this lecture
 You will be able to:
▪ Describe the steps to debug complex code failures
▪ Identify ways to manage the complexity when programming ▪ State guidelines for communicating the intention of the code
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
2

Carnegie Mellon
Outline
 Debugging
▪ Defects and Failures
▪ Scientific Debugging ▪ Tools
 Design
▪ Managing complexity
▪ Communication ▪ Naming
▪ Comments
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
3

Carnegie Mellon
Defects and Infections
1. The programmer creates a defect 2. The defect causes an infection
3. The infection propagates
4. The infection causes a failure
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
4

Carnegie Mellon
Curse of Debugging
 Not every defect causes a failure!
 Testing can only show the presence of errors – not their absence. (Dijkstra 1972)
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
5

Carnegie Mellon
Defects to Failures
 Code with defects will introduce erroneous or “infected” state
▪ Correct code may propagate this state
▪ Eventually an erroneous state is observed
 Some executions will not trigger the defect
▪ Others will not propagate “infected” state
 Debugging sifts through
the code to find the defect
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
6

Carnegie Mellon
Explicit Debugging
 Stating the problem
▪ Describe the problem aloud or in writing
▪ A.k.a. “Rubber duck” or “teddy bear” method
▪ Often a comprehensive problem description is sufficient to solve
the failure
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
7

Carnegie Mellon
Scientific Debugging
 Before debugging, you need to construct a hypothesis as to the defect
▪ Propose a possible defect and why it explains the failure conditions  Ockham’s Razor – given several hypotheses, pick the
simplest / closest to current work
Failing Runs
Code
Problem Description
Other Runs
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
8
Hypothesis

Carnegie Mellon
Scientific Debugging
▪ Make predictions based on your hypothesis
▪ What do you expect to happen under new conditions ▪ What data could confirm or refute your hypothesis
▪ How can I collect that data? ▪ What experiments?
▪ What collection mechanism?
Prediction
▪ Does the data refute the hypothesis?
▪ Refine the hypothesis based on the new inputs
Observation & Conclusion
Hypothesis
Experiment
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
9

Carnegie Mellon
Scientific Debugging
 A set of experiments has confirmed the hypothesis ▪ This is the diagnosis of the defect
 Develop a fix for the defect
 Run experiments to confirm the fix
▪ Otherwise, how do you know that it is fixed?
Conclusion Diagnosis Fix Confirm
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
10

Carnegie Mellon
Code with a Bug
int fib(int n)
{
int f, f0 = 1, f1 = 1; while (n > 1) {
n = n – 1;
f = f0 + f1;
f0 = f1;
f1 = f;
}
return f; }
int main(..) {
..
for (i = 9; i > 0; i–) printf(“fib(%d)=%d\n”,
i, fib(i));
$ gcc -o fib fib.c fib(9)=55 fib(8)=34

fib(2)=2
fib(1)=134513905
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
11
A defect has caused a failure.

Carnegie Mellon
Constructing a Hypothesis
 Specification defined the first Fibonacci number as 1 ▪ We have observed working runs (e.g., fib(2))
▪ We have observed a failing run ▪ We then read the code
 fib(1) failed
Code
for (i = 9; …) while (n > 1) { int f;
// Hypothesis
Hypothesis
Result depends on order of calls Loop check is incorrect
F is uninitialized
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
12

Carnegie Mellon
Brute Force Approach
 First, compilation flags ▪ MUST include “-Wall”
▪ Should include “-Werror”
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
13
Prompt> gcc -Wall -Werror -O3 -o badfib badfib.c
badfib.c: In function ‘fib’:
badfib.c:12:5: error: ‘f’ may be used uninitialized in this funct
return f;
^
cc1: all warnings being treated as errors
i

Carnegie Mellon
Brute Force Approach

fib(2)=2
 First, compilation flags: “-Wall –Werror”
▪ MUST include “-Wall”
▪ Should include “-Werror”
fib(1)=0
 Second, other optimization levels …
▪ Try at least –O3 and –O0
fib(2)=2
fib(1)=0
fib(0)=0
prompt>gcc -O1 -o badfib badfib.c
prompt>./badfib

fib(2)=2
fib(1)=9
fib(0)=9
prompt>gcc -O0 -o badfib badfib.c prompt>./badfib

fib(2)=2
fib(1)=2
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
14
prompt>gcc -O3 -o badfib badfib.c
prompt>./badfib
fib(0)=0
prompt>gcc -O2 -o badfib badfib.c
prompt>./badfib
fib(0)=2

Carnegie Mellon
Brute Force Approach
 First, compilation flags: “-Wall –Werror”
▪ MUST include “-Wall”
▪ Should include “-Werror”
 Second, other optimization levels ▪ Try at least –O3 and –O0
 Valgrind (even if your program appears to be working!) ▪ Run on both –O3 and –O0
▪ Only run after all warnings are gone!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
15

Carnegie Mellon
prompt> gcc -g -O3 -o badfib badfib.c
prompt> valgrind badfib
==1462== Memcheck, a memory error detector
==1462== Copyright (C) 2002-2017, and GNU GPL’d, by Julia ==1462== Using Valgrind-3.13.0 and LibVEX; rerun with -h ==1462== Command: badfib
==1462==
fib(9)=55
fib(8)=34
fib(7)=21
fib(6)=13
fib(5)=8
fib(4)=5
fib(2)=2
fib(1)=0
fib(0)=0
fib(3)=3
Valgrind is not perfect. On –O3 it finds no errors!
==1462==
16
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition

Carnegie Mellon
prompt> gcc -g -O0 -o badfib badfib.c
prompt> valgrind badfib
==1561== Memcheck, a memory error detector
==1561== Copyright (C) 2002-2017, and GNU GPL’d, by Julia ==1561== Using Valgrind-3.13.0 and LibVEX; rerun with -h ==1561== Command: badfib
==1561==
fib(9)=55
fib(8)=34
fib(7)=21
fib(6)=13
fib(5)=8
fib(4)=5
fib(2)=2
==1561== Conditional jump or move depends on uninitialise
==1561== at 0x4E988DA: vfprintf (vfprintf.c:1642)
fib(3)=3
Valgrind is not perfect, but pretty darn good.
==1561== by 0x4EA0F25: printf (printf.c:33)
17
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition

Carnegie Mellon
Constructing a Hypothesis
 Specification defined the first Fibonacci number as 1 ▪ We have observed working runs (e.g., fib(2))
▪ We have observed a failing run ▪ We then read the code
 fib(1) failed
Code
for (i = 9; …) while (n > 1) { int f;
// Hypothesis
Hypothesis
Result depends on order of calls Loop check is incorrect
F is uninitialized
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
19

Carnegie Mellon
Prediction
 Propose a new condition or conditions
▪ What will logically happen if your hypothesis is correct? ▪ What data can be
 fib(1) failed // Hypothesis ▪ // Result depends on order of calls
▪ If fib(1) is called first, it will return correctly. ▪ // Loop check is incorrect
▪ Change to n >= 1 and run again. ▪ // f is uninitialized
▪ Change to int f = 1;
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
20

Carnegie Mellon
Experiment
 Identical to the conditions of a prior run
▪ Except with one condition changed  Conditions
▪ Program input, using a debugger, altering the code
 fib(1) failed // Hypothesis
▪ If fib(1) is called first, it will return correctly.
▪ Fails.
▪ Change to n >= 1
▪ fib(1)=2
▪ fib(0)=…
▪ Change to int f = 1;
▪ Works. Sometimes a prediction can be a fix. Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
21

Carnegie Mellon
Observation
 What is the observed result?
▪ Factual observation, such as “Calling fib(1) will return 1.” ▪ The conclusion will interpret the observation(s)
 Don’t interfere.
▪ printf() can interfere
▪ Like quantum physics, sometimes observations are part of the experiment
 Proceed systematically.
▪ Update the conditions incrementally so each observation relates to
a specific change
 Do NOT ever proceed past first bug. Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
22

Carnegie Mellon
Debugging Tools
 Observing program state can require a variety of tools ▪ Debugger (e.g., gdb)
▪ What state is in local / global variables (if known) ▪ What path through the program was taken
▪ Valgrind
▪ Does execution depend on uninitialized variables ▪ Are memory accesses ever out-of-bounds
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
23

Carnegie Mellon
Diagnosis
 A scientific hypothesis that explains current observations and makes future predictions becomes a theory
▪ We’ll call this a diagnosis
 Use the diagnosis to develop a fix for the defect ▪ Avoid post hoc, ergo propter hoc fallacy
▪ Or correlation does not imply causation
 Understand why the defect and fix relate
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
24

Carnegie Mellon
Fix and Confirm
 Confirm that the fix resolves the failure
 If you fix multiple perceived defects, which fix was for the failure?
▪ Be systematic
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
26

Carnegie Mellon
Learn
 Common failures and insights ▪ Why did the code fail?
▪ What are my common defects?
 Assertions and invariants
▪ Add checks for expected behavior
▪ Extend checks to detect the fixed failure
 Testing
▪ Every successful set of conditions is added to the test suite
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
27

Carnegie Mellon
Quick and Dirty
 Not every problem needs scientific debugging ▪ Set a time limit: (for example)
▪ 0 minutes – -Wall, valgrind
▪ 1 – 10 minutes – Informal Debugging
▪ 10 – 60 minutes – Scientific Debugging
▪ > 60 minutes – Take a break / Ask for help
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
28

Carnegie Mellon
Code Smells
 Use of uninitialized variables  Unused values
 Unreachable code
 Memory leaks
 Interface misuse  Null pointers
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
29

Carnegie Mellon
Outline
 Debugging
▪ Defects and Failures
▪ Scientific Debugging ▪ Tools
 Design
▪ Managing complexity
▪ Communication ▪ Naming
▪ Comments
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
30

Carnegie Mellon
Design
 A good design needs to achieve many things: ▪ Performance
▪ Availability
▪ Modifiability, portability
▪ Scalability
▪ Security
▪ Testability
▪ Usability
▪ Cost to build, cost to operate
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
31

Carnegie Mellon
Design
 A good design needs to achieve many things: ▪ Performance
▪ Availability
▪ Modifiability, portability
▪ Scalability
▪ Security
▪ Testability
▪ Usability
▪ Cost to build, cost to operate
But above all else: it must be readable
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
32

Carnegie Mellon
Design
Good Design does:
Complexity Management & Communication
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
33

Carnegie Mellon
Complexity
 There are well known limits to how much complexity a human can manage easily.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
34

Carnegie Mellon
Complexity Management
 However, patterns can be very helpful…
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
35

Carnegie Mellon
Complexity Management
Many techniques have been developed to help manage complexity:
 Separation of concerns  Modularity
 Reusability
 Extensibility
 DRY
 Abstraction
 Information Hiding  …
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
36

Carnegie Mellon
Managing Complexity
 Given the many ways to manage complexity ▪ Design code to be testable
▪ Try to reuse testable chunks
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
37

Carnegie Mellon
Complexity Example
 Split a cache access into three+ testable components ▪ State all of the steps that a cache access requires
Convert address into tag, set index, block offset
Look up the set using the set index
Check if the tag matches any line in the set If so, hit
If not a match, miss, then
Find the LRU block
Evict the LRU block
Read in the new line from memory
Update LRU
Update dirty if the access was a store
▪ Which steps depend on the operation being a load or a store? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
38

Carnegie Mellon
Complexity Example
 Split a cache access into three+ testable components ▪ State all of the steps that a cache access requires
Convert address into tag, set index, block offset
Look up the set using the set index
Check if the tag matches any line in the set If so, hit
If not a match, miss, then
Find the LRU block
Evict the LRU block
Read in the new line from memory
Update LRU
Update dirty if the access was a store
▪ Which steps depend on the operation being a load or a store? Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
39

Carnegie Mellon
Designs need to be testable
 Testable design
▪ Testing versus Contracts
▪ These are complementary techniques
 Testing and Contracts are
▪ Acts of design more than verification ▪ Acts of documentation
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
40

Carnegie Mellon
Designs need to be testable
 Testable design
▪ Testing versus Contracts
▪ These are complementary techniques
 Testing and Contracts are
▪ Acts of design more than verification
▪ Acts of documentation: executable documentation!
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
41

Carnegie Mellon
Testing Example
 For your cache simulator, you can write your own traces ▪ Write a trace to test for a cache hit
L 50, 1 L 50, 1
▪ Write a trace to test dirty bytes in cache S 100, 1
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
42

Carnegie Mellon
Testable design is modular
 Modular code has: separation of concerns, encapsulation, abstraction
▪ Leads to: reusability, extensibility, readability, testability
 Separation of concerns
▪ Create helper functions so each function does “one thing”
▪ Functions should neither do too much nor too little ▪ Avoid duplicated code
 Encapsulation, abstraction, and respecting the interface
▪ Each module is responsible for its own internals
▪ No outside code “intrudes” on the inner workings of another module
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
43

Carnegie Mellon
Trust the Compiler!
 Use plenty of temporary variables  Use plenty of functions
 Let compiler do the math
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
44

Carnegie Mellon
Communication
When writing code, the author is communicating with:  The machine
 Other developers of the system
 Code reviewers
 Their future self
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
45

Carnegie Mellon
Communication
There are many techniques that have been developed around code communication:
 Tests
 Naming
 Comments
 Commit Messages  Code Review
 Design Patterns
 …
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
46

Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
47
Naming

Carnegie Mellon
Avoid deliberately meaningless names:
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
48

Carnegie Mellon
Naming is understanding
“If you don’t know what a thing should be called, you cannot know what it is.
If you don’t know what it is, you cannot sit down and write the code.” – Sam Gardiner
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
49

Carnegie Mellon
Better naming practices
1. Start with meaning and intention
2. Use words with precise meanings (avoid “data”, “info”, “perform”)
3. Prefer fewer words in names
4. Avoid abbreviations in names
5. Use code review to improve names
6. Read the code out loud to check that it sounds okay
7. Actually rename things
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
50

Carnegie Mellon
Naming guidelines – Use dictionary words
 Only use dictionary words and abbreviations that appear in a dictionary.
▪ For example: FileCpy -> FileCopy
▪ Avoid vague abbreviations such as acc, mod, auth, etc..
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
51

Carnegie Mellon
Avoid using single-letter names  Single letters are unsearchable
▪ Give no hints as to the variable’s usage  Exceptions are loop counters
▪ Especially if you know why i, j, etc were originally used
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
52

Carnegie Mellon
Limit name character length
“Good naming limits individual name length, and reduces the need for specialized vocabulary” – Philip Relf
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
53

Carnegie Mellon
Limit name word count
 Keep names to a four word maximum
 Limit names to the number of words that people can read at a glance.
 Which of each pair do you prefer? a1) arraysOfSetsOfLinesOfBlocks
a2) cache
b1) evictedData
b2) evictedDataBytes
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
54

Carnegie Mellon
Describe Meaning  Use descriptive names.
 Avoid names with no meaning: a, foo, blah, tmp, etc
 There are reasonable exceptions:
void swap(int* a, int* b) {
int tmp = *a;
*a = *b;
*b = tmp;
}
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
55

Carnegie Mellon
Use a large vocabulary  Be more specific when possible:
▪ Person -> Employee
 What is size in this binaryTree?
struct binaryTree {
int size;

};
height
numChildren
subTreeNumNodes
keyLength
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
56

Carnegie Mellon
Use problem domain terms
 Use the correct term in the problem domain’s language.
▪ Hint: as a student, consider the terms in the assignment  In cachelab, consider the following:
line element
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
57

Carnegie Mellon
Use opposites precisely
 Consistently use opposites in standard pairs
▪ first/end -> first/last
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
58

Carnegie Mellon
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
59
Comments

Carnegie Mellon
Don’t Comments
▪ Don’t say what the code does
▪ because the code already says that ▪ Don’t explain awkward logic
▪ improve the code to make it clear ▪ Don’t add too many comments
▪ it’s messy, and they get out of date
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
60

Carnegie Mellon
Awkward Code
 Imagine someone (TA, employer, etc) has to read your code
▪ Would you rather rewrite or comment the following?
(*(void **)((*(void **)(bp)) + DSIZE)) = (*(void **)(bp + DSIZE));
▪ How about? bp->prev->next = bp->next;
▪ Both lines update program state in the same way.
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
61

Carnegie Mellon
Do Comments
 Answer the question: why the code exists
 When should I use this code?
 When shouldn’t I use it?
 What are the alternatives to this code?
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
62

Carnegie Mellon
Why does this exist?
 Explain why a magic number is what it is.
// Each address is 64-bit, which is 16 + 1 hex characters const int MAX_ADDRESS_LENGTH = 17;
 When should this code be used? Is there an alternative?
unsigned power2(unsigned base, unsigned expo){
unsigned i;
unsigned result = 1; for(i=0;i foo.c
▪ The commit messages are your record of your work ▪ Communicating to your future self
▪ Describe in one line what you did
“Parses command line arguments”
“fix bug in unique tests, race condition not solved” “seg list finished, performance is …”
 Use branches
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
67

Carnegie Mellon
Summary
 Programs have defects
▪ Be systematic about finding them
 Programs are more complex than humans can manage
▪ Write code to be manageable
 Programming is not solitary, even if you are communicating with a grader or a future self
▪ Be understandable in your communication
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
68

Carnegie Mellon
Acknowledgements
 Some debugging content derived from:
▪ http://www.whyprogramsfail.com/slides.php
 Some code examples for design are based on:
▪ “The Art of Readable Code”. Boswell and Foucher. 2011.  Lecture originally written by
▪ Michael Hilton and Brian Railing
Bryant and O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition
69