程序代写 EECS 376: Foundations of Computer Science

EECS 376: Foundations of Computer Science
University of Michigan, Winter 2022 Discussion Notes 1 1 Proof Methods
In this section, we review some techniques for proving statements.
1.1 Direct proof

Direct proof requires that we start with a collection of assumptions and get directly to a conclusion. When we say direct, we mean that we assume a statement P, prove that P =⇒ Q, and then conclude that Q must be true.
We will do one example, which shows how to complete an “if and only if” proof. Proposition 1.1. Prove that x = y if and only if xy = (x+y)2 .
Aside. You will need to prove two “directions” here: the “if” and the “only if” part. In the future, you may find it convenient to treat each “direction” as an independent proof; that is, you could complete one direction with direct proof and the other using proof by contrapositive (coming up).
Proof. Begin by supposing that x = y. First, we seek to show that xy = (x+y)2 . Then, using the 4
initial assumption,
(x+y)2 (x+x)2 (2x)2 2
4 = 4 =4=x=xy.
To show the other direction, we forget what we just did and instead assume that xy = (x+y)2 . Now, 4
we seek to show that x = y. Beginning with our assumption, we deduce: 4xy=(x+y)2 =⇒ x2−2xy+y2 =0 =⇒ (x−y)2 =0.
This occurs exactly when x = y, which is what we wanted to show. 1.2 Proof by contradiction
Suppose we want to prove the statement P. We can do by first assuming that P is false, and then arriving at a contradiction. Since the assumption that P is false led to a contradiction, it must be that P is true. We illustrate this by proving
Proposition 1.2. There are infinitely many primes.
Proof. Suppose for the sake of contradiction that there are only finitely many primes. Let p1, . . . , pm
be the list of all of them. Consider the integer
q = 1 + 􏰐 pi.
Let p be a prime that divides q. We claim that p ̸= pj for any j. For the sake of contradiction, suppose this were the case. Then, since pj divides both q and 􏰏mi pi, it divides their difference, q − 􏰏mi pi = 1, which is impossible because no prime divides 1. Therefore, p is a new prime not on the list, contradicting the fact that the list contains all the primes. We conclude that there are infinitely many primes.

EECS 376: Foundations of Computer Science
University of Michigan, Winter 2022 Discussion Notes 1 1.3 Proving the contrapositive
Suppose we want to prove a statement of the form P ⇒ Q. We can instead prove the contrapositive of the statement, ¬Q ⇒ ¬P, which is logically equivalent to it. Sometimes it may be easier to prove the contrapositive than the original statement. We illustrate this technique by proving
Proposition 1.3. Let a, b ∈ Z, and suppose ab is odd. Then a and b are both odd.
Proof. We will prove the contrapositive: if one of a and b is even, then ab is even.
So suppose that one of a and b is even; without loss of generality, assume it is a. Then a = 2k
for some integer k, and hence ab = 2kb, which is even. 2 Asymptotic (“Big-O”) Notation
“Big-O” notation is a means for reasoning about asymptotic quantities and relationships. In partic- ular, Big-O notation allows us to quickly define what happens in limiting cases, without worrying about specific constants. This is an important analysis technique in computer science that we will use throughout this course.
For example, consider two programs A and B, that take in an input and solve a problem. How long they take to solve the problem depends on the input. If they receive a large input, they will naturally take longer to run. However, for all inputs, B runs one second slower than A. Of course, this may make a difference in some applications (for example, if A and B are powering some aspect of user interaction on a website), but B can basically do anything that A can. Any problem we are trying to solve with A, we can solve with B, if we are willing to wait just one more second. In this course, we are less interested in specific applications and more interested in what problems are (efficiently) solvable at all, so we need a notation that allows us to see that there is not a big difference between A and B.
Suppose that B takes twice as long to run as A. This may seem like a big difference, and indeed it is bigger than before. If A takes 10 seconds to run, B would take 20 seconds. If A takes 1000 seconds to run, B takes 2000 seconds. So there is no fixed upper bound on how much longer B may make us wait than A. However, although waiting on B may be frustrating, in a sense if A can solve a problem in a “reasonable” amount of time, than B will also be able to solve the problem in a “reasonable” amount of time. Even if A took a week (for example, A may involve training a big neural network for a machine learning application), waiting two weeks for B would not be fun, but it would still be doable. In particular, there is always hope that someone will develop better hardware, or a slightly more customized implementation, and bring the running time of B right on par with A.
On the other hand, suppose B takes exponentially as long to run as A. This may not be a problem when A runs quickly. For example, if A takes 2 seconds, B would take e2 = 7.4 seconds. But if A takes 200 seconds, B would take e200 = 7.2 ∗ 1086seconds, which is more seconds than there are estimated to be atoms in the universe! Needless to say, no matter how much we optimize hardware or tweak the implementation, there is no way that B will be solving the problem in any of our lifetimes, while A will be done in about the time it might take to brew a cup of tea while we wait. Now there is a clear difference: there are some problems that we can solve with A that are clearly beyond B’s reach.
What we see in the final scenario is that there is a point at which, if the input is big enough, there will be a huge gulf between A and B. Thus, we would say that “asymptotically”, as we consider larger and larger inputs, the difference between A and B will only grow to the point that

EECS 376: Foundations of Computer Science
University of Michigan, Winter 2022 Discussion Notes 1 B will never stand a chance of keeping pace. This is what we will formalize with our definition of
big-O notation.
Definition 2.1. Let f(n) and g(n) be functions. If there exist constants M > 0 and n0 such that
for all n > n0, |f(n)| ≤ M|g(n)|, then we say that f(n) = O(g(n)).
In a sense, big-O is like asymptotic ≤ notation. Similarly, we have asymptotic ≥ notation,
“big-omega”:
Definition 2.2. Let f(n) and g(n) be functions. If there exist constants M > 0 and n0 such that
for all n > n0, |f(n)| ≥ M|g(n)|, then we say that f(n) = Ω(g(n)).
We can also define asymptotic equality by simply combining our big-O and big-Ω notations:
Definition 2.3. Let f(n) and g(n) be functions. If f(n) = O(g(n)) and f(n) = Ω(g(n)), then we say that f(n) = Θ(g(n)).
Of the asymptotic notations, big-O and big-Θ are by far the most commonly used. Big-Ω and the following notations also come up in literature fairly frequently, but are less common to see in practice than big-O and big-Θ.
We can define asymptotic < which we call “little-o”: Definition 2.4. Let f(n) and g(n) be functions. If for all M > 0 there exist constants n0 such
that for all n > n0, |f(n)| < M|g(n)|, then we say that f(n) = o(g(n)). Asymptotic > notation is–you guessed it–“little-omega”:
Definition 2.5. Let f(n) and g(n) be functions. If for all M > 0 there exist constants n0 such that for all n > n0, |f(n)| > M|g(n)|, then we say that f(n) = ω(g(n)).
Example 2.6. Let us compare the runtimes of program B from our second scenario and B in our final scenario. If the runtime of A is n, then we could say that B’s runtime in the second scenario is f(n) = 2n, and in the final scenario it is g(n) = en. What is the asymptotic relationship of f(n) to g(n)? (Fill in the blank: f(n) is (g(n)).)
Solution: f(n) = o(g(n)). To see this, let n0 = M + 1. Then we have eM+1 > 2(M + 1), which (graph or verify with an equation solver) is true for all positive M.
Note: it would be correct to say that 2n = O(en) (if a < b then a ≤ b). However, although such a statement is true, it would not be the most helpful statement we could make. Saying that 2n = o(en) is a stronger statement that is also true, so in general we want to make the strongest statement we can. Similarly, if we were trying to reason about the asymptotic properties of f(n) = 2n, we could say “f(n) is O(en)”. This would be a true statement, but it would be overkill, since en asymptotically dominates 2n so much. It would not give us much helpful information about how fast asymptotically that f(n) does grow. A more helpful statement we can make–take a second to verify for yourself!–is that f(n) = O(n). This is an asymptotically tight upper bound. (In fact, a stronger statement we can make is that f(n) = Θ(n), but in practice many people will content themselves with giving the upper bound and saying “f(n) is O(n)”. Example 2.7. Let f (n) = 3n + 2 ln(n). What is the asymptotic relationship of f (n) to g(n) = n? EECS 376: Foundations of Computer Science University of Michigan, Winter 2022 Discussion Notes 1 Solution: f(n) = Θ(n). To see this, we first show that f(n) = O(n) (showing one side of the asymptotic equality). We want to find M and n0 such that 3n0 + 2ln(n0) ≤ Mn0. Rearranging, we have (M − 3)n0 ≥ 2 ln n0. This is easily verified to be true if we set M = 4 and n = 10, among others. Showing the other side of the asymptotic equality, we show that f(n) = Ω(n). This is easy to do. We can just set M = 1 (or any other positive number less than 3), and n0 = 1 (or any other positive number), and we verify that 3n + 2 ln(n) > n, or 2n + 2 ln(n) > 0∀n ≥ 1. Having shown both directions, that f (n) = O(n) and f (n) = Ω(n), we conclude that f(n) = Θ(n).
Note: similar to above, in practice you would be likely to hear someone just say “f(x) is O(x)” since in many cases when reasoning about asymptotic behavior we are interested in asymptotic upper bounds and this bound is tight. But Θ(x) is the most precise answer here, and there are times when it is important to reason about the bound from both sides. If in doubt from the context, better to be safe than sorry.
3 General Strategy in Proving f(n) = O(g(n)) and f(n) ̸= O(g(n)) To prove that f(n) = O(g(n)), there are several important approaches:
• if f (n) is in the form of a recurrence relation, verify whether we can use the Master’s Theorem, or if we can use non-Master Theorem Recurrence substitution (covered in next week’s lectures and section notes.)
• if you can verify that
this implies that f(n) = O(g(n)). You will sometimes find L’Hˆopital’s rule very helpful when
evaluating the limit.
• alternatively, you can try to argue according the basic definition:
∃ n0,M ∈ R s.t.∀n > n0 |f(n)| ≤ M|g(n)|
This, however, requires you to show the existence of a pair of n0,M with such property.
• in the case that f(n),g(n) is complicated, it might worth while to find the dominant term beforehand. That is, e.g. f(n) = A(n) + B(n) + C(n), then B(n) + C(n) = O(A(n)) =⇒ f(n) = Θ(A(n)).
• if you are allowed to assume certain complexity class is the Big-O of another one, sometimes we can derive other big-O’s. e.g. If we are to assume that n = O(n2), then we know there exists a pair of n0, M ∈ R such that ∀n > n0, |n| ≤ M · n2;
this pair of n0, M then implies that ∀n > n0, |n · en| ≤ M · |n2en|, because en is monotonic and positive. We proved O(nen) = O(n2en)
To prove that f(n) ̸= O(g(n)), there are several important approaches:
• if f (n) is in the form of a recurrence relation, verify whether we can use the Master’s Theorem, and see if you can conclude that f(n) = Θ(y(n)); then, you can argue whether y(n) ̸= O(g(n)) with the strategies below.
􏰎􏰎 f ( n ) 􏰎􏰎
lim 􏰎 􏰎n0 |f(n)|>M|g(n)|
This requires you to show any pair of n0,M has a corresponding n with the above quality, which might suggest you to represent n with n0, M somehow and show the inequality is always true.
in the case that f(n),g(n) are complicated, looking for the dominant term would also help. if you can show that g(x) = o(f(x)), you can say f(x) ̸= O(g(x)) surely. Remember, little-o
mentioned in the above statement is easily provable as follows:
􏰎􏰎 g(n) 􏰎􏰎 g(x)=o(f(x))⇐⇒ lim􏰎 􏰎=0
You can think of little-o as the stronger form of big-O, where big-O requires that the LHS grows not faster than RHS, and little-o requires that the LHS grows much slower than the RHS.
Other proof principles Proof by induction
􏰎􏰎 f ( n ) 􏰎􏰎
lim 􏰎 􏰎 = ∞
Induction is used to prove a statement about a countably infinite set, such as the positive integers. To prove a statement P(n) about the positive integers, we show two things:
(1) P(1) is true.
(2) If k is any positive integer and P (k) is true, then P (k + 1) is true as well.
If we can show these two things, then we can conclude from (1) that P (1) is true. Then, using (1) and (2) together, we conclude that P (2) is true, and then so is P (3), and so on. As an application, we prove the following:
Proposition 4.1. Let S be a set with n elements, and let P(S) be the power set of S, i.e., the set of all subsets of S. Then P(S) contains 2n elements.
Proof. We perform induction on the size of S. Let P(n) be the statement “A set of size n has 2n subsets.” Our goal is to prove that P (n) is true for all nonnegative integers.
Our base case will be when S is the empty set of size 0. In this case, S has exactly one subset, the empty set, and hence P(S) contains 1 = 20 elements. Therefore, P(0) is true.
Now, suppose that P (k) is true for some nonnegative integer k. We need to show that P (k + 1) is true. Let S be a set of size k+1. Single out an element a ∈ S. Then each element of P(S) either contains a, or it doesn’t.
n→∞ 􏰎f(n)􏰎

EECS 376: Foundations of Computer Science
University of Michigan, Winter 2022 Discussion Notes 1
The elements of P(S) that do not contain a are precisely the subsets of S \ {a}, the set S with a removed. This is a set of size k, and by hypothesis, there are 2k such sets.
The elements of P(S) that contain a are of the form T ∪ {a}, where T is an element of P(S) which does not contain a. Again, there are 2k such sets. Therefore, the total number of elements of P(S) is 2k + 2k = 2 · 2k = 2k+1. Therefore, we have shown that P (k + 1) is true, which completes the proof.
4.2 The pigeonhole principle
The pigeonhole principle states that if n items are placed in m boxes and n > m, then at least one box contains more than one item. As an application, we prove the following:
Proposition 4.2. Suppose there are 6 people at a party. Then one of the following must be true: (1) There is a group of 3 people who have all met each other before (they are friends).
(2) There is a group of 3 people who have never met each other (they are strangers).
Proof. Label the participants {A,B,C,D,E,F}. Consider person A, and imagine we place the 5 other people into 2 categories, “friend” or “stranger.” By the pigeonhole principle, one of these cat- egories contains at least 3 people. Without loss of generality (by labeling the people appropriately), suppose that A has at least 3 friends, and they are B, C, and D. There are 3 cases:
Case 1. B, C, and D are all friends. Then they form a group satisfying condition (1) of the proposition, and we are done.
Case 2. Only 2 of them are friends, say, B and C. Then A, B, and C form a group satisfying condition (1) of the proposition, and we are done.
Case 3. None of them are friends. Then they form a group satisfying condition (2) of the proposition, and again we are done.
5 Information
Information is an important concept in not only computers but all aspects of life, so it will be a subject of interest to us. Since we are trying to abstract away from specific applications, we will need to formulate a rigorous, standardized way to view it. Information may be stored in many forms, and in the real world is often messy and unstandardized (take databases, for example). Information retrieval is an entire academic field of research, one of whose biggest challenges is the lack of standardization in human-generated and managed information.
For the purposes of this course, however, information is very simple: everything is a bitstring.
You have probably seen binary notation before. Binary notation is a way to represent numbers with strings of bits. For example, the number 19 can be represented by the bitstring 10011. Note that larger numbers require more bits to represent, which of course is true in our standard Arabic numeral scheme too. For example, the number 100 takes 3 digits to represent, or 7 bits in binary. The number 19 takes 2 digits, or 5 bits as we have seen.
How might we represent language (text, or words) with bitstrings? There are many ways to do this. If you have heard of ASCII notation, which is a way of mapping letters (more generally characters) to numbers, we could map each character in a word to its ASCII number, and thus represent the word one character at a time. Of course, we can represent longer documents one word at a time in this manner. So it turns out that we can represent arbitrary information–files, websites, and even computer programs–as bitstrings!

EECS 376: Foundations of Computer Science
University of Michigan, Winter 2022 Discussion Notes 1
Inputs to programs are also information (that the program needs to run), so we can also represent them as bitstrings. This bitstring may represent different types of information, depending on what the program does (but no matter the program its input can always be formatted as a bitstring, at least in our minds). For example, the input to some programs may even be the source code of another program! (For example, the first program might be a compiler.) Nevertheless, no matter how crazy the input, we can regard it for our purposes as just another bitstring.

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts