CS 70 Discrete Mathematics and Probability Theory
Spring 2018 Ayazifar and Solutions
CS 70, Spring 2018, Final Solutions 1
Copyright By PowCoder代写 加微信 powcoder
1. Discrete Math: True/False (12 parts: 3 points each.)
1. ∀x,∀y,¬P(x,y)≡¬∃y,∃x,P(x,y)
Answer: True. If for every x and y P(x, y) is not true, then there doesn’t exist an x and y where P(x, y) is true.
2. (P =⇒ Q)≡(Q =⇒ P).
Answer: False. The converse of a statement is not logically equivalent to the statement.
3. Any simple graph with n vertices can be colored with n − 1 colors. Answer: False. The complete graph requires n colors to properly color it.
4. The set of all finite, undirected graphs is countable.
Answer: True. Can enumerate by considering all graphs with just one vertex, then two vertices, three, etc., all of which have a finite number of configurations of edges.
5. Thefunction f(x)=ax (mod N)isabijectionfromandto{0,…,N−1}ifandonlyifgcd(a,N)=1. Answer: True. a has a multiplicative inverse mod N.
6. Foraprime p,thefunction f(x)=xd (mod p)isabijectionfromandto{0,…,p−1}whengcd(d,p− 1)=1.
Answer:True.Theinversefunctionisg(x)=xe (modp)withe=d−1 (modp−1).
7. A male optimal pairing cannot be female optimal.
Answer: False. There could just be a single stable pairing.
8. For any undirected graph, the number of odd-degree vertices is odd.
Answer: False. The sum of the degrees is even, since it is twice the number of edges, and thus the number of odd degree vertices is even.
9. For every real number x, there is a program that given k, will print out the kth digit of x. Answer: False. The number of programs is countable but the number of real numbers is not.
10. There is a program that, given another program P, will determine if P halts when given no input. Answer: False. One can reduce from the halting problem as follows. Given a program P and x, produce a program that has a constant string with value x and runs P on x. This program does not take any input, so determining if it halts also determines whether P halts on x.
11. Any connected simple graph with n vertices and exactly n edges is planar.
Answer: True. It is a tree plus an edge. Since a tree has one face, that edge can be drawn in that face.
12. Given two numbers, x and y, that are relatively prime to N, the product xy is relatively prime to N. Answer: True. Neither have a prime factor in common with N and neither does their product which only has prime factors from x and y.
2. Discrete Math:Short Answer (10 parts: 4 points each)
1. If gcd(x,y) = d, what is the least common multiple of x and y (smallest natural number n where both
x|n and y|n)? [Leave your answer in terms of x,y,d]
Answer: xy. We have x = kd and y = ld where gcd(k,l) = 1. xy = kld, and thus is divisible by x dd
and y. Moreover, n must contain all the prime factors of k and l and d and since they have none in common, n must be as large as this product.
2. Considerthegraphwithvertices{0,…,N−1}andedges(i,i+a) (mod N)forsomea̸≡0 (mod N). Let d = gcd(a,N). What is the length of the longest cycle in this graph in terms of some subset of N,a, and d?
Answer: N/d. If one takes j = N/d steps along edges from i one gets to i + ja = i (mod N) since
ja=N/d(kd)=Nk=0 (mod N).
3. What is the minimum number of women who get their favorite partner (first in their preference list) in a female optimal stable pairing? (Note that the minimum is over any instance.)
Answer: 0. Consider a 3-men, 3-women example where women A and B have man 1 as their favorite, and man 1 rejects B, who then proposes to woman C’s current partner. She then asks man 1, who likes her best and rejects woman A. Now, no woman has her favorite partner.
4. What is the number of ways to split 7 dollars among Alice, Bob and Eve? (Each person should get an
whole number of dollars.)
Answer: 9. We think of assigning a number to Alice, one to Bob and one to Eve, where the numbers 2
sum to 5. We map this situation to one of having 5 stars and 2 bars and then choosing where the bars go out of the seven positions.
5. What is 624 (mod 35)?
Answer: 1. This is from the fact that a(p−1)(q−1) ≡ 1 (mod pq) when gcd(a, pq) = 1. Here a = 6 and pq = (7)(5) = 35 and (p−1)(q−1) = 24.
Alternative Answer: Notice that 62 = 1 (mod 35), which tells us 624 = 1 (mod 35).
6. If one has three distinct degree at most d polynomials, P(x), Q(x), R(x), what is the maximum number of intersections across all pairs of polynomials?
Recall that we define intersections to be two polynomials having the same value at a point. (That is if P(1) = Q(1), and P(2) = R(2) and R(3) = Q(3), that is three intersections. If they all meet at a point P(1) = Q(1) = R(1), that is three intersections.)
Answer: 3d. Consider if there were 3d + 1. Any intersection point can be assigned to a pair of poly- nomials that intersect at that point. The number of points assigned to at least one pair of polynomials must be d + 1 which means that pair of polynomials is the same.
Have all three polynomials intersect on the same d points and differ on a d + 1 fixed point yields 3d intersections.
7. Working modulo a prime p > d, given a degree exactly d polynomial P(x), how many polynomials
Q(x) of degree at most d are there such that P(x) and Q(x) intersect at exactly d points?
Answer: p(p−1). d
For any polynomial Q(x) that intersects at d points, we have R(x) = P(x) − Q(x) = 0 at these d points. This polynomial can be factored into the form R(x) = a0(x−r1)···(x−rd). We need to choose a0 and r1,…,rd to specify Q(x). The number of ways to choose a0 is p−1 and the number of ways to choose r1,…,rd without repetition as the intersection points are distinct.
8. Recall that the vertices in a d-dimensional hypercube correspond to 0 − 1 strings of length d. We call the number of 1’s in this representation the weight of a vertex.
(a) How many vertices in a d-dimensional hypercube have weight k? Answer: d. Need to choose where the 1’s are.
(b) Howmanyedgesarebetweenverticeswithweightatmostkandverticeswithweightgreaterthan k?
Answer: d(d − k). Each vertex of weight k has d − k neighbors of weight k + 1. k
9. Foraprimep,howmanyelementsof{0,…,pk−1}arerelativelyprimetop?
Answer: pk−1(p−1). One can eliminate all the numbers that are divisible by p, which gives pk − pk−1 and factor out p.
Or one can also recall Euler’s totient formula which we used on the homework.
3. Some proofs. (3 parts. 5/5/8 points.)
1. Recall for x,y, with gcd(x,y) = d, that there are a,b ∈ Z where ax+by = d. Prove that gcd(a,b) = 1. Answer: Noticex=dtandy=dsforintegerssandt,plugginginwegetasd+btd=d.
Dividing, we get as+bt = 1. Now, this suggests gcd(a,b) = 1 by Extended Euclid.
2. You have n coins. The probability of the ith coin being heads is 1/(i + 1) (i.e., the biases of the coins are 1, 1,…, 1 ). You flip all the coins. What is the probability that you see an even number of heads?
Prove it. (Hint: the answer is quite simple.)
Answer: 21 . For one coin we are asking for the probability that there are zero heads, or 1/2. Let Ai be the event that the number of heads is even in the first i coins, and Hi be the event that the ith coin is heads. We have
P(Ai+1) = P(Ai)P(Hi) + P(Ai)P(Hi) = (1/2)(1/i) + (1/2)(1 − 1/i) = 1/2.
Alternative Solution: Note that, because of the 1st coin with bias 1/2, every odd-headed arrangement has another even-headed arrangement of equal probability, corresponding to flipping the first coin.
3. Consider a game with two players alternating turns. The game begins with N > 0 flags. On each turn, each player can remove 1,2,3, or 4 flags. A player wins if they remove the last flag (even if they removed several in that turn).
Show that if both players play optimally, player 2 wins if N is a multiple of 5, and player 1 wins otherwise. (Note player 1 goes first.)
Answer: The base case is that Player 1 wins if N ∈ {1, 2, 3, 4}. We assume the statement by induction, and note that if N is not a multiple of 5, that Player 1 can make it a multiple of 5 by removing one of {1,2,3,4} flags. Then player 2 must remove some number of flags in {1,2,3,4} and the resulting number of flags that player 1 faces is not a multiple of 5 and it is smaller. Thus, by induction player 1 has a winning strategy.
4. Probability:True/False. (7 parts, 3 points each.)
1. For a random variable X , the event “X = 3” is independent of the event “X = 4”. Answer: False. They are mutually exclusive not independent.
2. Let X,Y be Normal with mean μ and variance σ2, independent of each other. Let Z = 2X +3Y. Then, LLSE[Z | X] = MMSE[Z | X].
Answer: True.
Notice,LLSE[Z|X]=5μ+cov(X,2X+3Y)(X−μ)=2X+3μ asX andY areindependent.
MMSE[Z|X] = E[2X +3Y | X] = 2X +E[3Y | X] = 2X +3μ as X and Y are independent.
3. Any irreducible Markov chain where one state has a self loop is aperiodic.
Answer: True. Since one can reach this state from any state and vice versa, there is a cycle of every length, which means that the period which is the gcd of all cycle lengths is at most 1.
4. Given a Markov Chain, let the random variables X1,X2,X3,…, where Xt = the state visited at time t in the Markov Chain. Then E[Xt |Xt−1 = x] = E[Xt |Xt−1 = x ∩ Xt−2 = x′].
Answer: True. The value of Xt conditioned on Xt−1 is independent of all previous times.
5. Given an expected value μ, a variance σ2 ≥ 0, and a probability p ∈ (0,1), it is always possible to choose a and b such that a discrete random variable X which is a with probability p and b with probability 1 − p will have the specified expected value and variance.
Answer: True. Can think of it as a system of two equations for variance and expected value, with two unknowns a and b.
6. Consider two random variables, X and Y , with joint density function f (x, y) = 4xy when x, y ∈ [0, 1]
and 0 elsewhere. X and Y are independent.
Answer: True. f(x)=1 f(x,y)dy=2x. f(x|Y =y)= f(x,y)/1 f(x,y)dx=4xy/(4y(1/2))=2x 00
7. Suppose every state in a Markov chain has exactly one outgoing transition. There is one state, s, whose outgoing transition is a self-loop. All other states’ outgoing transitions are not self-loops. If a unique stationary distribution exists, it must have probability 1 on s and 0 everywhere else.
Answer: True. You can always leave every other state, but s is the hotel california; you may check out, but you can never leave.
5. Probability: Short Answer. (17 parts, 4 points each.)
1. Consider X ∼ G(p), a geometric random variable X with parameter p. What is Pr[X > i|X > j] for i ≥ j?
Answer: (1 − p)i− j . This is the event that the first i − j trials fail after the jth trial.
2. Suppose we have a random variable, X, with pdf
cx2,if 0 ≤ x ≤ 1
0, otherwise
What is c?
Answer: 3. 1 cx2 = cx3/3|1 = c/3 = 1, which implies that c = 3.
3. Given a binomial random variable X with parameters n and p, (X ∼ B(n, p)) what is Pr[X = E[X]]?
(You should assume pn is an integer.)
Answer: n ppn(1−p)(1−p)n pn
4. Pr[A|B] = 1/2, and Pr[B] = 1/2, and A and B are independent events. What is Pr[A]?
Answer: 1/2. Pr[A|B] = Pr[A] for independent events. For A and B to be independent, Pr[A] × Pr[B] = Pr[A|B]Pr[B].
5. Aaron is teaching section and has 6 problems on the worksheet. The time it takes for him to finish covering each question are i.i.d. random variables that follow the exponential distribution with param- eter λ = 1/20. Additionally, for each question, Aaron may choose to skip it entirely with probability p = 1/3. What is the expected time of section?
Answer: 80 minutes. Let Xi be the random variable corresponding to the time for the ith problem where it is 0, if Aaron doesn’t do it. E[Xi] = E[Xi|skip]Pr[skip]+E[Xi|notskip]Pr[notskip] = 0∗1/3+ 20 ∗ 2/3 = 40/3. The total time is 6E[X1] = 80.
6. Let X be a uniformly distributed variable on the interval [3, 6]. What is Var(X)?
Answer: 9/12. The variance of X being uniform over [0, 1] is 1/12. The width of this interval is 3 so we get 9/12.
7. Label N teams as team 1 through team N. They play a tournament and get ranked from rank 1 to rank N (with no ties). All rankings are equally likely.
(a) What is the total number of rankings where team 1 is ranked higher than team 2?
Answer: N! . The number of orders with 1 before 2 is the same a the number where 1 is after 2; a 2
bijection is to switch the 1 and 2.
(b) What is the expected number of teams with a strictly lower rank number than their team number?
For example, if team 3 was rank 1, their rank number (1) is lower than their team number (3).
Simplify your answer (i.e. no summations).
Answer: N−1 . For team numbered i, let Xi be the indicator variable that i ends up at a lower rank. 2
E[Xi] = Pr[Xi = 1] = (i − 1)/N since each position is equally likely and there are i − 1 positions before i. Summing up over i, we obtain N1 (N − 1)(N )/2 = (N − 1)/2.
8. Let X be a random variable that is never smaller than −1 and has expectation 5. Give a non-trivial upper bound on the probability that X is at least 12.
Answer: 6/13. Let Y = X + 1, which is a non-negative random variable with expectation 6. By Markov’s inequality, Pr[X ≥ 12] = Pr[Y ≥ 13] ≤ 6 .
9. Let X be a random variable with mean E[X] = 5 with E[X2] = 29. Give a non-trivial upper bound on the probability that X is larger than 12.
Answer: 4/49. Chebyshev’s inequality suggests that Pr[|X − E[X|] > t] ≤ Var(X) = 4 . Moreover,
Var(X) = E[X2]−E[X]2 = 4.
10. Let T be the event that an individual gets a positive result on a medical test for a disease and D be the event that an individual has the disease. The test has the property that Pr[T |D] = .9 and Pr[T |D] = .01. Morever, Pr[D] = .01. Given a positive result, what the probability that the individual has a disease? (No need to simplify your answer, though it should be a complete expression with numbers.)
Answer: Pr[D|T]= Pr[T|D]×Pr[D] = .009 . Pr[T |D]×Pr[D]+Pr[T |Dc ]Pr[Dc ] .009+.0099
11. Let R be a continuous random variable corresponding to a reading on a medical test for an individual and D be the event that the individual has a disease. The probability of an individual having the disease is p. Further, let fR|D(r) (and fR|D(r)) be the conditional probability density for R conditioned on D (respectively conditioned on D). Given a reading of r, give an expression for the probability the individual has the disease in terms of fR|D(r), fR|D(r), and p.
Answer: fR|D(r)×p . fR|D(r)×p+ fR|D(r)×(1−p)
This can be seen as the limit of Pr[(r≤R≤r+dr)∩D] . Pr[r≤R≤r+dr]
The answer computes this with factors of dr in the numerator and denominator which then divide out.
12. For continuous random variables, X and Y where Y = g(X ) for some differentiable, bijective function g : R → R. What is fY (y) in terms of fX (·), g(·), g−1(·) and g′(·)? (Possibly useful to remember that
fY(y)dy=Pr[y≤Y ≤y+dy].)
Answer: fX(g−1(y))/g′(g−1(y)).
Recall that fY (y)dy = Pr[y ≤ Y ≤ y + dy]. We have fX (x)dx = Pr[x ≤ X ≤ x + dx]. Now, we have y = g(x), so dy = g′(x)dx which also states that dx = dy/g′(y).
For a fixed y and dy, x = g−1(y), and x+dx = g−1(y)+dy/g′(x).
Thus, we have fY (y)dy = Pr[g−1(y) ≤ X ≤ g−1(y) + dy/g′(x)] = fX (x)dy/g′(x). Dividing by dy yields the expression.
13. What is the stationary distribution, π, for the following three state Markov chain? (Hint: π(0) = 3/4)
1/4 1/4 03/41 2
where you are, the next state is 0 with probability 3/4, π(1) = (1/4)π(0), and finally π(2) is whatever
it needs to be to add up to 1.
14. Consider continuous random variables, X and Y , with joint density that is f (x, y) = 2 for x, y ∈ [0, 1] and where y < x. That is, the distribution is uniform over the shaded region in the figure below.
Answer: π(0) = 3/4,π(1) = (3/4)∗(1/4) = 3/16,π(2) = 1/16. π(0) is by noting that no matter
Say someone takes a sample of X or Y with equal probability, and then announces that the value is 2/3. What is the probability that the sample is from X?
Answer: 2/3.
We wish to compute Pr[ from X | see 2/3] which is just Pr[ from X and 2/3] .
Pr[2/3] Now Pr[ from X and 2/3] ∝ (1/2) × fX (2/3) = (2/3)(2) = 4/3
The Pr[from Y and 2/3] ∝ (1/2) × fY (2/3) = ((1/3)(2)) = 2/3
Thus, the ratio is 4/3 or 2/3. 2/3+4/3
15. Given a random variable X ∼ Expo(λ ), consider the integer valued random variable K = ⌈X ⌉.
(a) What is Pr[K = k]? Answer: (1 − e−λ )(e−λ )k−1 A proof is as follows:
(b) What standard distribution with associated parameter(s) does this correspond to?
Answer: It is geometric with parameter p = 1 − e−λ .
This can be seen as the process where the probability of success in an integer interval is the probability that the exponential variable is in an interval of length 1, which is 1 − e−λ .
6. Longer Probability Questions.
k −λt λe dt
Pr[K = k] =
= (−e−λ (k) + e−λ (k−1))
k−1 =e−λ(k−1)(1−e−λ)
1. [I iterated my expectations, and you can, too!] (4 parts. 5 points each.)
Consider two discrete random variables X and Y . For notational purposes, X has probability mass
function (or distribution), pX (x) = Pr[X = x], mean μX , and variance σX2 . Similiarly, random variable Y has PMF pY (y) = Pr[Y = y], mean μY and variance σY2.
For each of True/False parts in this problem, either prove the corresponding statement is True in general or use exactly one of the counterexamples provided below to show the statement is False.
(a) Potential Counterexample I (b) Potential Counterexample II
(a) Suppose E[Y|X] = c, where c is a fixed constant. This means that the conditional mean E[Y|X]
does not depend on X.
i. Showthatc=μY,themeanofY.
Answer: Apply the Law of Iterated Expectations:
E[Y] = E[E[Y|X]] = E[c] = c.
So, c = μY . To show this in more detail,
E[Y] = E[E[Y|X]] = ∑E[Y|X = x] pX(x) = c∑pX(x) = c.
ii. True or False?
The random variables X and Y are independent.
Answer: False. InPotentialCounterexampleI,E[Y|X =1]=E[Y|X =−1]=E[Y|X =0]=0. However, X and Y are not independent, as the pY |X (1|0) = 1/3 ̸= 1/5 = pY (1). Alternatively, note that pY,X (1, 1) = 0, whereas neither pY (1) nor pX (1) is zero. So, X and Y are dependent, even though the conditional mean E[Y|X] does not depend on X.
iii. True or False?
The random variables X and Y are uncorrelated, meaning that cov(X , Y ) = 0.
Answer: True.Thecovarianceis0,ifE[XY]=E[X]E[Y].Inthiscase,
E[XY] = ∑x∑yPr[X = x,Y = y] xy
= ∑Pr[X = x]x∑yPr[Y = y|X = x] xy
=∑Pr[X =x]xE[Y|X =x] x
= c∑Pr[X = x]· = cE[X] = E[Y]E[X]. x
An easier proof is that if the MMSE is a constant (and thus a linear function), the LLSE must be the same as the MMSE, and so is also a constant. From our formula for the LLSE, that implies that cov(X,Y) = 0.
Another Alternative: E[E[XY | X]] = E[XE[Y | X]] = E[cX] = cE[X] = E[Y]E[X].
(b) Suppose X and Y are uncorrelated, meaning that cov(X , Y ) = 0. True or False?
The conditional mean is E[Y|X] = c, where c is a fixed constant, meaning that E[Y|X] does not depend on X.
Answer: False.
We use Counterexample II to show this. Notice that μX = E(X) = 0. Therefore,
cov(X,Y)=E(XY)− μX μY
=0 =E(XY)=E(X3)
= ∑x3pX(x)=3 ∑x3 x=−1 x=−1
So far we’ve shown that X and Y are uncorrelated. However, note that
In particular,
E(Y|X)=E(X2|X)=X2.
+1 E(Y|X) = 0
if X = −1 if X = 0
if X = +1.
Clearly, E(Y|X) varies with X; it is not a constant.
2. [Estimations of a random variable with noise.] (6 parts. 2/4/2/2/4/8 points.)
Let random variable Y denote the blood pressure of a patient, and suppose we model it as a Gaussian random variable having mean μY and variance σY2.
Our blood pressure monitor (measuring device) is faulty. It yields a measurement
where the noise W is a zero mean Gaussian random variable (μW = 0) with variance σW2 . Assume that the noise W is uncorrelated with Y . Note, that the actual blood pressure Y is inaccessible to us, due to the additive noise W .
(a) Show that σX2 = σY2 + σW2 .
Answer: The variance of the sum of the two uncorrelated random variables is the sum of the
variances.
(b) ShowthatL(Y|X),theLinearLeast-SquareErrorEstimatefortheb
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com