EECS 70 Discrete Mathematics and Probability Theory Fall 2020
Random Variables: Variance and Covariance
Note 16
We have seen in the previous note that if we take a biased coin that shows heads with probability p and toss it n times, then the expected number of heads is np. What this means is that if we repeat the experiment multiple times, where in each experiment we toss the coin n times, then on average we get np heads. But in any single experiment, the number of heads observed can be any value between 0 and n. What can we say about how far off we are from the expected value? That is, what is the typical deviation of the number of heads from np?
1 Random Walk
Let us consider a simpler setting that is equivalent to tossing a fair coin n times, but is more amenable to analysis. Suppose we have a particle that starts at position 0 and performs a random walk in one dimension. At each time step, the particle moves either one step to the right or one step to the left with equal probability (this kind of random walk is called symmetric), and the move at each time step is independent of all other moves. We think of these random moves as taking place according to whether a fair coin comes up heads or tails. The expected position of the particle after n moves is back at 0, but how far from 0 should we typically expect the particle to end up?
Denoting a right-move by +1 and a left-move by −1, we can describe the probability space here as the set of all sequences of length n over the alphabet {±1}, each having equal probability 1 . Let the r.v. S denote
2n n the position of the particle (relative to our starting point 0) after n moves. Thus, we can write
Sn =X1+X2+···+Xn, (1) whereXi =+1ifthei-thmoveistotherightandXi =−1ifthemoveistotheleft.
The expectation of Sn can be easily computed as follows. Since E[Xi ] = ( 1 × 1) + ( 1 × (−1)) = 0, applying 22
linearity of expectation immediately gives E[Sn] = ∑ni=1 E[Xi] = 0. But of course this is not very informative, and is due to the fact that positive and negative deviations from 0 cancel out.
What we are really asking is: What is the expected value of |Sn|, the distance of the particle from 0? Rather than consider the r.v. |Sn|, which is a little difficult to work with due to the absolute value operator, we will instead look at the r.v. Sn2. Notice that this also has the effect of making all deviations from 0 positive, so it should also give a good measure of the distance from 0. However, because it is the squared distance, we will need to take a square root at the end.
We will now show that the expected square distance after n steps is equal to n: Proposition 16.1. For the random variable Sn defined in (1), we have ESn2 = n.
Proof. We use the expression (1) and expand the square:
nn
ESn2 = E(X1 +X2 +···+Xn)2 = E ∑Xi2 +2∑XiXj = ∑EXi2+2∑E[XiXj]. (2)
i=1 i
the correlation of X and Y is defined as
Corr(X,Y)= Cov(X,Y). σ(X)σ(Y)
EECS 70, Fall 2020, Note 16
5
Correlation is more useful than covariance because the former always ranges between −1 and +1, as the following theorem shows:
Theorem 16.4. For any pair of random variables X and Y with σ(X) > 0 and σ(Y) > 0, −1 ≤ Corr(X,Y) ≤ +1.
Proof. Let E[X] = μX and E[Y] = μY, and define X = (X −μX)/σ(X) and Y = (Y −μY)/σ(Y). Then, E[X2] = E[Y2] = 1, so
0 ≤ E [ ( X − Y ) 2 ] = E [ X 2 ] + E [ Y 2 ] − 2 E [ X Y ] = 2 − 2 E [ X Y ] 0 ≤ E [ ( X + Y ) 2 ] = E [ X 2 ] + E [ Y 2 ] + 2 E [ X Y ] = 2 + 2 E [ X Y ] ,
which implies −1 ≤ E[XY] ≤ +1. Now, noting that E[X] = E[Y] = 0, we obtain Corr(X,Y) = Cov(X,Y) = E[XY]. Hence, −1 ≤ Corr(X,Y) ≤ +1.
Note that the above proof shows that Corr(X,Y) = +1 if and only if E[(X−Y)2] = 0, which implies X =Y with probability 1. Similarly, Corr(X,Y) = −1 if and only if E[(X+Y)2] = 0, which implies X = −Y with probability 1. In terms of the original random variables X,Y, this means the following: if Corr(X,Y) = ±1, then there exist constants a and b such that, with probability 1,
Y = aX + b, where a > 0 if Corr(X,Y) = +1 and a < 0 if Corr(X,Y) = −1.
5 Exercises
1. For any random variable X and constant c, show that Var(cX) = c2Var(X). 2. For X ∼ Uniform{1,...,n}, show that E[X] and Var(X) are as shown in (4).
EECS 70, Fall 2020, Note 16 6