ANLY-601 Pattern Recognition
Homework 1
Due Tuesday, January 29, 2018
Use only your course notes — no internet or texts.
1. Moments of Gaussian Densities (10 points)
Consider the one-dimensional Gaussian pdf
p(x) =
1
√
2π σ2
exp−
(
(x−m)2
2σ2
)
.
Use the fact that ∫ ∞
−∞
exp−(αu2) du =
√
π
α
and the identity ∫
u2 exp−(αu2) du = −
d
dα
∫
exp−(αu2) du
to show that the even central moments of the Gaussian density are
E [(x−m)n] = (1 · 3 · 5 · . . . · n− 1) σn for n even .
Use symmetry arguments (hint: antisymmetric integrand over symmetric bounds) to show
that the odd central moments are all zero
E [(x−m)n] = 0 for n odd
2. Conditional and Unconditional Variance (10 points)
In class we showed the relationship between conditional means and unconditional means.
Specifically for random variables x ∈ RN and y ∈ RM , the conditional mean of x is
E[x|y] =
∫
x p(x|y) dNx
and the unconditional mean is
E[x] =
∫
x p(x) dNx =
∫
x
(∫
p(x|y) p(y) dMy
)
dNx
=
∫ (∫
x p(x|y) dNx
)
p(y) dMy = Ey[Ex[x|y] ] .
The relationship between the conditional variance and the unconditional variance is a bit
more interesting. For simplicity, take x ∈ R and y ∈ R (scalar random variables). The
conditional variance is
var(x|y) =
∫
(x− E[x|y])2 p(x|y) dx . (1)
(Note that like the mean, the conditional variance is a function of x2.) Show that the uncon-
ditional variance is related to the condition variance by
var(x) =
∫
(x− E[x])2 p(x) dx = Ey[ varx(x|y) ] + vary(E[x|y] ). (2)
Your derivation must show explicitly what vary(E[x|y] ) means in terms of integral averages
over quantities.
(Hint: Rewrite
(x− E[x])2 = (x− E[x] + E[x|y]− E[x|y])2 = (x− E[x|y] + E[x|y]− E[x])2
= (x− E[x|y])2 + (E[x|y]− E[x])2
+ 2 (x− E[x|y])(E[x|y]− E[x]) .
)
3. A Maximum likelihood estimation (5 points)
This problem has an interesting practical origin, that I’ll explain after you hand your solution
back.
I have a bag filled with m plastic balls, numbered consecutively 1, 2, . . . ,m. I don’t tell you
what the value of m is; I want you to make a (statistically informed) guess.
So I give you one piece of data. I reach into the bag and pull out one of the balls at random
(i.e. with probability 1/m) and hand it to you. It has the value “19” printed on it.
Let’s compute the maximum likelihood estimate of the total number of balls m. Mathemat-
ically, this is the value of m that maximizes p(x = 19|m). Start by building a likelihood
function — since there is one ball with each number 1, 2, 3, . . . , 19, . . . ,m, any number on a
ball in the range 1−m can be observed with equal probability
p(1|m) = p(2|m) = · · · = p(m|m) = 1/m .
Note also that it’s not possible to observe a number on a ball greater than (the unknown) m
p(n|m) = 0 for n > m .
These two pieces of information fix the likelihood function p(x|m). Given this information,
what is the value of m that maximizes the likelihood of the data p(19|m)?
2