ANLY-601 Pattern Recognition
Homework 2
Due Thursday, Feb. 15, 2018
Use only your course notes, integral tables or Mathematica.
1. Linear Gaussian Systems. (12 points)
Let x be a Gaussian distributed scalar random variable with mean zero and variance σ2x. Let
y be related to x by
y = Ax + µ + � (1)
where A and µ are constants, and � is Gaussian noise with mean zero and variance σ2� .
Assume x and � are statistically independent. For example, x might be a hidden continuous
state variable, and y an observable linearly related to x with additive noise �. More concretely,
suppose x is the Celsius temperature of some object (drawn from a population of objects with
zero-mean Gaussian temperature distribution) and y is the reading of a noisy, Fahrenheit
thermometer (with A and µ the scale change and offset between the Celsius and Fahrenheit
systems). We are going to estimate x and its distribution by using Bayes theorem to calculate
p(x|y).
(a) From the definition of y in terms of x and �, show that
E[y|x] = E[ (Ax + µ + �) |x ] = Ax + µ . (2)
You need not write p(y|x) explicitly to evaluate this, but you can. If you don’t write the
density, use words to make a clear argument for the result.
(b) Also use the definition of y to show that
var(y|x) = E[ ( y − E[y|x] )2 |x ] = E[ (Ax+ µ+ �− E[y|x] )2 |x ] = σ2� . (3)
Again, need not write p(y|x) explicitly to show this, but you can.
(c) Show that
µy = E[ y ] = µ (4)
var(y) = E[ ( y − µy )2 ] = A2 σ2x + σ
2
� (5)
using the fact that � and x are statistically independent (and therefore uncorrelated).
(d) Finally, use the fact that linear functions of Gaussian variables are Gaussian, and equa-
tions (2)-(5) to write p(y|x), p(y) and then, via Bayes’ theorem the posterior distribution
p(x|y)
p(x|y) =
p(y|x) p(x)
p(y)
.
Your answer should explicitly express the conditional mean E[x|y] and the conditional
variance var(x|y) in terms of A, µ, σ2x, and σ2� .
Hint: You should complete the square to write p(x|y) and extract the conditional mean
and variance.
2. The 80/20 Rule: Heavy-Tailed Densities (12 points)
Read ‘Million Dollar Murray’ in the class Blackboard site Documents → Other Notes and
Readings.
The Pareto principle (also called the 80/20 rule) states that for many phenomena, 80% of
the effects come from 20% of the causes. The notion that a few sources together carry a
huge proportion of the effects has attracted a lot of attention in recent years, although the
ratios are not often 80/20. For example, the most wealthy 1% of US households collectively
hold 40% of the country’s wealth (Christopher Ingraham, Washington Post web page, Dec.
6, 2017).
In both social and physical phenomena, heavy-tailed distributions — densities with tails that
drop off as a power law, rather than exponentially — occur widely and give rise to such skewed
proportions. Such distributions contrast strongly those with tails that decay exponentially,
which are appropriate for phenomena with cases that cluster around the peak of the density
(e.g. Gaussian and exponential densities).
The Pareto Type I density is
p(x) = αxm
(
1
x
)α+1
(6)
with support on xm ≤ x ≤ ∞, and parameter restricted to α > 1. Let’s use the Pareto Type
I to explore wealth allocation for heavy-tailed distributions.
Let x denote the net worth of individuals in a population modeled by a Pareto type I density.
For convenience, set the minimum net worth to xm = 1.
(a) (2 points) Sketch or plot a Pareto type I density with xm = 1 and α > 1. Show that the
proportion of the population with net worth greater than or equal to x0 > xm = 1 is
f0 =
∫ ∞
x0
p(x) dx =
(
1
x0
)α
. (7)
(b) (2 points) Show that the average wealth in the entire population is
wealth = E[x] =
∫ ∞
1
x p(x) dx =
α
α− 1
. (8)
(c) (2 points) Show that the average wealth among individuals with net worth greater than
or equal to x0 is
wealth(x0) =
∫ ∞
x0
x p(x) dx =
α
α− 1
(
1
x0
)α
x0 (9)
(d) (2 points) Conclude that the fraction of the total wealth held by those with net worth
greater than x0 is
f$0 =
wealth(x0)
wealth
=
(
1
x0
)α−1
. (10)
(e) (4 points) To write the proportion of wealth held by the rich in terms of their proportion
of the population, solve Equation (7) for x0(f0) and substitute into Equation (10). Plot
the resulting f$0 vs f0 for values of α in the range 1.05 ≤ α ≤ 100]. Make plots for
0 ≤ f0 ≤ 0.2 as well as for 0 ≤ f0 ≤ 1.0. Interpret the results in words. What value of
α corresponds to Ingraham’s report that f0 = 0.01 yields f$0 = 40%?
2