机器学习模式识别代写: ANLY-601 Pattern Recognition Homework 2

ANLY-601 Pattern Recognition

Homework 2
Due Thursday, Feb. 15, 2018

Use only your course notes, integral tables or Mathematica. 1. Linear Gaussian Systems. (12 points)

Let x be a Gaussian distributed scalar random variable with mean zero and variance σx2. Let y be related to x by

y = Ax + μ + ε (1)

where A and μ are constants, and ε is Gaussian noise with mean zero and variance σε2. Assume x and ε are statistically independent. For example, x might be a hidden continuous state variable, and y an observable linearly related to x with additive noise ε. More concretely, suppose x is the Celsius temperature of some object (drawn from a population of objects with zero-mean Gaussian temperature distribution) and y is the reading of a noisy, Fahrenheit thermometer (with A and μ the scale change and offset between the Celsius and Fahrenheit systems). We are going to estimate x and its distribution by using Bayes theorem to calculate p(x|y).

  1. (a)  From the definition of y in terms of x and ε, show that
    E[y|x] = E[(Ax + μ + ε)|x] = Ax + μ . (2)

    You need not write p(y|x) explicitly to evaluate this, but you can. If you don’t write the density, use words to make a clear argument for the result.

  2. (b)  Also use the definition of y to show that
    var(y|x) = E[(y−E[y|x])2|x] = E[(Ax+μ+ε−E[y|x])2|x] = σε2 . (3)

    Again, need not write p(y|x) explicitly to show this, but you can.

  3. (c)  Show that

    μy =E[y]=μ (4) var(y) = E[(y−μy)2]=A2σx2 +σε2 (5)

    using the fact that ε and x are statistically independent (and therefore uncorrelated).

  4. (d)  Finally, use the fact that linear functions of Gaussian variables are Gaussian, and equa- tions (2)-(5) to write p(y|x), p(y) and then, via Bayes’ theorem the posterior distribution p(x|y)

    p(x|y) = p(y|x)p(x) . p(y)

    Your answer should explicitly express the conditional mean E[x|y] and the conditional variance var(x|y) in terms of A, μ, σx2, and σε2.

    Hint: You should complete the square to write p(x|y) and extract the conditional mean and variance.

2. The 80/20 Rule: Heavy-Tailed Densities (12 points)
Read ‘Million Dollar Murray’ in the class Blackboard site Documents → Other Notes and

Readings.

The Pareto principle (also called the 80/20 rule) states that for many phenomena, 80% of the effects come from 20% of the causes. The notion that a few sources together carry a huge proportion of the effects has attracted a lot of attention in recent years, although the ratios are not often 80/20. For example, the most wealthy 1% of US households collectively hold 40% of the country’s wealth (Christopher Ingraham, Washington Post web page, Dec. 6, 2017).

In both social and physical phenomena, heavy-tailed distributions — densities with tails that drop off as a power law, rather than exponentially — occur widely and give rise to such skewed proportions. Such distributions contrast strongly those with tails that decay exponentially, which are appropriate for phenomena with cases that cluster around the peak of the density (e.g. Gaussian and exponential densities).

The Pareto Type I density is

􏰄1􏰅α+1
p(x) = αxm x (6)

with support on xm ≤ x ≤ ∞, and parameter restricted to α > 1. Let’s use the Pareto Type I to explore wealth allocation for heavy-tailed distributions.

Let x denote the net worth of individuals in a population modeled by a Pareto type I density. For convenience, set the minimum net worth to xm = 1.

  1. (a)  (2 points) Sketch or plot a Pareto type I density with xm = 1 and α > 1. Show that the proportion of the population with net worth greater than or equal to x0 > xm = 1 is

    􏰉 ∞ 􏰄 1 􏰅α
    f0= p(x)dx=x . (7)

    x0 0

  2. (b)  (2 points) Show that the average wealth in the entire population is

    􏰉∞ 1

wealth = E[x] =
(c) (2 points) Show that the average wealth among individuals with net worth greater than

α
xp(x)dx = α−1 . (8)

or equal to x0 is

􏰉∞ α 􏰄1􏰅α
xp(x)dx = α−1 x x0 (9)

wealth(x0) =
(d) (2 points) Conclude that the fraction of the total wealth held by those with net worth

greater than x0 is

wealth(x0) 􏰄 1 􏰅α−1

x0 0

f$0 = wealth = x

. (10)

(e) (4 points) To write the proportion of wealth held by the rich in terms of their proportion of the population, solve Equation (7) for x0(f0) and substitute into Equation (10). Plot the resulting f$0 vs f0 for values of α in the range 1.05 ≤ α ≤ 100]. Make plots for 0 ≤ f0 ≤ 0.2 as well as for 0 ≤ f0 ≤ 1.0. Interpret the results in words. What value of α corresponds to Ingraham’s report that f0 = 0.01 yields f$0 = 40%?

2

0