代写 matlab Bayesian theory CSE 515T (Fall 2019) Assignment 2

CSE 515T (Fall 2019) Assignment 2
Due Wednesday, 16 October 2019
1. Show the correspondence between the decision rule derived from Bayesian decision theory (minimizing the posterior expected loss) and from the “Bayes rule” derived from the frequentist perspective (choosing a “prior” p(θ) and minimizing risk).
2. (Curse of dimensionality.) Consider a d-dimensional, zero-mean, spherical multivariate Gaussian distribution:
p(x) = N(x;0,Id).
Equivalently, each entry of x is drawn iid from a univariate standard normal distribution.
In familiar small dimensions (d ≤ 3), “most” of the vectors drawn from a multivariate Gaussian distribution will lie near the mean. For example, the famous 68–95–99.7 rule for d = 1 indicates that large deviations from the mean are unusual. Here we will consider the behavior in larger dimensions.
• Draw 10 000 samples from p(x) for each dimension in d ∈ {1, 5, 10, 50, 100}, and √ ⊤ 􏰄d 2 1/2
compute the length of each vector drawn: yd = x x = ( i xi ) . Estimate the distribution of each yd using either a histogram or a kernel density estimate (in matlab, hist and ksdensity, respectively). Plot your estimates. (Please do not hand in your raw samples!) Summarize the behavior of this distribution as d increases.
• The true distribution of yd2 is a chi-square distribution with d degrees of freedom (the distribution of yd itself is the less-commonly seen chi distribution). Use this fact to compute the probability that yd < 5 for each of the dimensions in the last part. • For d = 1000, compute the 5th and 95th percentiles of yd. Is the mean x = 0 a representative summary of the distribution in high dimensions? This behavior has been called “the curse of dimensionality.” 3. (Laplace approximation.) Find a Laplace approximation to the gamma distribution: p(θ | α, β) = 1 θα−1 exp(−βθ). Z Plot the approximation against the true density for (α, β) = (3, 1). The true value of the normalizing constant is Z = Γ(α). βα If we fix β = 1, then Z = Γ(α), so we may use the Laplace approximation to estimate the Gamma function. Analyze the quality of this approximation as a function of α. Read the Wikipedia article about Stirling’s approximation. Do you see a connection? 4. (Gaussian process regression). Consider the following data: x = [−2.26, −1.31, −0.43, 0.32, 0.34, 0.54, 0.86, 1.83, 2.77, 3.58]⊤ ; y = [1.03, 0.70, −0.68, −1.36, −1.74, −1.01, 0.24, 1.55, 1.68, 1.53]⊤ . Fix the observation noise variance at σ2 = 0.52. 1 • Examining a scatter plot of the data, guess which values of (λ, l) in the squared expo- nential covariance (if any) might explain this data well. • PerformGaussianprocessregressionforthesedataontheintervalx∗ ∈[−4,4]usingthe squared exponential covariance for the same set of hyperparameters (λ, l) above. Plot the posterior mean and the pointwise 95% credible interval for each. Which predictions look the best? • Visualize the log model evidence log p(y | x, λ, l, σ2) as a function of (λ, l). You can choose to make, for example, a heatmap or a contour plot of this function. 2