CS计算机代考程序代写 algorithm STAT 513/413, ASSIGNMENT 5

STAT 513/413, ASSIGNMENT 5
1. Widely used in reliability and survival analysis is the Weibull distribution with the density
􏰘λ−κκxκ−1e−(x/λ)κ for y ≥ 0, 0 for y < 0, f(y;λ,κ) = Suppose that the data y1 , y2 , . . . , yn are modeled as the outcomes of independent random variables Y1, Y2, . . . , Yn, each with the Weibull distribution as above. (a) (Theoretical.) Derive the equations that the maximum likelihood estimates of the param- eters λ and κ have to satisfy. Solve as much as you can of those in closed form. (b) (Computational.) For the rest, design and implement a numerical procedure to finish the problem; in order to learn something by doing that, do it from the first principles: hence, please, no packages or downloaded code, no functions optim(), optimize(), nls(), and uniroot(). Test it on (at least) two examples: in each of these choose n (preferably not very small one), choose λ and κ, generate the data (you may use the R function rweibull() for this), compute the estimates, assess the convergence, and compare to the values you started with. Report the results. 2A. (Only for STAT 413) Construct test example for the simultaneous (that is, both in μ and σ) location-scale model with Cauchy standard distribution: select μ and σ, and then generate 30 random numbers using rcauchy() (and store them). (a) Write R functions implementing the negative loglikehood and its gradient, for your partic- ular generated sample (you can transfer those then into the final function conveniently given the R scoping rules). These functions are both functions of two variables (μ and σ); the first returns one number, the second a vector of two. Do persp() and contour() plots of your negative loglikehood functions, for judicously selected ranges for μ and σ (that is, showing the interesting part of the function in the illuminative way). Using R function optim() now, (b) Find the solutions for μ and σ using the Nelder-Mead method. Plot the result into the contourplot. (c) Find the solutions for μ and σ using the BFGS method; you must supply the gradient here (if you forget, the method will run anyway, but will be less precise; and your solution will be deemed not satisfactory). Plot the result into the contourplot; comment on all results and possibly other aspects. 2B. (For STAT 513; STAT 413 may choose to do this one instead of 2A, the previous one; STAT 513 have no choice.) Suppose that the data y1, y2, . . . , yn can be modeled as the outcomes of independent random variables Y1, Y2, . . . , Yn, which have all the same distribution – a mixture of two normal distributions with density p −(yi−μ1)2 1−p −(yi−μ2)2 f(yi;p,μ1,μ2,σ1,σ2) = σ √2πe 2σ12 + σ √2πe 2σ2 12 . The distribution, for the data that show length of eruptions of the geyser Old Faith- ful in the Yellowstone park (R dataset faithful), can be interpreted as switching 2 between two possible regimes: with probabilities p and 1 − p, the length of the eruption follows respectively the normal distribution with μ1 and σ1, and that with μ2 and σ2. The plot of the kernel estimate of the probability density, invoked by the R command > plot(density(faithful$eruptions))
can be seen on the right.
We want to estimate all five unknown parameters in-
volved: μ1, μ2, σ1, σ2, and p. To this end, we consider a hy-
pothetic situation that apart from y!, y2, . . . . , yn, we would
also possess certain additional data z1, z2, . . . , zn which
would specify whether y1 follow from the first (zi = 1) or the second regimen (1−zi = 1). These additional data would be supposed to be the outcomes of 0−1 random variables Z1, Z2, . . . , Zn.
(a) (Theoretical.) Assuming the hypothetic situation, write, or given yi and zi, the negative log- likelihood of the parameters μ1, μ2, σ1, σ2 and p, and derive the maximum likelihood estimates of these parameters.
(b) (Theoretical.) Replace yi and zi by corresponding random variables now. For fixed μ1,μ2,σ1,σ2 and p, what is the probability of Zi = 1 given Yi = yi? Note that is is also the conditional expected value of Zi under the same condition, as Zi is an indicator variable. Use this to derive the conditional expected value of the negativce loglikelihood you derived in (a).
(c) (Computational.) Piecing together (a) and (b), design and implement an EM algorithm to find the parameters of the the mixture model, on the basis of the data y1, y2, . . . , yn only (and not z1, z2, . . . , zn now, of course – as those pertain to the situation that was purely hypothetical). Apply this algorithm to the eruptions variable of the R dataset faithful and plot the resulting estimated density into the picture shown above.
As indicated in Lecture 22, page 410 of the Rizzo textbook contains a formula which looks almost like a solution of (b). While this formula is correct for what is done with it in the textbook, beware: it is NOT the solution of our problem – although its form is very similar.
4. The dataset cars records braking distances, dist, depending on the initial velocities, speed. (If you look more closely at the units, you will find that the dataset is pretty ancient.)
(a) Design and implement function for computing the parameters of quadratic relationship (y = a + bx + cx2) between a quantile (for given p) of braking distance and initial speed – via the IRLS method.
(b) Draw the picture of the datapoints and plot the fitted curves into it.
Do not use any package for this problem. (You may, however, use it for checking the correct- ness of the results, but this is not to be reported in the solution.)
Due Monday, April 12, 2021
density.default(x = faithful$eruptions)
123456
N = 272 Bandwidth = 0.3348
0.0 0.1 0.2
0.3 0.4 0.5
Density