2. (25) Thinking Through Nearest Neighbors.
The nearest neighbor technique that we discussed in class, is a popular classification rule. The idea is
straightforward: suppose we have a training dataset (Xi, Y.), 1 < i < n. Now, to allocate a class to a
new observation 2, we look at the & closest points X; from 2; those are called the k nearest neighbors;
Copyright By PowCoder代写 加微信 powcoder
we assign a to the class that is the most represented in this set of nearest neighbors. (In the case of
ties, flip a coin!) Selecting th e’right’ number k of neighbors is the topic of this question.
(a) What would happen if the number of neighbors equals the number of cases in the training sample?
(b) What would happen if the number of neighbors is 1? Does this even make sense? Explain.
Suggest a method for selecting a reasonable number of neighbors.
3. (25) Logistic Regression vs Linear Discriminant Analysis.
(a) 74 fleas from three different species types were samples and 6 dimensional variables were recorded.
The data are available in the Exams folder
in our class blackboard.
Using Logistic Regression
create a model for predicting the species of flea based on the 6 measurements. Now using Linear
Discriminant Analysis, develop a rule for prediction the species of flea based on the 6 measurements.
Set up your problem such that you are able to calculate misclassification error for both rules.
Explain in detail decisions you made and how you implemented these methods, including your R
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com