CS代考 Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
Fundamentals of Machine Learning for
Predictive Data Analytics
Chapter 8: Evaluation Sections 8.1, 8.2, 8.3
and Namee and Aoife D¡¯Arcy

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
1
2
3
Big Idea
Fundamentals
Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set
Summary
4

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
The most important part of the design of an evaluation experiment for a predictive model is ensuring that the data used to evaluate the model is not the same as the data used to train the model.

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
The purpose of evaluation is threefold:
1 to determine which model is the most suitable for a task
2 to estimate how the model will perform
3 to convince users that the model will meet their needs

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
Standard Approach: Measuring
Misclassification Rate on a Hold-out Test Set

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
Figure: The process of building and evaluating a model using a hold-out test set.

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
ID Target
Pred. Outcome ham FN ham FN ham TN
spam TP ham TN spam TP ham TN spam TP spam TP spam TP
ID Target Pred.
11 ham ham TN 12 spam ham FN 13 ham ham TN 14 ham ham TN 15 ham ham TN 16 ham ham TN 17 ham spam FP 18 spam spam TP 19 ham ham TN 20 ham spam FP
Table: A sample test set with model predictions.
Outcome
1 2 3 4 5 6 7 8 9 10
spam spam ham spam ham spam ham spam spam spam

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
misclassification rate = number incorrect predictions (1) total predictions

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
misclassification rate = number incorrect predictions (1) total predictions
misclassification rate = (2 + 3) = 0.25 (6+9+2+3)

Big Idea
Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
For binary prediction problems there are 4 possible outcomes:
1 True Positive (TP)
2 True Negative (TN)
3 False Positive (FP)
4 False Negative (FN)

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
Table: The structure of a confusion matrix.
Target
Prediction positive negative
positive TP FN negative FP TN

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
Table: A confusion matrix for the set of predictions shown in Table 1 [7] .
Target
¡¯spam¡¯ ¡¯ham¡¯
Prediction
¡¯spam¡¯ ¡¯ham¡¯
6 3 2 9

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
misclassification accuracy = (FP + FN) (2) (TP +TN +FP +FN)

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
misclassification accuracy = (FP + FN) (2) (TP +TN +FP +FN)
misclassification accuracy = (2 + 3) = 0.25 (6+9+2+3)

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
classification accuracy = (TP + TN) (3) (TP +TN +FP +FN)

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
classification accuracy = (TP + TN) (3) (TP +TN +FP +FN)
classification accuracy = (6 + 9) = 0.75 (6+9+2+3)

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
Summary

Big Idea Fundamentals Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set Summary
1
2
3
Big Idea
Fundamentals
Standard Approach: Measuring Misclassification Rate on a Hold-out Test Set
Summary
4