1. Consider the following data set for a binary class problem:
Feature A Feature B Class Label TF+ TT+ TT+ TF- TT+ FF- FF- FF- TT- FF-
We wish to select the feature that best predicts the class label using the χ2 method.
Write down the observed and expected contingency tables for feature A
Copyright By PowCoder代写 加微信 powcoder
Calculate the χ2(A, Class) value.
Using the table below, conclude whether feature A is independent of the class label for p = 0.05.
Repeat the process for feature B and decide which feature could be best used for predicting the class label.
Observed table:
A A=T A=F Total
Class=+ 4 Class=- 2 Total 6
04 46 4 10
Expected table:
A A=T A=F Total
Class=+ 2.4 1.6 4 Class=- 3.6 2.4 6
Total 6 4 10
χ2(A, Class) = (4−2.4)2 + (0−1.6)2 + (2−3.6)2 + (4−2.4)2 = 4.44
2.4 1.6 3.6 2.4
Degrees of freedom = (2 − 1) × (2 − 1) = 1
Lookup value in table (3.84). Since our calculated χ2 value is greater than the critical value in the table, conclude A is not independent of Class for p = 0.05
For feature B: Observed table:
B B=T B=F Total Class=+ 3 1 4 Class=- 1 5 6
Total 4 6 10
Expected table:
B B=T B=F Total
Class=+ 1.6 2.4 4 Class=- 2.4 3.6 6
Total 4 6 10
χ2(B, Class) = (3−1.6)2 + (1−2.4)2 + (1−2.4)2 + (5−3.6)2 = 3.40
1.6 2.4 3.6 3.6
Degrees of freedom = (2 − 1) × (2 − 1) = 1
Lookup value in table (3.84). Since our calculated χ2 value is less than the critical value in the table, conclude B is independent of Class for p = 0.05
Feature A best predicts class label.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com