CS计算机代考程序代写 data mining decision tree DATA MINING AND MACHINE LEARNING (EBUS537)

DATA MINING AND MACHINE LEARNING (EBUS537)

Formative Assignment

Set by Prof Dongping SONG

Date of issue:
23rd Oct 2021.

Date of submission: 19th November 2021 before 12 noon (online)

Contribution:
0%.

Essay length:
1000 words (maximum).

Coursework:

Using the given table as the training dataset, apply the Greedy strategy combined with the Gini impurity measure to build a fully-grown decision tree. If the attribute has multiple attribute values, please use multiway split (do not use binary split). Leaf nodes should be declared as a single class label.

Please provide the samples of the calculations and explanations to demonstrate the application process of the Greedy strategy and Gini impurity measure.

Please perform the following post-pruning activities: (i) prune the sub-tree if all of its leaf nodes have the same class label; (ii) prune the leaf nodes that have fewer than 2 instances as appropriate.

Table 1. Data set

Gender
Car Type
Shirt Size
Class

M
Family
Small
C0

M
Sports
Medium
C0

M
Sports
Medium
C0

M
Sports
Large
C0

M
Sports
Extra Large
C0

M
Sports
Extra Large
C0

F
Sports
Small
C0

F
Sports
Small
C0

F
Sports
Medium
C0

F
Luxury
Large
C0

M
Family
Large
C1

M
Family
Extra Large
C1

M
Family
Medium
C1

M
Luxury
Extra Large
C1

F
Luxury
Small
C1

F
Luxury
Small
C1

F
Luxury
Medium
C1

F
Luxury
Medium
C1

F
Luxury
Medium
C1

F
Luxury
Large
C1