CS计算机代考程序代写 Java data structure Hive algorithm python CS6735 Programming Project

CS6735 Programming Project

Conduct an experimental study on the following machine learning algorithms: (1) ID3; (2) Adaboost on ID3; (3) Random Forest; (4) Naïve Bayes; (5) K-nearest neighbors (kNN).
Implement the five algorithms using Java or Python.
Evaluate your implementation on the datasets in data.zip (downloadable from course website) using 10 times 5-fold cross-validation, and report the average accuracy and standard deviation. All datasets are for UCI machine learning repository. You can check the detailed descriptions from the following link:

HYPERLINK “http://www.ics.uci.edu/~mlearn/MLRepository.html” http://www.ics.uci.edu/~mlearn/MLRepository.html

For breast cancer data see:
HYPERLINK “http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)”http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

For car data see: HYPERLINK “http://archive.ics.uci.edu/ml/datasets/Car+Evaluation”http://archive.ics.uci.edu/ml/datasets/Car+Evaluation

For ecoli data see: HYPERLINK “http://archive.ics.uci.edu/ml/datasets/Ecoli”http://archive.ics.uci.edu/ml/datasets/Ecoli

For letter recognition data see:
HYPERLINK “http://archive.ics.uci.edu/ml/datasets/Letter+Recognition”http://archive.ics.uci.edu/ml/datasets/Letter+Recognition

For mushroom data see: HYPERLINK “http://archive.ics.uci.edu/ml/datasets/Mushroom”http://archive.ics.uci.edu/ml/datasets/Mushroom

For each data set, there is a target variable, the one your model predicts. The following are the target variable for each data set.
 Mushroom: first column (e, p) Letter: first column (A, B, …) Ecoli: last column (cp, im, ..) Car: last column (acc, uacc, ..) Breast-cancer: last column (2, 4)

Compare and discuss your algorithms (implementations) based on your experimental results.

Submission:
Hand in a report of your experimental study via Desire2Learning, including:
Description of the learning algorithms you implement.
Description of the datasets you use (number of examples, number of attribute, number of classes, type of attributes, etc.).
Technical details of your implementation: pre-processing of data sets (discretization, etc.), parameter setting, etc.
Design of your programming implementation (data structures, overall program structure).
Report and analysis of your experimental results.
Submit your code via Desire2Learning.

Deadline:
Submit your report and source code via D2L no later than 11:59pm, April 15, Thursday, 2021.

Related Posts