import pandas as pd
“””
Execution.py is for evaluating your model on the datasets available to you. You can use
this program to test the accuracy of your perceptron by calling it in the following way:
import Execution
Execution.eval(o_train, p_train, o_test, p_test)
In the sample code, o_train is the observed training labels, p_train is the predicted training labels, o_test is the
observed test labels, and p_test is the predicted test labels.
“””
def split_dataset(all_data):
train_data = None
validation_data = None
test_data = None
“””
This function will take in as input the whole dataset and you will have to program how to split the dataset into
training, validation, and test datasets. These are the following requirements:
-The function must take only one parameter which is all_data as a pandas dataframe of the raw dataset.
-It must return 3 outputs in the specified order: train, validation, and test datasets
It is up to you how you want to do the splitting of the data.
“””
return train_data, validation_data, test_data
def logistic_loss(orig_labels, pred_labels):
# Update this loss variable to reflect the logistic loss
loss = 0
“””
Please implement the logistic loss function. The input parameters, orig_labels and pred_labels, must not be changed.
These parameters will be a list of labels and from these lists, it is your job to calculate the total logistic loss
over the predictions.
“””
# Don’t remove this statement. This is the output format for printing the logistic loss results.
print(‘***************\nLogistic Loss: ‘ + str(loss) + ‘\n***************’)
“””
This function should not be changed at all.
“””
def eval(o_train, p_train, o_val, p_val, o_test, p_test):
print(‘\nTraining Accuracy Result!’)
accuracy(o_train, p_train)
print(‘\nTraining Logistic Loss Result!’)
logistic_loss(o_train, p_train)
print(‘\nValidation Accuracy Result!’)
accuracy(o_val, p_val)
print(‘\nValidation Logistic Loss Result!’)
logistic_loss(o_val, p_val)
print(‘\nTest Accuracy Result!’)
accuracy(o_test, p_test)
print(‘\nTest Logistic Loss Result!’)
logistic_loss(o_test, p_test)
“””
This function should not be changed at all.
“””
def accuracy(orig, pred):
num = len(orig)
if (num != len(pred)):
print(‘Error!! Num of labels are not equal.’)
return
match = 0
for i in range(len(orig)):
o_label = orig[i]
p_label = pred[i]
if (o_label == p_label):
match += 1
print(‘***************\nAccuracy: ‘+str(float(match) / num)+’\n***************’)
if __name__ == ‘__main__’:
“””
The code below these comments must not be altered in any way. This code is used to evaluate the predicted labels of
your perceptron against the ground-truth observations.
“””
from Perceptron import Perceptron
all_data = pd.read_csv(‘data.csv’, index_col=0)
train_data, validation_data, test_data = split_dataset(all_data)
# placeholder dataset -> when we run your code this will be an unseen test set your model will be evaluated on
test_data_unseen = pd.read_csv(‘test_data.csv’, index_col=0)
p = Perceptron()
p.train(train_data)
predicted_train_labels = p.predict(train_data)
predicted_val_labels = p.predict(validation_data)
predicted_test_labels = p.predict(test_data)
predicted_test_labels_unseen = p.predict(test_data_unseen)
eval(train_data[‘Label’].tolist(), predicted_train_labels, validation_data[‘Label’].tolist(), predicted_val_labels,
test_data[‘Label’].tolist(), predicted_test_labels)
# run the evaluation on the unseen test set
eval(train_data[‘Label’].tolist(), predicted_train_labels, validation_data[‘Label’].tolist(), predicted_val_labels,
test_data_unseen[‘Label’].tolist(), predicted_test_labels_unseen)