CS计算机代考程序代写 deep learning GPU AWS Copy_of_CIS545_HW_5_Release

Copy_of_CIS545_HW_5_Release

CIS 545 Homework 5: Deep Learning with MXNet¶
Due December 2nd, 10 PM EST¶
Welcome to CIS 545 Homework 5!

In this homework, we will learn more about the “new electricity” – Deep Learning (we didn’t coin this term, Andrew Ng did)! There are many cool frameworks for building deep learning models: PyTorch, Tensorflow, Theano, MxNet. Since you will be working with Big Data in this course, you need a framework that scales well. Almost all of these have multi-GPU support built in; MxNet provides the easiest abstractions to do this and works well with AWS as well as Colab. In this assignment, we will be building neural networks in MxNet to solve an interesting problem.

Deep learning or neural network architectures have been used to solve a multitude of problems in various different fields like vision, natural language processing, and radiology. So let’s take a “deep” dive into them!

Why deep learning?¶
It’s coooool
Everyone is talking about is these days, People like Siraj Raval can teach it in 5 mins (Check out his youtube channel for some comedy)
Deep learning unlocks the treasure trove of unstructured big data for those with the imagination to use it
Deep learning models have great representational power and are ‘universal approximators’

Deep Learning Applications:¶
Deep learning has been significantly improved voice command systems (such as Siri and Alexa), as well as healthcare and image identification.

Deep learning has applications across numerous industries, which is why experts think that this technology is the future of almost everything. There are truly deep learning technologies such as Google’s very human-like talking AI, a new theory that cracks the ‘black box’ of deep learning, and various budding ideas like this one about why human forgetting might be the key to AI. Here are some cool applications of deep learning –

Here’s a neural network detecting anomalies in Chest Xrays :

Most humans can’t tell that this is a case of Pleural Effusion {sounds like medical jargon to engineers like us} but this Neural Network model can detect it very well!

Mask RCNNs in action for detecting objects on the road aiding a self driving vehicle’s driving:

Pretty cool, right? We will be appling CNNs to solve a cool image classification problem.

Setup Skeleton¶

Penn Grader Setup¶
Make sure to initialize the grader with your 8 digit Penn ID.

In [ ]:

%%capture
!pip3 install penngrader

from penngrader.grader import *

VERY IMPORTANT : Enter your 8 digit Penn ID in the student id field below

PLEASE NOTE: There are some questions, for example making plots, that do not have test cases. All questions without an autograder attached will be manually graded.

In [ ]:

#PLEASE ENSURE YOUR PENN-ID IS ENTERED CORRECTLY. IF NOT, THE AUTOGRADER WON’T KNOW WHO
#TO ASSIGN POINTS TO YOU IN OUR BACKEND
STUDENT_ID = 99999999# YOUR PENN-ID GOES HERE AS AN INTEGER#

In [ ]:

grader = PennGrader(homework_id = ‘CIS545_Fall_2020_HW5’, student_id = STUDENT_ID)

MxNet Installation¶
First, verify that you see a Tesla or other GPU listed here…

In [ ]:

!nvidia-smi

In [ ]:

%%capture
!pip3 install –upgrade mxnet-cu101 gluoncv

In [ ]:

import mxnet as mx
from gluoncv.utils import viz
from mxnet.gluon.data import DataLoader
from mxnet.gluon.data.vision import transforms
from mxnet import np
import shutil

In [ ]:

from mxnet import np, npx
from mxnet.gluon import nn
npx.set_np()

if npx.num_gpus() < 1: raise "No GPU is found, please restart your runtime!" Section 1 : Indoor Scene Recognition with MXNet¶ 1.1 Lots of Data¶ The datset we use is used for the indoor scene recognition problem. Indoor scene recognition is a challenging open problem in high level vision. Most scene recognition models that work well for outdoor scenes perform poorly in the indoor domain. The main difficulty is that while some indoor scenes (e.g. corridors) can be well characterized by global spatial properties, others (e.g., bookstores) are better characterized by the objects they contain. More generally, to address the indoor scenes recognition problem we need a model that can exploit local and global discriminative information. The dataset contains 67 Indoor categories. The number of images varies across categories, but there are at least 100 images per category. All images are in jpg format. Download the dataset¶ Run the cell below to download the dataset from Google Drive. In [ ]: import os from google.colab import drive # Mount google drive DRIVE_MOUNT='/content/gdrive' drive.mount(DRIVE_MOUNT) # create folder to write data to CIS545_FOLDER=os.path.join(DRIVE_MOUNT, 'My Drive', 'CIS545_2020') HOMEWORK_FOLDER=os.path.join(CIS545_FOLDER, 'HW5') os.makedirs(HOMEWORK_FOLDER, exist_ok=True) Download the data into your google drive. You need to run this cell only once! We've included a check so it won't redundantly download the data. In [ ]: from google_drive_downloader import GoogleDriveDownloader as gdd if not os.path.isfile("/content/gdrive/My Drive/CIS545_2020/HW5/data.zip"): gdd.download_file_from_google_drive(file_id='1A-dYo1ba1mjTrnH6xjzYrg_fO_GxHjvN', dest_path='/content/gdrive/My Drive/CIS545_2020/HW5/data.zip') In [ ]: !ls "/content/gdrive/My Drive/CIS545_2020/HW5" In [ ]: !unzip "/content/gdrive/My Drive/CIS545_2020/HW5/data.zip" 1.1.1 Filter out corrupt and nonexistent images (5 points)¶ There are a lot of images in the dataset that aren't valid JPEG images. We need to filter out the invalid images! Complete the check_corrupt function which takes in a filename and returns a boolean indicating if the file is valid. Hint: the PIL library would be useful for this verification! In [ ]: base_path = '/content/indoorCVPR_09/Images' train_path = os.path.join(base_path, 'train/') test_path = os.path.join(base_path,'test/') os.makedirs(train_path, exist_ok=True) os.makedirs(test_path, exist_ok=True) train_file = open("/content/TrainImages.txt", "r") from PIL import Image def check_corrupt(filename): #TODO -- also fix the return result! return False # Check all images in the train file for validity and write the valid ones to train_path correct_file_count_train = 0 for file in train_file: image_path = os.path.join(base_path,file.rstrip('\n')) dest_folder = os.path.join(train_path, file.split('/')[0]) os.makedirs(dest_folder, exist_ok=True) dest_path = os.path.join(train_path,file.rstrip('\n')) if os.path.getsize(image_path) == 0 or check_corrupt(image_path): continue correct_file_count_train += 1 dest = shutil.move(image_path, dest_path) correct_file_count_test = 0 test_file = open("/content/TestImages.txt", "r") for file in test_file: image_path = os.path.join(base_path,file.rstrip('\n')) dest_folder = os.path.join(test_path, file.split('/')[0]) os.makedirs(dest_folder, exist_ok=True) dest_path = os.path.join(test_path,file.rstrip('\n')) if os.path.getsize(image_path) == 0 or check_corrupt(image_path): continue correct_file_count_test += 1 dest = shutil.move(image_path, dest_path) In [ ]: grader.grade('check_file_cleaning', (correct_file_count_train, correct_file_count_test)) 1.1.2 Build a dataset class (5 points)¶ Create train and test datasets for loading image files stored in a folder structure within train_path and test_path. Make sure that the image pixels are floats in the range [0,1] and not integers between [0,255] (Hint: transform parameter) Name your datasets train_dataset and test_dataset. Read about how you can do it here. In [ ]: # TODO: Create train_dataset and test_dataset In [ ]: sample_train_img, sample_train_label = train_dataset[5] sample_test_img, sample_test_label = test_dataset[5] sample_train_img = sample_train_img.asnumpy() sample_train_label = sample_train_label sample_test_img = sample_test_img.asnumpy() sample_test_label = sample_test_label grader.grade('check_datasets', [sample_train_img,sample_train_label, sample_test_img, sample_test_label]) 1.1.3 Visualize images from the dataset¶ We have our training and testing datasets but we humans don't really understand binary that well. So let's visualize what our data is by plotting some data points In [ ]: # Visualize 10 images in the dataset with their labels %matplotlib inline import matplotlib.pyplot as plt sample_idxs = [1, 3, 213, 224, 567,779,1052,2000, 3000, 4444] for sample_idx in sample_idxs: data, label = train_dataset[sample_idx] plt.imshow(data.asnumpy()) plt.title(train_dataset.synsets[label]) plt.show() We can see that these images are all different sizes and some examples can be really hard to classify - like the airport inside class! 1.1.4 Class frequency distributions (5 points)¶ Create a frequency distribution of the classes in the training dataset. You should create a dictionary with the number of faces belonging to each of the emotions. The key for the dictionary should be the name of the scene and the value should be the frequency in the train dataset. In [ ]: def create_frequency_dict(train_dataset): scene_frequency_dict = {} #TODO: Create the scene frequency distribution return scene_frequency_dict scene_frequency_dict = create_frequency_dict(train_dataset) print(scene_frequency_dict) In [ ]: grader.grade('check_freq_dists', scene_frequency_dict) Does the class distribution look uniform? If yes, we don't need to address class imbalance, if no, what should we do? 1.1.5 Create Dataloader objects (5 points)¶ Data loaders create data batches and perform transformations on the images. Since the images are different sizes, we need to resize them to the same value. You should write a transformation to resize the image to 224 x 224. You would also need to add a transformation to convert the image to a tensor -- one of the building blocks of neural network operations. Tensors are like numpy arrays with a gradient aspect. TLDR, in this section you will need to: Define a composition of transformations to first resize the image and then convert to tensors Next create train and test data loaders, apply the transformations to the train and test datasets respectively. You will also need to pass in the batch size and whether or not you want to shuffle the data. Set shuffle = True for the train and False for the test set. Use a batch size of 32 for the training loader. Use a batch size of 1 for the test set. Refer to to the transformation documentation In [ ]: from mxnet.gluon.data import DataLoader # TODO: Define the transformation and the train and test loaders In [ ]: answer = None for data, label in train_loader: answer = data.asnumpy().shape, label.asnumpy().shape break answer2 = None for data, label in test_loader: answer2 = data.asnumpy(), label.asnumpy() break grader.grade('check_loaders', (answer, answer2)) Section 2: Let's build classifiers!¶ We have the data we need to train a scene classifier. We will start simple with a logistic regression classifier as a baseline for our performance before we move onto more complex neural networks. 2.1.1 Logical Logistic Regression - Baseline (25 points)¶ Let's first try solving this problem with a Logistic Regression classifier solving the multiclass classification problem. We will define a logistic regression model in Apache MxNet and train it on our training set and evaluate the performance on the test set. Note: With MxNet, data (tensors) are typically in ndarrays. However, certain functions expect Numpy arrays. Occasionally you'll get an error when you pass an array into a function, where it tells you to call as_np_ndarray() or as_nd_ndarray(). If you get that error, just follow the instructions and you should be fine. Model Definition¶ We will define our first model in mxnet. In the MXNet/Gluon docs, read up about the gluon and autograd modules and how to use them to create layers in a neural network. Our first model is a logistic regression model with the number of outputs equal to the number of classes in the model. Complete the construct net function with the logistic regression model definition In [ ]: from mxnet import gluon, autograd, ndarray def construct_net(): net = # TODO: Initialize a gluon sequential model with net.name_scope(): #TODO: Add a gluon dense layer to the model return net net = construct_net() # Set the context to use the available GPUs, otherwise just use a CPU ctx = mx.gpu() if mx.context.num_gpus() else mx.cpu() Now we need to initialize the model weights and the context, call the net initialize function with Xavier Initialization (sets your starting model weights, read more about it online if you are curious) also set the ctx variable with the context defined above In [ ]: net.initialize(mx.init.Xavier(), ctx=ctx) Let's print the model summary In [ ]: x = mx.sym.var('data') sym = net(x.as_np_ndarray()) mx.viz.print_summary(sym) In [ ]: x = net(mx.nd.random.uniform(shape=(32, 3, 224, 224), ctx=ctx).as_np_ndarray()) x = x.asnumpy() grader.grade('check_log_reg_model', (x.shape, str(net.collect_params()))) This is a multi-class classification problem, so we will use the categorical cross-entropy loss function. It is defined as: $$L(y,\hat y)=-\sum_{j=0}^M\sum_{i=0}^{N}(y_{ij} log(\hat y_{ij}))$$Luckily, we don't have to write it ourselves, we will use the implementation within Mxnet and Gluon. Let's first define our criterion i.e. the loss function we want to optimize for. Read more about gluon loss functions here. In [ ]: # TODO: Define a gluon Softmax Cross Entropy object, name this 'criterion', # the softmax indicates that the loss function does a softmax first to get the probabilities # and then computes the Cross EntropyLoss criterion = None #TODO In [ ]: grader.grade('check_criterion', str(criterion)) Next, we define a trainer object, which includes an optimizer - we will use a stochastic gradient descent optimizer to optimize for our criterion and update our weights. We need 3 parameters while defining a gluon optimizer. Trainable parameters in the model - net.collect_params() gives you all these parameters An optimizer - 'sgd' or 'adam'. For this task, you'll have better luck with the Adam adaptive optimizer than with stochastic gradient descent aka 'sgd'. Optimizer params - A dictionary with parameters for your optimizer. We only need to specify the learning rate parameter within this dictionary. The learning rate is a hyperparameter that you should train for. You should start with a small learning rate like 0.001. Read more about the trainer object here. In [ ]: trainer = None # TODO Train Model¶ Next, we need to iterate through our training data multiple times to optimize our weights. Each of the iterations is called an epoch. We will write a training loop now. Here is the pseudocode for the training loop: Define an accuracy metric to measure performance. Repeat the following for a number of epochs Iterate through the mini batches in the training dataloader Each minibatch object will be a tuple (data, labels) Each minibatch has a number of images and a number of labels (batch size number of images and labels in each minibatch). Thus each data object will be of the shape (BATCH_SIZE, 3, 224, 224). The 3 corresponds to the number of channels - RGB and the labels array would be of size (BATCH_SIZE) Since our model requires linear inputs and not multiple channels, flatten the image in the batch. The reshape function will help you to do this. Send the data and label to the GPU. The as_in_ctx with the ctx defined above will help you to do this. Compute the model outputs for the flattened data using the net object defined above Compute the loss function with criterion we previously defined Compute the accuracy using the metric object defined above. Backpropogate through the computed loss value. This would compute the gradients for each of the model parameters. Use a trainer to perform an optimizer step - this updates the weights based on the computed gradients wrt the loss function. To see how your accuracy improves and the loss decreases, after each epoch print out the accuracy and the loss for that epoch. You should also plot your training accuracies and training loss function vs epochs. The plot is worth 5 points! Train your logistic regression model for 10 epochs. In [ ]: from mxnet import gluon, autograd, ndarray #TODO: Define a training function which trains the passed network for the given number of epochs using the provided optimizer and criterion # The function should return the final training loss and the final training accuracy def train_network(net, train_loader, criterion, trainer, metric, epochs = 10): #TODO: Define your training loop here final_training_accuracy = None #TODO: Set this to final training accuracy final_training_loss = None #TODO: Set this to final testing accuracy return final_training_loss, final_training_accuracy epochs = 10 metric = #TODO: Define an accuracy metric lr_training_loss, lr_training_accuracy = train_network(net, train_loader, criterion, trainer, metric, epochs) print("Logistic Regression - the training loss is ", str(lr_testing_loss)) print("Logistic Regression - the training accuracy is ", str(lr_testing_accuracy)) In [ ]: # Grader Cell : 5 Points grader.grade('check_lr_train', (lr_training_accuracy, lr_training_loss)) In [ ]: # TODO: Plot your training accuracies and training loss function vs epochs! Is your model learning? Is the loss decreasing? Is it able to classifiy better after training? Evaluate Model¶ Evaluate the model performance on the test set. Compute the cross entropy loss and accuracy on the test set. Note: Please don't report false numbers for the accuracy as we will be reviewing these manually and if if there is a manipulation with the accuracy computation, you will get a 0 for the entire section. In [ ]: def test_model(net, criterion, test_loader, metric): #TODO return testing_loss, testing_accuracy # metric = #TODO: Define an accuracy metric lr_testing_loss, lr_testing_accuracy = test_model(net, criterion, test_loader, metric) print("Logistic Regression - the testing loss is ", str(lr_testing_loss)) print("Logistic Regression - the testing accuracy is ", str(lr_testing_accuracy)) In [ ]: # Grader Cell : 8 Points grader.grade('check_lr_test', (lr_testing_accuracy,lr_testing_loss)) Does the logistic regression fit well to the data? Think about whether this is underfitting or overfitting? Think about if we need more representational power or we need more regularization to make it better? 2.1.2 Feedforward Neural Networks (25 points)¶ Since logistic regression isn't that great at fitting our classification problem, we need more representation power. We will now define a feedforward neural network Complete the create_ff_net function below to define a feedforward neural network with at least 2 hidden layers. Note that the last layer must have the number of classes as the output size. You will also need to initialize the network, create a new trainer object with the parameters of the feedforward network. Use a ReLU activation function for the hidden layers. In [ ]: from mxnet import gluon, autograd, ndarray def construct_ff_net(): ff_net = None # TODO: Create a feedforward network, experiment with the number of hidden layers and # the number of sizes of the hidden layers return ff_net ff_net = construct_ff_net() #TODO: Initialize the network trainer = #TODO: Create a new trainer object for this network Print the model summary for the fully connected network. In [ ]: x = mx.sym.var('data') sym = ff_net(x) mx.viz.print_summary(sym) Now train this network using the train network function defined above. Create plots for the training accuracy and training loss vs the number of epochs. In [ ]: #TODO: plot curves In [ ]: #TODO: Train the feedforward neural network on the training set using the train_network function ffn_training_loss, ffn_training_accuracy = # TODO In [ ]: grader.grade('check_ffn_train', (ffn_training_accuracy,ffn_training_loss)) Once again, evaluate the model performance on the test set. Compute the cross entropy loss and accuracy on the test set In [ ]: # TODO: Compute performance on the test set using the test_network function created before ffn_testing_loss, ffn_testing_accuracy = # TODO In [ ]: grader.grade('check_ffn_test', (ffn_testing_accuracy,ffn_testing_loss)) Does the feedforward network do better than logistic regression? Play around with the network architecture to see how it affects the performance on both the train and test data. 2.1.3 Convoluted Convolutional Neural Networks (30 points)¶ So, what are CNNs? Convolutional Neural Networks are very similar to Feedforward Neural Networks from the previous section: they are made up of neurons that have learnable weights and biases. Each neuron receives some inputs, performs a dot product and optionally follows it with a non-linearity. The whole network still expresses a single differentiable score function: from the raw image pixels on one end to class scores at the other. So what changes? ConvNet architectures make the explicit assumption that the inputs are images, which allows us to encode certain properties into the architecture. These then make the forward function more efficient to implement and vastly reduce the amount of parameters in the network. If you wanna know more about how CNNs function and see some cool visualizations, we would highly recommend this page We will define the architecture for the CNN we will be using. The components of CNNs are: Convolutional Layers Pooling Layers Linear Layers Activation Functions Define a CNN model with Mxnet and Gluon with a convolutional layer followed by an activation function and a max pool, for one or more layers; then flatten the output from the convolutional layers and pass it through one or more fully connected or 'dense' layers and activation functions after all but the last layer. Note that the output shape from the last layer must be the same as the number of classes. You can find some examples of this here: https://gluon.mxnet.io/chapter03_deep-neural-networks/mlp-gluon.html#Faster-modeling-with-gluon.nn.Sequential In [ ]: import mxnet.ndarray as F def construct_conv_net(): # TODO return None Once again, we ask you to create a network, initialize it and create a trainer for it. In [ ]: #TODO : Initialize network, initialize the criterion and the trainer In [ ]: #Grader Cell - Worth 10 points grader.grade('check_cnn_model', (str(cnn.collect_params()))) Write another function to train a convolutional neural network on the train data given the network, trainer and criterion. Train the CNN for 10-25 epochs. Plot the training loss and accuracy curves. Note that there will be a slight difference from training a feedforward network, because here you will the image information in separate chanels as the input rather than a single flattened input. Note also that you may have to experiment with the number and widths of the layers, until you get satisfactory performance. To give a hint of what's possible, our classifier achieve an accuracy score of 0.94 on this part (i.e., with the training dataset). In [ ]: from mxnet import gluon, autograd, ndarray #TODO: Define a training function which trains the passed network for the given number of epochs using the provided optimizer and criterion # The function should return the final training loss and the final training accuracy def train_cnn(net, train_loader, criterion, trainer, metric, epochs = 10): #TODO: Define your training loop here final_training_accuracy = None #TODO: Set this to final training accuracy final_training_loss = None #TODO: Set this to final testing accuracy return final_training_loss, final_training_accuracy cnn_training_loss, cnn_training_accuracy = train_cnn(net, train_loader, criterion, trainer, metric, epochs) In [ ]: # TODO: don't forget to plot training loss + accuracy curves! In [ ]: grader.grade('check_cnn_train', (cnn_training_accuracy, cnn_training_loss)) Once again, evaluate the model performance on the test set. Compute the cross entropy loss and accuracy on the test set. In [ ]: # TODO: Compute performance on the test set, you may need to write a new function modifying test_network without the flattening aspect cnn_testing_accuracy, cnn_testing_loss = In [ ]: grader.grade('check_cnn_test', (cnn_testing_accuracy, cnn_testing_loss)) How does the CNN perform? Does it out perform feedforward network? Print out the number of learned parameters for the CNN and for the FFN. Does the CNN have more parameters? Think about how this links to performance and how CNN is so powerful?