Lab Exercises | Week 1
COSC2779 – Deep Learning
1 Introduction
In this weeks lab, we will explore deep learning infrastructure that will be used during the course. The
main cloud based platforms we will be using are:
• Google Colab
• (optional) Kaggle Notebooks GPU
The following text will provide an introduction on setting up the above platforms for training and
testing deep neural network models. At the end of each section there are links to some video tutorials.
It is highly recommended that you go through them to get a good idea of Google Colab.
2 Google Colab
The first platform we are going to explore is google Colab. Colab (short for Colaboratory) is a product
offered by Google Research that allows machine learning researchers to work on projects in the browser.
Similar to Google Docs, it allows you to share projects between many people (for group work), and
best of all, it gives free access to GPUs for you to quickly train models. On the negative side, Colab
sessions are limited to 12 hours (you can restart after 12 hours) and there is no guarantee that you
will get a GPU (however most of the time you will get a GPU).
A nice introduction to Colab is at: https://youtu.be/inN8seMm7UI
In order to start, you need to have a google account. You can either use your RMIT student email
(preferred) or your personal Gmail account.
Setting up a project
• Login to your preferred google account and open Gdrive.
• Create a folder with a suitable name for the project (e.g. COSC2779lab1) and navigate into that
folder.
• Right click → Google Collaboratory; (if you cant see it, click more)
• This will open a notebook instance. You can rename the notebook.
• If you prefer to use GPU: Runtime → Change Runtime Type → Select GPU from the
dropdown box under Hardware accelerator.
Now you are ready to use python on a GPU instance. Using python on Colab is same as on a regular
anaconda Jupiter notebook.
Test and install TensorFlow
Lets first check if tensorflow is installed and the version.
import tensorflow
print(tensorflow.__version__)
Should output a version number grater than 2.0. If not you can install by using the following command.
Note that the “!” character allows you to run command line arguments.
!pip install tensorflow
New Colab comes with tensorflow 2.x installed and it is not recommended to use the pip command to
install it. The above line is just to indicate that you can install python packages in notebook using
command line.
1
Uploading Files
Any DL project would involve significant amount of data and the first step is to get them into the
compute instance. There are several methods to get data into a Colab instance and here we will discuss
two most popular methods.
Uploading from GDrive
Assume our data is available in Gdrive as a zip file. For this section you can download the “Boston-
HousingPrice.zip” file from canvas and upload it to the folder created earlier (e.g. COSC2779lab1).
This zip file contains a .csv file with Boston house price data with house prices, some factors associated
with the prices and trying to predict house prices.
First you need to mount your GDrive on to the instance. This can be done by running the the following
code block in notebook.
from google.colab import drive
drive.mount(‘/content/drive’)
This will ask for a authentication code:
Click on the link and obtain the authentication code. This will prompt you to login to the google
account again. Now your GDrive is mounted. You can check your files by
!ls /content/drive/’My Drive’/COSC2779/COSC2779lab1/
Working with data on the GDrive might be too slow. Specially when working with large data sets that
will be read one batch at a time (Typical for Deep learning). Therefore, we will read the data to the
local disc of the notebook instance using the following code.
!cp /content/drive/’My Drive’/COSC2779/COSC2779lab1/BostonHousingPrice.zip .
!unzip -q -o BostonHousingPrice.zip
!rm BostonHousingPrice.zip
!ls
Now we can read the data using pandas and use for our ML model development
import pandas as pd
import numpy as np
data = pd.read_csv(‘./BostonHousingPrice/housing.data.csv’,delimiter=’\s+’)
data.head(5)
2
Direct Upload from your computer
The next method to get your files in to the notebook instance is to upload them directly from your
computer. This method is not recommended as this will require you to upload data every time you
reinitialize the notebook instance which can be time consuming.
First do a Runtime → Factory reset runtime. This will remove all the data you uploaded in the
previous section. Then you can upload the code, extract the zip file and read dataframe using the
following code.
import pandas as pd
import numpy as np
# The following lines will upload data to notebook
from google.colab import files
files.upload()
#The lines below are same as in previous section
!unzip -q -o BostonHousingPrice.zip
!rm BostonHousingPrice.zip
data = pd.read_csv(‘./BostonHousingPrice/housing.data.csv’,delimiter=’\s+’)
data.head(5)
!
You are now setup to use Colab. For more information please follow the tutorials at:
• Video tutorial for using tensorflow on Colab
• Section: Working with Notebooks in Colab
3
https://colab.research.google.com/notebooks/intro.ipynb
3 Keggle Kernels
Kaggle provides notebook editors with free access to NVIDIA TESLA P100 GPUs. These GPUs are
useful for training deep learning models. You can use up to 30 hours per week of GPU, and
individual sessions can run up to 9 hours.
! More information on running Kaggle kernels is at: Introduction to Kaggle Kernels
4
Introduction
Google Colab
Keggle Kernels