CS计算机代考程序代写 python GPU algorithm 3a: PyTorch

3a: PyTorch

Week 3: Overview

This week, we will look at the basic structure and components of a typical PyTorch program, and run
some simple examples. We will also learn how to analyze the hidden unit dynamics of neural
networks.

Weekly learning outcomes

By the end of this module, you will be able to:

code simple PyTorch operations

analyze the geometry of hidden unit activations in neural networks

PyTorch

The following code fragments illustrate the typical structure of a PyTorch program, with further
details and various options for each component.

Typical Structure of a PyTorch Program

PYTHON

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

# create neural network according to model specification
net = MyModel().to(device) # CPU or GPU

# prepare to load the training and test data
train_loader = torch.utils.data.DataLoader(…)
test_loader = torch.utils.data.DataLoader(…)

# choose between SGD, Adam or other optimizer
optimizer = torch.optim.SGD(net.parameters,…)

# enter the training loop
for epoch in range(1, epochs):
train(params, net, device, train_loader, optimizer)
# periodically evaluate the network on the test data

if epoch % 10 0

De�ning a model

PYTHON

1
2
3
4
5
6
7
8
9

class MyModel(torch.nn.Module):

def __init__(self):
super(MyModel, self).__init__()
# define structure of the network here

def forward(self, input):
# apply network and return output

De�ning a Custom Model

This code de�nes a module for computing a function of the form (x, y) ↦ Ax log(y) + By2

PYTHON

1
2
3
4
5
6
7
8
9
10
11
12
13

import torch.nn as nn

class MyModel(nn.Module):

def __init__(self):
super(MyModel, self).__init__()
self.A = nn.Parameter(torch.randn((1),requires_grad=True))
self.B = nn.Parameter(torch.randn((1),requires_grad=True))

def forward(self, input):
output = self.A * input[:,0] * torch.log(input[:,1]) \
+ self.B * input[:,1] * input[:,1]
return output

Building a Net from Individual Components

PYTHON

1
2
3
4
5
6
7
8
9
10
11
12
13

class MyModel(torch.nn.Module):

def __init__(self):
super(MyModel, self).__init__()
self.in_to_hid = torch.nn.Linear(2,2)
self.hid_to_out = torch.nn.Linear(2,1)

def forward(self, input):
hid_sum = self.in_to_hid(input)
hidden = torch.tanh(hid_sum)
out_sum = self.hid_to_out(hidden)
output = torch.sigmoid(out_sum)
return output

De�ning a Sequential Network

PYTHON

1
2
3
4
5
6
7
8
9
10
11
12
13

class MyModel(torch.nn.Module):

def __init__(self, num_input, num_hid, num_out):
super(MyModel, self).__init__()
self.main = nn.Sequential(
nn.Linear(num_input, num_hid),
nn.Tanh(),
nn.Linear(num_hid, num_out),
nn.Sigmoid()
)
def forward(self, input):
output = self.main(input)
return output

Sequential Components

Network Layers:

nn.Linear()

nn.Conv2d()

Intermediate Operators:

nn.Dropout()

nn.BatchNorm()

Activation Functions:

nn.Tanh()

nn.Sigmoid()

nn.ReLU()

Declaring Data Explicitly

PYTHON

1
2
3
4
5
6
7
8

import torch.utils.data

# input and target values for the XOR task
input = torch.Tensor([[0,0],[0,1],[1,0],[1,1]])
target = torch.Tensor([[0],[1],[1],[0]])

xdata = torch.utils.data.TensorDataset(input,target)
train_loader = torch.utils.data.DataLoader(xdata,batch_size=4)

Loading Data from a .csv File

PYTHON

1
2
3
4
5
6
7
8
9
10

import pandas as pd

df = pd.read_csv(“sonar.all-data.csv”)
df = df.replace(’R’,0)
df = df.replace(’M’,1)
data = torch.tensor(df.values,dtype=torch.float32)
num_input = data.shape[1] – 1
input = data[:,0:num_input]
target = data[:,num_input:num_input+1]
dataset = torch.utils.data.TensorDataset(input,target)

Custom Datasets

PYTHON

1
2
3
4
5
6
7
8
9

from data import ImageFolder
# load images from a specified directory
dataset = ImageFolder(folder, transform)

import torchvision.datasets as dsets
# download popular image datasets remotely
mnistset = dsets.MNIST(…)
cifarset = dsets.CIFAR10(…)
celebset = dsets.CelebA(…)

Choosing an Optimizer

PYTHON

1
2
3
4
5
6
7
8
9

# SGD stands for “Stochastic Gradient Descent”
optimizer = torch.optim.SGD( net.parameters(),
lr=0.01, momentum=0.9,
weight_decay=0.0001)

# Adam = Adaptive Moment Estimation (good for deep networks)
optimizer = torch.optim.Adam(net.parameters(),eps=0.000001,
lr=0.01, betas=(0.5,0.999),
weight_decay=0.0001)

Training

PYTHON

1
2
3
4
5
6
7
8

def train(args, net, device, train_loader, optimizer):

for batch_idx, (data,target) in enumerate(train_loader):
optimizer.zero_grad() # zero the gradients
output = net(data) # apply network
loss = … # compute loss function
loss.backward() # update gradients
optimizer.step() # update weights

Loss Functions

PYTHON

1
2
3
4
5

loss = torch.sum((output-target)*(output-target))
loss = F.nll_loss(output,target)
loss = F.binary_cross_entropy(output,target)
loss = F.softmax(output,dim=1)
loss = F.log_softmax(output,dim=1)

Testing

PYTHON

1
2
3
4
5
6
7
8
9

def test(args, model, device, test_loader):
with torch.no_grad(): # suppress updating of gradients
net.eval() # toggle batch norm, dropout
test_loss = 0
for data, target in test_loader:
output = model(data)
test_loss += …
print(test_loss)
net.train() # toggle batch norm, dropout back again

Computational Graphs

PyTorch automatically builds a computational graph, enabling it to backpropagate derivatives.

Every parameter includes .data and .grad components, for example:

A.data

A.grad

optimizer.zero_grad() sets all .grad components to zero.

loss.backward() updates the .grad component of all Parameters by backpropagating gradients
through the computational graph.

optimizer.step() updates the .data components.

Controlling the Computational Graph

If we need to stop the gradients from being backpropagated through a certain variable (or expression)
A , we can exclude it from the computational graph by using:

A.detach()

By default, loss.backward() discards the computational graph after computing the gradients.

If needed, we can force it to keep the computational graph by calling it this way:

loss.backward(retain_graph=True)

Exercise: Running PyTorch

Question 1

No response

Question 2

The following program solves the simplest possible machine learning task:

solve such that f (x) = Ax f (1) = 1

Run PYTHON

1
2
3
4
5
6
7
8
9
10
11
12
13
14

import torch
import torch.utils.data
import numpy as np

lr = 1.9 # learning rate
mom = 0.0 # momentum

class MyModel(torch.nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.A = torch.nn.Parameter(torch.zeros((1), requires_grad=True))
def forward(self, input):
output = self.A * input
return(output)

Change the learning rate lr to each of the following values by editing line 5 in the above code.

0.01, 0.1, 0.5, 1.0, 1.5, 1.9, 2.0, 2.1

Try running the code and describe what happens for each value of lr , in terms of the success and
speed of the algorithm.

Now keep the learning rate at 1.9 , but try each of the following values for momentum by changing
the value of mom on line 6.

0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9

For which value of momentum is the task solved in the fewest epochs?

What happens when the momentum is 1.0 ? What happens when it is 1.1 ?

No response

Exercise: XOR with PyTorch

Question 1

No response

Question 2

No response

This program trains a two-layer neural network on the famous XOR task.

Run PYTHON

1
2
3
4
5
6
7
8
9
10
11
12
13
14

import torch
import torch.utils.data
import torch.nn.functional as F

lr = 0.1
mom = 0.0
init = 1.0

class MyModel(torch.nn.Module):
def __init__(self):
super(MyModel, self).__init__()
# define structure of the network here
self.in_hid = torch.nn.Linear(2,2)
self.hid_out = torch.nn.Linear(2,1)

def forward(self input):

Run the above code ten times. For how many runs does it reach the Global Minimum? For how many
runs does it reach a Local Minimum?

Keeping the learning rate �xed at 0.1 , adjust the values of momentum ( mom ) on line 6 and initial
weight size ( init ) on line 7 to see if you can �nd values for which the code converges relatively
quickly to the Global Minimum on virtually every run.

Coding Exercise: Basic PyTorch Operations

Objective

The Tensor is a fundamental structure in PyTorch which is very similar to an array or matrix. Tensors
are used to encode the inputs and outputs of a model, as well as the model’s parameters. In this
exercise, you will learn how to implement basic tensor operations.

Instructions

Before starting the exercise, please go through the tutorial about tensors from the PyTorch website.

https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-
tutorial-py

For some of the exercises, the torch.Tensor documentation should be very helpful.

https://pytorch.org/docs/stable/tensors.html

Week 3 Wednesday video