Lecture 31:
Convolutional networks (2) CS 189 (CDSS offering)
2022/04/13
Copyright By PowCoder代写 加微信 powcoder
Today’s lecture
• Today, we wrap up our discussion of convolutional networks (conv nets)
• We will see what the “convolution” operation actually looks like, in math, and we
will work out its gradient
• Then, we will discuss some important papers from the last ~10 years that lead us to the state-of-the-art conv nets we have today
• Next week, we will cover our final type of neural network: the transformer
Convolutions in math
Convolutions, the “forward” direction
consider processing a(l) ([H, W, C]) into z(l+1) ([H!, W!, C!]) using a convolution our filter W(l+1) has shape [K, K, C!, C]
z ti j K Eat E If I we ta matrix vector notation
altita jib c
C dimvector
C dim vector
b k c EatEÉwe ab al
Convolutions, the “backward” direction
for simplicity, let’s assume that C! = C = 1
so z(l+1)[i, j] = !K”1 !K”1 W(l+1)[a, b] # a(l)[i + a, j + b]
jzet ti j daefita.jp
o to k 1 pencil
fora0tok1 b wet ta b
convolution
Convolutions, the “backward” direction
z(l+1)[i, j] = !K!1 !K!1 W(l+1)[a, b] ” a(l)[i + a, j + b] a=0 b=0
altita jtb
It 2Y’j a’tita jib gzetfi.it
convolution
The last decade in conv nets (sort of)
AlexNet Krizhevsky et al, 2012
• The model that started the past decade of deep learning hype
• Demonstrated the power of combining expressive models with lots of compute
• Widely known for being the first neural network to attain state-of-the-art results on the ImageNet large scale visual recognition challenge (ILSVRC)
ImageNet image classification
• ImageNet consists of 224 ! 224 ! 3 images evenly covering 1000 classes • There are 1.2M training images and 50000 evaluation images
• ImageNet-22K is a larger version of ImageNet (roughly 10 ! larger) with 22000 classes, increasingly used these days due to expanding compute budgets
• It is common for computer vision applications to start from a network pretrained on ImageNet
Smaller image classification datasets MNIST, CIFAR-10, and CIFAR-100
• Working with ImageNet is not a good way to prototype
• MNIST, CIFAR-10, and CIFAR-100 are much smaller datasets that increase in difficulty in that order
• But they’re all much easier than ImageNet • MNIST:60000/10000train/test,10classes,
28 ! 28 ! 1 grayscale images
• CIFAR-*: 50000/10000, * classes, 32 ! 32 ! 3 color (RGB) images
Skip connections in convolutional networks He et al, 2015
• Recall the general idea behind skip connections: a(l) = !(z(l)) + a(l”1)
• This idea was popularized by residual networks (ResNets), a convolutional
architecture that implemented the idea slightly differently (and in two ways)
• This allowed for better training of deeper networks, which are more performant
He et al, 2015
Depth wise (or grouped) convolutions E.g., Xie et al, 2016
• In depth wise (resp. grouped) convolutions, the filter and input are split by channels (resp. groups of channels), convolved separately, then concatenated
• We can increase the number of channels and maintain roughly the same computational complexity with this technique, and performance often improves
A recent state-of-the-art example Liu et al, 2022
• ConvNeXt is a recent state-of-the-art conv net that aggregates several methods to achieve improved performance
• Improved training techniques (including better optimization and lots of data augmentation) turn out to help significantly
• Using depth wise convolutions (and proportionally increasing the number of channels) also significantly improves accuracy
• Some other changes, such as swapping BN for LN and swapping ReLU for GELU, provide smaller gains but appear to not be as important
Beyond image classification: MS COCO
• MS COCO is a large scale dataset that defines other computer vision tasks
• Includes labels for object detection (localization), segmentation, key point detection, and captioning
• 118000 training images, 5000 validation images, 41000 test images, 123000 unlabeled images for
object detection
• Not examinable, just letting you know that there is more than just image classification
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com