代写代考 Convolutional Neural Network

Convolutional Neural Network
Which is the dog?

Classification

• Given a feature representation for images, how do we learn a model for distinguishing features from different classes?
Decision boundary
Zebra Non-zebra
Slide credit: L. Lazebnik

Classification
• Assign input vector to one of two or more classes
• Input space divided into decision regions separated by decision boundaries
Slide credit: L. Lazebnik

The machine learning framework
• Apply a prediction function to a feature representation of the image to get the desired output:
f( ) = “apple” f( ) = “tomato” f( ) = “cow”
Slide credit: L. Lazebnik

The machine learning framework
output prediction image / image feature
• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the prediction function f by minimizing the prediction error on the training set
• Testing: apply f to a never before seen test example x and output the predicted value y = f(x)
Slide credit: L. Lazebnik

Training Images
The old-school way
Training Labels
Image Features
Learned model
Image Features
Learned model
Prediction
Test Image
Slide credit: D. Hoiem and L. Lazebnik

Neural Networks
• Basic building block for composition is a perceptron (Rosenblatt c.1960)
• Linear classifier – vector of weights w and a ‘bias’ b
𝒘 = (𝑤1,𝑤2,𝑤3) 𝒃 = 0.3
Output (binary)
activation functions

Binary classifying an image
• Each pixel of the image would be an input. • So, for a 28 x 28 image, we vectorize.
• x = 1 x 784
• w is a vector of weights for each pixel, 784 x 1 • b is a scalar bias per perceptron
• result = xw + b -> (1×784) x (784×1) + b = (1×1)+b

Neural Networks – multiclass
• Add more perceptrons 𝑥1
Binary output
Binary output
Binary output

Multi-class classifying an image
• Each pixel of the image would be an input.
• So, for a 28 x 28 image, we vectorize.
• x = 1 x 784
• W is a matrix of weights for each pixel/each perceptron • W = 784 x 10 (10-class classification)
• b is a bias per perceptron (vector of biases); (1 x 10)
• result = xW + b -> (1×784) x (784 x 10) + b
-> (1 x 10) + (1 x 10) = output vector

Composition
Attempt to represent complex functions as compositions of smaller functions.
Outputs from one perceptron are fed into inputs of another perceptron.

Layer 1 Layer 2
Sets of layers and the connections (weights) between them define the network architecture.

Hidden Hidden Layer 1 Layer 2
Layers that are in between the input and the output are called hidden layers, because we are going to learn their weights via an optimization process.
• We have formed chains of linear functions.
• We know that linear functions can be combined
• g = f(h(x))
Our composition of functions is really
just a single function

Rectified Linear Unit

Images as input to neural networks

Perceptron:

Filter = ‘local’ perceptron. Also called kernel.

Stride = 1

Stride = 3

Filter 1 Filter 2
-3 -1 0 -3
-1 -1 -1 -1
-1 -1 -2 1
-3 -3 3 -2
-1 -1 -1 0

Cun’s MNIST CNN architecture
Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-based learning applied to document recognition,” Prof. IEEE, 1998.

A Common Architecture: AlexNet
Figure from http://www.mdpi.com/2072-4292/7/11/14680/htm

AlexNet diagram
[Krizhevsky et al. 2012]
Input size 227 x 227 x 3
11 x 11 x 3 Stride 4 96 filters
5 x 5 x 48 Stride 1 256 filters
3 x 3 x 256 Stride 1 384 filters
3 x 3 x 192 Stride 1 384 filters
3 x 3 x 192 Stride 1 256 filters

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts