程序代写代做代考 deep learning COMP9444

COMP9444
Neural Networks and Deep Learning
Outline
COMP9444
⃝c Alan Blair, 2017-20
COMP9444
⃝c Alan Blair, 2017-20
COMP9444 20T2
Perceptrons
2 COMP9444 20T2
Perceptrons 3
1c. Perceptrons
􏰈 Neurons – Biological and Artificial 􏰈 Perceptron Learning
􏰈 Linear Separability
􏰈 Multi-Layer Networks
Textbook, Section 1.2
Structure of a Typical Neuron
Biological Neurons
COMP9444
⃝c Alan Blair, 2017-20
COMP9444 ⃝c Alan Blair, 2017-20
COMP9444 20T2 Perceptrons 1
The brain is made up of neurons (nerve cells) which have 􏰈 a cell body (soma)
􏰈 dendrites (inputs)
􏰈 an axon (outputs)
􏰈 synapses (connections between cells)
Synapses can be exitatory or inhibitory and may change over time.
When the inputs reach some threshhold an action potential (electrical pulse) is sent along the axon to the outputs.

COMP9444 20T2 Perceptrons
4
COMP9444 20T2 Perceptrons 5
Artificial Neural Networks
McCulloch & Pitts Model of a Single Neuron
(Artificial) Neural Networks are made up of nodes which have 􏰈 inputs edges, each with some weight
􏰈 outputs edges (with weights)
􏰈 an activation level (a function of the inputs)
x1
❍❍❍❍
w1 ❍❍❥
Weights can be positive or negative and may change over time (learning). The input function is the weighted sum of the activation levels of inputs. The activation level is a non-linear transfer function g of this input:
✟ x2 ✟✟
✕✁ ✁✁
COMP9444 20T2
Perceptrons
6
COMP9444 20T2
Perceptrons
7
Transfer function
Linear Separability
activationi = g(si) = g(∑wijxj) j
1
= w1x1 + w2x2 + w0
Some nodes are inputs (sensing), some are outputs (action) COMP9444
⃝c Alan Blair, 2017-20
COMP9444
⃝c Alan Blair, 2017-20
Originally, a (discontinuous) step function was used for the transfer function:
Question: what kind of functions can a perceptron compute?
g(s)=􏰎 1, if s≥0 0, if s<0 (Later, other transfer functions were introduced, which are continuous and smooth) COMP9444 ⃝c Alan Blair, 2017-20 x 1 x2 w2 ✟✟✟ Answer: linearly separable functions COMP9444 ⃝c Alan Blair, 2017-20 ✯✟ Σ ✲ g ✁ w0=-th ✁ s = w1x1 +w2x2−th w1, w2 are synaptic weights th is a threshold w0 is a bias weight g is transfer function s ✲ g(s) x1, x2 are inputs COMP9444 20T2 Perceptrons 8 COMP9444 20T2 Perceptrons 9 Linear Separability Rosenblatt Perceptron Examples of linearly separable functions: AND w1 =w2 = 1.0, OR w1 =w2 = 1.0, NOR w1 =w2 =−1.0, w0 =−1.5 w0 =−0.5 w0 = 0.5 Q: How can we train it to learn a new function? COMP9444 ⃝c Alan Blair, 2017-20 COMP9444 ⃝c Alan Blair, 2017-20 COMP9444 20T2 Perceptrons 10 COMP9444 20T2 Perceptrons 11 Rosenblatt Perceptron Perceptron Learning Rule COMP9444 ⃝c Alan Blair, 2017-20 as long as they are linearly separable. COMP9444 ⃝c Alan Blair, 2017-20 Adjust the weights as each input is presented. recall: s = w1x1 +w2x2 +w0 if g(s) = 0 but should be 1, wk ← wk+ηxk if g(s) = 1 but should be 0, wk ← wk−ηxk w0 ← w0+η so s ← s+η(1+∑xk2 ) so w0 ← w0−η s ← s−η(1+∑xk2 ) kk otherwise, weights are unchanged. (η > 0 is called the learning rate) Theorem: This will eventually learn to classify the data correctly,

COMP9444 20T2 Perceptrons
12
COMP9444 20T2
Perceptrons 13
Perceptron Learning Example
Training Step 1
x1
❍❍❍❍
w1 ❍❍❥ Σ→(+/−) ✲
COMP9444
⃝c Alan Blair, 2017-20
COMP9444
⃝c Alan Blair, 2017-20
COMP9444 20T2
Perceptrons
14
COMP9444 20T2
Perceptrons
15
w2 ✟✟✟✯ ✟✟ ✁✁✕
w1 x1+w2 x2+w0 >0 learning rate η = 0.1 begin with random weights
x2✟ ✁w0 ✁✁
(1,1)
w1 ← w2 ← w0 ←
w1−ηx1 = 0.1 w2−ηx2 = −0.1 w0−η = −0.2
Training Step 2
Training Step 3
x 2
0.1 x1 − 0.1 x2 − 0.2 > 0 ← w1 + η x1 =
x2
(2,2)
0.3 x1 + 0.0 x2 − 0.1 > 0 3rd point correctly classified,
COMP9444
⃝c Alan Blair, 2017-20
COMP9444
⃝c Alan Blair, 2017-20
1
w1= 0.2 w2= 0.0 w0 =−0.1
(2,1) x1
(1.5,0.5)
w1 ←
w2 ←
w0 ←
w1−ηx1 = 0.1 w2−ηx2 = −0.2 w0−η = −0.2
0.3 ← w2 + η x2 = 0.0 w0 ← w0+η = −0.1
w1 w2
so no change 4th point:
x
2
0.2 x1 + 0.0 x2 − 0.1 > 0
x 1
0.1 x1 − 0.2 x2 − 0.2 > 0
x1

COMP9444 20T2
Perceptrons 16
COMP9444 20T2 Perceptrons 17
Final Outcome
Limitations of Perceptrons
x2
I1 I1 I1 111
COMP9444
⃝c Alan Blair, 2017-20
COMP9444 20T2
Perceptrons
18 COMP9444 20T2 Perceptrons 19
Multi-Layer Neural Networks
Historical Context
AND
NOR −1.5
+0.5
XOR
In 1969, Minsky and Papert published a book highlighting the limitations of Perceptrons, and lobbied various funding agencies to redirect funding away from neural network research, preferring instead logic-based methods such as expert systems.
NOR
Problem: How can we train it to learn a new function? (credit assignment)
COMP9444 ⃝c Alan Blair, 2017-20
COMP9444 ⃝c Alan Blair, 2017-20
x1
Possible solution:
x1 XOR x2 can be written as: (x1 AND x2) NOR (x1 NOR x2)
Recall that AND, OR and NOR can be implemented by perceptrons. COMP9444 ⃝c Alan Blair, 2017-20
eventually, all the data will be correctly classified (provided it is linearly separable)
000 0 1I2 0 1I2
0 1I2 (c) I1 xor I2
+1
−1
+0.5 −1 −1
+1
−1
It was known as far back as the 1960’s that any given logical function could be implemented in a 2-layer neural network with step function activations. But, the the question of how to learn the weights of a multi-layer neural network based on training examples remained an open problem. The solution, which we describe in the next section, was found in 1976 by Paul Werbos, but did not become widely known until it was rediscovered in 1986 by Rumelhart, Hinton and Williams.
Problem: many useful functions are not linearly separable (e.g. XOR)
(a) I1 and I2 (b) I1 or I2
?