程序代写代做代考 data mining Data Mining and Machine Learning

Data Mining and Machine Learning
Introduction to Artificial Neural Networks
Peter Jančovič Slide 1
Data Mining and Machine Learning

Objectives
 Introduce Artificial Neural Networks (ANNs)
 Feed-forward ANNs – Multi-Layer Perceptrons (MLPs)
 Basic MLP calculations
 Geometric interpretation of MLPs
Slide 2
Data Mining and Machine Learning

Artificial Neural Networks
 (Artificial) Neural Networks (NNs) offer another approach to data analysis
 Popularised in 1980s, resurgence in 2000s
 “Machine learning” (or most recently “AI”) often
synonymous with the use of NNs
 Inspiration for the basic elements of a NN (artificial neuron) comes from biology, but analogy stops there
 ANNs are just a computational device for processing patterns – not “artificial brains”
Slide 3
Data Mining and Machine Learning

Feed-forward Neural Networks Multi-Layer Perceptron – Feed-Forward Neural Network
Input Layer (Input Units)
Artificial neuron
Hidden Layers (Hidden Units)
Slide 4
Output Layer (Output Units) Data Mining and Machine Learning

A simple model of a neuron
Slide 5
Data Mining and Machine Learning

A Simple Artificial Neuron
i1 i2 i3
 Basic idea –
– if the input iu to unit u is
w1,u
w2,u
u
w3,u
big enough, the neurone ‘fires’
– Otherwise nothing happens
 How do we calculate the input to u?
Slide 6
Data Mining and Machine Learning

Artificial Neurone (2)
i1 i2 i3

 
Suppose the inputs to units 1, 2 and 3 are i1, i2 and i3 and these are also the outputs o1, o2 and o3
o1 o2
w1,u w2,u w3,u u
o3
Then the input to u is: iowowow
u 1 1,u 2 2,u 3 3,u
In general, for an artificial neuron u that receives input from N units, the input to unit u is:
N iu ow
n n,u n1
Slide 7
Data Mining and Machine Learning

The sigmoid activation function
i1 i2 i3
o1 o2 o3
w1,u w2,u w3,u
The activation function defines the output of a neuron – whether the neuron should “fire”
A typical activation function is the sigmoid function g:
 
gx ou giu

ou giu Data Mining and Machine Learning
N iu ow
1 1e
n1
n n,u u
kx The output of u is then:
Slide 8

Activation functions
 Linear activation function (output equals input):
gx x
 Sigmoid activation
function: 1 gx1ekx
 The sigmoid is a ‘soft’ threshold function
Sigmoid activation function
Slide 9
Data Mining and Machine Learning

The ‘bias’
 As described, the neuron will ‘fire’ only if its input is greater than 0
 We can change the value of the point of firing by introducing a bias
 This is an additional input unit whose input is fixed at 1
i1 i2 i3 1 w1,u w2,u w3,u wb,u
u
Slide 10
Data Mining and Machine Learning

How the bias works…
 According to the sigmoid activation function, the artificial neuron u ‘fires’ if the input to u is greater than or equal to 0
iowowoww0 u 1 1,u 2 2,u 3 3,u b,u
 i.e:
 But this happens only if
iw iw iw w
1 1,u 2 2,u 3 3,u b,u
Slide 11
Data Mining and Machine Learning

Example (2D)
 Suppose u has a sigmoid activation function. Then, for these values of weight, u will ‘fire’ if:
iu 3xy20 i.e. y  3x  2
xy1
3 1
-2
u
Slide 12
Data Mining and Machine Learning

Example (continued)
xuy1u2 1 u2 3
y  3x  2
[2,2]T
3 1 -2 u
2/3
[-2,-2]T
A single artificial neuron defines a linear decision boundary
Slide 13
y  3x  2 Data Mining and Machine Learning

Example (continued)
 Assume
– Linear activation functions for units u1, u2 and u3 – Sigmoid activation function for u
 Case1: input to u1 is 2 and input to u2 is 2, then: – Input iu to u is 2 × 3 + 2 ×1 + 1 × (-2) = 6
– Hence output ou from u is g(6) = 0.998
 Case 2: input to u1 is -2 and input to u2 is -2, then: – Input iu to u is -2 × 3 + -2 ×1 + 1 × (-2) = -10
– Hence output ou from u is g(-10) = 4.54 × 10-5 ≈ 0
Slide 14
Data Mining and Machine Learning

Example 2
xy1 2 -1 -1
u
1/2 -1
Ifiu 2xy10,then y  2x 1
Slide 15
Data Mining and Machine Learning

Combining 2 Artificial Neurons
xy1 xy1
3 1 -2 u1
2
2/3
2 -1 -1 u2
1/2 -1
Slide 16
Data Mining and Machine Learning

Combining neurons – artificial neural networks
u1 x
y u2 -1
1
1
3
u1
-1
2
1
-2
20
u
u2
-2
-20
Slide 17
Data Mining and Machine Learning

Combining neurones
‘firing region’
2
2/3
-1
Slide 18
Data Mining and Machine Learning

Combining neurons
 Input to u1 is 3x + y – 2
 Input to u2 is 2x – y – 1
 When x = 3, y = 0
– Input iu1 to u1 is 7, input iu2 to u2 is 5
– Output ou1 from u1 is 1, output ou2 from u2 is 0.993 – Input iu to u is 1 × 20 + 0.993 × (-20) – 2 = -1.88 – Output ou from u is g(-1.88) = 0.13
Slide 19
Data Mining and Machine Learning

Outputs
i1
i2
ou
3
0
0.13
0.5
2
1.00
0.5
-2
0.00
-1
0
0.06
Slide 20
Data Mining and Machine Learning

Single hidden layer Multi-Layer Perceptron (MLP)
I units in Input layer H x I weight matrix W1
H units in Hidden layer O x H weight matrix W2
O units in Output layer
Slide 21
Data Mining and Machine Learning

Single hidden layer MLP  Can characterize arbitrary convex regions
 Defines the region using linear decision boundaries
Slide 22
Data Mining and Machine Learning

Two hidden layer MLP
H1 x I weight matrix W1
I units in Input layer H1 units in first hidden layer
H2 x H1 weight matrix W2 H2 units in second hidden layer
O x H2 weight matrix W3 O units in Output layer
Slide 23
Data Mining and Machine Learning

Two hidden layer MLP
 An MLP with two hidden layers can characterize arbitrary shapes
 First hidden layer characterises convex regions
 Second hidden layer combines these convex regions
 In theory, there is no advantage in having more than two hidden layers
 In practice multiple hidden layer “deep” neural networks give best performance (e.g. Speech recognition)
Slide 24
Data Mining and Machine Learning

Formal definition: MLP with a single hidden layer
 A single hidden layer MLP consists of:
1. A set of I input units, and for each input unit i an
activation function gi (typically linear)
2. A set of H hidden units, and for each hidden unit h an
activation function gh (typically sigmoid)
3. A set of O output units, and for each output unit o an
activation function go
4. An H x I weight matrix W1, which maps the outputs of
the input units to the inputs of the hidden units
5. An O x H weight matrix W2, which maps the outputs of the hidden units to the inputs of the output units
Slide 25
Data Mining and Machine Learning

Example
i(i )=0.9 i(i )=-0.5 12
W1
W2
 2 unit input layer, linear activation (I = 2)
 Single 3 unit hidden layer, sigmoid activation (H = 3)
 2 unit output layer, linear activation (O = 2)
 A 3 x 2 weight matrix W1 between input and hidden layer
 A 2 x 3 weight matrix W2 between hidden and output layer
Slide 26
Data Mining and Machine Learning

Example continued
2.6 1.7
W10.2 1.0,W2 
1.0 0.5 1.0
   4 . 0 2 . 5    0 . 5 0 . 6  1 . 0 
 Input  0.9 
- 0.5
Outputfrom first layer   0.9  (linear activation)
Slide 27
- 0.5
Data Mining and Machine Learning

Example (continued)
Inputstohidden layer:
i(h)w1 o w1 o 2.60.9(1.7)0.52.340.853.19, 1 111 122
ih w1 o w1 o 0.20.91.0(0.5)0.180.50.32, 2 211 222
ih w1 o w1 o (4.0)0.92.5(0.5)3.61.254.85 3 311 322
In matrix notation :
ihW1o
Outputsfrom hidden layer:
o(h )  1
oh  2
Slide 28
1 1e3.19
1 1 e0.32
 0.96, 0.42,oh  1
0.008. Data Mining and Machine Learning
3 1 e4.85

Example (continued) Inputstotheoutputlayer:
iow2 ohw2 ohw2 oh 1 11 1 12 2 13 3
io 10.960.50.4210.0080.960.210.0080.758. 1
iow2 ohw2 ohw2 oh 2 21 1 22 2 23 3
i o 2   0 . 5  0 . 9 6  0 . 6  0 . 4 2  1  0 . 0 0 8  0 . 4 8  0 . 2 5 4  0 . 0 0 8  0 . 7 4 2 . In matrix notation :
ioW2oh
Linear outputunit activation :
oo   0.758, oo   0.742. 12
Slide 29
Data Mining and Machine Learning

Summary
 Introduction to neural networks
 Definition of an ‘artificial neurone’
 Activation functions – linear and sigmoid
 Linear boundary defined by a single neurone
 Convex region defined by a one-level MLP
 Two-level MLPs
 Forward propagation in an MLP (calculation)
Slide 30
Data Mining and Machine Learning