CS代写 Introduction - PowCoder代写

Introduction
The Perceptron Algorithm: A Geometrical Proof
Marc de Kamps
Institute for Artificial and Biological Intelligence University of Leeds

Marc de Algorithm

Introduction
The Perceptron What is the definition?
We introduced the perceptron:
h = w1x1 +w2x2 +w3x3 −θ
if h > 0 then o = 1 if h ≤ 0 then o = 0
h is sometimes called the local field
Perceptron Algorithm
Marc de Kamps

Introduction
The Perceptron What is the definition?
Data set: linearly separable
Perceptron Algorithm: will always converge on weights, bias
Weights and bias will classify the data set correctly
Figure: Data set
Perceptron Algorithm
Marc de Kamps

Introduction
Perceptron Algorithm Proof
We introduced the perceptron:
h = w1x1 +w2x2 +w3x3 −θ
if h > 0 then o = 1 if h ≤ 0 then o = 0
h is sometimes called the local field
Perceptron Algorithm
Marc de Kamps

Introduction
Perceptron Algorithm Get Rid of Bias
w1 Consider a 2D classification w2
Find two weights and one bias But note:
w1x1 + w2x2 − θ is equivalent to:
w1x1 + w2x2 + w3x3 ifx3 ≡1andw3 ≡−θ
Perceptron Algorithm
Marc de Kamps

Introduction
Perceptron Algorithm Get Rid of Bias
In the data set this amounts to adding an extra column, e.g.:
classification
classification
Perceptron Algorithm
Marc de Kamps

Introduction
Perceptron Algorithm Get Rid of Bias
Effectively: new data set 3D not 2D
data points hover at z = 1
plane through origin (hard to see, but easy to verify algebraically:)
3x + y − 3z = 0 z = 1: y = −3x + 1
Perceptron Algorithm
Marc de Kamps

Introduction
The Perceptron Algorithm
Linearly Classifiable Dataset
Assume that there are two classes of input patterns in an N-dimensional space: C1 and C2. Also assume that a set of weights w⃗ ∗ exists, such that for all patterns ⃗x ∈ C1:
w⃗ ∗ · ⃗x < 0 and for all patterns ⃗y ∈ C2: w⃗ ∗ · ⃗y ≥ 0 Marc de Algorithm Introduction The Perceptron Algorithm The Perceptron Algorithm 1 Start with a random set of weights w⃗0. 2 Pick an arbitrary pattern ⃗x ∈ C1 ∪ C2. 3 If ⃗x ∈ C1 and ⃗x · w⃗ < 0 goto 2. Else w⃗ → w⃗ − λ⃗x. Goto 2 4 If ⃗x ∈ C2 and ⃗x · w⃗ ≥ 0 goto 2. Else w⃗ → w⃗ + λ⃗x. Goto 2 The Perceptron Theorem states that this algorithm will hit the ’Else’ clauses only a finite number of times if and only if the data set is linearly classifiable. 0 < λ ≤ 1 is the learning rate. Marc de Algorithm Introduction Perceptron Algorithm Leave Only One Class Construct a single class C as follows: Take a pattern ⃗x from C1 Replace it by −⃗x and add Remove the original pattern from C1 Repeat until C1 is empty Perceptron Algorithm Marc de Kamps Introduction Perceptron Algorithm Leave Only One Class Simplified Problem Given that we have a set of input vectors, and given the existance of a plane through the origin such that all input vectors of a class C are on one side of the plane, find such a plane. That plane is also the solution of our orginal classification problem of separating the two classes C1 and C2. Marc de Algorithm Introduction The Perceptron Algorithm The Simplified Perceptron Algorithm 1 Start with a random set of weights w⃗0. 2 Pick an arbitrary pattern ⃗x ∈ C. 3 If⃗x∈Cand⃗x·w⃗ ≥0goto2.Elsew⃗ →w⃗ +λ⃗x.Goto2 The Perceptron Theorem states that this algorithm will hit the ’Else’ clauses only a finite number of times if and only if the data set is linearly classifiable. 0 < λ ≤ 1 is the learning rate. Marc de Algorithm Introduction Perceptron Algorithm Leave Only One Class Normalizing: Normaling does not alter classification Amounts to multiplying all weights by the same positive constant Put patterns on the unit sphere But still on the same side of the plane Perceptron Algorithm Marc de Kamps Introduction Perceptron Theorem We now have a single class C of patterns, x⃗i with | x⃗i |= 1. We assume the existence of a set of weights w⃗∗ such that for each ⃗x ∈ C , ⃗x · w⃗ ≥ δ > 0 .
w⃗∗ is a solution to our classifcation problem, that we do not know, but that we know exists. (Without loss of generality:
| w⃗∗ |= 1)
We will look at what happens to
w⃗ ∗ · w⃗ n | w⃗ n |
during the execution of the Perceptron Algorithm.
Marc de Algorithm

Introduction
Perceptron Theorem
Strategy: assume we have had n misclassifications so far. The current set of weights is w⃗n, reflecting that the weights have been updated n times. What happens to:
w⃗ ∗ · w⃗ n | w⃗ n |
We will show that when the weights change, this increases by a finite amount
Marc de Algorithm

Introduction
Perceptron Theorem
Observe that:
w⃗ ∗ · w⃗ n | w⃗ n |
is nothing but cos(w⃗∗, w⃗n), so < 1 Observe numerator under change: Denominator: w⃗ ∗ · w⃗ → w⃗ ∗ · w ⃗ w⃗ · w⃗ → w ⃗ · w ⃗ n n n+1 n+1 Marc de Algorithm Introduction For an input pattern ⃗x: w⃗∗·w⃗ =w⃗∗·w⃗ +w⃗∗·⃗x We know that: w⃗ ∗ · ⃗x ≥ δ > 0
So the numerator increases with every step by at least a fixed amount δ:
w⃗∗·w⃗ ≥w⃗∗·w⃗+δ n+1 n
Marc de Algorithm

Introduction
Denominator
For an input pattern ⃗x:
w⃗ ·w⃗ =|w⃗ |2+2⃗x·w⃗+|⃗x·⃗x|
since w⃗n · ⃗x < 0 (Why?) w ⃗ · w ⃗ < | w⃗ | 2 + 1 , n+1 n+1 n Marc de Algorithm Introduction Conclusion Conclusion Since after the n-th application of the weight adding, w ∗ · w⃗ n ≥ n δ it follows that: i.e. increases by a fixed amount. This means, there can only be a finite number of steps or cos(w⃗∗,w⃗n) > 1, which is a contradiction.
| w⃗ n | 2 < n , w⃗ ∗ · w⃗ n n δ | w⃗ n | > √ n ,
Marc de Algorithm

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts