CS计算机代考程序代写 algorithm 08_hard-margin-SVM

08_hard-margin-SVM

Qiuhong Ke

Hard-margin Support Vector Machines

COMP90051 Statistical Machine Learning

Before we start…
About me

• 2015.02-2018.04: PhD in UWA

• 2018.05-2019.12: Post-doc in MPII

• From 2020.01: Lecturer in UniMelb

• Research: Action recognition and prediction using machine learning

• Contact: qiuhong. .com;

qiuhong. .au

mailto:qiuhong. .com
mailto:qiuhong. .com
qiuhongk
Cross-Out

Outline

• Linear Classifier

• Deriving Hard-margin SVM Objective

• SVM Objective as Regularised Loss Function

Linear separability

Linear Classifier Regularised LossSVM Objective

Form

Line Plane

Higher-dimensional
features …

Hyperplane

Linear Classifier Regularised LossSVM Objective

Formula

y = f(x) =
d

∑
i=1

wixi + b = w
Tx + b

w : d-dim weight vector (column)
x : d-dim feature vector (column)

b : Bias
T : Transpose

wTx = ∥w∥∥x∥ cos θ

Linear Classifier Regularised LossSVM Objective

Recap of Perceptron

y = 1, if f(x) =
d

∑
i=1

wixi + b ≥ 0

y = − 1, if f(x) =
d

∑
i=1

wixi + b < 0 Linear Classifier Regularised LossSVM Objective Example: classify angry birds 8 vs Source: by Papachan (CC BY) https://www.sketchport.com/drawing/4524248689278976/bomb-and-chuck Linear Classifier Regularised LossSVM Objective https://www.sketchport.com/user/4893810324668416/papachan https://www.sketchport.com/drawing/4524248689278976/bomb-and-chuck https://www.sketchport.com/user/4893810324668416/papachan https://www.sketchport.com/drawing/4524248689278976/bomb-and-chuck Use perceptron as the classifier 9 Perceptron : all boundaries are equally good as they separate classes perfectly Linear Classifier Regularised LossSVM Objective One possible (bad) decision boundary 10 ! f(x) = 0 f(x) < 0 f(x) > 0

A B
Which one of predictions of the data
points is more confident?

A or B?

Slight change of decision boundary or
data points changes the prediction

Linear Classifier Regularised LossSVM Objective

Data points are closer to decision
boundary->less confident prediction

Another possible (bad) decision boundary

f(x) = 0f(x) < 0 f(x) > 0

Linear Classifier Regularised LossSVM Objective

Good decision boundary

Linear Classifier Regularised LossSVM Objective

Goal of SVM

Ma
rgi

☺

Margin: 2x minimum distance between boundary and data points

Small-margin decision boundary

Margin

Margin: 2x minimum distance between boundary and data points

Linear Classifier Regularised LossSVM Objective

Small-margin decision boundary

M
ar

gi
n

Margin: 2x minimum distance between boundary and data points

Linear Classifier Regularised LossSVM Objective

☺

SVM

Goal: find maximum-margin decision boundary
(Robust and confident in prediction)

Hard-margin SVM: all points are perfectly
classified and on or outside the margin
(hard constraint)

Soft-margin SVM? Next lecture.

Linear classifier

Linear Classifier Regularised LossSVM Objective

Ma
rgi

Margin and weights

f(x) = wTx + b

Linear Classifier Regularised LossSVM Objective

☺

Ma
rgi

n
SVM decision boundary:

SVM goal: find parameters of max-
margin decision boundary

Margin: 2x minimum distance between boundary and data points

What’s the relationship between
parameters and margin?

Weight vector decision boundary

f(x) = 0

f(x) = wTx + b

wTx1 + b = 0

wTx2 + b = 0

wT(x1 − x2) = 0

∥w∥∥x1 − x2∥cosθ = 0

f(x) = wTx + b

Linear Classifier Regularised LossSVM Objective

Margin hyperplanes

wTx + b = − b0

wTx + b = + b0
wTx + b = 0

−

Margin = 2γ*

γ*γ*

Linear Classifier Regularised LossSVM Objective

Margin: 2x minimum distance between boundary and data points

Positive margin hyperplane: f(x) = + b0
Nositive margin hyperplane: f(x) = − b0

(b0 > 0)

f(x) = 0

−

Distance for positive class
f(x) = wTx + b

γ*
x+

x′�+ w

wTx+ + b = b0

wT(x+ − x′�+) = b0

γ* =
b0

∥ w ∥
o

Multiclass SVMLinear Classifier Regularised LossSVM Objective

wTx′�+ + b = 0

f(x) = + b0

∥w∥∥x1 − x2∥cosθ = b0

f(x) = 0

−

Distance for negative class
f(x) = wTx + b

γ*
x−

x′�−

Linear Classifier Regularised LossSVM Objective

f(x) = − b0

wTx− + b = − b0

wT(x− − x′�−) = − b0

γ* =
b0

∥ w ∥

wTx′�− + b = 0

∥w∥∥x1 − x2∥cosθ = − b0

Geometric Margin

wTx + b = − b0

wTx + b = + b0
wTx + b = 0

−

γ* =
b0

∥ w ∥

γ*γ*

Linear Classifier Regularised LossSVM Objective

γi =
yi(w

Txi + b)
∥ w ∥

γi = −
wTxi + b

∥ w ∥
yi = − 1,

: Euclidean distance between a
data point to the hyperlane.

γi
i

γi =
wTxi + b

∥ w ∥
yi = + 1,

f(x) = 0

−

Linear Classifier

Functional margin
f(x) = wTx + b

̂γi = yi(w
Txi + b)

For data point i:

Regularised LossSVM Objective

Example

̂γi = yi(w
Txi + b)

Functional margin: indicate confidence
and correctness of prediction for
different points given the same
boundary

̂γA < ̂γB ̂γC < ̂γD ̂γE < 0 f(x) = 3x1 + 2x2 − 12 = 0 A B D C E Linear Classifier Regularised LossSVM Objective 24 Problem of functional margin f(x) = 3x1 + 2x2 − 12 = 0 ̂γA = 4 Rescale w and b change the margin value Linear Classifier Regularised LossSVM Objective f(x) = 6x1 + 4x2 − 24 = 0 ̂γA = 8 B A B A ̂γB = 2 ̂γB = 4 How about the geometric margin? 25 Maximum margin objective w, b SVM objective: max margin f(x) = wTx + b Any problem? Linear Classifier Regularised LossSVM Objective (Geometric) Margin = 2γ* = 2b0 ∥ w ∥ wTx + b = − b0 wTx + b = + b0 wTx + b = 0 − + γ* = b0 ∥ w ∥ γ*γ* 26 Non-unique representation f(x) = wTx + b = 0 f(x) = x1 + 2x2 − 4 = 0 f(x) = 2x1 + 4x2 − 8 = 0 f(x) = awTx + ab = 0 Linear Classifier Regularised LossSVM Objective 27 Resolving ambiguity wTx + b = − b0 wTx + b = + b0 wTx + b = 0 − + − + wT x + b = 0 wT x + b = − 1 wT x + b = + 1 w b0 → w b b0 → b 2γ* γ*γ* Linear Classifier Regularised LossSVM Objective Margin = 2γ* = 2b0 ∥ w ∥ Margin = 2γ* = 2 ∥ w ∥ SVM: Constrained optimisation problem s.t w min ∥w∥2 2 1 − y(i)(wTx(i) + b) ≤ 0, i = 1,⋯, n (data points) max 2 ∥w∥w if y(i) = + 1 : f(x(i)) = wTx(i) + b ≥ + 1 if y(i) = − 1 : f(x(i)) = wTx(i) + b ≤ − 1 subject to (i = 1,⋯, n data points)f(x) = 0 f(x) = − 1 f(x) = + 1 − + w 28 1 ∥ w ∥1 ∥ w ∥ Linear Classifier Regularised LossSVM Objective Previous ML classification algorithms… 29 Linear Classifier Regularised LossSVM Objective Ridge regression vs Hard-margin SVM objective 30 s.t w min ∥w∥2 2 1 − y(i)(wTx(i) + b) ≤ 0, i = 1,⋯, n , 1 − y(i)(wTx(i) + b) ≤ 0,0 • Hard-margin SVM loss !! = #0 1 − ' ( ") + + ≤ 0 ∞ ./ℎ123451 • Soft-margin SVM loss (hinge loss) !# = # 0 1 − ' (") + + ≤ 0 1 − ' (") + + ./ℎ123451li = , 1 − y(i)(wTx(i) + b) > 0,∞

Data-dependent

training error

Data-independent

regularisation term

Recall ridge regression objective:
min (

∑
i=1

(y(i) − wTx(i))2 + λ ∥ w ∥)2)w
Hard Hard-margin SVM objective:

Linear Classifier Regularised LossSVM Objective

Summary

• What’s the goal of SVM

• What’s objective of hard-margin SVM

• How to represent SVM Objective as regularised loss function

Related Posts