程序代写 CCSMPNN2020-21 2/52

1 Introduction
2 Linear SVMs: Linearly Separable Case
3 Linear SVMs: Non-separable Case
4 Nonlinear SVMs

Copyright By PowCoder代写 加微信 powcoder

5 Multi-class SVMs
6 Conclusion
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 2/52

Introduction
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 3/52

Introduction
Support Vector Machines (SVMs):
Works in a similar concept of linear machines with margins.
Relies on preprocessing the data in a high dimension using Kernel functions. Classifies two classes, i.e., binary classifier.
Computes the optimal weights instead of through training.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 4/52

Introduction
Support Vector Machines (SVMs) based on Linear Discriminant Functions Include margins to optimise solution
Can allow errors to occur in a controlled way
Comparable to Neural Networks
Allow non-linear mappings in higher dimensional feature space through use of Kernel functions
Advantages over NNs as simpler to select models and less susceptible to over-fitting
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 5/52

Introduction
Two-Class Classification Problem:
Labelled training samples: S = {(x1,y1),(x2,y2),…,(xN,yN)}  xi1 
 xi2  xi= . ,
yi ∈{−1,1},i=1,2,…,N,
N denotes the number of training samples.
(Goal) Design a hyperplane f (x) = 0 which can classify correctly (all) the training
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 6/52

94 Chapter3: LINEAR CLASSIFIERS Introduction
FIGURE 3.7″
An example of a linearly separable two class problem with two
possible linear classifiers.
XXXXX XXXXX
Figure 1: A diagram showing linearly separable two classes.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 7/52

Introduction
How to design the optimal classifier, i.e., design the optimal hyperplane f (x)? Only optimal if
No errors, i.e., no mis-classification
Distance or margin between nearest support vectors and separating plane is maximal.
What is a support vector?
Can be achieved graphically on a small data set.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 8/52

Introduction
Section 3.7: SUPPORT VECTOR MACHINES
z direction2 /4•
FIGURE 3.8:
The margin for direction 2 is larger than the margin for direc-
~Oo~~~ oOoo~/ / –./
~176// -.~-z-1-…..(~(~ ….~- ……….7L-………… _f___._d[rection 1
“….. / …. -Z ……….~–;~| ………- / / ._~V~~~~>,
Figure 2: A diagram showing two linear classifiers with two margins.
rH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 9/52

Linear SVMs: Linearly Separable Case
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 10/52

Linear SVMs
Deals with linearly separable 2-class classification problem.
T  w2  Hyperplane: f(x)=w x+w0 =0wherew= . .
Find w and w0 such that the margin is optimal
􏰉wTx+w0 ≥ 1, wTx+w0 ≤ −1,
∀x ∈ class 1 (“+1”) ∀x ∈ class 2 (“−1”)
 .  wd
DrH.K.Lam (KCL)
SupportVectorMachines
7CCSMPNN2020-21

Linear SVMs: Linearly Separable Case
Optimal margin:
Distance of a point from a hyperplane: z = |f (x)| . ∥w∥
Example: f(x) = w1x1 +w2x2 +w0 = 2×1 +4×2 −6 = 0
|f (x)| |2×1+4×3−6|
Distance from a point (x1,x2) = (1,3): z = ∥w∥ = √22+42 = 1.7889
∥ · ∥ is the l2 norm operator (also known as Euclidean norm).
Achieve a maximum margin (distance): find the largest margin z between the
hyperplane and support vectors. The margin is given by:
2×1 + 4×2 − 6 = 0
min |f (xi)| + min xi :yi =−1 ∥w∥ xi :yi =+1
|f (xi)| ∥w∥
= 1 􏰂 min |f(xi)|+ min |f(xi)|􏰃
∥w∥ xi :yi =−1 =2
∥w∥ DrH.K.Lam (KCL)
xi :yi =+1
SupportVectorMachines
0.75 1 1.25 1.5 x1
7CCSMPNN2020-21

Linear SVMs: Linearly Separable Case
Constrained optimisation problem:
min J(w)=1∥w∥2 w2
subject to yi(wTxi +w0) ≥ 1, i = 1,2,…,N
Remark: Minimising “12∥w∥2” is equivalent to maximise the margin. The constraint “yi (wT xi + w0 ) ≥ 1” is to make sure all samples xi are correctly classified.
Method of Lagrange multipliers:
Primalproblem: L(w,w0,λ)= 2∥w∥2−∑λi(yi(wTxi+w0)−1)
whereλ= λ1 λ2 … λN ,λi≥0foralli=1,2,···,N.
The above primal problem can be transformed to the following dual problem:
􏰂1N 􏰃 Dualproblem: minmax ∥w∥2−∑λi(yi(wTxi+w0)−1)
w,w0λ≥0 2 i=1
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 13/52

Linear SVMs: Linearly Separable Case
∂L(w,w0,λ) N
∂w =0 ⇒ w=∑λiyixi (1)
∂L(w,w0,λ) N
=0 ⇒ ∑λiyi=0 (2)
−3 −2.5 −2 −1.5 −1 −0.5
x1,y1 = −1,λ1 x4,y4 = +1,λ4 x2,y2 =−1,λ2
x5,y5 =+1,λ5
x3,y3 = −1,λ3 x6,y6 = +1,λ6
From (1), w = λ1y1x1 + λ2y2x2 + λ3y3x3 + λ4y4x4 + λ5y5x5 + λ6y6x6 From (2), λ1y1 +λ2y2 +λ3y3 +λ4y4 +λ5y5 +λ6y6 = 0
1 1.5 2 2.5 3
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21

Linear SVMs: Linearly Separable Case
Putting (1) and (2) into L(w,w0,λ), we have
L(λ)= 2∥w∥2−∑λi(yi(wTxi+w0)−1)
i=1 1NNNNN
=2∑∑λiλjyiyjxTi xj−∑λiyiwTxi−∑λiyiw0+∑λi i=1j=1 i=1 i=1 i=1
1NN NN NN =2∑∑λiλjyiyjxTi xj−∑∑λiλjyiyjxTi xj−∑λiyiw0+∑λi
i=1 j=1 i=1 j=1 i=1 i=1 N1NN
= ∑λi − 2 ∑∑λiλjyiyjxTi xj. i=1 i=1 j=1
DrH.K.Lam (KCL)
SupportVectorMachines 7CCSMPNN2020-21

Linear SVMs: Linearly Separable Case
The dual problem is reduced to:
􏰇N1NN􏰈 max ∑λi − ∑∑λiλjyiyjxTi xj
λ≥0 i=1 2i=1j=1 N
subjectto ∑λiyi=0, i=1
λi ≥ 0,i = 1,2,…,N
DrH.K.Lam (KCL) SupportVectorMachines
7CCSMPNN2020-21

Linear SVMs: Linearly Separable Case
Solution λi to (3) can be found by using quadratic programming solver.
It is a scalar function which does not depend explicitly on the dimension of the
input space.
The solution of the Lagrange multipliers λi may not be unique but the
hyperplane characterised by w and w0 is unique.
For those xi with λi ̸= 0, they are known as support vectors. As a result, Ns
w = ∑λiyixi, Ns ≤ N denotes the number of support vectors. (Note: xi i=1
here refers to a support vector not any xi in the training samples)
The support vectors lie on the two hyperplanes satisfying wT x + w0 = ±1
where x ∈ support vectors. (Why?)
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 17/52

Linear SVMs: Linearly Separable Case
Rearranging the terms in L (λ ), 12N T
L(λ)= 2∥w∥ −∑λi(yi(w xi+w0)−1) i=1
= 2w ∑λiyixi + 2 ∑λiyiw0 − ∑λi(yi(w xi +w0)−1)
i=1 i=1 i=1 1N1NT
= 2 ∑λi − 2 ∑λi(yi(w xi +w0)−1) i=1 i=1
TomaximiseL(λ)inλi,sinceλi ≥0andyi(wTxi+w0)−1≥0,onepossibilityistohave λi(yi(wTxi +w0)−1) = 0 for all yi(wTxi +w0)−1 ≥ 0, i = 1, 2, …, N.
This is supported by Karush-Kuhn-Tucker (KKT) conditions.
As yi(wTxi +w0)−1 ≥ 0, it suggests that some λi = 0 for those xi not being a support vector.
Support vector xi: yi(wTxi +w0)−1 = 0 ⇒ λi ̸= 0
Non-support vector xi: yi(wTxi +w0)−1 > 0 ⇒ λi = 0
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 18/52

Linear SVMs: Linearly Separable Case
Summary: The linear SVM classifier (linearly separable case) can be found by solving the solution (w, w0 and λi) to the following conditions:
∂L(w,w0,λ) N
∂w =0⇒ w=∑λiyixi
λi􏰀yi(wTxi +w0)−1􏰁 = 0, Hard classifier: f (x) = sgn(wT x + w0) where sgn(z) =
−1 ifz<−1 Soft classifier: f(x) = h(wTx+w0) where h(z) = z +1 if −1 ≤ z ≤ 1 if z > 1
i=1 ∂w =0⇒ ∑λiyi=0
∂L(w,w0,λ) N
i = 1,2,…,N i = 1,2,…,N
􏰉−1 ifz<0 +1 ifz≥0 DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 Linear SVMs: Non-separable Case DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 20/52 Section 3.7: SUPPORT VECTOR MACHINES Linear SVMs: Non-separable Case ./" // ++; 371 FIGURE 3.9: In the nonseparable class case, points fall inside the class Figure 3: An example of two non-separable classes. separation band. DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 21/52 Linear SVMs: Non-separable Case Three categories of input samples: Samples fall outside the band and are correctly classified: yi(wTxi +w0) ≥ 1 Samples fall inside the band and are correctly classified: Samples are misclassified: All 3 categories can be described as: 0 ≤ yi(wTxi +w0) < 1 yi(wTxi +w0) < 0 yi(wTxi +w0) ≥ 1−ξi where ξi is known as slack variable. First category: ξi = 0; Second category: 0<ξi ≤1; Third case: ξi >1.
Remark: ξi for sample xi in the second/third category is obtained as
ξi = 1−yi(wTxi +w0).
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 22/52

Linear SVMs: Non-separable Case
We want to maximise the margin and minimise the number of misclassified point (minimise the margin violations). We formulate the constrained optimisation problem as:
1N min J(w,ξ)= ∥w∥2+C∑ξi
w,w0,ξ 2 i=1
subject to yi(wTxi +w0) ≥ 1−ξi, i = 1,2,…,N
ξi ≥ 0, i = 1,2,…,N 􏰊􏰋
ξ1 ξ2 … ξN and0≤C≤+∞isapre-setconstantscalar, This is known as the soft-margin method, the classifier is know as soft-margin
which controls the influence of the two competing terms.
classifier (do not confuse with soft/hard classifiers).
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 23/52

Linear SVMs: Non-separable Case
Example with small value of C Example with large value of C 22
1.5 1.5 11 0.5 0.5 00
−3 −2.5 −2 −1.5 −1 −0.5
−3 −2.5 −2 −1.5 −1 −0.5
0 0.5 1 1.5 2 2.5 3 x1
DrH.K.Lam (KCL)
SupportVectorMachines
7CCSMPNN2020-21

Linear SVMs: Non-separable Case
The constrained optimisation problem is formulated as the following primal problem using the method of Lagrange multipliers:
1N Primalproblem: L(w,w0,ξ,λ,μ)= 2∥w∥2+C∑ξi
− ∑ μi ξi − ∑ λi (yi (wT xi + w0 ) − 1 + ξi )) i=1 i=1
whereμ= μ1 μ2 … μN ,λ= λ1 λ2 … λN areLagrange multipliers, μi ≥ 0, λi ≥ 0 for all i = 1,2,··· ,N.
The above primal problem can be transformed to the following dual problem: Dual problem: min max L(w,w0,ξ,λ,μ)
w,w0 ,ξ λ ≥0,μ ≥0
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 25/52

Linear SVMs: Non-separable Case
∂L(w,w0,ξ,λ,μ) N
∂w =0 ⇒ w=∑λiyixi (4)
∂w =0 ⇒ ∑λiyi=0 (5)
∂L(w,w0,ξ,λ,μ) N 0 i=1
∂L(w,w0,ξ,λ,μ)=0 ⇒ C−μi−λi=0 (6) ∂ξi
Putting (4), (5) and (6) into L(w,w0,ξ,λ,μ), we have 1NNN
L(λ,ξ)= 2∥w∥2+C∑ξi−∑μiξi−∑λi(yi(wTxi+w0)−1+ξi) i=1 i=1 i=1
= 2 ∑∑λiλjyiyjxTi xj +C∑ξi −∑μiξi −∑λiyiwTxi −∑λiyiw0 +∑λi −∑λiξi
i=1j=1 i=1 i=1 i=1 i=1 i=1 i=1
1NNNNNNNNN
= 2 ∑∑λiλjyiyjxTi xj +C∑ξi −∑μiξi −∑∑λiλjyiyjxTi xj −∑λiyiw0 +∑λi −∑(C−μi)ξi
i=1 j=1 i=1 i=1 i=1 j=1 i=1 i=1 i=1
= ∑λi − 2 ∑∑λiλjyiyjxTi xj.
i=1 i=1 j=1
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 26/52

Linear SVMs: Non-separable Case
The dual problem is then reduced to:
􏰇N1NN􏰈 max ∑λi − ∑∑λiλjyiyjxTi xj
λ≥0 i=1 2i=1j=1 N
subjectto ∑λiyi=0, i=1
0≤λi ≤C,i=1,2,…,N
DrH.K.Lam (KCL) SupportVectorMachines
7CCSMPNN2020-21

Linear SVMs: Non-separable Case
The same remarks from linearly separable case apply.
This dual problem is the same as that of linearly separable case except a
bound is given to λi.
μi = 0 for ξi ̸= 0; μi ̸= 0 for ξi = 0.
λi = 0 for those samples fall outside the band and are correctly classified, i.e., the samples with yi(wT xi + w0) > 1 (ξi = 0).
λi ̸= 0 for the following cases:
The samples with yi(wT xi + w0) = 1 (ξi = 0).
The samples with yi (wT xi + w0 ) = 1 − ξi , i.e., samples fall inside the band and are correctly classified (0 < ξi ≤ 1) or samples are misclassified (ξi > 1)
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 28/52

Linear SVMs: Non-separable Case
The samples with λi ̸= 0 contribute to the final solution w.
In the case of ξi = 0 with yi(wTxi +w0) > 1, μi = C and λi = 0.
Inthecaseofξi =0withyi(wTxi+w0)=1,0≤μi 0 with yi(wTxi +w0) = 1−ξi, μi = 0 and λi = C.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 29/52

Linear SVMs: Non-separable Case
Summary: The linear SVM classifier (non-separable case) can be found by solving the solution (w, w0, ξi, λi, μi) to the following conditions:
∂L(w,w0,ξ,λ,μ) N
∂w =0 ⇒ w=∑λiyixi
i=1 ∂w =0 ⇒ ∑λiyi=0
∂L(w,w0,ξ,λ,μ) N 0 i=1
∂L(w,w0,ξ,λ,μ)=0 ⇒ C−μi−λi=0 ∂ξi
μi ≥0, λi ≥0, μiξi = 0, λi􏰀yi(wTxi +w0)−1+ξi􏰁 = 0,
i=1,2,…,N i = 1,2,…,N i = 1,2,…,N
Hard classifier: f (x) = sgn(wT x + w0), Soft classifier: f (x) = h(wT x + w0) DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 30/52

Nonlinear SVMs
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 31/52

a, i { }
i.e. yif(xi) > 0 for a correct classication.
Nonlinear SVMs
x Rd and y
Given training dat
Linearly separable case:
Nonlinearly separable case:
linelianrelyarl
separable separab
< i.e. yif(xi) > 0 for a c
DrH.K.Lam (KCL)
f(x) = w>x + b SupportVectorMachines
7CCSMPNN2020-21
Does it exist a mapping Φ(·) so as: assifiers
A linear classifier has the form
inear classifi
r classifier
er has the form

Nonlinear SVMs
Feature mapping: zi = Φ(xi), i = 1, 2, …, N.
In the above analysis for linear SVMs, instead of using xi, we use zi.
The same analysis results (formulas) can be applied.
LinearSVMs:w= ∑λiyixi⇒hyperplane:wTx+w0=0 i∈SVs
NonlinearSVMs(inzspace):w= ∑λiyizi⇒hyperplane:wTz+w0=0 i∈SVs
Hyperplane (in x space): ∑ λiyiΦ(xi)T Φ(x)+w0 = 0
􏰎􏰍􏰌􏰏 􏰎 􏰍􏰌 􏰏z
Hard classifier: f (x) = sgn( ∑ λiyiΦ(xi)T Φ(x) + w0) i∈SVs
Soft classifier: f (x) = h( ∑ λiyiΦ(xi)T Φ(x) + w0) i∈SVs
DrH.K.Lam (KCL)
SupportVectorMachines
7CCSMPNN2020-21

Nonlinear SVMs
Certain Kernels that satisfy Mercer’s Theorem allow mapping to high-dimensional feature space implicitly:
K(xi,x) = Φ(xi)TΦ(x)
Instead of finding the mapping function Φ(·), it is easier to find the kernel
function K(·).
Hard classifier: f(x) = sgn( ∑ λiyiK(xi,x)+w0)
Soft classifier: f(x) = h( ∑ λiyiK(xi,x)+w0)
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 34/52

Nonlinear SVMs
Commonly used kernels:
Linear kernel (standard inner product): K(xi,x) = xTi x+c, c is an optional constant. Polynomial kernel of degree q: K(xi,x) = (αxTi x+c)q, q > 0
− ∥xi −x∥2 Radial basis function (RBF) kernel (exponential kernel): K(xi,x) = e 2σ2
Multi-quadratic kernel: K(xi,x) = 􏰆∥xi −x∥2 +c Inverse multi-quadratic kernel: K(xi,x) = √ 1
∥xi −x∥2 +c
Power kernel: K(xi,x) = −∥xi −x∥q
Log kernel: K(xi,x) = −log(∥xi −x∥q +1)
Sigmoid Function (Hyperbolic Tangent): K(xi,x) = tanh(βxTi x+γ)
A kernel can be constructed from other kernels.
A linear combination of kernels: ∑k αkKk(·), αk > 0,∀k Product of kernels: ∏k αKk(·) (so as K(·)q)
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 35/52

Nonlinear SVMs
Why kernel functions?
Example: Consider the quadratic kernel (polynomial kernel of degree two, without offset term): K(z,x) = (zTx)2 = z21x12 +2z1z2x1x2 +z2x2.
It can be shown that the feature mapping function is:
Φ(x) =  2x1x2  ⇒ K(z,x) = Φ(z)TΦ(x) = (zTx)2 x2
The original feature space is mapped to a higher-dimensional feature space through a feature mapping function Φ(·).
Instead of computing Φ(·) and then Φ(·)T Φ(·) (two-step computation), computing the kernel function K(·) can be done in one step saving the computational demand especially for feature mapping function of higher dimensional space.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 36/52

Nonlinear SVMs Chapter 4: NONLINEAR CLASSIFIERS
xl 0;(0%, a ~
Figure 4: A diagram of SVM classifier with kernel functions: ∑ λiyiK(xi,x)+w0
FIGURE 4.22: The SVM architecture employing kernel functions.
DrH.K.Lam (KCL) SupportVectorMachines
7CCSMPNN2020-21 37/52

Nonlinear SVMs
FIGURE 4.22: The SVM architecture employing kernel functions.
rable classes. The
F kernel was used
-H-+++ + ++
0.5/(OO 9 9 9
Figure 5: A nonlinearly separable classification example using nonlinear SVM classifier.
FIGURE 4.23: Example of a nonlinear SVM classifier for the case of two
++ ++ . ‘~+~O 9 9

Nonlinear SVMs
When RBF kernel function is used, the SVM architecture is the same as the RBF network structure. However, the number of hidden units and the centres are determined by the optimisation procedure.
When sigmoid kernel function is used, the SVM architecture is the same as a three-layer fully-connected feed-forward neural network structure. However, the number of hidden units is determined by the optimisation procedure.
There is no systematic method to determine the best Kernel function and its parameters, and the parameter C (hyper-parameters), which are usually chosen by trial and error.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 39/52

Applications
Speaker verification Face detection
Hand-writing recognition Biomedical
Cancer diagnosis Epilepsy Diagnosis (EEG) Cardiac Arrhythmia (ECG) Cardiovascular Disease
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 40/52

Multi-class SVMs
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 41/52

Multi-class SVMs
SVM classifiers is a binary classifier, which can handle only two-class problems.
Multi-class SVM classifiers can be built by combining two-class SVM classifiers.
Multi-class classification problems:
Given a dataset: {x1,x2,…,xN} and each data point xi belongs to class
Ci ∈ {1,2,…,R}; i = 1, 2, …, N, design a classifier which can tell which class the data xi belongs to.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 42/52

Multi-class SVMs
Approaches combining SVMs
One against one One against all Binary decision tree Binary coded
These approaches combining a number of two-class SVMs (linear or nonlinear) for multi-class classification.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 43/52

Multi-class SVMs
One-against-one approach:
Number of classifiers: R(R−1) 2-class classifiers are required for R-class problem. 2
Learning: A 2-class SVM classifiers is trained for each pair of classes, i.e., Ci versus Cj. For example, the 2-class hard SVM classifier sgn􏰂w(ij)x+w(ij)􏰃 is able to
classify if the input x belongs to class Ci (using label +1) or Cj (using label −1).
Decision: Choose class with majority votes.
For example, considering 3 classes, we need to have 3(3−1) = 3 classifiers, i.e., 1
against 2, 2 against 3 and 3 against 1. If the output of 1-against-2 classifier is Class 1; 2-against-3 classifier is Class 2/3; 3-against-1 classifier is Class 1, the majority vote is Class 1, which is the final decision.
Some regions cannot be classified.
DrH.K.Lam (KCL) SupportVectorMachines 7CCSMPNN2020-21 44/52

Multi-class SVMs
One-against-one approach:
Figure 6: One-against-one approach for 3 classes.
DrH.K.Lam (KCL)
SupportVectorMachines
7CCSMPNN2020-21

Multi-class SVMs
One-against-all approach (soft SVM classifiers):
Number of classifiers: R 2-class classifiers are required for R-class problem. Learning: A 2-class SVM classifiers is trained for CR versus all Cj, j ̸= R. For
example, the 2-class hard SVM classifier w(i)x+w(i) is able to classify if the input x i0
belongs to class Ci (positive value) or the rest (negative value). Decision: Choose class with majority votes.
For example, considering 3 classes, we need to have R = 3 classifiers, i.e., 1 against 2&3, 2 against 1&3, and 3 against 1&2. If the output class of the 1-against-2&3 classifier is class “1”; 2-against-1&3 classifier is class “1” or “3” (not class “2”); 3-against-1&2 classifier is class “1” or “2” (not class “3”), majority votes go to class “1” (3 votes for

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com