Lecture 13:
Support-vector machines (2) CS 189 (CDSS offering)
2022/02/16
Copyright By PowCoder代写 加微信 powcoder
Today’s lecture
• Today, we continue our discussion of SVMs from last time
• Specifically, we will derive another version of SVMs that are used when the data
are not linearly separable — the soft margin SVM
• This version introduces a hyperparameter C which trades off maximizing
margin (of correctly classified points) and correctly classifying points
• We will analyze what effect varying C has, and we will see how the constrained
optimization problem can be rewritten without constraints using a hinge loss 2
Recap: the hard margin SVM
• Recall our optimization problem from last time: arg min 1 !w!2 s.t. yi(w”xi + b) # 1 $i
• This is known as the hard margin SVM — the margin is inviolable
• The solution to this QP is the max-margin linear classifier for our dataset, but this only exists if our data are linearly separable!
• What if it’s not?
Linearly inseparable data
Introducing: slack
• To account for data points that lie on the wrong side of the decision boundary, we will add a slack variable !i for each data point (xi, yi)
• We will change our constraints to be, for all i:
yi(w!xi + b) ” 1 # !i
• Therefore, we are allowing our data points to violate the margin
Violating the margin
The soft margin SVM
We, of course, want to minimize how many data points violate the margin Therefore, we change our optimization objective as well, to get:
argmin1$w$2+C!N !i s.t. yi(w!xi+b)”1#!i, !i “0 %i w,b,! 2 i=1
Turns out, this is still a QP, so there are still good algorithms for solving this
• C is a hyperparameter, i.e., we have to set it ourselves (think for yourself why)
• How does varying C change the decision boundary we learn? 7
Choose from C & {0.1, 0.15, 1, 100}
https://www.eecis.udel.edu/~shatkay/Course/papers/UOSVMAlliferisWithoutTears.pdf
Choose from C & {0.1, 0.15, 1, 100}
https://www.eecis.udel.edu/~shatkay/Course/papers/UOSVMAlliferisWithoutTears.pdf
Rewriting the soft margin SVM problem
argmin1$w$2+C!N !i w,b,! 2 i=1
rewrite constraints
yi unconstrained optimizationproblem
s.t. yi(w!xi+b)”1#!i, !i “0 %i
1 yo w Tx it b i 20
i max l YiwTXitb O Vi
we areminimizing i
angmin Ill w t c Iii max l yo wtxitb
The unconstrained soft margin SVM problem
• argminC!N max(1#yi(w!xi +b),0)+ 1$w$2 w,b i=1 2
• This version defines a loss function that has two components:
• A hinge loss which encourages accurate classification
• A “2-regularization term!
• Modern methods for solving this optimization problem heavily resemble gradient
descent or stochastic gradient methods
We have modified our SVM formulation to include slack variables, which allows data
points to lie on the wrong side of the margin
• C controls how heavily slack is penalized compared to maximizing the margin We can reinterpret the constrained optimization problem as an unconstrained
problem that combines a hinge loss with “2-regularization
Modern methods basically solve this problem with gradient descent
• On Friday, we will another SVM formulation — the dual optimization problem
If you have time, review the method of Lagrange multipliers before then 12
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com