Lecture 14:
Support-vector machines (3) CS 189 (CDSS offering)
2022/02/18
Copyright By PowCoder代写 加微信 powcoder
Today’s lecture
• Today, we will see a different, very math heavy formulation of soft margin SVMs
• Analyzing this dual formulation will seem mainly like a mathematical exercise
• We will briefly discuss the benefits of this formulation today, but more of this discussion will happen next week when we introduce kernelization
• If today’s lecture is hard to follow, first try reviewing the background math material (e.g., method of Lagrange multipliers), then try some of the math for yourself
• And use the annotated slides as a “check”
Setting up the Lagrangian
argmin1!w!2+C!N !i s.t. yi(w”xi+b)#1$!i, !i #0 %i w,b,! 2 i=1
let’s write out the Lagrangian function, &(w, b, !, “, #), for this problem Lwb5a B EllwIlitCEEi
liYiwxitb EEBi
EllwIlitEEIait eaiBii aiyiwtxitb 3
The Lagrangian function
• !(w, b, !, “, #) defines a competing objective between (w, b, !) and (“, #) — the former want to minimize, the latter want to maximize
• How do (w, b, !) minimize? By minimizing the objective
• But this doesn’t take into account the constraints we have
• How do (“, #) maximize? Note that they have to be non negative ( ” 0 )
• If the constraints are violated, they can go to #
• Thus, (w, b, !) are “forced” to satisfy the constraints 4
The primal and dual problems
!(w, b, !, “, #)
We can also consider the dual problem max min !(w, b, !, “, #)
The primal problem is defined as min max w,b,! “”0,#”0
• For SVMs, p$ = d$! This is referred to as strong duality
• This means we can choose to solve the primal or the dual problem
“”0,#”0 w,b,!
• Call the first value p$ and the second value d$ — in general, p$ ” d$
The primal and dual, as optimizations
The primal problem min max !(w, b, !, “, #) is the original problem we had: w,b,! “”0,#”0
argmin1%w%2+C!N !i s.t. yi(w&xi+b)”1′!i, !i “0 (i
• If you’re unsure, think about it as an exercise — the high level ideas from the rest of this lecture may help
Let’s figure out what the dual problem max min !(w, b, !, “, #) is “”0,#”0 w,b,!
w,b,! 2 i=1
• If you can see why, you are in a great place, mathematically
Solving the dual problem
We have max min !(w, b, !, “, #) — what does this mean intuitively? “”0,#”0 w,b,!
The inner optimization min !(w, b, !, “, #) is a function of (“, #) w,b,!
• This means that the minimizers (w$, b$, !$) are allowed to depend on (“, #)!
• Intuitively, (“, #) have to “go first”, and then (w, b, !) gets to “go second” after
the values for (“, #) have been set
• Let’s try to figure out how (w$, b$, !$) depend on (“, #) 7
Solving for (w$, b$, !$)
! ( w , b , ! , ” , # ) = 12 % w % 2 2 + ! Ni = 1 [ ” i + ( C ‘ ” i ‘ # i ) ! i ‘ ” i y i ( w & x i + b ) ]
wht Eit ai
Solving for (w$, b$, !$)
! ( w , b , ! , ” , # ) = 12 % w % 2 2 + ! Ni = 1 [ ” i + ( C ‘ ” i ‘ # i ) ! i ‘ ” i y i ( w & x i + b ) ]
we alsohave w
X and B must set these derivatives to zero
can drive L to
What do we have so far?
• We have found three different equalities from setting gradients equal to zero:
w$ = !N “$yixi (w$ is a weighted combination of the training points!) i=1 i
!N “$yi = 0 (the “$ have the same sum in the +1 and ‘1 classes) i=1 i i
• C ‘ “$ ‘ #$ = 0 (i (this relates “$ to #$, so we only have to find one!) ii ii
• The last equality also tells us that 0 ) “$ ) C — we will add this constraint i
Simplifying the dual problem
! ( w , b , ! , ” , # ) = 12 % w % 2 2 + ! Ni = 1 [ ” i + ( C ‘ ” i ‘ # i ) ! i ‘ ” i y i ( w & x i + b ) ]
I 1EYa any E EIxexiyelixixitatt
for an appropriate matrix Q 11
webe5 x Pt WATW I Xi
may L at1 I
The SVM dual optimization problem
• Final form: arg max !!1 ” 1 !!Q! s.t. !!y = 0 , 0 # !i # C $i !2N
• This is also a QP! After solving, we get w% = ! !%yixi i=1 i
• We previously had an optimization problem on the d-dimensional w, and now we have an optimization problem on the N-dimensional !
• If N > d, we likely want to solve the primal problem, and we use the dual problem when d > N — we will talk next week about why this might happen
Some intuition about ”
• The support vectors correspond to all indices i for which “i > 0 13
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com