狠狠撸

Classification
(??)
Supervised Learning
? ∶ ? → ?
Given samples ? = ( ?1, ?1 , ?2, ?2 , … , (? ?, ? ?))
find!

http://cs.stanford.edu/people/karpathy/svmjs/demo/

Question 1) ?? ???(classifier)??
Question 2) ???? ?? ????
○ Bayesian Classifier : Error, Risk? minimize
(Likelihood or MAP ????? ????)

1. ???? 2?? ??? point?? ?? : feature? 2?
2. ??? point ∈ ?1, ??? point ∈ ?2 ?? ??
?1
?2
?1
?2
① ② ③
?? ?? ??? ???????

?1
?2
?1
?2
① ② ③
?1?? ???? ???? ?? ??? ??
?? 2? ???? ????
?? ??? ???? ???, ???? ????? ? ?? : Generalization
(??? ??? ??? ??? ??? ? ??????)

?1
?2
?1
?2
③
Generalization? ???
(???? ????? ??? ???? ??? ?? ??? ??.)
?? ??? ??? ??. ? ??(margin)? ?? ???, ??? ???.

? = ?1, ?2, … , ? ? ∈ ? ?, ? = ?1, ?2, … , ? ? ∈ ? ?, ? ∈ ?
? ? = ? ?
? + ? = 0
? ?(?)? ?? ??? ? ???? ????.
○ ?1? ? ? > 0? ??
○ ?2? ? ? < 0? ??
? ??? ???? ???? ?? ?? ??.
? ? : normal vector(???? ??), ? : ???? ??
? ??? ? ?1?? ?????? ??
? =
? ? ?
?
Decision Hyperplane
?1
?2
? ? = 0? ? > 0
? ? < 0
?? ??? (decision hyperplane)

? Constrained optimization problem (primal problem)
○ ? ∈ ? ?
is the optimization variable.
○ ?0: ? ?
→ ? is the objective function or cost function. (?? ??)
○ ??: ? ?
→ ?, ? = 1, … , ?, are inequality constraint functions.
○ ??: ? ?
→ ?, ? = 1, … , ?, are equality constraint functions.
○ optimal value ??
??
= inf ?0 ? ?? ? ≤ 0, ? = 1, … , ?, ?? ? = 0, ? = 1, … , ?}
??? ??? ??
minimize ?0(?)
subject to ?? ? ≤ 0 , ? = 1, … , ?
?? ? = 0 , ? = 1, … , ?

? Affine set
○ ? ? ? ?
is affine → ? ?? 2?? ?? ?? point? ??? line? ?? ?? ?.
○ ?? ??, ?? or hyperplane ?? ??.
? Convex set
○ ? ? ? ?
is convex → ? ?? 2?? ?? ?? point? ??? line segment? ?? ??
?.
○ ?? ??, ? or ??? ?? ??.
Affine / Convex
convex set (???)convex set (??)
line
line segment

? Affine function
○ a function composed of a linear function and constant (translation)
○ in 1-dim : ? = ?? + ?
○ in 2-dim : ? ?, ? = ?? + ?? + ?
○ in 3-dim : ? ?, ?, ? = ?? + ?? + ?? + ?
translation : a transformation consisting of a constant offset with no rotation or distortion
Affine function

? ?: ? ? → ? is convex if ??? ? is convex and
? ?? + 1 ? ? ? ≤ ?? ? + 1 ? ? ? ?
for ??, ? ∈ ??? ?, 0 ≤ ? ≤ 1.
○ ? is concave if ?? is convex.
(?, ?(?))
(?, ?(?))
?(?? + 1 ? ? ?)
?? + 1 ? ? ?
?? ? + 1 ? ? ? ? ? minimum
??? ? : ?? ?? ??? ?? ?? ??. ? ??? ?? define?? ?????? ?? ??.
Convex function

? convex optimization problem
○ ?0, … , ?? are convex.
○ ?? ? = ??
?
? ? ?? , ? = 1, … , ? are affine.
○ The feasible set of a convex optimization problem is convex.
① Since ?1, … , ?? are convex, ????=1
?
??? ?? is convex.
② Since ? ?? ? = ??} is hyperplane, ????=1
?
? ?? ? = ??} is affine (i.e convex).
○ In a convex optimization, we minimize a convex objective function over a convex set.
convex optimization
minimize ?0(?)
subject to ?? ? ≤ 0 , ? = 1, … , ?
??
?
? = ?? , ? = 1, … , ?

? Lagrangian ?: ? ? × ? ? × ? ? → ?,
? ?, Λ, ? = ?? ? + ?
?=1
?
?? ??(?) + ?
?=1
?
????(?)
with, ??? ? = ????=1
?
??? ?? ∩ ????=1
?
??? ?? × ? ?
× ? ?
○ ?? is Lagrange multiplier associated with ?? ? ≤ 0 and ? = (? ?, … , ? ?)
○ ?? is Lagrange multiplier associated with ?? ? = 0 and ? = (? ?, … , ? ?)
Lagrangian

? Lagrange dual function ?: ? ? × ? ? → ?
? Lagrange dual problem
○ ?(Λ, ?) is concave and constraints are convex, so this is convex optimization
problem.
○ Lagrange dual problem? ?(Λ?
, ??
)? primal problem(??
) ?? lower bound? ??
??.
inf ?(??
, Λ?
, ??
) ≤ ?0(??
)
? Λ, ? = inf
?∈?
?(?, Λ, ? ) = inf
?∈?
?0 ? + ?
?=1
?
?? ?? ? + ?
?=1
?
???? ?
Lagrange dual function /
Lagrange dual problem
maximize ?(Λ, ?)
subject to Λ ? 0
How to make it coincide?

? If
○ ?? are convex (????, inequality constraint)
○ ?? are affine (equality constraint)
○ ??
, Λ?
, ??
satisfy KKT condition
? Then
○ ??
is primal optimal and (Λ?
, ??
) is dual optimal with zero duality gap.
○ ? ?? ?? ????.
KKT condition and
zero duality

KKT conditions
?? ?′ ≤ 0, ? = 1, … , ?
?? ?′ = 0, ? = 1, … , ?
??
′
≥ 0, ? = 1, … , ?
??
′
?? ?′ = 0, ? = 1, … , ?
??0 ?′ + σ?=1
?
??
′
???(?′) + σ?=1
?
??
′
???(?′) = 0
primal constraints
dual constraints
complementary slackness
gradient Lagrangian
with respect to ? vanishes at ?′

? Wolfe duality problem
○ ?0 is convex.
○ ??, ?? are differentiable. (Lagrange dual problem → Wolfe duality problem)
○ ????? KKT conditions? ?? ???? ?? ?????, ? ??? ?? ???
?? primal problem? ?? ??? ?? ????.
maximize ?0 ? + σ?=1
?
?? ?? ? + σ?=1
?
???? ?
subject to Λ ? 0
??0 ? + σ?=1
?
??
′
???(?) + σ?=1
?
??
′
???(?) = 0
Wolfe duality
?(?, Λ, ?)

? SVM? concept? ??(margin)?? ????.
? ? ??? ?? ??? ??(margin)? ??? ????? ????? ???.
○ ??(margin) : ?? ???(??)???? ?? ??? ????? ??? 2?
?1
?2
?1
?2
margin
How to find the best margin?
?1
?2
?1
?2
margin
separation band separation band

?1
?2
?1
?2
separation band
(?? ? ?? ???? ???? ?? ?)

? Problem : margin? ????? ????? ???.
○ support vector, ? ?: ???????? ?? ??? sample
○ margin? ?? =
2 ? ? ?
?
=
2
?
(??? ???? rescale? ???. ??? 1? ??)
○ ???? ? = { ?1, ?1 , ?2, ?2 , … , ? ?, ? ? }
① if ?? ∈ ?1, then ?? = 1
② if ?? ∈ ?2, then ?? = ?1
※ ?? : ?? ??? ?? ????. ? ?? ??? ?? ?? ??? ????.
maximize
2
?
subject to ? ?
?? + ? ≥ 1 ??? ∈ ?1
? ?
?? + ? ≤ ?1 ??? ∈ ?2

? Primal problem (??? ??? ??)
○
2
?
? ????? ??? ????
? 2
2
? ????? ??? ??? ? ??.
① ? ? =
1
2
? 2
? ? ? 2???? ???? ?? ??? convex ????.
② inequality constrain? ?? ???? ??? convex ????.
③ inequality constrain? ????? ??(feasible)? convex set??.
○ Primal problem? convex optimization problem??.
○ Primal problem? ?? global solution? ??.
minimize ? ? =
?
?
? ?
subject to ?? ? ?
?? + ? ≥ ? , ? = ?, … , ?

○ Primal problem? ??? ?? ??? ???? ???.
Primal problem
Lagrange dual problem
Wolfe dual problem
??? ??? ??
- Lagrangian ??
- Lagrange dual function ??
- Objective(Cost) function : convex
- constraints are differentiable.

? Lagrangian of primal problem
○ ? = ?1, … , ? ?
?
? Lagrange multiplier?? ??. ?? ≥ 0 for ? = 1, … , ?
? ?, ?, ? =
1
2
? 2 ? ?
?=1
?
?? ?? ? ? ?? + ? ? 1
? Lagrange dual function of primal problem
? ? = inf
?,?
?(?, ?, ?) = inf
?,?
1
2
? 2 ? ?
?=1
?
?? ?? ? ? ?? + ? ? 1
? Lagrange dual problem of primal problem
maximize ? ? = inf
?,?
?(?, ?, ?)
subject to ? ? 0

? Wolfe dual problem
○ ????
1
2
? 2
? convex ????.
○ ????? ?????? ?? ?? ????.
maximize ? ?, ?, ? =
1
2
? 2
? σ?=1
?
?? ?? ? ?
?? + ? ? 1
subject to ? ? 0,
??
??
= 0,
??
??
= 0
?? ≥ 0, ? = 1, … , ? ? = ?
?=1
?
?? ?? ?? ?
?=1
?
?? ?? = 0
maximize ? ? = σ?=1
?
?? ?
1
2
σ?=1
?
σ ?=1
?
?? ?? ?? ?? ??
?
??
subject to ?? ≥ 0, ? = 1, … , ?
σ?=1
?
?? ?? = 0

? ? ?? ?? ??? ??? ??? ??? ?? 2?(quadratic) ?? ??? ??
? ??
? ?, ?? ??? ??? ??, Lagrange multiplier ?? ??? ??? ???
? ?? ???? ?? ?? ??? ?? ???? ??, ? ?? ?? ??? ??
??
?
???? ???
? ?2? ?? ???? ?
maximize ? ? = σ?=1
?
?? ?
1
2
σ?=1
?
σ ?=1
?
?? ?? ?? ?? ??
?
??
subject to ?? ≥ 0, ? = 1, … , ?
σ?=1
?
?? ?? = 0

?1
?2
?1
?2
separation band
(?? ? ?? ???? ?????, ????? ??? ??)

?1
?2
?1
?2
separation band
①
③
②
① ?? ?? ??? ???(support vector), ?? ?? ??? ??.
? ? ≤ ?(? ?
? + ?)
② ?? ?? ??? ??, ??? ?? ??? ??? ??.
? ? ≤ ? ? ?
? + ? < ?
③ ?? ??? ?? ??? ??? ?? ??? ??? ??.
? ? ? ? ? + ? < ?

?1
?2
?1
?2
separation band
①
③
②
① ? ≤ ?(? ?
? + ?) ? ? = 0
② ? ≤ ? ? ? ? + ? < ? ? 0 < ? ≤ 1
③ ? ? ? ? + ? < ? ? ? > 1
slack ?? ?? ????, 3?? ???, ??? ??? ??
? ? ? ? + ? ≥ ? ? ?

?1
?2
?1
?2
separation band
①
③
②
??? ???? ? ??? ?? 2?? ??? ????? ??.
- ?? 1 : margin? ??? ??? ??.
- ?? 2 : ②, ③ ??? ???? ??? ?? ????? ??? ??.

?1
?2
?1
?2
2??? ??? ?? ????? ??? ?? ????? ??.
Ξ = (?1, … , ? ?)?? ??.
? ?, Ξ =
1
2
? 2
+ ? ?
?=1
?
??
? ?? ??? ?? ?? ??? ??? ???? ?? ??.
? = 0?? ??2? ????. ? = ∞?? ?? 2? ????.
?1
?2
?1
?2
? = ∞? = 0

? ?, Ξ =
1
2
? 2
+ ? ?
?=1
?
??
??? ?? ??? ????? ??.
minimize ? ? =
?
?
? ?
+ ? σ?=?
?
??
subject to ?? ? ?
?? + ? ≥ ? ? ?? , ? = ?, … , ?
?? ≥ ?, ? = ?, … , ?
????? ??????, ??? ??? ??? ??? ? ??.
? Primal problem (?? ?? ???? ??)
○ ???? ?(?)? convex????.
○ ??? ?? ??????.

? Lagrangian
? ?, ?, Ξ, ?, ? =
1
2
? 2 + ? ?
?=1
?
?? ? ?
?=1
?
?? ?? ? ? ?? + ? ? 1 + ?? + ?
?=1
?
?? ??
○ ? = (?1, … , ? ?), ? = (?1, … , ? ?), Ξ = (?1, … , ? ?)
maximize ? ?, ?, Ξ, ?, ?
=
1
2
? 2 + ? σ?=1
?
?? ? σ?=1
?
?? ?? ? ? ?? + ? ? 1 + ?? + σ?=1
?
?? ??
subject to ? ? 0, ? ? 0,
??
??
= 0,
??
??
= 0,
??
?Ξ
= 0
?? ≥ 0, ? = 1, … , ?
?? ≥ 0, ? = 1, … , ?
? = ?
?=1
?
?? ?? ??
?
?=1
?
?? ?? = 0
? = ?? + ??

○ soft margin ??? ??? ???? ?? ??? ? ?? ??? ??
① ?? = 0 : ???? support vector???, margin ???? ??.
② 0 < ?? < ? : ???? support vector
③ ?? = ? : ???? support vector???, margin ??? ??.
maximize ? ? = σ?=?
?
?? ?
?
?
σ?=?
? σ?=?
?
?? ?? ?? ?? ??
?
??
subject to σ?=?
?
?? ?? = ?
? ≤ ?? ≤ ?, ? = ?, … , ?

(Proof)
○ ??? ???? KKT ??? ????.
① ? = ?? + ??
② ?? ?? = 0
③ ?? ?? ? ?
?? + ? ? 1 ? ?? = 0
④ ?? ≥ 0
○ ?? = ?
① ?? = 0, ?? ≥ 0
② ?? ? ?
?? + ? ? 1 ? ?? = 0
? ?? ? ?
?? + ? ≤ 1
○ ?? = 0
① ?? = ?, ?? = 0
② ?? ? ?
?? + ? ? 1 ? ?? ≥ 0
? ?? ? ?
?? + ? ≥ 1
○ 0 < ?? < ?
① 0 < ?? < ?, ?? = 0
② ?? ? ?
?? + ? ? 1 ? ?? = 0
? ?? ? ?
?? + ? = 1

? ?? ???? ???? ?? ????? ?? ???? ?? ??? ???? ?
? ???? ??.
? ?? ?? ????? ?? ?? ???? ??? ? ??? ? ?? ??? ??
? ???? ???? ?? ?? ???? ?? ? ??.
?? ?? : ?? ?? ???
?1 ?2
?4?3
Φ: ? → ?
??? ?? : ?? ?? ??
?1
?4
?3
?2Φ ?? = ??

? ?? ??? ?? ??? ?? ??
Φ: ? → ?
○ ??, ?? ?? ??? ??? ?? ??.
? ?? ??(Kernel function)
○ ? ?? ?, ? ∈ ?
○ ? ?? Φ ? , Φ(?) ∈ ?
○ ?? ?: ? × ? → ?(????)
? ?, ? = ? ? ? ?(?)
○ ?? ??? ?? ??? ? ??? ??? ?? ??? ??.

? = ?1, ?2
?, ? = ?1, ?2
? , ?, ? ∈ ?
? ?, ? = ? ? ? 2 = ?1
2
?1
2
+ 2?1 ?2 ?1 ?2 + ?2
2
?2
2
Φ ? = ?1
2
, 2?1 ?2, ?2
2
∈ ?
? ?, ? = ? ? ? 2
= ?1
2
?1
2
+ 2?1 ?2 ?1 ?2 + ?2
2
?2
2
= ?1
2
, 2?1 ?2, ?2
2
?1
2
, 2?1 ?2, ?2
2
= Φ ? ? Φ(?)
?, ???? ?? ?? ????.
Example
example )

? Kernel substitution
○ Kernel trick???? ??.
○ ?? ??? ??? ??? ???? ?? ?, ? ??? ?? ??? ???? ???
? ????.
○ ?? ??? ????? ?? ?? ?? ???? ?????. ??? ???? Φ? ?
?? ??? ?? ??? ???? ??? ???.
○ ??? ??? ???? ???.
○ ?? ??? ???? ??? ? ??.

? ?? ???? ?? ?? ?? ??? ?? ??? ???? ???? ?????
???? ?? ?? ?? Φ? ? ?? ??? ( ?, ?? ??? ? ???) ???
?? ?? ???? ??? ??
? ?? ???? SVM ???? ??? ????? ??
? ? = ?
?=1
?
?? ?
1
2
?
?=1
?
?
?=1
?
?? ?? ?? ?? ??
?
??
○ ?1, ?2 ∈ ?, ? = (?1, … , ? ?)
? ?? ?? ????, SVM ???? ??? ????? ??
? ? = ?
?=1
?
?? ?
1
2
?
?=1
?
?
?=1
?
?? ?? ?? ??Φ ??
?
Φ(??)
○ Φ ?1 , Φ ?2 ∈ ?
○ ???? ∈ ?
?
?=1
?
?? ?
1
2
?
?=1
?
?
?=1
?
?? ?? ?? ?? ?(?, ?)
?? ??? ???? ??
??? ?? ???? ??
??? ?? ?????? ???, ????? ???? ????? ??.
?????? ???? ???? ??

? ??? ??
? ?, ? = ? ? ? + 1 ?
? Radial Basis Function ??
? ?, ? = ?
?
??? 2
2?2
? ????? ??? ??
? ?, ? = tanh(?? ? ? + ?)
○ ?? ?, ?? ??? ????? ????? ???. ? = 2, ? = 1? ??? ???.

? 1? M-1??
○ ??? ?? ???? ??
① ??? ?? ???? ?? ??? ??? ? ? 1? ??? ??
② ??? ??? ??? ?? ??, ??? ??? ??? ?? ???? ??
○ ?? ??? ??? ?? ?? ? ??? ??
① ? = arg max
?
??(?)
? 1? 1??
○ ?? ?? ???? ?? ????
? ??1
2
? ??
① ???(?)? ??? ??? ???? ?? ???? ?? ???
② ?? ??? ???(?)? ??? ???? ?? ??? ??, ??? ??? ??? ??
③ 2? ???
? ??1
2
? ???? ?? ?? ?? ?? ??? ?? ??

How to compute ? = (?1, … , ? ?)?
Sequential Minimal Optimization
(??? ?? ???)

? ??? ??? ???, ?? ???? ??? ????.
○ ?? ???? ?? ??? ? ???, ????? ????.
○ ????? ??? ???? ?? ?? ?? ??, ?? ??? ??? ??? ?? ?
? ??? ??.
? ??? ??? ?? : ? = (?1, … , ? ?)? ???
? ???? ?? ?? : ??? ??, ??? ?? ???

? Platt? ??? ???? (???? simple version? ???.)
? ??? ????? Coordinate ascent? ????.
○ ??? ???? ????, ??? ??? ??? ????? ???? ?, ?? optimize
???.
Loop until convergence : {
For ? = 1, … , ?, {
?? ? arg max
??
?(?1, … , ???1, ???, ??+1, … , ? ?)
}
}
?
?=1
?
?? ?? = 0

???? 0?? ?? ? = (?1, … , ? ?) ??
(while) ?? ?? ?? ????? ?? ? ??
(for) ??, ? = 1, … , ? ? ?? ?? (???? ??? ??)
???? ?? ??? ??? ? ? ???
??? ?? ???? ?? ??? ??
??? ??? ?? (??? ??? ????? optimization???? ??)
??? ??? ? ? ??? next ? continue
??? ??? ? ? ??? ??, ?? ??? ??
???? ???? ??? ?? ??? ???, ?? ?? ?? 1 ?? ???.
?
?=1
?
?? ?? = 0
Go to Appendix 1
Go to Appendix 2

? ??, ??? ??
○ ??? ??? ? ? ?? ??
①???? ?? ??? ???? ??? ? ?(?? )?, SVM?? ??(σ ?=1
?
?? ?? ??
?
?? + ?)
? ??? ?? ??? ??(tolerance) ??? ??
②? ??? ??? random?? ??
? ??? ?? ??? ?? ???.
○ ??? ??? ??? ??? ?
? ?? ??? ??? ?? ??? ????.
○ ?????(????) ???? ??? ?? ???? ?? ???, ?? ??? ??
??? ?? ??? ?.

? ??(margin) ??? ??? ??? ???? ??? ????? ?? ????
???? ? ? ??? ??? ??? ???? ?
? ??? ??? ?? ??, ??? ???? ???? ?? ??? ?? ? ? ??
??? ?? ?? (????)
? SVM? ??? ??? ???.

? ???? - ??? ?
? http://cs229.stanford.edu/materials/smo.pdf
? http://cs229.stanford.edu/notes/cs229-notes3.pdf - Andrew Ng
? https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf
? Fast Training of Support Vector Machine using Sequential Minimal Optimization,
in Advances in Kernel Methods - Platt John
? Machine Learning in Action - ?? ???
? Machine Learning-A Probabilistic Perspective - Kevin P. Murphy

? 0 < ?? < ?? ????? ??? ??? ??? ??? ??? ????? ??.
○ ?? + ?? = ? (?? = ??)
① ? > ?
a. ? ? ? ≤ ?? ≤ ?
② ? < ?
a. 0 ≤ ?? ≤ ?
? = max(0, ?? + ?? ? ?), ? = min(?, ?? + ??)
○ ?? ? ?? = ? (?? ≠ ??)
① ? > 0
a. 0 ≤ ?? ≤ ? ? ?
② ? < 0
a. ?? ≤ ?? ≤ ?
? = max(0, ?? ? ??), ? = min(?, ? + ?? ? ??)
?? ???

?? ???
? ? = ?
?=1
?
?? ?
1
2
?
?=1
?
?
?=1
?
?? ?? ?? ?? ??
?
??
??? ??? ????? ?? ??? ??? ??
?? ?? + ?? ?? = Const.
?? ?? ????? ???? ??? 2???? ??
1? ???, 2? ??? ??
From here!

?? ? ?? ?
?? ?? ? ??
?
?? = ?
?=1
?
?? ?? ??
?
?? + ? ? ??
? = 2??
?
?? ? ?? ?? ? ?? ??
?? ? ?
?
??
?
where
if ?? > ?
if ? ≤ ?? ≤ ?
if ?? < ?
??? ??
??
???
?? ???
back

?? ? ?? + ?? ??(??
???
? ??)
?1 = ? ? ?? ? ?? ?? ? ??
???
??
?
?? ? ?? ?? ? ??
???
??
?
??
?2 = ? ? ?? ? ?? ?? ? ??
???
??
?
?? ? ?? ?? ? ??
???
??
?
??
? ?
?1
?2
?1 + ?2
2
if 0 < ?? < ?
if 0 < ?? < ?
if otherwise
??
???
??? ??
??? ?
??, ? ???
back

狠狠撸

SVM

Recommended

More Related Content

What's hot (20)

Similar to SVM (20)

More from Jeonghun Yoon (17)

SVM