ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
10-MIN
MATHEMATICS
10? ?? ??(Ê®·Ö) ??? ??
Entropy/KL-Divergence10-MIN
MATHEMATICS
Entropy and Cross Entropy
H(X) (x)
1
PX
log P ( ) logxX Pwhere ( ) is pmf of discrete R.V. XxXx¡Ê¡Æ (x)
1
PX
= =[ ]
1. ????(Entropy)
????? ?? ????
[??? ??] ???? ?? ???? ?? (?? ??? ? ??) ???? ??? ??? ? ???? ?? ??.
[??? ??] ?? ??? ??? ???? ??? ????? ??? ?? ??? ?? ???? ?? ?.
?? A
?? B
?? ??
?? ??
?? ???
?? ???
??? ????
?? ????
log (x)
1
PX
?? ?? ??? ?? ???
?? ?? ??? ?? ???
???(0.5) ??(0.3) ??(0.2)
???? ????? ?? ??? ?? ?? ????
H(X)= 0.5 ¡Á 0.3 ¡Á 0.2++ ¡Á
log 0.5( )
-1
??? =
0.3010=
log 0.3( )
-1
?? =
0.5229=
log 0.2( )
-1
?? =
0.6990=
Entropy/KL-Divergence10-MIN
MATHEMATICS
H(P, Q) Q ( )xXP ( )logxXx¡Ê¡Æ= - +P ( )log logxX P ( )xX( )1- Q ( )xXQ ( )xX ( )1-x¡Æ= -
+p( )log logx p( )x( )1- q( )xq( )x ( )1-x¡Æ= -
2. ?? ????(Cross Entropy)
EXAMPLE
?? ??? ?? ??? ???? ??? ???, ?? ??? ?? ? ??? ??? ?? ?? ??? ???? ??? ?.
x
x? x? x? x? x?x? x? x? x? x??
p
¡Æ p(x )=10¡Üp(x )¡Ü1,
a ?nite discrete distribution
ii i
=- [ {p(x?) log q(x?)+(1-p(x?)) log (1-q(x?))} + {p(x?) log q(x?)+(1-p(x?)) log (1-q(x?))} + ]N
1
={p(x?) log q(x?)+ +p(x??) log q(x??)}
-¡Æ p(x) log q(x) =-{p(x?) log q(x?)+p(x?) log q(x?)+ +p(x??) log q(x??)}
N? ??? N?? ??? (?? ????)
???? N=10
Entropy/KL-Divergence10-MIN
MATHEMATICS
Kullback-Leibler Divergence and Entropy
1. Kullback-Leibler(KL) Divergence? ??
KL Divergence? ????
( ||D ) = p(x) p(x)q(x) log( )P QKL p(x)- - [ ]-log( ) = [P??? Q? ?? ????] [P? ????]-
x ¡Ê
¡Æ x ¡Ê
¡Æ
<Discrete>
( ||D ) = P log ( )( )x
Q(x)
P(x)-x ¡Ê
¡ÆP QKL
<Continuous>
( ||D )= p(x)log dx( )q(x)
p(x)
-¡Þ
¡Þ
P QKL
? ????? ??? ???? ?? ???? ???, ?? ???? ??? ??, ? ??? ???? ?? ??? ???
???? ??? ??? ? ?? ?? ???? ??? ??.
2. KL Divergence? ????? ??
???? ??? ???, KL-Divergence? ?? ????(P)? ?? ?? ?? ??(Q)? ?? ????(Cross Entropy)? ??
????(P)? ????? ?? ??? ? ??.
Entropy/KL-Divergence10-MIN
MATHEMATICS
( ||D ) ¡ÝP 0QKL
3. KL Divergence? ??
1) ??? ?? ??? ???
( ||Dwhere ) = P log ( )( )x
Q(x)
P(x)-x ¡Ê
¡ÆP P Qand is pmf,QKL
y = p ,i x = piq /i
?y pi pi
qi
pi
0,£¾? lny y ( )¡Üx x - 1 ln ( )¡Ü - 1
qi
pi
qi
pi
qi
pi
qi
pi
qi
pi
p¡Æ ¡Æii i piln ( )¡Ý -- - 1
( ||D ) 0P QKL
? ????
p¡Æ ¡Æii i q-- i ¡Æi p =0 ¡ß ¡à+ i
¡Æ ,i pi ¡Æi qiln ¡Ý ¡Ýp¡Æ ii- ln =
ln if f(x) f(x) is diffable and cts on [ x ],1=¡Üx ln ,x
f (c) c£¼ 1,1
f(1) f(x)-
- x
= =
xx 10 , by MVT,
there exists So, f(1) - f(x) 1 - x
?
¡Ý
-lnx 1 - x¡Ý ? lnx x - 1¡Ü
c ¡Ê(x, 1 ) such that Since.
,£¼£¼? - 1 ¡ß
f (c) = ¡Ý 1 ,
Entropy/KL-Divergence10-MIN
MATHEMATICS
2) ¡®KL Divergence? ?? 0??¡¯? ¡®???? P? Q? ??¡¯? ????.
log=f(x) x is strictly convex,
( )f is strictly convex if ¡Êx? x? ¡Ê ( ) f (tx?+(1-t)x?) f( f()t x? +£¼ (1-t) x?),0,1X,¡Ù? t?
¡ß
( ||D ) ?=P 0QKL
( ||D ) =P 0 if and only if P = QQKL
=P(x) Q(x), x¡Ê?
3) ?????, ? ??? KL Divergence? ?? ?? ??? ???? ? ???? ?? ????(????).
( ||D ) ¡ÙPgenerally, QKL ( ||D )Q PKL (?? ?? ?????)

More Related Content

????_Entropy and KL-Divergence

  • 2. Entropy/KL-Divergence10-MIN MATHEMATICS Entropy and Cross Entropy H(X) (x) 1 PX log P ( ) logxX Pwhere ( ) is pmf of discrete R.V. XxXx¡Ê¡Æ (x) 1 PX = =[ ] 1. ????(Entropy) ????? ?? ???? [??? ??] ???? ?? ???? ?? (?? ??? ? ??) ???? ??? ??? ? ???? ?? ??. [??? ??] ?? ??? ??? ???? ??? ????? ??? ?? ??? ?? ???? ?? ?. ?? A ?? B ?? ?? ?? ?? ?? ??? ?? ??? ??? ???? ?? ???? log (x) 1 PX ?? ?? ??? ?? ??? ?? ?? ??? ?? ??? ???(0.5) ??(0.3) ??(0.2) ???? ????? ?? ??? ?? ?? ???? H(X)= 0.5 ¡Á 0.3 ¡Á 0.2++ ¡Á log 0.5( ) -1 ??? = 0.3010= log 0.3( ) -1 ?? = 0.5229= log 0.2( ) -1 ?? = 0.6990=
  • 3. Entropy/KL-Divergence10-MIN MATHEMATICS H(P, Q) Q ( )xXP ( )logxXx¡Ê¡Æ= - +P ( )log logxX P ( )xX( )1- Q ( )xXQ ( )xX ( )1-x¡Æ= - +p( )log logx p( )x( )1- q( )xq( )x ( )1-x¡Æ= - 2. ?? ????(Cross Entropy) EXAMPLE ?? ??? ?? ??? ???? ??? ???, ?? ??? ?? ? ??? ??? ?? ?? ??? ???? ??? ?. x x? x? x? x? x?x? x? x? x? x?? p ¡Æ p(x )=10¡Üp(x )¡Ü1, a ?nite discrete distribution ii i =- [ {p(x?) log q(x?)+(1-p(x?)) log (1-q(x?))} + {p(x?) log q(x?)+(1-p(x?)) log (1-q(x?))} + ]N 1 ={p(x?) log q(x?)+ +p(x??) log q(x??)} -¡Æ p(x) log q(x) =-{p(x?) log q(x?)+p(x?) log q(x?)+ +p(x??) log q(x??)} N? ??? N?? ??? (?? ????) ???? N=10
  • 4. Entropy/KL-Divergence10-MIN MATHEMATICS Kullback-Leibler Divergence and Entropy 1. Kullback-Leibler(KL) Divergence? ?? KL Divergence? ???? ( ||D ) = p(x) p(x)q(x) log( )P QKL p(x)- - [ ]-log( ) = [P??? Q? ?? ????] [P? ????]- x ¡Ê ¡Æ x ¡Ê ¡Æ <Discrete> ( ||D ) = P log ( )( )x Q(x) P(x)-x ¡Ê ¡ÆP QKL <Continuous> ( ||D )= p(x)log dx( )q(x) p(x) -¡Þ ¡Þ P QKL ? ????? ??? ???? ?? ???? ???, ?? ???? ??? ??, ? ??? ???? ?? ??? ??? ???? ??? ??? ? ?? ?? ???? ??? ??. 2. KL Divergence? ????? ?? ???? ??? ???, KL-Divergence? ?? ????(P)? ?? ?? ?? ??(Q)? ?? ????(Cross Entropy)? ?? ????(P)? ????? ?? ??? ? ??.
  • 5. Entropy/KL-Divergence10-MIN MATHEMATICS ( ||D ) ¡ÝP 0QKL 3. KL Divergence? ?? 1) ??? ?? ??? ??? ( ||Dwhere ) = P log ( )( )x Q(x) P(x)-x ¡Ê ¡ÆP P Qand is pmf,QKL y = p ,i x = piq /i ?y pi pi qi pi 0,£¾? lny y ( )¡Üx x - 1 ln ( )¡Ü - 1 qi pi qi pi qi pi qi pi qi pi p¡Æ ¡Æii i piln ( )¡Ý -- - 1 ( ||D ) 0P QKL ? ???? p¡Æ ¡Æii i q-- i ¡Æi p =0 ¡ß ¡à+ i ¡Æ ,i pi ¡Æi qiln ¡Ý ¡Ýp¡Æ ii- ln = ln if f(x) f(x) is diffable and cts on [ x ],1=¡Üx ln ,x f (c) c£¼ 1,1 f(1) f(x)- - x = = xx 10 , by MVT, there exists So, f(1) - f(x) 1 - x ? ¡Ý -lnx 1 - x¡Ý ? lnx x - 1¡Ü c ¡Ê(x, 1 ) such that Since. ,£¼£¼? - 1 ¡ß f (c) = ¡Ý 1 ,
  • 6. Entropy/KL-Divergence10-MIN MATHEMATICS 2) ¡®KL Divergence? ?? 0??¡¯? ¡®???? P? Q? ??¡¯? ????. log=f(x) x is strictly convex, ( )f is strictly convex if ¡Êx? x? ¡Ê ( ) f (tx?+(1-t)x?) f( f()t x? +£¼ (1-t) x?),0,1X,¡Ù? t? ¡ß ( ||D ) ?=P 0QKL ( ||D ) =P 0 if and only if P = QQKL =P(x) Q(x), x¡Ê? 3) ?????, ? ??? KL Divergence? ?? ?? ??? ???? ? ???? ?? ????(????). ( ||D ) ¡ÙPgenerally, QKL ( ||D )Q PKL (?? ?? ?????)