Learning how to explain neural networks: PatternNet and PatternAttribution
1. Learning how to explain neural networks
PatternNet and PatternAttribution
PJ Kindermans et al. 2017
蠏觜
螻る蟲 一蟆曙螻牛螻
Data Science & Business Analytics 郁規
2. / 29
覈谿
1. 螻手碓 覦覯襦れ 覦 覓語
2. 螻手碓 覦覯襦 覿
1) DeConvNet
2) Guided BackProp
3. Linear model 蟲
1) Deterministic distractor
2) Additive isotropic Gaussian noise
4. Approaches
5. Quality criterion for signal estimator
6. Learning to estimate signal
1) Existing estimators
2) PatternNet & PatternAttribution
7. Experiments
2
3. / 29
0.
Data 譴 覩碁ゼ 願 Signal螻 碁 覿覿 Distractor襦 蟲焔.
螳襯 る Signal 覿覿 譴 伎 .
Model weight Distractor レ 襷 覦蠍 覓語
螳襯 weight襷 譟危覃 譬讌 蟆郁骸襯 碁.
output y distractor correlation朱 signal 讌 , 螳 .
豢覿 給 覈語 { weight, input, output } 螳朱 linear, non-linear
覦朱 signal 蟲 螻, 蠏碁襦 螳.
3
15. / 29
3.1 Linear Model 蟲 - Deterministic distractor
Linear model 牛 signal螻 distractor 讌 蟯谿
15
Notation
w : filter or weight
x : data
y : condensed output
s : relevant signal
d : distracting component.
output 覓企
覲企 螳讌螻 讌 覿覿
a_s : direction of signal.
output 殊 覈
a_d : direction of distractor
s = asyx = s + d
d = ad狼
as = (1,0)T
ad = (1,1)T
y [1,1]
狼 (亮, 2
)
Data x signal s distractor d
襷譟燕蠍 伎
願, 伎伎 蠍 覓
wT
x = y w = [1, 1]T
wT
asy = y wT
ad狼 = 0
16. / 29
3.1 Linear Model 蟲 - Deterministic distractor
16
, 覈襯 豢譟煙貅
weight distractor襯 蟇壱伎狩蠍 覓語 distractor direction螻 orthogonal り
讀 w signal direction螻 align讌 .
weight distractor orthogonal 讌覃伎,
蠍 譟一 牛 讌 .
Weight vector distractor 蟆 譬讌一
weight vector襷朱 企 input pattern output レ 殊讌
wT
asy = y wT
ad狼 = 0
- signal direction 蠏碁襦 讌
- distractor direction 覦
weight direction 覦
wT
as = 1
17. / 29
3.2 Linear Model 蟲 - No distractor, Additive isotropic Gaussian noise
17
Isotropic Gaussian noise襯 伎
zero mean: noise mean bias襯 牛 朱
讌 朱襦 0朱 .
correlation企 structure螳 蠍 覓語
weight vector襯 牛り 伎
noise螳 蟇磯讌 .
Gaussian noise襯 豢螳 蟆
L2 regularization螻 螳 螻朱ゼ 碁.
讀 weight襯 shirink .
譟郁唄 覓語 襷譟燕
螳 weight vector
螳 覦レ vector
< Gaussian pattern >
yn = 硫xn + 狼
N
n=1
(yn |硫xn, 2
)
N
n=1
(yn |硫xn, 2
)(硫|0,了1
)
N
n=1
1
2
(yn 硫xn)2
了硫2
+ const
< Gaussian noise & L2 regularization >
> likelihood,
皙 Logarithm
wT
as = 1
as
as
w
w霞
w
1
18. / 29
4. Approaches
18
Functions
data x output y襯 觸 磯 覦覯. ex) gradients, saliency map
y襯 x襦 覩碁伎 input 覲螳 企至 output 覲蟆讌 危エ覲碁.
企 model gradient襯 磯 蟆願 蟆郁記 gradient weight.
Signal
Signal: 覈語 neuron activate る 一危一 蟲
Output input space蟾讌 gradient襯 backprop貅 覲 蟯谿
DeConvNet, Guded BackProp 蟆 signal企 覲伎 覈詩.
Attribution
轟 Signal 朱 output 蠍一讌 企 讌
Linear model signal螻 weight vector element-wise 螻煙朱 伎伎
Deep taylor decomposition 朱語 activation 螳 input
contribution朱 覿危螻, LRP relevance 豺.
y = wT
x
y/x = w
PatternNet
PatternAttribution
19. / 29
5. Quality criterion for signal estimator
19
伎
wT
x = y
wT
s + wT
d = y
(x = s + d)wT
(s + d) = y
wT
s = y (wT
d = 0)
(wT
)1
wT
s = (wT
)1
y
s = uu1
(wT
)1
y
s = u(wT
u)1
y
u = random vector
(wT
u 0)
Quality measure
S(x) = s
(S) = 1 maxvcorr(wT
x, vT
(x S(x)))
d = x S(x) y = wT
x, ,
= 1 maxv
vT
cov[y, d]
2
vT d
2
y
譬 signal estimator correlation 0朱 ->
w 企 給 覈語 weight 螳
correlation scale invariant 蠍 覓語
覿一 覿郁骸 螳 蟆企 曙^蟇 豢螳
S(x)襯 螻り optimal 襯 谿城謂
覦 d y Least-squares regression
vT d y
v
illposed problem.
企襦 襴讌 .
るジ 覦
20. / 29
6.1 蠍一ヾ Signal estimator 覦
20
The identity estimator
data distractor , signal襷 譟伎 螳
data螳 企語 signal 企語 蠏碁襦企.
linear model attribution 蟲 .
(distractor螳 譟伎朱, attribution )
れ 一危一 distractor螳 螻,
forward pass 蟇磯讌襷
backward pass element wise 螻煙 讌
螳 noise螳 襷 覲伎碁(LRP)
Sx(x) = x
r = w x = w s + w d
The filter based estimator
∬豸° signal weight direction 螳
ex) DeConvNet, Guided BackProp
weight normalize 伎
linear model attribution 螻旧 れ螻 螳螻
signal 襦 蟲燕讌 覈詩
Sw(x) =
w
wTw
wT
x
r =
w w
wTw
y
21. / 29
6.2 PatternNet & PatternAttribution
21
覦 覦 螳
criterion 豕 給逢
覈 螳ロ 覯″ y d correlation 0
signal estimator S螳 optimal企 .
Linear model y d covariance 0企
cov[y, x] cov[y, S(x)] = 0
v
cov[y, x] = cov[y, S(x)]
cov[y, d] = 0
(S) = 1 maxvcorr(wT
x, vT
(x S(x)))
= 1 maxv
vT
cov[y, d]
2
vT d
2
y
Quality measure
22. / 29
6.2 PatternNet & PatternAttribution
22
The linear estimator
linear neuron data x linear signal襷 豢豢 螳
豌 y linear 一一 signal
linear model y, d covariance 0企襦
Sa(x) = awT
x = ay
cov[x, y]
= cov[S(x), y]
= cov[awT
x, y]
= a cov[y, y]
a =
cov[x, y]
2
y
襷 d s螳 orthogonal る
DeConvNet螻 螳 filter-based 覦
螻 殊.
Convolution layer 襷れ
FC layer ReLU螳 郁屋伎 覿覿
correlation 蟇壱
朱襦 豌 criterion 豺
VGG16
criterion 觜蟲
襷蠏碁 襦
random, S_w,
S_a, S_a+-
23. / 29
6.2 PatternNet & PatternAttribution
23
The two-component(Non-linear) estimator
linear estimator 螳 trick 一襷
y 螳 覿語 磯 螳螳 るゴ蟆 豌襴.
企一 燕伎讌 覿 覲企
distractor 譟伎 y螳 覿覿 螻
ReLU 覓語 れ positive domain襷
locally 一危 蠍 覓語 企ゼ 覲伎
covariance 螻旧 れ螻 螳螻
覿語 磯 磯 螻, 螳譴豺襦 .
Sa+(x) =
{
a+w
x ifw
x > 0
aw
x otherwise
x =
{
s+ + d+ ify > 0
s + d otherwise
cov(x, y) = [xy] [x][y]
cov(x, y) = +(+[xy] +[x][y])
+(1 +)(錫[xy] 錫[x][y])
cov(s, y) = +(+[sy] +[s][y])
+(1 +)(錫[sy] 錫[s][y])
cov(x,y), cov(s,y)螳 殊 , 覿語 覃
a+ =
+[xy] +[x][y]
w +[xy] w +[x][y]