際際滷

際際滷Share a Scribd company logo
Neural Module Network(NMN)
Neural Module
Network
蟾谿(Paul Kim)
1
Abstract
Neural Module Network
Learning to compose Neural Networks for Question Answering
VQA 企語 企語  觜螻旧 讌覓語  一危一
  螳讌 朱語 Jacob Andreas螳 VQA   ろ豌
 朱語 NMN 螳 ろ豌
1. 旧 螳ロ 企 ろ語 Layout Predictor襯 
2. 企語襷  螳ロ Visual Primitive襯 knowledge base 伎 豢襦 螳ロ蟆
2
Neural Module Network
轟
朱語  NMN(Neural Module Network) 轟 譴  蟲譟郁 旧 蟆暑
覈語 螳  覈 ろ語襦 蟲焔る .
VQA一危  蠍磯朱 螳 讌覓語  ろ語襯 螻 蠏狩
NMN覈語 ろ語 讌覓語 語 蟲譟一 磯 朱 焔
3
Neural Module Network
NMN 轟
るジ 譬襯 覈 るジ 朱 蠍磯.
Attention module(蠏碁殊 dog)
green朱 螻 label覈 blue襦

NMN 覈 module 襴曙願
蟲煙 螳ロ蠍 覓語 螳 覓語
語ろ伎る 螻一 るゼ  
NMN 覦 給  讌螻
一危一 クレ 覈碁 譴
蟆朱  讌覓語 所鍵 伎
Recurrent Nework(LSTM) 
4
Training Data Input
Training Data Input
Training data 覈 3-tuple (w, x, y)襯

W : natural-language question
X : image
Y : answer
覈語 覈 {m} 讌螻 螳螳 郁 襷り覲
theta(るジ讓 蠏碁殊 W)  string
network襦 襷ろ network layout predictor P襦
蟲焔
覈語 P(w)襯 蠍磯朱 ろ語襯 語ろ伎ろ螻 x襯
レ朱  危 企 牛 覿襯 詞企
(ex. VQA 伎 豢 覈 Classifier襦 れ)
5
Modules
Modules
覈   覈 蟲煙朱 assemble   覈  覲 蟆. 企 豕
譟壱 螳ロ vision primitive  覲 企
Moduleれ 3螳讌 basic data type 伎 operation 
A. Images
B. Unnormalized attention
C. Labels
TYPE[INSTANCE](ARG, )
A. TYPE : high-level module type(Attention, Re-Attention, )
B. INSTANCES : particular instance of model under consideration
6
Attention module
Attention Module
Attend 覈 attend[c]   企語 覈 豺襯 heatmap  unnormalized
attention 燕蠍 伎 weight vector(螳螳 C  蟲覲. Ex. cat, dog, )襦
convolution 
蠏碁殊 螻願 れ  企語  譟伎伎 螻 襾語 蟲覲蟆 
7
Re-attention module
Re-Attention Module
re-attend覈 MLP ReLU襦 蟲焔 螻  attention るジ attention朱
mapping   螳
Mapping   weight 螳螳 attend覈螻 襷谿螳讌襦 C襷 蟲覲る 蟆 蠍一
Ex) re-attend[above] above朱 伎 蠏碁殊  覦レ朱 企伎 attention

8
Combination module
Combination Module
Combine覈 2螳讌 attention  attention朱 merge 蠍磯レ 螳
Ex) Combine[and] 螳 蟆曙磯  螳 input 覈 activation  襷
蟆郁骸覓朱 activation
Ex) Combine[except] 蟆曙磯  螳讌 input 譴 豌覯讌語 input activation
region螻 覯讌 input activation inactive蟆 覲蟆曙貅 蟆郁骸覓殊 詞企
9
Classification Module
Classification Module
Classification覈 attention螻 input企語襯 螳螳 朱襖  覿襦 襷ろ 
Ex) Classify[color] color 轟 region attention 覃伎 朱襖  覿襯
襴危
10
Measurement Module
Measurement Module
Measurement覈 蟆曙磯 attention襷 伎 螳 朱襖  覿襯 mapping
覈 伎  attention unnormalized願鍵 覓語 轟 object襯 detect讌
誤蟇磯 objectれ setれ 螻壱 
11
String to networks
String to Network
Natural language讌覓語 語ろ伎ろ  蟆暑朱 覲 蟆 2螳讌 ろ 螳
A. Natural language讌覓語 Layout朱 mapping  :
: 譯殊伎 讌覓語 牛   覈 誤碁り骸 覈 螳 郁屋 讌 
B. 企蟆 襷れ伎 Layout 伎 豸 ろ語襯 assemble
12
Parsing
Parsing
Stanford Parser襦 煙  universal dependency representation 詞企
Parser  kites 螳 覲旧 kite 螳  襦 lemmatization 
危 譟伎 讌 讌覓語 wh-word 企 磯 
: 覓語レ 覩語 覿覿   symbolic form 螻
ex)
what is standing in the field? -> what(stand)
What color is the tuck -> color(truck)
Is there a circle next to a square? -> is(circle, next-to(square))
13
Layout
Layout
覈 leaf attend module, internal nodes re-attend 轟 combine module, root
node YES/NO襯 牛 QAろ measure module襦 襾語 QA 蟆曙磯
classify module襦 郁屋
Parameter 郁屋 
狩 high-level 蟲譟磯ゼ 螳讌襷 螳覲 覈れ るジ instanceれ 狩蟆 batch豌襴螳
螳ロ蠍 覓語 
ex. what color is the cat? -> classify[color](attend[cat]),
where is the truck? -> classify[where](attend[truck]))
14
Answering natural
language questions
LSTM question Encoder
A. parser襷  蟆曙 讌覓語 蠍 覓語 覓語 覩碁ゼ れ朱 覦蠑語 讌襷 旧 レ
譴   覓碁 螳 蠍磯
ex) What is flying, What are flying? -> what(fly)襦 convert.
讌襷 旧 螳螳 kites kite螳 伎 
=> question encoder 一危一 syntactic(蟲覓碁) regularities襯 覈碁蟆
れ
B. semantic(覩碁) regularities 谿  .
ex) what color is the bear?朱 讌覓語 朱 bear手 牛 蟆 襴.
green企手 豢襦 蟆 伎
=> question encode 企 譬襯 螻. 讀, semantic(覩碁) regularities襯 覈碁  
15
Answering natural
language questions
LSTM question Encoder
A. parser襷  蟆曙 讌覓語 蠍 覓語 覓語 覩碁ゼ れ朱 覦蠑語 讌襷 旧 レ
譴   覓碁 螳 蠍磯
ex) What is flying, What are flying? -> what(fly)襦 convert.
讌襷 旧 螳螳 kites kite螳 伎 
=> question encoder 一危一 syntactic(蟲覓碁) regularities襯 覈碁蟆
れ
B. semantic(覩碁) regularities 谿  .
ex) what color is the bear?朱 讌覓語 朱 bear手 牛 蟆 襴.
green企手 豢襦 蟆 伎
=> question encode 企 譬襯 螻. 讀, semantic(覩碁) regularities襯 覈碁  
豕譬覈語
Neural Module Network
 Output螻
LSTM question
Encoder襯
蟆壱
16
Answering natural
language questions
1024 hidden unit 螳 standard
single-layer LSTM 
Question modeling 蟲煙
NMN root module螻 螳 旧
 覿襯 豸″. 覈碁覿 豕譬
豸′  襯 覿 geometry
average願, text 覦 image
feature襯 伎 朱 れ
reweighted
NMN螻 sequence modeling 蟲
襯  豌 覈語 螻給朱

17
Training
Optimizer
讌覓語 牛  dynamic network structure 覓語 朱 weigh るジ 蟆覲企 譯
一危碁 蟆渚レ . 企 伎 覓語 adaptive per-weight learning rate襯 螳
螻襴讀  SGD覲企 譬 焔レ 企 蟆 誤.
蠏碁 AdaDelta襯  蠏
蠍一牛 
detect[cat]  螻 語蠍磯 螻蟇磯 豐蠍壱讌 朱 combine[and] attention
蟲讌 螻壱蠍 伎 螻 蟆 朱 蟆 蠍一牛伎 
End-to-End覦 糾骸 蟆郁骸覓朱 企 behavior襯 給.
18

More Related Content

Neural module Network

  • 1. Neural Module Network(NMN) Neural Module Network 蟾谿(Paul Kim) 1
  • 2. Abstract Neural Module Network Learning to compose Neural Networks for Question Answering VQA 企語 企語 觜螻旧 讌覓語 一危一 螳讌 朱語 Jacob Andreas螳 VQA ろ豌 朱語 NMN 螳 ろ豌 1. 旧 螳ロ 企 ろ語 Layout Predictor襯 2. 企語襷 螳ロ Visual Primitive襯 knowledge base 伎 豢襦 螳ロ蟆 2
  • 3. Neural Module Network 轟 朱語 NMN(Neural Module Network) 轟 譴 蟲譟郁 旧 蟆暑 覈語 螳 覈 ろ語襦 蟲焔る . VQA一危 蠍磯朱 螳 讌覓語 ろ語襯 螻 蠏狩 NMN覈語 ろ語 讌覓語 語 蟲譟一 磯 朱 焔 3
  • 4. Neural Module Network NMN 轟 るジ 譬襯 覈 るジ 朱 蠍磯. Attention module(蠏碁殊 dog) green朱 螻 label覈 blue襦 NMN 覈 module 襴曙願 蟲煙 螳ロ蠍 覓語 螳 覓語 語ろ伎る 螻一 るゼ NMN 覦 給 讌螻 一危一 クレ 覈碁 譴 蟆朱 讌覓語 所鍵 伎 Recurrent Nework(LSTM) 4
  • 5. Training Data Input Training Data Input Training data 覈 3-tuple (w, x, y)襯 W : natural-language question X : image Y : answer 覈語 覈 {m} 讌螻 螳螳 郁 襷り覲 theta(るジ讓 蠏碁殊 W) string network襦 襷ろ network layout predictor P襦 蟲焔 覈語 P(w)襯 蠍磯朱 ろ語襯 語ろ伎ろ螻 x襯 レ朱 危 企 牛 覿襯 詞企 (ex. VQA 伎 豢 覈 Classifier襦 れ) 5
  • 6. Modules Modules 覈 覈 蟲煙朱 assemble 覈 覲 蟆. 企 豕 譟壱 螳ロ vision primitive 覲 企 Moduleれ 3螳讌 basic data type 伎 operation A. Images B. Unnormalized attention C. Labels TYPE[INSTANCE](ARG, ) A. TYPE : high-level module type(Attention, Re-Attention, ) B. INSTANCES : particular instance of model under consideration 6
  • 7. Attention module Attention Module Attend 覈 attend[c] 企語 覈 豺襯 heatmap unnormalized attention 燕蠍 伎 weight vector(螳螳 C 蟲覲. Ex. cat, dog, )襦 convolution 蠏碁殊 螻願 れ 企語 譟伎伎 螻 襾語 蟲覲蟆 7
  • 8. Re-attention module Re-Attention Module re-attend覈 MLP ReLU襦 蟲焔 螻 attention るジ attention朱 mapping 螳 Mapping weight 螳螳 attend覈螻 襷谿螳讌襦 C襷 蟲覲る 蟆 蠍一 Ex) re-attend[above] above朱 伎 蠏碁殊 覦レ朱 企伎 attention 8
  • 9. Combination module Combination Module Combine覈 2螳讌 attention attention朱 merge 蠍磯レ 螳 Ex) Combine[and] 螳 蟆曙磯 螳 input 覈 activation 襷 蟆郁骸覓朱 activation Ex) Combine[except] 蟆曙磯 螳讌 input 譴 豌覯讌語 input activation region螻 覯讌 input activation inactive蟆 覲蟆曙貅 蟆郁骸覓殊 詞企 9
  • 10. Classification Module Classification Module Classification覈 attention螻 input企語襯 螳螳 朱襖 覿襦 襷ろ Ex) Classify[color] color 轟 region attention 覃伎 朱襖 覿襯 襴危 10
  • 11. Measurement Module Measurement Module Measurement覈 蟆曙磯 attention襷 伎 螳 朱襖 覿襯 mapping 覈 伎 attention unnormalized願鍵 覓語 轟 object襯 detect讌 誤蟇磯 objectれ setれ 螻壱 11
  • 12. String to networks String to Network Natural language讌覓語 語ろ伎ろ 蟆暑朱 覲 蟆 2螳讌 ろ 螳 A. Natural language讌覓語 Layout朱 mapping : : 譯殊伎 讌覓語 牛 覈 誤碁り骸 覈 螳 郁屋 讌 B. 企蟆 襷れ伎 Layout 伎 豸 ろ語襯 assemble 12
  • 13. Parsing Parsing Stanford Parser襦 煙 universal dependency representation 詞企 Parser kites 螳 覲旧 kite 螳 襦 lemmatization 危 譟伎 讌 讌覓語 wh-word 企 磯 : 覓語レ 覩語 覿覿 symbolic form 螻 ex) what is standing in the field? -> what(stand) What color is the tuck -> color(truck) Is there a circle next to a square? -> is(circle, next-to(square)) 13
  • 14. Layout Layout 覈 leaf attend module, internal nodes re-attend 轟 combine module, root node YES/NO襯 牛 QAろ measure module襦 襾語 QA 蟆曙磯 classify module襦 郁屋 Parameter 郁屋 狩 high-level 蟲譟磯ゼ 螳讌襷 螳覲 覈れ るジ instanceれ 狩蟆 batch豌襴螳 螳ロ蠍 覓語 ex. what color is the cat? -> classify[color](attend[cat]), where is the truck? -> classify[where](attend[truck])) 14
  • 15. Answering natural language questions LSTM question Encoder A. parser襷 蟆曙 讌覓語 蠍 覓語 覓語 覩碁ゼ れ朱 覦蠑語 讌襷 旧 レ 譴 覓碁 螳 蠍磯 ex) What is flying, What are flying? -> what(fly)襦 convert. 讌襷 旧 螳螳 kites kite螳 伎 => question encoder 一危一 syntactic(蟲覓碁) regularities襯 覈碁蟆 れ B. semantic(覩碁) regularities 谿 . ex) what color is the bear?朱 讌覓語 朱 bear手 牛 蟆 襴. green企手 豢襦 蟆 伎 => question encode 企 譬襯 螻. 讀, semantic(覩碁) regularities襯 覈碁 15
  • 16. Answering natural language questions LSTM question Encoder A. parser襷 蟆曙 讌覓語 蠍 覓語 覓語 覩碁ゼ れ朱 覦蠑語 讌襷 旧 レ 譴 覓碁 螳 蠍磯 ex) What is flying, What are flying? -> what(fly)襦 convert. 讌襷 旧 螳螳 kites kite螳 伎 => question encoder 一危一 syntactic(蟲覓碁) regularities襯 覈碁蟆 れ B. semantic(覩碁) regularities 谿 . ex) what color is the bear?朱 讌覓語 朱 bear手 牛 蟆 襴. green企手 豢襦 蟆 伎 => question encode 企 譬襯 螻. 讀, semantic(覩碁) regularities襯 覈碁 豕譬覈語 Neural Module Network Output螻 LSTM question Encoder襯 蟆壱 16
  • 17. Answering natural language questions 1024 hidden unit 螳 standard single-layer LSTM Question modeling 蟲煙 NMN root module螻 螳 旧 覿襯 豸″. 覈碁覿 豕譬 豸′ 襯 覿 geometry average願, text 覦 image feature襯 伎 朱 れ reweighted NMN螻 sequence modeling 蟲 襯 豌 覈語 螻給朱 17
  • 18. Training Optimizer 讌覓語 牛 dynamic network structure 覓語 朱 weigh るジ 蟆覲企 譯 一危碁 蟆渚レ . 企 伎 覓語 adaptive per-weight learning rate襯 螳 螻襴讀 SGD覲企 譬 焔レ 企 蟆 誤. 蠏碁 AdaDelta襯 蠏 蠍一牛 detect[cat] 螻 語蠍磯 螻蟇磯 豐蠍壱讌 朱 combine[and] attention 蟲讌 螻壱蠍 伎 螻 蟆 朱 蟆 蠍一牛伎 End-to-End覦 糾骸 蟆郁骸覓朱 企 behavior襯 給. 18