際際滷
Submit Search
Feudal networks for hierarchical reinforcement learning
Download as PPTX, PDF
0 likes
146 views
Y
yys8646
Follow
Feudal networks for hierarchical reinforcement learning
Read less
Read more
1 of 19
Download now
Download to read offline
More Related Content
Feudal networks for hierarchical reinforcement learning
1.
FeUdal Networks for Hierarchical
RL Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, etc. DeepMind Youngseok Yoon, MLG POSTECH.
2.
Contents Architecture Learning
Experiments Ablative analysis Discussion
3.
Architecture (1)
4.
Architecture (2)
: CNN(16 8x8 4, 32 4x4 2) + FCL(256) : FCL To encourage exploration in transition policy, by , emit random goal sampled. : Standard LSTM, 256 : dLSTM, 256 dLSTM: dilated LSTM
5.
Learning (1) Advantage
A-C: 諮 = 諮 log ( | ; ) = ; The direction of state-space (goal): (+|, ) + , Learning of Goal: 諮 = 諮 諮 + , t = 諮 諮 = 諮 + , = ,
6.
Learning (2) Intrinsic
reward for Worker: = 1 =1 ≠, ≠ Workers Policy gradient: 諮 = 諮 log ( | ; ) = + 腫 ; Managers Transition Policy Gradients 諮 = 諮 log (+|; )
7.
Experiments (1)
8.
Experiments (2)
9.
Experiments (3)
10.
Experiments (4)
11.
Experiments (5)
12.
Experiments (7)
13.
Ablative analysis (1)
14.
Ablative analysis (2)
15.
Ablative analysis (3)
16.
Ablative analysis (4)
17.
Ablative analysis (5)
18.
Discussion Previous action
input朱 れ願る 瑚 . 轟壱 れ願 蟇伎, 蠍一 l 蟇伎. 蟆郁記 + state襯 predict sub-goal 襷りる 覦. Step c 蠍郁 企 讌. (c=1 .) =>c size螳 譴 覓語 螻 覓語 り 螳. Manager input state s襯 c step 蟇企一 predict , c襯 hyper-parameter螳 model parameter襦 讌 Mager 2c, c2 煙 Meta-Manager (Hyper-Manager)襯 讌.
19.
Thank you!
Download