ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
RL Adventure
RAINBOW
???
1
INDEX
1. Environment
2. Before RAINBOW
DDQN(Double Deep Q-Learning)
Dueling DQN
Multi-Step TD(Temporal Difference)
PER(Prioritized Experience Replay)
Noisy Network
Categorical DQN(C51)
3. RAINBOW
4. RAINBOW - Code
2
OPENAI GYM
HTTPS://GYM.OPENAI.COM
HTTPS://GITHUB.COM/OPENAI/GYM
1. EXPERIMENT ENVIRONMENT
3
2. BEFORE RAINBOW : DOUBLE DQN
4
HTTPS://ARXIV.ORG/ABS/1509.06461
2. BEFORE RAINBOW : DUELING DQN
HTTPS://ARXIV.ORG/ABS/1511.06581
5
2. BEFORE RAINBOW : DUELING DQN
6
HTTPS://ARXIV.ORG/ABS/1511.06581
2. BEFORE RAINBOW : MULTI-STEP LEARNING
7
2. BEFORE RAINBOW : PER
HTTPS://ARXIV.ORG/ABS/1511.05952
8
2. BEFORE RAINBOW : NOISY NETWORK
HTTPS://ARXIV.ORG/ABS/1706.10295
9
2. BEFORE RAINBOW : NOISY NETWORK
HTTPS://ARXIV.ORG/ABS/1706.10295
10
2. BEFORE RAINBOW : CATEGORICAL DQN(C51)
HTTPS://ARXIV.ORG/PDF/1707.06887.PDF
11
2. BEFORE RAINBOW : CATEGORICAL DQN(C51)
HTTPS://ARXIV.ORG/PDF/1707.06887.PDF
12
RAINBOW
3. RAINBOW
13
3. RAINBOW
RAINBOW
DDQN(Double Deep Q-Learning)
+
Dueling DQN
+
Multi-Step TD(Temporal Difference)
+
PER(Prioritized Experience Replay)
+
Noisy Network
+
Categorical DQN(C51)
14
3. RAINBOW
15
3. RAINBOW
HYPERPARAMETERS
16
3. RAINBOW
17
3. RAINBOW
18
PONG
4. RAINBOW - CODE
19
NOISY LINEAR
4. RAINBOW - CODE
20
DUELING + NOISY + C51
4. RAINBOW - CODE
21
PROJECTION STEP
4. RAINBOW - CODE
22
CROSS-ENTROPY LOSS
4. RAINBOW - CODE
23
TEST
4. RAINBOW - CODE
24
Thank you
RAINBOW
???
25

More Related Content

pyconkr 2018 RL_Adventure : Rainbow(value based Reinforcement Learning)