際際滷

際際滷Share a Scribd company logo
?????? ????
????? ???????
???
????? ?????
???? ???
???? ??
???? ??
??
1. ?? & ???? ??
2. ???? ??
3. ???? ??
????????? ????? ?????
???? : ?????? ???? ??
????????? ????? ?????
? ?? ????? ??????
? ?? ????? ????
? ????? ?? ??? ?? ???? ??
? ????? ??? ??? ??????
? ????? ??? ??? ??? ?? ??
? ??? ?? ????? ??? ? ? ??? ??.
??? ??? ????
???????
??action
reward
??~ observatio
n
?????? ?? ?? ??
MDP
????? MDP ???? ..
? ?? ?? ???
? ?? ??? method
..? ???? ??? ????.
? ? ?, ? = ? ? ?+1
+ ? ? ? ? ??+1, ? ?+1 ?? = ?]
? ? ?, ? : Q??
? ?+1
|??
? : ???
?? ?? ???
? S?? ???? a ??? ??? ? ?? ????
?? ???? ???
?? ???? ??
? MDP ??? ??? ?? ??? ??.
Monte-Carlo
? ????? ???? ???(???? ?? ??? ??).
? ??? ?? ?? -> ???? ? -> ??? ????
???? ????? ??
? 100?? ????? ??? ?? -> ? state?? ???
???? ??? ?? ????? ??
?? =
1
?
?=1
?
??
Monte-Carlo
=
1
?
?? +
?=1
??1
??
?=1
??1
?? = ? ? 1 ???1
=
1
?
?? + (? ? 1)???1
???1 =
1
? ? 1
?=1
??1
??
? ? 1 ???1 =
?=1
??1
??
?? =
1
?
?? + (? ? 1)???1
=
1
?
?? + ???1 ?
1
?
? ???1
= ???1 +
1
?
?? ? ???1
???1 +
1
?
?? ?
1
?
? ???1
= ???1 + ? ?? ? ???1
?? = ???1 + ? ?? ? ???1
?? ???? ???
?? ??
???
?? ????
?? ???? ?
????
Monte-Carlo ??
? ????? ???? ????? ?
? ????? ??? ??? ? ???? ??? ???(ex.
??????)
Time difference
? ?????? ???? Monte-Carlo ??? ?????
???
?? = ??+1 + ???+1
??+1 + ?? (??+1)
?? = ?? + ? ??+1 + ?? (??+1) ? ??
? ? ?? ○ ? ?(??) + ? ?? ? ?(??)
Action? ??? ??? ???
?????
Stat
e Q Value
??action
reward
??~ observatio
n
Replay
Buffer
Target Network
Q ??? ??? ? ??
? Action? ????? Q??? ????? ??.
? ????? ?? ???
??? ?????
? Policy based Learning
? ?? action? ?? ????? ?? ??
? Continuous ? action ??? ? ???
?? : http://rail.eecs.berkeley.edu/deeprlcourse-fa17/f17docs/lecture_4_policy_gradient.pdf
?? = ?????? ? ??~? ?(?)[
?
?(??, ? ?)]
?
´
?? ?(?) = ?? log ? ?
? ??
= ? ?~? ? ?
?=1
?
?? log ? ? ? ? ??
?=1
?
? ??, ? ?
A2C
? Stochastic Policy Gradient ??? ??
? Policy gradient ?? ??? ????? Critic ???? ?? ??
actor
critic
env
?? ?(?) = ? ?~? ? ?
?=1
?
?? log ? ? ? ? ??
?=1
?
? ??, ? ?
? ?~? ? ?
?=1
?
?? log ? ? ? ? ?? ? ?(??)
A2C ????? ???
? Softmax? 64*64 ?? ? ? ?? ????? ??
? Stochastic Policy Gradient
? Action? attack ?, actor ??? ???
??
A2C ????? ???
? ?? ? ??????
? Action space? ?? ??(4096) C ?? ????? q?? ???
? ??? ?? ??
? ?? ? ??????
DDPG
? Deep Deterministic Policy Gradient
? A2C ????? Deterministic Policy Gradient ??
? ???? action ??? ?? ??? ???
? Target network ? replay buffer ??? ??
DDPG? ??!!
? ??? ???? action?? ??? ????!
? Action? ??? ??? ? ???
DDPG? ??!!
actor
critic
env
action
coordinate
Action? ?? ??
Coordinate ? ?? ??
DDPG ??
DDPG ??
DDPG ??
DDPG ???
? ??? ????? ?? ???? ???? ??
? DDPG?? ????? action ??? ???? ??(argmax)
? ?~? ? ?
?=1
?
?? log ? ? ? ? ??
?=1
?
? ??, ? ? Stochastic Policy Gradient
? ?~? ? ? ?? ? ?, ? ? ? ?? ?(?|? ?) Deterministic Policy Gradient
?? ?? ??
? ??? ?? ??? ?? ?? ??
? Action ??actor? ?? ?? actor 2?? ??? ?? Critic
??? ?? ??
- (DPG SPG ??? ??)
?????.

More Related Content

What's hot (20)

JJUG CCC リクル`トの Java にする函りMみ
JJUG CCC リクル`トの Java にする函りMみJJUG CCC リクル`トの Java にする函りMみ
JJUG CCC リクル`トの Java にする函りMみ
Recruit Technologies
?
カスタムメモリマネ`ジャと互堀なメモリアロケ`タについて
カスタムメモリマネ`ジャと互堀なメモリアロケ`タについてカスタムメモリマネ`ジャと互堀なメモリアロケ`タについて
カスタムメモリマネ`ジャと互堀なメモリアロケ`タについて
alwei
?
? ???? ???: ?? ???? ????? ????? DEVIEW 2017
? ???? ???: ?? ???? ????? ????? DEVIEW 2017? ???? ???: ?? ???? ????? ????? DEVIEW 2017
? ???? ???: ?? ???? ????? ????? DEVIEW 2017
Taehoon Kim
?
[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???
[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???
[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???
? ??
?
[NDC 2018] ???? ?? ???
[NDC 2018] ???? ?? ???[NDC 2018] ???? ?? ???
[NDC 2018] ???? ?? ???
Chris Ohk
?
?? ????? ???? ? ???? ??? ??? ??
?? ????? ???? ? ???? ??? ??? ???? ????? ???? ? ???? ??? ??? ??
?? ????? ???? ? ???? ??? ??? ??
Wonha Ryu
?
??? ?? ??? (??)
??? ?? ??? (??)??? ?? ??? (??)
??? ?? ??? (??)
Heungsub Lee
?
??? ?? ????
??? ?? ??????? ?? ????
??? ?? ????
YEONG-CHEON YOU
?
?????? ??????? ??????????? [???]
?????? ??????? ??????????? [???]?????? ??????? ??????????? [???]
?????? ??????? ??????????? [???]
Yurim Jin
?
??? ?? ???? ??? ????? ???
??? ?? ???? ??? ????? ?????? ?? ???? ??? ????? ???
??? ?? ???? ??? ????? ???
Seungjae Lee
?
"Simple Made Easy" Made Easy
"Simple Made Easy" Made Easy"Simple Made Easy" Made Easy
"Simple Made Easy" Made Easy
Kent Ohashi
?
Ndc2010 ???, v3. ????2??????
Ndc2010   ???, v3. ????2??????Ndc2010   ???, v3. ????2??????
Ndc2010 ???, v3. ????2??????
Jubok Kim
?
[KGC2011_???] ?? ?? ???? ??? ? ??
[KGC2011_???] ?? ?? ???? ??? ? ??[KGC2011_???] ?? ?? ???? ??? ? ??
[KGC2011_???] ?? ?? ???? ??? ? ??
MinGeun Park
?
?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)
?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)
?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)
Suhyun Park
?
マイクロサ`ビスっぽい湖じの三
マイクロサ`ビスっぽい湖じの三マイクロサ`ビスっぽい湖じの三
マイクロサ`ビスっぽい湖じの三
Makoto Haruyama
?
??? ?? - ??? ????? ?????
??? ?? - ??? ????? ???????? ?? - ??? ????? ?????
??? ?? - ??? ????? ?????
Chris Ohk
?
MMOG Server-Side ?? ? ???? ??? ??
MMOG Server-Side ?? ? ???? ??? ??MMOG Server-Side ?? ? ???? ??? ??
MMOG Server-Side ?? ? ???? ??? ??
YEONG-CHEON YOU
?
?? ??? ?
?? ??? ??? ??? ?
?? ??? ?
Eunhyang Kim
?
JJUG CCC リクル`トの Java にする函りMみ
JJUG CCC リクル`トの Java にする函りMみJJUG CCC リクル`トの Java にする函りMみ
JJUG CCC リクル`トの Java にする函りMみ
Recruit Technologies
?
カスタムメモリマネ`ジャと互堀なメモリアロケ`タについて
カスタムメモリマネ`ジャと互堀なメモリアロケ`タについてカスタムメモリマネ`ジャと互堀なメモリアロケ`タについて
カスタムメモリマネ`ジャと互堀なメモリアロケ`タについて
alwei
?
? ???? ???: ?? ???? ????? ????? DEVIEW 2017
? ???? ???: ?? ???? ????? ????? DEVIEW 2017? ???? ???: ?? ???? ????? ????? DEVIEW 2017
? ???? ???: ?? ???? ????? ????? DEVIEW 2017
Taehoon Kim
?
[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???
[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???
[IGC 2017] ???? ??? - Mmorpg? ?? voxel ?? ????? ????? ???
? ??
?
[NDC 2018] ???? ?? ???
[NDC 2018] ???? ?? ???[NDC 2018] ???? ?? ???
[NDC 2018] ???? ?? ???
Chris Ohk
?
?? ????? ???? ? ???? ??? ??? ??
?? ????? ???? ? ???? ??? ??? ???? ????? ???? ? ???? ??? ??? ??
?? ????? ???? ? ???? ??? ??? ??
Wonha Ryu
?
?????? ??????? ??????????? [???]
?????? ??????? ??????????? [???]?????? ??????? ??????????? [???]
?????? ??????? ??????????? [???]
Yurim Jin
?
??? ?? ???? ??? ????? ???
??? ?? ???? ??? ????? ?????? ?? ???? ??? ????? ???
??? ?? ???? ??? ????? ???
Seungjae Lee
?
"Simple Made Easy" Made Easy
"Simple Made Easy" Made Easy"Simple Made Easy" Made Easy
"Simple Made Easy" Made Easy
Kent Ohashi
?
Ndc2010 ???, v3. ????2??????
Ndc2010   ???, v3. ????2??????Ndc2010   ???, v3. ????2??????
Ndc2010 ???, v3. ????2??????
Jubok Kim
?
[KGC2011_???] ?? ?? ???? ??? ? ??
[KGC2011_???] ?? ?? ???? ??? ? ??[KGC2011_???] ?? ?? ???? ??? ? ??
[KGC2011_???] ?? ?? ???? ??? ? ??
MinGeun Park
?
?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)
?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)
?? ??? ? ???? ???? ?? ?? (????? KUCC, 2022? 4?)
Suhyun Park
?
マイクロサ`ビスっぽい湖じの三
マイクロサ`ビスっぽい湖じの三マイクロサ`ビスっぽい湖じの三
マイクロサ`ビスっぽい湖じの三
Makoto Haruyama
?
??? ?? - ??? ????? ?????
??? ?? - ??? ????? ???????? ?? - ??? ????? ?????
??? ?? - ??? ????? ?????
Chris Ohk
?
MMOG Server-Side ?? ? ???? ??? ??
MMOG Server-Side ?? ? ???? ??? ??MMOG Server-Side ?? ? ???? ??? ??
MMOG Server-Side ?? ? ???? ??? ??
YEONG-CHEON YOU
?

Similar to ????????? ????? ????? (20)

???? ???? ??? ???? ????
???? ???? ??? ???? ???????? ???? ??? ???? ????
???? ???? ??? ???? ????
Woong won Lee
?
Nationality recognition
Nationality recognitionNationality recognition
Nationality recognition
?? ?
?
???? ???? DQN?? (Reinforcement Learning from Basics to DQN)
???? ???? DQN?? (Reinforcement Learning from Basics to DQN)???? ???? DQN?? (Reinforcement Learning from Basics to DQN)
???? ???? DQN?? (Reinforcement Learning from Basics to DQN)
Curt Park
?
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement LearningImagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
?? ?
?
?? ?? ?? Reinforcement Learning an introduction
?? ?? ?? Reinforcement Learning an introduction?? ?? ?? Reinforcement Learning an introduction
?? ?? ?? Reinforcement Learning an introduction
Taehoon Kim
?
Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)
Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)
Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)
Kyunghwan Kim
?
Guided policy search
Guided policy searchGuided policy search
Guided policy search
Jaehyeon Park
?
RLCode? A3C ?? ?? ????
RLCode? A3C ?? ?? ????RLCode? A3C ?? ?? ????
RLCode? A3C ?? ?? ????
Woong won Lee
?
Lecture 3: Unsupervised Learning
Lecture 3: Unsupervised LearningLecture 3: Unsupervised Learning
Lecture 3: Unsupervised Learning
Sang Jun Lee
?
Ch.5 machine learning basics
Ch.5  machine learning basicsCh.5  machine learning basics
Ch.5 machine learning basics
Jinho Lee
?
Workshop 210417 dhlee
Workshop 210417 dhleeWorkshop 210417 dhlee
Workshop 210417 dhlee
Dongheon Lee
?
??? ???? 3?
??? ???? 3???? ???? 3?
??? ???? 3?
Donghun Lee
?
???? ? Trpo
???? ? Trpo???? ? Trpo
???? ? Trpo
Woong won Lee
?
Growing object oriented software guided by test
Growing object oriented software guided by testGrowing object oriented software guided by test
Growing object oriented software guided by test
??? ?
?
[IGC] ????? ??? - ?? ??? ??? NPC AI ??
[IGC] ????? ??? - ?? ??? ??? NPC AI ??[IGC] ????? ??? - ?? ??? ??? NPC AI ??
[IGC] ????? ??? - ?? ??? ??? NPC AI ??
? ??
?
????, ???? ??? ???
????, ???? ??? ???????, ???? ??? ???
????, ???? ??? ???
Jinwon Lee
?
???? ????? ??? Part 2
???? ????? ??? Part 2???? ????? ??? Part 2
???? ????? ??? Part 2
Dongmin Lee
?
Scrum - Agile Development Process
Scrum - Agile Development ProcessScrum - Agile Development Process
Scrum - Agile Development Process
Kook Maeng
?
[????] ??? 2018 ??
[????] ??? 2018 ??[????] ??? 2018 ??
[????] ??? 2018 ??
Donghyeon Kim
?
180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub
Jaewook. Kang
?
???? ???? ??? ???? ????
???? ???? ??? ???? ???????? ???? ??? ???? ????
???? ???? ??? ???? ????
Woong won Lee
?
Nationality recognition
Nationality recognitionNationality recognition
Nationality recognition
?? ?
?
???? ???? DQN?? (Reinforcement Learning from Basics to DQN)
???? ???? DQN?? (Reinforcement Learning from Basics to DQN)???? ???? DQN?? (Reinforcement Learning from Basics to DQN)
???? ???? DQN?? (Reinforcement Learning from Basics to DQN)
Curt Park
?
Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement LearningImagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
?? ?
?
?? ?? ?? Reinforcement Learning an introduction
?? ?? ?? Reinforcement Learning an introduction?? ?? ?? Reinforcement Learning an introduction
?? ?? ?? Reinforcement Learning an introduction
Taehoon Kim
?
Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)
Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)
Rainbow? ?? ? ?? (The Rainbow's adventure in the vessel) (RL Korea)
Kyunghwan Kim
?
Lecture 3: Unsupervised Learning
Lecture 3: Unsupervised LearningLecture 3: Unsupervised Learning
Lecture 3: Unsupervised Learning
Sang Jun Lee
?
Ch.5 machine learning basics
Ch.5  machine learning basicsCh.5  machine learning basics
Ch.5 machine learning basics
Jinho Lee
?
Workshop 210417 dhlee
Workshop 210417 dhleeWorkshop 210417 dhlee
Workshop 210417 dhlee
Dongheon Lee
?
Growing object oriented software guided by test
Growing object oriented software guided by testGrowing object oriented software guided by test
Growing object oriented software guided by test
??? ?
?
[IGC] ????? ??? - ?? ??? ??? NPC AI ??
[IGC] ????? ??? - ?? ??? ??? NPC AI ??[IGC] ????? ??? - ?? ??? ??? NPC AI ??
[IGC] ????? ??? - ?? ??? ??? NPC AI ??
? ??
?
????, ???? ??? ???
????, ???? ??? ???????, ???? ??? ???
????, ???? ??? ???
Jinwon Lee
?
???? ????? ??? Part 2
???? ????? ??? Part 2???? ????? ??? Part 2
???? ????? ??? Part 2
Dongmin Lee
?
Scrum - Agile Development Process
Scrum - Agile Development ProcessScrum - Agile Development Process
Scrum - Agile Development Process
Kook Maeng
?
180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub180624 mobile visionnet_baeksucon_jwkang_pub
180624 mobile visionnet_baeksucon_jwkang_pub
Jaewook. Kang
?

More from Euijin Jeong (6)

Data efficient hrl paper review
Data efficient hrl paper reviewData efficient hrl paper review
Data efficient hrl paper review
Euijin Jeong
?
Hindsight experience replay paper review
Hindsight experience replay paper reviewHindsight experience replay paper review
Hindsight experience replay paper review
Euijin Jeong
?
???? ??_2(Deep sarsa, Deep Q-learning, DQN)
???? ??_2(Deep sarsa, Deep Q-learning, DQN)???? ??_2(Deep sarsa, Deep Q-learning, DQN)
???? ??_2(Deep sarsa, Deep Q-learning, DQN)
Euijin Jeong
?
Deep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQNDeep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQN
Euijin Jeong
?
Reinforcement Learning basics part1
Reinforcement Learning basics part1Reinforcement Learning basics part1
Reinforcement Learning basics part1
Euijin Jeong
?
??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1
??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1
??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1
Euijin Jeong
?
Data efficient hrl paper review
Data efficient hrl paper reviewData efficient hrl paper review
Data efficient hrl paper review
Euijin Jeong
?
Hindsight experience replay paper review
Hindsight experience replay paper reviewHindsight experience replay paper review
Hindsight experience replay paper review
Euijin Jeong
?
???? ??_2(Deep sarsa, Deep Q-learning, DQN)
???? ??_2(Deep sarsa, Deep Q-learning, DQN)???? ??_2(Deep sarsa, Deep Q-learning, DQN)
???? ??_2(Deep sarsa, Deep Q-learning, DQN)
Euijin Jeong
?
Deep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQNDeep sarsa, Deep Q-learning, DQN
Deep sarsa, Deep Q-learning, DQN
Euijin Jeong
?
Reinforcement Learning basics part1
Reinforcement Learning basics part1Reinforcement Learning basics part1
Reinforcement Learning basics part1
Euijin Jeong
?
??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1
??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1
??????(MDP, Monte-Carlo, Time-difference, sarsa, q-learning) ??1
Euijin Jeong
?

????????? ????? ?????