際際滷

際際滷Share a Scribd company logo
2019 Presentation
Recurrent Neural Network (RNN)
A Basic Introduction to RNN, LSTM, and GRU
Donghyeon Kim
2019.04.30
2 / 62
Acknowledgement
? ??? ?? ????/??? ?? (by ??? ?)
? Dr. Sung Kim, HKUST / Naver Clova (http://hunkim.github.io/ml/ )
? ??? ?????? ?? ??
? 2019 KAIST ??? ???? ???
? Ideafactory KAIST
(https://github.com/heartcored98/Standalone-DeepLearning)
? ???? ?????? ?? ???!
3 / 62
??? ???
? ??? ??? (Sequential data) ??
? ?? ?? ??? ?? ?? (??, ?? ??, ?? ???, ?? ?)
? ?? ??, ??? ??, ?? ??? ??
4 / 62
Recurrent Neural Network (RNN)
? ??????? ??? RNN? ?????
5 / 62
Recurrent Neural Network (RNN)
? ??????? ??? RNN? ????? No!
? ??????? Classification/ Regression ??? ? ? Multi-layer
perceptron (MLP)? Convolutional Neural Network (CNN)? ???? ??
? ??? CNN? ??? ??? ?? ?? ??
6 / 62
Recurrent Neural Network (RNN)
? ??????? ??? RNN? ????? No!
? ??????? Classification/ Regression ??? ? ? Multi-layer
perceptron (MLP)? Convolutional Neural Network (CNN)? ???? ??
? ??? CNN? ??? ??? ?? ?? ??
? ???? ???? RNN?
7 / 62
Recurrent Neural Network (RNN)
? ??????? ??? RNN? ????? No!
? ??????? Classification/ Regression ??? ? ? Multi-layer
perceptron (MLP)? Convolutional Neural Network (CNN)? ???? ??
? ??? CNN? ??? ??? ?? ?? ??
? ???? ???? RNN?
? RNN: ??? ??? ?? (??? ???? ???´!)
? MLP? CNN? ?????? ?? ? ?? ?? ?? ??? ??????
?? ???? ????? ??? ?? ??
? RNN? ?? ??? ?? ??? ?? ?????? ?? ????
8 / 62
Recurrent Neural Network (RNN)
? RNN ?? ??
? ?(?)? ?? tanh(?)? ????.
? ?? ?? ???? ?? ?, ?, ?, ?, ?? ???? (weight ? bias ?? ??)
?(?) ?(?) ?(?)?
?
?
? ? = ? ?? ? + ?? ? ? 1 + ?
? ? = ?(?? ? + ?)
???? ??
* ?(?) : ?? ?? (??)
* ?(?-1) : ?? ??
* ?(?) : ?? ??
9 / 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
??
10/ 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
? ??1
??
11 / 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
? ??1
??
Element-wise
Summation
12/ 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
Non-linear
Activation
? ??1
??
? ?
Element-wise
Summation
13/ 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
Non-linear
Activation
? ??1
??
? ?
Element-wise
Summation
? ?
??+1
14/ 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
Non-linear
Activation
? ??1
??
? ?
Element-wise
Summation
? ?
??+1
15/ 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
Non-linear
Activation
? ??1
??
? ?
Element-wise
Summation
Non-linear
Activation
? ?+1
? ?
??+1
16/ 62
RNN: Graphical Description
Reference. 2019 KAIST ??? ???? ???
Element-wise
Summation
Non-linear
Activation
? ??1
??
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
??+2
? ?+2
? ?
??+1
17/ 62
??? RNN ???
?? NN
?: ??? ??
???★??
?: ?? ??
??★?/??
?: ????
??★??
?: ??? ??
???★??
18/ 62
??? RNN ???: Many-to-One
??. ?? ?? (?? ? ??/ ??)
Reference. 2019 KAIST ??? ???? ???
19/ 62
??? RNN ???: Many-to-One
??. ?? ?? (?? ? ??/ ??)
Reference. 2019 KAIST ??? ???? ???
20/ 62
??? RNN ???: Many-to-One
??. ??? ?? (??? ? ??)
Reference. 2019 KAIST ??? ???? ???
21/ 62
??? RNN ???: Many-to-Many
??. ??? ?? (??? ? ??)
Reference. 2019 KAIST ??? ???? ???
22/ 62
??? RNN ???: Many-to-Many
??. ?? ?? (?? ? ??)
Non-linear
Activation
? ??1
? ?
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
? ?+2
? ?+2
? ?
? ?+1
? ?+2
??+2
Reference. 2019 KAIST ??? ???? ???
23/ 62
??? RNN ???: Many-to-Many
??. ?? ?? (?? ? ??)
Non-linear
Activation
? ??1
? ?
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
? ?+2
? ?+2
? ?
? ?+1
? ?+2
??+2
??+2
Reference. 2019 KAIST ??? ???? ???
24/ 62
??? RNN ???: Many-to-Many
??. ?? ?? (?? ? ??)
Non-linear
Activation
? ??1
? ?
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
? ?+2
? ?+2
? ?
? ?+1
Non-linear
Activation
? ?+2
??+2
? ?+3
??+2 ??+3
Reference. 2019 KAIST ??? ???? ???
25/ 62
??? RNN ???: Many-to-Many
??. ?? ?? (?? ? ??)
Non-linear
Activation
? ??1
? ?
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
? ?+2
? ?+2
? ?
? ?+1
Non-linear
Activation
? ?+2
??+2
? ?+3
Non-linear
Activation
? ?+4
? ?+3
??+3
??+2 ??+3 ??+4
Reference. 2019 KAIST ??? ???? ???
26/ 62
??? RNN ???: One-to-Many
??. ??? ?? (?? ? ??)
Non-linear
Activation
? ??1
??
? ?
??
Reference. 2019 KAIST ??? ???? ???
27/ 62
??? RNN ???: One-to-Many
??. ??? ?? (?? ? ??)
Non-linear
Activation
? ??1
??
? ?
? ?
??
??
Reference. 2019 KAIST ??? ???? ???
28/ 62
??? RNN ???: One-to-Many
??. ??? ?? (?? ? ??)
Non-linear
Activation
? ??1
??
? ?
Non-linear
Activation
? ?
??
? ?+1
?? ??+1
Reference. 2019 KAIST ??? ???? ???
29/ 62
??? RNN ???: One-to-Many
??. ??? ?? (?? ? ??)
Non-linear
Activation
? ??1
??
? ?
Non-linear
Activation
? ?
??
? ?+1
Non-linear
Activation
? ?+2
? ?+3
??+1
?? ??+1 ??+2
Reference. 2019 KAIST ??? ???? ???
30/ 62
RNN: Mathematical Description
Reference. 2019 KAIST ??? ???? ???
Non-linear
Activation
? ??1
??
? ?
??
UW
V
?? = ? ??? + ????1
?? = ? ???
tanh
? ? = ?
Vanilla RNN
???? ????
31/ 62
BackPropagation Through Time (BPTT)
Non-linear
Activation
? ??1
? ?
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
? ?+2
? ?+2
? ?
? ?+1
Non-linear
Activation
? ?+2
??+2
? ?+3
Non-linear
Activation
? ?+4
? ?+3
??+3
??+2 ??+3 ??+4
True
??+2 ??+3 ??+4
Predict
Reference. 2019 KAIST ??? ???? ???
32/ 62
Back-Propagation Through Time (BPTT)
Non-linear
Activation
? ??1
? ?
? ?
Non-linear
Activation
? ?+1
Non-linear
Activation
? ?+1
? ?+2
? ?+2
? ?
? ?+1
Non-linear
Activation
? ?+2
??+2
? ?+3
Non-linear
Activation
? ?+4
? ?+3
??+3
??+2 ??+3 ??+4
??+2 ??+3 ??+4
???? ? = ?
?
????(?????,?, ?????,?)
? ?+2 ? ?+3 ? ?+4
Reference. 2019 KAIST ??? ???? ???
33/ 62
Back-Propagation Through Time (BPTT)
34/ 62
Vanishing Gradient Problem
? Vanishing Gradient Problem in Neural Networks
35/ 62
Vanishing Gradient Problem
? Vanishing Gradient Problem in Neural Networks
Solution
36/ 62
Vanishing Gradient Problem
? Vanishing Gradient Problem in Recurrent Neural Networks (RNN)
37/ 62
Vanishing Gradient Problem
? Vanishing Gradient Problem in Recurrent Neural Networks (RNN)
Reference. 2019 GDG?? RNN?? (???)
38/ 62
Vanishing Gradient Problem
? Vanishing Gradient Problem in Recurrent Neural Networks (RNN)
Reference. 2019 GDG?? RNN?? (???)
39/ 62
Vanishing Gradient Problem
? Vanishing Gradient Problem in Recurrent Neural Networks (RNN)
Solution?
40/ 62
Solution: RNN Variants
? Long Short-term Memory (LSTM)
[Sepp Hochreiter; J┨rgen Schmidhuber (1997). "Long short-term memory". Neural Computation. 9 (8): 1735C1780]
? Gated Recurrent Unit (GRU)
[Cho, Kyunghyun et al., (2014). arXiv:1406.1078]
? RNN with Rectified Linear Unit (ReLU)
[Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). arXiv:1504.00941.]
41/ 62
RNN with ReLU
42/ 62
RNN with ReLU
1 1 1 1
1 1 1 1
?? ReLU ? ???
???? ??? = 1
43/ 62
RNN with ReLU
1 1 1 1
1 1 1 1
?? ReLU ? ???
???? ??? = 1
? Weight ?? ?? ?? ???? (Exploding Gradient)
44/ 62
RNN with ReLU
1 1 1 1
1 1 1 1
?? ReLU ? ???
???? ??? = 1
? Weight ?? ?? ?? ???? (Exploding Gradient)
? (1) Weight ???? (2) ?? ?? Learning rate? ????? ?????
Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). arXiv:1504.00941
45/ 62
Long Short Term Memory (LSTM)
? Science 2017 ???
? ? ???? ??
? ?? ??? ?????
?? ?? ?? ??? ??
? ?? ?? ? ?????
??
? 2? ? ??? ?? ? ??
(??? ??? ??? ?)
???? ????? ??
46/ 62
Long Short Term Memory (LSTM)
? Science 2017 ???
? ? ???? ??
? ?? ??? ?????
?? ?? ?? ??? ??
? ?? ?? ? ?????
??
? Hidden state in LSTM´?
? 2? ? ??? ?? ? ??
(??? ??? ??? ?)
???? ????? ??
? Cell state in LSTM´?
47/ 62
Long Short Term Memory (LSTM)
? ?? ?? ?? ??
48/ 62
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
49/ 62
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
50/ 62
Long Short Term Memory (LSTM)
??? ?? ??? ?? ?? ?????
???? ? ???? ?? ??? forget ???? ??
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
51/ 62
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
??? ?? ??? ?? ?? ????? input ??? ??
??? ?? ??? ?? ?? ??? ??? ???? ?? ??
52/ 62
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
(1) ?? ????? forget ???? ?? ???? ?? ??
(2) ?? ???? ?? ?? input ???? ?? ??? ?? ??
? ??? (1)? (2)? ??? ?? ???? Cell state? ????
53/ 62
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
??? ?? ??? ?? ?? ????? output ??? ????
?? ????? output ???? ??? hidden state ????
54/ 62
Long Short Term Memory (LSTM)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
?? ??? ?? ??? ???? ? ?? ???
55/ 62
Gated Recurrent Unit (GRU)
GRUVanilla RNN
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
56/ 62
??
??
??
??
Gated Recurrent Unit (GRU)
? Update gate
?? ??
? ?? ??? ??? ??
?? ??? ?? ???
?? (??)
57/ 62
Gated Recurrent Unit (GRU)
? Reset gate
??? ??
? ?? ?? ???? ???
?? ?? ??? ????
? ??
??
??
??
??
58/ 62
Gated Recurrent Unit (GRU)
? Memory Candidate
?? ?? ??
? ?? ??? ???
??? ?? ?? ???
???(reset)? ?? ??
??
??
??
??
??
???
??
?? = ? : ?? ??? reset (???), ?? ??(??) ??? ?? ?? ?? ???
?? = ? : ?? ??? ????, ?? ??(??)? ???? ?? ?? ?? ???
59/ 62
Gated Recurrent Unit (GRU)
? Final Memory
?? ??
? ?? ??(update)? ??
??? ???
?? ?? ??? ??
?? = ? : ?? ?? ??? ?? ????
?? = ? : ?? ?? ?? ??? ?? ????
??
??
??
??
??
??
??
??
??
60/ 62
Gated Recurrent Unit (GRU)
61/ 62
Summary
? ??? ???? ?? ??? ??? ???? ?? RNN ? ????
? RNN? ??? ??? ??? ????
? One-to-Many / Many-to-One / Many-to-Many ?
? RNN ??? Back-Propagation Through Time (BPTT) ?? ??
? ?? RNN? Vanishing Gradient Problem? ????
? RNN with ReLU (???? ?? ??)
? LSTM (???? + ????)
? GRU (LSTM? ??? ????? ????? ??)
Thank You
2019 Presentation

More Related Content

[????] Recurrent Neural Network (RNN) ??

  • 1. 2019 Presentation Recurrent Neural Network (RNN) A Basic Introduction to RNN, LSTM, and GRU Donghyeon Kim 2019.04.30
  • 2. 2 / 62 Acknowledgement ? ??? ?? ????/??? ?? (by ??? ?) ? Dr. Sung Kim, HKUST / Naver Clova (http://hunkim.github.io/ml/ ) ? ??? ?????? ?? ?? ? 2019 KAIST ??? ???? ??? ? Ideafactory KAIST (https://github.com/heartcored98/Standalone-DeepLearning) ? ???? ?????? ?? ???!
  • 3. 3 / 62 ??? ??? ? ??? ??? (Sequential data) ?? ? ?? ?? ??? ?? ?? (??, ?? ??, ?? ???, ?? ?) ? ?? ??, ??? ??, ?? ??? ??
  • 4. 4 / 62 Recurrent Neural Network (RNN) ? ??????? ??? RNN? ?????
  • 5. 5 / 62 Recurrent Neural Network (RNN) ? ??????? ??? RNN? ????? No! ? ??????? Classification/ Regression ??? ? ? Multi-layer perceptron (MLP)? Convolutional Neural Network (CNN)? ???? ?? ? ??? CNN? ??? ??? ?? ?? ??
  • 6. 6 / 62 Recurrent Neural Network (RNN) ? ??????? ??? RNN? ????? No! ? ??????? Classification/ Regression ??? ? ? Multi-layer perceptron (MLP)? Convolutional Neural Network (CNN)? ???? ?? ? ??? CNN? ??? ??? ?? ?? ?? ? ???? ???? RNN?
  • 7. 7 / 62 Recurrent Neural Network (RNN) ? ??????? ??? RNN? ????? No! ? ??????? Classification/ Regression ??? ? ? Multi-layer perceptron (MLP)? Convolutional Neural Network (CNN)? ???? ?? ? ??? CNN? ??? ??? ?? ?? ?? ? ???? ???? RNN? ? RNN: ??? ??? ?? (??? ???? ???´!) ? MLP? CNN? ?????? ?? ? ?? ?? ?? ??? ?????? ?? ???? ????? ??? ?? ?? ? RNN? ?? ??? ?? ??? ?? ?????? ?? ????
  • 8. 8 / 62 Recurrent Neural Network (RNN) ? RNN ?? ?? ? ?(?)? ?? tanh(?)? ????. ? ?? ?? ???? ?? ?, ?, ?, ?, ?? ???? (weight ? bias ?? ??) ?(?) ?(?) ?(?)? ? ? ? ? = ? ?? ? + ?? ? ? 1 + ? ? ? = ?(?? ? + ?) ???? ?? * ?(?) : ?? ?? (??) * ?(?-1) : ?? ?? * ?(?) : ?? ??
  • 9. 9 / 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? ??
  • 10. 10/ 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? ? ??1 ??
  • 11. 11 / 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? ? ??1 ?? Element-wise Summation
  • 12. 12/ 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? Non-linear Activation ? ??1 ?? ? ? Element-wise Summation
  • 13. 13/ 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? Non-linear Activation ? ??1 ?? ? ? Element-wise Summation ? ? ??+1
  • 14. 14/ 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? Non-linear Activation ? ??1 ?? ? ? Element-wise Summation ? ? ??+1
  • 15. 15/ 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? Non-linear Activation ? ??1 ?? ? ? Element-wise Summation Non-linear Activation ? ?+1 ? ? ??+1
  • 16. 16/ 62 RNN: Graphical Description Reference. 2019 KAIST ??? ???? ??? Element-wise Summation Non-linear Activation ? ??1 ?? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ??+2 ? ?+2 ? ? ??+1
  • 17. 17/ 62 ??? RNN ??? ?? NN ?: ??? ?? ???★?? ?: ?? ?? ??★?/?? ?: ???? ??★?? ?: ??? ?? ???★??
  • 18. 18/ 62 ??? RNN ???: Many-to-One ??. ?? ?? (?? ? ??/ ??) Reference. 2019 KAIST ??? ???? ???
  • 19. 19/ 62 ??? RNN ???: Many-to-One ??. ?? ?? (?? ? ??/ ??) Reference. 2019 KAIST ??? ???? ???
  • 20. 20/ 62 ??? RNN ???: Many-to-One ??. ??? ?? (??? ? ??) Reference. 2019 KAIST ??? ???? ???
  • 21. 21/ 62 ??? RNN ???: Many-to-Many ??. ??? ?? (??? ? ??) Reference. 2019 KAIST ??? ???? ???
  • 22. 22/ 62 ??? RNN ???: Many-to-Many ??. ?? ?? (?? ? ??) Non-linear Activation ? ??1 ? ? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ? ?+2 ? ?+2 ? ? ? ?+1 ? ?+2 ??+2 Reference. 2019 KAIST ??? ???? ???
  • 23. 23/ 62 ??? RNN ???: Many-to-Many ??. ?? ?? (?? ? ??) Non-linear Activation ? ??1 ? ? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ? ?+2 ? ?+2 ? ? ? ?+1 ? ?+2 ??+2 ??+2 Reference. 2019 KAIST ??? ???? ???
  • 24. 24/ 62 ??? RNN ???: Many-to-Many ??. ?? ?? (?? ? ??) Non-linear Activation ? ??1 ? ? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ? ?+2 ? ?+2 ? ? ? ?+1 Non-linear Activation ? ?+2 ??+2 ? ?+3 ??+2 ??+3 Reference. 2019 KAIST ??? ???? ???
  • 25. 25/ 62 ??? RNN ???: Many-to-Many ??. ?? ?? (?? ? ??) Non-linear Activation ? ??1 ? ? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ? ?+2 ? ?+2 ? ? ? ?+1 Non-linear Activation ? ?+2 ??+2 ? ?+3 Non-linear Activation ? ?+4 ? ?+3 ??+3 ??+2 ??+3 ??+4 Reference. 2019 KAIST ??? ???? ???
  • 26. 26/ 62 ??? RNN ???: One-to-Many ??. ??? ?? (?? ? ??) Non-linear Activation ? ??1 ?? ? ? ?? Reference. 2019 KAIST ??? ???? ???
  • 27. 27/ 62 ??? RNN ???: One-to-Many ??. ??? ?? (?? ? ??) Non-linear Activation ? ??1 ?? ? ? ? ? ?? ?? Reference. 2019 KAIST ??? ???? ???
  • 28. 28/ 62 ??? RNN ???: One-to-Many ??. ??? ?? (?? ? ??) Non-linear Activation ? ??1 ?? ? ? Non-linear Activation ? ? ?? ? ?+1 ?? ??+1 Reference. 2019 KAIST ??? ???? ???
  • 29. 29/ 62 ??? RNN ???: One-to-Many ??. ??? ?? (?? ? ??) Non-linear Activation ? ??1 ?? ? ? Non-linear Activation ? ? ?? ? ?+1 Non-linear Activation ? ?+2 ? ?+3 ??+1 ?? ??+1 ??+2 Reference. 2019 KAIST ??? ???? ???
  • 30. 30/ 62 RNN: Mathematical Description Reference. 2019 KAIST ??? ???? ??? Non-linear Activation ? ??1 ?? ? ? ?? UW V ?? = ? ??? + ????1 ?? = ? ??? tanh ? ? = ? Vanilla RNN ???? ????
  • 31. 31/ 62 BackPropagation Through Time (BPTT) Non-linear Activation ? ??1 ? ? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ? ?+2 ? ?+2 ? ? ? ?+1 Non-linear Activation ? ?+2 ??+2 ? ?+3 Non-linear Activation ? ?+4 ? ?+3 ??+3 ??+2 ??+3 ??+4 True ??+2 ??+3 ??+4 Predict Reference. 2019 KAIST ??? ???? ???
  • 32. 32/ 62 Back-Propagation Through Time (BPTT) Non-linear Activation ? ??1 ? ? ? ? Non-linear Activation ? ?+1 Non-linear Activation ? ?+1 ? ?+2 ? ?+2 ? ? ? ?+1 Non-linear Activation ? ?+2 ??+2 ? ?+3 Non-linear Activation ? ?+4 ? ?+3 ??+3 ??+2 ??+3 ??+4 ??+2 ??+3 ??+4 ???? ? = ? ? ????(?????,?, ?????,?) ? ?+2 ? ?+3 ? ?+4 Reference. 2019 KAIST ??? ???? ???
  • 34. 34/ 62 Vanishing Gradient Problem ? Vanishing Gradient Problem in Neural Networks
  • 35. 35/ 62 Vanishing Gradient Problem ? Vanishing Gradient Problem in Neural Networks Solution
  • 36. 36/ 62 Vanishing Gradient Problem ? Vanishing Gradient Problem in Recurrent Neural Networks (RNN)
  • 37. 37/ 62 Vanishing Gradient Problem ? Vanishing Gradient Problem in Recurrent Neural Networks (RNN) Reference. 2019 GDG?? RNN?? (???)
  • 38. 38/ 62 Vanishing Gradient Problem ? Vanishing Gradient Problem in Recurrent Neural Networks (RNN) Reference. 2019 GDG?? RNN?? (???)
  • 39. 39/ 62 Vanishing Gradient Problem ? Vanishing Gradient Problem in Recurrent Neural Networks (RNN) Solution?
  • 40. 40/ 62 Solution: RNN Variants ? Long Short-term Memory (LSTM) [Sepp Hochreiter; J┨rgen Schmidhuber (1997). "Long short-term memory". Neural Computation. 9 (8): 1735C1780] ? Gated Recurrent Unit (GRU) [Cho, Kyunghyun et al., (2014). arXiv:1406.1078] ? RNN with Rectified Linear Unit (ReLU) [Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). arXiv:1504.00941.]
  • 42. 42/ 62 RNN with ReLU 1 1 1 1 1 1 1 1 ?? ReLU ? ??? ???? ??? = 1
  • 43. 43/ 62 RNN with ReLU 1 1 1 1 1 1 1 1 ?? ReLU ? ??? ???? ??? = 1 ? Weight ?? ?? ?? ???? (Exploding Gradient)
  • 44. 44/ 62 RNN with ReLU 1 1 1 1 1 1 1 1 ?? ReLU ? ??? ???? ??? = 1 ? Weight ?? ?? ?? ???? (Exploding Gradient) ? (1) Weight ???? (2) ?? ?? Learning rate? ????? ????? Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). arXiv:1504.00941
  • 45. 45/ 62 Long Short Term Memory (LSTM) ? Science 2017 ??? ? ? ???? ?? ? ?? ??? ????? ?? ?? ?? ??? ?? ? ?? ?? ? ????? ?? ? 2? ? ??? ?? ? ?? (??? ??? ??? ?) ???? ????? ??
  • 46. 46/ 62 Long Short Term Memory (LSTM) ? Science 2017 ??? ? ? ???? ?? ? ?? ??? ????? ?? ?? ?? ??? ?? ? ?? ?? ? ????? ?? ? Hidden state in LSTM´? ? 2? ? ??? ?? ? ?? (??? ??? ??? ?) ???? ????? ?? ? Cell state in LSTM´?
  • 47. 47/ 62 Long Short Term Memory (LSTM) ? ?? ?? ?? ??
  • 48. 48/ 62 Long Short Term Memory (LSTM) http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 49. 49/ 62 Long Short Term Memory (LSTM) http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 50. 50/ 62 Long Short Term Memory (LSTM) ??? ?? ??? ?? ?? ????? ???? ? ???? ?? ??? forget ???? ?? http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 51. 51/ 62 Long Short Term Memory (LSTM) http://colah.github.io/posts/2015-08-Understanding-LSTMs/ ??? ?? ??? ?? ?? ????? input ??? ?? ??? ?? ??? ?? ?? ??? ??? ???? ?? ??
  • 52. 52/ 62 Long Short Term Memory (LSTM) http://colah.github.io/posts/2015-08-Understanding-LSTMs/ (1) ?? ????? forget ???? ?? ???? ?? ?? (2) ?? ???? ?? ?? input ???? ?? ??? ?? ?? ? ??? (1)? (2)? ??? ?? ???? Cell state? ????
  • 53. 53/ 62 Long Short Term Memory (LSTM) http://colah.github.io/posts/2015-08-Understanding-LSTMs/ ??? ?? ??? ?? ?? ????? output ??? ???? ?? ????? output ???? ??? hidden state ????
  • 54. 54/ 62 Long Short Term Memory (LSTM) http://colah.github.io/posts/2015-08-Understanding-LSTMs/ ?? ??? ?? ??? ???? ? ?? ???
  • 55. 55/ 62 Gated Recurrent Unit (GRU) GRUVanilla RNN http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 56. 56/ 62 ?? ?? ?? ?? Gated Recurrent Unit (GRU) ? Update gate ?? ?? ? ?? ??? ??? ?? ?? ??? ?? ??? ?? (??)
  • 57. 57/ 62 Gated Recurrent Unit (GRU) ? Reset gate ??? ?? ? ?? ?? ???? ??? ?? ?? ??? ???? ? ?? ?? ?? ?? ??
  • 58. 58/ 62 Gated Recurrent Unit (GRU) ? Memory Candidate ?? ?? ?? ? ?? ??? ??? ??? ?? ?? ??? ???(reset)? ?? ?? ?? ?? ?? ?? ?? ??? ?? ?? = ? : ?? ??? reset (???), ?? ??(??) ??? ?? ?? ?? ??? ?? = ? : ?? ??? ????, ?? ??(??)? ???? ?? ?? ?? ???
  • 59. 59/ 62 Gated Recurrent Unit (GRU) ? Final Memory ?? ?? ? ?? ??(update)? ?? ??? ??? ?? ?? ??? ?? ?? = ? : ?? ?? ??? ?? ???? ?? = ? : ?? ?? ?? ??? ?? ???? ?? ?? ?? ?? ?? ?? ?? ?? ??
  • 61. 61/ 62 Summary ? ??? ???? ?? ??? ??? ???? ?? RNN ? ???? ? RNN? ??? ??? ??? ???? ? One-to-Many / Many-to-One / Many-to-Many ? ? RNN ??? Back-Propagation Through Time (BPTT) ?? ?? ? ?? RNN? Vanishing Gradient Problem? ???? ? RNN with ReLU (???? ?? ??) ? LSTM (???? + ????) ? GRU (LSTM? ??? ????? ????? ??)