26. 実装時に気づいたこと
● 重みの初期化方法に気をつける.
○ 0.01のガウス分布で適当に初期化とかするとダメ.
○ std = √(2/(k*k*c)) で初期化(k = カーネルサイズ, c = チャンネル数)[1]
● 畳み込み層ではバイアスを追加しないようにする.
● Adamを使っとけば良いとか思わない.
● Global Average Poolingは[2]参照
[1]. He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet
classification." Proceedings of the IEEE International Conference on Computer Vision. 2015.
[2]. Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).
28. References
[1] He, Kaiming, et al. "Deep Residual Learning for Image Recognition." arXiv preprint
arXiv:1512.03385 (2015).
ResNetの論文
[2] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by
reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).
Batch Normについての論文
[3]. He, Kaiming, et al. "Identity mappings in deep residual networks." arXiv preprint arXiv:1603.05027
(2016).
ResNetのモデル構造に関する考察
[4]. He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet
classification." Proceedings of the IEEE International Conference on Computer Vision. 2015.
ResNetの重みの初期化方法記載
29. References
[5]. Training and investigating Residual Nets,
ResNetのモデルと最適化手法の変更による性能比較
[6]. CS231n Convolutional Neural Networks for Visual Recognition,
Leaning Rate変更による考察
[7]. Eldan, Ronen, and Ohad Shamir. "The Power of Depth for Feedforward Neural Networks." arXiv
preprint arXiv:1512.03965 (2015).
広くより深くのほうが性能高いことを説明している論文
[8]. Huang, Gao, et al. "Deep networks with stochastic depth." arXiv preprint arXiv:1603.09382 (2016).
Dropoutの導入で時間短縮を実現