17. ハイパーパラメタのチューニング
モデルの複雑さが足りない領域
- More layer / nodes
過学習の領域
- Reduce model size
- More generalization (dropout etc.)
- More training data
“Typical relationship between capacity and error”,
p. 115 of Goodfellow, I., Bengio, Y., and Courville, A.: Deep Learning, MIT Press, 2016
18. Learning Rate はいろいろ試してみるべき
“If you have time to tune only one hyperparameter, tune the learning rate”,
p. 431 of Goodfellow, I., Bengio, Y., and Courville, A.: Deep Learning, MIT Press, 2016