Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am training a deep residual network with 10 hidden layers with game data.

Does anyone have an idea why I don't get any overfitting here? Training and test loss still decreasing after 100 epochs of training.

https://imgur.com/Tf3DIZL

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
485 views
Welcome To Ask or Share your Answers For Others

1 Answer

Just a couple of advice:

  1. for deep learning is recommended to do even 90/10 or 95/5 splitting (Andrew Ng)
  2. this small difference between curves means that your learning_rate is not tuned; try to increase it (and, probably, number of epochs if you will implement some kind of 'smart' lr-reduce)
  3. it is also reasonable for DNN to try to overfit with the small amount of data (10-100 rows) and an enormous number of iterations
  4. check for data leakage in the set: weights analysis inside each layer may help you in this

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...