Training Models - Sage Wisdom

Training Neural Networks

The following is solid advice from Andrej Karpathy on training neural networks and some specific tips to deal with it. Read his post here: A Recipe for Training Neural Networks (by @karpathy)

  1. Understand the data. Become one with it. Be the ball.
  2. Set up end to end pipeline, with some tips
    • fix and random seed
    • simplify
    • add sig digits to eval
    • verify loss at init
    • init "well"
    • human baseline
    • input-indepent baseline - does random or no data perform worse as expected
    • overfit one batch
    • verify decreasing training loss - push buttons you would expect to break things and see if they do. tap on the gauges when they look right, to see if it's real.
    • use backprop to chart dependencies - find bugs in arch by doing basic tests
    • generalize a special case - don't bite off more than you can chew. start special case, then generalize, reasonably.
  3. Overfit
  4. Regularize
    • moar data
    • creative data
    • dropout
    • weight decay
    • early stopping
    • decrease batch size
    • moar weights
  5. Tune your hyperparameters
  6. Squeeze out the juice w/ ensembles and lengthened training

Go read that post. And this one, and the rest of his site, probably.

Yes, You Should Understand Backprop (by @karpathy)