Back to Compendium
technical essay
AI and systems / 2 min read
Training Models - Sage Wisdom
Sage wisdom and tips for training neural networks, general advice and specific tips.
AI / software / networks
reading surface
Technology
Training Neural Networks
Permalink to Training Neural NetworksThe following is solid advice from Andrej Karpathy on training neural networks and some specific tips to deal with it. Read his post here: A Recipe for Training Neural Networks (by @karpathy)
- Understand the data. Become one with it. Be the ball.
- Set up end to end pipeline, with some tips
- fix and random seed
- simplify
- add sig digits to eval
- verify loss at init
- init "well"
- human baseline
- input-indepent baseline - does random or no data perform worse as expected
- overfit one batch
- verify decreasing training loss - push buttons you would expect to break things and see if they do. tap on the gauges when they look right, to see if it's real.
- use backprop to chart dependencies - find bugs in arch by doing basic tests
- generalize a special case - don't bite off more than you can chew. start special case, then generalize, reasonably.
- Overfit
- Regularize
- moar data
- creative data
- dropout
- weight decay
- early stopping
- decrease batch size
- moar weights
- Tune your hyperparameters
- Squeeze out the juice w/ ensembles and lengthened training
Go read that post. And this one, and the rest of his site, probably.