Monday, March 2, 2015

Notes on implementing neural networks


Here are some notes on implementing deep neural networks.

Visualize as much as possible

No, I don't mean in the sense of simply imagining success, but in the sense of creating visual representations of your models and the training process.  When you write the code to do the training and run it, it can be hard to diagnose what's going on when it just prints out "ERROR RATE" at every iteration.  Even just graphing the error rate can help you tell if things are converging or diverging easier than printing it out in textual format.

Of course this is much more useful if you are learning a visual task, but be creative, and use tools like T-SNE  to visualize data that isn't directly visual.  It will end up being a lot of effort to do visualization, even more than the actual implementation, but it's worth it in the end.

Use automatic differentiation

The common advice I see on training neural networks is to always check whether your gradient code is correct by using the finite differences method.  I would go further and say that it is worthwhile to use automatic differentiation if possible.  You get precise answers as to what the gradient /should/ be, and when you write the optimized  implementation of the gradient calculation, you can compare the results side-by side.  If you don't care about speed, you can just run the whole algorithm using the Automatically differentiated gradient and not worry about writing any extra code at all.