Reading about Neural Networks

One of my team at work sent around a link to a number of free (the price of receiving a marketing email) ebooks on Syncfusion’s Website. I looked over the EF and MVC books and thought they were really great, so I’d recommend those, but it James McCaffrey’s book on Neural Networks that interested me the most.

I’d just finished reading Jeff Heaton’s book on Fundamental Algorithms of Artificial Intelligence and in comparison, I think that James’s book was a much easier read for an introduction, but maybe it’s because the concepts were not new to me any more and because of the concrete code examples in a language I’m familiar with.

James’s book encourages writing your own neural network code from scratch rather than using something like Encog or Caffe) which sounds crazy, but I’ve messed around with a few libraries and the learning curve of the libraries can be so steep that you end up feeling like you’re not making progress and some of the libraries I tried couldn’t really handle even moderately sized volumes of data, making the learning effort a bit of a waste of time.

On the other hand, libraries like Caffe or Torch7 offer the tantalising prospect of running your networks on the GPU without having to write your own GPU code (Caffe’s blob approach is a great solution), which is quite attractive.

It took me a bit of effort to get my Ubuntu laptop with Optimus chipset to run Caffe in GPU mode, but when I did, the Caffe training was 6 times faster.

What I’m interested in now is in learning how to choose an appropriate design for a network, e.g. number of hidden layers and appropriate activation functions for different machine learning problems, something like a “Design Patterns for Neural Networks” so I’ve started reading Jeff Heaton’s book on the Encog library to see whether this answers some of these practical questions.

Do you have any recommendations for decent books in this area?