I’ve got around to adding some actual learning to my machine learning libraries by adding a few more components.

The training.Complete function provides a way of training machine learning systems, it takes the thing to train, data to train it against, the algorithm to use to train it (e.g. Greedy Random) and then how to calculate error (how far away from the ideal result we are). The last parameter is an array of ways to stop the training such as when the context is cancelled, a channel receives a value, when the training error reaches a value etc. I’ll probably add some way of showing progress soon too, since at the moment, it doesn’t provide any progress indication.

I added the most basic training method (Greedy Random) which just randomly selects new parameters to try and keeps the best one and a Radial Basis Function (RBF) Network which can be trained by:

  • Using the training algorithm (currently a choice of Greedy Random or nothing) to set its parameters
  • Running training data through it to get output values
  • Comparing the output values to the ideal result (by using a distance function) to get an error value
  • Comparing the error value returned by previous parameter sets with the latest error value to work out whether the new parameters are better than the old ones

An RBF Network is very simple. A network is made up of nodes which apply weights to input vectors, measure the distance of the weighted input against a randomly initialised (other initialisation options are available) centre position of that node using a Gaussian RBF, each node then outputs another (weighted) vector of outputs. The output vectors from each node are summed for each output node to produce one or more output values.

If the objective of the network is classification, the RBF Network can have multiple outputs, one for each class. If the objective is regression, a single output node can be used.

The Gaussian function at the core of this idea takes parameters for hight, centre position and width of the curve, then returns a function which takes input values and returns the scalar result. A lot of books and blog posts show the Gaussian function in one dimension, which is helpful, but you can’t really see how it relates to higher dimensions easily.

Here’s an example of the one dimension curve drawn out using Gonum Plot (gonum.org/v1/plot). The code to generate it is at https://github.com/a-h/ml/blob/master/rbf/example/main.go

rbf_1

It’s handy to test things out in OSX’s Grapher too.

grapher.png

To graph the multidimensional Gaussian, so I combined the code from the Go Programming Language book which renders an SVG of a function and my really basic library for drawing raster images (github.com/a-h/raster) and created an animation of what happens when the width parameter is adjusted. There’s no antialiasing, so it looks a bit retro. (I have looking for a nice 3D graphing library on my todo list.)

rbf_2

I think this makes it easier to see how the RBF Network works. I imagine that each RBF Node is making a hill in the 3D landscape, and that if I find a location on the map using X, Y coordinates I can see which of the hills I’m on, and how far up each hill goes. Changing the parameters changes where the peak of each hill is, and how steep the hill is.

To try out the algorithm, I trained a network against the XOR truth table. You can see the full program at https://github.com/a-h/ml/blob/master/rbf/xor/main.go

The output is shows the time taken, number of iterations and final error, a JSON serialized network (so you can create it again without training, but I haven’t tested this yet) and then the results of running the training set through the network. (I’m not worried about overfitting here.)

Press Ctrl-C to shut down.
Time: 1.957635526s
Iterations: 477433
Output error: 0.09695109314376912
[{"InputWeights":[-2.0923689796439877,2.6660390177817224],"Centroid":[-1.794155980764959,-9.852308766380387],"Width":7.141438394360126,"OutputWeights":[0.4412620778186813]},{"InputWeights":[-5.4866501503468665,3.4876428933778865],"Centroid":[4.136444440777179,-6.660279651036265],"Width":3.8851331450578215,"OutputWeights":[-7.278650409955206]},{"InputWeights":[6.7379437378209275,9.541282839128307],"Centroid":[-3.6214098899222877,-0.03886272536273516],"Width":1.5951265378060455,"OutputWeights":[-0.5937881599749701]},{"InputWeights":[-2.0338208289921393,-3.3117242537733507],"Centroid":[2.124486351812136,4.79095478789362],"Width":2.1460248286997015,"OutputWeights":[-1.037878581403465]},{"InputWeights":[-7.0895974586209665,7.76744552775741],"Centroid":[-8.847182211548326,5.019890137568236],"Width":2.3044320083905507,"OutputWeights":[-3.1754478673406066]},{"Outputs":[1]}]
input: [0 0], expected: [0], actual: [0.11712440583840633]
input: [0 1], expected: [1], actual: [0.9542248563534865]
input: [1 0], expected: [1], actual: [0.8578150049517184]
input: [1 1], expected: [0], actual: [-0.08271982804187505]

I’ve also made the training output the decision space in 3D, so I could see what was produced.

op

Well, that was fun! Maybe time to progress to some more effective algorithms.