The implementation itself is entirely abstracted out from the interfaces. This gives us the luxury of implementing different techniques as appropriate, without breaking compatibility.
The fundamental data structure is an array of layers, each containing a set of neurons and the layer's output. The output is an array stored in each layer rather than per neuron because such storage simplifies the filtering of information through the neural network.
The training data is stored separately so that it can be used only when required. The training data has the same structure as the data used for simulation. For the layers, the module stores the derivative of the activation function for the output, as well as the gradient of the error in the neuron. The layers also store the weight deltas, which are used to remember the gradient descent step. This allows momentum to be applied to the steepest descent algorithm.
Simulation and Learning
The simulation is essentially a set of nested loops. The outer loop processes the layers, propagating the information from the inputs to the outputs. The middle layer handles each of the neurons in the layers, and the inner loop actually computes the net sum of the neuron. Each loop is located in its own function to simplify the code and prevent redundancy. The inner loop is inlined for efficiency.
The learning algorithms first perform a forward simulation to determine the derivatives and gradients. Then, the error is propagated backward to determine the gradients in each of the neurons. Any algorithm can then be applied onto the gradients to adjust the weights. For batch processing, the gradients are summed together, and those are used by other gradient optimization techniques.