| Download a free evaluation copy of NeuroSolutions to discover how to apply neural network technology to your artificial intelligence application. |
B.6 Training a Network
This section discusses the extra components necessary to train an adaptive system using gradient descent. Don't worry if you don't understand the concepts just focus on the mechanics. The textbook explains the concepts in detail.
Adaptive learning using gradient descent focuses on using the error between the system output and the desired system output to train the system. The learning algorithm adapts the weights of the system based on the error until the system produces the desired output. The Error Criteria family in NeuroSolutions computes different error measures that can be used to train the network.
The Error Criteria palette
The Error Criteria components are typically connected to the output of the network. They have a Desired access point that requires a member of the Input family to provide it data (e.g. a File component or Function Generator). By far the most common criterion is the L2 or Mean Squared Error (MSE) criterion. It simply computes the difference between the system output and the desired signal and squares it.
| Icon
|
Name
|
Description
|
|
L1
Norm or Absolute Value |
The
absolute value of the error between the system output and
the desired output |
|
L2
Norm or Mean Squared Error |
The
squared difference between the system output and the
desired output |
|
Lp
Norm |
The
difference between the system output and the desired
output raised to the pth power |
TUTORIAL EXAMPLE 13
This
example studies the Error Criteria component. In this example, we
will use the StepExemplar button
on the Control toolbar to single-step the network so
we can analyze the error computation.
B.6.1 Learning and Backpropagation
Now that we know how to compute the error, we can use the error to modify the weights of the system, allowing it to learn. The goal of the system is for the system output to be the same as the desired output, so we want to minimize the mean squared error. The method used to do this is called error backpropagation. Essentially, it is a three-step process. First, the input data is propagated forward through the network to compute the system output. Next the error is computed and propagated backward (thus the name backpropagation) through the network, and then it is used to modify the weights.
NeuroSolutions implements backpropagation of the error in a secondary "plane" that sits on top of the Axons and Synapses. This is called the backpropagation plane.
A network with a backpropagation plane
NeuroSolutions shows the backpropagation plane using smaller versions of the Axons and Synapses stacked on top of them. The backpropagation plane passes the errors backward from the Error Criteria component to the beginning of the network (and manipulates the errors along the way). NeuroSolutions adds a third plane that actually uses the errors in the backpropagation plane to change the weights in the network this is where the learning actually happens.
This plane is called the
gradient descent plane and sits on top of the
backpropagation plane. A typical gradient descent component is
the Momentum component
. Notice in the figure below that only the components
with weights use the gradient descent components.
Network with all 3 planes the forward, the backpropagation and the gradient descent planes
The last thing we need
to discuss is the Back Static Controller
. This controller sits on top of the Static
Controller and controls the backpropagation and the gradient
descent planes.
The details of all these components are not necessary at this point, so we will summarize what we have learned in an example.\
TUTORIAL EXAMPLE 14
This example uses the Breadboard from the previous example, sets up the backpropagation plane and the gradient descent plane and allows the system to learn.