Gradient Definition and Construction

The gradient is formally defined in terms of partial derivatives of a function f(x, y). Let us consider a function f(x,y) that has partial derivatives at x0,y0. The gradient of f at x0,y0 is defined by

NEURAL AND ADAPTIVE SYSTEMS00000199.gif

where ux, uy are the unit vectors along x and y, and fx, fy are the partial derivatives of f along the x and y directions, respectively, which are given by

NEURAL AND ADAPTIVE SYSTEMS00000200.gif

The gradient is associated with the concept of a directional derivative of a function. Let us assume we have a direction u = aux + buy. The directional derivative of f at x0,y0 along u is

NEURAL AND ADAPTIVE SYSTEMS00000201.gif

The gradient can be thus defined as a function of the ordered derivatives:

NEURAL AND ADAPTIVE SYSTEMS00000202.gif

where the operation is the dot product of two vectors (for v = cux + duy, v NEURAL AND ADAPTIVE SYSTEMS00090017.gif u = ac + bd).

This expression means that the maximum value of the directional derivative as a function of the direction u is given by the size of the gradient, and it occurs exactly when the direction u coincides with the gradient direction.

Moreover, we can also find this direction pretty easily. Let us consider the curve C(x,y), defined as the line in the x, y plane where the function f has a constant value (this line is called the level curve or the contour of f). At a point x
0, y0 in C the rate of change of f in the direction of the unit vector u tangent to C must be zero (see the preceding definition), that is,

NEURAL AND ADAPTIVE SYSTEMS00000203.gif

But this implies that the gradient vector is perpendicular to the tangent vector u of the level curve at x0,y0. This explains the graphical construction outlined in the text.

Use your browser's back button to return to text.