| Download a free evaluation copy of NeuroSolutions to discover how to apply neural network technology to your artificial intelligence application. |


by
Introduction to Evaluation Version
This version of Neural and Adaptive Systems: Fundamentals Through Simulations contains the Preface, Chapter 1, and Appendix B (the NeuroSolutions Tutorial) and is for evaluation purposes only. No part of this text may be copied, reproduced, or otherwise duplicated without written permission from John Wiley & Sons and NeuroDimension, Inc.
NOTE: The HTML version of this chapter is for demonstration only and does not link to NeuroSolutions which runs the simulations. Nevertheless, you can imagine the interaction between the text and the simulations. For more information on obtaining Chapter 1 and an evaluation version of NeuroSolutions, click here.
Preface
This book presents neurocomputing from a different perspective. Throughout the discussion we blend the power of a software simulator with the theory of neurocomputing to constantly reinforce the synergism between the conceptual equations and their behavior in practical neural systems. In our presentation model, which we call an interactive electronic book (i-book) [Principe et al., 2000], the text co-exists with a functional simulator. This is much different than the two common methods of integrating a text and computer software. We do not simply add computer-based examples to the end of the text, we use computer simulations early and often in the presentation. In addition, the i-book is not simply a laboratory manual where exercises are wrapped with text. We have reorganized the theory of neurocomputing into fine-grain conceptual modules, each of which is illustrated with a simple simulation to enhance conceptual understanding throughout the text. At the end of every chapter there is at least one full simulation, which is incrementally built and explained throughout the chapter, that can be used for a project. The simulation thus becomes an essential piece of information delivery, deeply affecting the organization of the textbook, and allowing visualization and reader interaction with the material.
Our intent is to present the concepts at a level that can be understood by senior-level undergraduate students in science and engineering. The link between the hypertext and the simulator creates an interactive learning environment especially appropriate for self-guided study of neural networks and adaptive systems. Our more practical, simulation-driven approach will be particularly beneficial to professionals in the fields of science, engineering, and economics.
We strive to fully cover the major topics in neurocomputing, from theory to applications. Because adaptation is the key to neurocomputing, we give it special attention, covering both the supervised and unsupervised paradigms. We treat neural networks as a nonlinear extension of linear adaptive systems, thus discussing both linear and nonlinear adaptive topologies in detail. In fact, we start with the well-known regression problem to introduce the idea of data fitting and many of the concepts of adaptation. Time, so important for engineering applications, is covered from the perspective of digital signal processing and is integrated with neural topologies. Our hope is to show that adaptive filters and neural networks, normally taught in two different disciplines, can be integrated under the common theme of neural and adaptive systems. Each chapter contains extensive simulation examples (more than 200 simulations overall) to illustrate the concepts, and to provide for further exploration with instructor or student supplied data. We emphasize the design and use of neural and adaptive systems. Readers can use the electronic book in other disciplines, or for senior projects, master theses, or even Ph.D. dissertations.
The textbook is presented in two formats: an electronic version on CD-ROM, consisting of a hypertext document linked with a software simulator, and the paper version duplicating the text material.
Electronic Version
The electronic version is a hypertext document in the Windows help format, so any PC with the Windows NT or Windows 95 (or higher) operating system will be able to install and run the interactive book. The hypertext is linked to NeuroSolutions, a neural network simulator developed by NeuroDimension that is included on the CD-ROM. The simulator is called from the hypertext by clicking on an icon. The simulation examples include fully functional neural networks with explanations of the network topology, and directions for the reader to use and study the relevant aspects of the simulation. A key point is that the simulator is open to experimentation: students can change parameters and topologies, and open new probes to answer what if questions, which normally leads toward a much deeper understanding of the concepts. Once the simulations are complete, control passes to the point where the simulator was called. Hence a seamless integration of the text with the simulator is achieved.
Because of this organization, it is very important that the reader (as well as the instructor) learns how to interact with NeuroSolutions. We include a NeuroSolutions tutorial as Appendix B , and we strongly encourage everyone to consult and run the different examples in the tutorial in order to learn how to create, configure, probe, and adequately control the simulator.
The text is geared toward the presentation of the fundamental concepts of adaptive systems and how they can be applied in practice. Derivation of equations is encapsulated in hot links (which we call know-more boxes) that can be called from the main text but can be skipped in a first reading pass. Important definitions, equations, and references are accessible in pop-up windows.
The text can be navigated in several different ways. The beginning of each chapter contains hyperlinks to its section headings, and a hyperlink to the preface, as well as the next chapter. At the end of each section there is a hyperlink to the next section, providing a sequential, paper-book style of organization. Hot links to the appendix are called from the text (return to the text with the Back button in the control bar). At the end of each chapter we present hyperlinks to all the examples of the chapter, and a concept map with hot-links to all the chapter section headings and to conceptually related chapters.
Using the Electronic Book
As mentioned above, the electronic text is in the Windows Help file format. For those not familiar with the Windows Help system, this section will briefly introduce the navigation through the book.
Navigation through help files is done by using the help file toolbar and through hyperlinks. The following is an image of the help file toolbar:
By
pressing the
button you move to the next topic in the sequence.
Pressing the
moves to the previous topic in the sequence. If you
click on a hyperlink (described below), press the
button to return to the original position in the
text. Pressing the
button brings up the table of contents and key word
search panel shown below.
Selecting a chapter and clicking "Open" (or double-clicking the chapter) in the table of contents opens all the subsections of that chapter. Clicking display on a subsection moves immediately to that section of the text. Clicking on the "Index" tab of the window allows you to search through the keywords of the text. Clicking on the "Find" tab allows you to search through the text of the book.
There are three types of hyperlinks
· Pop-up windows that are used for definintions, references, and footnotes,
· Know More Boxes that contain more detailed information such as derivations, and
· Hyperlinks to related topics in other sections or chapters. Press the "back" button in the help file toolbar to return to the original position in the text.
Simply click on any of these hyperlinks to see the related information.
Paper Version
The paper version is a carbon copy of the electronic hypertext and it is composed of three fundamental elements: the text, the know-more boxes, and the NeuroSolutions examples. The know-more boxes are marked in the left margin with gray diamonds, and they contain further details about the topic (normally derivations). They correspond to the hot-links in the hypertext. They can be skipped in a first reading, but no thorough understanding of the material is possible without studying them. The NeuroSolutions examples are shaded boxes, and they contain the details to run the simulator.
We suggest that even when using the paper version of the textbook, the reader should execute the examples on the computer as soon as they appear in the text because they are intrinsically bound to the material, and enhance the understanding. For convenience, at the end of each chapter in the electronic version we provide a list of the examples with their corresponding hot-links as a quick reference for readers of the paper version. After executing the example, the reader can go back to the example listing by clicking on the Back button of the control bar.
Topics
The material is divided into 11 chapters, 3 appendices, and a glossary. Chapter 1 covers the concept of data fitting with the linear model, and at the same time, gradient-descent learning. We decided to start with the linear regression model because it is at least vaguely familiar to students in science, economics, and engineering. Linear regression exemplifies the central theme of the book, which is adapting parameters from data. The link between least squares and the search for the minimum of the mean square error is established in this chapter. Gradient-descent learning and the elegant LMS algorithm are also presented here. The style of computation and the properties of the iterated solution are covered for single and multiple regression. Newton's method is presented as well. At the end of the first chapter a project using data from the Internet is set up to help students apply what they learned to real- world data.
Chapter 2 presents the concepts of statistical pattern recognition. This chapter formulates classification as the placement of discriminant functions in pattern space to minimize the probability of the classification error. We stress the difference between data fitting and classification but show that the same basic methodology can be used to train both classifiers and regressors. The design of the Bayes classifier is introduced. We also present the concept of using high dimensional spaces to simplify the placement of the discriminant function. The concept of linear and quadratic classifiers,and their links to likelihoods are also discussed.
Chapter 3 studies multilayer perceptrons (MLPs). We systematically treat the processing power of the topology and the adaptation of the parameters as two different aspects of the problem. We start very simply, with a single McCulloch-Pitts processing element in two- dimensional space to build up intuition. We define the perceptron and show how to adapt its parameters with gradient-descent learning by using the chain rule. We also present a large-margin learning rule (the Adatron algorithm) linked to the perceptron learning rule. We then introduce the need for more powerful topologies. The study of the one-hidden-layer MLP is first conducted with fixed parameters (in terms of discriminant functions), and then backpropagation is derived and applied. The two-hidden-layer MLP is also studied as a topology and adapted using backpropagation. The method of ordered derivatives, important for practical implementations, is covered, and its implications for constructing general-purpose simulators are explored. Since NeuroSolutions uses a data-flow implementation, it very clearly illustrates the concepts of backpropagation and ordered derivatives. Training embedded adaptive subsystems in larger fixed parameters systems, so often forgotten but so relevant to solving practical problems, is also addressed with examples. The chapter finishes with the interpretation of an MLP as an a posteriori probability estimator.
Chapter 4 delves into the details of how to use MLPs in real-world applications. The practical aspects of initialization, alternative search methods, topology size, and stopping criterion are all addressed. We start by pointing out the stochastic nature of training and its implications. We cover only first-order search methods in detail (gradient descent, momentum, adaptive step sizes, and noise), but we present the full framework. Cross-validation is presented as the preferred method of stopping the training, but other methods are also covered. For good generalization, we demonstrate the use of weight decay as a way to control the size of topologies. Norm selection is also addressed as an extra control of performance. We explain committees as a practical way of controlling the variance of the estimator. After covering all of these aspects, we present two practical applications of MLPs, and we set up a simulation again using the Internet.
Chapter 5 presents the unifying view that regression and classification are special cases of the more general problem of function approximation. Adaptive systems should ultimately be interpreted as parametric function approximators. This chapter is a bit more demanding in terms of mathematics, but the goal is to introduce the problem of function approximation and the use of MLPs, and radial basis functions (RBFs) as universal function approximators. The statistical view of nonlinear regression is provided, and RBFs are also used as classifiers. We finish the chapter by providing a very brief introduction to support vector machines. Two projects, one with financial data and the other with housing prices, are constructed in the simulator.
Chapter 6 covers the principles of Hebbian learning. The estimation of correlation with local rules is stressed and shown to be a universal principle in learning systems. We present principal component analysis (PCA) as a robust and efficient data reduction methodology. The extensions of Hebbian learning to forced- and anti-Hebbian learning are also covered. The linear associative memories and its applications both to hetero-association and autoassociation receive a special place in the presentation.
Chapter 7 is devoted to competition, the other major learning principle. We apply competitive networks to clustering, and develop the Kohonen self-organizing network from the point of view of a topology-preserving map. Modular networks are briefly reviewed, and applications are presented.
Chapter 8 covers the fundamentals of extracting (and quantifying) information in time. The relationship between time signals and vectors is stressed. We motivate filtering as projections in vector spaces. This chapter is a summary of the fundamental concepts of linear signal and systems analysis. The concepts of impulse response, convolution, frequency response, and transfer function as descriptors of system properties are emphasized. This chapter also covers Fourier transforms, and a simple filter design method.
The remaining chapters exploit combinations of previous topics. Chapter 9 brings together the concepts of regression and time processing in the form of adaptive filters. The idea of optimal filters is presented, and the LMS algorithm is utilized for adaptation. We present system identification from the point of view of function approximation. Hebbian learning extended to time produces eigenfilters and Karhunen-Loeve transforms. This chapter contains a showcase of examples to illustrate the exceptional applicability of the simple linear adaptive network. Over 20 practical examples are offered, including
· System identification
· Prediction
· Model-based spectral analysis
· Noise cancellation
· Interference cancellation
· Echo cancellation
· Inverse modeling
· Inverse adaptive controls
· Eigenfiltering
· PCA in time
· Subspace spectral analysis
· Blind source separation
Readers may modify all of these models to solve their own problems.
Chapter 10 brings together time and nonlinear processing elements to yield systems that can be interpreted either as extensions of the MLP to time processing, or as nonlinear extensions of adaptive filters. The topologies covered are called time-lagged-feedforward networks (TLFNs), and they are feedforward combinations of linear filters with nonlinear PEs. The adaptation can still be performed with static backpropagation. The design of short-term memories is covered in detail. Many problems can be solved using these intermediate topologies. We present examples of temporal pattern recognition, nonlinear system identification, and nonlinear prediction.
Chapter 11 teaches the training of recurrent topologies in time, and presents the general case of first-order distributed dynamic systems. We extend static backpropagation to the training of recurrent systems, for both static patterns (fixed point learning) and for trajectory learning. Real-time recurrent learning (RTRL) and backpropagation through time (BPTT) are presented and compared. We discuss applications to nonlinear system identification and control. This chapter covers the Hopfield model, and the concept of computational energy. The hierarchy of neural models is introduced with the Grossberg additive model. The last topic is an extension of first-order dynamics to systems that are locally stable, but globally chaotic, to illustrate some of the challenges ahead in the field of neurocomputing.
Appendix A provides a topical view of matrix computations and probabilities, and is a quick reference to the underlying concepts necessary to understand the text material. Appendix B is the NeuroSolutions tutorial and it is a must for the student (and instructor) who is not familiar with the software package. It contains the basic concepts to construct networks, configure and probe components, and set up the simulations. A description of the data included on the CD-ROM is found in Appendix C . For quick reference, we also include a glossary section with definitions of the terms used in the text for quick reference.
The extensive coverage of topics makes the book impossible to cover in one semester, but permits different organizations of topics depending on the special goals of the instructor. Many possible topic sequences are possible. Chapters 1 through 5 form a short sequence that can be used as an introduction to neural networks covering regression, multi-layer perceptrons, and radial basis functions. A second course could cover chapters 6, 7, 9, 10, and 11, which include unsupervised learning rules, SOMs, and temporal neural networks. Chapters 1, 8, and 9 cover linear adaptive systems and filters and can be used separately or as introduction to the nonlinear systems.
Classroom Experience and the Simulator
The i-book can be used in a normal classroom setting, improving the explanation of the material through the simulations with real data, and providing the students with a way to practice at home what has been demonstrated in class. However, the full potential of the i-book is achieved with some improvements to the conventional classroom format. We would like to share our experience of teaching an elective course for electrical engineering undergraduates at the University of Florida using this electronic book. For further information, please consult the paper [Principe et al., 2000]. The course is taught in a laboratory format where we use a computer projector and an electronic white board, and each student has access to a computer during every lecture. The white board allows the text to be interactive, and examples are executed by touching the board. We cover a concept for 15 minutes and then allow the students 15 minutes to run the simulator and explore the example. Then we conclude and move to the next topic. The text naturally provides conceptual boundaries, with each concept illustrated by an example. The function of the instructor in this environment is as a conductor: explaining the concepts, answering questions, and timing the experiments. Students are allowed two hours of self-study a day in the classroom. Office hours are conducted in the classroom. Projects involve searches for data on the Internet.
The simulator is one of the central tools of learning. We slowly cover the most important parts of the simulator in Chapters 1 and 3, at the same time the material in those chapters is covered. However, we recommend that the instructor be familiar with the simulator before the course begins. The simulators user interface is iconic which makes learning easy. Icons are organized into families: five neural component families (Synapse, Axon, search, and supervised and unsupervised criteria) and simulation and visualization functions such as Probes, Controllers, Inputs, Transmitters and Schedulers. Each icon has a corresponding Inspector that changes the component parameters. The mechanics are intuitive, with drag-and-drop functionality. Configuration is a bit more detailed, but the user interface mimics the design steps. Therefore the choices are logical if the material is understood. Students can use other data supplied on the CD-ROM, and even modify the topologies given in the examples up to the limits imposed by the software developer.
The simulator is such an important part of the course presentation that we include a tutorial with the book. The instructor should master the simulator since it facilitates the answers to what if questions that make learning more interesting and productive. Students should also learn the simulator so that they can run the examples independently, make modifications for better learning, work out the problems at the end of the chapters, and apply newly acquired knowledge to projects.
References:
Principe J., N. Euliano, C. Lefebvre, Innovating Adaptive and Neural Systems Instruction with Interactive Electronic Books, Proceedings of the IEEE, special issue on engineering education, January 2000 (in press).