Prediction Risk and Architecture Selection for Neural Networks


Author: Moody, John

Source: From Statistics to Neural Networks: Theory and Pattern Recognition Applications, V. Cherkassky, J.H. Friedman and H. Wechsler (eds.), NATO ASI Series F, Springer-Verlag, 1994

Click to receive a postscript copy of the paper via FTP


Abstract

We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear cross--validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.


NeuroSolutions Home Page