## Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z

# comp.ai.neural-nets FAQ, Part 3 of 7: GeneralizationSection - How can generalization error be estimated?

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Forum ]

Top Document: comp.ai.neural-nets FAQ, Part 3 of 7: Generalization
Previous Document: How many hidden units should I use?
Next Document: What are cross-validation and bootstrapping?
```
There are many methods for estimating generalization error.

Single-sample statistics: AIC, SBC, MDL, FPE, Mallows' C_p, etc.
In linear models, statistical theory provides several simple estimators
of the generalization error under various sampling assumptions
(Darlington 1968; Efron and Tibshirani 1993; Miller 1990). These
estimators adjust the training error for the number of weights being
estimated, and in some cases for the noise variance if that is known.

These statistics can also be used as crude estimates of the
generalization error in nonlinear models when you have a "large" training
set. Correcting these statistics for nonlinearity requires substantially
more computation (Moody 1992), and the theory does not always hold for
neural networks due to violations of the regularity conditions.

Among the simple generalization estimators that do not require the noise
variance to be known, Schwarz's Bayesian Criterion (known as both SBC and
BIC; Schwarz 1978; Judge et al. 1980; Raftery 1995) often works well for
NNs (Sarle 1995, 1999). AIC and FPE tend to overfit with NNs. Rissanen's
Minimum Description Length principle (MDL; Rissanen 1978, 1987, 1999) is
closely related to SBC. A special issue of Computer Journal contains
several articles on MDL, which can be found online at
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/
Several articles on SBC/BIC are available at the University of
Washigton's web site at http://www.stat.washington.edu/tech.reports

For classification problems, the formulas are not as simple as for
regression with normal noise. See Efron (1986) regarding logistic
regression.

Split-sample or hold-out validation.
The most commonly used method for estimating generalization error in
neural networks is to reserve part of the data as a "test" set, which
must not be used in any way during training. The test set must be a
representative sample of the cases that you want to generalize to. After
training, run the network on the test set, and the error on the test set
provides an unbiased estimate of the generalization error, provided that
the test set was chosen randomly. The disadvantage of split-sample
validation is that it reduces the amount of data available for both
training and validation. See Weiss and Kulikowski (1991).

Cross-validation (e.g., leave one out).
Cross-validation is an improvement on split-sample validation that allows
you to use all of the data for training. The disadvantage of
cross-validation is that you have to retrain the net many times. See
"What are cross-validation and bootstrapping?".

Bootstrapping.
Bootstrapping is an improvement on cross-validation that often provides
better estimates of generalization error at the cost of even more
computing time. See "What are cross-validation and bootstrapping?".

If you use any of the above methods to choose which of several different
networks to use for prediction purposes, the estimate of the generalization
error of the best network will be optimistic. For example, if you train
several networks using one data set, and use a second (validation set) data
set to decide which network is best, you must use a third (test set) data
set to obtain an unbiased estimate of the generalization error of the chosen
network. Hjorth (1994) explains how this principle extends to
cross-validation and bootstrapping.

References:

Darlington, R.B. (1968), "Multiple Regression in Psychological Research
and Practice," Psychological Bulletin, 69, 161-182.

Efron, B. (1986), "How biased is the apparent error rate of a prediction
rule?" J. of the American Statistical Association, 81, 461-470.

Efron, B. and Tibshirani, R.J. (1993), An Introduction to the Bootstrap,
London: Chapman & Hall.

Hjorth, J.S.U. (1994), Computer Intensive Statistical Methods:
Validation, Model Selection, and Bootstrap, London: Chapman & Hall.

Miller, A.J. (1990), Subset Selection in Regression, London: Chapman &
Hall.

Moody, J.E. (1992), "The Effective Number of Parameters: An Analysis of
Generalization and Regularization in Nonlinear Learning Systems", in
Moody, J.E., Hanson, S.J., and Lippmann, R.P., Advances in Neural
Information Processing Systems 4, 847-854.

Raftery, A.E. (1995), "Bayesian Model Selection in Social Research," in
Marsden, P.V. (ed.), Sociological Methodology 1995, Cambridge, MA:
Blackwell, ftp://ftp.stat.washington.edu/pub/tech.reports/ or
http://www.stat.washington.edu/tech.reports/bic.ps

Rissanen, J. (1978), "Modelling by shortest data description,"
Automatica, 14, 465-471.

Rissanen, J. (1987), "Stochastic complexity" (with discussion), J. of the
Royal Statistical Society, Series B, 49, 223-239.

Rissanen, J. (1999), "Hypothesis Selection and Testing by the MDL
Principle," Computer Journal, 42, 260-269,
http://www3.oup.co.uk/computer_journal/hdb/Volume_42/Issue_04/

Sarle, W.S. (1995), "Stopped Training and Other Remedies for
Overfitting," Proceedings of the 27th Symposium on the Interface of
Computing Science and Statistics, 352-360,
ftp://ftp.sas.com/pub/neural/inter95.ps.Z (this is a very large
compressed postscript file, 747K, 10 pages)

Sarle, W.S. (1999), "Donoho-Johnstone Benchmarks: Neural Net Results,"
ftp://ftp.sas.com/pub/neural/dojo/dojo.html

Weiss, S.M. & Kulikowski, C.A. (1991), Computer Systems That Learn,
Morgan Kaufmann.

```

## User Contributions:

Top Document: comp.ai.neural-nets FAQ, Part 3 of 7: Generalization
Previous Document: How many hidden units should I use?
Next Document: What are cross-validation and bootstrapping?

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)

Last Update March 27 2014 @ 02:11 PM