[ By Archive-name
| By Author | By Category | By Newsgroup ]
[ Home | Latest Updates | Archive Stats | Search | Usenet References | Help ]
Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page
Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is ART?
Next Document: What is GRNN?
[ Home | Latest Updates | Archive Stats | Search | Usenet References | Help ]
-
Search the FAQ Archives
Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page
Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is ART?
Next Document: What is GRNN?
What is PNN?
PNN or "Probabilistic Neural Network" is Donald Specht's term for kernel discriminant analysis. (Kernels are also called "Parzen windows".) You can think of it as a normalized RBF network in which there is a hidden unit centered at every training case. These RBF units are called "kernels" and are usually probability density functions such as the Gaussian. The hidden-to-output weights are usually 1 or 0; for each hidden unit, a weight of 1 is used for the connection going to the output that the case belongs to, while all other connections are given weights of 0. Alternatively, you can adjust these weights for the prior probabilities of each class. So the only weights that need to be learned are the widths of the RBF units. These widths (often a single width is used) are called "smoothing parameters" or "bandwidths" and are usually chosen by cross-validation or by more esoteric methods that are not well-known in the neural net literature; gradient descent is not used. Specht's claim that a PNN trains 100,000 times faster than backprop is at best misleading. While they are not iterative in the same sense as backprop, kernel methods require that you estimate the kernel bandwidth, and this requires accessing the data many times. Furthermore, computing a single output value with kernel methods requires either accessing the entire training data or clever programming, and either way is much slower than computing an output with a feedforward net. And there are a variety of methods for training feedforward nets that are much faster than standard backprop. So depending on what you are doing and how you do it, PNN may be either faster or slower than a feedforward net. PNN is a universal approximator for smooth class-conditional densities, so it should be able to solve any smooth classification problem given enough data. The main drawback of PNN is that, like kernel methods in general, it suffers badly from the curse of dimensionality. PNN cannot ignore irrelevant inputs without major modifications to the basic algorithm. So PNN is not likely to be the top choice if you have more than 5 or 6 nonredundant inputs. For modified algorithms that deal with irrelevant inputs, see Masters (1995) and Lowe (1995). But if all your inputs are relevant, PNN has the very useful ability to tell you whether a test case is similar (i.e. has a high density) to any of the training data; if not, you are extrapolating and should view the output classification with skepticism. This ability is of limited use when you have irrelevant inputs, since the similarity is measured with respect to all of the inputs, not just the relevant ones. References: Hand, D.J. (1982) Kernel Discriminant Analysis, Research Studies Press. Lowe, D.G. (1995), "Similarity metric learning for a variable-kernel classifier," Neural Computation, 7, 72-85, http://www.cs.ubc.ca/spider/lowe/pubs.html McLachlan, G.J. (1992) Discriminant Analysis and Statistical Pattern Recognition, Wiley. Masters, T. (1993). Practical Neural Network Recipes in C++, San Diego: Academic Press. Masters, T. (1995) Advanced Algorithms for Neural Networks: A C++ Sourcebook, NY: John Wiley and Sons, ISBN 0-471-10588-0 Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (1994) Machine Learning, Neural and Statistical Classification, Ellis Horwood; this book is out of print but available online at http://www.amsta.leeds.ac.uk/~charles/statlog/ Scott, D.W. (1992) Multivariate Density Estimation, Wiley. Specht, D.F. (1990) "Probabilistic neural networks," Neural Networks, 3, 110-118.
Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is ART?
Next Document: What is GRNN?
Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page
[ By Archive-name | By Author | By Category | By Newsgroup ]
[ Home | Latest Updates | Archive Stats | Search | Usenet References | Help ]
Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)
Last Update July 06 2008 @ 00:10 AM