comp.ai.neural-nets FAQ, Part 1 of 7: IntroductionSection

Top Document: comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Previous Document: Who is concerned with NNs?
Next Document: How many kinds of Kohonen networks exist?
See reader questions & answers on this topic! - Help others by sharing your knowledge

There are many many kinds of NNs by now. Nobody knows exactly how many. New
ones (or at least variations of old ones) are invented every week. Below is
a collection of some of the most well known methods, not claiming to be
complete.

The two main kinds of learning algorithms are supervised and unsupervised. 

 o In supervised learning, the correct results (target values, desired
   outputs) are known and are given to the NN during training so that the NN
   can adjust its weights to try match its outputs to the target values.
   After training, the NN is tested by giving it only input values, not
   target values, and seeing how close it comes to outputting the correct
   target values. 
 o In unsupervised learning, the NN is not provided with the correct results
   during training. Unsupervised NNs usually perform some kind of data
   compression, such as dimensionality reduction or clustering. See "What
   does unsupervised learning learn?" 

The distinction between supervised and unsupervised methods is not always
clear-cut. An unsupervised method can learn a summary of a probability
distribution, then that summarized distribution can be used to make
predictions. Furthermore, supervised methods come in two subvarieties:
auto-associative and hetero-associative. In auto-associative learning, the
target values are the same as the inputs, whereas in hetero-associative
learning, the targets are generally different from the inputs. Many
unsupervised methods are equivalent to auto-associative supervised methods.
For more details, see "What does unsupervised learning learn?" 

Two major kinds of network topology are feedforward and feedback. 

 o In a feedforward NN, the connections between units do not form cycles.
   Feedforward NNs usually produce a response to an input quickly. Most
   feedforward NNs can be trained using a wide variety of efficient
   conventional numerical methods (e.g. see "What are conjugate gradients,
   Levenberg-Marquardt, etc.?") in addition to algorithms invented by NN
   reserachers. 
 o In a feedback or recurrent NN, there are cycles in the connections. In
   some feedback NNs, each time an input is presented, the NN must iterate
   for a potentially long time before it produces a response. Feedback NNs
   are usually more difficult to train than feedforward NNs. 

Some kinds of NNs (such as those with winner-take-all units) can be
implemented as either feedforward or feedback networks. 

NNs also differ in the kinds of data they accept. Two major kinds of data
are categorical and quantitative. 

 o Categorical variables take only a finite (technically, countable) number
   of possible values, and there are usually several or more cases falling
   into each category. Categorical variables may have symbolic values (e.g.,
   "male" and "female", or "red", "green" and "blue") that must be encoded
   into numbers before being given to the network (see "How should
   categories be encoded?") Both supervised learning with categorical target
   values and unsupervised learning with categorical outputs are called
   "classification." 
 o Quantitative variables are numerical measurements of some attribute, such
   as length in meters. The measurements must be made in such a way that at
   least some arithmetic relations among the measurements reflect analogous
   relations among the attributes of the objects that are measured. For more
   information on measurement theory, see the Measurement Theory FAQ at 
   ftp://ftp.sas.com/pub/neural/measurement.html. Supervised learning with
   quantitative target values is called "regression." 

Some variables can be treated as either categorical or quantitative, such as
number of children or any binary variable. Most regression algorithms can
also be used for supervised classification by encoding categorical target
values as 0/1 binary variables and using those binary variables as target
values for the regression algorithm. The outputs of the network are
posterior probabilities when any of the most common training methods are
used. 

Here are some well-known kinds of NNs: 

1. Supervised 

   1. Feedforward 

       o Linear 
          o Hebbian - Hebb (1949), Fausett (1994) 
          o Perceptron - Rosenblatt (1958), Minsky and Papert (1969/1988),
            Fausett (1994) 
          o Adaline - Widrow and Hoff (1960), Fausett (1994) 
          o Higher Order - Bishop (1995) 
          o Functional Link - Pao (1989) 
       o MLP: Multilayer perceptron - Bishop (1995), Reed and Marks (1999),
         Fausett (1994) 
          o Backprop - Rumelhart, Hinton, and Williams (1986) 
          o Cascade Correlation - Fahlman and Lebiere (1990), Fausett (1994)
          o Quickprop - Fahlman (1989) 
          o RPROP - Riedmiller and Braun (1993) 
       o RBF networks - Bishop (1995), Moody and Darken (1989), Orr (1996) 
          o OLS: Orthogonal Least Squares - Chen, Cowan and Grant (1991) 
       o CMAC: Cerebellar Model Articulation Controller - Albus (1975),
         Brown and Harris (1994) 
       o Classification only 
          o LVQ: Learning Vector Quantization - Kohonen (1988), Fausett
            (1994) 
          o PNN: Probabilistic Neural Network - Specht (1990), Masters
            (1993), Hand (1982), Fausett (1994) 
       o Regression only 
          o GNN: General Regression Neural Network - Specht (1991), Nadaraya
            (1964), Watson (1964) 

   2. Feedback - Hertz, Krogh, and Palmer (1991), Medsker and Jain (2000)

       o BAM: Bidirectional Associative Memory - Kosko (1992), Fausett
         (1994) 
       o Boltzman Machine - Ackley et al. (1985), Fausett (1994) 
       o Recurrent time series 
          o Backpropagation through time - Werbos (1990) 
          o Elman - Elman (1990) 
          o FIR: Finite Impulse Response - Wan (1990) 
          o Jordan - Jordan (1986) 
          o Real-time recurrent network - Williams and Zipser (1989) 
          o Recurrent backpropagation - Pineda (1989), Fausett (1994) 
          o TDNN: Time Delay NN - Lang, Waibel and Hinton (1990) 

   3. Competitive 

       o ARTMAP - Carpenter, Grossberg and Reynolds (1991) 
       o Fuzzy ARTMAP - Carpenter, Grossberg, Markuzon, Reynolds and Rosen
         (1992), Kasuba (1993) 
       o Gaussian ARTMAP - Williamson (1995) 
       o Counterpropagation - Hecht-Nielsen (1987; 1988; 1990), Fausett
         (1994) 
       o Neocognitron - Fukushima, Miyake, and Ito (1983), Fukushima,
         (1988), Fausett (1994) 

2. Unsupervised - Hertz, Krogh, and Palmer (1991) 

   1. Competitive 

       o Vector Quantization 
          o Grossberg - Grossberg (1976) 
          o Kohonen - Kohonen (1984) 
          o Conscience - Desieno (1988) 
       o Self-Organizing Map 
          o Kohonen - Kohonen (1995), Fausett (1994) 
          o GTM: - Bishop, Svensén and Williams (1997) 
          o Local Linear - Mulier and Cherkassky (1995) 
       o Adaptive resonance theory 
          o ART 1 - Carpenter and Grossberg (1987a), Moore (1988), Fausett
            (1994) 
          o ART 2 - Carpenter and Grossberg (1987b), Fausett (1994) 
          o ART 2-A - Carpenter, Grossberg and Rosen (1991a) 
          o ART 3 - Carpenter and Grossberg (1990) 
          o Fuzzy ART - Carpenter, Grossberg and Rosen (1991b) 
       o DCL: Differential Competitive Learning - Kosko (1992) 

   2. Dimension Reduction - Diamantaras and Kung (1996) 

       o Hebbian - Hebb (1949), Fausett (1994) 
       o Oja - Oja (1989) 
       o Sanger - Sanger (1989) 
       o Differential Hebbian - Kosko (1992) 

   3. Autoassociation 

       o Linear autoassociator - Anderson et al. (1977), Fausett (1994) 
       o BSB: Brain State in a Box - Anderson et al. (1977), Fausett (1994) 
       o Hopfield - Hopfield (1982), Fausett (1994) 

3. Nonlearning 

   1. Hopfield - Hertz, Krogh, and Palmer (1991) 
   2. various networks for optimization - Cichocki and Unbehauen (1993) 

References: 

   Ackley, D.H., Hinton, G.F., and Sejnowski, T.J. (1985), "A learning
   algorithm for Boltzman machines," Cognitive Science, 9, 147-169. 

   Albus, J.S (1975), "New Approach to Manipulator Control: The Cerebellar
   Model Articulation Controller (CMAC)," Transactions of the ASME Journal
   of Dynamic Systems, Measurement, and Control, September 1975, 220-27. 

   Anderson, J.A., and Rosenfeld, E., eds. (1988), Neurocomputing:
   Foundatons of Research, Cambridge, MA: The MIT Press. 

   Anderson, J.A., Silverstein, J.W., Ritz, S.A., and Jones, R.S. (1977)
   "Distinctive features, categorical perception, and probability learning:
   Some applications of a neural model," Psychological Rveiew, 84, 413-451.
   Reprinted in Anderson and Rosenfeld (1988). 

   Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford:
   Oxford University Press. 

   Bishop, C.M., Svensén, M., and Williams, C.K.I (1997), "GTM: A principled
   alternative to the self-organizing map," in Mozer, M.C., Jordan, M.I.,
   and Petsche, T., (eds.) Advances in Neural Information Processing
   Systems 9, Cambrideg, MA: The MIT Press, pp. 354-360. Also see 
   http://www.ncrg.aston.ac.uk/GTM/ 

   Brown, M., and Harris, C. (1994), Neurofuzzy Adaptive Modelling and
   Control, NY: Prentice Hall. 

   Carpenter, G.A., Grossberg, S. (1987a), "A massively parallel
   architecture for a self-organizing neural pattern recognition machine,"
   Computer Vision, Graphics, and Image Processing, 37, 54-115. 

   Carpenter, G.A., Grossberg, S. (1987b), "ART 2: Self-organization of
   stable category recognition codes for analog input patterns," Applied
   Optics, 26, 4919-4930. 

   Carpenter, G.A., Grossberg, S. (1990), "ART 3: Hierarchical search using
   chemical transmitters in self-organizing pattern recognition
   architectures. Neural Networks, 3, 129-152. 

   Carpenter, G.A., Grossberg, S., Markuzon, N., Reynolds, J.H., and Rosen,
   D.B. (1992), "Fuzzy ARTMAP: A neural network architecture for incremental
   supervised learning of analog multidimensional maps," IEEE Transactions
   on Neural Networks, 3, 698-713 

   Carpenter, G.A., Grossberg, S., Reynolds, J.H. (1991), "ARTMAP:
   Supervised real-time learning and classification of nonstationary data by
   a self-organizing neural network," Neural Networks, 4, 565-588. 

   Carpenter, G.A., Grossberg, S., Rosen, D.B. (1991a), "ART 2-A: An
   adaptive resonance algorithm for rapid category learning and
   recognition," Neural Networks, 4, 493-504. 

   Carpenter, G.A., Grossberg, S., Rosen, D.B. (1991b), "Fuzzy ART: Fast
   stable learning and categorization of analog patterns by an adaptive
   resonance system," Neural Networks, 4, 759-771. 

   Chen, S., Cowan, C.F.N., and Grant, P.M. (1991), "Orthogonal least
   squares learning for radial basis function networks," IEEE Transactions
   on Neural Networks, 2, 302-309. 

   Cichocki, A. and Unbehauen, R. (1993). Neural Networks for Optimization
   and Signal Processing. NY: John Wiley & Sons, ISBN 0-471-93010-5. 

   Desieno, D. (1988), "Adding a conscience to competitive learning," Proc.
   Int. Conf. on Neural Networks, I, 117-124, IEEE Press. 

   Diamantaras, K.I., and Kung, S.Y. (1996) Principal Component Neural
   Networks: Theory and Applications, NY: Wiley. 

   Elman, J.L. (1990), "Finding structure in time," Cognitive Science, 14,
   179-211. 

   Fahlman, S.E. (1989), "Faster-Learning Variations on Back-Propagation: An
   Empirical Study", in Touretzky, D., Hinton, G, and Sejnowski, T., eds., 
   Proceedings of the 1988 Connectionist Models Summer School, Morgan
   Kaufmann, 38-51. 

   Fahlman, S.E., and Lebiere, C. (1990), "The Cascade-Correlation Learning
   Architecture", in Touretzky, D. S. (ed.), Advances in Neural Information
   Processing Systems 2,, Los Altos, CA: Morgan Kaufmann Publishers, pp.
   524-532. 

   Fausett, L. (1994), Fundamentals of Neural Networks, Englewood Cliffs,
   NJ: Prentice Hall. 

   Fukushima, K., Miyake, S., and Ito, T. (1983), "Neocognitron: A neural
   network model for a mechanism of visual pattern recognition," IEEE
   Transactions on Systems, Man, and Cybernetics, 13, 826-834. 

   Fukushima, K. (1988), "Neocognitron: A hierarchical neural network
   capable of visual pattern recognition," Neural Networks, 1, 119-130. 

   Grossberg, S. (1976), "Adaptive pattern classification and universal
   recoding: I. Parallel development and coding of neural feature
   detectors," Biological Cybernetics, 23, 121-134 

   Hand, D.J. (1982) Kernel Discriminant Analysis, Research Studies Press. 

   Hebb, D.O. (1949), The Organization of Behavior, NY: John Wiley & Sons. 

   Hecht-Nielsen, R. (1987), "Counterpropagation networks," Applied Optics,
   26, 4979-4984. 

   Hecht-Nielsen, R. (1988), "Applications of counterpropagation networks,"
   Neural Networks, 1, 131-139. 

   Hecht-Nielsen, R. (1990), Neurocomputing, Reading, MA: Addison-Wesley. 

   Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of
   Neural Computation. Addison-Wesley: Redwood City, California. 

   Hopfield, J.J. (1982), "Neural networks and physical systems with
   emergent collective computational abilities," Proceedings of the National
   Academy of Sciences, 79, 2554-2558. Reprinted in Anderson and Rosenfeld
   (1988). 

   Jordan, M. I. (1986), "Attractor dynamics and parallelism in a
   connectionist sequential machine," In Proceedings of the Eighth Annual
   conference of the Cognitive Science Society, pages 531-546. Lawrence
   Erlbaum. 

   Kasuba, T. (1993), "Simplified Fuzzy ARTMAP," AI Expert, 8, 18-25. 

   Kohonen, T. (1984), Self-Organization and Associative Memory, Berlin:
   Springer. 

   Kohonen, T. (1988), "Learning Vector Quantization," Neural Networks, 1
   (suppl 1), 303. 

   Kohonen, T. (1995/1997), Self-Organizing Maps, Berlin: Springer-Verlag.
   First edition was 1995, second edition 1997. See 
   http://www.cis.hut.fi/nnrc/new_book.html for information on the second
   edition. 

   Kosko, B.(1992), Neural Networks and Fuzzy Systems, Englewood Cliffs,
   N.J.: Prentice-Hall. 

   Lang, K. J., Waibel, A. H., and Hinton, G. (1990), "A time-delay neural
   network architecture for isolated word recognition," Neural Networks, 3,
   23-44. 

   Masters, T. (1993). Practical Neural Network Recipes in C++, San Diego:
   Academic Press. 

   Masters, T. (1995) Advanced Algorithms for Neural Networks: A C++
   Sourcebook, NY: John Wiley and Sons, ISBN 0-471-10588-0 

   Medsker, L.R., and Jain, L.C., eds. (2000), Recurrent Neural Networks:
   Design and Applications, Boca Raton, FL: CRC Press, ISBN 0-8493-7181-3. 

   Minsky, M.L., and Papert, S.A. (1969/1988), Perceptrons, Cambridge, MA:
   The MIT Press (first edition, 1969; expanded edition, 1988). 

   Moody, J. and Darken, C.J. (1989), "Fast learning in networks of
   locally-tuned processing units," Neural Computation, 1, 281-294. 

   Moore, B. (1988), "ART 1 and Pattern Clustering," in Touretzky, D.,
   Hinton, G. and Sejnowski, T., eds., Proceedings of the 1988
   Connectionist Models Summer School, 174-185, San Mateo, CA: Morgan
   Kaufmann. 

   Mulier, F. and Cherkassky, V. (1995), "Self-Organization as an Iterative
   Kernel Smoothing Process," Neural Computation, 7, 1165-1177. 

   Nadaraya, E.A. (1964) "On estimating regression", Theory Probab. Applic.
   10, 186-90. 

   Oja, E. (1989), "Neural networks, principal components, and subspaces,"
   International Journal of Neural Systems, 1, 61-68. 

   Orr, M.J.L. (1996), "Introduction to radial basis function networks," 
   http://www.anc.ed.ac.uk/~mjo/papers/intro.ps or 
   http://www.anc.ed.ac.uk/~mjo/papers/intro.ps.gz 

   Pao, Y. H. (1989), Adaptive Pattern Recognition and Neural Networks,
   Reading, MA: Addison-Wesley Publishing Company, ISBN 0-201-12584-6. 

   Pineda, F.J. (1989), "Recurrent back-propagation and the dynamical
   approach to neural computation," Neural Computation, 1, 161-172. 

   Reed, R.D., and Marks, R.J, II (1999), Neural Smithing: Supervised
   Learning in Feedforward Artificial Neural Networks, Cambridge, MA: The
   MIT Press, ISBN 0-262-18190-8.

   Riedmiller, M. and Braun, H. (1993), "A Direct Adaptive Method for Faster
   Backpropagation Learning: The RPROP Algorithm", Proceedings of the IEEE
   International Conference on Neural Networks 1993, San Francisco: IEEE. 

   Rosenblatt, F. (1958), "The perceptron: A probabilistic model for
   information storage and organization in the brain., Psychological Review,
   65, 386-408. 

   Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1986), "Learning
   internal representations by error propagation", in Rumelhart, D.E. and
   McClelland, J. L., eds. (1986), Parallel Distributed Processing:
   Explorations in the Microstructure of Cognition, Volume 1, 318-362,
   Cambridge, MA: The MIT Press. 

   Sanger, T.D. (1989), "Optimal unsupervised learning in a single-layer
   linear feedforward neural network," Neural Networks, 2, 459-473. 

   Specht, D.F. (1990) "Probabilistic neural networks," Neural Networks, 3,
   110-118. 

   Specht, D.F. (1991) "A Generalized Regression Neural Network", IEEE
   Transactions on Neural Networks, 2, Nov. 1991, 568-576. 

   Wan, E.A. (1990), "Temporal backpropagation: An efficient algorithm for
   finite impulse response neural networks," in Proceedings of the 1990
   Connectionist Models Summer School, Touretzky, D.S., Elman, J.L.,
   Sejnowski, T.J., and Hinton, G.E., eds., San Mateo, CA: Morgan Kaufmann,
   pp. 131-140. 

   Watson, G.S. (1964) "Smooth regression analysis", Sankhy{\=a}, Series A,
   26, 359-72. 

   Werbos, P.J. (1990), "Backpropagtion through time: What it is and how to
   do it," Proceedings of the IEEE, 78, 1550-1560. 

   Widrow, B., and Hoff, M.E., Jr., (1960), "Adaptive switching circuits,"
   IRE WESCON Convention Record. part 4, pp. 96-104. Reprinted in Anderson
   and Rosenfeld (1988). 

   Williams, R.J., and Zipser, D., (1989), "A learning algorithm for
   continually running fully recurrent neurla networks," Neural Computation,
   1, 270-280. 

   Williamson, J.R. (1995), "Gaussian ARTMAP: A neural network for fast
   incremental learning of noisy multidimensional maps," Technical Report
   CAS/CNS-95-003, Boston University, Center of Adaptive Systems and
   Department of Cognitive and Neural Systems.
User Contributions:

Comment about this article, ask questions, or add new information about this topic:

Archived related questions and answers
Top Document: comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Previous Document: Who is concerned with NNs?
Next Document: How many kinds of Kohonen networks exist?

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page
[ Usenet FAQs | Web FAQs | Documents | RFC Index ]
Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)
Last Update March 27 2014 @ 02:11 PM
comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Section - How many kinds of NNs exist?

Search the FAQ Archives

comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Section - How many kinds of NNs exist?

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

comp.ai.neural-nets FAQ, Part 1 of 7: IntroductionSection - How many kinds of NNs exist?

Search the FAQ Archives

comp.ai.neural-nets FAQ, Part 1 of 7: IntroductionSection - How many kinds of NNs exist?

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Section - How many kinds of NNs exist?

comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Section - How many kinds of NNs exist?