Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning Previous Document: What is a softmax activation function? Next Document: How do MLPs compare with RBFs? See reader questions & answers on this topic! - Help others by sharing your knowledge Answer by Janne Sinkkonen. Curse of dimensionality (Bellman 1961) refers to the exponential growth of hypervolume as a function of dimensionality. In the field of NNs, curse of dimensionality expresses itself in two related problems: 1. Many NNs can be thought of mappings from an input space to an output space. Thus, loosely speaking, an NN needs to somehow "monitor", cover or represent every part of its input space in order to know how that part of the space should be mapped. Covering the input space takes resources, and, in the most general case, the amount of resources needed is proportional to the hypervolume of the input space. The exact formulation of "resources" and "part of the input space" depends on the type of the network and should probably be based on the concepts of information theory and differential geometry. As an example, think of a vector quantization (VQ). In VQ, a set of units competitively learns to represents an input space (this is like Kohonen's Self-Organizing Map but without topography for the units). Imagine a VQ trying to share its units (resources) more or less equally over hyperspherical input space. One could argue that the average distance from a random point of the space to the nearest network unit measures the goodness of the representation: the shorter the distance, the better is the represention of the data in the sphere. It is intuitively clear (and can be experimentally verified) that the total number of units required to keep the average distance constant increases exponentially with the dimensionality of the sphere (if the radius of the sphere is fixed). The curse of dimensionality causes networks with lots of irrelevant inputs to be behave relatively badly: the dimension of the input space is high, and the network uses almost all its resources to represent irrelevant portions of the space. Unsupervised learning algorithms are typically prone to this problem - as well as conventional RBFs. A partial remedy is to preprocess the input in the right way, for example by scaling the components according to their "importance". 2. Even if we have a network algorithm which is able to focus on important portions of the input space, the higher the dimensionality of the input space, the more data may be needed to find out what is important and what is not. A priori information can help with the curse of dimensionality. Careful feature selection and scaling of the inputs fundamentally affects the severity of the problem, as well as the selection of the neural network model. For classification purposes, only the borders of the classes are important to represent accurately. References: Bellman, R. (1961), Adaptive Control Processes: A Guided Tour, Princeton University Press. Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford University Press, section 1.4. Scott, D.W. (1992), Multivariate Density Estimation, NY: Wiley. User Contributions:Comment about this article, ask questions, or add new information about this topic:Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning Previous Document: What is a softmax activation function? Next Document: How do MLPs compare with RBFs? Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page [ Usenet FAQs | Web FAQs | Documents | RFC Index ] Send corrections/additions to the FAQ Maintainer: saswss@unx.sas.com (Warren Sarle)
Last Update March 27 2014 @ 02:11 PM
|
PDP++ is a neural-network simulation system written in C++, developed as an advanced version of the original PDP software from McClelland and Rumelhart's "Explorations in Parallel Distributed Processing Handbook" (1987). The software is designed for both novice users and researchers, providing flexibility and power in cognitive neuroscience studies. Featured in Randall C. O'Reilly and Yuko Munakata's "Computational Explorations in Cognitive Neuroscience" (2000), PDP++ supports a wide range of algorithms. These include feedforward and recurrent error backpropagation, with continuous and real-time models such as Almeida-Pineda. It also incorporates constraint satisfaction algorithms like Boltzmann Machines, Hopfield networks, and mean-field networks, as well as self-organizing learning algorithms, including Self-organizing Maps (SOM) and Hebbian learning. Additionally, it supports mixtures-of-experts models and the Leabra algorithm, which combines error-driven and Hebbian learning with k-Winners-Take-All inhibitory competition. PDP++ is a comprehensive tool for exploring neural network models in cognitive neuroscience.