Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Section - What is the curse of dimensionality?

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Cities ]


Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is a softmax activation function?
Next Document: How do MLPs compare with RBFs?
See reader questions & answers on this topic! - Help others by sharing your knowledge
Answer by Janne Sinkkonen.

Curse of dimensionality (Bellman 1961) refers to the exponential growth of
hypervolume as a function of dimensionality. In the field of NNs, curse of
dimensionality expresses itself in two related problems: 

1. Many NNs can be thought of mappings from an input space to an output
   space. Thus, loosely speaking, an NN needs to somehow "monitor", cover or
   represent every part of its input space in order to know how that part of
   the space should be mapped. Covering the input space takes resources,
   and, in the most general case, the amount of resources needed is
   proportional to the hypervolume of the input space. The exact formulation
   of "resources" and "part of the input space" depends on the type of the
   network and should probably be based on the concepts of information
   theory and differential geometry. 

   As an example, think of a vector quantization (VQ). In VQ, a set of units
   competitively learns to represents an input space (this is like Kohonen's
   Self-Organizing Map but without topography for the units). Imagine a VQ
   trying to share its units (resources) more or less equally over
   hyperspherical input space. One could argue that the average distance
   from a random point of the space to the nearest network unit measures the
   goodness of the representation: the shorter the distance, the better is
   the represention of the data in the sphere. It is intuitively clear (and
   can be experimentally verified) that the total number of units required
   to keep the average distance constant increases exponentially with the
   dimensionality of the sphere (if the radius of the sphere is fixed). 

   The curse of dimensionality causes networks with lots of irrelevant
   inputs to be behave relatively badly: the dimension of the input space is
   high, and the network uses almost all its resources to represent
   irrelevant portions of the space. 

   Unsupervised learning algorithms are typically prone to this problem - as
   well as conventional RBFs. A partial remedy is to preprocess the input in
   the right way, for example by scaling the components according to their
   "importance". 

2. Even if we have a network algorithm which is able to focus on important
   portions of the input space, the higher the dimensionality of the input
   space, the more data may be needed to find out what is important and what
   is not. 

A priori information can help with the curse of dimensionality. Careful
feature selection and scaling of the inputs fundamentally affects the
severity of the problem, as well as the selection of the neural network
model. For classification purposes, only the borders of the classes are
important to represent accurately. 

References: 

   Bellman, R. (1961), Adaptive Control Processes: A Guided Tour, Princeton
   University Press. 

   Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford:
   Oxford University Press, section 1.4. 

   Scott, D.W. (1992), Multivariate Density Estimation, NY: Wiley. 

User Contributions:

1
Majid Maqbool
Sep 27, 2024 @ 5:05 am
https://techpassion.co.uk/how-does-a-smart-tv-work-read-complete-details/
PDP++ is a neural-network simulation system written in C++, developed as an advanced version of the original PDP software from McClelland and Rumelhart's "Explorations in Parallel Distributed Processing Handbook" (1987). The software is designed for both novice users and researchers, providing flexibility and power in cognitive neuroscience studies. Featured in Randall C. O'Reilly and Yuko Munakata's "Computational Explorations in Cognitive Neuroscience" (2000), PDP++ supports a wide range of algorithms. These include feedforward and recurrent error backpropagation, with continuous and real-time models such as Almeida-Pineda. It also incorporates constraint satisfaction algorithms like Boltzmann Machines, Hopfield networks, and mean-field networks, as well as self-organizing learning algorithms, including Self-organizing Maps (SOM) and Hebbian learning. Additionally, it supports mixtures-of-experts models and the Leabra algorithm, which combines error-driven and Hebbian learning with k-Winners-Take-All inhibitory competition. PDP++ is a comprehensive tool for exploring neural network models in cognitive neuroscience.

Comment about this article, ask questions, or add new information about this topic:




Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is a softmax activation function?
Next Document: How do MLPs compare with RBFs?

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)





Last Update March 27 2014 @ 02:11 PM