Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Section - What is the curse of dimensionality?

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Property taxes ]


Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is a softmax activation function?
Next Document: How do MLPs compare with RBFs?
See reader questions & answers on this topic! - Help others by sharing your knowledge

Answer by Janne Sinkkonen.

Curse of dimensionality (Bellman 1961) refers to the exponential growth of
hypervolume as a function of dimensionality. In the field of NNs, curse of
dimensionality expresses itself in two related problems: 

1. Many NNs can be thought of mappings from an input space to an output
   space. Thus, loosely speaking, an NN needs to somehow "monitor", cover or
   represent every part of its input space in order to know how that part of
   the space should be mapped. Covering the input space takes resources,
   and, in the most general case, the amount of resources needed is
   proportional to the hypervolume of the input space. The exact formulation
   of "resources" and "part of the input space" depends on the type of the
   network and should probably be based on the concepts of information
   theory and differential geometry. 

   As an example, think of a vector quantization (VQ). In VQ, a set of units
   competitively learns to represents an input space (this is like Kohonen's
   Self-Organizing Map but without topography for the units). Imagine a VQ
   trying to share its units (resources) more or less equally over
   hyperspherical input space. One could argue that the average distance
   from a random point of the space to the nearest network unit measures the
   goodness of the representation: the shorter the distance, the better is
   the represention of the data in the sphere. It is intuitively clear (and
   can be experimentally verified) that the total number of units required
   to keep the average distance constant increases exponentially with the
   dimensionality of the sphere (if the radius of the sphere is fixed). 

   The curse of dimensionality causes networks with lots of irrelevant
   inputs to be behave relatively badly: the dimension of the input space is
   high, and the network uses almost all its resources to represent
   irrelevant portions of the space. 

   Unsupervised learning algorithms are typically prone to this problem - as
   well as conventional RBFs. A partial remedy is to preprocess the input in
   the right way, for example by scaling the components according to their
   "importance". 

2. Even if we have a network algorithm which is able to focus on important
   portions of the input space, the higher the dimensionality of the input
   space, the more data may be needed to find out what is important and what
   is not. 

A priori information can help with the curse of dimensionality. Careful
feature selection and scaling of the inputs fundamentally affects the
severity of the problem, as well as the selection of the neural network
model. For classification purposes, only the borders of the classes are
important to represent accurately. 

References: 

   Bellman, R. (1961), Adaptive Control Processes: A Guided Tour, Princeton
   University Press. 

   Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford:
   Oxford University Press, section 1.4. 

   Scott, D.W. (1992), Multivariate Density Estimation, NY: Wiley. 

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

CAPTCHA




Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is a softmax activation function?
Next Document: How do MLPs compare with RBFs?

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)





Last Update March 27 2014 @ 02:11 PM