## Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z

# comp.ai.neural-nets FAQ, Part 2 of 7: LearningSection - What is the curse of dimensionality?

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Cities ]

Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is a softmax activation function?
Next Document: How do MLPs compare with RBFs?
```

Curse of dimensionality (Bellman 1961) refers to the exponential growth of
hypervolume as a function of dimensionality. In the field of NNs, curse of
dimensionality expresses itself in two related problems:

1. Many NNs can be thought of mappings from an input space to an output
space. Thus, loosely speaking, an NN needs to somehow "monitor", cover or
represent every part of its input space in order to know how that part of
the space should be mapped. Covering the input space takes resources,
and, in the most general case, the amount of resources needed is
proportional to the hypervolume of the input space. The exact formulation
of "resources" and "part of the input space" depends on the type of the
network and should probably be based on the concepts of information
theory and differential geometry.

As an example, think of a vector quantization (VQ). In VQ, a set of units
competitively learns to represents an input space (this is like Kohonen's
Self-Organizing Map but without topography for the units). Imagine a VQ
trying to share its units (resources) more or less equally over
hyperspherical input space. One could argue that the average distance
from a random point of the space to the nearest network unit measures the
goodness of the representation: the shorter the distance, the better is
the represention of the data in the sphere. It is intuitively clear (and
can be experimentally verified) that the total number of units required
to keep the average distance constant increases exponentially with the
dimensionality of the sphere (if the radius of the sphere is fixed).

The curse of dimensionality causes networks with lots of irrelevant
inputs to be behave relatively badly: the dimension of the input space is
high, and the network uses almost all its resources to represent
irrelevant portions of the space.

Unsupervised learning algorithms are typically prone to this problem - as
well as conventional RBFs. A partial remedy is to preprocess the input in
the right way, for example by scaling the components according to their
"importance".

2. Even if we have a network algorithm which is able to focus on important
portions of the input space, the higher the dimensionality of the input
space, the more data may be needed to find out what is important and what
is not.

A priori information can help with the curse of dimensionality. Careful
feature selection and scaling of the inputs fundamentally affects the
severity of the problem, as well as the selection of the neural network
model. For classification purposes, only the borders of the classes are
important to represent accurately.

References:

Bellman, R. (1961), Adaptive Control Processes: A Guided Tour, Princeton
University Press.

Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford:
Oxford University Press, section 1.4.

Scott, D.W. (1992), Multivariate Density Estimation, NY: Wiley.

```

## User Contributions:

Top Document: comp.ai.neural-nets FAQ, Part 2 of 7: Learning
Previous Document: What is a softmax activation function?
Next Document: How do MLPs compare with RBFs?

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
saswss@unx.sas.com (Warren Sarle)

Last Update March 27 2014 @ 02:11 PM