Advertisement

Abstract

Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 106 optic nerve fibers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.
Get full access to this article

View all available purchase options and get full access to this article.

Already a Subscriber?

REFERENCES AND NOTES

1
Young M. P., Yamane S., Science 256, 1327 (1992).
2
Shepard R. N., Science 210, 390 (1980).
3
Turk M., Pentland A., J. Cogn. Neurosci. 3, 71 (1991).
4
Murase H., Nayar S. K., Int. J. Comp. Vision 14, 5 (1995).
5
McClurkin J. W., Optican L. M., Richmond B. J., Gawne T. J., Science 253, 675 (1991).
6
Elman J. L., Zipser D., J. Acoust. Soc. Am. 83, 1615 (1988).
7
Klein W., Plomp R., Pols L. C. W., J. Acoust. Soc. Am. 48, 999 (1970).
8
Bizzi E., Mussa-Ivaldi F. A., Giszter S., Science 253, 287 (1991).
9
Sanger T. D., Adv. Neural Info. Proc. Syst. 7, 1023 (1995).
10
Hurrell J. W., Science 269, 676 (1995).
11
Bailer-Jones C. A. L., Irwin M., von Hippel T., Mon. Not. R. Astron. Soc. 298, 361 (1997).
12
Menozzi P., Piazza A., Cavalli-Sforza L., Science 201, 786 (1978).
13
K. V. Mardia, J. T. Kent, J. M. Bibby, Multivariate Analysis, (Academic Press, London, 1979).
14
A. H. Monahan, J. Clim., in press.
15
The scale-invariant K parameter is typically easier to set than ε, but may yield misleading results when the local dimensionality varies across the data set. When available, additional constraints such as the temporal ordering of observations may also help to determine neighbors. In earlier work (36) we explored a more complex method (37), which required an order of magnitude more data and did not support the theoretical performance guarantees we provide here for ε- and K-Isomap.
16
This procedure, known as Floyd's algorithm, requires O(N3) operations. More efficient algorithms exploiting the sparse structure of the neighborhood graph can be found in (38).
17
The operator τ is defined by τ(D) = −HSH/2, where S is the matrix of squared distances {Sij = Dij2}, and H is the “centering matrix” {Hij = δij − 1/N} (13).
18
Our proof works by showing that for a sufficiently high density (α) of data points, we can always choose a neighborhood size (ε or K) large enough that the graph will (with high probability) have a path not much longer than the true geodesic, but small enough to prevent edges that “short circuit” the true geometry of the manifold. More precisely, given arbitrarily small values of λ1, λ2, and μ, we can guarantee that with probability at least 1 − μ, estimates of the form
(1λ1)dM(i,j)dG(i,j)(1+λ2)dM(i,j)
will hold uniformly over all pairs of data points i,j. For ε-Isomap, we require
ε(2/π)r024λ1,ε<s0,
α>[log(V/μηd(λ2ε/16)d)]/ηd(λ2ε/8)d
where r0 is the minimal radius of curvature of the manifold M as embedded in the input space X, s0 is the minimal branch separation of M in X, V is the (d-dimensional) volume of M, and (ignoring boundary effects) ηd is the volume of the unit ball in Euclidean d-space. For K-Isomap, we let ε be as above and fix the ratio (K + 1)/α = ηd(ε/2)d/2. We then require
e(K+1)/4μηd(ε/4)d/4V,
(e/4)(K+1)/2μηd(ε/8)d/16V,
α>[4log(8V/μηd(λ2ε/32π)d)]/ηd(λ2ε/16π)d
The exact content of these conditions—but not their general form—depends on the particular technical assumptions we adopt. For details and extensions to nonuniform densities, intrinsic curvature, and boundary effects, see .
19
In practice, for finite data sets, dG(i,j) may fail to approximate dM(i,j) for a small fraction of points that are disconnected from the giant component of the neighborhood graph G. These outliers are easily detected as having infinite graph distances from the majority of other points and can be deleted from further analysis.
20
The Isomap embedding of the hand images is available at Science Online at www.sciencemag.org/cgi/content/full/290/5500/2319/DC1. For additional material and computer code, see .
21
R. Basri, D. Roth, D. Jacobs, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (1998), pp. 414–420.
22
Bregler C., Omohundro S. M., Adv. Neural Info. Proc. Syst. 7, 973 (1995).
23
Hinton G. E., Revow M., Dayan P., Adv. Neural Info. Proc. Syst. 7, 1015 (1995).
24
Durbin R., Willshaw D., Nature 326, 689 (1987).
25
T. Kohonen, Self-Organisation and Associative Memory (Springer-Verlag, Berlin, ed. 2, 1988), pp. 119–157.
26
Hastie T., Stuetzle W., J. Am. Stat. Assoc. 84, 502 (1989).
27
Kramer M. A., AIChE J. 37, 233 (1991).
28
DeMers D., Cottrell G., Adv. Neural Info. Proc. Syst. 5, 580 (1993).
29
Hecht-Nielsen R., Science 269, 1860 (1995).
30
Bishop C. M., Svensén M., Williams C. K. I., Neural Comp. 10, 215 (1998).
31
Comon P., Signal Proc. 36, 287 (1994).
32
Bell A. J., Sejnowski T. J., Neural Comp. 7, 1129 (1995).
33
Shepard R. N., Judd S. A., Science 191, 952 (1976).
34
Shiffrar M., Freyd J. J., Psychol. Science 1, 257 (1990).
35
Shepard R. N., Psychon. Bull. Rev. 1, 2 (1994).
36
Tenenbaum J. B., Adv. Neural Info. Proc. Syst. 10, 682 (1998).
37
Martinetz T., Schulten K., Neural Netw. 7, 507 (1994).
38
V. Kumar, A. Grama, A. Gupta, G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms (Benjamin/Cummings, Redwood City, CA, 1994), pp. 257–297.
39
Beymer D., Poggio T., Science 272, 1905 (1996).
41
Simard P. Y., LeCun Y., Denker J., Adv. Neural Info. Proc. Syst. 5, 50 (1993).
42
In order to evaluate the fits of PCA, MDS, and Isomap on comparable grounds, we use the residual variance 1 – R2(M, DY). DY is the matrix of Euclidean distances in the low-dimensional embedding recovered by each algorithm. M is each algorithm's best estimate of the intrinsic manifold distances: for Isomap, this is the graph distance matrix DG; for PCA and MDS, it is the Euclidean input-space distance matrix DX (except with the handwritten “2”s, where MDS uses the tangent distance). R is the standard linear correlation coefficient, taken over all entries of M and DY.
43
In each sequence shown, the three intermediate images are those closest to the points 1/4, 1/2, and 3/4 of the way between the given endpoints. We can also synthesize an explicit mapping from input space X to the low-dimensional embedding Y, or vice versa, using the coordinates of corresponding points {xi, yi} in both spaces provided by Isomap together with standard supervised learning techniques (39).
44
Supported by the Mitsubishi Electric Research Laboratories, the Schlumberger Foundation, the NSF (DBS-9021648), and the DARPA Human ID program. We thank Y. LeCun for making available the MNIST database and S. Roweis and L. Saul for sharing related unpublished work. For many helpful discussions, we thank G. Carlsson, H. Farid, W. Freeman, T. Griffiths, R. Lehrer, S. Mahajan, D. Reich, W. Richards, J. M. Tenenbaum, Y. Weiss, and especially M. Bernstein.

Information & Authors

Information

Published In

Science
Volume 290Issue 550022 December 2000
Pages: 2319 - 2323

History

Received: 10 August 2000
Accepted: 21 November 2000

Permissions

Request permissions for this article.

Authors

Affiliations

Joshua B. Tenenbaum*
Department of Psychology and
Vin de Silva
Department of Mathematics, Stanford University, Stanford, CA 94305, USA.
John C. Langford
Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15217, USA.

Notes

*
To whom correspondence should be addressed. E-mail: [email protected]

Metrics & Citations

Metrics

Citations

View Options

Media

Figures

Other

Tables

Share

Information & Authors
Published In
issue cover image
Science
Volume 290|Issue 5500
22 December 2000
Submission history
Received:10 August 2000
Accepted:21 November 2000
Published in print:22 December 2000
Metrics & Citations
Article Usage
Altmetrics
Export citation

Select the format you want to export the citation of this publication.

Cited by
  1. Efficient cortical coding of 3D posture in freely behaving rats, Science, 362, 6414, (584-589), (2021)./doi/10.1126/science.aau2013
    Abstract
  2. Mapping the human DC lineage through the integration of high-dimensional techniques, Science, 356, 6342, (2021)./doi/10.1126/science.aag3009
    Abstract
  3. More Is Less: Signal Processing and the Data Deluge, Science, 331, 6018, (717-719), (2021)./doi/10.1126/science.1197448
    Abstract
  4. New Life for Neural Networks, Science, 313, 5786, (454-455), (2021)./doi/10.1126/science.1129813
    Abstract
  5. Reducing the Dimensionality of Data with Neural Networks, Science, 313, 5786, (504-507), (2021)./doi/10.1126/science.1127647
    Abstract
  6. The Isomap Algorithm and Topological Stability, Science, 295, 5552, (7-7), (2021)./doi/10.1126/science.295.5552.7a
    Abstract
  7. The Manifold Ways of Perception, Science, 290, 5500, (2268-2269), (2021)./doi/10.1126/science.290.5500.2268
    Abstract
  8. A trainable clustering algorithm based on shortest paths from density peaks, Science Advances, 5, 10, (2019)./doi/10.1126/sciadv.aax3770
    Abstract
  9. Computational discovery of extremal microstructure families, Science Advances, 4, 1, (2018)./doi/10.1126/sciadv.aao7005
    Abstract
Loading...
Share
Share article link

Share on social media
Get Access
Log in to view the full text

AAAS Log in

AAAS login provides access to Science for AAAS members, and access to other journals in the Science family to users who have purchased individual subscriptions, as well as limited access for those who register for access.

Log in via OpenAthens.
Log in via Shibboleth.
More options

Purchase digital access to this article

Download and print this article for your personal scholarly, research, and educational use.

Purchase this issue in print

Buy a single issue of Science for just $15 USD.

View Options
Tables
References

(0)eLetters

No eLetters have been published for this article yet.

eLetters is an online forum for ongoing peer review. Submission of eLetters are open to all. eLetters are not edited, proofread, or indexed. Please read our Terms of Service before submitting your own eLetter.