Skip to main content

Multidimensional scaling by deterministic annealing

  • Deterministic Methods
  • Conference paper
  • First Online:
Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1223))

  • 183 Accesses


Multidimensional scaling addresses the problem how proximity data can be faithfully visualized as points in a low-dimensional Euclidian space. The quality of a data embedding is measured by a cost function called stress which compares proximity values with Euclidian distances of the respective points. We present a novel deterministic annealing algorithm to efficiently determine embedding coordinates for this continuous optimization problem. Experimental results demonstrate the superiority of the optimization technique compared to conventional gradient descent methods. Furthermore, we propose a transformation of dissimilarities to reduce the mismatch between a high-dimensional data space and a low-dimensional embedding space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. J. Buhmann and T. Hofmann. Central and pairwise data clustering by competitive neural networks. In Advances in Neural Information Processing Systems 6, pages 104–111. Morgan Kaufmann Publishers, 1994.

    Google Scholar 

  2. J. M. Buhrnann and H. Kühnel. Vector quantization with complexity costs. IEEE Transactions on Information Theory, 39(4):1133–1145, July 1993.

    Google Scholar 

  3. T. F. Cox and M.A.A. Cox. Multidimensional Scaling. Number 59 in Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1994.

    Google Scholar 

  4. J. deLeeuw and I. Stoop. An upper bound for SSTRESS. Psychometrika, 51:149–153, 1986.

    Google Scholar 

  5. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. J. Royal Statist. Soc. Ser. B (methodological), 39:1–38, 1977.

    Google Scholar 

  6. R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. Wiley, New York, 1973.

    Google Scholar 

  7. S. Geman and D. Geman. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. PAMI, 6:721–741, 1984.

    Google Scholar 

  8. S. Gold and A. Rangarajan. A graduated assignment algorithm for graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(4):377–388, 1996.

    Google Scholar 

  9. J.A. Hartigan. Representations of similarity matrices by trees. J.Am.Statist.Ass., 62:1140–1158, 1967.

    Google Scholar 

  10. E. T. Jaynes. Information theory and statistical mechanics. Physical Review, 106:620–630, 1957.

    Article  Google Scholar 

  11. S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi. Optimization by simulated annealing. Science, 220:671–680, 1983.

    Google Scholar 

  12. H. Klock and J.M. Buhmann. Data visualization by multidimensional scaling: A deterministic annealing approach. Technical Report IAI-TR-96-8, UniversitÄt Bonn, Institut für Informatik III, Römerstra\e 194, October 1996.

    Google Scholar 

  13. Joseph B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1–27, MÄrz 1964.

    Google Scholar 

  14. Joseph B. Kruskal. Nonmetric multidimensional scaling: a numerical method. Psychometrika, 29(2):115–129, Juni 1964.

    Google Scholar 

  15. R.M Neal and G.E. Hinton. A new view of the em algorithm that justifies incremental and other varienats. Submitted to Biometrica, 1993.

    Google Scholar 

  16. A. Papoulis. Probability, Random Variables and Stochastic Processes. McGraw-Hill, 1965.

    Google Scholar 

  17. William Press, Saul Teukolsky, William Vetterling, and Brian Flannery. Numerical Recipes in C. Cambridge University Press, 2. edition, 1992.

    Google Scholar 

  18. B.D Ripley. Pattern Recognition and Neural Networks. Cambridge University Press, 1996.

    Google Scholar 

  19. K. Rose, E. Gurewitz, and G. Fox. Statistical mechanics and phase transitions in clustering. Physical Review Letters, 65(8):945–948, 1990.

    PubMed  Google Scholar 

  20. J.W. Sammon. A nonlinear mapping for data structure analysis. IEEE Trans. Comp., C-18(5):401–409, May 1969.

    Google Scholar 

  21. R.N. Shepard. The analysis of proximities: Multidimensional scaling with an unknown distance function i. Psychometrica, 27:125–140, 1962.

    Google Scholar 

  22. M. W. Simmen, G.J. Goodhill, and D.J. Willshaw. Scaling and brain connectivity. Nature, 369:448–450, 1994.

    Article  Google Scholar 

  23. Yoshio Takane and Forest W. Young. Nonmetric individul differences multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika, 42(1):7–67, March 1977. ALSCAL.

    Google Scholar 

  24. T.Hofmann and J.M.Buhmann. Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intellegence, 1997. to appear.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Marcello Pelillo Edwin R. Hancock

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Klock, H., Buhmann, J.M. (1997). Multidimensional scaling by deterministic annealing. In: Pelillo, M., Hancock, E.R. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 1997. Lecture Notes in Computer Science, vol 1223. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-62909-2

  • Online ISBN: 978-3-540-69042-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics