Abstract
In this chapter, the embedding of a set of data into a vector space is studied when an unconditional pairwise dissimilarity w between data is given. The vector space is endowed with a suitable pseudo-euclidean structure and the data embedding is built by extending the classical kernel principal component analysis. This embedding is unique, up to an isomorphism, and injective if and only if w separates the data. This construction takes advantage of axis corresponding to negative eigenvalues to develop pseudo-euclidean scatterplot matrix representations. This new visual tool is applied to compare various dissimilarities between hidden Markov models built from person’s faces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Kaufman, A., Rosenblum, L., Nielson, G.M. (eds.) Proceedings of the 1st conference on Visualization 1990, pp. 361–378. IEEE Computer Society Press, Los Alamitos (1990)
Brunsdon, C., Fotheringham, A., Charlton, M.: An Investigation of Methods for Visualising Highly Multivariate Datasets. In: Unwin, D., Fisher, P. (eds.) Technical report in Case Studies of Visualization in the Social Sciences, vol. 43, pp. 55–80 (1998)
Fayyad, U., Grinstein, G.G., Wierse, A. (eds.): Information visualization in data mining and knowledge discovery. Morgan Kaufmann, San Francisco (2001)
Ward, M.O., LeBlanc, J.T., Tipnis, R.: N-land: a graphical tool for exploring n-dimensional data. In: Proceedings of Computer Graphics International Conference 1994, Melbourne, Australia, p. 14 (1994), davis.wpi.edu/~matt/docs/cgi94.ps
Vesanto, J.: Data mining techniques based on the self-organizing map, Master’s thesis, Helsinki University of Technology, Espoo, Finland (1997)
Borg, I., Groenen, P.: Modern multidimensional scaling: theory and applications. Springer series in statistics. Springer, Heidelberg (1997)
Cox, T.F., Cox, M.A.A.: Multidimensional scaling, 2nd edn. Monographs on Statistics and Applied Probability. Chapman & Hall/CRC, Boca Raton (2000)
Wong, P.C., Bergeron, R.D.: 30 Years of Multidimensional Multivariate Visualization, Scientific Visualization, Overviews, Methodologies, and Techniques, pp. 3–33. IEEE Computer Society, Washington (1997)
Wills, G.J.: Nicheworks - interactive visualization of very large graphs. Journal of Computational and Graphical Statistics 8(2), 190–212 (1999)
Walter, J., Ritter, H.: On interactive visualization of high-dimensional data using the hyperbolic plane. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Edmonton, Alberta, Canada, pp. 123–132 (2002)
Wang, J., Yu, B., Gasser, L.: Concept tree based clustering visualization with shaded similarity matrices. In: Kumar, V., Tsumoto, S., Zhong, N., Yu, P., Wu, X. (eds.) Proceedings of 2002 IEEE international conference on Data Mining, pp. 697–700. IEEE Computer Society, Maebashi (2002)
Eades, P., Lin, X.: Spring algorithms and symmetry. Theoretical Computer Science 240(2), 379–405 (2000)
Graepel, T., Herbrich, R., Bollmann-Sdorra, P., Obermayer, K.: Classification on pairwise proximity data. In: Proceedings of the 1998 conference on advances in neural information processing systems, pp. 438–444. MIT Press, Cambridge (1999)
Pekalska, E., Paclik, P., Duin, R.P.W.: A generalized kernel approach to dissimilarity-based classification. Journal of Machine Learning Research 2, 175–211 (2002)
Torgeson, W.S.: Multidimensional scaling of similarity. Psychometrika 30, 379–393 (1965)
Schölkopf, B., Smola, A.J., Müller, K.-R.: Kernel Principal Component Analysis. In: Advances in Kernel Methods – support vector learning, ch. 20, pp. 327–352. MIT Press, Cambridge (1999)
Camiz, S.: Contribution, à partir d’exemples d’application, à la méthodologie en analyse des données, Ph.D. thesis, Université Paris-IX Dauphine, Paris (2002)
Schnabel, R.B., Eskow, E.: A revised modified Cholesky factorization algorithm. SIAM Journal on Optimization 9(4), 1135–1148 (1999)
Cheng, S.H., Higham, N.J.: A modified Cholesky algorithm based on a symmetric indefinite factorization. SIAM Journal on Matrix Analysis and Applications 19(4), 1097–1110 (1998)
Goldfarb, L.: A unified approach to pattern recognition. Pattern Recognition 17, 575–582 (1984)
Goldfarb, L.: A new approach in pattern recognition. In: Kana, I.N., Rosenfeld, A. (eds.) Progress in Machine Intelligence and Pattern Recognition, vol. 2, pp. 241–402. Elsevier Sc. Publishers, Amsterdam (1985)
Pekalska, E.: Dissimilarity representations in pattern recognition, Concepts, theory and application, Thesis, Delft Univ. Tech, pp. 322 (2005)
Pekalska, E., Duin, R.P.W.: The Dissimilarity Representation for Pattern Recognition: Foundations And Applications (Machine Perception and Artificial Intelligence). World Scientific Publishing Company, Singapore (2005)
Harris, R.J.: A primer of multivariate statistics, 2nd edn. Academic Press, Inc., London (1985)
Kaye, R.W., Wilson, R.: Linear Algebra. Oxford University Press, Oxford (1998)
Schoenberg, I.J.: Metric spaces and positive definite functions. Trans. Amer. Math. Soc. 44, 522–536 (1938)
Boutin, M., Kemper, G.: On reconstructing n-point configurations from the distribution of distances or areas. Adv. Appl. Math. 32, 709–735 (2004)
van Wijk, J., van Liere, R.: Hyperslice - visualization of scalar functions of many variables. In: Press, I.C.S. (ed.) Proceedings visualization 1993, Los Alamitos, Canada (1993)
Becker, R.A., Cleveland, W.S.: Brushing Scatterplots. Technometrics 29, 127–142 (1987); reprinted in Cleveland, W.S., McGill, M.E.: Dynamic Graphics for Data Analysis. Chapman and Hall, New York (1988)
Baker, J.K.: The DRAGON system-An overview. IEEE Transactions on Acoustics, Speech, Signal Proceding 23(1), 24–29 (1975)
Jelinek, F., Bahl, L.R., Mercer, L.: Design of a linguistic statistical decoder for the recognition of continuous speech. IEEE Transactions on Information Theory 21(3), 250–256 (1975)
Brown, M.P., Hughey, R., Krogh, A., Mian, I.S., Sjolander, K., Haussler, D.: Using dirichlet mixture priors to derive hidden Markov models for protein families. In: A. Press (ed.) Proceedings of the 1st international conference on intelligent systems for molecular biology, pp. 47–55 (1993)
Soukhal, A., Kelarestaghi, M., Slimane, M., Martineau, P.: Hidden Markov Models and scheduling problem with transportation consideration. In: 15th Annual European Simulation Multi conference (ESM 2001), Prague, pp. 836–840 (2001)
Serradura, L., Vincent, N., Slimane, M.: Web pages indexing using hidden markov models. In: 6th International Conference on Document Analysis (ICDAR), Seattle, pp. 1094–1098 (2001)
Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden markov model: analysis and applications. Machine Learning 32(1), 41–62 (1998)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286 (1989)
Falkhausen, M., Reininger, H., Dietrich, W.: Calculation of distance measures between hidden Markov models. In: Proceedings of the Eurospeech 1995, Madrid, pp. 1487–1490 (1995)
Vihola, M., Harju, M., Salmela, P., Suontausta, J., Savela, J.: Two dissimilarity measures for HMMs and their application in phoneme model clustering. In: Proceedings ICASSP 2002, 2002 IEEE International Conference on Acoustics Speech and Signal Processing, Orlando, Florida, USA, pp. 933–936 (2002)
Do, M.N.: Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models. IEEE Signal Processing Letters 10(8), 250–254 (2003)
Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Statist. 22, 79–86 (1951)
Kullback, S.: Information theory and Statistics. Dover Publication, New York (1968)
Samaria, F., Harter, A.: Parameterisation of a stochastic model for human face identification. In: IEEE workshop on Applications of Computer Vision, Sarasota, Florida, pp. 138–142 (1994)
Slimane, M., Venturin, G., de Beauville, J.P.A., Brouard, T., Brandeau, A.: Optimizing hidden markov models with a genetic algorithm. In: Alliot, J.-M., Ronald, E., Lutton, E., Schoenauer, M., Snyers, D. (eds.) AE 1995. LNCS, vol. 1063, pp. 384–396. Springer, Heidelberg (1996)
Engelen, S., Hubert, M., Vanden Branden, K.: A comparison of three procedures for robust PCA in high dimensions. Austrian Journal of Statistics 34(2), 117–126 (2005)
Huber, M., Engelen, S.: Robust PCA and classification in biosciences. Bioinformatics 20, 1728–1736 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Aupetit, S., Monmarché, N., Liardet, P., Slimane, M. (2009). Dissimilarity Analysis and Application to Visual Comparisons. In: Hassanien, AE., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds) Foundations of Computational, Intelligence Volume 1. Studies in Computational Intelligence, vol 201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01082-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-01082-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01081-1
Online ISBN: 978-3-642-01082-8
eBook Packages: EngineeringEngineering (R0)