Abstract
We review the information-geometric framework for statistical pattern recognition: First, we explain the role of statistical similarity measures and distances in fundamental statistical pattern recognition problems. We then concisely review the main statistical distances and report a novel versatile family of divergences. Depending on their intrinsic complexity, the statistical patterns are learned by either atomic parametric distributions, semi-parametric finite mixtures, or non-parametric kernel density distributions. Those statistical patterns are interpreted and handled geometrically in statistical manifolds either as single points, weighted sparse point sets or non-weighted dense point sets. We explain the construction of the two prominent families of statistical manifolds: The Rao Riemannian manifolds with geodesic metric distances, and the Amari-Chentsov manifolds with dual asymmetric non-metric divergences. For the latter manifolds, when considering atomic distributions from the same exponential families (including the ubiquitous Gaussian and multinomial families), we end up with dually flat exponential family manifolds that play a crucial role in many applications. We compare the advantages and disadvantages of these two approaches from the algorithmic point of view. Finally, we conclude with further perspectives on how “geometric thinking” may spur novel pattern modeling and processing paradigms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 4–37
Cramér, H.: Mathematical Methods of Statistics. Princeton Landmarks in mathematics (1946)
Fréchet, M.: Sur l’extension de certaines évaluations statistiques au cas de petits échantillons. Review of the International Statistical Institute 11, 182–205 (1939) (published in IHP Lecture)
Rao, C.R.: Information and the accuracy attainable in the estimation of statistical parameters. Bulletin of the Calcutta Mathematical Society 37, 81–89
Nielsen, F.: In : Connected at Infinity II: A selection of mathematics by Indians. Cramér-Rao lower bound and information geometry (Hindustan Book Agency (Texts and Readings in Mathematics, TRIM)) arxiv 1301.3578
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological) 39, 1–38
Fukunaga, K.: Introduction to statistical pattern recognition, 2nd edn. Academic Press Professional, Inc. (1990); (1st edn. 1972)
Piro, P., Nielsen, F., Barlaud, M.: Tailored Bregman ball trees for effective nearest neighbors. In: European Workshop on Computational Geometry (EuroCG), LORIA, Nancy, France. IEEE (2009)
Nielsen, F., Piro, P., Barlaud, M.: Bregman vantage point trees for efficient nearest neighbor queries. In: Proceedings of the 2009 IEEE International Conference on Multimedia and Expo (ICME), pp. 878–881 (2009)
Nock, R., Nielsen, F.: Fitting the smallest enclosing bregman balls. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 649–656. Springer, Heidelberg (2005)
Nielsen, F., Nock, R.: On the smallest enclosing information disk. Inf. Process. Lett. 105, 93–97
Nielsen, F., Nock, R.: On approximating the smallest enclosing Bregman balls. In: ACM Symposium on Computational Geometry (SoCG). ACM Press (2006)
Arnaudon, M., Nielsen, F.: On approximating the Riemannian 1-center. Computational Geometry 46, 93–104
Nielsen, F., Nock, R.: Approximating smallest enclosing balls with applications to machine learning. Int. J. Comput. Geometry Appl. 19, 389–414
Ali, S.M., Silvey, S.D.: A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society, Series B 28, 131–142
Csiszár, I.: Information-type measures of difference of probability distributions and indirect observation. Studia Scientiarum Mathematicarum Hungarica 2, 229–318
Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley Interscience, New York (1991)
Nielsen, F.: Closed-form information-theoretic divergences for statistical mixtures. In: International Conference on Pattern Recognition, ICPR (2012)
Wu, J., Rehg, J.M.: Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In: ICCV (2009)
Nielsen, F., Garcia, V.: Statistical exponential families: A digest with flash cards. arXiv.org:0911.4863 (2009)
Hellman, M.E., Raviv, J.: Probability of error, equivocation and the Chernoff bound. IEEE Transactions on Information Theory 16, 368–372
Nielsen, F., Boltz, S.: The Burbea-Rao and Bhattacharyya centroids. IEEE Transactions on Information Theory 57, 5455–5466
Amari, S., Nagaoka, H.: Methods of Information Geometry. Oxford University Press (2000)
Qiao, Y., Minematsu, N.: A study on invariance of f-divergence and its application to speech recognition. Transactions on Signal Processing 58, 3884–3890
Pardo, M.C., Vajda, I.: About distances of discrete distributions satisfying the data processing theorem of information theory. IEEE Transactions on Information Theory 43, 1288–1293
Amari, S.: Alpha-divergence is unique, belonging to both f-divergence and Bregman divergence classes. IEEE Transactions on Information Theory 55, 4925–4931
Morozova, E.A., Chentsov, N.N.: Markov invariant geometry on manifolds of states. Journal of Mathematical Sciences 56, 2648–2669
Fisher, R.A.: On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London A 222, 309–368
Chentsov, N.N.: Statistical Decision Rules and Optimal Inferences. Transactions of Mathematics Monograph, numero 53 (1982) (published in Russian in 1972)
Peter, A., Rangarajan, A.: A new closed-form information metric for shape analysis, vol. 1, pp. 249–256
Atkinson, C., Mitchell, A.F.S.: Rao’s distance measure. Sankhya A 43, 345–365
Lovric, M., Min-Oo, M., Ruh, E.A.: Multivariate normal distributions parametrized as a Riemannian symmetric space. Journal of Multivariate Analysis 74, 36–48
Schwander, O., Nielsen, F.: Model centroids for the simplification of kernel density estimators. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 737–740
Arnaudon, M., Nielsen, F.: Medians and means in Finsler geometry. CoRR abs/1011.6076 (2010)
Nielsen, F., Nock, R.: Hyperbolic Voronoi diagrams made easy, vol. 1, pp. 74–80. IEEE Computer Society, Los Alamitos
Nielsen, F., Nock, R.: The hyperbolic voronoi diagram in arbitrary dimension. CoRR abs/1210.8234 (2012)
Pennec, X.: Statistical computing on manifolds: From riemannian geometry to computational anatomy. In: Nielsen, F. (ed.) ETVC 2008. LNCS, vol. 5416, pp. 347–386. Springer, Heidelberg (2009)
Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. Journal of Machine Learning Research 6, 1705–1749
Barndorff-Nielsen, O.E.: Information and exponential families: In statistical theory. Wiley series in probability and mathematical statistics: Tracts on probability and statistics. Wiley (1978)
Bogdan, K., Bogdan, M.: On existence of maximum likelihood estimators in exponential families. Statistics 34, 137–149
Nielsen, F.: k-MLE: A fast algorithm for learning statistical mixture models. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE (2012) (preliminary, technical report on arXiv)
Schwander, O., Nielsen, F., Schutz, A., Berthoumieu, Y.: k-MLE for mixtures of generalized Gaussians. In: International Conference on Pattern Recognition, ICPR (2012)
Schwander, O., Nielsen, F.: Fast learning of Gamma mixture models with k-MLE. In: Hancock, E., Pelillo, M. (eds.) SIMBAD 2013. LNCS, vol. 7953, pp. 235–249. Springer, Heidelberg (2013)
Saint-Jean, C., Nielsen, F.: A new implementation of k-MLE for mixture modelling of Wishart distributions. In: Geometric Sciences of Information, GSI (2013)
Schwander, O., Nielsen, F.: Learning Mixtures by Simplifying Kernel Density Estimators. In: Bhatia, Nielsen (eds.) Matrix Information Geometry, pp. 403–426
Nielsen, F., Nock, R.: Sided and symmetrized Bregman centroids. IEEE Transactions on Information Theory 55, 2882–2904
Garcia, V., Nielsen, F., Nock, R.: Levels of details for Gaussian mixture models, vol. 2, pp. 514–525
Vemuri, B., Liu, M., Amari, S., Nielsen, F.: Total Bregman divergence and its applications to DTI analysis. IEEE Transactions on Medical Imaging (2011) 10.1109/TMI.2010.2086464
Liu, M., Vemuri, B.C., Amari, S., Nielsen, F.: Shape retrieval using hierarchical total Bregman soft clustering. Transactions on Pattern Analysis and Machine Intelligence (2012)
Boissonnat, J.-D., Nielsen, F., Nock, R.: Bregman Voronoi diagrams. Discrete Comput. Geom. 44, 281–307
Nielsen, F., Boissonnat, J.-D., Nock, R.: On Bregman Voronoi diagrams. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, pp. 746–755. Society for Industrial and Applied Mathematics, Philadelphia
Nielsen, F., Boissonnat, J.-D., Nock, R.: Visualizing Bregman Voronoi diagrams. In: Proceedings of the Twenty-Third Annual Symposium on Computational Geometry, SCG 2007, pp. 121–122. ACM, New York
Nielsen, F., Nock, R.: Jensen-Bregman Voronoi diagrams and centroidal tessellations. In: International Symposium on Voronoi Diagrams (ISVD), pp. 56–65.
Nielsen, F.: Hypothesis testing, information divergence and computational geometry. In: Geometric Sciences of Information, GSI (2013)
Nielsen, F.: An information-geometric characterization of Chernoff information. IEEE Signal Processing Letters (SPL) 20, 269–272
Garcia, V., Nielsen, F.: Simplification and hierarchical representations of mixtures of exponential families. Signal Processing (Elsevier) 90, 3197–3212
Schwander, O., Nielsen, F.: PyMEF - A framework for exponential families in Python. In: IEEE/SP Workshop on Statistical Signal Processing, SSP (2011)
Shen, Z.: Riemann-Finsler geometry with applications to information geometry. Chinese Annals of Mathematics 27B, 73–94
Cena, A., Pistone, G.: Exponential statistical manifold. Annals of the Institute of Statistical Mathematics 59, 27–56
Gangbo, W., McCann, R.J.: The geometry of optimal transportation. Acta Math. 177, 113–161
Barbaresco, F.: Interactions between Symmetric Cone and Information Geometries: Bruhat-Tits and Siegel Spaces Models for High Resolution Autoregressive Doppler Imagery. In: Nielsen, F. (ed.) ETVC 2008. LNCS, vol. 5416, pp. 124–163. Springer, Heidelberg (2009)
Dawid, A.P.: The geometry of proper scoring rules. Annals of the Institute of Statistical Mathematics 59, 77–93
Grasselli, M.R., Streater, R.F.: On the uniqueness of the Chentsov metric in quantum information geometry. Infinite Dimensional Analysis, Quantum Probability and Related Topics 4, 173–181, arXiv.org:math-ph/0006030
Nielsen, F.: A family of statistical symmetric divergences based on Jensen’s inequality. CoRR abs/1009.4004 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nielsen, F. (2013). Pattern Learning and Recognition on Statistical Manifolds: An Information-Geometric Review. In: Hancock, E., Pelillo, M. (eds) Similarity-Based Pattern Recognition. SIMBAD 2013. Lecture Notes in Computer Science, vol 7953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39140-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-39140-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39139-2
Online ISBN: 978-3-642-39140-8
eBook Packages: Computer ScienceComputer Science (R0)