Geometrically local isotropic independence and numerical analysis of the Mahalanobis metric in vector space

https://doi.org/10.1016/j.patrec.2009.07.018Get rights and content

Abstract

The Mahalanobis metric was proposed by extending the Mahalanobis distance to provide a probabilistic distance for a non-normal distribution. The Mahalanobis metric equation is a nonlinear second order differential equation derived from the geometrically local isotropic independence equation, which was proposed to define normal distributions in a manifold. We explain the equations and show experimental results of calculating the Mahalanobis metric by the Newton–Raphson method. We add error to an original probability density function and calculate the Mahalanobis metric to investigate the effect on the solution of error in a probability density function. This paper is an extended version of “numerical analysis of Mahalanobis metric in vector space” (Track 2 IBM Best Student Paper Award in ICPR’08; Son, J., Inoue, N., Yamashita, Y., 2008. Numerical analysis of Mahalanobis metric in vector space. In: Proc. 19th Int. Conf. on Pattern Recognition (CD–ROM)).

Introduction

The Mahalanobis distance (Mahalanobis, 1937), which is the distance normalized by the variance–covariance matrix of a distribution, is widely used for pattern recognition and data analysis. It enables improved accuracy of recognition compared to the Euclidean distance.

For example, if two vectors differ in a direction in which the variance is small, the difference is small in terms of the Euclidean distance but may be large from the viewpoint of probability (Fig. 1a).

When the distribution of patterns is normal, normalization by the variance–covariance matrix is reasonable. However, consider a probability distribution function (p.d.f.) that is not normal. The set of samples given as Fig. 1b is an example. It is natural to evaluate the distance not along a straight line but along a curve. In order to realize such a distance, the Mahalanobis metric was proposed.

In the n-dimensional Euclidean space, the normal distribution with an average μ and a variance–covariance matrix Σ is expressed by the p.d.f.p(x)=12πndet(Σ)exp-Σ-1x-μ,x-μ/2,where ·,· denotes the inner product and Σ is assumed to be regular. Let I be the identity matrix. When x is mapped to y by a linear transformation y=Σ1/2x, p is transformed to the p.d.f. of a normal distribution of which the variance–covariance matrix is I. The inner product that expresses the Mahalanobis distance is given as the pull back of the inner product as y,y=Σ1x,x (Fig. 2).

The concept of geometrically local isotropic independence (GLII) was proposed to define a normal distribution in a manifold (Yamashita et al., 2006). It provides a normal distribution of which the variance–covariance matrix is aI (a>0) in the Euclidean space, the von Mises–Fisher distribution on a hyper-spherical surface (Mardia, 1972, Mardia and Jupp, 1999), and distribution proportional to e-κcoshθ in the Lobachevsky space (Alekseevskij et al., 1993). The Mahalanobis metric equation was defined by extending the linear transformation to a diffeomorphism and using the GLII equation (Yamashita et al., 2006, Son et al., 2008). It is remarkable that its differential equation does not depend on the coordinate system or the metric that is originally defined in the space. Furthermore, the diffeomorphism disappears completely in the equation.

In one-dimensional statistical analysis, approximate transformation to a normal distribution, such as the Box–Cox transformation, is often used. It was also shown that the classification accuracy is improved by transforming a gamma distribution of a component of feature vectors to a normal distribution (Shia et al., 2002). On the other hand, the Mahalanobis distance (whitening variance–covariance of data) is a common technique in pattern recognition. By measuring distance with the Mahalanobis metric, we can do both simultaneously for a multi-dimensional data. Therefore, the Mahalanobis metric will improve the accuracy of statistical analysis and pattern recognition.

We explain the GLII and Mahalanobis metric equations and conduct experiments to solve the Mahalanobis metric equation by the Newton–Raphson method. With the experiment, we show the Mahalanobis metric equation can be solved effectively although the nonlinearity of the Mahalanobis equation is very high. If a p.d.f. is given from sample data in a real case, it must contain errors. To address this, we added an error to an original p.d.f. and solved the Mahalanobis metric equation to investigate the effect of the error on the solution.

In Section 2, the GLII is explained. In Section 3, the Mahalanobis metric equation is shown. We show a solution of the GLII equation on a hyper-spherical surface of which proof is first provided. In Section 4, a numerical analysis method for the Mahalanobis metric with the Newton–Raphson method is proposed. In Section 5, experimental results are shown.

Section snippets

Characterizations of normal distribution in Euclidean space

Normal distributions can be characterized by the equality between the sample mean and the maximum likelihood estimator (C.F. Gauss) by isotropic independence (Maxwell), by the entropy, or by limits (Fry, 1965, Feller, 1968, Maistrov, 1974, Tien and Lienhard, 1985).

In the first characterization, Gauss showed if a p.d.f. is given as p(xμ), where μ is a parameter, samples are extracted independently, and the maximum likelihood estimator of μ is always given by the sample mean of x, then the

Mahalanobis metric equation

For a p.d.f. p in a manifold M, suppose that a diffeomorphism T from M to a manifold M transforms p to a solution of the GLII equation in M. Fig. 3 illustrates this transformation, where xμ and yμ are coordinate systems in M and M, respectively.

Then, the Mahalanobis metric g˜μν in M is defined by the pull back of the metric in M and is given by the Mahalanobis metric equation:˜μ˜νlogp-12logg˜-fg˜μν=0,where g˜=detg˜μν and ˜μ is the covariant differential derived from g˜μν. A scalar f is

Mahalanobis metric by the Newton–Raphson method

We provide an algorithm to solve the Mahalanobis metric equation by the Newton–Raphson method. Here, we discuss the case when the dimension of the space is two. For brevity, g˜μν, ˜, etc. are denoted without ·˜. We expand the Mahalanobis metric Eq. (13) and let its left hand side be Fμν. Then, we have:F11=11logp-Γ1111logp-Γ1122logp+Γ111Γ111+Γ212+Γ112Γ121+Γ222-x1Γ111+Γ212+g11,F12=12logp-Γ1211logp-Γ1222logp+Γ121Γ111+Γ212+Γ122Γ121+Γ222-x2Γ111+Γ222+g12F22=22logp-Γ2211logp-Γ2222logp+Γ

Experimental results

Since the nonlinearity of the Mahalanobis equation is very high, we have to examine whether or not the equation can be solved from a given p.d.f. In this experiment we show that we can solve it efficiently by the Newton method.

First, we define a coordinate transform asy1=a1x1,y2=a2x2+bhx1hcx2,with constants a and b, whereh(t)=e41-11-t2-1<t<10(else).

Function (22) is an infinite-times continuously differentiable function with a compact support. It is sufficient if the second derivatives are

Conclusion

We explained the GLII and the Mahalanobis metric equations and showed experimental results of solving the Mahalanobis equation by the Newton–Raphson method. From the result, we could obtain the Mahalanobis metric even when error is added to a p.d.f.

For future work, we have to develop methods to obtain the Mahalanobis metric from samples and in higher dimensional spaces. Furthermore, extension of the GLII equation to time-series data is necessary.

References (12)

  • D. Alekseevskij et al.

    Spaces of Constant Curvature

    (1993)
  • W. Feller
    (1968)
  • T. Fry

    Probability and its Engineering Uses

    (1965)
  • P. Mahalanobis

    Normalization of statistical variates and the use of rectangular co-ordinates in the theory of sampling distributions

    Sankhay

    (1937)
  • L. Maistrov

    Probability Theory, a Historical Sketch

    (1974)
  • K. Mardia

    Statistics of Directional Data

    (1972)
There are more references available in the full text version of this article.

Cited by (0)

View full text