Short communication
Efficient estimation of the mode of continuous multivariate data

https://doi.org/10.1016/j.csda.2013.01.018Get rights and content

Abstract

Mode estimation is an important task, because it has applications to data from a wide variety of sources. Many mode estimates have been proposed with most based on nonparametric density estimates. However, mode estimates obtained by such methods, although they perform excellently with large sample sizes, perform non-satisfactorily with practical (i.e., small to moderate) sample sizes. Recently, Bickel (2003) proposed an efficient method to estimate the mode of continuous univariate data, and showed that its performance is excellent with small to moderate sample sizes. In this paper, we extend Bickel’s method to continuous multivariate data by using the multivariate Box–Cox transform. The excellent performance of the proposed method at practical sample sizes is demonstrated by simulation examples and two real examples from the fields of climatology and image recognition.

Introduction

Mode estimation is an important task, because it has applications to data from a wide variety of sources. Its applications, for example, include 1-D (one-dimensional, similarly, 2-D, etc.) measurements of cellular DNA content (Marchette and Wegman, 1997), 1-D time estimates in molecular clock studies (Hedges and Shah, 2003), 2-D predictions of the greatest air-pollution (Sager, 1979), seismic tremor locations (Ooi, 2002), 3-D color histograms (Comaniciu, 2003), 5-D color image segmentation (Comaniciu and Meer, 2002), 13-D flow cytometry measurements (Jing et al., 2012), 16-D image recognition (Griffin and Lillholm, 2003), 66-D gene expression time-courses (Liu and Muller, 2003), and 365-D daily rainfall rates (Hall and Heckman, 2002). However, it is difficult to estimate the mode(s) precisely when the underlying distribution is continuous but otherwise unknown.

Many mode estimates have been proposed for univariate data, with most based on nonparametric kernel density estimates, i.e., the population modes are estimated by those of the density estimates (see, e.g., Parzen, 1962, Eddy, 1980, Silverman, 1981, Vieu, 1996 and Minnotte et al., 1998, among many others). Although the NKDM (nonparametric kernel density mode) performs excellently with large sample sizes, its performance at small to moderate ones is not satisfactory. Recently, Bickel (2003) proposed a parametric mode, using the univariate Box–Cox transform, which performs very well at small to moderate sample sizes. For a review of methods for univariate data, the reader is referred to Bickel and Frühwirth (2006) and the references cited therein.

Although many practical applications use multivariate data, much less work has been done in the area of mode estimation for this. This may be due to the fact that multivariate cases are more complex than univariate ones. The selection of the bandwidth, i.e., scale parameter, is crucial to the performance of the NKDM. Gasser et al. (1998), Abraham et al. (2003) and Klemelä (2005) proposed multivariate NKDMs with a fixed bandwidth, and Jing et al. (2012) proposed a similar one using polynomial histograms instead of kernels. Griffin and Lillholm, 2003, Griffin and Lillholm, submitted for publication proposed multivariate NKDMs using the PSST (pessimistic scale space tracking) and MSMS (multiscale mean shift) methods. Both methods are variants of the bandwidth reduction method, where the bandwidth varies from large to small and the path of the mode varies continuously, with the tracking process continuing until some stopping rule is satisfied. For an approach which is different from the preceding nonparametric kernel or polynomial histogram approaches, the reader is referred to Sager (1979).

In this paper, we propose a parametric mode estimate for continuous multivariate data using the joint Box–Cox transform. Our estimate is the multivariate extension of the one by Bickel (2003). Section 2 gives the details and theoretical properties of the proposed estimate. Section 3 contains simulation studies. Sections 3.1 Simulation study for i.i.d. data from a unimodal density, 3.2 Sensitivity to multimodalities deal with simulated examples for i.i.d. data from unimodal- and multimodal-densities, respectively. These clearly show that for unimodal densities our estimate is superior to other estimates included in this study when the sample size n is small to moderate. Further, our estimate is more reliable and less sensitive to multimodalities than other estimates in estimating the major mode when n is relatively small and/or the amount of data clustering around the minor mode(s) is substantially less than that around the major one. Sections 3.3 Simulation study for dependent data, 3.4 Simulation study for functional data explore the possibility of extending our mode estimate to the case of dependent data and i.i.d. functional data, respectively, and very encouraging results are obtained when n is relatively small. In Section 3.5, we illustrate the performance of our estimate by two real examples from the fields of climatology and image recognition. Finally, the Appendix contains the proofs.

Section snippets

The proposed method: using the joint Box–Cox transform

Suppose that we observe n i.i.d. positive d-variate observations X1,,Xn with a common but unknown unimodal pdf f(x) where x=(x1,,xd). For each λ=(λ1,,λd)Rd, we define x[λ]=(x1[λ1],,xd[λd]) where for 1jd, xj[λ]={(xjλ1)/λ,if λ0,logxj,if λ=0. In general, for any vectors a=(a1,,ad) and b=(b1,,bd), we define ab=(a1b1,,adbd),g(a)=(g(a1),,g(ad)) for any function g:R1R1, and use ab to mean ajbj for all j (similarly, a<b). We require x>0 in the transform (1), but it can be generalized by

Numerical results

We carried out numerical studies to compare the performance of our mode estimate mˆT (see (5)) with the NKDM estimates mˆF (with the fixed optimal bandwidth selector of Wu and Tsai, 2004), mˆM (the MSMS bandwidth reduction method of Griffin and Lillholm, submitted for publication) and mˆP (the PSST bandwidth reduction method of Griffin and Lillholm, 2003), and the second order polynomial histogram estimate mˆS (the SOPHE of Jing et al., 2012). The Gaussian product kernel is used throughout. In

Acknowledgments

We are grateful to two referees for valuable suggestions which improved the quality and presentation of the paper. In particular, the suggestions by the referees of including the study of sensitivity to multimodalities and the study for dependent- and functional-data led us to add Sections 3.2 Sensitivity to multimodalities, 3.3 Simulation study for dependent data, 3.4 Simulation study for functional data to Section 3, which significantly improved this work. The climatology data itself is not

References (39)

  • G.E.P. Box et al.

    Bayesian Inference in Statistical Analysis

    (1992)
  • D. Comaniciu

    An algorithm for data-driven bandwidth selection

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2003)
  • D. Comaniciu et al.

    Mean shift: a robust approach toward feature space analysis

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2002)
  • S. Corti et al.

    Signature of recent climate change in frequencies of natural atmospheric circulation regimes

    Nature

    (1999)
  • P. Doukhan
  • W.F. Eddy

    Optimum kernel estimators of the mode

    The Annals of Statistics

    (1980)
  • W. Feller

    An Introduction to Probability Theory and its Application

    (1968)
  • F. Ferraty et al.

    Nonparametric Functional Data Analysis: Theory and Practice

    (2006)
  • T. Gasser et al.

    Nonparametric estimation of the mode of a distribution of random curves

    Journal of the Royal Statistical Society. Series B

    (1998)
  • Cited by (7)

    • A fast mode estimator in multidimensional space

      2020, Statistics and Probability Letters
      Citation Excerpt :

      Bickel (2003) used the power transform to fit the empirical distribution on the real line to a normal distribution. Hsu and Wu (2013) extended that approach to multidimensional case. Empirical probability density function mode.

    • Minimum volume peeling: A robust nonparametric estimator of the multivariate mode

      2016, Computational Statistics and Data Analysis
      Citation Excerpt :

      Unfortunately, at that time no efficient algorithms were at hand to determine a minimum volume set. Recently, Hsu and Wu (2013) provided an indirect multivariate mode estimator by extending the univariate estimator of Bickel (2003) to the multivariate setting. This estimator works well for multivariate distributions that can be normalized through a Box–Cox transformation and, in this respect, it is a parametric procedure.

    • A class of nonparametric mode estimators

      2022, Communications in Statistics: Simulation and Computation
    • Dynamic Vector Mode Regression

      2020, Journal of Business and Economic Statistics
    View all citing articles on Scopus
    View full text