Efficient estimation of the mode of continuous multivariate data

doi:10.1016/j.csda.2013.01.018

Computational Statistics & Data Analysis

Volume 63, July 2013, Pages 148-159

https://doi.org/10.1016/j.csda.2013.01.018 Get rights and content

Abstract

Mode estimation is an important task, because it has applications to data from a wide variety of sources. Many mode estimates have been proposed with most based on nonparametric density estimates. However, mode estimates obtained by such methods, although they perform excellently with large sample sizes, perform non-satisfactorily with practical (i.e., small to moderate) sample sizes. Recently, Bickel (2003) proposed an efficient method to estimate the mode of continuous univariate data, and showed that its performance is excellent with small to moderate sample sizes. In this paper, we extend Bickel’s method to continuous multivariate data by using the multivariate Box–Cox transform. The excellent performance of the proposed method at practical sample sizes is demonstrated by simulation examples and two real examples from the fields of climatology and image recognition.

Introduction

Mode estimation is an important task, because it has applications to data from a wide variety of sources. Its applications, for example, include 1-D (one-dimensional, similarly, 2-D, etc.) measurements of cellular DNA content (Marchette and Wegman, 1997), 1-D time estimates in molecular clock studies (Hedges and Shah, 2003), 2-D predictions of the greatest air-pollution (Sager, 1979), seismic tremor locations (Ooi, 2002), 3-D color histograms (Comaniciu, 2003), 5-D color image segmentation (Comaniciu and Meer, 2002), 13-D flow cytometry measurements (Jing et al., 2012), 16-D image recognition (Griffin and Lillholm, 2003), 66-D gene expression time-courses (Liu and Muller, 2003), and 365-D daily rainfall rates (Hall and Heckman, 2002). However, it is difficult to estimate the mode(s) precisely when the underlying distribution is continuous but otherwise unknown.

Many mode estimates have been proposed for univariate data, with most based on nonparametric kernel density estimates, i.e., the population modes are estimated by those of the density estimates (see, e.g., Parzen, 1962, Eddy, 1980, Silverman, 1981, Vieu, 1996 and Minnotte et al., 1998, among many others). Although the NKDM (nonparametric kernel density mode) performs excellently with large sample sizes, its performance at small to moderate ones is not satisfactory. Recently, Bickel (2003) proposed a parametric mode, using the univariate Box–Cox transform, which performs very well at small to moderate sample sizes. For a review of methods for univariate data, the reader is referred to Bickel and Frühwirth (2006) and the references cited therein.

Although many practical applications use multivariate data, much less work has been done in the area of mode estimation for this. This may be due to the fact that multivariate cases are more complex than univariate ones. The selection of the bandwidth, i.e., scale parameter, is crucial to the performance of the NKDM. Gasser et al. (1998), Abraham et al. (2003) and Klemelä (2005) proposed multivariate NKDMs with a fixed bandwidth, and Jing et al. (2012) proposed a similar one using polynomial histograms instead of kernels. Griffin and Lillholm, 2003, Griffin and Lillholm, submitted for publication proposed multivariate NKDMs using the PSST (pessimistic scale space tracking) and MSMS (multiscale mean shift) methods. Both methods are variants of the bandwidth reduction method, where the bandwidth varies from large to small and the path of the mode varies continuously, with the tracking process continuing until some stopping rule is satisfied. For an approach which is different from the preceding nonparametric kernel or polynomial histogram approaches, the reader is referred to Sager (1979).

In this paper, we propose a parametric mode estimate for continuous multivariate data using the joint Box–Cox transform. Our estimate is the multivariate extension of the one by Bickel (2003). Section 2 gives the details and theoretical properties of the proposed estimate. Section 3 contains simulation studies. Sections 3.1 Simulation study for i.i.d. data from a unimodal density, 3.2 Sensitivity to multimodalities deal with simulated examples for i.i.d. data from unimodal- and multimodal-densities, respectively. These clearly show that for unimodal densities our estimate is superior to other estimates included in this study when the sample size $n$ is small to moderate. Further, our estimate is more reliable and less sensitive to multimodalities than other estimates in estimating the major mode when $n$ is relatively small and/or the amount of data clustering around the minor mode(s) is substantially less than that around the major one. Sections 3.3 Simulation study for dependent data, 3.4 Simulation study for functional data explore the possibility of extending our mode estimate to the case of dependent data and i.i.d. functional data, respectively, and very encouraging results are obtained when $n$ is relatively small. In Section 3.5, we illustrate the performance of our estimate by two real examples from the fields of climatology and image recognition. Finally, the Appendix contains the proofs.

Section snippets

The proposed method: using the joint Box–Cox transform

Suppose that we observe $n$ i.i.d. positive $d$ -variate observations $X_{1}, \dots, X_{n}$ with a common but unknown unimodal pdf $f (x)$ where $x = (x_{1}, \dots, x_{d})$ . For each $λ = (λ_{1}, \dots, λ_{d}) \in R^{d}$ , we define $x^{[λ]} = (x_{1}^{[λ_{1}]}, \dots, x_{d}^{[λ_{d}]})$ where for $1 \leq j \leq d$ , $x_{j}^{[λ]} = {\begin{matrix} (x_{j}^{λ} - 1) / λ, & if λ \neq 0, \\ log x_{j}, & if λ = 0 . \end{matrix}$ In general, for any vectors $a = (a_{1}, \dots, a_{d})$ and $b = (b_{1}, \dots, b_{d})$ , we define $a^{b} = (a_{1}^{b_{1}}, \dots, a_{d}^{b_{d}}), g (a) = (g (a_{1}), \dots, g (a_{d}))$ for any function $g : R^{1} \to R^{1}$ , and use $a \leq b$ to mean $a_{j} \leq b_{j}$ for all $j$ (similarly, $a < b$ ). We require $x > 0$ in the transform (1), but it can be generalized by

Numerical results

We carried out numerical studies to compare the performance of our mode estimate ${\hat{m}}_{T}$ (see (5)) with the NKDM estimates ${\hat{m}}_{F}$ (with the fixed optimal bandwidth selector of Wu and Tsai, 2004), ${\hat{m}}_{M}$ (the MSMS bandwidth reduction method of Griffin and Lillholm, submitted for publication) and ${\hat{m}}_{P}$ (the PSST bandwidth reduction method of Griffin and Lillholm, 2003), and the second order polynomial histogram estimate ${\hat{m}}_{S}$ (the SOPHE of Jing et al., 2012). The Gaussian product kernel is used throughout. In

Acknowledgments

We are grateful to two referees for valuable suggestions which improved the quality and presentation of the paper. In particular, the suggestions by the referees of including the study of sensitivity to multimodalities and the study for dependent- and functional-data led us to add Sections 3.2 Sensitivity to multimodalities, 3.3 Simulation study for dependent data, 3.4 Simulation study for functional data to Section 3, which significantly improved this work. The climatology data itself is not

References (39)

D.R. Bickel et al.
On a fast, roubust estimator of the mode: comparisons to other robust estimators with applications
Computational Statistics and Data Analysis
(2006)
P. Burman et al.
Multivariate mode hunting: data analytic tools with measures of significance
Journal of Multivariate Analysis
(2009)
M.-Y. Cheng et al.
On mode testing and empirical approximations to distributions
Statistics and Probability Letters
(1998)
A. Mokkadem
Mixing properties of ARMA processes
Stochastic Processes and their Applications
(1988)
J.W. Silverstein et al.
On the empirical distribution of eigenvalues of a class of large dimensional random matrices
Journal of Multivariate Analysis
(1995)
P. Vieu
A note on density mode estimation
Statistics and Probability Letters
(1996)
C. Abraham et al.
Simple estimation of the mode of a multivariate density
The Canadian Journal of Statistics
(2003)
Alimoglu, F., Alpaydin, E., 1996. Methods of combining multiple classifiers based on different representations for...
D.F. Andrews et al.
Transformations of multivariate data
Biometrics
(1971)
D.R. Bickel
Robust and efficient estimation of the mode of continuous data: the mode as a viable measure of central tendency
Journal of Statistical Computation and Simulation
(2003)

G.E.P. Box et al.

Bayesian Inference in Statistical Analysis

(1992)

D. Comaniciu

An algorithm for data-driven bandwidth selection

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2003)

D. Comaniciu et al.

Mean shift: a robust approach toward feature space analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2002)

S. Corti et al.

Signature of recent climate change in frequencies of natural atmospheric circulation regimes

Nature

(1999)

P. Doukhan

W.F. Eddy

Optimum kernel estimators of the mode

The Annals of Statistics

(1980)

W. Feller

An Introduction to Probability Theory and its Application

(1968)

F. Ferraty et al.

Nonparametric Functional Data Analysis: Theory and Practice

(2006)

T. Gasser et al.

Nonparametric estimation of the mode of a distribution of random curves

Journal of the Royal Statistical Society. Series B

(1998)

Cited by (7)

A fast mode estimator in multidimensional space
2020, Statistics and Probability Letters
Citation Excerpt :
Bickel (2003) used the power transform to fit the empirical distribution on the real line to a normal distribution. Hsu and Wu (2013) extended that approach to multidimensional case. Empirical probability density function mode.
Minimum volume peeling: A robust nonparametric estimator of the multivariate mode
2016, Computational Statistics and Data Analysis
Citation Excerpt :
Unfortunately, at that time no efficient algorithms were at hand to determine a minimum volume set. Recently, Hsu and Wu (2013) provided an indirect multivariate mode estimator by extending the univariate estimator of Bickel (2003) to the multivariate setting. This estimator works well for multivariate distributions that can be normalized through a Box–Cox transformation and, in this respect, it is a parametric procedure.
Among the measures of a distribution’s location, the mode is probably the least often used, although it has some appealing properties. Estimators for the mode of univariate distributions are widely available. However, few contributions can be found for the multivariate case. A consistent direct multivariate mode estimation procedure, called minimum volume peeling, can be outlined as follows. The approach iteratively selects nested subsamples with a decreasing fraction of sample points, looking for the minimum volume subsample at each step. The mode is then estimated by calculating the mean of all points in the final set. The robustness of the method is investigated by analyzing its finite sample breakdown point and algorithms to determine minimum volume sets are discussed. Simulation results confirm that using minimum volume peeling leads to efficient mode estimates both in uncontaminated as well as contaminated situations.
Copula Approximate Bayesian Computation Using Distribution Random Forests
2024, arXiv
A class of nonparametric mode estimators
2022, Communications in Statistics: Simulation and Computation
Dynamic Vector Mode Regression
2020, Journal of Business and Economic Statistics
An Online Sample-Based Method for Mode Estimation Using ODE Analysis of Stochastic Approximation Algorithms
2019, IEEE Control Systems Letters

View all citing articles on Scopus

View full text

Short communicationEfficient estimation of the mode of continuous multivariate data

Abstract

Introduction

Section snippets

The proposed method: using the joint Box–Cox transform

Numerical results

Acknowledgments

Computational Statistics and Data Analysis

Journal of Multivariate Analysis

Statistics and Probability Letters

Stochastic Processes and their Applications

Journal of Multivariate Analysis

Statistics and Probability Letters

Simple estimation of the mode of a multivariate density

The Canadian Journal of Statistics

Transformations of multivariate data

Biometrics

Robust and efficient estimation of the mode of continuous data: the mode as a viable measure of central tendency

Journal of Statistical Computation and Simulation

Bayesian Inference in Statistical Analysis

An algorithm for data-driven bandwidth selection

IEEE Transactions on Pattern Analysis and Machine Intelligence

Mean shift: a robust approach toward feature space analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence

Signature of recent climate change in frequencies of natural atmospheric circulation regimes

Nature

Optimum kernel estimators of the mode

The Annals of Statistics

An Introduction to Probability Theory and its Application

Nonparametric Functional Data Analysis: Theory and Practice

Nonparametric estimation of the mode of a distribution of random curves

Journal of the Royal Statistical Society. Series B

Short communication
Efficient estimation of the mode of continuous multivariate data