Summary
This article is concerned with computing approximate p-values for the maximum of the absolute difference between kernel density estimates. The approximations are based on treating the process of local extrema of the differences as a nonhomogeneous Poisson Process and estimating the corresponding local intensity function. The process of local extrema is characterized by the intensity function, which determines the rate of local extrema above a given threshold. A key idea of this article is to provide methods for more accurate estimation of the intensity function by using saddlepoint approximations for the joint density of the difference between kernel density estimates and using the first and second derivative of the difference. In this article, saddlepoint approximations are compared to gaussian approximations. Simulation results from saddlepoint approximations show consistently better agreement between empirical p-value and predetermined value with various bandwidths of kernel density estimates.




Similar content being viewed by others
References
Aldous, D. (1989), ‘Probability Approximations via the Poisson Clumping Heuristic’, Springer-Verlag, in New York
Barndorff-Nielsen, O., and Cox, D. R. (1979), ‘Edgeworth and Saddle-point Approximations With Statistical Applications (with discussion)’,Journal of the Royal Statistical Society, Ser. B,41, 279–312.
Cacoullos, T. (1966), ‘Estimation of a Multivariate Density’,Annals of the Institute of Statistical Mathematics,18, 178–189.
Daniels, H. E. (1954), ‘Saddlepoint Approximations in Statistics’, Annals of Mathematical Statistic,25, 614–649.
Davison, A. C. (1988), ‘Approximate Conditional Inference in Generalized Linear Models’,J.R.Statist. Soc. B 50, 445–461.
McCuIlagh, P. (1987), ‘Tensor methods in statistics’, London:Chapman and Hall.
Minnotte, M. C., and Scott, D. W. (1993), ‘The Mode Tree: A Tool for Visualization of Nonparametric Density Features’J. Comp. Graph. Stat.,2, 51–68.
Rabinowitz, D. (1994), ‘Detecting clusters in disease incidence’,Change-point Problems (Edited by E. Carlstein, H.-G. Muller and D. Siegmund), 255–275. IMS, Hayward California.
Rabinowitz, D. and Siegmund, D. (1997), ‘The Approximate Distribution of the Maximum of a Smoothed Poisson Random Field’,Statistica Sinica 7, 167–180.
Siegmund, D. (1986), ‘Boundary Crossing Probabilities and Statistical Applications’,Annals of Statistics,14, 361–404.
Silverman, B. (1986), ‘Density Estimation for Statistics and Data Analysis’, Chapman and Hall, New York.
Acknowledgement
This work was supported in part by a grant R03-2002-000-00034-0 from Korea Sciene and Engineering Foundation (KOSEF) in 2002–2004. This paper was derived from the author’s Ph. D. dissertation at Columbia University completed under the supervision of Dr. Daniel Rabinowitz.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
A-1.
Since u1,u2,...,un+m all have the same distribution under the null hypothesis, with mean
In a similar way, E∆″(t) = 0 can be proved.
A-2.
Since all realizations of u are equally likely,
Since
, After a little more arithmetic manipulation, using the relation
\(\operatorname{var} \Delta \left( t \right)\) is given by
The C code including the module for Appendix A1 and A2 is available upon request.
Rights and permissions
About this article
Cite this article
Lim, HJ. Saddlepoint approximations to P-values for comparison of density estimates. Computational Statistics 20, 31–50 (2005). https://doi.org/10.1007/BF02736121
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF02736121