Summary
Many distribution-free statistics have the drawback that computing exact p-values under the null hypothesis is an intensive task. When the sample sizes are small or the number of ties is large, approximations are often unsatisfactory. Moreover, tables of exact critical values are not available for conditional rank statistics (ties, censoring), for rank statistics with arbitrary regression constants, or for permutation test statistics. In those cases, it is important to have a fast algorithm for computing exact p-values. We present a new algorithm and apply it to a large class of distribution-free one-sample, two-sample and serial statistics. The algorithm is based on splitting the probability generating function of the test statistic into two parts. We compare the speed of this “split-up algorithm” to that of existing procedures and we conclude that our new algorithm is faster in many cases.
Similar content being viewed by others
References
Gibbons, J.D., and J.W. Pratt (1975). P-values: interpretation and methodology. American Statistician, 29, 20–25.
Good, P.I. (1994). Permutation Tests, a Practical Guide to Resampling Methods for Testing Hypotheses. Springer-Verlag, New York.
Hallin, M., and G. Mélard (1988). Rank-based tests for randomness against first-order serial dependence. Journal of the American Statistical Association, 83, 1117–1128.
Hallin, M., and M.L. Puri (1988). Optimal rank-based procedures for time series analysis: testing an ARMA model against other ARMA models. Annals of Statistics, 16, 402–432.
Hallin, M., and M.L. Puri (1991). Time series analysis via rank order theory: signed-rank tests for ARMA models. Journal of Multivariate Analysis, 39, 1–29.
Kendall, M.G., and A. Stuart (1977). The Advanced Theory of Statistics, Volume 2. Charles Griffin & Co., London.
Mehta, C.R., N.R. Patel, and L.J. Wei (1988). Constructing exact significance tests with restricted randomization rules. Biometrika, 75, 295–302.
Pagano, M., and D. Tritchler (1983). On obtaining permutation distributions in polynomial time. Journal of the American Statistical Association, 78, 435–441.
Prentice, R.L., and P. Marek (1979). A qualitative discrepancy between censored data rank tests. Biometrics, 35, 861–867.
Puri, M.L., and P.K. Sen (1985). Nonparametric Methods in General Linear Models. Wiley, New York.
Streitberg, B., and J. Röhmel (1986). Exact distributions for permutation and rank tests: an introduction to some recently published algorithms. Statistical Software Newsletter, 12, 10–17.
Streitberg, B., and J. Röhmel (1987). Exakte Verteilungen für Rang-und Randomisierungstests im allgemeinen c-Stichprobenproblem. EDV in Medizin und Biologie, 18, 12–19.
Van de Wiel, M.A. (1996). Computing exact distributions of rank statistics with computer algebra. Master’s thesis, Eindhoven University of Technology, The Netherlands.
Van de Wiel, M.A. (1998). Exact distributions of two-sample rank statistics and block rank statistics using computer algebra. Technical Report Memorandum COSOR 98-14, Eindhoven University of Technology, The Netherlands.
Van de Wiel, M.A., and A. Di Bucchianico (2001). Fast computation of the exact null distribution of Spearman’s rho and Page’s Lstatistic for samples with and without ties. Journal of Statistical Planning and Inference, 92, 133–145.
Wald, A., and J. Wolfowitz (1943). An exact test for randomness in the nonparametric case based on serial correlation. Annals of Mathematical Statistics, 14, 378–388.
Acknowledgements
I would like to thank Alessandro Di Bucchianico and the referees of this journal for their useful comments on earlier versions of this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
van de Wiel, M. The split-up algorithm: a fast symbolic method for computing p-values of distribution-free statistics. Computational Statistics 16, 519–538 (2001). https://doi.org/10.1007/s180-001-8328-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s180-001-8328-6