Abstract
The main objective of this paper is to discuss selected computational aspects of robust estimation in the linear model with the emphasis on R-estimators. We focus on numerical algorithms and computational efficiency rather than on statistical properties. In addition, we formulate some algorithmic properties that a “good” method for R-estimators is expected to satisfy and show how to satisfy them using the currently available algorithms. We illustrate both good and bad properties of the existing algorithms. We propose two-stage methods to minimize the effect of the bad properties. Finally we justify a challenge for new approaches based on interior-point methods in optimization.






Similar content being viewed by others
Notes
Usually, \(\varphi :[0,1]\rightarrow \mathbb {R}\) is assumed to be a non-decreasing, square-integrable, and bounded function on (0, 1), standardized so that \(\int _0^1\,\varphi (t)\,dt=0\) and \(\int _0^1\,\varphi ^2(t)\,dt=1\). Moreover, some authors require that \(\varphi (t)\) is odd, an assumption that is not necessary in general but is needed, for example, when estimating the intercept parameter based on signed rank scores. Another common assumption is that the scores generated by \(\varphi (t)\) sum up to zero as required in Jaeckel (1972) to ensure the translation equivariance of the resulting estimator.
Jaeckel in a seminal paper (Jaeckel 1972) interpreted (8) as a measure of dispersion of residuals and advocated its use instead of the residual variance on which the classical least squares method is based. Parallel to that, Jurečková (1969) and Koul (1971) suggested other versions of R-estimates. It is shown in Jurečková and Sen (1996), that all the three approaches are asymptotically equivalent in probability and have the same asymptotic properties.
Let \(o_i(\varvec{x})\) denote the value of the \(i^{th}\) smallest coordinate in \(\varvec{x}=\big (x_1,\ldots ,x_n\big )^{\top }\). Putting \(x_{(i)}=o_i(\varvec{x})\), we get \(x_{(1)}\le x_{(2)}\le \ldots \le x_{(n)}\). For such \(\varvec{x}\) that all its coordinates are different from each other, let \(q_i(\varvec{x})\) denote the number of \(x's\le x_i\), that is, the rank of \(x_i\) in \(x_{(1)}\le x_{(2)}\le \ldots \le x_{(n)}\). If \(\varvec{Z}=\big (Z_1,\ldots ,Z_n\big )^{\top }\) is a random vector, the statistic \(Z_{(i)}=o_i(\varvec{Z})\) is called the \(i^{th}\) order statistic and the statistic \(R_i=q_{i}(\varvec{Z})\) is called the rank of \(Z_i\). It is evident that, under our assumptions, \(\varvec{R}\) is a random permutation of \((1,\ldots ,n)^{\top }\). Recall that the ranks are “well defined” only if the probability of coincidence of any pair of coordinates equals to 0. In the case of ties, the corresponding mathematical theory is much more complicated. For details of dealing with ties, see, e.g., monograph (Hájek and Šidák 1967) and the follow-up literature. For our purposes, the vector of ranks will always be a permutations of \(\{1,\ldots ,n\}\), and in the case of ties the ranks will be attributed according to the “first-in first-used” rule.
Note that if the gradient of \(\mathcal {D}_{\varphi }\big (\varvec{\beta }\big )\) exists, it is equivalent to the regression rank test statistic \(S(\varvec{\beta })\) of Jurečková (1969); while if the gradient of \(\mathcal {D}_{\varphi }\big (\varvec{\beta }\big )\) does not exist, then \(S(\varvec{\beta })\) exists but is multivalued.
It is worth noting that information about R add-on packages that provide newer, faster, and/or more efficient algorithms covering robustification of statistical methods is regularly updated in Maechler (2020).
Observe that the line search produces a point \(\varvec{\beta }_s\) in a nonsmooth point of Jaeckel’s dispersion \(\mathcal {D}_{\varphi }\). It can be useful to add a small random perturbation of \(\varvec{\beta }_s\).
Osborne’s method can be run from a vertex of the arrangement (10) only. Switching to that method requires an intermediate step, similar to what is known as “crossover” in optimization. To find Osborne’s starting vertex having \(\varvec{\beta }_s\) in hand from Stage 1, we compute the permutation Q such that \(r_{Q_1}(\varvec{\beta }_s) \le r_{Q_2}(\varvec{\beta }_s) \le \cdots \le r_{Q_n}(\varvec{\beta }_s)\) and solve the linear programming problem \(\min _{\varvec{\beta } \in \mathbb {R}^p}\left\{ \sum _{i=1}^n a(i) (y_{Q_{i}} - \varvec{x}_{Q_i}^{\top }\varvec{\beta })\ |\ y_{Q_{1}} - \varvec{x}_{Q_1}^{\top }\varvec{\beta } \le y_{Q_{2}} - \varvec{x}_{Q_2}^{\top }\varvec{\beta } \le \cdots \le y_{Q_{n}} - \varvec{x}_{Q_n}^{\top }\varvec{\beta }\right\} \).
References
Agresti A (2018) Statistical methods for the social sciences, 5th edn. Pearson Education Limited, London
Antoch J, Ekblom H (1995) Recursive robust regression—computational aspects and comparison. Comput Stat Data Anal 19:115–128
Antoch J, Ekblom H (2003) Selected algorithms for robust regression estimators. In: Dutter R et al (eds) Developments in robust statistics. Physica-Verlag, Heidelberg, pp 32–48
Antoch J, Janssen P (1989) Nonparametric regression \(M\)-quantiles. Stat Probab Lett 8:355–362
Bertsekas D (2009) Convex optimization theory. Athena Scientific, Nashua
Bertsekas D (2015) Convex optimization algorithms. Athena Scientific, Nashua
Bilgic Y, Susmann H (2013) rlme: an \(R\) package for rank-based estimation and prediction in random effects nested models. R J 5:71–79
Björck Å (1996) Numerical methods for least squares problems. SIAM, Philadelphia
Björck Å (2015) Numerical methods in matrix computation. Springer, Heidelberg
Cassart D, Hallin M, Paindaveine D (2010) On the estimation of cross-information quantities in rank-based inference. In: Antoch J, Hušková M, Sen P (eds) Nonparametrics and robustness in modern statistical inference and time series analysis: a Festschrift in honor of Prof. Jana Jurečková, Vol. 7 of IMS Collections, Institute of Mathematical Statistics, pp 35–45
Černý M et al (2022) A class of optimization problems motivated by rank estimators in robust regression. Optimization 71:2241–2271
Černý M, Hladík M, Rada M (2021) Walks on hyperplane arrangements and optimization of piecewise linear functions. Technical report. arXiv:1912.12750
Černý M, Rada M, Sokol O (2020) Rank-estimators for robust regression: approximate algorithms, exact algorithms and two-stage methods. In: Huynh V et al (eds) Integrated uncertainty in knowledge modelling and decision making, volume 12482 of lecture notes in computer science. Springer, New York, pp 163–173. https://doi.org/10.1007/978-3-030-62509-2_14
Cheng K, Hettmansperger T (1983) Weighted least squares rank regression. Commun Stat A Theory Methods 12:1069–1086
Crimin K, Abebe A, McKean J (2008) Robust general linear models and graphics via a user interface (Web RGLM). J Mod Appl Stat Methods 7:318–330
Draper D (1988) Rank-based robust analysis of linear models. I. Exposition and review (with discussion). Stat Sci 3:239–271
Dutta S, Datta S (2018) Rank-based inference for covariate and group effects in clustered data in presence of informative intra-cluster group size. Stat Med 37:4807–4822
Dutter R, Huber P (1981) Numerical methods for the non-linear robust regression problem. Commun Stat Simul Comput 13:79–114
Gao F, Han L (2012) Implementing the Nelder–Mead simplex algorithm with adaptive parameters. Comput Optim Appl 51:259–277
George K, Osborne M (1990) The efficient computation of linear rank statistics. J Stat Comput Simul 35:227–237
Golub G, van Loan C (1996) Matrix computations, 3rd edn
Grötschel M, Lovász L, Schrijver A (1993) Geometric algorithms and combinatorial optimization. Springer, Heidelberg
Hájek J, Šidák Z (1967) The theory of rank tests. Academia, Prague
Hallin M et al (2013) One-step \(R\)-estimation in linear models with stable errors. J Econom 172:195–204
Hallin M, Mehta C (2015) \(R\)-estimation for asymmetric independent component analysis. J Am Stat Assoc 110:218–232
Hallin M, La Vecchia D (2017) \(R\)-estimation in semiparametric dynamic location-scale models. J Econom 196:233–247
Hallin M, Oja H, Paindaveine D (2006) Semiparametrically efficient rank-based inference for shape II. Optimal \(R\)-estimation of shape. Ann Stat 34:2757–2789
Hallin M, Paindaveine D, Verdebout T (2014) Efficient \(R\)-estimation of principal and common principal components. J Am Stat Assoc 109:1071–1083
Hardy G (1988) Inequalities, 2nd edn. Cambridge University Press, Cambridge
Hettmansperger T (1984) Statistical inference based on ranks. Wiley, New York
Hettmansperger T, McKean J (2011) Statistical inference based on ranks, 2nd edn. CRC Press, Boca Raton
Hladík M, Černý M, Antoch J (2020) EIV regression with bounded errors in data: total least squares with Chebyshev norm. Stat Pap 61:279–301
Jaeckel L (1972) Estimating regression coefficients by minimizing the dispersion of the residuals. Ann Math Stat 43:1449–1458
Jurečková J (1969) Asymptotic linearity of a rank statistic in regression. Ann Math Stat 40:1449–1458
Jurečková J (2016) Averaged extreme regression quantiles. Extremes 19:41–49
Jurečková J et al (2016) Behavior of \(R\)-estimators under measurement errors. Bernoulli 22:1093–1112
Jurečková J, Sen P (1996) Robust statistical procedures. Wiley, New York
Jurečková P, Sen J, Picek J (2012) Methodology in robust and nonparametric statistics. Chapman and Hall/CRC, Boca Raton
Jurečková J, Picek J, Schindler M (2019) Robust statistical methods with R. Chapman and Hall/CRC, Boca Raton
Kloke J, Mckean J (2012) RFIT: rank-based estimation for linear models. R J 4:57–64
Kloke J, Mckean J, Mushfiqur R (2009) Cluster correlated errors. J Am Stat Assoc 104:384–390
Koenker R (2005) Quantile regression. Oxford University Press, Oxford
Koller M (2016) robustlmm: an \(R\)-package for robust estimation of linear mixed-effects models. J Stat Softw 75:1–24
Koul H (1971) Asymptotic behavior of class of confidence regions in multiple linear regression. Ann Math Stat 42:42–57
Koul H, Sievers G, McKean J (1987) An estimator of the scale parameter for the rank analysis of linear models under general score functions. Scand J Stat 14:131–141
Le Cam L (1986) Asymptotic methods in statistical decision theory. Springer, New York
Letellier T et al (2000) Statistical analysis of mitochondrial pathologies in childhood: identification of deficiencies using principal component analysis. Lab Investig 80:1019–1030
Louhichi S, Miura R, Volný D (2017) On the asymptotic normality of the \(R\)-estimators of the slope parameters of simple linear regression models with associated errors. Statistics 51:167–187
Madsen K, Nielsen H (1990) Finite algorithms for robust linear regression. BIT 30:682–699
Maechler M (2020) CRAN task view: robust statistical methods. Technical report. https://cran.r-project.org/web/views/Robust.html/
Marazzi A (1992) Algorithms, routines and S-functions for robust statistics, Wadsworth, Pacific Grove
Mathworks, Robustfit, Technical report. https://www.mathworks.com/help/stats/robustfit.html/ (2023)
McKean J, Hettmansperger T (1978) A robust analysis of the general linear model based on one-step \(R\)-estimates. Biometrika 65:571–579
McKean J, Sievers G (1989) Rank scores suitable for the analysis of linear models under asymmetric error distributions. Technometrics 31:207–218
Nelder J, Mead R (1965) A simplex method or function minimization. Comput J 7:308–313
O’Leary D (1990) Robust regression computation using iteratively reweighted least squares. SIAM J Matrix Anal Appl 11:466–480
Osborne M (1982) A finite algorithm for the rank regression problem. J Appl Probab 19:241–252
Osborne M (1985) Finite algorithms in optimization and data analysis. Wiley, New York
Osborne M (2001) Simplicial algorithms for minimizing polyhedral functions. Cambridge University Press, Cambridge
Press W et al (2007) Numerical recipes: the art of scientific computing, 3rd edn. Cambridge University Press, New York
Rao C (1993) Computational statistics. Handbook of statistics 9, North Holland, Amsterdam
Rockafeller T (1970) Convex analysis. Princeton University Press, Princeton
Roos C, Terlaky T, Vial J (2005) Interior point methods for linear optimization. Springer, New York
Saleh A, Picek J, Kalina J (2012) \(R\)-estimation of the parameters of a multiple regression model with measurement errors. Metrika 75:311–328
Schlossmacher E (1973) An iterative technique for absolute deviations curve fiting. J Am Stat Assoc 68:857–859
Sen P, Singer J, Pedroso de Lima A (2010) From finite sample to asymptotic methods in statistics. Cambridge University Press, New York
Shano D, Rocke D (1986) Numerical methods for robust regression. Linear models. SIAM J Sci Stat Comput 7:86–97
Sievers J, Abebe A (2004) Rank estimation of regression coefficients using iterated reweighted least squares. J Stat Comput Simul 74:821–831
Singer S, Singer S (2004) Efficient implementation of the Nelder–Mead search algorithm. Appl Numer Anal Comput Math 1:524–534
van Eeden C, Kraft C (1972) Linearized rank estimates and signed-rank estimates for the general linear hypothesis. Ann Math Stat 43:42–57
Wolke R (1992) Iteratively reweighted least squares. Algorithms, convergence and numerical comparison. BIT 32:506–524
Wolke R, Schwetlik H (1988) Iteratively reweighted least squares. Algorithms, convergence and numerical comparisons. SIAM J Sci Stat Comput 9:907–921
Yu C, Yao W (2017) Robust linear regression: a review and comparison. Commun Stat Simul Comput 46:6261–6282
Acknowledgements
We thank M. Hallin, M. Rada, and other colleagues for a thorough personal discussion and interesting remarks. The authors received valuable input from an associate editor. The work of J. Antoch and M. Černý was supported by the Czech Science Foundation under Grant 22-19353S. The work of R. Miura was supported by JSPS KAKENHI Grant Number 17H02508.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Antoch, J., Černý, M. & Miura, R. R-estimation in linear models: algorithms, complexity, challenges. Comput Stat 40, 405–439 (2025). https://doi.org/10.1007/s00180-024-01495-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-024-01495-0