Skip to main content

Advertisement

A robust quantile regression for bounded variables based on the Kumaraswamy Rectangular distribution

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Quantile regression (QR) models offer an interesting alternative compared with ordinary regression models for the response mean. Besides allowing a more appropriate characterization of the response distribution, the former is less sensitive to outlying observations than the latter. Indeed, the QR models allow modeling other characteristics of the response distribution, such as the lower and/or upper tails. However, in the presence of outlying observations, the estimates can still be affected. In this context, a robust quantile parametric regression model for bounded responses is developed, considering a new distribution, the Kumaraswamy Rectangular (KR) distribution. The KR model corresponds to a finite mixture structure similar to the Beta Rectangular distribution. That is, the KR distribution has heavier tails compared to the Kumaraswamy model. Indeed, we show that the correspondent KR quantile regression model is more robust and flexible than the usual Kumaraswamy one. Bayesian inference, which includes parameter estimation, model fit assessment, model comparison, and influence analysis, is developed through a hybrid-based MCMC approach. Since the quantile of the KR distribution is not analytically tractable, we consider the modeling of the conditional quantile based on a suitable data augmentation scheme. To link both quantiles in terms of a regression structure, a two-step estimation algorithm under a Bayesian approach is proposed to obtain the numerical approximation of the respective posterior distributions of the parameters of the regression structure for the KR quantile. Such an algorithm combines a Markov Chain Monte Carlo algorithm with the Ordinary Least Squares approach. Our proposal showed to be robust against outlying observations related to the response while keeping the estimation process simple without adding too much to the computational complexity. We showed the effectiveness of our estimation method with a simulation study, whereas two other studies showed some benefits of the proposed model in terms of robustness and flexibility. To exemplify the adequacy of our approach, under the presence of outlying observations, we analyzed two data sets regarding socio-economic indicators from Brazil and compared them with alternatives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Atkinson, A.C.: Plots, Transformations, and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis. Clarendon Press, Oxford (1985)

    Google Scholar 

  • Azevedo, C.L.N., Fox, J.-P., Andrade, D.F.: Bayesian longitudinal item response modeling with restricted covariance pattern structures. Stat. Comput. 26(1), 443–460 (2016). https://doi.org/10.1007/s11222-014-9518-5

    Article  MathSciNet  Google Scholar 

  • Barndorff-Nielsen, O.E., Jørgensen, B.: Some parametric models on the simplex. J. Multivar. Anal. 39(1), 106–116 (1991)

    MathSciNet  Google Scholar 

  • Barnett, V., Lewis, T.: Outliers in Statistical Data. Wiley, Chichester (1994)

    Google Scholar 

  • Barreto-Souza, W., Mayrink, V.D., Simas, A.B.: Bessel regression and bbreg package to analyse bounded data. Aust. N. Zeal. J. Stat. 63(4), 685–706 (2021)

    MathSciNet  Google Scholar 

  • Bayes, C.L., Bazán, J.L., García, C.: A new robust regression model for proportions. Bayesian Anal. 7(4), 841–866 (2012)

    MathSciNet  Google Scholar 

  • Bayes, C.L., Bazán, J.L., Castro, M.: A quantile parametric mixed regression model for bounded response variables. Stat. Interface 10, 483–493 (2017)

    Google Scholar 

  • Benoit, D.F., Poel, D.: Binary quantile regression: a Bayesian approach based on the asymmetric Laplace distribution. J. Appl. Econom. 27(7), 1174–1188 (2012)

    MathSciNet  Google Scholar 

  • Bottai, M., Cai, B., McKeown, R.E.: Logistic quantile regression for bounded outcomes. Stat. Med. 29(2), 309–317 (2010)

    MathSciNet  Google Scholar 

  • Bouguila, N., Ziou, D., Monga, E.: Practical Bayesian estimation of a finite beta mixture through Gibbs sampling and its applications. Stat. Comput. 16(2), 215–225 (2006)

    MathSciNet  Google Scholar 

  • Bourguignon, M., Gallardo, D.I., Saulo, H.: A parametric quantile beta regression for modeling case fatality rates of COVID-19. arXiv (2021)

  • Box, G.E.P.: Sampling and Bayes’ inference in scientific modelling and robustness. J. R. Stat. Soc. Ser. A (General) 143(4), 383–404 (1980)

    MathSciNet  Google Scholar 

  • Brent, R.P.: Algorithms for Minimization Without Derivatives. Dover Books on Mathematics. Dover Publications, New Jersey (2013)

    Google Scholar 

  • Buchinsky, M.: Recent advances in quantile regression models: a practical guideline for empirical research. J. Hum. Resour. 33(1), 88–126 (1998)

    Google Scholar 

  • Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A.: Stan: a probabilistic programming language. J. Stat. Softw. (2017). https://doi.org/10.18637/jss.v076.i01

    Article  Google Scholar 

  • Chen, M.-H., Shao, Q.-M., Ibrahim, J.G.: Monte Carlo Methods in Bayesian Computation. Springer, New York (2012)

    Google Scholar 

  • Cho, H., Ibrahim, J.G., Sinha, D., Zhu, H.: Bayesian case influence diagnostics for survival models. Biometrics 65(1), 116–124 (2009)

    MathSciNet  Google Scholar 

  • Courard-Hauri, D.: Using Monte Carlo analysis to investigate the relationship between overconsumption and uncertain access to one’s personal utility function. Ecol. Econ. 64(1), 152–162 (2007)

    Google Scholar 

  • Cribari-Neto, F., Souza, T.C.: Testing inference in variable dispersion beta regressions. J. Stat. Comput. Simul. 82(12), 1827–1843 (2012)

    MathSciNet  Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–38 (1977)

    MathSciNet  Google Scholar 

  • Dey, S., Mazucheli, J., Anis, M.Z.: Estimation of reliability of multicomponent stress-strength for a Kumaraswamy distribution. Commun. Stat. Theory Methods 46(4), 1560–1572 (2017)

    MathSciNet  Google Scholar 

  • Dunn, P.K., Smyth, G.K.: Randomized quantile residuals. J. Comput. Graph. Stat. 5(3), 236–244 (1996)

    Google Scholar 

  • Dyk, D.A., Meng, X.-L.: The art of data augmentation. J. Comput. Graph. Stat. 10(1), 1–50 (2001)

    MathSciNet  Google Scholar 

  • Ferrari, S., Cribari-Neto, F.: Beta regression for modelling rates and proportions. J. Appl. Stat. 31(7), 799–815 (2004)

    MathSciNet  Google Scholar 

  • Ferrari, S.L.P., Espinheira, P.L., Cribari-Neto, F.: Diagnostic tools in beta regression with varying dispersion. Stat. Neerl. 65(3), 337–351 (2011)

    MathSciNet  Google Scholar 

  • Figueroa-Zúñiga, J.I., Arellano-Valle, R.B., Ferrari, S.L.P.: Mixed beta regression: a Bayesian perspective. Comput. Stat. Data Anal. 61, 137–147 (2013)

    MathSciNet  Google Scholar 

  • Fletcher, S.G., Kumaraswamy, P.: Estimation of reservoir yield and storage distribution using moments analysis. J. Hydrol. 182(1), 259–275 (1996)

    Google Scholar 

  • Ganji, A., Kumaraswamy, P., Khalili, D., Karamouz, M.: Grain yield reliability analysis with crop water demand uncertainty. Stoch. Environ. Res. Risk Assess. 20(4), 259–277 (2006)

    MathSciNet  Google Scholar 

  • Gelfand, A., Dey, D., Chang, H.: Model determination using predictive distributions with implementation via sampling-based methods. Bayesian Stat. 4, 147 (1992)

    MathSciNet  Google Scholar 

  • Hahn, E.D.: Mixture densities for project management activity times: a robust approach to pert. Eur. J. Oper. Res. 188(2), 450–459 (2008)

    Google Scholar 

  • John, O.O.: Robustness of quantile regression to outliers. Am. J. Appl. Math. Stat. 3(2), 86–88 (2015)

    Google Scholar 

  • Jørgensen, B.: Proper dispersion models. Braz. J. Probab. Stat. 11(2), 89–128 (1997)

    MathSciNet  Google Scholar 

  • Kieschnick, R., McCullough, B.D.: Regression analysis of variates observed on (0, 1): percentages, proportions and fractions. Stat. Model. 3(3), 193–213 (2003)

    MathSciNet  Google Scholar 

  • Kızılaslan, F., Nadar, M.: Estimation of reliability in a multicomponent stress-strength model based on a bivariate Kumaraswamy distribution. Stat. Pap. 59(1), 307–340 (2018)

    MathSciNet  Google Scholar 

  • Koenker, R.: Quantile regression: 40 years on. Ann. Rev. Econ. 9(1), 155–176 (2017)

    MathSciNet  Google Scholar 

  • Koenker, R., Bassett, G.: Regression quantiles. Econometrica 46(1), 33–50 (1978)

    MathSciNet  Google Scholar 

  • Kumaraswamy, P.: Stochastic simulation of weekly hydrological processes (with computer programs), part 1. Institute of Hydraulics and Hydrology, 34–72 (1976)

  • Kumaraswamy, P.: A generalized probability density function for double-bounded random processes. J. Hydrol. 46(1), 79–88 (1980)

    Google Scholar 

  • Lemonte, A.J., Bazán, J.L.: New class of Johnson distributions and its associated regression model for rates and proportions. Biom. J. 58(4), 727–746 (2016)

    MathSciNet  Google Scholar 

  • Lemonte, A.J., Moreno-Arenas, G.: On a heavy-tailed parametric quantile regression model for limited range response variables. Comput. Stat. 35(1), 379–398 (2020)

    MathSciNet  Google Scholar 

  • Mazucheli, J., Menezes, A.F.B., Fernandes, L.B., Oliveira, R.P., Ghitany, M.E.: The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates. J. Appl. Stat. 47(6), 954–974 (2020)

    MathSciNet  Google Scholar 

  • McDonald, J.B.: Some generalized functions for the size distribution of income. Econometrica 52(3), 647–663 (1984)

    Google Scholar 

  • Migliorati, S., Brisco, A.M.D., Ongaro, A.: A new regression model for bounded responses. Bayesian Anal. 13(3), 845–872 (2018)

    MathSciNet  Google Scholar 

  • Mitnik, P.A., Baek, S.: The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Stat. Pap. 54(1), 177–192 (2013)

    MathSciNet  Google Scholar 

  • Mousa, A.M., El-Sheikh, A.A., Abdel-Fattah, M.A.: A gamma regression for bounded continuous variables. Adv. Appl. Stat. 49(4), 305–326 (2016)

    Google Scholar 

  • Pinheiro, J.C., Liu, C., Wu, Y.N.: Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate t distribution. J. Comput. Graph. Stat. 10(2), 249–276 (2001)

    MathSciNet  Google Scholar 

  • Plummer, M.: JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling (2003)

  • Plummer, M.: Rjags: Bayesian Graphical Models Using MCMC. R package version 4-13 (2022)

  • R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2022)

    Google Scholar 

  • Ribeiro, V.S.O., Nobre, J.S., Santos, J.R.S., Azevedo, C.L.N.: Beta rectangular regression models to longitudinal data. Braz. J. Probab. Stat. 35(4), 851–874 (2021)

    MathSciNet  Google Scholar 

  • Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12(4), 1151–1172 (1984)

    MathSciNet  Google Scholar 

  • Sánchez, S., Ancheyta, J., McCaffrey, W.C.: Comparison of probability distribution functions for fitting distillation curves of petroleum. Energy Fuels 21(5), 2955–2963 (2007)

    Google Scholar 

  • Santos, A.R.: Zero-one augmented heteroscedastic rectangular beta regression models. Unpublished Thesis (2017)

  • Seifi, A., Kumaraswamy, P., Vlach, J.: Maximization of manufacturing yield of systems with arbitrary distributions of component values. Ann. Oper. Res. 99, 373–383 (2000)

    MathSciNet  Google Scholar 

  • Shiryayev, A.N.: The Method of the Median in the Theory of Errors, pp. 115–117. Springer, Dordrecht (1992)

    Google Scholar 

  • Silva, A.R.S., Azevedo, C.L.N., Bazán, J.L., Nobre, J.S.: Augmented-limited regression models with an application to the study of the risk perceived using continuous scales. J. Appl. Stat. 48(11), 1998–2021 (2021)

    MathSciNet  Google Scholar 

  • Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Van Der Linde, A.: Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 64(4), 583–639 (2002)

    MathSciNet  Google Scholar 

  • Sundar, V., Subbiah, K.: Application of double bounded probability density function for analysis of ocean waves. Ocean Eng. 16(2), 193–200 (1989)

    Google Scholar 

  • Trecenti, J., Witkoski, K.: abjData: databases used routinely by the Brazilian jurimetrics association. R package version 1.1.2 (2022). https://CRAN.R-project.org/package=abjData

  • Verkuilen, J., Smithson, M.: Mixed and mixture regression models for continuous bounded responses using the beta distribution. J. Educ. Behav. Stat. 37(1), 82–113 (2012)

    Google Scholar 

  • Wickham, H.: Ggplot2: Elegant Graphics for Data Analysis. Springer, New York (2016)

    Google Scholar 

  • Yu, K., Moyeed, R.A.: Bayesian quantile regression. Stat. Probab. Lett. 54(4), 437–447 (2001)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to express our sincere gratitude to the diligent efforts of the referees and Associate Editor for their insightful comments and valuable suggestions, which significantly enhanced the quality of this manuscript.

Funding

This study was partially financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Project Number 001. The authors are also thankfull to the Conselho Nacional de Desenvolvimento Científico e Tecnológico Grant Number 308058/2022-4, for a research scholarship granted to the second author, as well as to the Fundação de Amparo à Pesquisa do Estado de São Paulo, Grant Number 2020/16713-0, for providing an additional financial support, also granted to the second author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matheus Castro.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 85 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Castro, M., Azevedo, C. & Nobre, J. A robust quantile regression for bounded variables based on the Kumaraswamy Rectangular distribution. Stat Comput 34, 74 (2024). https://doi.org/10.1007/s11222-024-10381-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10381-0

Keywords