Abstract
Variable selection is crucial in order to better investigate relationships between variables in regression analysis. However, sometimes data are collected in an imprecise way and can not be described by random variables. As a result, classical variable selection methods are invalid. Characterizing these imprecise observations as uncertain variables, this paper presents the uncertain lasso estimate and the de-biased uncertain lasso estimate to select variables and estimate unknown parameters, respectively. Moreover, a way to choose the tuning parameter using cross-validation is suggested. Finally, numerical examples are documented to show our methods in detail.
Similar content being viewed by others
References
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov, Csaki (eds) Proceedings of 2nd international symposium on information theory. Akademia Kiado, Budapest, pp 267–281
Bisserier A, Boukezzoula R, Galichet S (2009) An Interval approach for fuzzy linear regression with imprecise data. In: Proceedings of the joint 2009 international fuzzy systems association world congress and 2009 European society of fuzzy logic and technology conference, Lisbon, Portugal, July 20–24
Cattaneo M, Wiencierz A (2012) Likelihood-based imprecise regression. Int J Approx Reason 53:1137–1154. https://doi.org/10.1016/j.ijar.2012.06.010
Fang L, Liu S, Huang Z (2020) Uncertain Johnson-Schumacher growth model with imprecise observations and \(k\)-fold cross-validation test. Soft Comput 24:2715–2720. https://doi.org/10.1007/s00500-019-04090-4
Ferraro M, Coppi R, Gonzalez G, Colubi A (2010) A linear regression model for imprecise response. Int J Approx Reason 51:759–770. https://doi.org/10.1016/j.ijar.2010.04.003
Geisser S (1974) A predictive approach to the random effect model. Biometrika 61:101–107. https://doi.org/10.2307/2334290
Geisser S (1975) The predictive sample reuse method with applications. J Am Stat Assoc 70:320–328. https://doi.org/10.1080/01621459.1975.10479865
Lio W, Liu B (2018) Residual and confidence interval for uncertain regression model with imprecise observations. J Intell Fuzzy Syst 35:2573–2583. https://doi.org/10.3233/JIFS-18353
Lio W, Liu B (2020) Uncertain maximum likelihood estimation with application to uncertain regression analysis. Soft Comput 24:9351–9360. https://doi.org/10.1007/s00500-020-04951-3
Liu B (2007) Uncertainty theory, 2nd edn. Springer, Berlin
Liu B (2015) Uncertainty theory, 4th edn. Springer, Berlin
Liu B (2009) Some research problems in uncertainty theory. J Uncertain Syst 3:3–10
Liu B (2010) Uncertainty theory: a branch of mathematics for modeling human uncertainty. Springer, Berlin
Liu B (2012) Why is there a need for uncertainty theory? J Uncertain Syst 6:3–10
Liu Z, Jia L (2020) Cross-validation for the uncertain Chapman–Richards growth model with imprecise observations. Int J Uncertain Fuzziness Knowl Based Syst 28:769–783. https://doi.org/10.1142/S0218488520500336
Liu Z, Yang Y (2020) Least absolute deviations estimation for uncertain regression with imprecise observations. Fuzzy Optim Decis Mak 19:33–52. https://doi.org/10.1007/s10700-019-09312-w
Prade H, Serrurier M (2010) Why imprecise regression: a discussion. In: Borgelt C et al (eds) Combining soft computing and statistical methods in data analysis. Advances in intelligent and soft computing, vol 77. Springer, Berlin
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464. https://doi.org/10.1214/aos/1176344136
Stone M (1974) Cross validatory choice and assessment of statistical predictions. J Roy Stat Soc 36:111–147. https://doi.org/10.1111/j.2517-6161.1974.tb00994.x
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Stat Methodol 58:267–288. https://doi.org/10.1111/j.1467-9868.2011.00771.x
Wang X, Gao Z, Guo H (2012) Delphi method for estimating uncertainty distributions. Inform An Int Interdiscipl J 15:449-460. https://doi.org/10.1680/gein.2012.19.1.85.
Wang X, Peng Z (2014) Method of moments for estimating uncertainty distributions. J Uncertain Anal Appl. https://doi.org/10.1186/2195-5468-2-5
Yao K, Liu B (2018) Uncertain regression analysis: an approach for imprecise observations. Soft Comput 22:5579–5582. https://doi.org/10.1007/s00500-017-2521-y
Ye T, Liu Y (2020) Multivariate uncertain regression model with imprecise observations. J Ambient Intell Human Comput 11:4941–4950. https://doi.org/10.1007/s12652-020-01763-z
Zhang C, Liu Z, Liu J (2020) Least absolute deviations for uncertain multivariate regression model. Int J Gen Syst 49:449–465. https://doi.org/10.1080/03081079.2020.1748615
Acknowledgements
This work was supported by National Natural Science Foundation of China (No. 62073009) and the Program for Young Excellent Talents in UIBE (No. 18YQ06).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This paper does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Corollary 1
According to Definitions 1 and 2, the uncertain lasso estimate and de-biased uncertain lasso estimate for linear regression model (11) solve minimization problems (12) and (13), respectively. Since
is increasing with respect to \(\tilde{x}_{ji}\) when \(\beta _{j} > 0\) and decreasing with respect to \(\tilde{x}_{ji}\) when \(\beta _{j} \le 0\) for each i \((i=1,2,\ldots , n)\), the corollary follows from Theorems 1 and 2 immediately. \(\square \)
Proof of Corollary 2
According to Definitions 1 and 2, the uncertain lasso estimate and de-biased uncertain lasso estimate for the regression model (14) solve minimization problems (15) and (16), respectively. It follows from the operational law of uncertain variable (p. 55 of Liu 2015) that inverse uncertainty distributions of uncertain variables \(\ln \tilde{y}_{i}\) are \(\ln F_{i}^{-1}(\alpha )\), \(i=1,2,\ldots ,n\), respectively. Since
is increasing with respect to \(\tilde{x}_{ji}\) when \(\beta _{j} > 0\) and decreasing with respect to \(\tilde{x}_{ji}\) when \(\beta _{j} \le 0\) for each i \((i=1,2,\ldots , n)\), the corollary follows from Theorems 1 and 2 immediately.
\(\square \)
Rights and permissions
About this article
Cite this article
Liu, Z., Yang, X. Variable selection in uncertain regression analysis with imprecise observations. Soft Comput 25, 13377–13387 (2021). https://doi.org/10.1007/s00500-021-06129-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-06129-x