Abstract
To tackle long time series with high-dimensional covariates and dependent non-Gaussian errors, we consider the divide-and-conquer strategy and develop a class of sparse and debiased Lasso estimators. To alleviate the serial correlation in long time series data, we sequentially split the long time series into several subseries and apply a generalized penalized least squares (GLS) method for linear regression models in each subseries allowing stationary covariates and AR(q) error processes. To make accurate statistical inference, we further propose a sparse and debiased estimator and investigate its asymptotic properties. By constructing a pseudo-response variable using a squared loss transformation, the proposed GLS method is extended to a unified M-estimation framework including Huber and quantile regression models to reduce computational burden. Extensive simulations validate theoretical properties and demonstrate that our proposed estimators have better performance than some existing methods. The proposed estimators are applied to Beijing Air Quality Data and NIFTY 50 Index Data to illustrate their validity and feasibility.




Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
No datasets were generated or analysed during the current study.
References
Adamek, R., Smeekes, S., Wilms, I.: Lasso inference for high-dimensional time series. J. Econ. 235, 1114–1143 (2023)
Andrews, D.: Non-strong mixing autoregressive processes. J. Appl. Probab. 21(4), 930–934 (1984)
Athreya, K.B., Pantula, S.G.: A note on strong mixing of arma processes. Stat. Probab. Lett. 4, 187–190 (1986)
Battey, H., Fan, J., Liu, H., Zhu, Z.: Distributed testing and estimation under sparse high dimensional models. Ann. Stat. 46(3), 1352–1382 (2018)
Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of Lasso and dantzig selector. Ann. Stat. 37, 1705–1732 (2009)
Chen, X., Liu, W., Mao, X., Yang, Z.: Distributed high-dimensional regression under a quantile loss function. J. Mach. Learn. Res. 21, 1–43 (2020)
Chen, X., Xie, M.: A split-and-conquer approach for analysis of extraordinarily large data. Stat. Sin. 24(4), 1655–1684 (2014)
Chernozhukov, V., Chernozhukov, D., Katy, K.: Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 41(6), 2786–2819 (2013)
Chernozhukov, V., Fernández-Val, I., Huang, C., Wang, W.: Arellano-bond lasso estimator for dynamic linear panel models. arXiv preprint arXiv:2402.00584 (2024)
Chernozhukov, V., Huang, C., Wang, W.: Uniform inference on high-dimensional spatial panel networks. arXiv preprint arXiv:2105.07424v3 (2023)
Chernozhukov, V., Härdle, W.K., Huang, C., Wang, W.: Lasso-driven inference in time and space. Ann. Stat. 49(3), 1702–1735 (2021)
Chronopoulos, I., Chrysikou, K., Kapetanios, G.: High dimensional generalised penalised least squares. arXiv:2207.07055v4 1–71 (2023)
Fan, J., Yao, Q.: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer-Verlag, New York (2003)
Han, D., Huang, J., Lin, Y., Shen, G.: Robust post-selection inference of high-dimensional mean regression with heavy-tailed asymmetric or heteroskedastic errors. J. Econ. 230, 416–431 (2022)
Hsu, N.-J., Hung, H.-L., Chang, Y.-M.: Subset selection for vector autoregressive processes using lasso. Comput. Stat. Data Anal. 52(7), 3645–3657 (2008)
Huang, J., Ma, S., Zhang, C.: Adaptive lasso for sparse high-dimensional regression models. Stat. Sin. 18(4), 1603–1618 (2008)
Jacob, L., Obozinski, G., Vert, J.: Group lasso with overlap and graph lasso. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 433–440 (2009)
Javanmard, A., Montanari, A.: Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15, 2869–2909 (2014)
Kostrzewa, J., Mazzocco, G., Plewczynski, D.: Divide and conquer ensemble method for time series forecasting. Trans. Comput. Collect. Intell. XXIV, 134–152 (2016)
Lv, S., Lian, H.: Debiased distributed learning for sparse partial linear models in high dimensions. J. Mach. Learn. Res. 23, 1–23 (2022)
Masini, R.P., Medeiros, M.C., Mendes, E.F.: Regularized estimation of high-dimensional vector autoregressions with weakly dependent innovations. J. Time Ser. Anal. 43(4), 532–557 (2022)
Murphy, J.J.: Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications. Prentice Hall Press (1999)
Nardi, Y., Rinaldo, A.: Autoregressive process modeling via the lasso procedure. J. Multivar. Anal. 102(3), 528–549 (2011)
Schwager, J.D.: A Complete Guide to the Futures Market: Technical Analysis, Trading Systems, Fundamental Analysis, Options, Spreads, and Trading Principles. John Wiley & Sons, Hoboken (2017)
Sun, Q., Zhou, W.-X., Fan, J.: Adaptive Huber regression. J. Am. Stat. Assoc. 115(529), 254–265 (2020)
Sun, Y., Ma, L., Xia, Y.: A decorrelating and debiasing approach to simultaneous inference for high-dimensional confounded models. J. Am. Stat. Assoc. 119(548), 2857–2868 (2024)
Tibshirani, R.: Regression shrinkage and selection via the lasso: a retrospective. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 58(1), 267–288 (2011)
Tu, J., Liu, W., Mao, X.: Byzantine-robust distributed sparse learning for m-estimation. Mach. Learn. 112(10), 3773–3804 (2023)
van de Geer, S., Bühlmann, P., Ritov, Y., Dezeure, R.: On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42(3), 1166–1202 (2014)
van de Geer, S.A., Bühlmann, P.: On the conditions used to prove oracle results for the lasso. Electron. J. Stat. 3, 1360–1392 (2009)
Vazquez, O., Nan, B.: Debiased lasso after sample splitting for estimation and inference in high dimensional generalized linear models. arXiv:2302.14218v1 (2023)
Volgushev, S., Chao, S.-K., Cheng, G.: Distributed inference for quantile regression processes. Ann. Stat. 47(3), 1634–1662 (2019)
Wang, H., Li, G., Tsai, C.-L.: Regression coefficient and autoregressive order shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 69(1), 63–78 (2007)
Wang, L., Zheng, C., Zhou, W., Zhou, W.-X.: A new principle for tuning-free Huber regression. Stat. Sin. 31, 2153–2177 (2021)
Wang, X., Kang, Y., Hyndman, R.J., Li, F.: Distributed arima models for ultra-long time series. Int. J. Forecast. 39(3), 1163–1184 (2023)
Wong, K.C., Li, Z., Tewari, A.: Lasso guarantees for \({\beta }\)-mixing heavy-tailed time series. Ann. Stat. 48(2), 1124–1142 (2020)
Xia, L., Shojaie, A.: Statistical inference for high-dimensional generalized estimating equations. arXiv:2207.11686, 1–56 (2022)
Zhang, C.-H., Zhang, S.S.: Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 76(1), 217–242 (2014)
Zhao, W., Zhang, F., Lian, H.: Debiasing and distributed estimation for high-dimensional quantile regression. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2569–2577 (2019)
Zhou, N., Zhu, J.: Group variable selection via a hierarchical lasso and its oracle property. Stat. Interface 3, 557–574 (2010)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)
Acknowledgements
The authors are grateful to the Editor, an Associate Editor and two anonymous referees for their insightful comments and suggestions on this article, which have led to significant improvements. Our research was supported by the National Natural Science Foundation of China (12271272, 12201316). All authors contributed to this work equally.
Author information
Authors and Affiliations
Contributions
Jin Liu and Wei Ma performed the simulations, prepared figures and tables. Lei Wang and Heng Lian wrote the main manuscript text. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Material
The Supplementary Material contains some useful lemmas, proofs of Theorems 1-3 and Corollary 1, discussions on Assumptions and additional simulation results. (pdf 794KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, J., Ma, W., Wang, L. et al. Sparse and debiased Lasso estimation and statistical inference for long time series via divide-and-conquer. Stat Comput 35, 72 (2025). https://doi.org/10.1007/s11222-025-10602-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-025-10602-0