Skip to main content

Advertisement

Log in

Sparse and debiased Lasso estimation and statistical inference for long time series via divide-and-conquer

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

To tackle long time series with high-dimensional covariates and dependent non-Gaussian errors, we consider the divide-and-conquer strategy and develop a class of sparse and debiased Lasso estimators. To alleviate the serial correlation in long time series data, we sequentially split the long time series into several subseries and apply a generalized penalized least squares (GLS) method for linear regression models in each subseries allowing stationary covariates and AR(q) error processes. To make accurate statistical inference, we further propose a sparse and debiased estimator and investigate its asymptotic properties. By constructing a pseudo-response variable using a squared loss transformation, the proposed GLS method is extended to a unified M-estimation framework including Huber and quantile regression models to reduce computational burden. Extensive simulations validate theoretical properties and demonstrate that our proposed estimators have better performance than some existing methods. The proposed estimators are applied to Beijing Air Quality Data and NIFTY 50 Index Data to illustrate their validity and feasibility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

No datasets were generated or analysed during the current study.

References

  • Adamek, R., Smeekes, S., Wilms, I.: Lasso inference for high-dimensional time series. J. Econ. 235, 1114–1143 (2023)

    Article  MathSciNet  Google Scholar 

  • Andrews, D.: Non-strong mixing autoregressive processes. J. Appl. Probab. 21(4), 930–934 (1984)

    Article  MathSciNet  Google Scholar 

  • Athreya, K.B., Pantula, S.G.: A note on strong mixing of arma processes. Stat. Probab. Lett. 4, 187–190 (1986)

    Article  MathSciNet  Google Scholar 

  • Battey, H., Fan, J., Liu, H., Zhu, Z.: Distributed testing and estimation under sparse high dimensional models. Ann. Stat. 46(3), 1352–1382 (2018)

    Article  MathSciNet  Google Scholar 

  • Bickel, P.J., Ritov, Y., Tsybakov, A.B.: Simultaneous analysis of Lasso and dantzig selector. Ann. Stat. 37, 1705–1732 (2009)

    Article  MathSciNet  Google Scholar 

  • Chen, X., Liu, W., Mao, X., Yang, Z.: Distributed high-dimensional regression under a quantile loss function. J. Mach. Learn. Res. 21, 1–43 (2020)

    MathSciNet  Google Scholar 

  • Chen, X., Xie, M.: A split-and-conquer approach for analysis of extraordinarily large data. Stat. Sin. 24(4), 1655–1684 (2014)

    MathSciNet  Google Scholar 

  • Chernozhukov, V., Chernozhukov, D., Katy, K.: Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 41(6), 2786–2819 (2013)

    Article  MathSciNet  Google Scholar 

  • Chernozhukov, V., Fernández-Val, I., Huang, C., Wang, W.: Arellano-bond lasso estimator for dynamic linear panel models. arXiv preprint arXiv:2402.00584 (2024)

  • Chernozhukov, V., Huang, C., Wang, W.: Uniform inference on high-dimensional spatial panel networks. arXiv preprint arXiv:2105.07424v3 (2023)

  • Chernozhukov, V., Härdle, W.K., Huang, C., Wang, W.: Lasso-driven inference in time and space. Ann. Stat. 49(3), 1702–1735 (2021)

    Article  MathSciNet  Google Scholar 

  • Chronopoulos, I., Chrysikou, K., Kapetanios, G.: High dimensional generalised penalised least squares. arXiv:2207.07055v4 1–71 (2023)

  • Fan, J., Yao, Q.: Nonlinear Time Series: Nonparametric and Parametric Methods. Springer-Verlag, New York (2003)

    Book  Google Scholar 

  • Han, D., Huang, J., Lin, Y., Shen, G.: Robust post-selection inference of high-dimensional mean regression with heavy-tailed asymmetric or heteroskedastic errors. J. Econ. 230, 416–431 (2022)

    Article  MathSciNet  Google Scholar 

  • Hsu, N.-J., Hung, H.-L., Chang, Y.-M.: Subset selection for vector autoregressive processes using lasso. Comput. Stat. Data Anal. 52(7), 3645–3657 (2008)

    Article  MathSciNet  Google Scholar 

  • Huang, J., Ma, S., Zhang, C.: Adaptive lasso for sparse high-dimensional regression models. Stat. Sin. 18(4), 1603–1618 (2008)

    MathSciNet  Google Scholar 

  • Jacob, L., Obozinski, G., Vert, J.: Group lasso with overlap and graph lasso. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 433–440 (2009)

  • Javanmard, A., Montanari, A.: Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15, 2869–2909 (2014)

    MathSciNet  Google Scholar 

  • Kostrzewa, J., Mazzocco, G., Plewczynski, D.: Divide and conquer ensemble method for time series forecasting. Trans. Comput. Collect. Intell. XXIV, 134–152 (2016)

  • Lv, S., Lian, H.: Debiased distributed learning for sparse partial linear models in high dimensions. J. Mach. Learn. Res. 23, 1–23 (2022)

    MathSciNet  Google Scholar 

  • Masini, R.P., Medeiros, M.C., Mendes, E.F.: Regularized estimation of high-dimensional vector autoregressions with weakly dependent innovations. J. Time Ser. Anal. 43(4), 532–557 (2022)

    Article  MathSciNet  Google Scholar 

  • Murphy, J.J.: Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications. Prentice Hall Press (1999)

  • Nardi, Y., Rinaldo, A.: Autoregressive process modeling via the lasso procedure. J. Multivar. Anal. 102(3), 528–549 (2011)

    Article  MathSciNet  Google Scholar 

  • Schwager, J.D.: A Complete Guide to the Futures Market: Technical Analysis, Trading Systems, Fundamental Analysis, Options, Spreads, and Trading Principles. John Wiley & Sons, Hoboken (2017)

    Google Scholar 

  • Sun, Q., Zhou, W.-X., Fan, J.: Adaptive Huber regression. J. Am. Stat. Assoc. 115(529), 254–265 (2020)

    Article  MathSciNet  Google Scholar 

  • Sun, Y., Ma, L., Xia, Y.: A decorrelating and debiasing approach to simultaneous inference for high-dimensional confounded models. J. Am. Stat. Assoc. 119(548), 2857–2868 (2024)

    Article  MathSciNet  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso: a retrospective. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 58(1), 267–288 (2011)

    Article  MathSciNet  Google Scholar 

  • Tu, J., Liu, W., Mao, X.: Byzantine-robust distributed sparse learning for m-estimation. Mach. Learn. 112(10), 3773–3804 (2023)

    Article  MathSciNet  Google Scholar 

  • van de Geer, S., Bühlmann, P., Ritov, Y., Dezeure, R.: On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42(3), 1166–1202 (2014)

    MathSciNet  Google Scholar 

  • van de Geer, S.A., Bühlmann, P.: On the conditions used to prove oracle results for the lasso. Electron. J. Stat. 3, 1360–1392 (2009)

    MathSciNet  Google Scholar 

  • Vazquez, O., Nan, B.: Debiased lasso after sample splitting for estimation and inference in high dimensional generalized linear models. arXiv:2302.14218v1 (2023)

  • Volgushev, S., Chao, S.-K., Cheng, G.: Distributed inference for quantile regression processes. Ann. Stat. 47(3), 1634–1662 (2019)

    Article  MathSciNet  Google Scholar 

  • Wang, H., Li, G., Tsai, C.-L.: Regression coefficient and autoregressive order shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 69(1), 63–78 (2007)

    Article  MathSciNet  Google Scholar 

  • Wang, L., Zheng, C., Zhou, W., Zhou, W.-X.: A new principle for tuning-free Huber regression. Stat. Sin. 31, 2153–2177 (2021)

    MathSciNet  Google Scholar 

  • Wang, X., Kang, Y., Hyndman, R.J., Li, F.: Distributed arima models for ultra-long time series. Int. J. Forecast. 39(3), 1163–1184 (2023)

  • Wong, K.C., Li, Z., Tewari, A.: Lasso guarantees for \({\beta }\)-mixing heavy-tailed time series. Ann. Stat. 48(2), 1124–1142 (2020)

    Article  MathSciNet  Google Scholar 

  • Xia, L., Shojaie, A.: Statistical inference for high-dimensional generalized estimating equations. arXiv:2207.11686, 1–56 (2022)

  • Zhang, C.-H., Zhang, S.S.: Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 76(1), 217–242 (2014)

    Article  MathSciNet  Google Scholar 

  • Zhao, W., Zhang, F., Lian, H.: Debiasing and distributed estimation for high-dimensional quantile regression. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2569–2577 (2019)

    MathSciNet  Google Scholar 

  • Zhou, N., Zhu, J.: Group variable selection via a hierarchical lasso and its oracle property. Stat. Interface 3, 557–574 (2010)

    Article  MathSciNet  Google Scholar 

  • Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the Editor, an Associate Editor and two anonymous referees for their insightful comments and suggestions on this article, which have led to significant improvements. Our research was supported by the National Natural Science Foundation of China (12271272, 12201316). All authors contributed to this work equally.

Author information

Authors and Affiliations

Authors

Contributions

Jin Liu and Wei Ma performed the simulations, prepared figures and tables. Lei Wang and Heng Lian wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Lei Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material

The Supplementary Material contains some useful lemmas, proofs of Theorems 1-3 and Corollary 1, discussions on Assumptions and additional simulation results. (pdf 794KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Ma, W., Wang, L. et al. Sparse and debiased Lasso estimation and statistical inference for long time series via divide-and-conquer. Stat Comput 35, 72 (2025). https://doi.org/10.1007/s11222-025-10602-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-025-10602-0

Keywords