Skip to main content
Log in

New Techniques to Perform Cross-Validation for Time Series Models

  • Brief Report
  • Published:
Operations Research Forum Aims and scope Submit manuscript

Abstract

Model validation for time series models has always been a challenge due to a lot of complexities. The presence of auto-correlation in the data creates a challenge to the conventional cross validation techniques like k-fold cross validation to be implemented for time-series models. In this paper, two weighted k-fold time series split cross-validation techniques are proposed for this purpose. The proposed techniques were validated using the opening price data of cryptocurrency. Mean squared error (MSE), Mean absolute error (MAE) and Mean absolute percentage error (MAPE) were the selected metrics to validate the proposed techniques. Both the techniques were found to give robust results; however, the Exponential weighted K-fold time series split cross validation (EWKCV) technique was seen to perform better than Generally weighted K-fold time series split cross validation (GWKCV) technique. The results of the proposed techniques, along with the results of simple train-test split for the time-series models, is seen to give better result.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

The data considered for the analysis in this paper is taken the Kaggle website (https://www.kaggle.com/datasets/varpit94/bitcoin-data-updated-till-26jun2021). Also, the first forty observations of the 2832 observations are shown in the Annexure of this paper just for immediate and quick reference.

Code Availability

Submitted.

References

  1. Naylor TH, Seaks TG, Wichern DW (1972) Box-Jenkins methods: an alternative to econometric models. Int Stat Review/Revue Int de Statistique 40(2):123–137

  2. Feizabadi J (2022) Machine learning demand forecasting and supply chain performance. Int J Logistics Res Appl 25(2):119–142

    Article  Google Scholar 

  3. Jardet C, Meunier B (2022) Nowcasting world GDP growth with high-frequency data. J Forecast 41(6):1181–1200

    Article  Google Scholar 

  4. Tan CV, Singh S, Lai CH, Zamri ASSM, Dass SC, Aris TB, ... Gill BS (2022) Forecasting COVID-19 case trends using SARIMA models during the third wave of COVID-19 in Malaysia. Int J Environ Res Public Health 19(3):1504

    Article  Google Scholar 

  5. Adenomon MO, Maijamaa B, John DO (2022) The effects of Covid-19 outbreak on the Nigerian Stock Exchange performance: evidence from GARCH Models. J Stat Model Analytics (JOSMA) 4(1)

  6. Lim B, Zohren S (2021) Time-series forecasting with deep learning: a survey. Philosophical Trans Royal Soc A 379(2194):20200209

    Article  Google Scholar 

  7. Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79

    Article  Google Scholar 

  8. Berrar D (2018) Cross-validation. Encycl Bioinform Comput Biol 1(Elsevier):542–545

    Google Scholar 

  9. Bergmeir C, Hyndman RJ, Koo B (2018) A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput Stat Data Anal 120:70–83

    Article  Google Scholar 

  10. Hwang S (2010) Cross-validation of short-term productivity forecasting methodologies. J Constr Eng Manag 136(9):1037–1046

    Article  Google Scholar 

  11. Bergmeir C, Benítez JM (2012) On the use of cross-validation for time series predictor evaluation. Inf Sci 191:192–213

    Article  Google Scholar 

  12. Donate JP, Cortez P, Sanchez GG, De Miguel AS (2013) Time series forecasting using a weighted cross-validation evolutionary artificial neural network ensemble. Neurocomputing 109:27–32

    Article  Google Scholar 

  13. Fonseca-Delgado R, Gomez-Gil P (2013) An assessment of ten-fold and Monte Carlo cross validations for time series forecasting. In 2013 10th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE) (pp. 215–220). IEEE

  14. Barrow DK, Crone SF (2016) Cross-validation aggregation for combining autoregressive neural network forecasts. Int J Forecast 32(4):1120–1137

    Article  Google Scholar 

  15. Jiang G, Wang W (2017) Markov cross-validation for time series model evaluations. Inf Sci 375:219–233

    Article  Google Scholar 

  16. Cerqueira V, Torgo L, Smailović J, Mozetič I (2017) A comparative study of performance estimation methods for time series forecasting. In 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA) (pp. 529–538). IEEE

  17. Malki Z, Atlam ES, Hassanien AE, Dagnew G, Elhosseini MA, Gad I (2020) Association between weather data and COVID-19 pandemic predicting mortality rate: machine learning approaches. Chaos Solitons Fractals 138:110137

    Article  Google Scholar 

  18. Malki Z, Atlam ES, Ewis A, Dagnew G, Alzighaibi AR, ELmarhomy G, ... Gad I (2021) ARIMA models for predicting the end of COVID-19 pandemic and the risk of second rebound. Neural Comput Appl 33:2929–2948

    Article  Google Scholar 

  19. Kaur J, Parmar KS, Singh S (2023) Autoregressive models in environmental forecasting time series: a theoretical and application review. Environ Sci Pollut Res 30(8):19617–19641

    Article  Google Scholar 

  20. Bürkner PC, Gabry J, Vehtari A (2020) Approximate leave-future-out cross-validation for bayesian time series models. J Stat Comput Simul 90(14):2499–2523

    Article  Google Scholar 

Download references

Funding

None.

Author information

Authors and Affiliations

Authors

Contributions

Vamsikrishna A carried our analysis and wrote the draft. Gijo EV was the respectful guide and carried out multiple detailed reviews of the manuscript to bring it to near perfect.

Corresponding author

Correspondence to A. Vamsikrishna.

Ethics declarations

Ethics Approval

Not applicable.

Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Annexure: First Forty Observations of 2832 Observations

Annexure: First Forty Observations of 2832 Observations

Date

Open

High

Low

Close

Adj Close

Volume

17-09-2014

465.864

468.174

452.422

457.334

457.334

21,056,800

18-09-2014

456.86

456.86

413.104

424.44

424.44

34,483,200

19-09-2014

424.103

427.835

384.532

394.796

394.796

37,919,700

20-09-2014

394.673

423.296

389.883

408.904

408.904

36,863,600

21-09-2014

408.085

412.426

393.181

398.821

398.821

26,580,100

22-09-2014

399.1

406.916

397.13

402.152

402.152

24,127,600

23-09-2014

402.092

441.557

396.197

435.791

435.791

45,099,500

24-09-2014

435.751

436.112

421.132

423.205

423.205

30,627,700

25-09-2014

423.156

423.52

409.468

411.574

411.574

26,814,400

26-09-2014

411.429

414.938

400.009

404.425

404.425

21,460,800

27-09-2014

403.556

406.623

397.372

399.52

399.52

15,029,300

28-09-2014

399.471

401.017

374.332

377.181

377.181

23,613,300

29-09-2014

376.928

385.211

372.24

375.467

375.467

32,497,700

30-09-2014

376.088

390.977

373.443

386.944

386.944

34,707,300

01-10-2014

387.427

391.379

380.78

383.615

383.615

26,229,400

02-10-2014

383.988

385.497

372.946

375.072

375.072

21,777,700

03-10-2014

375.181

377.695

357.859

359.512

359.512

30,901,200

04-10-2014

359.892

364.487

325.886

328.866

328.866

47,236,500

05-10-2014

328.916

341.801

289.296

320.51

320.51

83,308,096

06-10-2014

320.389

345.134

302.56

330.079

330.079

79,011,800

07-10-2014

330.584

339.247

320.482

336.187

336.187

49,199,900

08-10-2014

336.116

354.364

327.188

352.94

352.94

54,736,300

09-10-2014

352.748

382.726

347.687

365.026

365.026

83,641,104

10-10-2014

364.687

375.067

352.963

361.562

361.562

43,665,700

11-10-2014

361.362

367.191

355.951

362.299

362.299

13,345,200

12-10-2014

362.606

379.433

356.144

378.549

378.549

17,552,800

13-10-2014

377.921

397.226

368.897

390.414

390.414

35,221,400

14-10-2014

391.692

411.698

391.324

400.87

400.87

38,491,500

15-10-2014

400.955

402.227

388.766

394.773

394.773

25,267,100

16-10-2014

394.518

398.807

373.07

382.556

382.556

26,990,000

17-10-2014

382.756

385.478

375.389

383.758

383.758

13,600,700

18-10-2014

383.976

395.158

378.971

391.442

391.442

11,416,800

19-10-2014

391.254

393.939

386.457

389.546

389.546

5,914,570

20-10-2014

389.231

390.084

378.252

382.845

382.845

16,419,000

21-10-2014

382.421

392.646

380.834

386.475

386.475

14,188,900

22-10-2014

386.118

388.576

382.249

383.158

383.158

11,641,300

23-10-2014

382.962

385.048

356.447

358.417

358.417

26,456,900

24-10-2014

358.591

364.345

353.305

358.345

358.345

15,585,700

25-10-2014

358.611

359.861

342.877

347.271

347.271

18,127,500

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vamsikrishna, A., Gijo, E.V. New Techniques to Perform Cross-Validation for Time Series Models. Oper. Res. Forum 5, 51 (2024). https://doi.org/10.1007/s43069-024-00334-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43069-024-00334-8

Keywords