A cost-sensitive active learning algorithm: toward imbalanced time series forecasting

Zhang, Jing; Dai, Qun

doi:10.1007/s00521-021-06837-3

A cost-sensitive active learning algorithm: toward imbalanced time series forecasting

Original Article
Published: 23 January 2022

Volume 34, pages 6953–6972, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Jing Zhang¹ &
Qun Dai¹

677 Accesses
3 Citations
Explore all metrics

Abstract

Recently, many outstanding techniques for Time series forecasting (TSF) have been proposed. These techniques depend on necessary and sufficient data samples, which is the key to train a good predictor. Thus, an Active learning (AL) algorithmic framework based on Support vector regression (SVR) is designed for TSF, with the goal to choose the most valuable samples and reduce the complexity of the training set. To evaluate the quality of samples comprehensively, multiple essential criteria, such as informativeness, representativeness and diversity, are considered in a two clustering-based consecutive stages procedure. In addition, considering the imbalance of time series data, a range of values might be seriously under-represented but extremely important to the user. Thus, it is unreasonable to assign the same prediction cost to each sample. To address this imbalance problem, a multiple criteria cost-sensitive active learning algorithm in the virtue of weight SVR architecture, abbreviated as MAW-SVR, ad hoc for imbalanced TSF, is proposed. By introducing the cost-sensitive scheme, each sample is endowed with a penalty weight, which can be dynamically updated in the AL procedure. The experimental comparisons between MAW-SVR and the other six AL algorithms on a total of thirty time series datasets verify the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Article 30 August 2019

Xibin Dong, Zhiwen Yu, … Qianli Ma

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Bartosz Krawczyk

References

Contreras-Reyes JE, Idrovo-Aguirre BJ (2020) Backcasting and forecasting time series using detrended cross-correlation analysis. Physica A-Stat Mechan Appl 560:125109
Article MathSciNet Google Scholar
Salles R, Belloze K, Porto F, Gonzalez PH, Ogasawara E (2019) Nonstationary time series transformation methods: An experimental review. Knowl-Based Syst 164:274–291
Article Google Scholar
Hyndman RJ, De Gooijer JG (2006) 25 years of time series forecasting. Int J Forecast 22:443–473
Article Google Scholar
Junior DSDOS, De Oliveira JFL, Neto PSGDM (2019) An intelligent hybridization of ARIMA with machine learning models for time series forecasting. Knowl-Based Syst 175:72–86
Article Google Scholar
De Prado MLAdvances in financial machine learning: John Wiley & Sons, 2018.
Li JH, Dai Q, Ye R (2019) A novel double incremental learning algorithm for time series prediction. Neural Comput Appl 31:6055–6077
Article Google Scholar
Hong W-C (2012) Application of seasonal SVR with chaotic immune algorithm in traffic flow forecasting. Neural Comput Appl 21:583–593
Article Google Scholar
Yaseen ZM, Allawi MF, Yousif AA, Jaafar O, Hamzah FM, El-Shafie A (2018) Non-tuned machine learning approach for hydrological time series forecasting. Neural Comput Appl 30:1479–1491
Article Google Scholar
Peralta Donate J, Li X, Gutierrez Sanchez G, Sanchis A, de Miguel, (2013) Time series forecasting by evolving artificial neural networks with genetic algorithms, differential evolution and estimation of distribution algorithm. Neural Comput Appl 22:11–20
Article Google Scholar
Suykens JAK, De Brabanter J, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48:85–105
Article Google Scholar
Kumar P, Gupta A (2020) Active Learning Query Strategies for Classification, Regression, and Clustering: A Survey. J Comput Sci Technol 35:913–945
Article Google Scholar
Shu Z, Sheng VS, Li J (2018) Learning from crowds with active learning and self-healing. Neural Comput Appl 30:2883–2894
Article Google Scholar
Gorissen D, Tommasi LD, Crombecq K, Dhaene T (2009) Sequential modeling of a low noise amplifier with neural networks and active learning. Neural Comput Appl 18:485–494
Article Google Scholar
Huang S, Jin R, Zhou Z (2014) Active Learning by Querying Informative and Representative Examples. IEEE Trans Pattern Anal Machine Intelligence 36:1936–1949
Article Google Scholar
Yu H, Sun C, Yang W, Yang X, Zuo X (2015) AL-ELM: One uncertainty-based active learning algorithm using extreme learning machine. Neurocomputing 166:140–150
Article Google Scholar
Wu D, Lin CT, Huang J (2019) Active Learning for Regression Using Greedy Sampling. Inf Sci 474:90–105
Article MathSciNet Google Scholar
Wu D (2019) Pool-Based Sequential Active Learning for Regression. IEEE Trans Neural Networks 30:1348–1359
Article MathSciNet Google Scholar
R Burbidge, JJ Rowland, and RD King 2007 "Active learning for regression based on query by committee," in 8th International Conference on Intelligent Data Engineering and Automated Learning, Birmingham, England pp. 209–218.
W Cai, Y Zhang, and J Zhou 2013 "Maximizing Expected Model Change for Active Learning in Regression," in Proceedings 13th IEEE International Conference on Data Mining, Dallas, Texas, 51–60
B. Settles and M. Craven 2008 "An Analysis of Active Learning Strategies for Sequence Labeling Tasks," in Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, pp. 1070–1079
Demir B, Bruzzone L (2014) A multiple criteria active learning method for support vector regression. Pattern Recogn 47:2558–2567
Article Google Scholar
Cao XY, Yao J, Xu ZB, Meng DY (2020) Hyperspectral Image Classification With Convolutional Neural Network and Active Learning. IEEE Trans Geosci Remote Sens 58:4604–4616
Article Google Scholar
Li M, Xiong A, Wang L, Deng S, Ye J (2020) ACO Resampling: Enhancing the performance of oversampling methods for class imbalance classification. Knowl-Based Syst 196:105–118
Google Scholar
M. Koziarski, "Two-stage resampling for convolutional neural network training in the imbalanced colorectal cancer image classification arXiv," 7 April 2020.
Yu H, Yang X, Zheng S, Sun C (2019) Active Learning From Imbalanced Data: A Solution of Online Weighted Extreme Learning Machine. IEEE Trans Neural Networks 30:1088–1103
Article Google Scholar
Ma C, Liu Z, Cao Z, Song W, Zeng W (2020) Cost-Sensitive Deep Forest for Price Prediction. Pattern Recogn 107:107–122
Article Google Scholar
Moniz N, Branco P, Torgo L (2017) Resampling strategies for imbalanced time series forecasting. J Data Sci 3:161–181
Google Scholar
McCarthy K, Zabar B, and Weiss G 2005 "Does cost-sensitive learning beat sampling for classifying rare classes?," in Proc. Int. Workshop Utility-Based Data Mining, Chicago, Illinois, USA pp. 69–77
Liu X and Zhou Z 2006 "The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study," in Proceedings 6th IEEE International Conference on Data Mining, Hong Kong, China pp. 970–974
Drummond C and Holte RC 2000 "Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria," in Proceedings of Learning from Imbalanced Data Sets, Austin, Texas, USA pp. 239–246
Smola AJ, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14:199–222
Article MathSciNet Google Scholar
Bao YK, Xiong T, Hu ZY (2014) Multi-step-ahead time series prediction using multiple-output support vector regression. Neurocomputing 129:482–493
Article Google Scholar
Yoon ES, Lee DE, Song JH, Song S (2005) Weighted Support Vector Machine for Quality Estimation in the Polymerization Process. Ind Eng Chem Res 44:2101–2105
Article Google Scholar
Elattar EE, Goulermas JY, Wu QH (2010) Electric Load Forecasting Based on Locally Weighted Support Vector Regression. IEEE Trans Syst Man Cybernetics Part C-Appl Rev 40:438–447
Article Google Scholar
RPA Ribeiro 2011 "Utility-based Regression," Ph.D. thesis, Department of Computer Science, Faculty of Sciences, University of Porto
Dougherty RL, Edelman A, Hyman JM (1989) Nonnegativity-, monotonicity-, or convexity-preserving cubic and quintic hermite interpolation. Math Comput 52:471–494
Article MathSciNet Google Scholar
R Zhang and AI Rudnicky 2002 "A large scale clustering scheme for kernel K-Means," in 16th International Conference on Pattern Recognition (ICPR), Quebec, Canada pp. 289–292
Mardia KV, Kent JT, Bibby JM (1979) Multivariate Analysis. Math Gazette 37:123–131
MATH Google Scholar
Yahoo Finance[EB/OL]. Available: http://finance.yahoo.com/
RJ Hyndman and Y Yang. (2018). Time Series Data Library. v0.1.0. Available: https://pkg.yangzhuoranyang.com/tsdl/
Plutowski M, Cottrell GW, White H (1996) Experience with selecting exemplars from clean data. Neural Netw 9:273–294
Article Google Scholar
Dalponte M, Bruzzone L, Gianelle D (2011) A System for the Estimation of Single-Tree Stem Diameter and Volume Using Multireturn LIDAR Data. IEEE Trans Geosci Remote Sens 49:2479–2490
Article Google Scholar

Download references

Acknowledgements

This work is supported by the National Key R&D Program of China (Grant Nos. 2018YFC2001600, 2018YFC2001602), and the National Natural Science Foundation of China under Grant no. 61473150.

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Jing Zhang & Qun Dai

Authors

Jing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qun Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qun Dai.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, J., Dai, Q. A cost-sensitive active learning algorithm: toward imbalanced time series forecasting. Neural Comput & Applic 34, 6953–6972 (2022). https://doi.org/10.1007/s00521-021-06837-3

Download citation

Received: 28 November 2020
Accepted: 12 December 2021
Published: 23 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s00521-021-06837-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A cost-sensitive active learning algorithm: toward imbalanced time series forecasting

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A cost-sensitive active learning algorithm: toward imbalanced time series forecasting

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Learning from imbalanced data: open challenges and future directions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation