Abstract
In many practical engineering applications, online sequential data imbalance problems are universally found. Many traditional machine learning methods are hard to improve the classification accuracy effectively while solving these problems. To get fast and efficient classification, a new online sequential extreme learning machine algorithm with sparse-weighting strategy is proposed to increase the accuracy of minority class while reducing the accuracy loss of majority class as much as possible. The main idea is integrating a new sparse-weighting strategy into the present data-based strategy for sequential data imbalance problem. In offline stage, a two phase balanced strategies is introduced to obtain the valuable virtual sample set. In online stage, a dynamic weighting strategy is proposed to assign the corresponding weight for each sequential sample by means of the change of sensitivity and specificity in order to maintain the optimal network structure. Experimental results on two kinds of imbalanced datasets, UCI datasets and the real-world air pollutant forecasting dataset, show that the proposed method has higher prediction accuracy and better numerical stability compared with ELM, OS-ELM, meta-cognitive OS-ELM and weighted OS-ELM.
Similar content being viewed by others
References
Yang Y, Han DQ (2016) A new distance-based total uncertainty measure in the theory of belief functions. Knowl-Based Syst 94:114-123
Mirza B, Lin Z, Toh KA (2013) Weighted online sequential extreme learning machine for class imbalance learning. Neural Process Lett 38:465–486
Chawla NV, Bowyer KW, Hall LO (2011) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Yang Z, Qiao L, Peng X (2007) Research on datamining method for imbalanced dataset based on improved SMOTE. Acta Electron Sin 12:22–26
Liu Y, Liu S, Liu T, Wang Z (2014) New oversampling algorithm DB_SMOTE. Comput Eng Appl 50:92–95
Wang X, Aamir R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29:1185–1196
Krawczyk B, Woźniak M, Schaefer G (2014) Cost-sensitive decision tree ensembles for effective imbalanced classification. Appl Soft Comput J 14:554–562
Cervantes J, García-Lamont F, López A, Rodriguez L, Castilla JSR, Trueba A (2015) PSO-based method for SVM classification on skewed data-sets. Lecture notes in computer science, vol 9227. Springer International Publishing, pp 79–86. doi:10.1007/978-3-319-22053-6_9
Wang X, Xing H, Li Y, Li Y (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Wang X (2015) Uncertainty in learning from big data-editorial. J Intell Fuzzy Syst 28(5):2329–2330
Huang G, Zhou H, Ding X (2012) Extreme learning machine for regression and multiclass. IEEE Trans Syst Man Cybern Part B Cybern 42:513–529
Feng G, Huang GB, Lin Q (2009) Error minimized extreme learning machine with growth of hidden nodes and incremental learning. Neural Netw IEEE Trans 20:1352–1357
Miche Y, Sorjamaa A, Bas P (2010) OP-ELM: optimally pruned extreme learning machine. Neural Netw IEEE Trans 21:158–162
Lu S, Wang X, Zhang G, Zhou X (2015) Effective algorithms of the Moore–Penrose inverse matrices for extreme learning machine. Intell Data Anal 19(4):743–760
Wang X, Shao Q, Qing M, Zhai J (2013) Architecture selection for networks trained with extreme learning machine using localized generalization error model. Neurocomputing 102:3–9
Liang N, Huang GB (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17:1411–1423
Mirza B, Lin Z, Liu N (2006) Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift. Neurocomputing 149:316–329
Chiu CC (2013) Online Sequential Prediction of Minority Class of Suspended Particulate Matters by Meta-Cognitive OS-ELM. Master Thesis, University of Macau
Yen SJ, Lee YS (2009) Cluster-based under-sampling approaches for imbalanced data distributions. Expert Syst Appl 36:5718–5727
Zhu W, Miao J, Qing L (2014) Robust regression with extreme support vectors. Pattern Recognit Lett 45:205–210
Zhu W, Miao J, Qing L (2014) Extreme support vector regression. In: Sun F, Toh K-A, Romay MG, Mao K (eds) Extreme learning machines 2013: algorithms and applications. Springer International Publishing, pp 25–34
Leng Q, Qi H, Miao J, Zhu W, Su G (2015) One-class classification with extreme learning machine. Mathe Probl Eng 2015(2015):1–11
Yang L, Zhang R (2012) Online sequential ELM algorithm and its improvement. J Northwest Univ (Nat Sci Edn) 42:885–896
Zhang X (2005) Matrix analysis and application. Tsinghua University Press, Beijing
Vong CM, Ip WF, Wong PK, Chiu CC (2014) Predicting minority class for suspended particulate matters level by extreme learning machine. Neurocomputing 128:136–144
SMG E-publication Download Page (2013) http://www.smg.gov.mo/www/ccaa/pdf/e_pdf_download.php
Newman DJ, Hettich S, Blake CL, Merz CJ UCI Repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html. Irvine, CA: University of California, Department of Information and Computer Science
Acknowledgments
We wish to thank the author C.M. Vong of [25] for useful discussion and instruction. This work was supported by the National Natural Science Foundation of China (No. U1204609), Postdoctoral Science Foundation of China (No. 2014M550508), the funding scheme of University Science & Technology Innovation in Henan Province (No. 15HASTIT022), the funding scheme of University Young Core Instructor in Henan Province (No. 2014GGJS-046) and the foundation of Henan Normal University for Excellent Young Teachers (No.14YQ007).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mao, W., Wang, J. & Xue, Z. An ELM-based model with sparse-weighting strategy for sequential data imbalance problem. Int. J. Mach. Learn. & Cyber. 8, 1333–1345 (2017). https://doi.org/10.1007/s13042-016-0509-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-016-0509-z