Skip to main content
Log in

Online Extreme Learning Machine with Hybrid Sampling Strategy for Sequential Imbalanced Data

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In real applications of cognitive computation, data with imbalanced classes are used to be collected sequentially. In this situation, some of current machine learning algorithms, e.g., support vector machine, will obtain weak classification performance, especially on minority class. To solve this problem, a new hybrid sampling online extreme learning machine (ELM) on sequential imbalanced data is proposed in this paper. The key idea is keeping the majority and minority classes balanced with similar sequential distribution characteristic of the original data. This method includes two stages. At the offline stage, we introduce the principal curve to build confidence regions of minority and majority classes respectively. Based on these two confidence zones, over-sampling of minority class and under-sampling of majority class are both conducted to generate new synthetic samples, and then, the initial ELM model is established. At the online stage, we first choose the most valuable ones from the synthetic samples of majority class in terms of sample importance. Afterwards, a new online fast leave-one-out cross validation (LOO CV) algorithm utilizing Cholesky decomposition is proposed to determine whether to update the ELM network weight at online stage or not. We also prove theoretically that the proposed method has upper bound of information loss. Experimental results on seven UCI datasets and one real-world air pollutant forecasting dataset show that, compared with ELM, OS-ELM, meta-cognitive OS-ELM, and OSELM with SMOTE strategy, the proposed method can simultaneously improve the classification performance of minority and majority classes in terms of accuracy, G-mean value, and ROC curve. As a conclusion, the proposed hybrid sampling online extreme learning machine can be effectively applied to the sequential data imbalance problem with better generalization performance and numerical stability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Liu H, Sun F, Gao D, et al. Structured output-associated dictionary learning for haptic understanding. IEEE Trans Syst Man Cybern Syst. 2017;47(7):1564–74.

    Article  Google Scholar 

  2. Deng W, Zheng Q, Wang Z. Cross-person activity recognition using reduced kernel extreme learning machine. Neural Netw 2014;53:1–7.

    Article  PubMed  Google Scholar 

  3. Liu H, Sun F, Fang B, et al. Robotic room-level localization using multiple sets of sonar measurements. IEEE Trans Instrum Meas. 2017;66(1):2–13.

    Article  Google Scholar 

  4. Liu H, Yu Y, Sun F, et al. Robotic room-level localization using multiple sets of sonar measurements. IEEE Trans Autom Sci Eng 2017;14(2):996–1008.

    Article  Google Scholar 

  5. Xu R, Chen T, Xia Y, et al. Word embedding composition for data imbalances in sentiment and emotion classification. Cognitive Computation 2015;7:226–40.

    Article  Google Scholar 

  6. Xiong S, Meng F, Liu B, et al. A kernel clustering-based possibilistic fuzzy extreme learning machine for class imbalance learning. Cognitive Computation 2015;7(1):74–85.

    Article  Google Scholar 

  7. Ou W, Yuan D, Li D, et al. Patch-based visual tracking with online representative sample selection. J Electronic Imaging 2017;26(3):33006.

    Article  Google Scholar 

  8. Batista GEAPA, Prati RC, Monard MC. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor Newsl. 2004;6:20–9.

    Article  Google Scholar 

  9. Chawla NV, Bowyer KW, Hall LO, et al. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artifical Intelligence Research 2002;16(1):321–57.

    Google Scholar 

  10. Yang Z, Qiao L, Peng X. Research on datamining method for imbalanced dataset based on improved SMOTE. ACTA ELECTRONICA SINICA 2007;12(A):22–6.

    Google Scholar 

  11. Zeng Z, Wu Q, Liao B, et al. A classification method for imbalance data set based on kernel SMOTE. Acta Electronica Sinica 2009;37(11):2489–95.

    Google Scholar 

  12. Jeatrakul P, Wong KW, Fung CC. Classification of imbalanced data by combining the complementary neural network and SMOTE algorithm. Neural Information Processing 2010;6444:152–9.

    Google Scholar 

  13. Zhai Y, Ma N, Ruan D. An effective over-sampling method for imbalanced data sets classification. Chin J Electron 2011;20(3):489–94.

    Google Scholar 

  14. Ducange P, Lazzerini B, Marcelloni F. Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft Comput. 2010;14:713–28.

    Article  Google Scholar 

  15. Wu G, Chang EY. KBA: kernel boundary alignment considering imbalanced data distribution. IEEE Trans Knowl Data. Eng. 2005;17:786–95.

    Article  Google Scholar 

  16. Estabrooks A, Jo T, Japkowicz N. A multiple resampling method for learning from imbalanced datasets. Comput Intell. 2004;20:18–36.

    Article  Google Scholar 

  17. Huang GB. What are extreme learning machines? Filling the gap between Frank Rosenblatt’s dream and John von Neumann’s puzzle. Cognitive Computation 2015;7:263–78.

    Article  Google Scholar 

  18. Huang GB, Zhu QY, Siew CK. Extreme learning machine: a new learning scheme of feedforward neural networks. Proceedings of International Joint Conference on Neural Networks (IJCNN2004) 2004;2:985–90.

    Google Scholar 

  19. Huang GB, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern. 2012;42(2):513–29.

    Article  PubMed  Google Scholar 

  20. Zong W, Huang G.-B., Chen Y. Weighted extreme learning machine for imbalanced learning. Neurocomputing 2013;101(3):229–42.

    Article  Google Scholar 

  21. Liang NY, Huang GB, Saratchandran P. A fast accurate online sequential learning algorithm for feedforword networks. IEEE Trans, Neural Networks 2006;17:1411–23.

    Article  PubMed  Google Scholar 

  22. Vong CM, IP WF, Wong PK, Chiu CC. Predicting minority class for suspended particulate matters level by extreme learning machine. Neurocomputing 2014;128:136–44.

    Article  Google Scholar 

  23. Mirza B, Lin Z, Toh KA. Weighted online sequential extreme learning machine for class imbalance learning, neural. Process Lett. 2013;38:465–86.

    Article  Google Scholar 

  24. Wang S, Minku LL, Yao X. Resampling-based ensemble methods for online class imbalance learning. IEEE Trans Knowl Data Eng. 2015;27(5):1356–68.

    Article  Google Scholar 

  25. Mirza B, Lin Z, Liu N. Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift. Neurocomputing 2005;149(A):316–29.

    Google Scholar 

  26. Zhang Y, Liu B, Cai J, Zhang S. 2016. Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution. Neural Comput Applic. 1–9.

  27. Yuan P, Ma H, Fu H. Hotspot-entropy based data forwarding in opportunistic social networks. Pervasive and Mobile Computing 2015;16(A):136–54.

    Article  Google Scholar 

  28. Mao W, Wang J, He L, et al. Online sequential prediction of imbalance data with two-stage hybrid strategy by extreme learning machine. Neurocomputing 2017;261:94–105.

    Article  Google Scholar 

  29. Liu H, Qin J, Sun F, et al. 2017. Extreme kernel sparse learning for tactile object recognition. IEEE Transactions on Cybernetics. (in press).

  30. Cao J, Zhao T, Wang J, et al. 2017. Excavation equipment classification based on improved MFCC features and ELM. Neurocomputing. (in press).

  31. Huang, Yu Y, Gu J, et al. An efficient method for traffic sign recognition based on extreme learning machine. IEEE Transactions on Cybernetics 2016;47(4):920–33.

    Article  PubMed  Google Scholar 

  32. Lan Y, Soh YC, Huang GB. Two-stage extreme learning machine for regression. Neurocomputing 2010; 73(16–18):3028–38.

    Article  Google Scholar 

  33. Feng G, Huang G.-B., Lin Q, Gay R. Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Transactions on Neural Networks 2009;20(8):1352–7.

    Article  PubMed  Google Scholar 

  34. Mao W, Tian M, Cao X, Xu J. Model selection of extreme learning machine based on multi-objective optimization. Neural Comput Applic. 2013;22(3–4):521–9.

    Article  Google Scholar 

  35. Cao J, Zhang K, Luo M, et al. Extreme learning machine and adaptive sparse representation for image classification. Neural Netw. 2016;81:91–102.

    Article  PubMed  Google Scholar 

  36. Heeswijk M, Miche Y. Binary/ternary extreme learning machines. Neurocomputing 2015;149:187–97.

    Article  Google Scholar 

  37. Liu X, Li P, Gao C. Fast leave-one-out cross-validation algorithm for extreme learning machine. Journal of Shanghai Jiaotong University 2011;45(8):6–11.

    Google Scholar 

  38. Hastie T, Stuetzle WX. 1984. Principal curves and surfaces, Stanford University, Department of Statistics: Technical Report 11.

  39. Hermann T, Meinicke P, Ritter H. Principal curve sonification. Proceedings of International Conference on Auditory Display. Atlanda; 2000. p. 81–6.

  40. Kégl B, Krzyzak A, Linder T, Zeger K. Learning and design of principal curves. IEEE Trans. Pattern Recognition and Machine Intelligence 2000;22(3):281–97.

    Article  Google Scholar 

  41. Zhang J, Wang J. An overview of principal curves. Chinese Journal of Computers 2003;26(2):129–46.

    CAS  Google Scholar 

  42. Zhang X, Wang L. Incremental regularized extreme learning machine based on Cholesky factorization and its application to time series prediction. Acta Phys. Sin 2011;11:7–12.

    CAS  Google Scholar 

  43. Vong CM, Ip WF, Chiu CC, Wong PK. Imbalanced learning for air pollution by meta-cognitive online sequential extreme learning machine. Cognitive Computation 2015;7:381–91.

    Article  Google Scholar 

  44. Yang Z, Qiao L, Peng X. Research on data mining method for imbalanced dataset based on improved SMOTE. Acta Electronica Sinica 2007;12A(35):22–6.

    Google Scholar 

  45. Newman DJ, Hettich S, Blake CL, et al. UCI Repository of machine learning databases [http://www.ics.uci.edu/mlearn/MLRepository.html]. Irvine: University of california, department of information and computer science.

Download references

Acknowledgements

We wish to thank the author C.M. Vong of [43] and Dr. Yuan of [27] for useful discussion and instruction.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wentao Mao.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Funding

This work was supported by the National Natural Science Foundation of China (No. 61572399, U1204609), the China Postdoctoral Science Foundation Specific Support (No. 2016T90944), the funding scheme of the University Science and Technology Innovation in Henan Province (No. 15HASTIT022), the funding scheme of University Young Core Instructor in Henan Province (No. 2014GGJS-046), the Foundation of Henan Normal University for Excellent Young Teachers (No. 14YQ007), the Major Science and Technology Foundation in Guangdong Province of China(No:2015B010104002), and the key Scientific Research Foundation of Henan Provincial University (16A520015,15A520078).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mao, W., Jiang, M., Wang, J. et al. Online Extreme Learning Machine with Hybrid Sampling Strategy for Sequential Imbalanced Data. Cogn Comput 9, 780–800 (2017). https://doi.org/10.1007/s12559-017-9504-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-017-9504-2

Keywords

Navigation