Skip to main content
Log in

An effective cost-sensitive sparse online learning framework for imbalanced streaming data classification and its application to online anomaly detection

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Class imbalance is one of the most challenging problems in streaming data mining due to its adverse impact on predictive capability of online models. Most of the existing approaches for online learning lack an effective mechanism to handle high-dimensional streaming data with skewed class distributions, resulting in deteriorated model performance and limited interpretability. In this paper, we develop a cost-sensitive regularized dual averaging (CSRDA) method to tackle this problem. Our proposed method substantially extends the influential regularized dual averaging method by formulating a new convex optimization function, in which four \(\ell _1\)-norm regularized cost-sensitive objective functions are directly optimized, respectively. We then theoretically analyze CSRDA’s regret bounds and the bounds of primal variables, demonstrating that CSRDA and its variants can achieve a theoretical convergence in terms of the balanced cost and sparsity when handling severe imbalanced and high-dimensional streaming data. To validate the proposed methods, we conduct extensive experiments on six benchmark streaming datasets with varied imbalance ratios and three online anomaly detection tasks. The experimental results demonstrate that, compared to other baseline methods, CSRDA and its variants not only improve classification performance, but also successfully capture sparse features more effectively and hence potentially have a better model interpretability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Poggio T, Voinea S, Rosasco L (2011) Online learning, stability, and stochastic gradient descent. CoRR abs/1105.4701

  2. Ma Y, Zheng T (2017) Stabilized sparse online learning for sparse data. J Mach Learn Res 18(1):4773–4808

    MathSciNet  MATH  Google Scholar 

  3. Duchi J, Singer Y (2009) Efficient online and batch learning using forward backward splitting. J Mach Learn Res 10:2899–2934

    MathSciNet  MATH  Google Scholar 

  4. Langford J, Li L, Zhang T (2009) Sparse online learning via truncated gradient. In: Proceedings of advances in neural information processing systems, pp 905–912

  5. Zhang Q, Zhang P, Long G, Ding W, Zhang C, Wu X (2015) Towards mining trapezoidal data streams. In: IEEE international conference on data mining, pp 1111–1116

  6. Xiao L (2010) Dual averaging methods for regularized stochastic learning and online optimization. J Mach Learn Res 11:2543–2596

    MathSciNet  MATH  Google Scholar 

  7. Lee S, Wright SJ (2012) Manifold identification in dual averaging for regularized stochastic online learning. J Mach Learn Res 13(1):1705–1744

    MathSciNet  MATH  Google Scholar 

  8. Ushio A, Yukawa M (2019) Projection-based regularized dual averaging for stochastic optimization. IEEE Trans Signal Process 67(10):2720–2733

    Article  MathSciNet  MATH  Google Scholar 

  9. Wang J, Zhao P, Hoi SC (2013) Cost-sensitive online classification. IEEE Trans Knowl Data Eng 26(10):2425–2438

    Article  Google Scholar 

  10. Liu M, Xu C, Luo Y, Xu C, Wen Y, Tao D (2017) Cost-sensitive feature selection by optimizing F-measures. IEEE Trans Image Process 27(3):1323–1335

    Article  MathSciNet  MATH  Google Scholar 

  11. Yan Y, Yang T, Yang Y, Chen J (2017) A framework of online learning with imbalanced streaming data. In: AAAI conference on artificial intelligence, pp 2817–2823

  12. Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(19):551–585

    MathSciNet  MATH  Google Scholar 

  13. Li Y, Zaragoza H, Herbrich R, Shawe-Taylor J, Kandola J (2002) The perceptron algorithm with uneven margins. In: international conference on machine learning, pp 379–386

  14. Crammer K, Dredze M, Pereira F (2008) Exact convex confidence-weighted learning. In: Proceedings of advances in neural information processing systems, pp 345–352

  15. Dredze M, Crammer K, Pereira (2009) Confidence-weighted linear classification. In: International conference on machine learning, pp 264–271

  16. Zhao P, Zhang Y, Wu M, Hoi SC, Tan M, Huang J (2018) Adaptive cost-sensitive online classification. IEEE Trans Knowl Data Eng 31(2):214–228

    Article  Google Scholar 

  17. Chen Z, Fang Z, Fan W, Edwards A, Zhang K (2017) CSTG: An effective framework for cost-sensitive sparse online learning. In: SIAM international conference on data mining, pp 759–767

  18. Cesa-Bianchi N, Conconi A, Gentile C (2004) On the generalization ability of online learning algorithms. IEEE Trans Info Theory 50(9):2050–2057

    Article  MATH  Google Scholar 

  19. Liu JW, Zhou JJ, Kamel MS, Luo XL (2017) Online learning algorithm based on adaptive control theory. IEEE Trans Neural Netw Learn Syst 29(6):2278–2293

    Article  MathSciNet  Google Scholar 

  20. Hoi SC, Sahoo D, Lu J, Zhao P (2021) Online learning: a comprehensive survey. Neurocomputing 459:249–289

    Article  Google Scholar 

  21. Chen Z, Fang Z, Zhao J, Fan W, Edwards A, Zhang K (2018) Online density estimation over streaming data: a local adaptive solution. In: IEEE international conference on big data, pp 201–210

  22. Singh C, Anuj S (2020) Online learning using multiple times weight updating. Appl Artif Intell 34(6):515–536

    Article  Google Scholar 

  23. Chen Z, Fang Z, Sheng V, Zhao J, Fan W, Edwards A, Zhang K (2021) Adaptive robust local online density estimation for streaming data. Int J Mach Learn Cyber 12(6):1803–1824

    Article  Google Scholar 

  24. Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386

    Article  Google Scholar 

  25. Gentile C (2001) A new approximate maximal margin classification algorithm. J Mach Learn Res 2:213–242

    MathSciNet  MATH  Google Scholar 

  26. Li Y, Long PM (2000) The relaxed online maximum margin algorithm. In: Proceedings of advances in neural information processing systems, pp 498–504

  27. Crammer K, Kulesza A, Dredze M (2009) Adaptive regularization of weight vectors. In: Proceedings of advances in neural information processing systems, pp 414–422

  28. Cesa-Bianchi N, Conconi A, Gentile C (2005) A second-order perceptron algorithm. SIAM J Comput 34(3):640–668

    Article  MathSciNet  MATH  Google Scholar 

  29. Wang J, Zhao P, Hoi SC (2012) Exact soft confidence-weighted learning. In: International conference on machine learning, pp 107–114

  30. Luo H, Agarwal A, Cesa-Bianchi N, Langford J (2016) Efficient second order online learning by sketching. In: Proceedings of advances in neural information processing systems, pp 910–918

  31. Wang J, Zhao P, Hoi SC, Jin R (2013) Online feature selection and its applications. IEEE Trans Knowl Data Eng 26(3):698–710

    Article  Google Scholar 

  32. Nesterov Y (2009) Primal-dual subgradient methods for convex problems. Math Prog 120(1):221–259

    Article  MathSciNet  MATH  Google Scholar 

  33. Zhou B, Chen F, Ying Y (2019) Dual averaging method for online graph-structured sparsity. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 436–446

  34. Zhao P, Wang D, Wu P, Hoi SC (2020) A unified framework for sparse online learning. ACM Trans Knowl Discov Data 14(5):1–20

    Article  Google Scholar 

  35. Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N (2018) A survey on addressing high-class imbalance in big data. J Big Data 5(1):1–30

    Article  Google Scholar 

  36. Elkan C (2001) The foundations of cost-sensitive learning. In: International joint conference on artificial intelligence, pp 973–978

  37. Zhao P, Hoi SC (2013) Cost-sensitive online active learning with application to malicious URL detection. In: Proceedings of ACM SIGKDD international conference on knowledge discovery and data mining, pp 919–927

  38. Zhao P, Zhuang F, Wu M, Li XL, Hoi SC (2015) Cost-sensitive online classification with adaptive regularization and its applications. In: IEEE international conference on data mining, 649–658

  39. Zinkevich M (2003) Online convex programming and generalized infinitesimal gradient ascent. In: International conference on machine learning, pp 928–936

  40. Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: International conference on machine learning, pp 179–186

  41. Hurley N, Rickard S (2009) Comparing measures of sparsity. IEEE Trans Info Theory 55(10):4723–4741

    Article  MathSciNet  MATH  Google Scholar 

  42. Hoi SC, Wang J, Zhao P (2014) Libol: a library for online learning algorithms. J Mach Learn Res 15(1):495–499

    MATH  Google Scholar 

Download references

Acknowledgements

This publication was made possible by funding from the DOD ARO Grant #W911NF-20-1-0249.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kun Zhang.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Sheng, V., Edwards, A. et al. An effective cost-sensitive sparse online learning framework for imbalanced streaming data classification and its application to online anomaly detection. Knowl Inf Syst 65, 59–87 (2023). https://doi.org/10.1007/s10115-022-01745-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01745-x

Keywords

Navigation