Multi-matrices entropy discriminant ensemble learning for imbalanced problem

Wang, Zhe; Chen, Zhaozhi; Zhu, Yiwen; Zhang, Jing; Du, Wenli; Li, Dongdong

doi:10.1007/s00521-019-04306-6

Multi-matrices entropy discriminant ensemble learning for imbalanced problem

Original Article
Published: 28 June 2019

Volume 32, pages 8245–8264, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zhe Wang^1,2,
Zhaozhi Chen²,
Yiwen Zhu²,
Jing Zhang²,
Wenli Du¹ &
…
Dongdong Li²

389 Accesses
3 Citations
Explore all metrics

Abstract

The objective of this paper is to make an improvement on ensemble learning for imbalanced problem. Multi-matrices approach and nearest entropy are introduced into model of base classifier for the sake of utilizing spatial information of data and geometric relation between instances. Our method utilizes the variety of matrix to mine the potential information in the data and constructs regularization term that measures the neighboring relationship among instances with entropy to enhance the stability of decision boundary. The different shapes of matrix contain distinct spatial information. As a result, the origin vector-oriented data are reorganized into multiple shapes of matrix to expand the different spatial information. The nearest entropy is used to measure the local certainty of instances so that the stable instances can be selected to train by the new regularization term. In order to compare the advantages of introducing the multi-matrices and entropy, several ensemble learning methods that have similar ensemble strategy and variants of linear classification models are selected to implement experiments, based on 55 binary classification datasets of KEEL benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Ensemble Based Imbalance Classification

An Ensemble Method Based on SVC and Euclidean Distance for Classification Binary Imbalanced Data

A Simple Ensemble Learning Algorithm For a Real Time High Dimensional Data

References

Barua S, Islam MM, Yao X, Murase K (2014) Mwmote-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng 26(2):405–425
Article Google Scholar
Breiman L, Friedman JH, Olshen R, Stone CJ (1984) Classification and regression trees. Wadsworth Brooks 57(1):582–588
MATH Google Scholar
Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C (2009) Safe-level-smote: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Pacific-Asia conference on advances in knowledge discovery and data mining, pp 475–482
Cai D, He X, Zhou K, Han J, Bao H (2007) Locality sensitive discriminant analysis. In: International joint conference on artificial intelligence, pp 708–713
Cao C, Wang Z (2018) Imcstacking: cost-sensitive stacking learning with feature inverse mapping for imbalanced problems. Knowl Based Syst 150:27–37
Article Google Scholar
Castro CL, Braga AP (2013) Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Trans Neural Netw Learn Syst 24(6):888–899
Article Google Scholar
Chan P, Stolfo S (1998) Toward scalable learning with non-uniform class and cost distributions. In: International conference on knowledge discovery and data mining, pp 164–168
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Article MATH Google Scholar
Chao C, Andy L, Leo B et al (2004) Using random forest to learn imbalanced data, vol 110. University of California, Berkeley, pp 1–12
Google Scholar
Chen S, Wang Z, Tian Y (2007) Matrix-pattern-oriented Ho–Kashyap classifier with regularization learning. Pattern Recognit 40(5):1533–1543
Article MATH Google Scholar
Chen Y, Wu K, Chen X, Tang C, Zhu Q (2014) An entropy-based uncertainty measurement approach in neighborhood systems. Inf Sci 279:239–250
Article MathSciNet MATH Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
MathSciNet MATH Google Scholar
Drummond C, Holte RC (2000) Exploiting the cost (in)sensitivity of decision tree splitting criteria. In: International conference on machine learning, pp 1–2
Duda Richard O, Hart Peter E, Stork David G (2001) Pattern classification, second edn. Wiley, New York
MATH Google Scholar
Fan Q, Wang Z, Li DD, Gao DQ, Zha HY (2017) Entropy-based fuzzy support vector machine for imbalanced datasets. Knowl Based Syst 115:87–99
Article Google Scholar
Fan W, Stolfo SJ, Zhang J, Chan PK (1999) Adacost: misclassification cost-sensitive boosting. In: International conference on machine learning, pp 97–105
Fumera G, Roli F (2002) Support vector machines with embedded reject option. In: Lee S-W, Verri A (eds) Pattern recognition with support vector machines. Springer, Berlin
MATH Google Scholar
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern C (Appl Rev) 42(4):463–484
Article Google Scholar
Guo HX, Li YJ, Li Y, Liu X, Li J (2016) BPSO-AdaBoost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng Appl Artif Intell 49(3):176–193
Google Scholar
Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. Int Conf Intell Comput 2005:878–887
Google Scholar
He HB, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
Jo T, Japkowicz N (2004) Class imbalances versus small disjuncts. ACM SIGKDD Explor Newsl 6(1):40–49
Article Google Scholar
Krawczyk B (2016) Learning from imbalanced data: open challenges and future directions. Prog Artif Intell 5(4):221–232
Article Google Scholar
Kukar M, Kononenko I et al (1998) Cost-sensitive learning with neural networks. In: European conference on artificial intelligence, pp 445–449
Kwok JT (1999) Moderating the outputs of support vector machine classifiers. In: Proceedings of the international joint conference on neural networks (IJCNN’99), pp 943–948
Li Q, Li G, Niu WJ, Cao Y, Chang Lg, Tan JL, Guo L (2016) Boosting imbalanced data learning with wiener process oversampling. Front Comput Sci 11(5):1–16
MATH Google Scholar
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu XY, Wu JX, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern B (Cybern) 39(2):539–550
Article Google Scholar
Maloof MA (2003) Learning when data sets are imbalanced and when costs are unequal and unknown. In: International conference on machine learning-2003 workshop on learning from imbalanced data sets II, pp 1–2
Masnadi-Shirazi H, Vasconcelos N, Iranmehr A (2012) Cost-sensitive support vector machines. arXiv preprint arXiv:1212.0975
Poggio T (1996) Image representations for visual learning. Science 272(5270):1905–1909
Article Google Scholar
Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2010) RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern A Syst Hum 40(1):185–197
Article Google Scholar
Shannon CE (2001) A mathematical theory of communication. ACM SIGMOBILE Mob Comput Commun Rev 5(1):3–55
Article MathSciNet Google Scholar
Sun B, Chen HY, Wang JD, Xie H (2018) Evolutionary under-sampling based bagging ensemble method for imbalanced data classification. Front Comput Sci 12(2):331–350
Article Google Scholar
Sun SL (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038
Article Google Scholar
Sun Shiliang (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7):2031–2038
Article Google Scholar
Sun Y, Wong AKC, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(4):687–719
Article Google Scholar
Wang Q, Luo ZH, Huang JC, Feng YH, Liu Z (2017) A novel ensemble method for imbalanced data learning: bagging of extrapolation-smote svm. Comput Intell Neurosci 2017(3):11
Google Scholar
Wang Z, Chen S (2007) New least squares support vector machines based on matrix patterns. Neural Process Lett 26(1):41–56
Article MathSciNet Google Scholar
Wang Z, Chen SC, Gao DQ (2011) A novel multi-view learning developed from single-view patterns. Pattern Recognit 44(10):2395–2413
Article MATH Google Scholar
Wang Z, Zhang GW, Li DD, Zhu YJ, Cao CJ (2017) Locality sensitive discriminant matrixized learning machine. Knowl Based Syst 116:13–25
Article Google Scholar
Wimalawarne K, Tomioka R, Sugiyama M (2016) Theoretical and experimental analyses of tensor-based regression and classification. Neural Comput 28(4):686–715
Article MathSciNet MATH Google Scholar
Xue H, Chen S, Yang Q (2009) Discriminatively regularized least-squares classification. Pattern Recognit 42(1):93–104
Article MATH Google Scholar
Zhang H, Wang S, Zhao M, Xu X, Ye Y (2018) Locality reconstruction models for book representation. IEEE Trans Knowl Data Eng 30(10):1873–1886
Article Google Scholar
Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77
Article MathSciNet Google Scholar
Zhu YJ, Wang Z, Gao DQ (2015) Gravitational fixed radius nearest neighbor for imbalanced problem. Knowl Based Syst 90:224–238
Article Google Scholar

Download references

Acknowledgements

This work is supported by Natural Science Foundation of China under Grant No. 61672227, “Shuguang Program” supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission, National Science Foundation of China for Distinguished Young Scholars under Grant 61725301, National Key R&D Program of China under Grant No. 2018YFC0910500 and Natural Science Foundations of China under Grant No. 61806078.

Author information

Authors and Affiliations

Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology, Shanghai, 200237, People’s Republic of China
Zhe Wang & Wenli Du
Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai, 200237, People’s Republic of China
Zhe Wang, Zhaozhi Chen, Yiwen Zhu, Jing Zhang & Dongdong Li

Authors

Zhe Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaozhi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yiwen Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wenli Du
View author publications
You can also search for this author in PubMed Google Scholar
Dongdong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhe Wang, Jing Zhang or Wenli Du.

Ethics declarations

Conflict of interest

The authors of this manuscript state that there are no conflicts of interests between this manuscript and other published works.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Chen, Z., Zhu, Y. et al. Multi-matrices entropy discriminant ensemble learning for imbalanced problem. Neural Comput & Applic 32, 8245–8264 (2020). https://doi.org/10.1007/s00521-019-04306-6

Download citation

Received: 16 November 2018
Accepted: 17 June 2019
Published: 28 June 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s00521-019-04306-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-matrices entropy discriminant ensemble learning for imbalanced problem

Abstract

Access this article

Similar content being viewed by others

Hierarchical Ensemble Based Imbalance Classification

An Ensemble Method Based on SVC and Euclidean Distance for Classification Binary Imbalanced Data

A Simple Ensemble Learning Algorithm For a Real Time High Dimensional Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-matrices entropy discriminant ensemble learning for imbalanced problem

Abstract

Access this article

Similar content being viewed by others

Hierarchical Ensemble Based Imbalance Classification

An Ensemble Method Based on SVC and Euclidean Distance for Classification Binary Imbalanced Data

A Simple Ensemble Learning Algorithm For a Real Time High Dimensional Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation