Rectified Encoder Network for High-Dimensional Imbalanced Learning

Zheng, Tao; Chen, Wei-Jie; Tsang, Ivor; Yao, Xin

doi:10.1007/978-3-030-29911-8_53

Tao Zheng^10,11,
Wei-Jie Chen¹²,
Ivor Tsang¹¹ &
…
Xin Yao¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

2681 Accesses

Abstract

Many existing works have studied the learning on imbalanced data, however, it is still very challenging to handle high-dimensional imbalanced data. One key challenge of learning on imbalanced data is that most learning models usually have a bias towards the majority and its performance will deteriorate in the presence of underrepresented data and severe class distribution skews. One solution is to synthesize the minority data to balance the class distribution, but it may lead to more overlapping, especially in the high-dimensional setting. To alleviate the above challenges, in this paper, we present a novel Rectified Encoder Network (REN) for high-dimensional imbalanced learning tasks. The main contribution is that: (1) To deal with high-dimensionality, REN encodes high-dimensional imbalanced data into low dimensional latent codes as a latent representation. (2) To obtain a discriminative representation, we introduce a Rectifier to match the latent codes with our proposed Predefined Codes, which disentangles the overlapping among classes. (3) During rectification, in the Predefined Latent Distribution, we can efficiently identify and generate informative samples to maintain the balance of class distribution, so that the minority classes will not be neglected. The experimental results on several high-dimensional and image imbalanced data sets indicate that our REN obtains good representation code for classification and visualize the reason why REN gets better performance in high-dimensional imbalanced learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://archive.ics.uci.edu/ml/index.php.

References

Ashby, F.G., Maddox, W.T.: Capturing human category representations by sampling in deep feature spaces, pp. 1–10 (2018)
Google Scholar
Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., Sivic, J.: Seeing 3D chairs: exemplar part-based 2D–3D alignment using a large dataset of cad models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3762–3769 (2014)
Google Scholar
Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Chawla, N.V., Japkowicz, N., Kotcz, A.: Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1–6 (2004)
Article Google Scholar
Dong, Q., Gong, S., Zhu, X.: Imbalanced deep learning by minority class incremental rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1367–1381 (2018)
Article Google Scholar
Drummond, C., Holte, R.C., et al.: C4. 5, class imbalance, and costsensitivity: why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II. vol. 11, pp. 1–8. Citeseer (2003)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks 2008. IJCNN 2008. IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)
Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2008)
Google Scholar
Jimenez, L.O., Landgrebe, D.A.: Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 28(1), 39–54 (1998)
Article Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(1), 559–563 (2017)
MATH Google Scholar
Pedregosa, F., et al.: Scikit-learn machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Advances in Neural Information Processing Systems, pp. 3483–3491 (2015)
Google Scholar
Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Sixth International Conference on Data Mining 2006. ICDM 2006, pp. 592–602. IEEE (2006)
Google Scholar
Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Article Google Scholar
Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)
Article Google Scholar
Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders, pp. 1–16 (2018). http://arxiv.org/abs/1711.01558
Wang, S., Yao, X.: Multiclass imbalance problems analysis and potential solutions. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(4), 1119–1130 (2012)
Article Google Scholar
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Sys. 2(1–3), 37–52 (1987)
Article Google Scholar
Zhu, T., Lin, Y., Liu, Y.: Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recogn. 72, 327–340 (2017)
Article Google Scholar

Download references

Acknowledgment

This work was supported by the National Key R&D Program of China (Grant No. 2017YFC0804003), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (Grant No. 2017ZT07X386), Shenzhen Peacock Plan (Grant No. KQTD2016112514355531), the Science and Technology Innovation Committee Foundation of Shenzhen (Grant Nos. ZDSYS201703031748284, JCYJ20180504165652917), the Program for University Key Laboratory of Guangdong Province (Grant No. 2017KSYS008), the ARC Future Fellowship ARC LP150100671, DP180100106, and National Natural Science Foundation of China (Grant Nos. 61603338, 61866010, 61703370).

Author information

Authors and Affiliations

Shenzhen Key Laboratory of Computational Intelligence, University Key Laboratory of Evolving Intelligent Systems of Guangdong Province, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Tao Zheng & Xin Yao
Centre for Artificial Intelligence (CAI), University of Technology Sydney, Ultimo, Australia
Tao Zheng & Ivor Tsang
Zhijiang College, Zhejiang University of Technology, Hangzhou Shi, China
Wei-Jie Chen

Authors

Tao Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Jie Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ivor Tsang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Yao .

Editor information

Editors and Affiliations

Department of Computing, Macquarie University, Sydney, NSW, Australia
Abhaya C. Nayak
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, T., Chen, WJ., Tsang, I., Yao, X. (2019). Rectified Encoder Network for High-Dimensional Imbalanced Learning. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_53

Download citation

DOI: https://doi.org/10.1007/978-3-030-29911-8_53
Published: 23 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29910-1
Online ISBN: 978-3-030-29911-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics