Skip to main content

Rectified Encoder Network for High-Dimensional Imbalanced Learning

  • Conference paper
  • First Online:
PRICAI 2019: Trends in Artificial Intelligence (PRICAI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11671))

Included in the following conference series:

  • 2681 Accesses

Abstract

Many existing works have studied the learning on imbalanced data, however, it is still very challenging to handle high-dimensional imbalanced data. One key challenge of learning on imbalanced data is that most learning models usually have a bias towards the majority and its performance will deteriorate in the presence of underrepresented data and severe class distribution skews. One solution is to synthesize the minority data to balance the class distribution, but it may lead to more overlapping, especially in the high-dimensional setting. To alleviate the above challenges, in this paper, we present a novel Rectified Encoder Network (REN) for high-dimensional imbalanced learning tasks. The main contribution is that: (1) To deal with high-dimensionality, REN encodes high-dimensional imbalanced data into low dimensional latent codes as a latent representation. (2) To obtain a discriminative representation, we introduce a Rectifier to match the latent codes with our proposed Predefined Codes, which disentangles the overlapping among classes. (3) During rectification, in the Predefined Latent Distribution, we can efficiently identify and generate informative samples to maintain the balance of class distribution, so that the minority classes will not be neglected. The experimental results on several high-dimensional and image imbalanced data sets indicate that our REN obtains good representation code for classification and visualize the reason why REN gets better performance in high-dimensional imbalanced learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://archive.ics.uci.edu/ml/index.php.

References

  1. Ashby, F.G., Maddox, W.T.: Capturing human category representations by sampling in deep feature spaces, pp. 1–10 (2018)

    Google Scholar 

  2. Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., Sivic, J.: Seeing 3D chairs: exemplar part-based 2D–3D alignment using a large dataset of cad models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3762–3769 (2014)

    Google Scholar 

  3. Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014)

    Article  Google Scholar 

  4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  5. Chawla, N.V., Japkowicz, N., Kotcz, A.: Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 6(1), 1–6 (2004)

    Article  Google Scholar 

  6. Dong, Q., Gong, S., Zhu, X.: Imbalanced deep learning by minority class incremental rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1367–1381 (2018)

    Article  Google Scholar 

  7. Drummond, C., Holte, R.C., et al.: C4. 5, class imbalance, and costsensitivity: why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II. vol. 11, pp. 1–8. Citeseer (2003)

    Google Scholar 

  8. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  9. He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks 2008. IJCNN 2008. IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)

    Google Scholar 

  10. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 9, 1263–1284 (2008)

    Google Scholar 

  11. Jimenez, L.O., Landgrebe, D.A.: Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 28(1), 39–54 (1998)

    Article  Google Scholar 

  12. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  13. Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(1), 559–563 (2017)

    MATH  Google Scholar 

  14. Pedregosa, F., et al.: Scikit-learn machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: Advances in Neural Information Processing Systems, pp. 3483–3491 (2015)

    Google Scholar 

  16. Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Sixth International Conference on Data Mining 2006. ICDM 2006, pp. 592–602. IEEE (2006)

    Google Scholar 

  17. Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)

    Article  Google Scholar 

  18. Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)

    Article  Google Scholar 

  19. Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders, pp. 1–16 (2018). http://arxiv.org/abs/1711.01558

  20. Wang, S., Yao, X.: Multiclass imbalance problems analysis and potential solutions. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(4), 1119–1130 (2012)

    Article  Google Scholar 

  21. Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Sys. 2(1–3), 37–52 (1987)

    Article  Google Scholar 

  22. Zhu, T., Lin, Y., Liu, Y.: Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recogn. 72, 327–340 (2017)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by the National Key R&D Program of China (Grant No. 2017YFC0804003), the Program for Guangdong Introducing Innovative and Entrepreneurial Teams (Grant No. 2017ZT07X386), Shenzhen Peacock Plan (Grant No. KQTD2016112514355531), the Science and Technology Innovation Committee Foundation of Shenzhen (Grant Nos. ZDSYS201703031748284, JCYJ20180504165652917), the Program for University Key Laboratory of Guangdong Province (Grant No. 2017KSYS008), the ARC Future Fellowship ARC LP150100671, DP180100106, and National Natural Science Foundation of China (Grant Nos. 61603338, 61866010, 61703370).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Yao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, T., Chen, WJ., Tsang, I., Yao, X. (2019). Rectified Encoder Network for High-Dimensional Imbalanced Learning. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11671. Springer, Cham. https://doi.org/10.1007/978-3-030-29911-8_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29911-8_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29910-1

  • Online ISBN: 978-3-030-29911-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics