DOS-GAN: A Distributed Over-Sampling Method Based on Generative Adversarial Networks for Distributed Class-Imbalance Learning

Guan, Hongtao; Ma, Xingkong; Shen, Siqi

doi:10.1007/978-3-030-60248-2_42

Hongtao Guan⁹,
Xingkong Ma¹⁰ &
Siqi Shen⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12454))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

2150 Accesses
1 Citations

Abstract

Class-imbalance Learning is one of the hot research issues in machine learning. In the practical application of distributed class-imbalance learning, data continues to arrive, which often leads to class-imbalance situations. The imbalance problem in the distributed scenario is particular: the imbalanced state of different nodes may be complementary. The imbalanced states of different nodes may be complementary. Using this complementary relationship to do oversampling to change the imbalanced state is a valuable method. However, the data island limits data sharing in this case between the nodes. To this end, we propose DOS-GAN, which can take turns to use the data of one same class data on multiple nodes to train the global GAN model, and then use this GAN generator to oversampling the class without the original data being exchanged. Extensive experiments confirm that DOS-GAN outperforms the combination of traditional methods and achieves classification accuracy closes to the method of data aggregating.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Data Augment in Imbalanced Learning Based on Generative Adversarial Networks

BCGAN: A CGAN-based over-sampling model using the boundary class for data balancing

Article 05 March 2021

Evidential Generative Adversarial Networks for Handling Imbalanced Learning

Notes

References

Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953
Article MATH Google Scholar
Domingos, P.: MetaCost: a general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)
Google Scholar
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: AdaCost: misclassification cost-sensitive boosting. In: ICML, vol. 99, pp. 97–105 (1999)
Google Scholar
González-Serrano, F.J., Navia-Vázquez, Á., Amor-Martín, A.: Training support vector machines with privacy-protected data. Pattern Recogn. 72, 93–107 (2017). https://doi.org/10.1016/j.patcog.2017.06.016
Article Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Guan, H., Wang, Y., Ma, X., Li, Y.: DCIGAN: a distributed class-incremental learning method based on generative adversarial networks. In: 2019 IEEE International Conference on Parallel Distributed Processing with Applications, Big Data Cloud Computing, Sustainable Computing Communications, Social Computing Networking (ISPA/BDCloud/SocialCom/SustainCom), pp. 768–775 (2019)
Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, vol. 3, pp. 5767–5777 (2017)
Google Scholar
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Chapter Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)
Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Jagannathan, G., Wright, R.N.: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 593–599. ACM (2005)
Google Scholar
Konecný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. CoRR abs/1610.05492 (2016). http://arxiv.org/abs/1610.05492
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. B (Cybernetics) 39(2), 539–550 (2008)
Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)
Google Scholar
McMahan, H.B., Moore, E., Ramage, D., Arcas, B.A.: Federated learning of deep networks using model averaging. CoRR abs/1602.05629 (2016). http://arxiv.org/abs/1602.05629
Ming, Y., Zhao, Y., Wu, C., Li, K., Yin, J.: Distributed and asynchronous stochastic gradient descent with variance reduction. Neurocomputing 281, 27–36 (2018)
Article Google Scholar
Ming, Y., Zhu, E., Wang, M., Ye, Y., Liu, X., Yin, J.: DMP-ELMs: data and model parallel extreme learning machines for large-scale learning tasks. Neurocomputing 320, 85–97 (2018)
Article Google Scholar
Ng, A.Y., Jordan, M.I.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. In: Advances in Neural Information Processing Systems, pp. 841–848 (2002)
Google Scholar
Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2009). https://doi.org/10.1007/s10462-009-9124-7
Article MathSciNet Google Scholar
Schölkopf, B., Williamson, R., Smola, A., Shawe-Taylor, J., Platt, J.: Support vector method for novelty detection. In: Proceedings of the 12th International Conference on Neural Information Processing Systems, NIPS 1999, pp. 582–588. MIT Press, Cambridge (1999)
Google Scholar
Tang, Y., Zhang, Y.Q., Chawla, N.V., Krasser, S.: SVMs modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. B (Cybern.) 39(1), 281–288 (2008)
Article Google Scholar
Teo, S.G., Cao, J., Lee, V.C.: DAG: a general model for privacy-preserving data mining. IEEE Trans. Knowl. Data Eng. 1–1 (2018). https://doi.org/10.1109/tkde.2018.2880743
Wang, Y., Ma, X.: A general scalable and elastic content-based publish/subscribe service. IEEE Trans. Parallel Distrib. Syst. 26(8), 2100–2113 (2015). https://doi.org/10.1109/TPDS.2014.2346759
Article Google Scholar
Wang, Y., Pei, X., Ma, X., Xu, F.: TA-Update: an adaptive update scheme with tree-structured transmission in erasure-coded storage systems. IEEE Trans. Parallel Distrib. Syst. 29(8), 1893–1906 (2018). https://doi.org/10.1109/TPDS.2017.2717981
Article Google Scholar
Yang, Z., Wright, R.: Privacy-preserving computation of Bayesian networks on vertically partitioned data. IEEE Trans. Knowl. Data Eng. 18(9), 1253–1264 (2006). https://doi.org/10.1109/tkde.2006.147
Article Google Scholar

Download references

Acknowledgment

The authors would like to thank the anonymous reviewers for their valuable comments.

Author information

Authors and Affiliations

Science and Technology on Parallel and Distributed Laboratory, National University of Defense Technology, Changsha, Hunan, People’s Republic of China
Hongtao Guan & Siqi Shen
College of Computer, National University of Defense Technology, Changsha, Hunan, People’s Republic of China
Xingkong Ma

Authors

Hongtao Guan
View author publications
You can also search for this author in PubMed Google Scholar
Xingkong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Siqi Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongtao Guan .

Editor information

Editors and Affiliations

Columbia University, New York, NY, USA
Meikang Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guan, H., Ma, X., Shen, S. (2020). DOS-GAN: A Distributed Over-Sampling Method Based on Generative Adversarial Networks for Distributed Class-Imbalance Learning. In: Qiu, M. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2020. Lecture Notes in Computer Science(), vol 12454. Springer, Cham. https://doi.org/10.1007/978-3-030-60248-2_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-60248-2_42
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60247-5
Online ISBN: 978-3-030-60248-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics