Abstract
Aiming at the problem of the unbalanced advertising user data of social networks leading to unsatisfactory prediction results, we propose a prediction model for advertising users based on the combination among K-Means, synthetic minority oversampling Technique (SMOTE), and Ensemble Learning. On the basis of the real user data provided by Scholat, we analyzed the data and extracted many key features from it to draw a portrait of advertising users. Our algorithm first clusters the minority class, and then processes the continuous and discrete features of each sample separately through the improved SMOTE to synthesize new minority samples, and finally constructs an integrated classifier using the ensemble learning. This method effectively avoids the problems of blurred positive and negative class boundaries caused by SMOTE and the inability of SMOTE to process discrete features. Meanwhile, ensemble learning enables the classifier to get more reasonable results and reduce overall errors. The experimental results show that our method improves the quality of the generated minority class samples and significantly improves the prediction performance of advertising users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Xixian, P., Qinghua, Z., Xuan, L.: Research on behavior characteristics and classification of micro-blog. Inf. Sci. 033(001), 69–75 (2015)
Meng, X., Xu, L., Wang, S.: Spam analysis and detection of social network based on sina weibo. Sci. Technol. 000(015), 125–127 (2014)
Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Twenty-Sixth Annual Computer Security Applications Conference, ACSAC 2010, Austin, Texas, USA, 6–10 December 2010 (2010)
Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J.M., Gonçalves, M.: Detecting spammers and content promoters in online video social networks. In: IEEE (2009)
Hui, H., Wang, W.Y., Mao, B.H.: Borderline-smote: A new over-sampling method in imbalanced data sets learning. In: Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I (2005)
Sánchez, A.I., Morales, E.F., Gonzalez, J.A.: Synthetic oversampling of instances using clustering. Int. J. Artif. Intell. Tools 22(02), 1350008 (2013). https://doi.org/10.1142/S0218213013500085
Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE--majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014). https://doi.org/10.1109/TKDE.2012.232
Douzas, G., Bacao, F., Last, F.: Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci. 465, 1–20 (2018). https://doi.org/10.1016/j.ins.2018.06.056
Ruan, Q., Qingfeng, W., Wang, Y., Liu, X., Miao, F.: Effective learning model of user classification based on ensemble learning algorithms. Computing 101(6), 531–545 (2018). https://doi.org/10.1007/s00607-018-0688-4
Acknowledgements
We thank the anonymous reviewers for their insightful comments. This work was supported by National Natural Science Foundation of China under grant number U1811263, by National Natural Science Foundation of China under grant number 6177221.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Qiu, Z., Zhou, Z., Long, Y., Ji, C., Li, J., Tang, Y. (2022). Detection of Advertising Users Based on K-SMOTE and Ensemble Learning. In: Zu, Q., Tang, Y., Mladenovic, V., Naseer, A., Wan, J. (eds) Human Centered Computing. HCC 2021. Lecture Notes in Computer Science, vol 13795. Springer, Cham. https://doi.org/10.1007/978-3-031-23741-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-23741-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23740-9
Online ISBN: 978-3-031-23741-6
eBook Packages: Computer ScienceComputer Science (R0)