PUED: A Social Spammer Detection Method Based on PU Learning and Ensemble Learning

Song, Yuqi; Gao, Min; Yu, Junliang; Li, Wentao; Yu, Lulan; Xiao, Xinyu

doi:10.1007/978-3-030-00916-8_14

Yuqi Song^21,22,
Min Gao^21,22,
Junliang Yu^21,22,
Wentao Li²³,
Lulan Yu^21,22 &
…
Xinyu Xiao^21,22

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 252))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

1280 Accesses
2 Citations

Abstract

In social network, people generally tend to share information with others, thus, those who have frequent access to the social network are more likely to be affected by the interest and opinions of other people. This characteristic is exploited by spammers, who spread spam information in network to disturb normal users for interest motives seriously. Numerous notable studies have been done to detect social spammers, and these methods can be categorized into three types: unsupervised, supervised and semi-supervised methods. While the performance of supervised and semi-supervised methods is superior in terms of detection accuracy, these methods usually suffer from the dilemma of imbalanced data since the number of unlabeled normal users is far more than spammers’ in real situations. To address the problem, we propose a novel method only relying on normal users to detect spammers exactly. We present two steps: one picks out reliable spammers from unlabeled samples which is imposed on a voting classifier; while the other trains a random forest detector from the normal users and reliable spammers. We conduct experiments on two real-world social datasets and show that our method outperforms other supervised methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hu, X., Tang, J., Zhang, Y., Liu, H.: Social spammer detection in microblogging. In: IJCAI, vol. 13, pp. 2633–2639 (2013). Citeseer
Google Scholar
Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, pp. 175–184. AAAI (2013)
Google Scholar
Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp. 35–47. ACM (2010)
Google Scholar
Tan, E., Guo, L., Chen, S., Zhang, X., Zhao, Y.: UNIK: unsupervised social network spam detection. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 479–488. ACM (2013)
Google Scholar
Zhang, B., Qian, T., Chen, Y., You, Z.: Social spammer detection via structural properties in ego network. In: Li, Y., Xiang, G., Lin, H., Wang, M. (eds.) SMP 2016. CCIS, vol. 669, pp. 245–256. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-2993-6_21
Chapter Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, Electronic Messaging, Anti-abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)
Google Scholar
Wei, W., Joseph, K., Liu, H., Carley, K.M.: Exploring characteristics of suspended users and network stability on twitter. Soc. Netw. Anal. Mining 6(1), 51 (2016)
Google Scholar
Wu, L., Hu, X., Morstatter, F., Liu, H.: Adaptive spammer detection with sparse group modeling (2017)
Google Scholar
Wu, Z., Wang, Y., Wang, Y., Wu, J., Cao, J., Zhang, L.: Spammers detection from product reviews: a hybrid model. In: 2015 IEEE International Conference on Data Mining (ICDM), pp. 1039–1044. IEEE (2015)
Google Scholar
Li, Z., Zhang, X., Shen, H., Liang, W., He, Z.: A semi-supervised framework for social spammer detection. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 177–188. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_14
Chapter Google Scholar
Li, W., Gao, M., Rong, W., Wen, J., Xiong, Q., Ling, B.: LSSL-SSD: social spammer detection with Laplacian score and semi-supervised learning. In: Lehner, F., Fteimi, N. (eds.) KSEM 2016. LNCS (LNAI), vol. 9983, pp. 439–450. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47650-6_35
Chapter Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 179–186. IEEE (2003)
Google Scholar
Polikar, R.: Ensemble learning. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 1–34. Springer, Heidelberg (2012)
Google Scholar
Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)
Article Google Scholar
Bühlman, P.: Bagging, boosting and ensemble methods. In: Gentle, J., Härdle, W., Mori, Y. (eds.) Handbook of Computational Statistics, pp. 985–1022. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-21551-3_33
Google Scholar
Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J., Gonçalves, M.: Detecting spammers and content promoters in online video social networks. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 620–627. ACM (2009)
Google Scholar

Download references

Acknowledgments

The work is supported by the Basic and Advanced Research Projects in Chongqing under Grant No. cstc2015jcyjA40049, the National Key Basic Research Program of China (973) under Grant No. 2013CB328903, the Guangxi Science and Technology Major Project under Grant No. GKAA17129002, and the Graduate Scientific Research and Innovation Foundation of Chongqing, China under Grant No. CYS17035.

Author information

Authors and Affiliations

Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University), Ministry of Education, Chongqing, China
Yuqi Song, Min Gao, Junliang Yu, Lulan Yu & Xinyu Xiao
School of Software Engineering, Chongqing University, Chongqing, China
Yuqi Song, Min Gao, Junliang Yu, Lulan Yu & Xinyu Xiao
Centre for Artificial Intelligence, School of Software, Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, Australia
Wentao Li

Authors

Yuqi Song
View author publications
You can also search for this author in PubMed Google Scholar
Min Gao
View author publications
You can also search for this author in PubMed Google Scholar
Junliang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Li
View author publications
You can also search for this author in PubMed Google Scholar
Lulan Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xinyu Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Gao .

Editor information

Editors and Affiliations

Edinburgh Napier University, Edinburgh, UK
Imed Romdhani
Guangdong University of Petrochemical Technology, Maoming, China
Lei Shu
Osaka University, Osaka, Japan
Hara Takahiro
China University of Geosciences, Beijing, China
Zhangbing Zhou
University of Nebraska–Lincoln, Lincoln, UK
Timothy Gordon
School of Computer Science, China University of Geosciences, Wuhan, Hubei, China
Deze Zeng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, Y., Gao, M., Yu, J., Li, W., Yu, L., Xiao, X. (2018). PUED: A Social Spammer Detection Method Based on PU Learning and Ensemble Learning. In: Romdhani, I., Shu, L., Takahiro, H., Zhou, Z., Gordon, T., Zeng, D. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 252. Springer, Cham. https://doi.org/10.1007/978-3-030-00916-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-00916-8_14
Published: 26 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00915-1
Online ISBN: 978-3-030-00916-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics