A Semi-Supervised Framework for Social Spammer Detection

Li, Zhaoxing; Zhang, Xianchao; Shen, Hua; Liang, Wenxin; He, Zengyou

doi:10.1007/978-3-319-18032-8_14

Zhaoxing Li¹⁰,
Xianchao Zhang¹⁰,
Hua Shen¹⁰,
Wenxin Liang¹⁰ &
…
Zengyou He¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

4331 Accesses
14 Citations

Abstract

Spammers create large number of compromised or fake accounts to disseminate harmful information in social networks like Twitter. Identifying social spammers has become a challenging problem. Most of existing algorithms for social spammer detection are based on supervised learning, which needs a large amount of labeled data for training. However, labeling sufficient training set costs too much resources, which makes supervised learning impractical for social spammer detection. In this paper, we propose a semi-supervised framework for social spammer detection(SSSD), which combines the supervised classification model with a ranking scheme on the social graph. First, we train an original classifier with a small number of labeled data. Second, we propose a ranking model to propagate trust and distrust on the social graph. Third, we select confident users that are judged by the classifier and ranking scores as new training data and retrain the classifier. We repeat the all steps above until the classifier cannot be refined any more. Experimental results show that our framework can effectively detect social spammers in the condition of lacking sufficient labeled data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amleshwaram, A.A., Reddy, N., Yadav, S., Gu, G., Yang, C.: Cats: characterizing automation of twitter spammers. In: 2013 Fifth International Conference on Communication Systems and Networks (COMSNETS), pp. 1–10. IEEE (2013)
Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS) (2010)
Google Scholar
Benevenuto, F., Rodrigues, T., Almeida, V., Almeida, J., Gonçalves, M.: Detecting spammers and content promoters in online video social networks. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval(SIGIR), pp. 620–627. ACM (2009)
Google Scholar
Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.: Detecting and characterizing social spam campaigns. In: Proceedings of the 10th Annual Conference on Internet Measurement(IMC), pp. 35–47. ACM (2010)
Google Scholar
Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web, pp. 61–70. ACM (2012)
Google Scholar
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases-vol. 30, pp. 576–587. VLDB Endowment (2004)
Google Scholar
Heymann, P., Koutrika, G., Garcia-Molina, H.: Fighting spam on social web sites: A survey of approaches and future challenges. IEEE Internet Computing 11(6), 36–45 (2007)
Article Google Scholar
Hu, X., Tang, J., Liu, H.: Online social spammer detection. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Google Scholar
Hu, X., Tang, J., Zhang, Y., Liu, H.: Social spammer detection in microblogging. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 2633–2639. AAAI Press (2013)
Google Scholar
Krishnan, V., Raj, R.: Web spam detection with anti-trust rank. AIRWeb. 6, 37–40 (2006)
Google Scholar
Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: social honeypots+ machine learning. In: Proceeding of the 33rd International ACM (SIGIR) Conference on Research and Development in Information Retrieval, pp. 435–442. ACM (2010)
Google Scholar
Li, R., Wang, S., Deng, H., Wang, R., Chang, K.C.C.: Towards social user profiling: unified and discriminative influence model for inferring home locations. In: KDD, pp. 1023–1031 (2012)
Google Scholar
Prasetyo, P.K., Lo, D., Achananuparp, P., Tian, Y., Lim, E.P.: Automatic classification of software related microblogs. In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 596–599. IEEE (2012)
Google Scholar
Tan, E., Guo, L., Chen, S., Zhang, X., Zhao, Y.: Unik: unsupervised social network spam detection. In: Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management, pp. 479–488. ACM (2013)
Google Scholar
Wang, A.: Don’t follow me: Spam detection in twitter. In: Proceedings of the 2010 International Conference on Security and Cryptography (SECRYPT), pp. 1–10. IEEE (2010)
Google Scholar
Wu, B., Goel, V., Davison, B.D.: Propagating trust and distrust to demote web spam. MTW 190 (2006)
Google Scholar
Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Transactions on Information Forensics and Security 8(8), 1280–1293 (2013)
Article Google Scholar
Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of the 21st International Conference on World Wide Web, pp. 71–80. ACM (2012)
Google Scholar
Zhang, X., Zhu, S., Liang, W.: Detecting spam and promoting campaigns in the twitter social network. In: Proceedings of the 2012 IEEE 12th International Conference on Data Mining, pp. 1194–1199. IEEE Computer Society (2012)
Google Scholar
Zhou, Z.H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Article Google Scholar
Zhu, Y., Wang, X., Zhong, E., Liu, N.N., Li, H., Yang, Q.: Discovering spammers in social networks. In: AAAI (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Software Technology, Dalian University of Technology, 116621, Dalian, China
Zhaoxing Li, Xianchao Zhang, Hua Shen, Wenxin Liang & Zengyou He

Authors

Zhaoxing Li
View author publications
You can also search for this author in PubMed Google Scholar
Xianchao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hua Shen
View author publications
You can also search for this author in PubMed Google Scholar
Wenxin Liang
View author publications
You can also search for this author in PubMed Google Scholar
Zengyou He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianchao Zhang .

Editor information

Editors and Affiliations

Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Tru Cao
Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Japan Advanced Institute of Science and Technology, Nomi City, Japan
Tu-Bao Ho
The University of Hong Kong, Hong Kong, Hong Kong SAR
David Cheung
Osaka University, Osaka, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Zhang, X., Shen, H., Liang, W., He, Z. (2015). A Semi-Supervised Framework for Social Spammer Detection. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-18032-8_14
Published: 09 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18031-1
Online ISBN: 978-3-319-18032-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics