Behavioral Analysis of Users for Spammer Detection in a Multiplex Social Network

Pourhabibi, Tahereh; Boo, Yee Ling; Ong, Kok-Leong; Kam, Booi; Zhang, Xiuzhen

doi:10.1007/978-981-13-6661-1_18

Tahereh Pourhabibi¹⁶,
Yee Ling Boo¹⁶,
Kok-Leong Ong¹⁷,
Booi Kam¹⁶ &
…
Xiuzhen Zhang¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 996))

Included in the following conference series:

Australasian Conference on Data Mining

1181 Accesses

Abstract

There are now a growing number of social networking websites with millions of users, creating a fertile ground for “spammers” to abuse opportunities in these websites for their own gain through constant exposure of malicious communications to other users. The variety of interactions afforded by these social networks has resulted in a Multiplex Network of interactions. In these networks, malicious users evade detection by frequently changing the nature of their activities. This makes it challenging to analyse users’ interactions to capture anomalous behaviours. In this paper, we aimed to detect spammers in a large time-evolving multiplex social network called Tagged.com. For this purpose, we used four different sets of features: (i) a set of light-weight behavioural features to capture the structural behaviour of users in their neighbourhood network; (ii) a set of bursty features and (iii) sequence-based features for capturing the temporal behaviour of users; and (iv) a set of profile-based features which was used as a side information. In addition, we also employed an unsupervised Laplacian Score based approach for feature selection and space dimensionality reduction. The experimental results showed an accuracy of over 88% in spammer detection with a lower empirical time complexity for feature extraction. Implementing behavioural and bursty features in a relational data management system makes the proposed approach more practical since most of the real-world networks store their data in relational databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of ACSAC10, USA (2010)
Google Scholar
Fakhraei, S., Foulds, J., Shashanka, M., Getoor, L.: Collective spammer detection in evolving multi-relational social networks. In: Proceedings of KDD15, Australia, pp 1769–1778. ACM (2015)
Google Scholar
Agrawal, D., Budak, C., El Abbadi, A., Georgiou, T., Yan, X.: Big data in online social networks: user interaction analysis to model user behavior in social networks. In: Madaan, A., Kikuchi, S., Bhalla, S. (eds.) DNIS 2014. LNCS, vol. 8381, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05693-7_1
Chapter Google Scholar
Shehnepoor, S., Salehi, M., Farahbakhsh, R., Crespi, N.: NetSpam: a network-based spam detection framework for reviews in online social media. IEEE Trans. Inf. Forensics Secur. 12, 1585–1595 (2017)
Article Google Scholar
Zheng, X., Zeng, Z., Chen, Z., Yu, Y., Rong, C.: Detecting spammers on social networks. Neurocomputing 159, 27–34 (2015)
Article Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: 7th Annual Collaboration, Electronic Messaging, AntiAbuse and Spam, USA (2010)
Google Scholar
Wang, A.H.: Don’t follow me: spam detection in Twitter. In: International Conference on Security and Cryptography, Greece (2010)
Google Scholar
Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: Proceedings of IMC 2010, Australia, pp 35–47. ACM (2010)
Google Scholar
Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans. Inf. Forensics Secur. 8, 1280–1293 (2013)
Article Google Scholar
Hooi, B., Shin, K., Song, H.A., Beutel, A., Shah, N., Faloutsos, C.: Graph-based fraud detection in the face of camouflage. ACM Trans. Knowl. Discov. Data 11, 1–26 (2017)
Article Google Scholar
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of NIPS 2005, Canada, pp 507–514. MIT Press (2005)
Google Scholar
Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. In: Proceedings of SIGCOMM 2008, USA. vol. 38, pp. 171–182. ACM (2008)
Google Scholar
Liu, T., Li, P., Chen, Y., Zhang, J.: Community size effects on epidemic spreading in multiplex social networks. PLoS One 11, e0152021 (2016)
Article Google Scholar
Schlichtkrull, M., Kipf, T.N., Bloem, P., Berg, R.v.d., Titov, I., Welling, M.: Modeling Relational Data with Graph Convolutional Networks. arXiv preprint arXiv:170306103 (2017)
Karim, M.R., Zilles, S.: Robust features for detecting evasive spammers in Twitter. In: Sokolova, M., van Beek, P. (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 295–300. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06483-3_28
Chapter Google Scholar
Bhat, S.Y., Abulaish, M.: Community-based features for identifying spammers in online social networks. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Canada (2013)
Google Scholar
Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_17
Chapter Google Scholar
Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a Human, Bot, or Cyborg? IEEE Trans. Dependable Secur. Comput. 9, 811–824 (2012)
Article Google Scholar
Eom, C.S.-H., Lee, W., Lee, J.J.-H.: Spammer detection for real-time big data graphs. In: Proceedings of EDB 2016, Korea, pp 51–60. ACM (2016)
Google Scholar
Karsai, M., Jo, H.-H., Kaski, K.: Bursty Human Dynamics. SC. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-68540-3
Book Google Scholar
García-Pérez, G., Boguñá, M., Serrano, M.Á.: Regulation of burstiness by network-driven activation. Sci. Rep. 5, 9714 (2015)
Article Google Scholar
Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: Fame for sale: Efficient detection of fake Twitter followers. Decis. Support Syst. 80, 56–71 (2015)
Article Google Scholar
Bindu, P.V., Mishra, R., Thilagam, P.S.: Discovering spammer communities in Twitter. J. Intell. Inf. Syst. 1–25 (2018)
Google Scholar
Jiang, M., Cui, P., Beutel, A., Faloutsos, C., Yang, S.: Catching synchronized behaviors in large networks: a graph mining approach. ACM Trans. Knowl. Discov. Data 10, 1–27 (2016)
Article Google Scholar
Kariin, S., Burge, C.: Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 11, 283–290 (1995)
Article Google Scholar
Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)
MathSciNet MATH Google Scholar
Pourhabibi, T., Imani, M.B., Haratizadeh, S.: Feature selection on Persian fonts: a comparative analysis on GAA, GESA and GA. Procedia Comput. Sci. 3, 1249–1255 (2011)
Article Google Scholar
Zhu, L., Miao, L., Zhang, D.: Iterative laplacian score for feature selection. In: Liu, C.-L., Zhang, C., Wang, L. (eds.) CCPR 2012. CCIS, vol. 321, pp. 80–87. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33506-8_11
Chapter Google Scholar
Enache, A.-C., Sgârciu, V.: An improved bat algorithm driven by support vector machines for intrusion detection. In: Herrero Á., Baruque B., Sedano J., Quintián H., Corchado, E. (eds.) International Joint Conference. CISIS 2015. Advances in Intelligent Systems and Computing. International Joint Conference, vol. 369, pp. 41–51. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19713-5_4
Google Scholar
Perera, B.K.: A class imbalance learning approach to fraud detection in online advertising. Masdar Institute of Science and Technology (2013)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab (1999)
Google Scholar
Jensen, T.R., Toft, B.: Graph coloring problems. Wiley, New York (2011)
MATH Google Scholar
Pemmaraju, S., Skiena, S.: Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Wesley Longman, Boston (1990)
MATH Google Scholar
Polak, A.: Counting triangles in large graphs on GPU. In: IEEE International Parallel and Distributed Processing Symposium Workshops (2016)
Google Scholar
Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: Proceedings of NIPS 2005 Canada, pp 41–50. MIT Press (2005)
Google Scholar
Zheng, F., Webb, G.I.: Tree augmented naive bayes. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 990–991. Springer, USA (2010). https://doi.org/10.1007/978-0-387-30164-8
Chapter Google Scholar
Liu, Z., Wang, C., Zou, Q., Wang, H.: Clustering coefficient queries on massive dynamic social networks. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 115–126. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14246-8_14
Chapter Google Scholar
Jindal, A., Madden, S., Castellanos, M., Hsu, M.: Graph analytics using vertica relational database. In: IEEE International Conference on Big Data, pp 1191–1200 (2015)
Google Scholar
Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10, e0118432 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

RMIT University, Melbourne, Australia
Tahereh Pourhabibi, Yee Ling Boo, Booi Kam & Xiuzhen Zhang
Latrobe University, Melbourne, Australia
Kok-Leong Ong

Authors

Tahereh Pourhabibi
View author publications
You can also search for this author in PubMed Google Scholar
Yee Ling Boo
View author publications
You can also search for this author in PubMed Google Scholar
Kok-Leong Ong
View author publications
You can also search for this author in PubMed Google Scholar
Booi Kam
View author publications
You can also search for this author in PubMed Google Scholar
Xiuzhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tahereh Pourhabibi .

Editor information

Editors and Affiliations

School of Computing and Mathematics, Charles Sturt University, Albury, NSW, Australia
Rafiqul Islam
University of Auckland, Auckland, New Zealand
Yun Sing Koh
CSIRO Scientific Computing, Canberra, Australia
Yanchang Zhao
Data Science and Engineering, Australian Taxation Office, Canberra, Australia
Graco Warwick
Department of Information Technology, University of Wollongong, Wollongong, NSW, Australia
David Stirling
School of Computing and Mathematics, Charles Sturt University, Wagga Wagga, Australia
Chang-Tsun Li
School of Computing and Mathematics, Charles Sturt University, Bathurst, Australia
Zahidul Islam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pourhabibi, T., Boo, Y.L., Ong, KL., Kam, B., Zhang, X. (2019). Behavioral Analysis of Users for Spammer Detection in a Multiplex Social Network. In: Islam, R., et al. Data Mining. AusDM 2018. Communications in Computer and Information Science, vol 996. Springer, Singapore. https://doi.org/10.1007/978-981-13-6661-1_18

Download citation

DOI: https://doi.org/10.1007/978-981-13-6661-1_18
Published: 16 February 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6660-4
Online ISBN: 978-981-13-6661-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics