Skip to main content

Behavioral Analysis of Users for Spammer Detection in a Multiplex Social Network

  • Conference paper
  • First Online:
Data Mining (AusDM 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 996))

Included in the following conference series:

  • 1181 Accesses

Abstract

There are now a growing number of social networking websites with millions of users, creating a fertile ground for “spammers” to abuse opportunities in these websites for their own gain through constant exposure of malicious communications to other users. The variety of interactions afforded by these social networks has resulted in a Multiplex Network of interactions. In these networks, malicious users evade detection by frequently changing the nature of their activities. This makes it challenging to analyse users’ interactions to capture anomalous behaviours. In this paper, we aimed to detect spammers in a large time-evolving multiplex social network called Tagged.com. For this purpose, we used four different sets of features: (i) a set of light-weight behavioural features to capture the structural behaviour of users in their neighbourhood network; (ii) a set of bursty features and (iii) sequence-based features for capturing the temporal behaviour of users; and (iv) a set of profile-based features which was used as a side information. In addition, we also employed an unsupervised Laplacian Score based approach for feature selection and space dimensionality reduction. The experimental results showed an accuracy of over 88% in spammer detection with a lower empirical time complexity for feature extraction. Implementing behavioural and bursty features in a relational data management system makes the proposed approach more practical since most of the real-world networks store their data in relational databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of ACSAC10, USA (2010)

    Google Scholar 

  2. Fakhraei, S., Foulds, J., Shashanka, M., Getoor, L.: Collective spammer detection in evolving multi-relational social networks. In: Proceedings of KDD15, Australia, pp 1769–1778. ACM (2015)

    Google Scholar 

  3. Agrawal, D., Budak, C., El Abbadi, A., Georgiou, T., Yan, X.: Big data in online social networks: user interaction analysis to model user behavior in social networks. In: Madaan, A., Kikuchi, S., Bhalla, S. (eds.) DNIS 2014. LNCS, vol. 8381, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05693-7_1

    Chapter  Google Scholar 

  4. Shehnepoor, S., Salehi, M., Farahbakhsh, R., Crespi, N.: NetSpam: a network-based spam detection framework for reviews in online social media. IEEE Trans. Inf. Forensics Secur. 12, 1585–1595 (2017)

    Article  Google Scholar 

  5. Zheng, X., Zeng, Z., Chen, Z., Yu, Y., Rong, C.: Detecting spammers on social networks. Neurocomputing 159, 27–34 (2015)

    Article  Google Scholar 

  6. Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: 7th Annual Collaboration, Electronic Messaging, AntiAbuse and Spam, USA (2010)

    Google Scholar 

  7. Wang, A.H.: Don’t follow me: spam detection in Twitter. In: International Conference on Security and Cryptography, Greece (2010)

    Google Scholar 

  8. Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: Proceedings of IMC 2010, Australia, pp 35–47. ACM (2010)

    Google Scholar 

  9. Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans. Inf. Forensics Secur. 8, 1280–1293 (2013)

    Article  Google Scholar 

  10. Hooi, B., Shin, K., Song, H.A., Beutel, A., Shah, N., Faloutsos, C.: Graph-based fraud detection in the face of camouflage. ACM Trans. Knowl. Discov. Data 11, 1–26 (2017)

    Article  Google Scholar 

  11. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of NIPS 2005, Canada, pp 507–514. MIT Press (2005)

    Google Scholar 

  12. Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. In: Proceedings of SIGCOMM 2008, USA. vol. 38, pp. 171–182. ACM (2008)

    Google Scholar 

  13. Liu, T., Li, P., Chen, Y., Zhang, J.: Community size effects on epidemic spreading in multiplex social networks. PLoS One 11, e0152021 (2016)

    Article  Google Scholar 

  14. Schlichtkrull, M., Kipf, T.N., Bloem, P., Berg, R.v.d., Titov, I., Welling, M.: Modeling Relational Data with Graph Convolutional Networks. arXiv preprint arXiv:170306103 (2017)

  15. Karim, M.R., Zilles, S.: Robust features for detecting evasive spammers in Twitter. In: Sokolova, M., van Beek, P. (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 295–300. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06483-3_28

    Chapter  Google Scholar 

  16. Bhat, S.Y., Abulaish, M.: Community-based features for identifying spammers in online social networks. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Canada (2013)

    Google Scholar 

  17. Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_17

    Chapter  Google Scholar 

  18. Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a Human, Bot, or Cyborg? IEEE Trans. Dependable Secur. Comput. 9, 811–824 (2012)

    Article  Google Scholar 

  19. Eom, C.S.-H., Lee, W., Lee, J.J.-H.: Spammer detection for real-time big data graphs. In: Proceedings of EDB 2016, Korea, pp 51–60. ACM (2016)

    Google Scholar 

  20. Karsai, M., Jo, H.-H., Kaski, K.: Bursty Human Dynamics. SC. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-68540-3

    Book  Google Scholar 

  21. García-Pérez, G., Boguñá, M., Serrano, M.Á.: Regulation of burstiness by network-driven activation. Sci. Rep. 5, 9714 (2015)

    Article  Google Scholar 

  22. Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: Fame for sale: Efficient detection of fake Twitter followers. Decis. Support Syst. 80, 56–71 (2015)

    Article  Google Scholar 

  23. Bindu, P.V., Mishra, R., Thilagam, P.S.: Discovering spammer communities in Twitter. J. Intell. Inf. Syst. 1–25 (2018)

    Google Scholar 

  24. Jiang, M., Cui, P., Beutel, A., Faloutsos, C., Yang, S.: Catching synchronized behaviors in large networks: a graph mining approach. ACM Trans. Knowl. Discov. Data 10, 1–27 (2016)

    Article  Google Scholar 

  25. Kariin, S., Burge, C.: Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 11, 283–290 (1995)

    Article  Google Scholar 

  26. Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)

    MathSciNet  MATH  Google Scholar 

  27. Pourhabibi, T., Imani, M.B., Haratizadeh, S.: Feature selection on Persian fonts: a comparative analysis on GAA, GESA and GA. Procedia Comput. Sci. 3, 1249–1255 (2011)

    Article  Google Scholar 

  28. Zhu, L., Miao, L., Zhang, D.: Iterative laplacian score for feature selection. In: Liu, C.-L., Zhang, C., Wang, L. (eds.) CCPR 2012. CCIS, vol. 321, pp. 80–87. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33506-8_11

    Chapter  Google Scholar 

  29. Enache, A.-C., Sgârciu, V.: An improved bat algorithm driven by support vector machines for intrusion detection. In: Herrero Á., Baruque B., Sedano J., Quintián H., Corchado, E. (eds.) International Joint Conference. CISIS 2015. Advances in Intelligent Systems and Computing. International Joint Conference, vol. 369, pp. 41–51. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19713-5_4

    Google Scholar 

  30. Perera, B.K.: A class imbalance learning approach to fraud detection in online advertising. Masdar Institute of Science and Technology (2013)

    Google Scholar 

  31. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab (1999)

    Google Scholar 

  32. Jensen, T.R., Toft, B.: Graph coloring problems. Wiley, New York (2011)

    MATH  Google Scholar 

  33. Pemmaraju, S., Skiena, S.: Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Wesley Longman, Boston (1990)

    MATH  Google Scholar 

  34. Polak, A.: Counting triangles in large graphs on GPU. In: IEEE International Parallel and Distributed Processing Symposium Workshops (2016)

    Google Scholar 

  35. Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: Proceedings of NIPS 2005 Canada, pp 41–50. MIT Press (2005)

    Google Scholar 

  36. Zheng, F., Webb, G.I.: Tree augmented naive bayes. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 990–991. Springer, USA (2010). https://doi.org/10.1007/978-0-387-30164-8

    Chapter  Google Scholar 

  37. Liu, Z., Wang, C., Zou, Q., Wang, H.: Clustering coefficient queries on massive dynamic social networks. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 115–126. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14246-8_14

    Chapter  Google Scholar 

  38. Jindal, A., Madden, S., Castellanos, M., Hsu, M.: Graph analytics using vertica relational database. In: IEEE International Conference on Big Data, pp 1191–1200 (2015)

    Google Scholar 

  39. Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10, e0118432 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tahereh Pourhabibi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pourhabibi, T., Boo, Y.L., Ong, KL., Kam, B., Zhang, X. (2019). Behavioral Analysis of Users for Spammer Detection in a Multiplex Social Network. In: Islam, R., et al. Data Mining. AusDM 2018. Communications in Computer and Information Science, vol 996. Springer, Singapore. https://doi.org/10.1007/978-981-13-6661-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6661-1_18

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6660-4

  • Online ISBN: 978-981-13-6661-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics