skip to main content
10.1145/2492517.2492567acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Community-based features for identifying spammers in online social networks

Published:25 August 2013Publication History

ABSTRACT

The popularity of Online Social Networks (OSNs) is often faced with challenges of dealing with undesirable users and their malicious activities in the social networks. The most common form of malicious activity over OSNs is spamming wherein a bot (fake user) disseminates content, malware/viruses, etc. to the legitimate users of the social networks. The common motives behind such activity include phishing, scams, viral marketing and so on which the recipients do not indent to receive. It is thus a highly desirable task to devise techniques and methods for identifying spammers (spamming accounts) in OSNs. With an aim of exploiting social network characteristics of community formation by legitimate users, this paper presents a community-based framework to identify spammers in OSNs. The framework uses community-based features of OSN users to learn classification models for identification of spamming accounts. The preliminary experiments on a real-world dataset with simulated spammers reveal that proposed approach is promising and that using community-based node features of OSN users can improve the performance of classifying spammers and legitimate users.

References

  1. L. A. Adamic and E. Adar, "Friends and neighbors on the web," Social Networks, vol. 25, no. 3, pp. 211--230, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  2. R. Kumar, J. Novak, and A. Tomkins, "Structure and evolution of online social networks," in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD '06. New York, NY, USA: ACM, 2006, pp. 611--617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Girvan and M. E. J. Newman, "Community structure in social and biological networks," Proceedings of the National Academy of Sciences, vol. 99, no. 12, pp. 7821--7826, Jun. 2002.Google ScholarGoogle ScholarCross RefCross Ref
  4. S. Y. Bhat and M. Abulaish, "Octracker: A density-based framework for tracking the evolution of overlapping communities in osns," in 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Los Alamitos, CA, USA: IEEE Computer Society, 2012, pp. 501--505. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. E. Newman and M. Girvan, "Finding and evaluating community structure in networks," Physical Review E, vol. 69, 2004.Google ScholarGoogle Scholar
  6. G. Palla, I. Derényi, I. Farkas, and T. Vicsek, "Uncovering the overlapping community structure of complex networks in nature and society," Nature, vol. 435, no. 7043, pp. 814--818, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Gregory, "An algorithm to find overlapping community structure in networks," in Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, 2007, pp. 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y.-R. Lin, Y. Chi, S. Zhu, H. Sundaram, and B. L. Tseng, "Analyzing communities and their evolutions in dynamic social networks," ACM Trans. Knowl. Discov. Data, vol. 3, pp. 8:1--8:31, April 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. L. Becchetti, C. Castillo, D. Donato, S. Leonardi, and R. Baeza-Yates, "Link-based characterization and detection of web spam," in Second International Workshop on Adversarial Information Retrieval on the Web (AIRWeb), Seattle, USA, 2006.Google ScholarGoogle Scholar
  10. F. J. Ortega, C. Macdonald, J. A. Troyano, and F. Cruz, "Spam detection with a content-based random-walk algorithm," in Proceedings of the 2nd international workshop on Search and mining user-generated contents, ser. SMUC '10. New York, NY, USA: ACM, 2010, pp. 45--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Shrivastava, A. Majumder, and R. Rastogi, "Mining (social) network graphs to detect random link attacks," in Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ser. ICDE '08. Washington, DC, USA: IEEE Computer Society, 2008, pp. 486--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. H. Gomes, R. B. Almeida, L. M. A. Bettencourt, V. Almeida, and J. M. A., "Comparative graph theoretical characterization of networks of spam and legitimate email," in Proceedings of the 2nd Conference on Email and Anti-Spam (CEAS), 2005.Google ScholarGoogle Scholar
  13. H. Lam, A Learning Approach to Spam Detection Based on Social Networks. Hong Kong University of Science and Technology, 2007.Google ScholarGoogle Scholar
  14. P. O. Boykin and V. P. Roychowdhury, "Leveraging social networks to fight spam," Computer, vol. 38, no. 4, pp. 61--68, Apr. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Ramachandran, A. Dasgupta, N. Feamster, and K. Weinberger, "Spam or ham?: characterizing and detecting fraudulent "not spam" reports in web mail systems," in Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, ser. CEAS '11. New York, NY, USA: ACM, 2011, pp. 210--219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Damiani, S. D. C. di Vimercati, S. Paraboschi, and P. Samarati, "P2p-based collaborative spam detection and filtering," in Proceedings of the Fourth International Conference on Peer-to-Peer Computing, ser. P2P '04. Washington, DC, USA: IEEE Computer Society, 2004, pp. 176--183. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. Li and M. H. Hsieh, "An empirical study of clustering behavior of spammers and group-based anti-spam strategies," in CEAS 2006 - The Third Conference on Email and Anti-Spam, July 27--28, 2006, Mountain View, California, USA, 2006, pp. 27--28.Google ScholarGoogle Scholar
  18. G. Stringhini, C. Kruegel, and G. Vigna, "Detecting spammers on social networks," in Proceedings of the 26th Annual Computer Security Applications Conference, ser. ACSAC '10. New York, NY, USA: ACM, 2010, pp. 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. DeBarr and H. Wechsler, "Using social network analysis for spam detection," in Proceedings of the Third international conference on Social Computing, Behavioral Modeling, and Prediction, ser. SBP'10. Berlin, Heidelberg: Springer-Verlag, 2010, pp. 62--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. H. Wang, "Don't follow me: Spam detection in twitter," in Security and Cryptography (SECRYPT), Proceedings of the 2010 International Conference on, 2010, pp. 1--10.Google ScholarGoogle Scholar
  21. C. X. Jin, X. and Lin, J. Luo, and J. Han, "Socialspamguard: A data mining-based spam detection system for social media networks." PVLDB, no. 12, pp. 1458--1461, 2011.Google ScholarGoogle Scholar
  22. F. Benevenuto, T. Rodrigues, V. Almeida, J. Almeida, and M. Gonçalves, "Detecting spammers and content promoters in online video social networks," in Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR '09. New York, NY, USA: ACM, 2009, pp. 620--627. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Castillo, D. Donato, A. Gionis, V. Murdock, and F. Silvestri, "Know your neighbors: web spam detection using the web topology," in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR '07. New York, NY, USA: ACM, 2007, pp. 423--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Lee, J. Caverlee, and S. Webb, "Uncovering social spammers: social honeypots + machine learning," in Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval, ser. SIGIR '10. New York, NY, USA: ACM, 2010, pp. 435--442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Q. Gan and T. Suel, "Improving web spam classifiers using link structure," in Proceedings of the 3rd international workshop on Adversarial information retrieval on the web, ser. AIRWeb '07. New York, NY, USA: ACM, 2007, pp. 17--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Ramachandran, N. Feamster, and S. Vempala, "Filtering spam with behavioral blacklisting," in Proceedings of the 14th ACM conference on Computer and communications security, ser. CCS '07. New York, NY, USA: ACM, 2007, pp. 342--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, and E. Gillum, "Botgraph: large scale spamming botnet detection," in Proceedings of the 6th USENIX symposium on Networked systems design and implementation, ser. NSDI'09. Berkeley, CA, USA: USENIX Association, 2009, pp. 321--334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. H. Gao, J. Hu, C. Wilson, Z. Li, Y. Chen, and B. Y. Zhao, "Detecting and characterizing social spam campaigns," in Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, ser. IMC '10. New York, NY, USA: ACM, 2010, pp. 35--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov, "Spamming botnets: signatures and characteristics," SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 171--182, Aug. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Brendel and H. Krawczyk, "Application of social relation graphs for early detection of transient spammers," WSEAS Trans. Info. Sci. and App., vol. 5, no. 3, pp. 267--276, Mar. 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Fire, G. Katz, and Y. Elovici, "Strangers intrusion detection-detecting spammers and fake proles in social networks based on topology anomalies," Human Journal, vol. 1, no. 1, pp. 26--39, 2012.Google ScholarGoogle Scholar
  32. E. Frank, M. Hall, G. Holmes, R. Kirkby, B. Pfahringer, I. Witten, and L. Trigg, "Weka," in Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach, Eds. Springer US, 2005, pp. 1305--1314.Google ScholarGoogle Scholar
  33. B. Viswanath, A. Mislove, M. Cha, and K. P. Gummadi, "On the evolution of user interaction in Facebook," in Proc. Workshop on Online Social Networks, 2009, pp. 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. B. Klimt and Y. Yang, "The Enron corpus: A new dataset for email classification research," in Proc. European Conf. on Machine Learning, 2004, pp. 217--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Bouguessa, "An unsupervised approach for identifying spammers in social networks," in Proceedings of the 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence, ser. ICTAI '11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 832--840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Lancichinetti and S. Fortunato, "Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities," Physical Review E, vol. 80, p. 016118, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  37. J. R. Quinlan, C4.5: programs for machine learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Y. Freund and L. Mason, "The alternating decision tree learning algorithm," in Proceedings of the Sixteenth International Conference on Machine Learning, ser. ICML '99. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1999, pp. 124--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. W. Aha, D. Kibler, and M. K. Albert, "Instance-based learning algorithms," Mach. Learn., vol. 6, no. 1, pp. 37--66, Jan. 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. G. H. John and P. Langley, "Estimating continuous distributions in bayesian classifiers," in Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, ser. UAI'95. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1995, pp. 338--345. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Community-based features for identifying spammers in online social networks

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            ASONAM '13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
            August 2013
            1558 pages
            ISBN:9781450322409
            DOI:10.1145/2492517

            Copyright © 2013 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 August 2013

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate116of549submissions,21%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader