Skip to main content
Log in

Anomaly Detection in Microblogging via Co-Clustering

  • Regular Papers
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Traditional anomaly detection on microblogging mostly focuses on individual anomalous users or messages. Since anomalous users employ advanced intelligent means, the anomaly detection is greatly poor in performance. In this paper, we propose an innovative framework of anomaly detection based on bipartite graph and co-clustering. A bipartite graph between users and messages is built to model the homogeneous and heterogeneous interactions. The proposed co-clustering algorithm based on nonnegative matrix tri-factorization can detect anomalous users and messages simultaneously. The homogeneous relations modeled by the bipartite graph are used as constraints to improve the accuracy of the co-clustering algorithm. Experimental results show that the proposed scheme can detect individual and group anomalies with high accuracy on a Sina Weibo dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Takahashi T, Tomioka R, Yamanishi K. Discovering emerging topics in social streams via link-anomaly detection. IEEE Trans. Knowledge and Data Engineering, 2014, 26(1): 120–130.

  2. Guille A, Favre C. Mention-anomaly-based event detection and tracking in Twitter. In Proc. the IEEE International Conference on Advances in Social Network Analysis and Mining, August 2014, pp. 375–382

  3. Savage D, Zhang X, Yu X et al. Anomaly detection in online social networks. Social Networks, 2014, 39: 62–70.

  4. O’Callaghan D, Harrigan M, Carthy J et al. Network analysis of recurring YouTube spam campaigns. In Proc. the 6th AAAI Conference on Weblogs and Social Media, June 2012, pp. 531–534.

  5. Gao H, Hu J, Huang T et al. Security issues in online social networks. IEEE Internet Computing, 2011, 15(4): 56–63.

  6. Zhu Y, Wang X, Zhong E et al. Discovering spammers in social networks. In Proc. the 26th AAAI Conference on Artificial Intelligence, July 2012, pp. 171–177.

  7. Kwak H, Lee C, Park H et al. What is Twitter, a social network or a news media? In Proc. the 19th WWW, April 2010, pp. 591–600.

  8. Wu S, Hofman J M, Mason W A et al. Who says what to whom on Twitter. In Proc. the 20th WWW, Match 28-April 1, 2011, pp. 705–714.

  9. Yu L, Asur S, Huberman B A. What trends in Chinese social media. In Proc. the 5th SNA-KDD Workshop, August 2011.

  10. Gao Q, Abel F, Houben G et al. A comparative study of users’ microblogging behavior on Sina Weibo and Twitter.In Lecture Notes in Computer Science 7379, Masthoff J, Mobasher B, Desmarais M C et al. (eds.), Springer Berlin Heidelberg, 2012, pp. 88–101.

  11. McCord M, Chuah M. Spam detection on Twitter using traditional classifiers. In Lecture Notes in Computer Science 6906, Alcaraz Calero J M, Yang L T, Mármol F G et al. (eds.), Springer Berlin Heidelberg, 2011, pp. 175–186.

  12. Martinez-Romo J, Araujo L. Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications: An International Journal, 2013, 40(8): 2992–3000.

  13. Bosma M, Meij E, Weerkamp W. A framework for unsupervised spam detection in social networking sites. In Lecture Notes in Computer Science 7224, Baeza-Yates R, de Vries A P, Zaragoza H et al. (eds.), Springer Berlin Heidelberg, 2012, pp.364-375.

  14. Altshuler Y, Fire M, Shmueli E et al. Detecting anomalous behaviors using structural properties of social networks. In Proc. the 6th International Conference on Social Computing, Behavioral Cultural Modeling and Prediction, April 2013, pp. 433–440.

  15. Zhang Q, Ma H, Qian W et al. Duplicate detection for identifying social spam in microblogs. In Proc. the 2nd IEEE International Congress on Big Data, June 27-July 2, 2013, pp. 141-148.

  16. Chu Z, Widjaja I, Wang H. Detecting social spam campaigns on Twitter. In Lecture Notes in Computer Science 7341, Bao F, Samarati P, Zhou J (eds.), Springer Berlin Heidelberg, 2012, pp. 455–472.

  17. Jiang J, Wilson C, Wang X et al. Understanding latent interactions in online social networks. In Proc. the 10th ACM SIGCOMM Conference on Internet Measurement, November 2010, pp. 369–382.

  18. Chen Y, Wang L, Dong M. Non-negative matrix factorization for semi-supervised heterogeneous data coclustering. IEEE Trans. Knowledge and Data Engineering, 2010, 22(10): 1459–1474.

  19. Tang L, Wang X F, Liu H. Community detection via heterogeneous interaction analysis. Data Mining and Knowledge Discovery, 2012, 25(1): 1–33.

  20. Hu X, Tang J L, Zhang Y C et al. Social spammer detection in microblogging. In Proc. the 23rd International Joint Conference on Artificial Intelligence, August 2013, pp. 2633–2639.

  21. Hu X, Tang J L, Liu H. Online social spammer detection. In Proc. the 28th AAAI Conference on Artificial Intelligence, July 2014, pp. 59–65.

  22. Dai H, Zhu F, Lim E et al. Detecting anomalies in bipartite graphs with mutual dependency principles. In Proc. the 12th ICDM, December 2012, pp. 171–180.

  23. Sun J, Qu H, Chakrabarti D et al. Neighborhood formation and anomaly detection in bipartite graphs. In Proc. the 5th ICDM, Nov. 2005, pp. 418–425.

  24. Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: A survey. Data Mining and Knowledge Discovery, 2014, 29(3): 626–688.

  25. Zhao B, Ji G, Qu W et al. Detecting spam community using retweeting relationships — A study on Sina microblog. In Lecture Notes in Computer Science 8178, Cao L, Motoda H, Srivastava J et al. (eds.), Springer International Publishing, 2013, pp. 178–190.

  26. Bhat S Y, Abulaish M. Community-based features for identifying spammers in online social networks. In Proc. the 2013 IEEE International Conference on Advances in Social Networks Analysis and Mining, August 2013, pp. 100–107.

  27. Yu R, He X R, Liu Y. GLAD: Group anomaly detection in social media analysis. In Proc. the 20th ACM SIGKDD KDD, August 2014, pp. 372–381.

  28. Xing E P, Ng A Y, Jordan M I et al. Distance metric learning, with application to clustering with side-information. In Proc. the 16th Neural Information Processing Systems, December 2002, pp. 505–512.

  29. Wang H, Nie F P, Huang H. Robust distance metric learning via simultaneous l1-norm minimization and maximization. In Proc. the 31st International Conference on Machine Learning, June 2014, pp. 1836–1844.

  30. Chang C C, Lin C J. LIBSVM: A library for support vector machines. ACM Trans. Intelligent Systems and Technology, 2011, 2(3): 27:1–27:27.

  31. Hu X, Tang J L, Liu H. Leveraging knowledge across media for spammer detection in microblogging. In Proc. the 37th SIGIR, July 2014, pp. 547–556.

  32. Hu X, Tang J L, Gao H J et al. Social spammer detection with sentiment information. In Proc. the 14th ICDM, December 2014, pp. 180–189.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo-Wei Shen.

Additional information

This work was supported by the National Natural Science Foundation of China under Grant No. 61170242, the National High Technology Research and Development 863 Program of China under Grant No. 2012AA012802, and the Fundamental Research Funds for the Central Universities of China under Grant No. HEUCF100605.

A preliminary version of the paper was published in the Proceedings of SMP 2014.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, W., Shen, GW., Wang, W. et al. Anomaly Detection in Microblogging via Co-Clustering. J. Comput. Sci. Technol. 30, 1097–1108 (2015). https://doi.org/10.1007/s11390-015-1585-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-015-1585-3

Keywords

Navigation