Anomaly Detection in Microblogging via Co-Clustering

Yang, Wu; Shen, Guo-Wei; Wang, Wei; Gong, Liang-Yi; Yu, Miao; Dong, Guo-Zhong

doi:10.1007/s11390-015-1585-3

Anomaly Detection in Microblogging via Co-Clustering

Regular Papers
Published: 14 September 2015

Volume 30, pages 1097–1108, (2015)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Wu Yang¹,
Guo-Wei Shen¹,
Wei Wang¹,
Liang-Yi Gong¹,
Miao Yu¹ &
…
Guo-Zhong Dong¹

194 Accesses
12 Citations
Explore all metrics

Abstract

Traditional anomaly detection on microblogging mostly focuses on individual anomalous users or messages. Since anomalous users employ advanced intelligent means, the anomaly detection is greatly poor in performance. In this paper, we propose an innovative framework of anomaly detection based on bipartite graph and co-clustering. A bipartite graph between users and messages is built to model the homogeneous and heterogeneous interactions. The proposed co-clustering algorithm based on nonnegative matrix tri-factorization can detect anomalous users and messages simultaneously. The homogeneous relations modeled by the bipartite graph are used as constraints to improve the accuracy of the co-clustering algorithm. Experimental results show that the proposed scheme can detect individual and group anomalies with high accuracy on a Sina Weibo dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting Anomalies in Microblogging via Nonnegative Matrix Tri-Factorization

Detection of Microblog Overlapping Community Based on Multidimensional Information and Edge Distance Matrix

Mining Microblog Community Based on Clustering Analysis

References

Takahashi T, Tomioka R, Yamanishi K. Discovering emerging topics in social streams via link-anomaly detection. IEEE Trans. Knowledge and Data Engineering, 2014, 26(1): 120–130.
Guille A, Favre C. Mention-anomaly-based event detection and tracking in Twitter. In Proc. the IEEE International Conference on Advances in Social Network Analysis and Mining, August 2014, pp. 375–382
Savage D, Zhang X, Yu X et al. Anomaly detection in online social networks. Social Networks, 2014, 39: 62–70.
O’Callaghan D, Harrigan M, Carthy J et al. Network analysis of recurring YouTube spam campaigns. In Proc. the 6th AAAI Conference on Weblogs and Social Media, June 2012, pp. 531–534.
Gao H, Hu J, Huang T et al. Security issues in online social networks. IEEE Internet Computing, 2011, 15(4): 56–63.
Zhu Y, Wang X, Zhong E et al. Discovering spammers in social networks. In Proc. the 26th AAAI Conference on Artificial Intelligence, July 2012, pp. 171–177.
Kwak H, Lee C, Park H et al. What is Twitter, a social network or a news media? In Proc. the 19th WWW, April 2010, pp. 591–600.
Wu S, Hofman J M, Mason W A et al. Who says what to whom on Twitter. In Proc. the 20th WWW, Match 28-April 1, 2011, pp. 705–714.
Yu L, Asur S, Huberman B A. What trends in Chinese social media. In Proc. the 5th SNA-KDD Workshop, August 2011.
Gao Q, Abel F, Houben G et al. A comparative study of users’ microblogging behavior on Sina Weibo and Twitter.In Lecture Notes in Computer Science 7379, Masthoff J, Mobasher B, Desmarais M C et al. (eds.), Springer Berlin Heidelberg, 2012, pp. 88–101.
McCord M, Chuah M. Spam detection on Twitter using traditional classifiers. In Lecture Notes in Computer Science 6906, Alcaraz Calero J M, Yang L T, Mármol F G et al. (eds.), Springer Berlin Heidelberg, 2011, pp. 175–186.
Martinez-Romo J, Araujo L. Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Systems with Applications: An International Journal, 2013, 40(8): 2992–3000.
Bosma M, Meij E, Weerkamp W. A framework for unsupervised spam detection in social networking sites. In Lecture Notes in Computer Science 7224, Baeza-Yates R, de Vries A P, Zaragoza H et al. (eds.), Springer Berlin Heidelberg, 2012, pp.364-375.
Altshuler Y, Fire M, Shmueli E et al. Detecting anomalous behaviors using structural properties of social networks. In Proc. the 6th International Conference on Social Computing, Behavioral Cultural Modeling and Prediction, April 2013, pp. 433–440.
Zhang Q, Ma H, Qian W et al. Duplicate detection for identifying social spam in microblogs. In Proc. the 2nd IEEE International Congress on Big Data, June 27-July 2, 2013, pp. 141-148.
Chu Z, Widjaja I, Wang H. Detecting social spam campaigns on Twitter. In Lecture Notes in Computer Science 7341, Bao F, Samarati P, Zhou J (eds.), Springer Berlin Heidelberg, 2012, pp. 455–472.
Jiang J, Wilson C, Wang X et al. Understanding latent interactions in online social networks. In Proc. the 10th ACM SIGCOMM Conference on Internet Measurement, November 2010, pp. 369–382.
Chen Y, Wang L, Dong M. Non-negative matrix factorization for semi-supervised heterogeneous data coclustering. IEEE Trans. Knowledge and Data Engineering, 2010, 22(10): 1459–1474.
Tang L, Wang X F, Liu H. Community detection via heterogeneous interaction analysis. Data Mining and Knowledge Discovery, 2012, 25(1): 1–33.
Hu X, Tang J L, Zhang Y C et al. Social spammer detection in microblogging. In Proc. the 23rd International Joint Conference on Artificial Intelligence, August 2013, pp. 2633–2639.
Hu X, Tang J L, Liu H. Online social spammer detection. In Proc. the 28th AAAI Conference on Artificial Intelligence, July 2014, pp. 59–65.
Dai H, Zhu F, Lim E et al. Detecting anomalies in bipartite graphs with mutual dependency principles. In Proc. the 12th ICDM, December 2012, pp. 171–180.
Sun J, Qu H, Chakrabarti D et al. Neighborhood formation and anomaly detection in bipartite graphs. In Proc. the 5th ICDM, Nov. 2005, pp. 418–425.
Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: A survey. Data Mining and Knowledge Discovery, 2014, 29(3): 626–688.
Zhao B, Ji G, Qu W et al. Detecting spam community using retweeting relationships — A study on Sina microblog. In Lecture Notes in Computer Science 8178, Cao L, Motoda H, Srivastava J et al. (eds.), Springer International Publishing, 2013, pp. 178–190.
Bhat S Y, Abulaish M. Community-based features for identifying spammers in online social networks. In Proc. the 2013 IEEE International Conference on Advances in Social Networks Analysis and Mining, August 2013, pp. 100–107.
Yu R, He X R, Liu Y. GLAD: Group anomaly detection in social media analysis. In Proc. the 20th ACM SIGKDD KDD, August 2014, pp. 372–381.
Xing E P, Ng A Y, Jordan M I et al. Distance metric learning, with application to clustering with side-information. In Proc. the 16th Neural Information Processing Systems, December 2002, pp. 505–512.
Wang H, Nie F P, Huang H. Robust distance metric learning via simultaneous l1-norm minimization and maximization. In Proc. the 31st International Conference on Machine Learning, June 2014, pp. 1836–1844.
Chang C C, Lin C J. LIBSVM: A library for support vector machines. ACM Trans. Intelligent Systems and Technology, 2011, 2(3): 27:1–27:27.
Hu X, Tang J L, Liu H. Leveraging knowledge across media for spammer detection in microblogging. In Proc. the 37th SIGIR, July 2014, pp. 547–556.
Hu X, Tang J L, Gao H J et al. Social spammer detection with sentiment information. In Proc. the 14th ICDM, December 2014, pp. 180–189.

Download references

Author information

Authors and Affiliations

Information Security Research Center, Harbin Engineering University, Harbin, 150001, China
Wu Yang, Guo-Wei Shen, Wei Wang, Liang-Yi Gong, Miao Yu & Guo-Zhong Dong

Authors

Wu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Wei Shen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Liang-Yi Gong
View author publications
You can also search for this author in PubMed Google Scholar
Miao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Zhong Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guo-Wei Shen.

Additional information

This work was supported by the National Natural Science Foundation of China under Grant No. 61170242, the National High Technology Research and Development 863 Program of China under Grant No. 2012AA012802, and the Fundamental Research Funds for the Central Universities of China under Grant No. HEUCF100605.

A preliminary version of the paper was published in the Proceedings of SMP 2014.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, W., Shen, GW., Wang, W. et al. Anomaly Detection in Microblogging via Co-Clustering. J. Comput. Sci. Technol. 30, 1097–1108 (2015). https://doi.org/10.1007/s11390-015-1585-3

Download citation

Received: 17 November 2014
Revised: 12 July 2015
Published: 14 September 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s11390-015-1585-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Anomaly Detection in Microblogging via Co-Clustering

Abstract

Access this article

Similar content being viewed by others

Detecting Anomalies in Microblogging via Nonnegative Matrix Tri-Factorization

Detection of Microblog Overlapping Community Based on Multidimensional Information and Edge Distance Matrix

Mining Microblog Community Based on Clustering Analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Anomaly Detection in Microblogging via Co-Clustering

Abstract

Access this article

Similar content being viewed by others

Detecting Anomalies in Microblogging via Nonnegative Matrix Tri-Factorization

Detection of Microblog Overlapping Community Based on Multidimensional Information and Edge Distance Matrix

Mining Microblog Community Based on Clustering Analysis

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation