Abstract
Detecting evolution-based anomalies have emerged as an effective research topic in many domains, such as social and information networks, bioinformatics, and diverse security applications. However, the majority of research has focused on detecting anomalies using evolutionary behavior among objects in a network. The real-world networks are omnipresent, and heterogeneous in nature, while, in these networks, multiple types of objects co-evolve together with their attributes. To understand the anomalous co-evolution of multi-typed objects in a heterogeneous information network (HIN), we need an effective technique that can capture abnormal co-evolution of multi-typed objects. For example, detecting co-evolution-based anomalies in the heterogeneous bibliographic information network (HBIN) can depict better the object-oriented semantics than just scrutinizing the co-author or citation network alone. In this paper, we introduce the novel notion of a co-evolutionary anomaly in the HBIN, detect anomalies using co-evolution pattern mining (CPM), and study how multi-typed objects influence each other in their anomalous declaration by following a special type of HIN called star networks. The influence of three pre-defined attributes namely paper-count, co-author, and venue over target objects is measured to detect co-evolutionary anomalies in HBIN. The anomaly scores are calculated for each 510 target objects and individual influence of attributes is measured for two top target objects in case-studies. It is observed that venue has the most influence on the target objects discussed as case studies, however, about the rest of anomalies in the list, the most anomalous influential attribute could be rather different than the venue. Indeed, the CABIN algorithm constructs the way to find out the most influential attributes in co-evolutionary anomaly detection. Experiments on bibliographic dataset validate the effectiveness of the model and dominance of the algorithm. The proposed technique can be applied on various HINs such as Facebook, Twitter, Delicious to detect co-evolutionary anomalies.
Similar content being viewed by others
References
Akoglu, L., & Christos, F. (2010). Event detection in time series of mobile communication. In Proceedings of the Army Science Conference (pp. 77–79). Orlando, Florida.
Akoglu, L., Tong, H., & Koutra, D. (2014). Graph based anomaly detection and description: A survey. Journal of Data Mining and Knowledge Discovery, 29(3), 626–688.
Amjad, T., Ding, Y., Daud, A., Xu, J., & Malic, V. (2015). Topic-based heterogeneous rank. Journal of Scientometrics, 104(1), 313–334.
Angiulli, F., & Fassetti, F. (2016). Toward generalizing the unification with statistical outliers: The gradient outlier factor measure. ACM Transactions on Knowledge Discovery from Data (TKDD), 10(3), 1–27.
Basu, S., & Meckesheimer, M. (2007). Automatic outlier detection for time series: An application to sensor data. International Journal of Knowledge and Information Systems, 11(2), 137–154.
Bindu, P., & Thilagam, P. S. (2016). Mining social networks for anomalies: Methods and challenges. Journal of Network and Computer Applications, 68, 213–229.
Bindu, P., Thilagam, P. S., & Ahuja, D. (2017). Discovering suspicious behavior in multilayer social networks. Computer in Human Behavior, 73, 568–582.
Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (pp. 93–104). Dallas, TX, USA.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. Journal of ACM Computing Surveys, 41(3), 1–72.
Chen, Y.-L., Chuang, C.-H., & Chiu, Y.-T. (2014). Community detection based on social interactions in a social network. Journal of the Association for Information Science and Technology, 65(3), 539–550.
Cheng, Q., Lu, X., Liu, Z., & Huang, J. (2015). Mining research trends with anomaly detection models the case of social computing research. Journal of Scientometrics, 103(2), 453–469.
Dalmia, A., Gupta, M., & Varma, V. (2016). Query-based evolutionary graph cuboid outlier detection. In IEEE 16th International Conference on Data Mining (ICDM) (pp. 85–92). Barcelona, Spain.
Daud, A. (2012). Using time topic modeling for semantics-based dynamic research interest finding. Journal of Knowledge-Based Systems, 26, 154–163.
Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Knowledge discovery through directed probabilistic topic models: A survey. Journal of Frontiers of Computer Science in China, 4(2), 280–301.
Daud, A., Ahmad, M., Malik, M., & Che, D. (2015). Using machine learning techniques for rising star prediction in co-author network. Journal of Scientometrics, 102(2), 1687–1711.
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1(2), 224–227.
Gao, J., Liang Feng, Fan, W., Wang, C., Sun, Y., & Han, J. (2010). On community outliers and their efficient detection in information networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 813–822). Washington, DC.
Gupta, M., Aggarwal, C. C., Han, J., & Sun, Y. (2011). Evolutionary clustering and analysis of bibliographic networks. In International Conference on Advances in Social Networks Analysis and Mining ASONAM (pp. 63–70). Kaohsiung, Taiwan.
Gupta, M., Gao, J., Sun, Y., & Han, J. (2012a). Integrating community matching and outlier detection for mining evolutionary community outliers. In Proceedings of the 18th ACM International Conference on Knowledge Discovery and Data Mining (KDD) (pp. 859–866). Beijing.
Gupta, M., Gao, J., Sun, Y., & Han, J. (2012b). Community trend outlier detection using soft temporal pattern mining. In Proceedings of the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD) (pp. 692–708). Bristol.
Gupta, M., Gao, J., & Han, J. (2013). Community distribution outlier detection in heterogeneous information networks. In ECML PKDD 2013 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 557–573). Prague, Czech Republic.
Gupta, M., Mallya, A., Roy, S., Cho, J. H., & Han, J. (2014). Local learning for mining outlier subgraphs from network datasets. In Proceedings of the 2014 SIAM International Conference on Data Mining (pp. 73–81). Pennsylvania, USA.
Hu, R., Aggarwal, C. C., & Ma, S. (2016). An embedding approach to anomaly detection. In Proceedings of the 32 and IEEE International Conference on Data Engineering (pp. 385–396). Helsinki, Finland.
Jeong, Y.-S., Lee, S.-H., & Gweon, G. (2016). Discovery of research interests of authors over time using a topic model. In Proceedings of the 3rd International Conference on Big Data and Smart Computing (pp. 24–31). Hong Kong, China.
Leto, P., & Clauset, A. (2015). Detecting change points in the large-scale structure of evolving networks. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, arXiv:1403.0989, (pp. 2914–2920). Austin, Texas, USA.
Lu, Y.-C., Wu, C.-W., Lu, C.-T., & Lerch, A. (2016). An unsupervised approach to anomaly detection in music datasets. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 749–752). Pisa, Italy.
Mehmood, R., Bie, R., Jiao, L., Dawood, H., & Sun, Y. (2016). Adaptive cutoff distance: Clustering by fast search and find of density peaks. Journal of Intelligent and Fuzzy Systems, 31(5), 2619–2628.
Perrozi, B., Akoglu, L., Sanchez, P. I., & Muller, E. (2014). Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1346–1355). New York, USA.
Qi, G.-J., Aggarwal, C. C., & Huang, T. S. (2012). On clustering heterogeneous social media objects with outlier links. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (WSDM) (pp. 553–562). Washington.
Rodriguez, A. M., & Shinavier, J. (2010). Exposing multi-relational networks to single-relational network analysis algorithms. Journal of Informetrics, 4(1), 29–41.
Sricharan, K., & Das, K. (2014). Localizing anomalous changes in time-evolving graphs. In Proceedings of the 2014 International Conference on Management of Data (pp. 1374–1385). Snowbird, UT, USA.
Sun, Y., & Han, J. (2013). Mining heterogeneous information networks: A structural analysis approach. In Explorations of the 18th SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 20–28). Beijing, China.
Sun, Y., Yu, Y., & Han, J. (2009). Ranking-based clustering of heterogeneous information networks with star network schema. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 797–806). Paris, France.
Sun, Y., Tang, J., Han, J., Chen, C., & Gupta, M. (2014). Co-evolution of multi-typed objects in dynamic star networks. IEEE Transactions on Knowledge and Data Engineering, 26(12), 2942–2955.
Sun, X., Ding, K., & Lin, Y. (2016). Mapping the evolution of scientific fields based on cross-field authors. Journal of Informetrics, 10(3), 750–761.
Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 990–998). Las Vegas, USA.
Wei, W., & Carley, K. M. (2015). Measuring temporal patterns in dynamic social networks. Journal of Knowledge Discovery from Data (TKDD), 10(1), 1–27.
Xu, S., Shi, Q., Qiao, X., Zhu, L., Jung, H., Lee, S., & Choi, S.-P. (2014). Author-topic over time (AToT): A dynamic users’ interest model. In Proceedings of the 4th International Conference on Mobile, Ubiquitous, and Intelligent Computing (pp. 239–245). Gwangju, Korea.
Yasami, Y., & Safaei, F. (2017). A statistical infinite feature cascade-based approach to anomaly detection for dynamic social networks. Computer Communications, 100, 52–64.
Zhou, Y., & Liu, L. (2013). Social influence based clustering of heterogeneous information networks. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 338–346). Chicago, Illinois, USA.
Acknowledgements
The authors would like to thank Manish Gupta, Jing Gao, and Zhengxing Chen for providing the code. Finally, authors would like to thank the reviewers for their detailed reviews and comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hayat, M.K., Daud, A. Anomaly detection in heterogeneous bibliographic information networks using co-evolution pattern mining. Scientometrics 113, 149–175 (2017). https://doi.org/10.1007/s11192-017-2467-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-017-2467-y