Skip to main content
Log in

Anomaly detection in heterogeneous bibliographic information networks using co-evolution pattern mining

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Detecting evolution-based anomalies have emerged as an effective research topic in many domains, such as social and information networks, bioinformatics, and diverse security applications. However, the majority of research has focused on detecting anomalies using evolutionary behavior among objects in a network. The real-world networks are omnipresent, and heterogeneous in nature, while, in these networks, multiple types of objects co-evolve together with their attributes. To understand the anomalous co-evolution of multi-typed objects in a heterogeneous information network (HIN), we need an effective technique that can capture abnormal co-evolution of multi-typed objects. For example, detecting co-evolution-based anomalies in the heterogeneous bibliographic information network (HBIN) can depict better the object-oriented semantics than just scrutinizing the co-author or citation network alone. In this paper, we introduce the novel notion of a co-evolutionary anomaly in the HBIN, detect anomalies using co-evolution pattern mining (CPM), and study how multi-typed objects influence each other in their anomalous declaration by following a special type of HIN called star networks. The influence of three pre-defined attributes namely paper-count, co-author, and venue over target objects is measured to detect co-evolutionary anomalies in HBIN. The anomaly scores are calculated for each 510 target objects and individual influence of attributes is measured for two top target objects in case-studies. It is observed that venue has the most influence on the target objects discussed as case studies, however, about the rest of anomalies in the list, the most anomalous influential attribute could be rather different than the venue. Indeed, the CABIN algorithm constructs the way to find out the most influential attributes in co-evolutionary anomaly detection. Experiments on bibliographic dataset validate the effectiveness of the model and dominance of the algorithm. The proposed technique can be applied on various HINs such as Facebook, Twitter, Delicious to detect co-evolutionary anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. http://arnetminer.org/ArnetMinerNetwork.

  2. http://en.wikipedia.org/wiki/Computer_science.

  3. http://www.acm.org/about/class/2012/.

References

  • Akoglu, L., & Christos, F. (2010). Event detection in time series of mobile communication. In Proceedings of the Army Science Conference (pp. 77–79). Orlando, Florida.

  • Akoglu, L., Tong, H., & Koutra, D. (2014). Graph based anomaly detection and description: A survey. Journal of Data Mining and Knowledge Discovery, 29(3), 626–688.

    Article  MathSciNet  Google Scholar 

  • Amjad, T., Ding, Y., Daud, A., Xu, J., & Malic, V. (2015). Topic-based heterogeneous rank. Journal of Scientometrics, 104(1), 313–334.

    Article  Google Scholar 

  • Angiulli, F., & Fassetti, F. (2016). Toward generalizing the unification with statistical outliers: The gradient outlier factor measure. ACM Transactions on Knowledge Discovery from Data (TKDD), 10(3), 1–27.

    Article  Google Scholar 

  • Basu, S., & Meckesheimer, M. (2007). Automatic outlier detection for time series: An application to sensor data. International Journal of Knowledge and Information Systems, 11(2), 137–154.

    Article  Google Scholar 

  • Bindu, P., & Thilagam, P. S. (2016). Mining social networks for anomalies: Methods and challenges. Journal of Network and Computer Applications, 68, 213–229.

    Article  Google Scholar 

  • Bindu, P., Thilagam, P. S., & Ahuja, D. (2017). Discovering suspicious behavior in multilayer social networks. Computer in Human Behavior, 73, 568–582.

    Article  Google Scholar 

  • Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (pp. 93–104). Dallas, TX, USA.

  • Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. Journal of ACM Computing Surveys, 41(3), 1–72.

    Article  Google Scholar 

  • Chen, Y.-L., Chuang, C.-H., & Chiu, Y.-T. (2014). Community detection based on social interactions in a social network. Journal of the Association for Information Science and Technology, 65(3), 539–550.

    Article  Google Scholar 

  • Cheng, Q., Lu, X., Liu, Z., & Huang, J. (2015). Mining research trends with anomaly detection models the case of social computing research. Journal of Scientometrics, 103(2), 453–469.

    Article  Google Scholar 

  • Dalmia, A., Gupta, M., & Varma, V. (2016). Query-based evolutionary graph cuboid outlier detection. In IEEE 16th International Conference on Data Mining (ICDM) (pp. 85–92). Barcelona, Spain.

  • Daud, A. (2012). Using time topic modeling for semantics-based dynamic research interest finding. Journal of Knowledge-Based Systems, 26, 154–163.

    Article  Google Scholar 

  • Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Knowledge discovery through directed probabilistic topic models: A survey. Journal of Frontiers of Computer Science in China, 4(2), 280–301.

    Article  Google Scholar 

  • Daud, A., Ahmad, M., Malik, M., & Che, D. (2015). Using machine learning techniques for rising star prediction in co-author network. Journal of Scientometrics, 102(2), 1687–1711.

    Article  Google Scholar 

  • Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1(2), 224–227.

    Article  Google Scholar 

  • Gao, J., Liang Feng, Fan, W., Wang, C., Sun, Y., & Han, J. (2010). On community outliers and their efficient detection in information networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 813–822). Washington, DC.

  • Gupta, M., Aggarwal, C. C., Han, J., & Sun, Y. (2011). Evolutionary clustering and analysis of bibliographic networks. In International Conference on Advances in Social Networks Analysis and Mining ASONAM (pp. 63–70). Kaohsiung, Taiwan.

  • Gupta, M., Gao, J., Sun, Y., & Han, J. (2012a). Integrating community matching and outlier detection for mining evolutionary community outliers. In Proceedings of the 18th ACM International Conference on Knowledge Discovery and Data Mining (KDD) (pp. 859–866). Beijing.

  • Gupta, M., Gao, J., Sun, Y., & Han, J. (2012b). Community trend outlier detection using soft temporal pattern mining. In Proceedings of the 2012 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD) (pp. 692–708). Bristol.

  • Gupta, M., Gao, J., & Han, J. (2013). Community distribution outlier detection in heterogeneous information networks. In ECML PKDD 2013 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 557–573). Prague, Czech Republic.

  • Gupta, M., Mallya, A., Roy, S., Cho, J. H., & Han, J. (2014). Local learning for mining outlier subgraphs from network datasets. In Proceedings of the 2014 SIAM International Conference on Data Mining (pp. 73–81). Pennsylvania, USA.

  • Hu, R., Aggarwal, C. C., & Ma, S. (2016). An embedding approach to anomaly detection. In Proceedings of the 32 and IEEE International Conference on Data Engineering (pp. 385–396). Helsinki, Finland.

  • Jeong, Y.-S., Lee, S.-H., & Gweon, G. (2016). Discovery of research interests of authors over time using a topic model. In Proceedings of the 3rd International Conference on Big Data and Smart Computing (pp. 24–31). Hong Kong, China.

  • Leto, P., & Clauset, A. (2015). Detecting change points in the large-scale structure of evolving networks. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, arXiv:1403.0989, (pp. 2914–2920). Austin, Texas, USA.

  • Lu, Y.-C., Wu, C.-W., Lu, C.-T., & Lerch, A. (2016). An unsupervised approach to anomaly detection in music datasets. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 749–752). Pisa, Italy.

  • Mehmood, R., Bie, R., Jiao, L., Dawood, H., & Sun, Y. (2016). Adaptive cutoff distance: Clustering by fast search and find of density peaks. Journal of Intelligent and Fuzzy Systems, 31(5), 2619–2628.

    Article  Google Scholar 

  • Perrozi, B., Akoglu, L., Sanchez, P. I., & Muller, E. (2014). Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1346–1355). New York, USA.

  • Qi, G.-J., Aggarwal, C. C., & Huang, T. S. (2012). On clustering heterogeneous social media objects with outlier links. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining (WSDM) (pp. 553–562). Washington.

  • Rodriguez, A. M., & Shinavier, J. (2010). Exposing multi-relational networks to single-relational network analysis algorithms. Journal of Informetrics, 4(1), 29–41.

    Article  Google Scholar 

  • Sricharan, K., & Das, K. (2014). Localizing anomalous changes in time-evolving graphs. In Proceedings of the 2014 International Conference on Management of Data (pp. 1374–1385). Snowbird, UT, USA.

  • Sun, Y., & Han, J. (2013). Mining heterogeneous information networks: A structural analysis approach. In Explorations of the 18th SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 20–28). Beijing, China.

  • Sun, Y., Yu, Y., & Han, J. (2009). Ranking-based clustering of heterogeneous information networks with star network schema. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 797–806). Paris, France.

  • Sun, Y., Tang, J., Han, J., Chen, C., & Gupta, M. (2014). Co-evolution of multi-typed objects in dynamic star networks. IEEE Transactions on Knowledge and Data Engineering, 26(12), 2942–2955.

    Article  Google Scholar 

  • Sun, X., Ding, K., & Lin, Y. (2016). Mapping the evolution of scientific fields based on cross-field authors. Journal of Informetrics, 10(3), 750–761.

    Article  Google Scholar 

  • Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 990–998). Las Vegas, USA.

  • Wei, W., & Carley, K. M. (2015). Measuring temporal patterns in dynamic social networks. Journal of Knowledge Discovery from Data (TKDD), 10(1), 1–27.

    Article  Google Scholar 

  • Xu, S., Shi, Q., Qiao, X., Zhu, L., Jung, H., Lee, S., & Choi, S.-P. (2014). Author-topic over time (AToT): A dynamic users’ interest model. In Proceedings of the 4th International Conference on Mobile, Ubiquitous, and Intelligent Computing (pp. 239–245). Gwangju, Korea.

  • Yasami, Y., & Safaei, F. (2017). A statistical infinite feature cascade-based approach to anomaly detection for dynamic social networks. Computer Communications, 100, 52–64.

    Article  Google Scholar 

  • Zhou, Y., & Liu, L. (2013). Social influence based clustering of heterogeneous information networks. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 338–346). Chicago, Illinois, USA.

Download references

Acknowledgements

The authors would like to thank Manish Gupta, Jing Gao, and Zhengxing Chen for providing the code. Finally, authors would like to thank the reviewers for their detailed reviews and comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Malik Khizar Hayat.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hayat, M.K., Daud, A. Anomaly detection in heterogeneous bibliographic information networks using co-evolution pattern mining. Scientometrics 113, 149–175 (2017). https://doi.org/10.1007/s11192-017-2467-y

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-017-2467-y

Keywords

Navigation