HeteroUI: A Framework Based on Heterogeneous Information Network Embedding for User Identification in Enterprise Networks

Li, Meng; Cai, Lijun; Yu, Aimin; Yu, Haibo; Meng, Dan

doi:10.1007/978-3-030-41579-2_10

Meng Li^12,13,
Lijun Cai¹²,
Aimin Yu¹²,
Haibo Yu¹² &
…
Dan Meng¹²

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11999))

Included in the following conference series:

International Conference on Information and Communications Security

2539 Accesses

Abstract

User identification process is an important security guard towards discovering insider threat and preventing unauthorized access in enterprise networks. However, most existing user identification approaches based on behavior analysis fail to capture latent correlations between multi-domain behavior records due to the lack of a panoramic view or the disability of dealing with heterogeneous data. In light of this, this paper presents HeteroUI, a framework based on heterogeneous information network embedding for user identification in enterprise networks. In our model, multi-domain heterogeneous behavior records are first transformed into a heterogeneous information network, then the embeddings of entities will be trained iteratively according to a joint objective combining with local and global components for more accurate user identification. Experimental results on the CERT insider threat dataset r4.2 demonstrate that HeteroUI exhibits excellent performance in discovering user identities with the mean average precision reaching over 98%. Besides, HeteroUI has a certain contribution to inferring potential insiders in a multi-user and multi-domain environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=508099.
2.
A query means a set of new behavior records for a certain PC to be inspected.

References

Shashanka, M., Shen, M.Y., Wang, J.: User and entity behavior analytics for enterprise security. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1867–1874. IEEE (2016)
Google Scholar
Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2016)
Article Google Scholar
Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: PathSim: meta path-based top-K similarity search in heterogeneous information networks. Proc. VLDB Endowment 4(11), 992–1003 (2011)
Article Google Scholar
Tuor, A., Kaplan, S., Hutchinson, B., Nichols, N., Robinson, S.: Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In: Workshops at the Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Pei, K., et al.: HERCULE: attack story reconstruction via community discovery on correlated log graph. In: ACSAC, pp. 583–595 (2016)
Google Scholar
Wang, J., Cai, L., Yu, A., Zhu, M., Meng, D.: TempatMDS: a masquerade detection system based on temporal and spatial analysis of file access records. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), pp. 360–371. IEEE (2018)
Google Scholar
Chen, T., Sun, Y.: Task-guided and path-augmented heterogeneous network embedding for author identification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 295–304. ACM (2017)
Google Scholar
Du, M., Li, F., Zheng, G., Srikumar, V.: Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC, CCS, pp. 1285–1298 (2017)
Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT 2010, pp. 177–186. Physica-Verlag HD (2010)
Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)
Google Scholar
Tang, J., Qu, M., Mei, Q.: Pte: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1165–1174. ACM (2015)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Bhattacharjee, S.D., Yuan, J., Jiaqi, Z., Tan, Y.P.: Context-aware graph-based analysis for detecting anomalous activities. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 1021–1026. IEEE (2017)
Google Scholar
Le, D.C., Zincir-Heywood, A.N.: Machine learning based insider threat modelling and detection. In: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), pp. 1–6. IEEE (2019)
Google Scholar
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11(Feb), 625–660 (2010)
MathSciNet MATH Google Scholar
Dittman, D. J., Khoshgoftaar, T. M., Napolitano, A.: The effect of data sampling when using random forest on imbalanced bioinformatics data. In: 2015 IEEE International Conference on Information Reuse and Integration, pp. 457–463. IEEE (2015)
Google Scholar

Download references

Acknowledgments

This research was supported by the National Key R&D Program of China (No. 2016YFB0801001). We thank our shepherd Shujun Li for his valuable feedback.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Meng Li, Lijun Cai, Aimin Yu, Haibo Yu & Dan Meng
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Meng Li

Authors

Meng Li
View author publications
You can also search for this author in PubMed Google Scholar
Lijun Cai
View author publications
You can also search for this author in PubMed Google Scholar
Aimin Yu
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Yu
View author publications
You can also search for this author in PubMed Google Scholar
Dan Meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Lijun Cai or Aimin Yu .

Editor information

Editors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Jianying Zhou
The Hong Kong Polytechnic University, Kowloon, Hong Kong
Xiapu Luo
Peking University, Beijing, China
Qingni Shen
Institute of Information Engineering, Beijing, China
Zhen Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, M., Cai, L., Yu, A., Yu, H., Meng, D. (2020). HeteroUI: A Framework Based on Heterogeneous Information Network Embedding for User Identification in Enterprise Networks. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds) Information and Communications Security. ICICS 2019. Lecture Notes in Computer Science(), vol 11999. Springer, Cham. https://doi.org/10.1007/978-3-030-41579-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-41579-2_10
Published: 18 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41578-5
Online ISBN: 978-3-030-41579-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics