Skip to main content

HeteroUI: A Framework Based on Heterogeneous Information Network Embedding for User Identification in Enterprise Networks

  • Conference paper
  • First Online:
Information and Communications Security (ICICS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11999))

Included in the following conference series:

  • 2539 Accesses

Abstract

User identification process is an important security guard towards discovering insider threat and preventing unauthorized access in enterprise networks. However, most existing user identification approaches based on behavior analysis fail to capture latent correlations between multi-domain behavior records due to the lack of a panoramic view or the disability of dealing with heterogeneous data. In light of this, this paper presents HeteroUI, a framework based on heterogeneous information network embedding for user identification in enterprise networks. In our model, multi-domain heterogeneous behavior records are first transformed into a heterogeneous information network, then the embeddings of entities will be trained iteratively according to a joint objective combining with local and global components for more accurate user identification. Experimental results on the CERT insider threat dataset r4.2 demonstrate that HeteroUI exhibits excellent performance in discovering user identities with the mean average precision reaching over 98%. Besides, HeteroUI has a certain contribution to inferring potential insiders in a multi-user and multi-domain environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=508099.

  2. 2.

    A query means a set of new behavior records for a certain PC to be inspected.

References

  1. Shashanka, M., Shen, M.Y., Wang, J.: User and entity behavior analytics for enterprise security. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1867–1874. IEEE (2016)

    Google Scholar 

  2. Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2016)

    Article  Google Scholar 

  3. Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: PathSim: meta path-based top-K similarity search in heterogeneous information networks. Proc. VLDB Endowment 4(11), 992–1003 (2011)

    Article  Google Scholar 

  4. Tuor, A., Kaplan, S., Hutchinson, B., Nichols, N., Robinson, S.: Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In: Workshops at the Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  5. Pei, K., et al.: HERCULE: attack story reconstruction via community discovery on correlated log graph. In: ACSAC, pp. 583–595 (2016)

    Google Scholar 

  6. Wang, J., Cai, L., Yu, A., Zhu, M., Meng, D.: TempatMDS: a masquerade detection system based on temporal and spatial analysis of file access records. In: 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), pp. 360–371. IEEE (2018)

    Google Scholar 

  7. Chen, T., Sun, Y.: Task-guided and path-augmented heterogeneous network embedding for author identification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 295–304. ACM (2017)

    Google Scholar 

  8. Du, M., Li, F., Zheng, G., Srikumar, V.: Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC, CCS, pp. 1285–1298 (2017)

    Google Scholar 

  9. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT 2010, pp. 177–186. Physica-Verlag HD (2010)

    Google Scholar 

  10. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)

    Google Scholar 

  11. Tang, J., Qu, M., Mei, Q.: Pte: predictive text embedding through large-scale heterogeneous text networks. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1165–1174. ACM (2015)

    Google Scholar 

  12. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  13. Bhattacharjee, S.D., Yuan, J., Jiaqi, Z., Tan, Y.P.: Context-aware graph-based analysis for detecting anomalous activities. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 1021–1026. IEEE (2017)

    Google Scholar 

  14. Le, D.C., Zincir-Heywood, A.N.: Machine learning based insider threat modelling and detection. In: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), pp. 1–6. IEEE (2019)

    Google Scholar 

  15. Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11(Feb), 625–660 (2010)

    MathSciNet  MATH  Google Scholar 

  16. Dittman, D. J., Khoshgoftaar, T. M., Napolitano, A.: The effect of data sampling when using random forest on imbalanced bioinformatics data. In: 2015 IEEE International Conference on Information Reuse and Integration, pp. 457–463. IEEE (2015)

    Google Scholar 

Download references

Acknowledgments

This research was supported by the National Key R&D Program of China (No. 2016YFB0801001). We thank our shepherd Shujun Li for his valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lijun Cai or Aimin Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, M., Cai, L., Yu, A., Yu, H., Meng, D. (2020). HeteroUI: A Framework Based on Heterogeneous Information Network Embedding for User Identification in Enterprise Networks. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds) Information and Communications Security. ICICS 2019. Lecture Notes in Computer Science(), vol 11999. Springer, Cham. https://doi.org/10.1007/978-3-030-41579-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41579-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41578-5

  • Online ISBN: 978-3-030-41579-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics