Skip to main content

Topic Reconstruction: A Novel Method Based on LDA Oriented to Intrusion Detection

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11944))

  • 1599 Accesses

Abstract

Traditional intrusion detection methods are facing the problems of distinguishing different types of intrusion with high similarity. The methods use a single value to characterize each attribute and mine the relationship of each attribute at the feature extraction stage. However, this granularity of features extraction is not sufficient to distinguish different intrusions whose network flow characteristics are similar. Facing the problem, we establish an intrusion detection model based on Latent Dirichlet Allocation (ID-LDA) and propose a novel topic reconstruction method to extract the distinctive features. We mine the value distribution of each attribute and the association of multiple attributes to extract the more implicit semantic features. These features are more useful for identifying slight differences in different kinds of intrusions. However, the current LDA models are difficult in determining the most optimal topic number. Meanwhile, the recent methods ignore the multiple topics selection. These above problems result in difficulty in generating the perfect Document-Topic Distribution (DTD) and lower detection accuracy. So we propose a topic overlap degree and a dispersion degree to quantitatively assess the quality of the DTD. Finally, we get the most optimal topic number and select the best topics. Experiments on the public NSL-KDD dataset have verified the validity of the ID-LDA. These results outperform many state-of-the-art intrusion detection methods in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Zhang, Y., Chen, W., Zha, H., et al.: A time-topic coupled LDA model for IPTV user behaviors. IEEE Trans. Broadcast. 61(1), 56–65 (2015)

    Article  Google Scholar 

  3. Farrahi, K., Gatica-Perez, D.: Discovering routines from large scale human locations using probabilistic topic models. ACM Trans. Intell. Syst. Technol. 2(1), (2011)

    Google Scholar 

  4. Huynh, T., Fritz, M., Schiele, B.: Discovery of activity patterns using topic models. In: Proceedings of the 10th International Conference on Ubiquitous Computing, Seoul, Korea, pp. 10–19. ACM (2008)

    Google Scholar 

  5. Guixian, X., Xu, W., Yao, H., et al.: Research on topic recognition of network sensitive information based on SW-LDA model. IEEE Access 7, 21527–21538 (2019)

    Article  Google Scholar 

  6. Zhang, Y., Wang, Z., Yongtao, Yu., et al.: LF-LDA: a supervised topic model for multi-label documents classification. IJDWM 14(2), 18–36 (2018)

    Google Scholar 

  7. Casale, P., Pujol, O., Radeva, P., et al.: A first approach to activity recognition using topic models. In: Artificial Intelligence Research & Development, International Conference of the Catalan Association for Artificial Intelligence, CCIA, Vilar Rural De Cardona, October. DBLP (2009)

    Google Scholar 

  8. Yang, Y., Sun, J., Guo, L.: PersonaIA: a lightweight implicit authentication system based on customized user behavior selection. IEEE Trans. Dependable Secure Comput. 16(1), 113–126 (2019)

    Article  Google Scholar 

  9. Wilson, J., Chaudhury, S., Lall, B.: Clustering short temporal behaviour sequences for customer segmentation using LDA. Expert Syst. e12250 (2009)

    Google Scholar 

  10. Xie, L., Shi, Y., Li, Z.: Driving pattern recognition based on improved LDA model. In: 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, pp. 320–324 (2018)

    Google Scholar 

  11. Gao, Y., Wei, X., Zhang, X., et al.: A combinational LDA-based topic model for user interest inference of energy efficient IPTV service in smart building. IEEE Access 6, 48921–48933 (2018)

    Article  Google Scholar 

  12. Chen, W., Zhang, Y., Zha, H.: Mining IPTV user behaviors with a coupled LDA model. In: IEEE International Symposium on Broadband Multimedia Systems & Broadcasting, London, pp. 1–6. IEEE (2013)

    Google Scholar 

  13. Wang, Z., Gu, S., Xu, X.: GSLDA: LDA-based group spamming detection in product reviews. Appl. Intell. 1, 1–14 (2018)

    Google Scholar 

  14. Budhiraja, A., Reddy, R., Shrivastava, M.: Poster: LWE: LDA refined word embeddings for duplicate bug report detection. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Companion Proceedings, Gothenburg, pp. 165–166. IEEE Computer Society (2018)

    Google Scholar 

  15. Andrzejewski, D., Mulhern, A., Liblit, B., Zhu, X.: Statistical debugging using latent topic models. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 6–17. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_5

    Chapter  Google Scholar 

  16. Mäntylä, M., Claes, M., Farooq, U.: Measuring LDA topic stability from clusters of replicated runs. In: ESEM 2018 ACM, Oulu, Finland (2018)

    Google Scholar 

  17. Gollapalli, S.D., Li, X.-l.: Using PageRank for characterizing topic quality in LDA. In: 2018 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2018), Tianjin, China, pp. 115–122 (2018)

    Google Scholar 

  18. Morstatter, F., Liu, H.: A novel measure for coherence in statistical topic models. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 543–548 (2016)

    Google Scholar 

  19. Newman, D., Lau, J.H., Grieser, K., et al.: Automatic evaluation of topic coherence. In: Human Language Technologies: Conference of the North American Chapter of the Association of the ACL, Los Angeles, California, pp. 100–108 (2010)

    Google Scholar 

  20. Jonathan, C., Boyd-Graber, J., et al.: Reading tea leaves: how humans interpret topic models. In: NIPS, Vancouver, British Columbia, Canada (2009)

    Google Scholar 

  21. Zhao, W., Chen, J.J., Perkins, R., et al.: A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinformatics 16(Suppl 13), S8 (2015)

    Article  Google Scholar 

  22. Grant, S., Cordy, J.R., Skillicorn, D.B.: Using heuristics to estimate an appropriate number of latent topics in source code analysis. Sci. Comput. Program. 78(9), 1663–1678 (2013)

    Article  Google Scholar 

  23. Lin, J.: Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 37(1), 145–151 (1991)

    Article  MathSciNet  Google Scholar 

  24. McHugh, J., Brugger, S.T. (1999). http://kdd.ics.uci.edu/databases/kddcup99.thml

  25. Zhihua, C., Lei, D., et al.: Malicious code detection based on CNNs and multi-objective algorithm. Parallel Distrib. Comput. 129, 50–58 (2019)

    Article  Google Scholar 

  26. Xiaoyu, G., Hui, Z., et al.: A single attention-based combination of CNN and RNN for relation classification. IEEE Access 7, 12467–12475 (2019)

    Article  Google Scholar 

  27. Yao, H., Sun, X., et al.: An enhanced LSTM for trend following of time series. IEEE Access 7, 34020–34030 (2019)

    Article  Google Scholar 

  28. Alguliyev, R.M., Aliguliyev, R.M., et al.: The improved LSTM and CNN models for DDoS attacks prediction in social media. IJCWT 9(1), 1–18 (2019)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (U1636208, F020605, No. 61902013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianbo Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lei, S., Xia, C., Wang, T., Wang, S. (2020). Topic Reconstruction: A Novel Method Based on LDA Oriented to Intrusion Detection. In: Wen, S., Zomaya, A., Yang, L. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2019. Lecture Notes in Computer Science(), vol 11944. Springer, Cham. https://doi.org/10.1007/978-3-030-38991-8_38

Download citation

Publish with us

Policies and ethics