Skip to main content

Advertisement

Log in

Toward Generating a Large Scale Intrusion Detection Dataset and Intruders Behavioral Profiling Using Network and Transportation Layers Traffic Flow Analyzer (NTLFlowLyzer)

  • Published:
Journal of Network and Systems Management Aims and scope Submit manuscript

Abstract

In today’s digital landscape, network security and intrusion detection systems are crucial due to our growing dependence on interconnected systems and data exchange. IDS continuously monitors network traffic to detect threats and ensure the security and integrity of modern digital infrastructure. However, IDSs face several challenges, including low accuracy, high false positive rates, the absence of an effective behavioral profiling model, and the requirement for enhanced visualization capabilities. This paper introduces a groundbreaking pattern extraction and profiling system that addresses limitations in characterizing diverse network activities. We introduce a novel attribute selection algorithm, a groundbreaking approach for characterizing network activities, and a novel concept of local and global profiling, featuring the concept of a “super feature”. Our approach, which includes Attribute Extraction, Relation Extraction, and Entity Extraction, forms a robust foundation for precise activity characterization and accurate profiling. By emphasizing sub-behaviors through Local and Global profiling, we effectively mitigate the common issue of high false positive rates seen in previous methods. The approach culminates in the weighting of sub-profiles and the influence of the global profile on shaping comprehensive activity profiles, achieved through a neural network architecture. We perform practical implementation and validation by developing a new network traffic analyzer, NTLFlowLyzer, with an extensive set of over 300 features and introducing the updated benchmark data set BCCC-CSE-CIC-IDS2018. The experimental results showed that the proposed Local and Global profiling was effective in profiling different malicious activities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

After publishing this paper, the updated intrusion Detection Dataset, namely BCCC-CSE-CIC-IDS2018, will be publicly available on our website [8]. Additionally, the implementation code for the NTLFlowLyzer will be accessible on the GitHub repository [10].

References

  1. Abdulganiyu, O.H., Ait Tchakoucht, T., Saheed, Y.K.: A systematic literature review for network intrusion detection system (ids). Int. J. Inf. Secur. 1–38 (2023)

  2. de Neira, A.B., Kantarci, B., Nogueira, M.: Distributed denial of service attack prediction: challenges, open issues and opportunities. Comput. Netw. 222, 109553 (2023)

    Google Scholar 

  3. Alashhab, Z.R., Anbar, M., Singh, M.M., Hasbullah, I.H., Jain, P., Al-Amiedy, T.A.: Distributed denial of service attacks against cloud computing environment: survey, issues, challenges and coherent taxonomy. Appl. Sci. 12(23), 12441 (2022)

    Google Scholar 

  4. Markevych, M., Dawson, M.: A review of enhancing intrusion detection systems for cybersecurity using artificial intelligence (ai). Int. Confer. Knowl.-Based Org. 29, 30–37 (2023)

    Google Scholar 

  5. Aloqaily, M., Kanhere, S., Bellavista, P., Nogueira, M.: Special issue on cybersecurity management in the era of ai. J. Netw. Syst. Manage. 30(3), 39 (2022)

    Google Scholar 

  6. Shafi, M., Lashkari, A.H., Rodriguez, V., Nevo, R.: Toward generating a new cloud-based distributed denial of service (DDoS) dataset and cloud intrusion traffic characterization. Information 15(4), 195 (2024)

    Google Scholar 

  7. Thakkar, A., Lohiya, R.: A review of the advancement in intrusion detection datasets. Proced. Comput. Sci. 167, 636–645 (2020)

    Google Scholar 

  8. BCCC-CSE-CIC-IDS2018. BCCC updated intrusion detection dataset (2018) (bccc-cse-cic-ids2018). https://www.yorku.ca/research/bccc/ucs-technical/cybersecurity-datasets-cds/

  9. Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., Ghorbani, A.A., et al.: Characterization of tor traffic using time based features. In: ICISSp, pp. 253–262 (2017)

  10. BCCC: Network and transport layer flow analyzer (ntlflowlyzer). Behaviour-Centric Cybersecurity Center (BCCC). https://github.com/ahlashkari/NTLFlowLyzer

  11. Tang, B., Wang, J., Yu, Z., Chen, B., Ge, W., Yu, J., Lu, T.: Advanced persistent threat intelligent profiling technique: a survey. Comput. Electr. Eng. 103, 108261 (2022)

    Google Scholar 

  12. Shafi, M., Lashkari, A.H., Roudsari, A.H.: Ntlflowlyzer: towards generating an intrusion detection dataset and intruders behavior profiling through network and transport layers traffic analysis and pattern extraction. Comput. Secur. 148, 104160 (2025)

    Google Scholar 

  13. Masdari, M., Khezri, H.: A survey and taxonomy of the fuzzy signature-based intrusion detection systems. Appl. Soft Comput. 92, 106301 (2020)

    Google Scholar 

  14. Ayyagari, M.R., Kesswani, N., Kumar, M., Kumar, K.: Intrusion detection techniques in network environment: a systematic review. Wirel. Netw. 27(2), 1269–1285 (2021)

    Google Scholar 

  15. Khraisat, A., Gondal, I., Vamplew, P., Kamruzzaman, J.: Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2(1), 1–22 (2019)

    Google Scholar 

  16. Hajj, S., El Sibai, R., Bou Abdo, J., Demerjian, J., Makhoul, A., Guyeux, C.: Anomaly-based intrusion detection systems: the requirements, methods, measurements, and datasets. Trans. Emerging Telecommun. Technol. 32(4), e4240 (2021)

    Google Scholar 

  17. Kocher, G., Kumar, G.: Machine learning and deep learning methods for intrusion detection systems: recent developments and challenges. Soft. Comput. 25(15), 9731–9763 (2021)

    Google Scholar 

  18. Devi, M., Nandal, P., Sehrawat, H.: A novel rule-based intrusion detection framework for secure wireless sensor networks (2023)

  19. Einy, S., Oz, C., Navaei, Y.D.: The anomaly-and signature-based ids for network security using hybrid inference systems. Math. Probl. Eng. 2021, 1–10 (2021)

    Google Scholar 

  20. Varzaneh, Z.A., Kuchaki Rafsanjani, M.: Intrusion detection system using a new fuzzy rule-based classification system based on genetic algorithm. Intell. Dec. Technol. 15(2), 231–237 (2021)

    Google Scholar 

  21. Asad, H., Gashi, I.: Dynamical analysis of diversity in rule-based open source network intrusion detection systems. Empir. Softw. Eng. 27, 1–30 (2022)

    Google Scholar 

  22. Díaz-Verdejo, J., Muñoz-Calle, J., Estepa Alonso, A., Estepa Alonso, R., Madinabeitia, G.: On the detection capabilities of signature-based intrusion detection systems in the context of web attacks. Appl. Sci. 12(2), 852 (2022)

    Google Scholar 

  23. Liao, H.-J., Lin, C.-H.R., Lin, Y.-C., Tung, K.-Y.: Intrusion detection system: a comprehensive review. J. Netw. Comput. Appl. 36(1), 16–24 (2013)

    Google Scholar 

  24. Mushtaq, E., Zameer, A., Umer, M., Abbasi, A.A.: A two-stage intrusion detection system with auto-encoder and lstms. Appl. Soft Comput. 121, 108768 (2022)

    Google Scholar 

  25. Tama, B.A., Comuzzi, M., Rhee, K.-H.: Tse-ids: a two-stage classifier ensemble for intelligent anomaly-based intrusion detection system. IEEE Access 7, 94497–94507 (2019)

    Google Scholar 

  26. Dina, A.S., Manivannan, D.: Intrusion detection based on machine learning techniques in computer networks. Internet Things 16, 100462 (2021)

    Google Scholar 

  27. Zavrak, S., İskefiyeli, M.: Anomaly-based intrusion detection from network flow features using variational autoencoder. IEEE Access 8, 108346–108358 (2020)

    Google Scholar 

  28. Salo, F., Injadat, M., Nassif, A.B., Shami, A., Essex, A.: Data mining techniques in intrusion detection systems: a systematic literature review. IEEE Access 6, 56046–56058 (2018)

    Google Scholar 

  29. Guibene, K., Messai, N., Ayaida, M., Khoukhi, L.: A data mining-based intrusion detection system for cyber physical power systems. In: Proceedings of the 18th ACM International Symposium on QoS and Security for Wireless and Mobile Networks, pp. 55–62 (2022)

  30. Mohan, L., Jain, S., Suyal, P., Kumar, A.: Data mining classification techniques for intrusion detection system. In: 2020 12th International Conference on Computational Intelligence and Communication Networks (CICN), pp. 351–355. IEEE (2020)

  31. Monzer, M.-H., Beydoun, K., Ghaith, A., Flaus, J.-M.: Model-based ids design for icss. Reliab. Eng. Syst. Saf. 225, 108571 (2022)

    Google Scholar 

  32. Sonchack, J., Aviv, A.J., Smith, J.M.: Cross-domain collaboration for improved ids rule set selection. J. Inf. Secur. Appl. 24, 25–40 (2015)

    Google Scholar 

  33. Sagala, A.: Automatic snort ids rule generation based on honeypot log. In: 2015 7th International Conference on Information Technology and Electrical Engineering (ICITEE), pp. 576–580. IEEE (2015)

  34. Tomandl, A., Fuchs, K.-P., Federrath, H.: Rest-net: a dynamic rule-based ids for vanets. In: 2014 7th IFIP Wireless and Mobile Networking Conference (WMNC), pp. 1–8. IEEE (2014)

  35. Afzal, Z., Lindskog, S.: Ids rule management made easy. In: 2016 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1–8. IEEE (2016)

  36. AlYousef, M.Y., Abdelmajeed, N.T.: Dynamically detecting security threats and updating a signature-based intrusion detection system’s database. Proced. Comput. Sci. 159, 1507–1516 (2019)

    Google Scholar 

  37. Li, W., Tug, S., Meng, W., Wang, Y.: Designing collaborative blockchained signature-based intrusion detection in iot environments. Fut. Gener. Comput. Syst. 96, 481–489 (2019)

    Google Scholar 

  38. Wang, Y., Meng, W., Li, W., Li, J., Liu, W.-X., Xiang, Y.: A fog-based privacy-preserving approach for distributed signature-based intrusion detection. J. Parallel Distrib. Comput. 122, 26–35 (2018)

    Google Scholar 

  39. Zhang, C., Jia, D., Wang, L., Wang, W., Liu, F., Yang, A.: Comparative research on network intrusion detection methods based on machine learning. Comput. Secur. 121, 102861 (2022)

    Google Scholar 

  40. Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., Ahmad, F.: Network intrusion detection system: a systematic study of machine learning and deep learning approaches. Trans. Emerging Telecommun. Technol. 32(1), e4150 (2021)

    Google Scholar 

  41. Kim, T., Pak, W.: Real-time network intrusion detection using deferred decision and hybrid classifier. Futur. Gener. Comput. Syst. 132, 51–66 (2022)

    Google Scholar 

  42. Qiu, W., Ma, Y., Chen, X., Yu, H., Chen, L.: Hybrid intrusion detection system based on Dempster–Shafer evidence theory. Comput. Secur. 117, 102709 (2022)

    Google Scholar 

  43. Herrera-Semenets, V., Hernández-León, R., van den Berg, J.: A fast instance reduction algorithm for intrusion detection scenarios. Comput. Electr. Eng. 101, 107963 (2022)

    Google Scholar 

  44. Baldini, G., Amerini, I.: Online distributed denial of service (DDOS) intrusion detection based on adaptive sliding window and morphological fractal dimension. Comput. Netw. 210, 108923 (2022)

    Google Scholar 

  45. Asif, M., Abbas, S., Khan, M., Fatima, A., Khan, M.A., Lee, S.-W.: Mapreduce based intelligent model for intrusion detection using machine learning technique. J. King Saud Univer.-Comput. Inf. Sci. (2021)

  46. Hou, J., Liu, F., Lu, H., Tan, Z., Zhuang, X., Tian, Z.: A novel flow-vector generation approach for malicious traffic detection. J. Parallel Distrib. Comput. 169, 72–86 (2022)

    Google Scholar 

  47. Rabbani, M., Wang, Y.L., Khoshkangini, R., Jelodar, H., Zhao, R., Hu, P.: A hybrid machine learning approach for malicious behaviour detection and recognition in cloud computing. J. Netw. Comput. Appl. 151, 102507 (2020)

    Google Scholar 

  48. Herrmann, D., Banse, C., Federrath, H.: Behavior-based tracking: exploiting characteristic patterns in DNS traffic. Comput. Secur. 39, 17–33 (2013)

    Google Scholar 

  49. Imran, M., Haider, N., Shoaib, M., Razzak, I., et al.: An intelligent and efficient network intrusion detection system using deep learning. Comput. Electr. Eng. 99, 107764 (2022)

    Google Scholar 

  50. Liu, Q., Wang, D., Jia, Y., Luo, S., Wang, C.: A multi-task based deep learning approach for intrusion detection. Knowl.-Based Syst. 238, 107852 (2022)

    Google Scholar 

  51. Ravi, V., Chaganti, R., Alazab, M.: Recurrent deep learning-based feature fusion ensemble meta-classifier approach for intelligent network intrusion detection system. Comput. Electr. Eng. 102, 108156 (2022)

    Google Scholar 

  52. Li, B., Wang, Y., Xu, K., Cheng, L., Qin, Z.: Dfaid: density-aware and feature-deviated active intrusion detection over network traffic streams. Comput. Secur. 118, 102719 (2022)

    Google Scholar 

  53. Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31(3), 357–374 (2012)

    Google Scholar 

  54. Garcia, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur. 45, 100–123 (2014)

    Google Scholar 

  55. Moustafa, N., Slay, J.: Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6. IEEE (2015)

  56. Lawrence, H., Ezeobi, U., Tauil, O., Nosal, J., Redwood, O., Zhuang, Y., Bloom, G.: Cupid: a labeled dataset with pentesting for evaluation of network intrusion detection. J. Syst. Arch. 129, 102621 (2022)

    Google Scholar 

  57. Cohen, I., Huang, Y., Chen, J., Benesty, J., Benesty, J., Chen, J., Huang, Y., Cohen, I.: Pearson correlation coefficient. Noise Reduct. Speech Process. 1–4 (2009)

  58. Awerbuch, B.: A new distributed depth-first-search algorithm. Inf. Process. Lett. 20(3), 147–150 (1985)

    Google Scholar 

  59. Chen, Y.-C.: A tutorial on kernel density estimation and recent advances. Biostat. Epidemiol. 1(1), 161–187 (2017)

    Google Scholar 

  60. Heer, J.: Fast & accurate gaussian kernel density estimation. In: 2021 IEEE Visualization Conference (VIS), pp. 11–15. IEEE (2021)

  61. Sheikhpour, R., Sarram, M.A., Sheikhpour, R.: Particle swarm optimization for bandwidth determination and feature selection of kernel density estimation based classifiers in diagnosis of breast cancer. Appl. Soft Comput. 40, 113–131 (2016)

    Google Scholar 

  62. He, Y.-L., Ye, X., Huang, D.-F., Huang, J.Z., Zhai, J.-H.: Novel kernel density estimator based on ensemble unbiased cross-validation. Inf. Sci. 581, 327–344 (2021)

    MathSciNet  Google Scholar 

  63. Kotsiantis, S., Kanellopoulos, D.: Association rules mining: a recent overview. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 71–82 (2006)

    Google Scholar 

  64. Zeng, Y., Yin, S., Liu, J., Zhang, M.: Research of improved fp-growth algorithm in association rules mining. Sci. Program. 2015, 6–6 (2015)

    Google Scholar 

  65. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE (1995)

  66. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization: an overview. Swarm Intell. 1, 33–57 (2007)

    Google Scholar 

  67. Zhang, Z.: Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), pp. 1–2. IEEE (2018)

  68. Wei, Z., Wang, J., Zhao, Z., Shi, K.: Toward data efficient anomaly detection in heterogeneous edge-cloud environments using clustered federated learning. Futur. Gener. Comput. Syst. 164, 107559 (2025)

    Google Scholar 

  69. Dilworth, R., Gudla, C.: Harnessing pu learning for enhanced cloud-based ddos detection: a comparative analysis. arXiv preprint arXiv:2410.18380 (2024)

  70. Shafi, M., Lashkari, A.H., Mohanty, H.: Unveiling malicious dns behavior profiling and generating benchmark dataset through application layer traffic analysis. Comput. Electr. Eng. 118, 109436 (2024)

    Google Scholar 

  71. Zou, F., Ren, Y., Zhu, J., Tang, J.: Detecting data leakage in dns traffic based on time series anomaly detection. In: 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 503–510. IEEE (2021)

  72. Lison, P., Mavroeidis, V.: Neural reputation models learned from passive dns data. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 3662–3671. IEEE (2017)

  73. Göcs, L., Johanyák, Z.C.: Identifying relevant features of cse-cic-ids2018 dataset for the development of an intrusion detection system. arXiv preprint arXiv:2307.11544 (2023)

  74. Sarhan, M., Layeghy, S., Portmann, M.: Feature analysis for machine learning-based iot intrusion detection. arXiv preprint arXiv:2108.12732 (2021)

  75. Cambiaso, E., Papaleo, G., Chiola, G., Aiello, M.: Slow dos attacks: definition and categorisation. Int. J. Trust Manag. Comput. Commun. 1(3–4), 300–319 (2013)

    Google Scholar 

  76. Cambiaso, E., Papaleo, G., Aiello, M.: Taxonomy of slow dos attacks to web applications. In: Recent Trends in Computer Networks and Distributed Systems Security: International Conference, SNDS 2012, Trivandrum, India, October 11–12, 2012. Proceedings 1, pp. 195–204. Springer (2012)

  77. Lashkari, A.H., Gil, G.D., Mamun, M.S.I., Ghorbani, A.A.: Characterization of tor traffic using time based features. In: Proceeding of the 3rd International Conference on Information System Security and Privacy, SCITEPRESS (2017)

Download references

Acknowledgements

The authors acknowledge the grant from Canada Research Chair—Tier II (#CRC-2021-00340) and the Natural Sciences and Engineering Research Council of Canada—NSERC (#RGPIN-2020-04701)—to Arash Habibi Lashkari.

Funding

The authors acknowledge the grant from Canada Research Chair—Tier II (#CRC-2021-00340) and the Natural Sciences and Engineering Research Council of Canada—NSERC (#RGPIN-2020-04701)—to Arash Habibi Lashkari.

Author information

Authors and Affiliations

Authors

Contributions

MohammadMoein Shafi: Designed and implemented the model and all related code; wrote and prepared the main manuscript text and conceptualization. Arash Habibi Lashkari: Contributed to supervision, conceptualization, writing-review, editing, and securing the fund. Arousha Haghighian Roudsari: Provided advice and suggestions as the collaborator, along with reviewing and editing the manuscript.

Corresponding author

Correspondence to MohammadMoein Shafi.

Ethics declarations

Conflict of interest

The authors do not have any relevant Conflict of interest to disclose concerning the content of this paper.

Ethical Approval

This article does not involve any research studies conducted with human participants or animals by any authors.

Consent for Publication

Permitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shafi, M., Lashkari, A.H. & Roudsari, A.H. Toward Generating a Large Scale Intrusion Detection Dataset and Intruders Behavioral Profiling Using Network and Transportation Layers Traffic Flow Analyzer (NTLFlowLyzer). J Netw Syst Manage 33, 44 (2025). https://doi.org/10.1007/s10922-025-09917-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10922-025-09917-0

Keywords