Abstract
Network flow records consist of categorical and numerical features that provide context data and summary statistics computed from the raw packets exchanged between pairs of nodes in a network. Flow records labeled by human experts are typically used in high speed networks to design and evaluate intrusion detection systems. In spite of the ever-increasing body of literature on flow-based intrusion detection, there is no contribution that investigates the accuracy of flow records at rendering the class of traffic of the original aggregation of packets.
This paper proposes a collaborative filtering approach to compute sanitized labels for a given set of flow records. Sanitized labels are compared with the labels assigned by human experts. Experiments are done with CICIDS2017, i.e., an intrusion detection dataset that provides raw packets and labeled flow records obtained from benign operations and attack conditions. Results indicate that around 3.61% flow records might fail to render benign aggregations of packets; surprisingly, the percentage of flow records, which fail to render aggregations of packets pertaining to attacks, ranges from 5.39% to 27.18% depending on the type of attack. These findings indicate the need for improving the features collected or potential imperfections while computing the flow records.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
If a distance metric is adopted, the computed distances need to be sorted in ascending order; on the other hand, sorting needs to be in descending order in case of similarity metrics.
- 11.
- 12.
- 13.
For a small number of flow records the protocol field is unspecified.
- 14.
References
Ahmim, A., Maglaras, L., Ferrag, M.A., Derdour, M., Janicke, H.: A novel hierarchical intrusion detection system based on decision tree and rules-based models. In: Proceedings of the International Conference on Distributed Computing in Sensor Systems, pp. 228–233 (2019)
Bhuyan, M.H., Bhattacharyya, D., Kalita, J.: Towards generating real-life datasets for network intrusion detection. Int. J. Netw. Secur. 17, 683–701 (2015)
Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6. IEEE (2021)
Catillo, M., Pecchia, A., Rak, M., Villano, U.: Demystifying the role of public intrusion datasets: a replication study of DoS network traffic data. Comput. Secur. 108, 102341 (2021)
Catillo, M., Rak, M., Villano, U.: 2L-ZED-IDS: a two-level anomaly detector for multiple attack classes. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) WAINA 2020. AISC, vol. 1150, pp. 687–696. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_63
Catillo, M., Pecchia, A., Villano, U.: Measurement-based analysis of a DoS defense module for an open source web server. In: Casola, V., De Benedictis, A., Rak, M. (eds.) ICTSS 2020. LNCS, vol. 12543, pp. 121–134. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64881-7_8
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)
Cotroneo, D., Paudice, A., Pecchia, A.: Empirical analysis and validation of security alerts filtering techniques. IEEE Trans. Dependable Secure Comput. 16(5), 856–870 (2019)
García, S., Grill, M., Stiborek, J., Zunino, A.: An empirical comparison of botnet detection methods. Comput. Secur 45, 100–123 (2014)
Gogoi, P., Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Packet and flow based network intrusion dataset. In: Parashar, M., Kaushik, D., Rana, O.F., Samtaney, R., Yang, Y., Zomaya, A. (eds.) IC3 2012. CCIS, vol. 306, pp. 322–334. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32129-0_34
Kshirsagar, D., Kumar, S.: An efficient feature reduction method for the detection of DoS attack. ICT Express 7, 371–375 (2021)
Lee, J., Kim, J., Kim, I., Han, K.: Cyber threat detection based on artificial neural networks using event profiles. IEEE Access 7, 165607–165626 (2019)
Liu, H., Lang, B.: Machine learning and deep learning methods for intrusion detection systems: a survey. Appl. Sci. 9(20), 4396 (2019)
Maciá-Fernández, G., Camacho, J., Magán-Carrión, R., García-Teodoro, P., Therón, R.: UGR’16: a new dataset for the evaluation of cyclostationarity-based network IDSs. Comput. Secur. 73, 411–424 (2017)
Paudice, A., Muñoz-González, L., Lupu, E.C.: Label sanitization against label flipping poisoning attacks. In: Alzate, C., et al. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11329, pp. 5–15. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13453-2_1
Ring, M., Wunderlich, S., Scheuring, D., Landes, D., Hotho, A.: A survey of network-based intrusion detection data sets. Comput. Secur. 86, 147–167 (2019)
Sharafaldin, I., Lashkari, A.H., Ghorbani., A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the International Conference on Information Systems Security and Privacy, pp. 108–116. SciTePress (2018)
Shiravi, A., Shiravi, H., Tavallaee, M., Ghorbani, A.: Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 31, 357–374 (2012)
Smallwood, D., Vance, A.: Intrusion analysis with deep packet inspection: increasing efficiency of packet based investigations. In: Proceedings of the International Conference on Cloud and Service Computing, pp. 342–347. IEEE (2011)
Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B.: An overview of IP flow-based intrusion detection. IEEE Commun. Surv. Tutor. 12(3), 343–356 (2010)
Umer, M.F., Sher, M., Bi, Y.: Flow-based intrusion detection: techniques and challenges. Comput. Secur. 70, 238–254 (2017)
Wankhede, S., Kshirsagar, D.: DoS attack detection using machine learning and neural network. In: Proceedings of the 4th International Conference on Computing Communication Control and Automation, pp. 1–5 (2018)
Acknowledgment
Andrea Del Vecchio gratefully acknowledges support by the “Orio Carlini” 2020 GARR Consortium Fellowship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 IFIP International Federation for Information Processing
About this paper
Cite this paper
Catillo, M., Vecchio, A.D., Pecchia, A., Villano, U. (2022). On the Quality of Network Flow Records for IDS Evaluation: A Collaborative Filtering Approach. In: Clark, D., Menendez, H., Cavalli, A.R. (eds) Testing Software and Systems. ICTSS 2021. Lecture Notes in Computer Science, vol 13045. Springer, Cham. https://doi.org/10.1007/978-3-031-04673-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-04673-5_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-04672-8
Online ISBN: 978-3-031-04673-5
eBook Packages: Computer ScienceComputer Science (R0)