Abstract
In the dynamic realm of modern technology, the rapid growth of Internet of Things (IoT) devices introduces different challenges in considering network security and reliability. However, the different nature of IoT environments complicates the task for network operators and security experts, who must face increasingly sophisticated threats. Additionally, relying only on network traffic to detect user actions presents some problems. The complexity of IoT environments and the variability of user actions make the distinctions between legitimate activities and threats difficult to track. Recently, Machine Learning techniques have arising as a way to identify threats in networking systems. Even if such techniques are very powerful, they relies on reliable datasets able to collect examples of both licit and malicious traffic. However, often datasets are limited in the number of examples collected and in the documentation of the way in which the traffic was monitored, moreover, labelling is not always reliable. Accordingly, this paper delineates the development of a procedure to generate datasets utilizing a dedicated test bed to capture user actions associated with smart-home IoT devices. Unlike most datasets in the literature, this paper aims at offering a way to easily collect and label continuously produced data, generating datasets enriched with detailed descriptions of each device involved in traffic generation. We believe that this paper offers a first step in the direction of systematic production of datasets, more suitable for the efficient use of machine learning techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alrawi, O., Lever, C., Antonakakis, M., Monrose, F.: SoK: security evaluation of home-based IoT deployments. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 1362–1380. IEEE (2019)
Booij, T.M., Chiscop, I., Meeuwissen, E., Moustafa, N., den Hartog, F.T.H.: Ton IoT: the role of heterogeneity and the need for standardization of features and attack types in IoT network intrusion data sets. IEEE Internet of Things J. 9(1), 485–496 (2022)
Catillo, M., Pecchia, A., Rak, M., Villano, U.: Demystifying the role of public intrusion datasets: a replication study of dos network traffic data. Comput. Secur 108, 102341 (2021)
Conti, M., Mancini, L.V., Spolaor, R., Verde, N.V.: Can’t you hear me knocking: identification of user actions on android apps via traffic analysis. In: Proceedings of the 5th ACM Conference on Data and Application Security and Privacy (CODASPY 2015), pp. 297–304. Association for Computing Machinery, New York (2015)
Ferrag, M.A., Friha, O., Hamouda, D., Maglaras, L., Janicke, H.: Edge-iiotset: a new comprehensive realistic cyber security dataset of IoT and IIoT applications: centralized and federated learning (2022)
Ficco, M., Granata, D., Palmieri, F., Rak, M.: A systematic approach for threat and vulnerability analysis of unmanned aerial vehicles. Internet of Things (Netherlands) 26, 101180 (2024)
Fomichev, M., Álvarez, F., Steinmetzer, D., Gardner-Stephen, P., Hollick, M.: Survey and systematization of secure device pairing. IEEE Commun. Surv. Tutor. 20(1), 517–550 (2017)
Garcia, S., Parmisano, A., Erquiaga, M.J.: IoT-23: A Labeled Dataset with Malicious and Benign IoT Network Traffic (Version 1.0.0) (2020)
Granata, D., Rak, M., Salzillo, G., Barbato, U.: Security in IoT Pairing and Authentication Protocols, a Threat Model and a Case Study Analysis, vol. 2940, pp. 207–218. CEUR-WS (2021)
Guerra-Manzanares, A., Medina-Galindo, J., Bahsi, H., Nõmm, S.: Medbiot: generation of an IoT botnet dataset in a medium-sized IoT network. In: Proceedings of the 6th International Conference on Information Systems Security and Privacy (ICISSP), vol. 1, pp. 207–218 (2020)
Hindy, H., Tachtatzis, C., Atkinson, R., Bayne, E., Bellekens, X.: Mqtt-iot-ids2020: Mqtt internet of things intrusion detection dataset (2020)
Koroniotis, N., Moustafa, N., Sitnikova, E., Turnbull, B.: Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: bot-iot dataset. Futur. Gener. Comput. Syst. 100, 779–796 (2019)
Liu, Z., Thapa, N., Shaver, A., Roy, K., Yuan, X., Khorsandroo, S.: Anomaly detection on IoT network intrusion using machine learning. In: 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), pp. 1–5 (2020)
Meidan, Y., Bohadana, M., Mathov, Y., Mirsky, Y., Breitenbacher, D., Shabtai, A.: N-baiot. UCI Machine Learning Repository, Detection of iot Botnet Attacks (2018)
Mirsky, Y., Doitshman, T., Elovici, Y., Shabtai, A.: Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089 (2018)
Neto, E.C.P., et al.: Ghorbani. Ciciot2023: a real-time dataset and benchmark for large-scale attacks in IoT environment. Sensors, 23(13) (2023)
Rak, M., Salzillo, G., Granata, D.: Esseca: an automated expert system for threat modelling and penetration testing for IoT ecosystems. Comput. Electric. Eng. 99, 107721 (2022)
Salzillo, G., Rak, M., Moretta, F.: Threat modeling based penetration testing: the open energy monitor case study. In: 13th International Conference on Security of Information and Networks (SIN 2020). Association for Computing Machinery, New York (2021)
Sarhan, M., Layeghy, S., Portmann, M.: Evaluating standard feature sets towards increased generalisability and explainability of ml-based network intrusion detection. Big Data Res. 30, 100359 (2022)
Sivanathan, A., et al.: Classifying IoT devices in smart environments using network traffic characteristics. IEEE Trans. Mobile Comput. (2018)
Teixeira, M.A., Salman, T., Zolanvari, M., Jain, R., Meskin, N., Samaka, M.: Scada system testbed for cybersecurity research using machine learning approach. Future Internet 10, 76 (2018)
Vaccari, I., Chiola, G., Aiello, M., Mongelli, M., Cambiaso, E.: MQTTSET, a new dataset for machine learning techniques on MQTT. Sensors 20(22), 6578 (2020)
Acknowledgment
This work was partially supported by the UPSIDE project (B63D23000820004) and the project DEFEDGE (E53D23016380001) under the PRIN program funded by the Italian MUR.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rak, M., Granata, D., Esposito, A., Ferretti, A. (2024). Navigating IoT Complexity: Developing Datasets for Smart-Home Device Interactions. In: Barolli, L. (eds) Complex, Intelligent and Software Intensive Systems. CISIS 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 87. Springer, Cham. https://doi.org/10.1007/978-3-031-70011-8_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-70011-8_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70010-1
Online ISBN: 978-3-031-70011-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)