Abstract
The cloud-to-edge continuum paradigm has permeated various application domains, including critical urban-city safety systems. In these contexts, anomalies can compromise public safety, for example, by disrupting the communication between smart city infrastructure and vehicles, which aims to prevent accidents at pedestrian crossings. Given these environments’ heterogeneous and large-scale nature, manual recovery from anomalies is not feasible. Machine Learning techniques have emerged as an alternative, supporting a zero-touch approach that enables self-organising and self-healing solutions for anomaly prediction, detection, and mitigation. This paper proposes an Artificial Intelligence-driven, self-organising approach for anomaly management in the cloud-to-edge continuum, integrating both reactive and proactive mechanisms. We evaluate different Machine Learning models, including Random Forest Classifiers, Neural Networks, and Convolutional Neural Networks, to predict node performance anomalies. The simulation results obtained using the COSCO framework showcase the effectiveness of our method. It achieves an F1 score of 73% for multiclass classification, predicting different levels of anomaly severity, and 87% for binary classification, distinguishing between normal and abnormal states.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Node’s Million Instructions Per Second (MIPS) maximum capacity. It was removed because, for local normalisation, it would always be a 1.
- 2.
- 3.
- 4.
References
Azure/AzurePublicDataset. Microsoft Azure (2024). https://github.com/Azure/AzurePublicDataset. Accessed 16 June 2023
Arzovs, A., Judvaitis, J., Nesenbergs, K., Selavo, L.: Distributed learning in the IoT-edge-cloud continuum. Mach. Learn. Knowl. Extract. 6(1), 283–315 (2024). https://doi.org/10.3390/make6010015
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Chen, M., et al.: Distributed learning in wireless networks: recent progress and future challenges. IEEE J. Sel. Areas Commun. 39(12), 3579–3605 (2021). https://doi.org/10.1109/JSAC.2021.3118346
European Commission: The European Green Deal - European Commission (2021). https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/european-green-deal_en. Accessed 12 Apr 2024
Coronado, E., et al.: Zero touch management: a survey of network automation solutions for 5G and 6G networks. IEEE Commun. Surv. Tutor. 24(4), 2535–2578 (2022). https://doi.org/10.1109/COMST.2022.3212586
Cortez, E., Bonde, A., Muzio, A., Russinovich, M., Fontoura, M., Bianchini, R.: Resource central: understanding and predicting workloads for improved resource management in large cloud platforms. In: Proceedings of the 26th Symposium on Operating Systems Principles, SOSP ’17, pp. 153–167. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3132747.3132772
Du, Q., He, Yu., Xie, T., Yin, K., Qiu, J.: An approach of collecting performance anomaly dataset for NFV infrastructure. In: Vaidya, J., Li, J. (eds.) ICA3PP 2018. LNCS, vol. 11336, pp. 59–71. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05057-3_5
Du, Q., Xie, T., He, Yu.: Anomaly detection and diagnosis for container-based microservices with performance monitoring. In: Vaidya, J., Li, J. (eds.) ICA3PP 2018. LNCS, vol. 11337, pp. 560–572. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05063-4_42
Faria, B.: Self-organising engine for the cloud-to-edge continuum. Master’s thesis, University of Coimbra, Coimbra, Portugal (2023). https://hdl.handle.net/10316/110708
Gallego-Madrid, J., Sanchez-Iborra, R., Ruiz, P.M., Skarmeta, A.F.: Machine learning-based zero-touch network and service management: a survey. Digit. Commun. Netw. 8(2), 105–123 (2022). https://doi.org/10.1016/j.dcan.2021.09.001
Kumar, Y., Farooq, H., Imran, A.: Fault prediction and reliability analysis in a real cellular network. In: 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1090–1095 (2017). https://doi.org/10.1109/IWCMC.2017.7986437
Liyanage, M., et al.: A survey on Zero touch network and Service Management (ZSM) for 5G and beyond networks. J. Netw. Comput. Appl. 203, 103362 (2022). https://doi.org/10.1016/j.jnca.2022.103362
Mao, B., Tang, F., Kawamoto, Y., Kato, N.: AI models for green communications towards 6G. IEEE Commun. Surv. Tutor. 24(1) (2021). https://doi.org/10.1109/COMST.2021.3130901
Marchese, A., Tomarchio, O.: Sophos: a framework for application orchestration in the cloud-to-edge continuum. In: Proceedings of the 13th International Conference on Cloud Computing and Services Science - CLOSER, pp. 261–268. SCITEPRESS - Science and Technology Publications, Prague (2023). https://doi.org/10.5220/0011972600003488
Moustapha, A.I., Selmic, R.R.: Wireless sensor network modeling using modified recurrent neural networks: application to fault detection. IEEE Trans. Instrum. Measur. 57(5), 981–988 (2008). https://doi.org/10.1109/TIM.2007.913803
Palakurti, N.R.: Challenges and future directions in anomaly detection. In: Practical Applications of Data Processing, Algorithms, and Modeling, pp. 269–284. IGI Global (2024). https://doi.org/10.4018/979-8-3693-2909-2.ch020
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 38:1–38:38 (2021). https://doi.org/10.1145/3439950
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pellegrini, A., Sanzo, P.D., Avresky, D.R.: A machine learning-based framework for building application failure prediction models. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 1072–1081 (2015). https://doi.org/10.1109/IPDPSW.2015.110
Sauvanaud, C., Kaâniche, M., Kanoun, K., Lazri, K., Da Silva Silvestre, G.: Anomaly detection and diagnosis for cloud services: practical experiments and lessons learned. J. Syst. Softw. 139, 84–106 (2018). https://doi.org/10.1016/j.jss.2018.01.039
Sauvanaud, C., Lazri, K., Kaâniche, M., Kanoun, K.: Anomaly detection and root cause localization in virtual network functions. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 196–206 (2016). https://doi.org/10.1109/ISSRE.2016.32
Shen, S., Van Beek, V., Iosup, A.: Statistical characterization of business-critical workloads hosted in cloud datacenters. In: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 465–474 (2015). https://doi.org/10.1109/CCGrid.2015.60
Soualhia, M., Fu, C., Khomh, F.: Infrastructure fault detection and prediction in edge cloud environments. In: Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC ’19, pp. 222–235. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3318216.3363305
Sousa, B., et al.: Estudos Preliminares na área do Projeto. Technical report E2.1, Universidade de Coimbra (2021). https://oreos.pt/wp-content/uploads/2022/05/RD-OREOS-17PT-E2.1-EstudosPreliminaresNaA%CC%81reaDoProjeto.pdf
Su, J., et al.: Large language models for forecasting and anomaly detection: a systematic literature review (2024). arXiv:2402.10350. https://doi.org/10.48550/arXiv.2402.10350
Theodoropoulos, T., Violos, J., Tsanakas, S., Leivadeas, A., Tserpes, K., Varvarigou, T.: Intelligent proactive fault tolerance at the edge through resource usage prediction. ITU J. Future Evolving Technol. 3(3), 761–778 (2022). https://doi.org/10.52953/EHJP3291
Tuli, S., Poojara, S.R., Srirama, S.N., Casale, G., Jennings, N.R.: COSCO: container orchestration using co-simulation and gradient based optimization for fog computing environments. IEEE Trans. Parallel Distrib. Syst. 33(1), 101–116 (2022). https://doi.org/10.1109/TPDS.2021.3087349
Tusa, F., Clayman, S.: End-to-end slices to orchestrate resources and services in the cloud-to-edge continuum. Future Gener. Comput. Syst. 141, 473–488 (2023). https://doi.org/10.1016/j.future.2022.11.026
Verdecchia, R., Sallou, J., Cruz, L.: A systematic review of Green AI. WIREs Data Min. Knowl. Discov. 13(4), e1507 (2023). https://doi.org/10.1002/widm.1507
Zhang, T., Zhu, K., Hossain, E.: Data-driven machine learning techniques for self-healing in cellular wireless networks: challenges and solutions. Intell. Comput. 2022, 1–8 (2022). https://doi.org/10.34133/2022/9758169
Acknowledgments
This work is funded by the FCT - Foundation for Science and Technology, I.P./MCTES through national funds (PIDDAC), within the scope of CISUC R&D Unit - UIDB/00326/2020 or project code UIDP/00326/2020.
Content produced within the scope of the Agenda “NEXUS - Pacto de Inovação - Transição Verde e Digital para Transportes, Logística e Mobilidade”, financed by the Portuguese Recovery and Resilience Plan (PRR), with no. C645112083-00000059 (investment project no. .\(^\circ \) 53).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Faria, B., Abreu, D.P., Velasquez, K., Curado, M. (2025). Self-organising Approach to Anomaly Mitigation in the Cloud-to-Edge Continuum. In: Comuzzi, M., Grigori, D., Sellami, M., Zhou, Z. (eds) Cooperative Information Systems. CoopIS 2024. Lecture Notes in Computer Science, vol 15506. Springer, Cham. https://doi.org/10.1007/978-3-031-81375-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-81375-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-81374-0
Online ISBN: 978-3-031-81375-7
eBook Packages: Computer ScienceComputer Science (R0)