Self-organising Approach to Anomaly Mitigation in the Cloud-to-Edge Continuum

Faria, Bruno; Abreu, David Perez; Velasquez, Karima; Curado, Marília

doi:10.1007/978-3-031-81375-7_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15506))

Included in the following conference series:

International Conference on Cooperative Information Systems

202 Accesses

Abstract

The cloud-to-edge continuum paradigm has permeated various application domains, including critical urban-city safety systems. In these contexts, anomalies can compromise public safety, for example, by disrupting the communication between smart city infrastructure and vehicles, which aims to prevent accidents at pedestrian crossings. Given these environments’ heterogeneous and large-scale nature, manual recovery from anomalies is not feasible. Machine Learning techniques have emerged as an alternative, supporting a zero-touch approach that enables self-organising and self-healing solutions for anomaly prediction, detection, and mitigation. This paper proposes an Artificial Intelligence-driven, self-organising approach for anomaly management in the cloud-to-edge continuum, integrating both reactive and proactive mechanisms. We evaluate different Machine Learning models, including Random Forest Classifiers, Neural Networks, and Convolutional Neural Networks, to predict node performance anomalies. The simulation results obtained using the COSCO framework showcase the effectiveness of our method. It achieves an F1 score of 73% for multiclass classification, predicting different levels of anomaly severity, and 87% for binary classification, distinguishing between normal and abnormal states.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Node’s Million Instructions Per Second (MIPS) maximum capacity. It was removed because, for local normalisation, it would always be a 1.
2.
https://github.com/brunofaria1322/COSCO.
3.
http://gwa.ewi.tudelft.nl/datasets/gwa-t-12-bitbrains.
4.
https://manpages.ubuntu.com/manpages/xenial/man1/stress-ng.1.html.

References

Azure/AzurePublicDataset. Microsoft Azure (2024). https://github.com/Azure/AzurePublicDataset. Accessed 16 June 2023
Arzovs, A., Judvaitis, J., Nesenbergs, K., Selavo, L.: Distributed learning in the IoT-edge-cloud continuum. Mach. Learn. Knowl. Extract. 6(1), 283–315 (2024). https://doi.org/10.3390/make6010015
Article MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Chen, M., et al.: Distributed learning in wireless networks: recent progress and future challenges. IEEE J. Sel. Areas Commun. 39(12), 3579–3605 (2021). https://doi.org/10.1109/JSAC.2021.3118346
Article MATH Google Scholar
European Commission: The European Green Deal - European Commission (2021). https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/european-green-deal_en. Accessed 12 Apr 2024
Coronado, E., et al.: Zero touch management: a survey of network automation solutions for 5G and 6G networks. IEEE Commun. Surv. Tutor. 24(4), 2535–2578 (2022). https://doi.org/10.1109/COMST.2022.3212586
Article MATH Google Scholar
Cortez, E., Bonde, A., Muzio, A., Russinovich, M., Fontoura, M., Bianchini, R.: Resource central: understanding and predicting workloads for improved resource management in large cloud platforms. In: Proceedings of the 26th Symposium on Operating Systems Principles, SOSP ’17, pp. 153–167. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3132747.3132772
Du, Q., He, Yu., Xie, T., Yin, K., Qiu, J.: An approach of collecting performance anomaly dataset for NFV infrastructure. In: Vaidya, J., Li, J. (eds.) ICA3PP 2018. LNCS, vol. 11336, pp. 59–71. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05057-3_5
Chapter MATH Google Scholar
Du, Q., Xie, T., He, Yu.: Anomaly detection and diagnosis for container-based microservices with performance monitoring. In: Vaidya, J., Li, J. (eds.) ICA3PP 2018. LNCS, vol. 11337, pp. 560–572. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05063-4_42
Chapter MATH Google Scholar
Faria, B.: Self-organising engine for the cloud-to-edge continuum. Master’s thesis, University of Coimbra, Coimbra, Portugal (2023). https://hdl.handle.net/10316/110708
Gallego-Madrid, J., Sanchez-Iborra, R., Ruiz, P.M., Skarmeta, A.F.: Machine learning-based zero-touch network and service management: a survey. Digit. Commun. Netw. 8(2), 105–123 (2022). https://doi.org/10.1016/j.dcan.2021.09.001
Article Google Scholar
Kumar, Y., Farooq, H., Imran, A.: Fault prediction and reliability analysis in a real cellular network. In: 2017 13th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 1090–1095 (2017). https://doi.org/10.1109/IWCMC.2017.7986437
Liyanage, M., et al.: A survey on Zero touch network and Service Management (ZSM) for 5G and beyond networks. J. Netw. Comput. Appl. 203, 103362 (2022). https://doi.org/10.1016/j.jnca.2022.103362
Article MATH Google Scholar
Mao, B., Tang, F., Kawamoto, Y., Kato, N.: AI models for green communications towards 6G. IEEE Commun. Surv. Tutor. 24(1) (2021). https://doi.org/10.1109/COMST.2021.3130901
Marchese, A., Tomarchio, O.: Sophos: a framework for application orchestration in the cloud-to-edge continuum. In: Proceedings of the 13th International Conference on Cloud Computing and Services Science - CLOSER, pp. 261–268. SCITEPRESS - Science and Technology Publications, Prague (2023). https://doi.org/10.5220/0011972600003488
Moustapha, A.I., Selmic, R.R.: Wireless sensor network modeling using modified recurrent neural networks: application to fault detection. IEEE Trans. Instrum. Measur. 57(5), 981–988 (2008). https://doi.org/10.1109/TIM.2007.913803
Article MATH Google Scholar
Palakurti, N.R.: Challenges and future directions in anomaly detection. In: Practical Applications of Data Processing, Algorithms, and Modeling, pp. 269–284. IGI Global (2024). https://doi.org/10.4018/979-8-3693-2909-2.ch020
Pang, G., Shen, C., Cao, L., Hengel, A.V.D.: Deep learning for anomaly detection: a review. ACM Comput. Surv. 54(2), 38:1–38:38 (2021). https://doi.org/10.1145/3439950
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pellegrini, A., Sanzo, P.D., Avresky, D.R.: A machine learning-based framework for building application failure prediction models. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 1072–1081 (2015). https://doi.org/10.1109/IPDPSW.2015.110
Sauvanaud, C., Kaâniche, M., Kanoun, K., Lazri, K., Da Silva Silvestre, G.: Anomaly detection and diagnosis for cloud services: practical experiments and lessons learned. J. Syst. Softw. 139, 84–106 (2018). https://doi.org/10.1016/j.jss.2018.01.039
Article MATH Google Scholar
Sauvanaud, C., Lazri, K., Kaâniche, M., Kanoun, K.: Anomaly detection and root cause localization in virtual network functions. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 196–206 (2016). https://doi.org/10.1109/ISSRE.2016.32
Shen, S., Van Beek, V., Iosup, A.: Statistical characterization of business-critical workloads hosted in cloud datacenters. In: 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 465–474 (2015). https://doi.org/10.1109/CCGrid.2015.60
Soualhia, M., Fu, C., Khomh, F.: Infrastructure fault detection and prediction in edge cloud environments. In: Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC ’19, pp. 222–235. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3318216.3363305
Sousa, B., et al.: Estudos Preliminares na área do Projeto. Technical report E2.1, Universidade de Coimbra (2021). https://oreos.pt/wp-content/uploads/2022/05/RD-OREOS-17PT-E2.1-EstudosPreliminaresNaA%CC%81reaDoProjeto.pdf
Su, J., et al.: Large language models for forecasting and anomaly detection: a systematic literature review (2024). arXiv:2402.10350. https://doi.org/10.48550/arXiv.2402.10350
Theodoropoulos, T., Violos, J., Tsanakas, S., Leivadeas, A., Tserpes, K., Varvarigou, T.: Intelligent proactive fault tolerance at the edge through resource usage prediction. ITU J. Future Evolving Technol. 3(3), 761–778 (2022). https://doi.org/10.52953/EHJP3291
Article Google Scholar
Tuli, S., Poojara, S.R., Srirama, S.N., Casale, G., Jennings, N.R.: COSCO: container orchestration using co-simulation and gradient based optimization for fog computing environments. IEEE Trans. Parallel Distrib. Syst. 33(1), 101–116 (2022). https://doi.org/10.1109/TPDS.2021.3087349
Article Google Scholar
Tusa, F., Clayman, S.: End-to-end slices to orchestrate resources and services in the cloud-to-edge continuum. Future Gener. Comput. Syst. 141, 473–488 (2023). https://doi.org/10.1016/j.future.2022.11.026
Article Google Scholar
Verdecchia, R., Sallou, J., Cruz, L.: A systematic review of Green AI. WIREs Data Min. Knowl. Discov. 13(4), e1507 (2023). https://doi.org/10.1002/widm.1507
Article MATH Google Scholar
Zhang, T., Zhu, K., Hossain, E.: Data-driven machine learning techniques for self-healing in cellular wireless networks: challenges and solutions. Intell. Comput. 2022, 1–8 (2022). https://doi.org/10.34133/2022/9758169
Article MATH Google Scholar

Download references

Acknowledgments

This work is funded by the FCT - Foundation for Science and Technology, I.P./MCTES through national funds (PIDDAC), within the scope of CISUC R&D Unit - UIDB/00326/2020 or project code UIDP/00326/2020.

Content produced within the scope of the Agenda “NEXUS - Pacto de Inovação - Transição Verde e Digital para Transportes, Logística e Mobilidade”, financed by the Portuguese Recovery and Resilience Plan (PRR), with no. C645112083-00000059 (investment project no. .$^\circ $ 53).

Author information

Authors and Affiliations

Laboratory for Informatics and Systems, Pedro Nunes Institute, Coimbra, Portugal
Bruno Faria & David Perez Abreu
CISUC, Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
Bruno Faria, David Perez Abreu, Karima Velasquez & Marília Curado

Authors

Bruno Faria
View author publications
You can also search for this author in PubMed Google Scholar
David Perez Abreu
View author publications
You can also search for this author in PubMed Google Scholar
Karima Velasquez
View author publications
You can also search for this author in PubMed Google Scholar
Marília Curado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Faria .

Editor information

Editors and Affiliations

Ulsan National Institute of Science and Technology, Ulsan, Korea (Republic of)
Marco Comuzzi
Paris Dauphine - PSL University, Paris, France
Daniela Grigori
Institut Polytechnique de Paris, Paris, France
Mohamed Sellami
University of Geosciences, Beijing, Beijing, China
Zhangbing Zhou

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Faria, B., Abreu, D.P., Velasquez, K., Curado, M. (2025). Self-organising Approach to Anomaly Mitigation in the Cloud-to-Edge Continuum. In: Comuzzi, M., Grigori, D., Sellami, M., Zhou, Z. (eds) Cooperative Information Systems. CoopIS 2024. Lecture Notes in Computer Science, vol 15506. Springer, Cham. https://doi.org/10.1007/978-3-031-81375-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-81375-7_15
Published: 14 February 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-81374-0
Online ISBN: 978-3-031-81375-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Self-organising Approach to Anomaly Mitigation in the Cloud-to-Edge Continuum