Skip to main content
Log in

A reinforcement learning-based approach for online optimal control of self-adaptive real-time systems

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

This paper deals with self-adaptive real-time embedded systems (RTES). A self-adaptive system can operate in different modes. Each mode encodes a set of real-time tasks. To be executed, each task is allocated to a processor (placement) and assigned a priority (scheduling), while respecting timing constraints. An adaptation scenario allows switching between modes by adding, removing, and updating task parameters that must meet related deadlines after adaptation. For such systems, anticipating all operational modes at design time is usually impossible. Online reinforcement learning is increasingly used in the presence of design-time uncertainty. To tackle this problem, we formalize the placement and scheduling problems in self-adaptive RTES as a Markov decision process and propose related algorithms based on Q-learning. Then, we introduce an approach that integrates the proposed algorithms to assist designers in the development of self-adaptive RTES. At the design level, the RL Placement and the RL Scheduler are proposed to process predictable adaptation scenarios. These modules are designed to generate placement and scheduling models for an application while maximizing system extensibility and ensuring real-time feasibility. At the execution level, the RL Adapter is defined to process online adaptations. Indeed, the goal of the RL Adapter agent is to reject the adaptation scenario when feasibility concerns are raised; otherwise, it generates a new feasible placement and scheduling. We apply and simulate the proposed approach to a healthcare robot case study to show its applicability. Performance evaluations are conducted to prove the effectiveness of the proposed approach compared to related works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author upon reasonable request.

References

  1. Lakhdhar W, Mzid R, Khalgui M, Treves N (2016) Milp-based approach for optimal implementation of reconfigurable real-time systems. In: International conference on software engineering and applications, vol 2, pp 330–335

  2. Palm A, Metzger A, Pohl K (2020) Online reinforcement learning for self-adaptive information systems. In: International conference on advanced information systems engineering, pp 169–184. Springer

  3. Burchard A, Liebeherr J, Oh Y, Son SH (1995) New strategies for assigning real-time tasks to multiprocessor systems. IEEE Trans Comput 44(12):1429–1442

    Article  MathSciNet  MATH  Google Scholar 

  4. El Sayed MA, Saad ESM, Aly RF, Habashy SM (2021) Energy-efficient task partitioning for real-time scheduling on multi-core platforms. Computers 10(1):10

    Article  Google Scholar 

  5. Bouaziz R, Lemarchand L, Singhoff F, Zalila B, Jmaiel M (2018) Multi-objective design exploration approach for ravenscar real-time systems. Real-Time Syst 54(2):424–483

    Article  Google Scholar 

  6. Gharbi I, Gharsellaoui H, Bouamama S (2021) New hybrid genetic based approach for real-time scheduling of reconfigurable embedded systems. In: Research anthology on multi-industry uses of genetic programming and algorithms, pp 1140–1155. IGI Global

  7. Mehiaoui A, Wozniak E, Babau J-P, Tucci-Piergiovanni S, Mraidha C (2019) Optimizing the deployment of tree-shaped functional graphs of real-time system on distributed architectures. Autom Softw Eng 26(1):1–57

    Article  Google Scholar 

  8. Audsley NC (2001) On priority assignment in fixed priority scheduling. Inf Process Lett 79(1):39–44

    Article  MATH  Google Scholar 

  9. Chen Y, Lu C, Yan J, Feng J, Sareh P (2022) Intelligent computational design of scalene-faceted flat-foldable tessellations. J Comput Des Eng 9(5):1765–1774

    Google Scholar 

  10. Fan W, Chen Y, Li J, Sun Y, Feng J, Hassanin H, Sareh P (2021) Machine learning applied to the design and inspection of reinforced concrete bridges: resilient methods and emerging applications. In: Structures, vol. 33, pp. 3954–3963. Elsevier

  11. Rao Z, Tung PY, Xie R, Wei Y, Zhang H, Ferrari A, Klaver T, Körmann F, Sukumar PT, Kwiatkowski da Silva A et al (2022) Machine learning-enabled high-entropy alloy discovery. Science 378(6615):78–85

    Article  Google Scholar 

  12. Zhang P, Fan W, Chen Y, Feng J, Sareh P (2022) Structural symmetry recognition in planar structures using convolutional neural networks. Eng Struct 260:114227

    Article  Google Scholar 

  13. Altay O, Varol Altay E (2023) A novel hybrid multilayer perceptron neural network with improved grey wolf optimizer. Neural Comput Appl 35(1):529–556

    Article  Google Scholar 

  14. Jamsheed F, Iqbal SJ (2023) Simplified artificial neural network based online adaptive control scheme for nonlinear systems. Neural Comput Appl 35(1):663–679

    Article  Google Scholar 

  15. Lakhdhar W, Mzid R, Khalgui M, Frey G, Li Z, Zhou M (2020) A guidance framework for synthesis of multi-core reconfigurable real-time systems. Inf Sci 539:327–346

    Article  MathSciNet  Google Scholar 

  16. Hoi SC, Sahoo D, Lu J, Zhao P (2021) Online learning: a comprehensive survey. Neurocomputing 459:249–289

    Article  Google Scholar 

  17. Casas-Velasco DM, Rendon OMC, da Fonseca NL (2020) Intelligent routing based on reinforcement learning for software-defined networking. IEEE Trans Netw Serv Manage 18(1):870–881

    Article  Google Scholar 

  18. Yang L, Sun Q, Zhang N, Liu Z (2020) Optimal energy operation strategy for we-energy of energy internet based on hybrid reinforcement learning with human-in-the-loop. IEEE Trans Syst Man Cybernet Syst 52(1):32–42

    Article  Google Scholar 

  19. Haji SH, Abdulazeez AM (2021) Comparison of optimization techniques based on gradient descent algorithm: a review. PalArch’s J Archaeol Egypt 18(4):2715–2743

    Google Scholar 

  20. Yuan X, Wang Y, Liu J, Sun C (2022) Action mapping: A reinforcement learning method for constrained-input systems. IEEE Trans Neural Netw Learn Syst

  21. Zhu L, Wu F, Hu Y, Huang K, Tian X (2023) A heuristic multi-objective task scheduling framework for container-based clouds via actor-critic reinforcement learning. Neural Comput Appl, 1–24

  22. Gheibi O, Weyns D, Quin F (2021) Applying machine learning in self-adaptive systems: a systematic literature review. ACM Trans Auton Adapt Syst (TAAS) 15(3):1–37

    Google Scholar 

  23. Wang H, Chen X, Wu Q, Yu Q, Hu X, Zheng Z, Bouguettaya A (2017) Integrating reinforcement learning with multi-agent techniques for adaptive service composition. ACM Trans Auton Adapt Syst (TAAS) 12(2):1–42

    Google Scholar 

  24. Zhao T, Zhang W, Zhao H, Jin Z (2017) A reinforcement learning-based framework for the generation and evolution of adaptation rules. In: 2017 IEEE international conference on autonomic computing (ICAC), pp 103–112. IEEE

  25. Chillet D, Eiche A, Pillement S, Sentieys O (2011) Real-time scheduling on heterogeneous system-on-chip architectures using an optimised artificial neural network. J Syst Architect 57(4):340–353

    Article  Google Scholar 

  26. Cardeira C, Mammeri Z (1994) Neural networks for multiprocessor real-time scheduling. In: Proceedings sixth euromicro workshop on real-time systems, pp 59–64. IEEE

  27. Cardeira C, Mammeri Z (1995) Preemptive and non-preemptive real-time scheduling based on neural networks. In: Distributed computer control systems, pp 67–72. Elsevier

  28. Hopfield JJ, Tank DW (1985) “neural’’ computation of decisions in optimization problems. Biol Cybernet 52(3):141–152

    Article  MathSciNet  MATH  Google Scholar 

  29. Mirjalili S (2019) Genetic algorithm. In: Evolutionary algorithms and neural networks, pp 43–55. Springer

  30. Goubaa A, Khalgui M, Li Z, Frey G, Al-Ahmari A (2020) On parametrizing feasible reconfigurable systems under real-time, energy, and resource sharing constraints. IEEE Trans Autom Sci Eng 18(3):1492–1504

    Article  Google Scholar 

  31. Gammoudi A, Benzina A, Khalgui M, Chillet D (2018) Energy-efficient scheduling of real-time tasks in reconfigurable homogeneous multicore platforms. IEEE Trans Syst Man Cybernet Syst 50(12):5092–5105

    Article  Google Scholar 

  32. Gendreau M, Potvin J-Y (2005) Tabu search. In: Search methodologies, pp 165–186. Springer

  33. Ghofrane R, Hamza G, Samir BA (2018) New optimal solutions for real-time scheduling of reconfigurable embedded systems based on neural networks with minimisation of power consumption. Int J Intell Eng Inform 6(6):569–585

    Google Scholar 

  34. Gharsellaoui H, Gharbi A, Khalgui M, Ahmed SB (2012) Feasible automatic reconfigurations of real-time os tasks. In: Handbook of research on industrial informatics and manufacturing intelligence: innovations and solutions, pp 390–414. IGI Global

  35. Caviglione L, Gaggero M, Paolucci M, Ronco R (2021) Deep reinforcement learning for multi-objective placement of virtual machines in cloud datacenters. Soft Comput 25(19):12569–12588

    Article  Google Scholar 

  36. Ghasemi A, Toroghi Haghighat A (2020) A multi-objective load balancing algorithm for virtual machine placement in cloud data centers based on machine learning. Computing 102(9):2049–2072

    Article  MathSciNet  Google Scholar 

  37. Hazan E, et al (2016) Introduction to online convex optimization. Found Trends® Optim 2(3-4), 157–325

  38. Metzger A, Quinton C, Mann Z.Á, Baresi L, Pohl K (2022) Realizing self-adaptive systems via online reinforcement learning and feature-model-guided exploration. Computing, 1–22

  39. Feit F, Metzger A, Pohl K (2022) Explaining online reinforcement learning decisions of self-adaptive systems. In: 2022 IEEE international conference on autonomic computing and self-organizing systems (ACSOS), pp. 51–60. IEEE

  40. Juozapaitis Z, Koul A, Fern A, Erwig M, Doshi-Velez F (2019) Explainable reinforcement learning via reward decomposition. In: IJCAI/ECAI workshop on explainable artificial intelligence

  41. Sequeira P, Gervasio M (2020) Interestingness elements for explainable reinforcement learning: Understanding agents’ capabilities and limitations. Artif Intell 288:103367

    Article  MathSciNet  Google Scholar 

  42. Quin F, Weyns D, Gheibi O (2022) Reducing large adaptation spaces in self-adaptive systems using classical machine learning. J Syst Softw 190:111341

    Article  Google Scholar 

  43. Lee H, Lee J, Yeom I, Woo H (2020) Panda: Reinforcement learning-based priority assignment for multi-processor real-time scheduling. IEEE Access 8:185570–185583

    Article  Google Scholar 

  44. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  45. Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: a brief survey. IEEE Signal Process Mag 34(6):26–38

    Article  Google Scholar 

  46. Barto AG (2021) Reinforcement learning: An introduction by Richards’ Sutton. SIAM Rev 6(2):423

    MathSciNet  Google Scholar 

  47. Muhammad I, Yan Z (2015) Supervised machine learning approaches: a survey. ICTACT J Soft Comput 5(3)

  48. Shanthamallu US, Spanias A, Tepedelenlioglu C, Stanley M (2017) A brief survey of machine learning methods and their sensor and iot applications. In: 2017 8th international conference on information, intelligence, systems & applications (IISA), pp 1–8. IEEE

  49. Bellman R (1957) A markovian decision process. J Math Mech 679–684

  50. Howard RA (1960) Dynamic programming and markov processes

  51. Ghallab M, Nau D, Traverso P (2004) Automated planning: theory and practice. Elsevier, Amsterdam

    MATH  Google Scholar 

  52. Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3):279–292

    Article  MATH  Google Scholar 

  53. Watkins CJCH (1989) Learning from delayed rewards

  54. Audsley N, Burns A, Richardson M, Tindell K, Wellings AJ (1993) Applying new scheduling theory to static priority pre-emptive scheduling. Softw Eng J 8(5):284–292

    Article  Google Scholar 

  55. Liu CL, Layland JW (1973) Scheduling algorithms for multiprogramming in a hard-real-time environment. J ACM (JACM) 20(1):46–61

    Article  MathSciNet  MATH  Google Scholar 

  56. Manabe Y, Aoyagi S (1995) A feasibility decision algorithm for rate monotonic scheduling of periodic real-time tasks. In: Proceedings real-time technology and applications symposium, pp 212–218. IEEE

  57. Coradeschi S, Cesta A, Cortellessa G, Coraci L, Gonzalez J, Karlsson L, Furfari F, Loutfi A, Orlandini A, Palumbo F, et al (2013) Giraffplus: Combining social interaction and long term monitoring for promoting independent living. In: 2013 6th international conference on human system interactions (HSI), pp 578–585. IEEE

  58. Shibata T (2012) Therapeutic seal robot as biofeedback medical device: Qualitative and quantitative evaluations of robot therapy in dementia care. Proc IEEE 100(8):2527–2538

    Article  Google Scholar 

  59. Audsley NC, Burns A, Richardson M, Wellings A (1990) Deadline monotonic scheduling

  60. Bouaziz R, Lemarchand L, Singhoff F, Zalila B, Jmaiel M (2015) Architecture exploration of real-time systems based on multi-objective optimization. In: 2015 20th international conference on engineering of complex computer systems (ICECCS), pp 1–10. IEEE

  61. François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J, et al (2018) An introduction to deep reinforcement learning. Found Trends® Mach Learn 11(3-4), 219–354

  62. Cichy RM, Kaiser D (2019) Deep neural networks as scientific models. Trends Cogn Sci 23(4):305–317

    Article  Google Scholar 

  63. Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26

    Article  Google Scholar 

  64. Zhang H, Feng L, Wu N, Li Z (2017) Integration of learning-based testing and supervisory control for requirements conformance of black-box reactive systems. IEEE Trans Autom Sci Eng 15(1):2–15

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Bakhta Haouari or Rania Mzid.

Ethics declarations

Conflict of interest:

The authors have no competing interests to declare that are relevant to the content of this article

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Haouari, B., Mzid, R. & Mosbahi, O. A reinforcement learning-based approach for online optimal control of self-adaptive real-time systems. Neural Comput & Applic 35, 20375–20401 (2023). https://doi.org/10.1007/s00521-023-08778-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08778-5

Keywords

Navigation