Skip to main content

FAA+RTS: Designing Fault-Aware Adaptive Real-Time Systems —From Specification to Execution—

  • Conference paper
  • First Online:
Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS 2024)

Abstract

Large-scale cyber-physical systems, such as those for subway transportation or air traffic control, are becoming increasingly complex and often need to operate without human intervention. At the same time, these systems are subject to high requirements on the timing behavior and fault-tolerance. Consequently, the detection and mitigation of both hard and soft errors is of high importance in the already complex systems design process. The main challenges towards fault-aware real-time systems is the overall system design, in which the sheer size of the state-space and the system’s complexity exceeds the capacity of today’s development tools. In this paper, we present a new holistic methodology called FAA+RTS, for designing fault-aware adaptive real-time systems. We cover the entire path from system specification using a coordination language, via design-space exploration and task scheduling to the adaptive fault-aware runtime environment. Mitigating both hard and soft errors addresses competing requirements. Improving soft error tolerance (through redundant execution) may accelerate the aging process of silicon, thus expediting hard error failures. FAA+RTS is a novel solution as it integrates previously-isolated methods for dealing with multiple constraints into a single framework, presenting a single overview of all possible trade-offs to the application designer. This integration ensures that all aspects of system design, from specification to execution, are cohesively addressed, resulting in a robust and reliable system. We exemplify FAA+RTS using industrial-sized autonomous subway transportation system as a use-case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdi, A., Zarandi, H.: A meta heuristic-based task scheduling and mapping method to optimize main design challenges of heterogeneous multiprocessor embedded systems. Microelectron. J. 87, 1–11 (2019)

    Article  MATH  Google Scholar 

  2. Arbab, F.: Composition of interacting computations. In: Goldin, D., Smolka, S.A., Wegner, P. (eds.) Interactive Computation. Springer, Heidelberg (2006). https://doi.org/10.1007/3-540-34874-3_12

  3. Bakken, D., Schlichting, R.: Supporting fault-tolerant parallel programming in Linda. IEEE Trans. Parallel Distrib. Syst. 6(3), 287–302 (1995)

    Article  MATH  Google Scholar 

  4. Bansal, S., Bansal, R., Arora, K.: Energy conscious scheduling for fault-tolerant real-time distributed computing systems. In: Pandey, S., Shanker, U., Saravanan, V., Ramalingam, R. (eds.) Role of Data-Intensive Distributed Computing Systems in Designing Data Solutions. EAI/Springer Innovations in Communication and Computing. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-15542-0_1

  5. Ciatto, G., Mariani, S., Louvel, M., Omicini, A., Zambonelli, F.: Twenty years of coordination technologies: state-of-the-art and perspectives. In: Di Marzo Serugendo, G., Loreti, M. (eds.) Coordination Models and Languages. COORDINATION 2018. LNCS, vol. 10852, pp. 51–80. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-92408-3_3

  6. Feiler, P., Gluch, D., Hudak, J.: The architecture analysis and design language (AADL): an introduction. Technical report. Carnegie-Mellon University, Pittsburgh, USA (2006)

    Google Scholar 

  7. Gunes, V., Peter, S., Givargis, T., Vahid, F.: A survey on concepts, applications, and challenges in cyber-physical systems. KSII Trans. Internet Inf. Syst. 8(12), 4242–4268 (2014)

    Google Scholar 

  8. Hammond, K., Michaelson, G.: Hume: a domain-specific language for real-time embedded systems. In: Pfenning, F., Smaragdakis, Y. (eds.) GPCE 2003. LNCS, vol. 2830, pp. 37–56. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39815-8_3

    Chapter  MATH  Google Scholar 

  9. Kühbacher, C., Ungerer, T., Altmeyer, S.: Redundant dataflow applications on clustered manycore architectures. In: Hong, J., Bures, M., Park, J.W., Cerny, T. (eds.) SAC 2022: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, Virtual Event, 25–29 April 2022, pp. 226–235 (2022)

    Google Scholar 

  10. Loeve, W., Grelck, C.: Towards facilitating resilience in cyber-physical systems using coordination languages. In: Constantinou, E. (ed.) 13th Seminar on Advanced Techniques and Tools for Software Evolution (SATToSE 2020), vol. 2754. CEUR Workshop Proceedings (2020)

    Google Scholar 

  11. Ma, Y., Chantem, T., Dick, R., Hu, X.: Improving system-level lifetime reliability of multicore soft real-time systems. IEEE Trans. Very Large Scale Integr. VLSI Syst. 25(6), 1895–1905 (2017)

    Google Scholar 

  12. Roeder, J., Rouxel, B., Altmeyer, S., Grelck, C.: Towards energy-, time- and security-aware multi-core coordination. In: Bliudze, S., Bocchi, L. (eds.) 22nd International Conference on Coordination Models and Languages (COORD 2020), Malta. LNCS, vol. 12134, pp. 57–74. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50029-0_4

  13. Sapra, D., Pimentel, A.D.: Exploring multi-core systems with lifetime reliability and power consumption trade-offs. In: Silvano, C., Pilato, C., Reichenbach, M. (eds.) Embedded Computer Systems: Architectures, Modeling, and Simulation: 23rd International Conference, SAMOS 2023. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-46077-7_6

  14. Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)

    Google Scholar 

  15. Villarreal Lozano, C., Vijayan, K.: Literature review on cyber physical systems design. Procedia Manuf. 45, 295–300 (2020)

    Google Scholar 

  16. Wasala, S.M., Niknam, S., Pathania, A., Grelck, C., Pimentel, A.D.: Lifetime estimation for core-failure resilient multi-core processors. In: 16th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2023). IEEE (2023)

    Google Scholar 

  17. Wells, G.: Coordination languages: back to the future with Linda. In: 2nd International Workshop on Coordination and Adaption Techniques for Software Entities (WCAT 2005) (2005)

    Google Scholar 

  18. Youness, H., Omar, A., Moness, M.: An optimized weighted average makespan in fault-tolerant heterogeneous MPSoCs. IEEE Trans. Parallel Distrib. Syst. 32(8), 1933–1946 (2021)

    Article  MATH  Google Scholar 

  19. Zhang, L., Li, K., Li, C., Li, K.: Bi-objective workflow scheduling of the energy consumption and reliability in heterogeneous computing systems. Inf. Sci. 379, 241–256 (2017)

    Google Scholar 

  20. Zhou, J., Hu, X., Ma, Y., Sun, J., Wei, T., Hu, S.: Improving availability of multicore real-time systems suffering both permanent and transient faults. IEEE Trans. Comput. 68(12), 1785–1801 (2019)

    Article  MATH  Google Scholar 

  21. Zhou, J., et al.: Resource management for improving soft-error and lifetime reliability of real-time MPSoCs. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 38(12), 2215–2228 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lukas Miedema .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Miedema, L., Sapra, D., Novobilsky, P., Altmeyer, S., Grelck, C., Pimentel, A.D. (2025). FAA+RTS: Designing Fault-Aware Adaptive Real-Time Systems —From Specification to Execution—. In: Carro, L., Regazzoni, F., Pilato, C. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2024. Lecture Notes in Computer Science, vol 15226. Springer, Cham. https://doi.org/10.1007/978-3-031-78377-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78377-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78376-0

  • Online ISBN: 978-3-031-78377-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics