skip to main content
research-article

The Predictable Execution Model in Practice: Compiling Real Applications for COTS Hardware

Published:29 July 2021Publication History
Skip Abstract Section

Abstract

Adoption of multi- and many-core processors in real-time systems has so far been slowed down, if not totally barred, due do the difficulty in providing analytical real-time guarantees on worst-case execution times. The Predictable Execution Model (PREM) has been proposed to solve this problem, but its practical support requires significant code refactoring, a task better suited for a compilation tool chain than human programmers. Implementing a PREM compiler presents significant challenges to conform to PREM requirements, such as guaranteed upper bounds on memory footprint and the generation of efficient schedulable non-preemptive regions. This article presents a comprehensive description on how a PREM compiler can be implemented, based on several years of experience from the community. We provide accumulated insights on how to best balance conformance to real-time requirements and performance and present novel techniques that extend the applicability from simple benchmark suites to real-world applications. We show that code transformed by the PREM compiler enables timing predictable execution on modern commercial off-the-shelf hardware, providing novel insights on how PREM can protect 99.4% of memory accesses on random replacement policy caches at only 16% performance loss on benchmarks from the PolyBench benchmark suite. Finally, we show that the requirements imposed on the programming model are well-aligned with current coding guidelines for timing critical software, promoting easy adoption.

References

  1. Alexy Torres Aurora Dugo, Jean-Baptiste Lefoul, Felipe Gohring De Magalhaes, Dahman Assal, and Gabriela Nicolescu. 2019. Cache locking content selection algorithms for ARINC-653 compliant RTOS. ACM Trans. Embed. Comput. Syst. 18, 5s (Oct. 2019). DOI:https://doi.org/10.1145/3358196Google ScholarGoogle Scholar
  2. Hyoseung Kim and Ragunathan (Raj) Rajkumar. 2017. Predictable shared cache management for multi-core real-time virtualization. ACM Trans. Embed. Comput. Syst. 17, 1 (Dec. 2017). DOI:https://doi.org/10.1145/3092946Google ScholarGoogle Scholar
  3. Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. 2013. Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In RTAS’13. IEEE.Google ScholarGoogle Scholar
  4. Heechul Yun, Waqar Ali, Santosh Gondi, and Siddhartha Biswas. 2017. BWLOCK: A dynamic memory access control framework for soft real-time applications on multicore platforms. IEEE Trans. Comput. 66, 7 (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sakshi Tiwari, Shreshth Tuli, Isaar Ahmad, Ayushi Agarwal, Preeti Ranjan Panda, and Sreenivas Subramoney. 2019. REAL: REquest arbitration in last level caches. ACM Trans. Embed. Comput. Syst. 18, 6 (Nov. 2019). DOI:https://doi.org/10.1145/3362100Google ScholarGoogle Scholar
  6. Dominic Oehlert, Selma Saidi, and Heiko Falk. 2019. Code-inherent traffic shaping for hard real-time systems. ACM Trans. Embed. Comput. Syst. 18, 5s (Oct. 2019). DOI:https://doi.org/10.1145/3358215Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ahmed Alhammad and Rodolfo Pellizzoni. 2014. Schedulability analysis of global memory-predictable scheduling. In EMSOFT’14. DOI:https://doi.org/10.1145/2656045.2656070Google ScholarGoogle Scholar
  8. G. Yao, R. Pellizzoni, S. Bak, H. Yun, and M. Caccamo. 2016. Global real-time memory-centric scheduling for multicore systems. IEEE Trans. Comput. 65, 9 (Sep. 2016), 2739–2751. DOI:https://doi.org/10.1109/TC.2015.2500572.Google ScholarGoogle Scholar
  9. Arno Luppold, Dominic Oehlert, and Heiko Falk. 2020. Compiling for the worst case: Memory allocation for multi-task and multi-core hard real-time systems. ACM Trans. Embed. Comput. Syst. 19, 2, (Mar. 2020). DOI:https://doi.org/10.1145/3381752Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christoph M. Kirsch and Ana Sokolova. 2012. The logical execution time paradigm. In Advances in Real-time Systems. Springer, 103–120.Google ScholarGoogle Scholar
  11. Rodolfo Pellizzoni, Emiliano Betti, Stanley Bak, Gang Yao, John Criswell, Marco Caccamo, and Russell Kegley. 2011. A predictable execution model for COTS-based embedded systems. In RTAS’11.Google ScholarGoogle Scholar
  12. Muhammad Refaat Soliman and Rodolfo Pellizzoni. 2017. WCET-Driven dynamic data scratchpad management with compiler-directed prefetching. In ECRTS’17. DOI:https://doi.org/10.4230/LIPIcs.ECRTS.2017.24Google ScholarGoogle Scholar
  13. Bjorn Forsberg, Luca Benini, and Andrea Marongiu. 2018. HePREM: Enabling predictable GPU execution on heterogeneous SoC. In DATE’18.Google ScholarGoogle Scholar
  14. B. Forsberg, L. Benini, and A. Marongiu. 2020. HePREM: A predictable execution model for GPU-based heterogeneous SoCs. IEEE Trans. Comput. (2020). DOI:https://doi.org/10.1109/TC.2020.2980520Google ScholarGoogle Scholar
  15. Joel Matejka, Björn Forsberg, Michal Sojka, Premysl Sucha, Luca Benini, Andrea Marongiu, and Zdeněk Hanzalek. 2019. Combining PREM compilation and static scheduling for high-performance and predictable MPSoC execution. Parallel Comput. (2019). DOI:https://doi.org/10.1016/j.parco.2018.11.002Google ScholarGoogle Scholar
  16. Muhammad R. Soliman and Rodolfo Pellizzoni. 2019. PREM-based optimal task segmentation under fixed priority scheduling. In ECRTS’19. 1–24.Google ScholarGoogle Scholar
  17. R. Pellizzoni, A. Schranzhofer, Jian-Jia Chen, M. Caccamo, and L. Thiele. 2010. Worst case delay analysis for memory interference in multicore systems. In DATE’10.Google ScholarGoogle Scholar
  18. R. Cavicchioli, N. Capodieci, and M. Bertogna. 2017. Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms. In ETFA’17.Google ScholarGoogle Scholar
  19. H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar. 2014. Bounding memory interference delay in COTS-based multi-core systems. In RTAS’14.Google ScholarGoogle Scholar
  20. D. Dasari, B. Andersson, V. Nelis, S. M. Petters, A. Easwaran, and J. Lee. 2011. Response time analysis of COTS-based multicores considering the contention on the shared memory bus. In TrustCom’19.Google ScholarGoogle Scholar
  21. S. Saidi and A. Syring. 2018. Exploiting locality for the performance analysis of shared memory systems in MPSoCs. In RTSS’18.Google ScholarGoogle Scholar
  22. S. Bak, G. Yao, R. Pellizzoni, and M. Caccamo. 2012. Memory-aware scheduling of multicore task sets for real-time systems. In RTCSA’12.Google ScholarGoogle Scholar
  23. Gang Yao, Rodolfo Pellizzoni, Stanley Bak, Emiliano Betti, and Marco Caccamo. 2012. Memory-centric scheduling for multicore hard real-time systems. Real-Time Syst. 48, 6 (2012), 681–715.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ahmed Alhammad and Rodolfo Pellizzoni. 2014. Time-predictable execution of multithreaded applications on multicore systems. In DATE’14.Google ScholarGoogle Scholar
  25. A. Alhammad, S. Wasly, and R. Pellizzoni. 2015. Memory efficient global scheduling of real-time tasks. In RTAS’15. DOI:https://doi.org/10.1109/RTAS.2015.7108452Google ScholarGoogle Scholar
  26. R. Mancuso, R. Dudko, and M. Caccamo. 2014. Light-PREM: Automated software refactoring for predictable execution on COTS embedded systems. In RTCSA’14.Google ScholarGoogle Scholar
  27. Ralf Ramsauer, Jan Kiszka, Daniel Lohmann, and Wolfgang Mauerer. 2017. Deterministic memory hierarchy and virtualization for modern multi-core embedded systems. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’19). 1–14. DOI:10.1109/RTAS.2019.00009Google ScholarGoogle Scholar
  28. Tomasz Kloda, Marco Solieri, Renato Mancuso, Nicola Capodieci, Paolo Valente, and Marko Bertogna. 2019. Deterministic memory hierarchy and virtualization for modern multi-core embedded systems. In RTAS’19.Google ScholarGoogle Scholar
  29. H. Yun, R. Mancuso, Z. Wu, and R. Pellizzoni. 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In RTAS’14.Google ScholarGoogle Scholar
  30. Motor Industry Research Association. 2013. MISRA C:2012: Guidelines for the Use of the C Language in Critical Systems. Motor Industry Research Association.Google ScholarGoogle Scholar
  31. Randy Allen and Ken Kennedy. 2001. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann.Google ScholarGoogle Scholar
  32. Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. IEEE, 75–86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kelefouras Vasilios, Keramidas Georgios, and Voros Nikolaos. 2018. Combining software cache partitioning and loop tiling for effective shared cache management. ACM Trans. Embed. Comput. Syst. 17, 3 (May 2018). DOI:https://doi.org/10.1145/3202663Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Louis-Noël Pouchet. [n.d.]. Polybench: The polyhedral benchmark suite. Retrieved from http://www.cs.ucla.edu/pouchet/software/polybench.Google ScholarGoogle Scholar
  35. 2019. NVIDIA Jetson TX2 Developer Kit. Retrieved on June 25th, 2021 from https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-tx2/.Google ScholarGoogle Scholar
  36. ARM Limited. 2014. ARM Cortex-A57 MPCore Processor Technical Reference Manual (7th ed.).Google ScholarGoogle Scholar
  37. Kecheng Ji, Ming Ling, Longxing Shi, and Jianping Pan. 2018. An analytical cache performance evaluation framework for embedded out-of-order processors using software characteristics. ACM Trans. Embed. Comput. Syst. 17, 4 (Aug. 2018). DOI:https://doi.org/10.1145/3233182Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ignacio Sañudo, Paolo Cortimiglia, Luca Miccio, Marco Solieri, Paolo Burgio, Christian Di Biagio, Franco Felici, Giovanni Nuzzo, and Marko Bertogna. 2018. The key role of memory in next-generation embedded systems for military applications. In SEDA’2018’. Springer International.Google ScholarGoogle Scholar
  39. Sebastian Altmeyer, Liliana Cucu-Grosjean, and Robert I. Davis. 2015. Static probabilistic timing analysis for real-time systems using random replacement caches. Real-Time Syst. 51, 1 (2015), 77–123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. 2013. Online object tracking: A benchmark. In CVPR. 2411–2418.Google ScholarGoogle Scholar
  41. O. J. Dahl, E. W. Dijkstra, and C. A. R. Hoare (Eds.). 1972. Structured Programming. Academic Press Ltd., GBR.Google ScholarGoogle Scholar

Index Terms

  1. The Predictable Execution Model in Practice: Compiling Real Applications for COTS Hardware

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Embedded Computing Systems
          ACM Transactions on Embedded Computing Systems  Volume 20, Issue 5
          September 2021
          342 pages
          ISSN:1539-9087
          EISSN:1558-3465
          DOI:10.1145/3468851
          • Editor:
          • Tulika Mitra
          Issue’s Table of Contents

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 29 July 2021
          • Accepted: 1 May 2021
          • Received: 1 November 2020
          • Revised: 1 March 2020
          Published in tecs Volume 20, Issue 5

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed
        • Article Metrics

          • Downloads (Last 12 months)38
          • Downloads (Last 6 weeks)3

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format