research-article

The Predictable Execution Model in Practice: Compiling Real Applications for COTS Hardware

Authors:
Björn Forsberg

ETH Zürich, Switzerland

ETH Zürich, Switzerland
View Profile

,
Marco Solieri

University of Modena and Reggio Emilia, Italy

University of Modena and Reggio Emilia, Italy
View Profile

,
Marko Bertogna

University of Modena and Reggio Emilia, Italy

University of Modena and Reggio Emilia, Italy
View Profile

,
Luca Benini

ETH Zürich, Switzerland and University of Bologna, Italy

ETH Zürich, Switzerland and University of Bologna, Italy
View Profile

,
Andrea Marongiu

University of Modena and Reggio Emilia, Italy

University of Modena and Reggio Emilia, Italy
View Profile

Authors Info & Claims

ACM Transactions on Embedded Computing Systems Volume 20 Issue 5Article No.: 47pp 1–25https://doi.org/10.1145/3465370

Published:29 July 2021Publication History

ACM Transactions on Embedded Computing Systems

Abstract

Adoption of multi- and many-core processors in real-time systems has so far been slowed down, if not totally barred, due do the difficulty in providing analytical real-time guarantees on worst-case execution times. The Predictable Execution Model (PREM) has been proposed to solve this problem, but its practical support requires significant code refactoring, a task better suited for a compilation tool chain than human programmers. Implementing a PREM compiler presents significant challenges to conform to PREM requirements, such as guaranteed upper bounds on memory footprint and the generation of efficient schedulable non-preemptive regions. This article presents a comprehensive description on how a PREM compiler can be implemented, based on several years of experience from the community. We provide accumulated insights on how to best balance conformance to real-time requirements and performance and present novel techniques that extend the applicability from simple benchmark suites to real-world applications. We show that code transformed by the PREM compiler enables timing predictable execution on modern commercial off-the-shelf hardware, providing novel insights on how PREM can protect 99.4% of memory accesses on random replacement policy caches at only 16% performance loss on benchmarks from the PolyBench benchmark suite. Finally, we show that the requirements imposed on the programming model are well-aligned with current coding guidelines for timing critical software, promoting easy adoption.

References

Alexy Torres Aurora Dugo, Jean-Baptiste Lefoul, Felipe Gohring De Magalhaes, Dahman Assal, and Gabriela Nicolescu. 2019. Cache locking content selection algorithms for ARINC-653 compliant RTOS. ACM Trans. Embed. Comput. Syst. 18, 5s (Oct. 2019). DOI:https://doi.org/10.1145/3358196Google Scholar
Hyoseung Kim and Ragunathan (Raj) Rajkumar. 2017. Predictable shared cache management for multi-core real-time virtualization. ACM Trans. Embed. Comput. Syst. 17, 1 (Dec. 2017). DOI:https://doi.org/10.1145/3092946Google Scholar
Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. 2013. Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In RTAS’13. IEEE.Google Scholar
Heechul Yun, Waqar Ali, Santosh Gondi, and Siddhartha Biswas. 2017. BWLOCK: A dynamic memory access control framework for soft real-time applications on multicore platforms. IEEE Trans. Comput. 66, 7 (2017).Google ScholarDigital Library
Sakshi Tiwari, Shreshth Tuli, Isaar Ahmad, Ayushi Agarwal, Preeti Ranjan Panda, and Sreenivas Subramoney. 2019. REAL: REquest arbitration in last level caches. ACM Trans. Embed. Comput. Syst. 18, 6 (Nov. 2019). DOI:https://doi.org/10.1145/3362100Google Scholar
Dominic Oehlert, Selma Saidi, and Heiko Falk. 2019. Code-inherent traffic shaping for hard real-time systems. ACM Trans. Embed. Comput. Syst. 18, 5s (Oct. 2019). DOI:https://doi.org/10.1145/3358215Google ScholarDigital Library
Ahmed Alhammad and Rodolfo Pellizzoni. 2014. Schedulability analysis of global memory-predictable scheduling. In EMSOFT’14. DOI:https://doi.org/10.1145/2656045.2656070Google Scholar
G. Yao, R. Pellizzoni, S. Bak, H. Yun, and M. Caccamo. 2016. Global real-time memory-centric scheduling for multicore systems. IEEE Trans. Comput. 65, 9 (Sep. 2016), 2739–2751. DOI:https://doi.org/10.1109/TC.2015.2500572.Google Scholar
Arno Luppold, Dominic Oehlert, and Heiko Falk. 2020. Compiling for the worst case: Memory allocation for multi-task and multi-core hard real-time systems. ACM Trans. Embed. Comput. Syst. 19, 2, (Mar. 2020). DOI:https://doi.org/10.1145/3381752Google ScholarDigital Library
Christoph M. Kirsch and Ana Sokolova. 2012. The logical execution time paradigm. In Advances in Real-time Systems. Springer, 103–120.Google Scholar
Rodolfo Pellizzoni, Emiliano Betti, Stanley Bak, Gang Yao, John Criswell, Marco Caccamo, and Russell Kegley. 2011. A predictable execution model for COTS-based embedded systems. In RTAS’11.Google Scholar
Muhammad Refaat Soliman and Rodolfo Pellizzoni. 2017. WCET-Driven dynamic data scratchpad management with compiler-directed prefetching. In ECRTS’17. DOI:https://doi.org/10.4230/LIPIcs.ECRTS.2017.24Google Scholar
Bjorn Forsberg, Luca Benini, and Andrea Marongiu. 2018. HePREM: Enabling predictable GPU execution on heterogeneous SoC. In DATE’18.Google Scholar
B. Forsberg, L. Benini, and A. Marongiu. 2020. HePREM: A predictable execution model for GPU-based heterogeneous SoCs. IEEE Trans. Comput. (2020). DOI:https://doi.org/10.1109/TC.2020.2980520Google Scholar
Joel Matejka, Björn Forsberg, Michal Sojka, Premysl Sucha, Luca Benini, Andrea Marongiu, and Zdeněk Hanzalek. 2019. Combining PREM compilation and static scheduling for high-performance and predictable MPSoC execution. Parallel Comput. (2019). DOI:https://doi.org/10.1016/j.parco.2018.11.002Google Scholar
Muhammad R. Soliman and Rodolfo Pellizzoni. 2019. PREM-based optimal task segmentation under fixed priority scheduling. In ECRTS’19. 1–24.Google Scholar
R. Pellizzoni, A. Schranzhofer, Jian-Jia Chen, M. Caccamo, and L. Thiele. 2010. Worst case delay analysis for memory interference in multicore systems. In DATE’10.Google Scholar
R. Cavicchioli, N. Capodieci, and M. Bertogna. 2017. Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms. In ETFA’17.Google Scholar
H. Kim, D. de Niz, B. Andersson, M. Klein, O. Mutlu, and R. Rajkumar. 2014. Bounding memory interference delay in COTS-based multi-core systems. In RTAS’14.Google Scholar
D. Dasari, B. Andersson, V. Nelis, S. M. Petters, A. Easwaran, and J. Lee. 2011. Response time analysis of COTS-based multicores considering the contention on the shared memory bus. In TrustCom’19.Google Scholar
S. Saidi and A. Syring. 2018. Exploiting locality for the performance analysis of shared memory systems in MPSoCs. In RTSS’18.Google Scholar
S. Bak, G. Yao, R. Pellizzoni, and M. Caccamo. 2012. Memory-aware scheduling of multicore task sets for real-time systems. In RTCSA’12.Google Scholar
Gang Yao, Rodolfo Pellizzoni, Stanley Bak, Emiliano Betti, and Marco Caccamo. 2012. Memory-centric scheduling for multicore hard real-time systems. Real-Time Syst. 48, 6 (2012), 681–715.Google ScholarDigital Library
Ahmed Alhammad and Rodolfo Pellizzoni. 2014. Time-predictable execution of multithreaded applications on multicore systems. In DATE’14.Google Scholar
A. Alhammad, S. Wasly, and R. Pellizzoni. 2015. Memory efficient global scheduling of real-time tasks. In RTAS’15. DOI:https://doi.org/10.1109/RTAS.2015.7108452Google Scholar
R. Mancuso, R. Dudko, and M. Caccamo. 2014. Light-PREM: Automated software refactoring for predictable execution on COTS embedded systems. In RTCSA’14.Google Scholar
Ralf Ramsauer, Jan Kiszka, Daniel Lohmann, and Wolfgang Mauerer. 2017. Deterministic memory hierarchy and virtualization for modern multi-core embedded systems. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS’19). 1–14. DOI:10.1109/RTAS.2019.00009Google Scholar
Tomasz Kloda, Marco Solieri, Renato Mancuso, Nicola Capodieci, Paolo Valente, and Marko Bertogna. 2019. Deterministic memory hierarchy and virtualization for modern multi-core embedded systems. In RTAS’19.Google Scholar
H. Yun, R. Mancuso, Z. Wu, and R. Pellizzoni. 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In RTAS’14.Google Scholar
Motor Industry Research Association. 2013. MISRA C:2012: Guidelines for the Use of the C Language in Critical Systems. Motor Industry Research Association.Google Scholar
Randy Allen and Ken Kennedy. 2001. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann.Google Scholar
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. IEEE, 75–86.Google ScholarDigital Library
Kelefouras Vasilios, Keramidas Georgios, and Voros Nikolaos. 2018. Combining software cache partitioning and loop tiling for effective shared cache management. ACM Trans. Embed. Comput. Syst. 17, 3 (May 2018). DOI:https://doi.org/10.1145/3202663Google ScholarDigital Library
Louis-Noël Pouchet. [n.d.]. Polybench: The polyhedral benchmark suite. Retrieved from http://www.cs.ucla.edu/pouchet/software/polybench.Google Scholar
2019. NVIDIA Jetson TX2 Developer Kit. Retrieved on June 25th, 2021 from https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-tx2/.Google Scholar
ARM Limited. 2014. ARM Cortex-A57 MPCore Processor Technical Reference Manual (7th ed.).Google Scholar
Kecheng Ji, Ming Ling, Longxing Shi, and Jianping Pan. 2018. An analytical cache performance evaluation framework for embedded out-of-order processors using software characteristics. ACM Trans. Embed. Comput. Syst. 17, 4 (Aug. 2018). DOI:https://doi.org/10.1145/3233182Google ScholarDigital Library
Ignacio Sañudo, Paolo Cortimiglia, Luca Miccio, Marco Solieri, Paolo Burgio, Christian Di Biagio, Franco Felici, Giovanni Nuzzo, and Marko Bertogna. 2018. The key role of memory in next-generation embedded systems for military applications. In SEDA’2018’. Springer International.Google Scholar
Sebastian Altmeyer, Liliana Cucu-Grosjean, and Robert I. Davis. 2015. Static probabilistic timing analysis for real-time systems using random replacement caches. Real-Time Syst. 51, 1 (2015), 77–123.Google ScholarDigital Library
Yi Wu, Jongwoo Lim, and Ming-Hsuan Yang. 2013. Online object tracking: A benchmark. In CVPR. 2411–2418.Google Scholar
O. J. Dahl, E. W. Dijkstra, and C. A. R. Hoare (Eds.). 1972. Structured Programming. Academic Press Ltd., GBR.Google Scholar

Index Terms

The Predictable Execution Model in Practice: Compiling Real Applications for COTS Hardware
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Time-Predictable Out-of-Order Execution for Hard Real-Time Systems

Superscalar out-of-order CPU designs can achieve higher performance than simpler in-order designs through exploitation of instruction-level parallelism in software. However, these CPU designs are often considered to be unsuitable for hard real-time ...
Read More
A Predictable Execution Model for COTS-Based Embedded Systems
RTAS '11: Proceedings of the 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium

Building safety-critical real-time systems out of inexpensive, non-real-time, COTS components is challenging. Although COTS components generally offer high performance, they can occasionally incur significant timing delays. To prevent this, we propose ...
Read More
Dynamic Constraints for Mixed-Criticality Systems
COINS '19: Proceedings of the International Conference on Omni-Layer Intelligent Systems

We define quality of service requirements for mixed-criticality systems based on min-plus algebra rather than discrete criticality levels. The requirements (1) unify a spectrum of weakly-hard real-time requirements with strongly-hard real-time and soft ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Embedded Computing Systems Volume 20, Issue 5
September 2021
342 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3468851
Editor:
Tulika Mitra
National University of Singapore, Singapore
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 29 July 2021
- Accepted: 1 May 2021
- Received: 1 November 2020
- Revised: 1 March 2020
Published in tecs Volume 20, Issue 5

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Predictable execution models
commercial off-the-shelf systems
freedom from interference
memory interference
multi-core systems
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 219
  Total Downloads
- Downloads (Last 12 months)38
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

The Predictable Execution Model in Practice: Compiling Real Applications for COTS Hardware

ACM Transactions on Embedded Computing Systems

Abstract

References

Cited By

Index Terms

Recommendations

Time-Predictable Out-of-Order Execution for Hard Real-Time Systems

A Predictable Execution Model for COTS-Based Embedded Systems

Dynamic Constraints for Mixed-Criticality Systems