skip to main content
research-article

Cache Interference-aware Task Partitioning for Non-preemptive Real-time Multi-core Systems

Published: 28 May 2022 Publication History

Abstract

Shared caches in multi-core processors introduce serious difficulties in providing guarantees on the real-time properties of embedded software due to the interaction and the resulting contention in the shared caches. Prior work has studied the schedulability analysis of global scheduling for real-time multi-core systems with shared caches. This article considers another common scheduling paradigm: partitioned scheduling in the presence of shared cache interference. To achieve this, we propose CITTA, a cache interference-aware task partitioning algorithm. We first analyze the shared cache interference between two programs for set-associative instruction and data caches. Then, an integer programming formulation is constructed to calculate the upper bound on cache interference exhibited by a task, which is required by CITTA. We conduct schedulability analysis of CITTA and formally prove its correctness. A set of experiments is performed to evaluate the schedulability performance of CITTA against global EDF scheduling and other greedy partition approaches such as First-fit and Worst-fit over randomly generated tasksets and realistic workloads in embedded systems. Our empirical evaluations show that CITTA outperforms global EDF scheduling and greedy partition approaches in terms of task sets deemed schedulable.

References

[1]
K. Albers and F. Slomka. 2004. An event stream driven approximation for the analysis of real-time systems. In Proceedings of the 16th Euromicro Conference on Real-Time Systems (ECRTS’04).187–195.
[2]
Sebastian Altmeyer and Claire Maiza Burguière. 2011. Cache-related preemption delay via useful cache blocks: Survey and redefinition. J. Syst. Arch. 57, 7 (2011), 707–719.
[3]
Sanjoy Baruah. 2005. The limited-preemption uniprocessor scheduling of sporadic task systems. In Proceedings of the 17th Euromicro Conference on Real-Time Systems (ECRTS’05). 137–144.
[4]
Sanjoy Baruah. 2007. Techniques for multiprocessor global schedulability analysis. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS’07). IEEE Computer Society, Washington, DC, 119–128.
[5]
S. Baruah and N. Fisher. 2005. The partitioned multiprocessor scheduling of sporadic task systems. In Proceedings of the 26th IEEE International Real-Time Systems Symposium (RTSS’05).
[6]
Sanjoy K. Baruah, Aloysius K. Mok, and Louis E. Rosier. 1990. Preemptively scheduling hard-real-time sporadic tasks on one processor. In In Proceedings of the 11th Real-Time Systems Symposium. IEEE Computer Society Press, 182–190.
[7]
A. Bastoni, B. B. Brandenburg, and J. H. Anderson. 2010. An empirical comparison of global, partitioned, and clustered multiprocessor EDF schedulers. In Proceedings of the 31st IEEE Real-Time Systems Symposium. 14–24.
[8]
M. Bertogna, M. Cirinei, and G. Lipari. 2009. Schedulability analysis of global scheduling algorithms on multiprocessor platforms. IEEE Trans. Parallel Distrib. Syst. 20, 4 (April 2009), 553–566.
[9]
B. B. Brandenburg and M. Gül. 2016. Global scheduling not required: Simple, near-optimal multiprocessor real-time scheduling with semi-partitioned reservations. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS’16). 99–110.
[10]
Marco Caccamo, Marco Cesati, Rodolfo Pellizzoni, Emiliano Betti, Roman Dudko, and Renato Mancuso. 2013. Real-time cache management framework for multi-core architectures. In Proceedings of the Real Time Technology and Applications Symposium (RTAS’13). IEEE Computer Society, Washington, DC, 45–54.
[11]
John M. Calandrino, Hennadiy Leontyev, Aaron Block, UmaMaheswari C. Devi, and James H. Anderson. 2006. Litmus^ rt: A testbed for empirically comparing real-time multiprocessor schedulers. In Proceedings of the 27th IEEE International Real-Time Systems Symposium (RTSS’06). IEEE, 111–126.
[12]
Daniel Casini, Alessandro Biondi, and Giorgio Buttazzo. 2017. Semi-partitioned scheduling of dynamic real-time workload: A practical approach based on analysis-driven load balancing. In Proceedings of the 29th Euromicro Conference on Real-Time Systems (ECRTS 2017),Leibniz International Proceedings in Informatics (LIPIcs), Marko Bertogna (Ed.), Vol. 76. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 13:1–13:23.
[13]
Kenneth L. Clarkson. 1995. Las vegas algorithms for linear and integer programming when the dimension is small. J. ACM 42, 2 (1995), 488–499.
[14]
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. 238–252.
[15]
Robert I. Davis and Alan Burns. 2011. A survey of hard real-time scheduling for multiprocessor systems. ACM Comput. Surv. 43, 4, Article 35 (Oct. 2011), 44 pages.
[16]
Heiko Falk, Sebastian Altmeyer, Peter Hellinckx, Björn Lisper, Wolfgang Puffitsch, Christine Rochange, Martin Schoeberl, Rasmus Bo Sørensen, Peter Wägemann, and Simon Wegener. 2016. TACLeBench: A benchmark collection to support worst-case execution time research. In Proceedings of the 16th International Workshop on Worst-Case Execution Time Analysis.
[17]
Nathan Fisher and Sanjoy Baruah. 2006. The partitioned multiprocessor scheduling of non-preemptive sporadic task systems. In Proceedings of the 14th International Conference on Real-time and Network Systems.
[18]
Laurent George, Paul Muhlethaler, and Nicolas Rivierre. 1995. Optimality and Non-preemptive Real-time Scheduling Revisited. Research Report RR-2516. INRIA. Projet REFLECS.
[19]
G. Gracioli and A. A. Fröhlich. 2013. An experimental evaluation of the cache partitioning impact on multicore real-time schedulers. In Proceedings of the IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’03). 72–81.
[20]
Nan Guan, Martin Stigge, Wang Yi, and Ge Yu. 2009. Cache-aware scheduling and analysis for multicores. In Proceedings of the 7th ACM International Conference on Embedded Software. ACM, 245–254.
[21]
Zhishan Guo, Kecheng Yang, Fan Yao, and Amro Awad. 2020. Inter-task cache interference aware partitioned real-time scheduling. In Proceedings of the 35th Annual ACM Symposium on Applied Computing. 218–226.
[22]
Jan Gustafsson, Adam Betts, Andreas Ermedahl, and Björn Lisper. 2010. The Mälardalen WCET benchmarks: Past, present and future. In Proceedings of the 10th International Workshop on Worst-Case Execution Time Analysis (WCET’10), OpenAccess Series in Informatics (OASIcs), Björn Lisper (Ed.), Vol. 15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 136–146.
[23]
D. Hardy, T. Piquet, and I. Puaut. 2009. Using bypass to tighten WCET estimates for multi-core processors with shared instruction caches. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS’09). 68–77.
[24]
D. Hardy and I. Puaut. 2008. WCET analysis of multi-level non-inclusive set-associative instruction caches. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS’08). 456–466.
[25]
Damien Hardy, Benjamin Rouxel, and Isabelle Puaut. 2017. The heptane static worst-case execution time estimation tool. In Proceedings of the 17th International Workshop on Worst-Case Execution Time Analysis (WCET’17). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[26]
K. Jeffay, D. F. Stanat, and C. U. Martel. 1991. On non-preemptive scheduling of period and sporadic tasks. In Proceedings of the 12th Real-Time Systems Symposium. 129–139.
[27]
Kevin Jeffay, Donald F. Stanat, and Charles U. Martel. 1991. On non-preemptive scheduling of periodic and sporadic tasks. In Proceedings of the IEEE Real-time Systems Symposium. IEEE, 129–139.
[28]
S. Kato and N. Yamasaki. 2009. Semi-partitioned fixed-priority scheduling on multiprocessors. In Proceedings of the 15th IEEE Real-Time and Embedded Technology and Applications Symposium. 23–32.
[29]
H. Kim, A. Kandhalu, and R. Rajkumar. 2013. A coordinated approach for practical os-level cache management in multi-core real-time systems. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS’13). 80–89.
[30]
Chang-Gun Lee, Hoosun Hahn, Yang-Min Seo, Sang Lyul Min, Rhan Ha, Seongsoo Hong, Chang Yun Park, Minsuk Lee, and Chong Sang Kim. 1998. Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Trans. Comput. 47, 6 (1998), 700–713.
[31]
J. Lee, K. G. Shin, I. Shin, and A. Easwaran. 2015. Composition of schedulability analyses for real-time multiprocessor systems. IEEE Trans. Comput. 64, 4 (April 2015), 941–954.
[32]
Y. Li, V. Suhendra, Y. Liang, T. Mitra, and A. Roychoudhury. 2009. Timing analysis of concurrent programs running on shared cache multi-cores. In Proceedings of the 30th IEEE Real-Time Systems Symposium. 57–67.
[33]
J. Liedtke, H. Hartig, and M. Hohmuth. 1997. OS-controlled cache predictability for real-time systems. In Proceedings of the Real Time Technology and Applications Symposium (RTAS’97). 213–224.
[34]
José María López, José Luis Díaz, and Daniel F. García. 2004. Utilization bounds for EDF scheduling on real-time multiprocessor systems. Real-Time Syst. 28, 1 (2004), 39–68.
[35]
Thomas Lundqvist and Per Stenstrom. 1999. Timing anomalies in dynamically scheduled microprocessors. In Proceedings of the 20th IEEE Real-Time Systems Symposium. IEEE, 12–21.
[36]
Hemendra Singh Negi, Tulika Mitra, and Abhik Roychoudhury. 2003. Accurate estimation of cache-related preemption delay. In Proceedings of the 1st IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis. 201–206.
[37]
Mayank Shekhar, Abhik Sarkar, Harini Ramaprasad, and Frank Mueller. 2012. Semi-partitioned hard-real-time scheduling under locked cache migration in multicore systems. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS’12). IEEE Computer Society, Washington, DC, 331–340.
[38]
Roger Stafford. 2006. Random vectors with fixed sum. (2006). http://www.mathworks.com/matlabcentral/fileexchange/9700.
[39]
Vivy Suhendra and Tulika Mitra. 2008. Exploring locking & partitioning for predictable shared caches on multi-cores. In Proceedings of the 45th Annual Design Automation Conference (DAC’08). ACM, New York, NY, 300–303.
[40]
Henrik Theiling, Christian Ferdinand, and Reinhard Wilhelm. 2000. Fast and precise WCET prediction by separated cache and path analyses. Real-Time Syst 18, 2–3 (2000), 157–179.
[41]
B. C. Ward, J. L. Herman, C. J. Kenna, and J. H. Anderson. 2013. Making shared caches more predictable on multicore platforms. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS’13). 157–167.
[42]
Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenström. 2008. The worst-case execution-time problem—Overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7, 3, Article 36 (May 2008), 53 pages.
[43]
J. Xiao, S. Altmeyer, and A. Pimentel. 2017. Schedulability analysis of non-preemptive real-time scheduling for multicore processors with shared caches. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS’17). 199–208.
[44]
J. Xiao, S. Altmeyer, and A. D. Pimentel. 2020. Schedulability analysis of global scheduling for multicore systems with shared caches. IEEE Trans. Comput. (2020), 1–1.
[45]
Jun Xiao and Andy D. Pimentel. 2020. CITTA: Cache interference-aware task partitioning for real-time multi-core systems. In Proceedings of the 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems. 97–107.
[46]
M. Xu, L. T. X. Phan, H. Choi, Y. Lin, H. Li, C. Lu, and I. Lee. Holistic resource allocation for multicore real-time systems. In Proceedings of the Real Time Technology and Applications Symposium (RTAS’19).
[47]
M. Xu, L. T. X. Phan, H. Y. Choi, and I. Lee. 2016. Analysis and implementation of global preemptive fixed-priority scheduling with dynamic cache allocation. In Proceedings of the Real Time Technology and Applications Symposium (RTAS’16). 1–12.
[48]
Maolin Yang, Wen-Hung Huang, and Jian-Jia Chen. 2018. Resource-oriented partitioning for multiprocessor systems with shared resources. IEEE Trans. Comput. PP (12 2018), 1–1.
[49]
W. Zhang and J. Yan. 2009. Accurately estimating worst-case execution time for multi-core processors with shared direct-mapped instruction caches. In IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’09). 455–463.

Cited By

View all

Index Terms

  1. Cache Interference-aware Task Partitioning for Non-preemptive Real-time Multi-core Systems

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Embedded Computing Systems
    ACM Transactions on Embedded Computing Systems  Volume 21, Issue 3
    May 2022
    365 pages
    ISSN:1539-9087
    EISSN:1558-3465
    DOI:10.1145/3530307
    • Editor:
    • Tulika Mitra
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Journal Family

    Publication History

    Published: 28 May 2022
    Online AM: 26 January 2022
    Accepted: 01 September 2021
    Revised: 01 August 2021
    Received: 01 December 2020
    Published in TECS Volume 21, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Shared caches
    2. partitioned scheduling
    3. schedulability analysis
    4. real-time systems

    Qualifiers

    • Research-article
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)108
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Shared Cache Analysis Under Preemptive Scheduling2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546581(1-6)Online publication date: 25-Mar-2024
    • (2024)Optimizing code allocation for hybrid on-chip memory in IoT systemsIntegration10.1016/j.vlsi.2024.10219597(102195)Online publication date: Jul-2024
    • (2024)Timing-aware analysis of shared cache interference for non-preemptive schedulingReal-Time Systems10.1007/s11241-024-09430-860:4(570-624)Online publication date: 30-Sep-2024
    • (2024)Minimizing cache usage with fixed-priority and earliest deadline first schedulingReal-Time Systems10.1007/s11241-024-09423-760:4(625-664)Online publication date: 28-Jun-2024
    • (2023)Scalable Hierarchical Instruction Cache for Ultralow-Power Processors ClustersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2022.322833631:4(456-469)Online publication date: 1-Apr-2023
    • (2023)Co-Optimizing Cache Partitioning and Multi-Core Task Scheduling: Exploit Cache Sensitivity or Not?2023 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS59052.2023.00028(224-236)Online publication date: 5-Dec-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media