skip to main content
research-article

Holistic Resource Allocation Under Federated Scheduling for Parallel Real-time Tasks

Published: 14 January 2022 Publication History

Abstract

With the technology trend of hardware and workload consolidation for embedded systems and the rapid development of edge computing, there has been increasing interest in supporting parallel real-time tasks to better utilize the multi-core platforms while meeting the stringent real-time constraints. For parallel real-time tasks, the federated scheduling paradigm, which assigns each parallel task a set of dedicated cores, achieves good theoretical bounds by ensuring exclusive use of processing resources to reduce interferences. However, because cores share the last-level cache and memory bandwidth resources, in practice tasks may still interfere with each other despite executing on dedicated cores. Such resource interferences due to concurrent accesses can be even more severe for embedded platforms or edge servers, where the computing power and cache/memory space are limited. To tackle this issue, in this work, we present a holistic resource allocation framework for parallel real-time tasks under federated scheduling. Under our proposed framework, in addition to dedicated cores, each parallel task is also assigned with dedicated cache and memory bandwidth resources. Further, we propose a holistic resource allocation algorithm that well balances the allocation between different resources to achieve good schedulability. Additionally, we provide a full implementation of our framework by extending the federated scheduling system with Intel’s Cache Allocation Technology and MemGuard. Finally, we demonstrate the practicality of our proposed framework via extensive numerical evaluations and empirical experiments using real benchmark programs.

References

[1]
Ankit Agrawal, Gerhard Fohler, Johannes Freitag, Jan Nowotsch, Sascha Uhrig, and Michael Paulitsch. 2017. Contention-aware dynamic memory bandwidth isolation with predictability in COTS multicores: An avionics case study. In Euromicro Conference on Real-Time Systems (ECRTS). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[2]
Ahmed Alhammad and Rodolfo Pellizzoni. 2016. Trading cores for memory bandwidth in real-time systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 1–11.
[3]
Gene M. Amdahl. 1967. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18-20, 1967, Spring Joint Computer Conference. 483–485.
[4]
Björn Andersson and Dionisio de Niz. 2012. Analyzing global-EDF for multiprocessor scheduling of parallel tasks. In International Conference on Principles Of Distributed Systems. Springer, 16–30.
[5]
ARM. 2018. Memory System Resource Partitioning and Monitoring (MPAM). (2018). https://developer.arm.com/documentation/ddi0598/latest/.
[6]
Muhammad Ali Awan, Konstantinos Bletsas, Pedro F. Souto, Benny Akesson, and Eduardo Tovar. 2017. Mixed-criticality scheduling with dynamic redistribution of shared cache. In Euromicro Conference on Real-Time Systems (ECRTS). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[7]
Sanjoy Baruah. 2015. Federated scheduling of sporadic DAG task systems. In International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 179–186.
[8]
Sanjoy Baruah. 2016. The federated scheduling of systems of mixed-criticality sporadic DAG tasks. In IEEE Real-Time Systems Symposium (RTSS). 227–236.
[9]
Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University. http://parsec.cs.princeton.edu.
[10]
Vincenzo Bonifaci, Alberto Marchetti-Spaccamela, Sebastian Stiller, and Andreas Wiese. 2013. Feasibility analysis in the sporadic DAG task model. In Euromicro Conference on Real-Time Systems (ECRTS). 225–233.
[11]
Gang Chen, Biao Hu, Kai Huang, Alois Knoll, Di Liu, and Todor Stefanov. 2014. Automatic cache partitioning and time-triggered scheduling for real-time MPSoCs. In International Conference on ReConFigurable Computing and FPGAs (ReConFig). IEEE, 1–8.
[12]
Hoon Sung Chwa, Jinkyu Lee, Kieu-My Phan, Arvind Easwaran, and Insik Shin. 2013. Global EDF schedulability analysis for synchronous parallel tasks on multicore platforms. In Euromicro Conference on Real-Time Systems (ECRTS). 25–34.
[13]
Liran Funaro, Orna Agmon Ben-Yehuda, and Assaf Schuster. 2016. Ginseng: Market-driven LLC allocation. In USENIX Annual Technical Conference (ATC). 295–308.
[14]
Gerald Gamrath, Daniel Anderson, Ksenia Bestuzheva, Wei-Kun Chen, Leon Eifler, Maxime Gasse, Patrick Gemander, Ambros Gleixner, Leona Gottwald, Katrin Halbig, et al. 2020. The SCIP Optimization Suite 7.0. In Technical Report.
[15]
Giovani Gracioli, Ahmed Alhammad, Renato Mancuso, Antônio Augusto Fröhlich, and Rodolfo Pellizzoni. 2015. A survey on cache management mechanisms for real-time embedded systems. Computing Surveys (CSUR) 48, 2 (2015), 1–36.
[16]
Nan Guan, Martin Stigge, Wang Yi, and Ge Yu. 2009. Cache-aware scheduling and analysis for multicores. In International Conference on Embedded Software (EMSOFT). ACM, 245–254.
[17]
Danlu Guo, Mohamed Hassan, Rodolfo Pellizzoni, and Hiren Patel. 2018. A comparative study of predictable DRAM controllers. Transactions on Embedded Computing Systems (TECS) 17, 2 (2018), 1–23.
[18]
Mohamed Hassan, Hiren Patel, and Rodolfo Pellizzoni. 2015. A framework for scheduling DRAM memory accesses for multi-core mixed-time critical systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 307–316.
[20]
Intel. 2019. User space software for Intel(R) Resource Director Technology. (2019). https://github.com/intel/intel-cmt-cat.
[21]
Xu Jiang, Nan Guan, Xiang Long, and Wang Yi. 2017. Semi-federated scheduling of parallel real-time tasks on multiprocessors. In Real-Time Systems Symposium (RTSS). IEEE, 80–91.
[22]
Xu Jiang, Xiang Long, Nan Guan, and Han Wan. 2016. On the decomposition-based global EDF scheduling of parallel real-time tasks. In Real-Time Systems Symposium (RTSS). IEEE, 237–246.
[23]
Hyoseung Kim, Dionisio De Niz, Björn Andersson, Mark Klein, Onur Mutlu, and Ragunathan Rajkumar. 2014. Bounding memory interference delay in COTS-based multi-core systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 145–154.
[24]
Hyoseung Kim and Ragunathan Rajkumar. 2016. Real-time cache management for multi-core virtualization. In International Conference on Embedded Software (EMSOFT). IEEE, 1–10.
[25]
Junsung Kim, Hyoseung Kim, Karthik Lakshmanan, and Ragunathan Raj Rajkumar. 2013. Parallel scheduling for cyber-physical systems: Analysis and case study on a self-driving car. In 4th International Conference on Cyber-Physical Systems (ICCPS). 31–40.
[26]
Karthik Lakshmanan, Shinpei Kato, and Ragunathan Rajkumar. 2010. Scheduling parallel real-time tasks on multi-core processors. In 31st IEEE Real-Time Systems Symposium (RTSS). 259–268.
[27]
Jing Li, Kunal Agrawal, Chenyang Lu, and Christopher Gill. 2013. Analysis of global EDF for parallel tasks. In 25th Euromicro Conference on Real-Time Systems (ECRTS). 3–13.
[28]
J. Li, Jian-Jia Chen, K. Agrawal, C. Lu, C. D. Gill, and Abusayeed Saifullah. 2014. Analysis of federated and global scheduling for parallel real-time tasks. In 26th Euromicro Conference on Real-Time Systems (ECRTS). 85–96.
[29]
Jing Li, Son Dinh, Kevin Kieselbach, Kunal Agrawal, Christopher Gill, and Chenyang Lu. 2016. Randomized work stealing for large scale soft real-time systems. In IEEE Real-Time Systems Symposium (RTSS). 203–214.
[30]
Yonghui Li, Benny Akesson, and Kees Goossens. 2016. Architecture and analysis of a dynamically-scheduled real-time memory controller. Real-Time Systems 52, 5 (2016), 675–729.
[31]
John D. McCalpin et al. 1995. Memory bandwidth and machine balance in current high performance computers. Computer Society Technical Committee on Computer Architecture (TCCA) newsletter 2, 19–25 (1995).
[32]
Alessandra Melani, Marko Bertogna, Vincenzo Bonifaci, Alberto Marchetti-Spaccamela, and Giorgio Buttazzo. 2016. Schedulability analysis of conditional parallel task graphs in multicore systems. IEEE Trans. Comput. 66, 2 (2016), 339–353.
[33]
Geoffrey Nelissen, Vandy Berten, Joël Goossens, and Dragomir Milojevic. 2012. Techniques optimizing the number of processors to schedule multi-threaded tasks. In 24th Euromicro Conference on Real-Time Systems (ECRTS). 321–330.
[34]
Viet Anh Nguyen, Damien Hardy, and Isabelle Puaut. 2019. Cache-conscious off-line real-time scheduling for multi-core platforms: Algorithms and implementation. Real-Time Systems 55, 4 (2019), 810–849.
[35]
OpenMP. 2013. OpenMP Application Program Interface v4.0. (July 2013). http://http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf.
[36]
PBBS. 2014. Problem Based Benchmark Suite. (2014). http://www.cs.cmu.edu/pbbs.
[37]
Rodolfo Pellizzoni and Heechul Yun. 2016. Memory servers for multicore systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 1–12.
[38]
Abusayeed Saifullah, Jing Li, Kunal Agrawal, Chenyang Lu, and Christopher Gill. 2013. Multi-core real-time scheduling for generalized parallel task models. Real-Time Systems 49, 4 (2013), 404–435.
[39]
Abhik Sarkar, Frank Mueller, and Harini Ramaprasad. 2015. Static task partitioning for locked caches in multicore real-time systems. Transactions on Embedded Computing Systems (TECS) 14, 1 (2015), 1–30.
[40]
Mahadev Satyanarayanan. 2017. The emergence of edge computing. Computer 50, 1 (2017), 30–39.
[41]
Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. 2016. Edge computing: Vision and challenges. Internet of Things Journal 3, 5 (2016), 637–646.
[42]
Parul Sohal, Rohan Tabish, Ulrich Drepper, and Renato Mancuso. 2020. E-WarP: A system-wide framework for memory bandwidth profiling and management. In Real-Time Systems Symposium (RTSS). IEEE, 345–357.
[43]
Corey Tessler, Venkata P. Modekurthy, Nathan Fisher, and Abusayeed Saifullah. 2020. Bringing inter-thread cache benefits to federated scheduling. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 281–295.
[44]
Niklas Ueter, Georg von der Bruggen, Jian-Jia Chen, Jing Li, and Kunal Agrawal. 2018. Reservation-based federated scheduling for parallel real-time tasks. In IEEE Real-Time Systems Symposium (RTSS). 482–494.
[45]
Prathap Kumar Valsan, Heechul Yun, and Farzad Farshchi. 2016. Taming non-blocking caches to improve isolation in multicore real-time systems. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 1–12.
[46]
Qi Wang and Gabriel Parmer. 2014. FJOS: Practical, predictable, and efficient system support for fork/join parallelism. In Real-Time and Embedded Technology and Applications Symposium (RTAS), IEEE 20th. 25–36.
[47]
Jun Xiao, Sebastian Altmeyer, and Andy Pimentel. 2017. Schedulability analysis of non-preemptive real-time scheduling for multicore processors with shared caches. In Real-Time Systems Symposium (RTSS). IEEE, 199–208.
[48]
Meng Xu, Linh Thi Xuan Phan, Hyon-Young Choi, Yuhan Lin, Haoran Li, Chenyang Lu, and Insup Lee. 2019. Holistic resource allocation for multicore real-time systems. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 345–356.
[49]
Ying Ye, Richard West, Jingyi Zhang, and Zhuoqun Cheng. 2016. Maracas: A real-time multicore vCPU scheduling framework. In Real-Time Systems Symposium (RTSS). IEEE, 179–190.
[50]
Heechul Yun, Waqar Ali, Santosh Gondi, and Siddhartha Biswas. 2016. Bwlock: A dynamic memory access control framework for soft real-time applications on multicore platforms. Transactions on Computers (TC) 66, 7 (2016), 1247–1252.
[51]
Heechul Yun, Renato Mancuso, Zheng-Pei Wu, and Rodolfo Pellizzoni. 2014. PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 155–166.
[52]
Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. 2013. Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 55–64.
[53]
Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. 2015. Memory bandwidth management for efficient performance isolation in multi-core platforms. Transactions on Computers (TC) 65, 2 (2015), 562–576.
[54]
Xiao Zhang, Sandhya Dwarkadas, and Kai Shen. 2009. Towards practical page coloring-based multicore cache management. In European Conference on Computer Systems. ACM, 89–102.
[55]
Yanqi Zhou and David Wentzlaff. 2016. MITTS: Memory inter-arrival time traffic shaping. SIGARCH Computer Architecture News 44, 3 (2016), 532–544.
[56]
Alexander Zuepke and Robert Kaiser. 2019. Deterministic futexes: Addressing WCET and bounded interference concerns. In Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 65–76.

Cited By

View all
  • (2024)LAG-based schedulability analysis for preemptive global EDF scheduling with dynamic cache allocationJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2023.103045147:COnline publication date: 17-Apr-2024
  • (2023)LAG-Based Analysis for Preemptive Global Scheduling with Dynamic Cache Allocation2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00022(107-116)Online publication date: 30-Aug-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 21, Issue 1
January 2022
288 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3505211
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 14 January 2022
Accepted: 01 September 2021
Revised: 01 August 2021
Received: 01 February 2021
Published in TECS Volume 21, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Parallel real-time systems
  2. federated scheduling
  3. resource partitioning

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Science Foundation (USA)
  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)103
  • Downloads (Last 6 weeks)9
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)LAG-based schedulability analysis for preemptive global EDF scheduling with dynamic cache allocationJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2023.103045147:COnline publication date: 17-Apr-2024
  • (2023)LAG-Based Analysis for Preemptive Global Scheduling with Dynamic Cache Allocation2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00022(107-116)Online publication date: 30-Aug-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media