ABSTRACT
Memory latencies vary in non-uniform memory access (NUMA) systems so that execution times may become unpredictable in a multicore real-time system. This results in overly conservative scheduling with low utilization due to loose bounds on the worst-case execution time (WCET) of tasks. This work contributes a controller/node-aware memory coloring (CAMC) allocator inside the Linux kernel for the entire address space to reduce access conflicts and latencies by isolating tasks from one another. CAMC improves timing predictability and performance over Linux' buddy allocator and prior coloring methods. It provides core isolation with respect to banks and memory controllers for real-time systems. To our knowledge, this work is first to consider multiple memory controllers in real-time systems, combine them with bank coloring, and assess its performance on a NUMA architecture.
- Manu Awasthi, David W Nellans, Kshitij Sudan, Rajeev Balasubramonian, and Al Davis. Handling the problems and opportunities posed by multiple on-chip memory controllers. In International Conference on Parallel Architectures and Compilation Techniques, 2010. Google ScholarDigital Library
- Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. The parsec benchmark suite: Characterization and architectural implications. In PACT, October 2008. Google ScholarDigital Library
- Sergey Blagodurov, Sergey Zhuravlev, Alexandra Fedorova, and Ali Kamali. A case for numa-aware contention management on multicore systems. In International Conference on Parallel Architectures and Compilation Techniques, 2010. Google ScholarDigital Library
- Micaiah Chisholm, Bryan C Ward, Namhoon Kim, and James H Anderson. Cache sharing and isolation tradeoffs in multicore mixed-criticality systems. In IEEE Real-Time Systems Symposium, 2015. Google ScholarDigital Library
- Pengcheng Huang, Georgia Giannopoulou, Rehan Ahmed, Davide B. Bartolini, and Lothar Thiele. An isolation scheduling model for multi-cores. In IEEE Real-Time Systems Symposium, 2015. Google ScholarDigital Library
- Hyoseung Kim, Dionisio de Niz, Björn Andersson, Mark Klein, Onur Mutlu, and Ragunathan Raj Rajkumar. Bounding memory interference delay in cots-based multi-core systems. In IEEE Real-Time Embedded Technology and Applications Symposium, 2014.Google ScholarCross Ref
- Renaud Lachaize, Baptiste Lepers, Vivien Quéma, et al. Memprof: A memory profiler for numa multicore systems. In USENIX Annual Technical Conference, 2012. Google ScholarDigital Library
- Hui Li, Sudarsan Tandri, Michael Stumm, and Kenneth C Sevcik. Locality and loop scheduling on numa multiprocessors. In International Conference on Parallel Processing, 1993. Google ScholarDigital Library
- Lei Liu, Zehan Cui, Mingjie Xing, Yungang Bao, Mingyu Chen, and Chengyong Wu. A software memory partition approach for eliminating bank-level interference in multicore systems. In International Conference on Parallel Architectures and Compilation Techniques, 2012. Google ScholarDigital Library
- Zoltan Majo and Thomas R Gross. Matching memory access patterns and data placement for numa systems. In International Symposium on Code Generation and Optimization, 2012. Google ScholarDigital Library
- Zoltan Majo and Thomas R Gross. (mis) understanding the numa memory system performance of multithreaded workloads. In International Symposium on Workload Characterization, 2013.Google ScholarCross Ref
- Renato Mancuso, Rodolfo Pellizzoni, Caccamo Marco, Lui Sha, and Heechul Yun. Wcet(m) estimation in multi-core systems using single core equivalence. In Euromicro Conference on Real-Time Systems, 2015. Google ScholarDigital Library
- Jaydeep Marathe, Vivek Thakkar, and Frank Mueller. Feedback-directed page placement for ccnuma via hardware-generated memory traces. Journal of Parallel and Distributed Computing, 2010. Google ScholarDigital Library
- Collin McCurdy and Jeffrey Vetter. Memphis: Finding and fixing numa-related performance problems on multi-core platforms. In International Symposium on Performance Analysis of Systems & Software, 2010.Google ScholarCross Ref
- Takeshi Ogasawara. Numa-aware memory manager with dominant-thread-based copying gc.Google Scholar
- Xing Pan, Yasaswini J. Gownivaripalli, and Frank Mueller. Tintmalloc: Reducing memory access divergence via controller-aware coloring. In International Parallel and Distributed Processing Symposium, 2016.Google ScholarCross Ref
- Rodolfo Pellizzoni and Heechul Yun. Memory servers for multicore systems. In IEEE Real-Time Embedded Technology and Applications Symposium, 2016.Google ScholarCross Ref
- Xiao Zhang Sandhya Dwarkadas Kai Shen. Hardware execution throttling for multi-core resource management. In USENIX Annual Technical Conference, 2009. Google ScholarDigital Library
- Noriaki Suzuki, Hyoseung Kim, Dionisio de Niz, Bjorn Andersson, Lutz Wrage, Mark Klein, and Ragunathan Rajkumar. Coordinated bank and cache coloring for temporal protection of memory accesses. In International Conference on Computational Science and Engineering, 2013. Google ScholarDigital Library
- Bryan C. Ward. Relaxing resource-sharing constraints for improved hardware management and schedulability. In IEEE Real-Time Systems Symposium, 2015. Google ScholarDigital Library
- Zheng Pei Wu, Yogen Krish, and Rodolfo Pellizzoni. Worst case analysis of dram latency in multi-requestor systems. In IEEE Real-Time Systems Symposium, 2013. Google ScholarDigital Library
- Heechul Yun, Renato Mancuso, Zheng-Pei Wu, and Rodolfo Pellizzoni. Palloc: Dram bank-aware memory allocator for performance isolation on multicore platforms. In IEEE Real-Time Embedded Technology and Applications Symposium, 2014.Google ScholarCross Ref
- Heechul Yun, Rodolfo Pellizzoni, and Prathap Valsan, Kumar. Parallelism-aware memory interference delay analysis for cots multi-core systems. In Euromicro Conference on Real-Time Systems, 2015. Google ScholarDigital Library
- Heechul Yun, Gang Yao, Rodolfo Pellizzoni, Marco Caccamo, and Lui Sha. Memguard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms. In IEEE Real-Time Embedded Technology and Applications Symposium, 2013. Google ScholarDigital Library
Index Terms
- Controller-aware memory coloring for multicore real-time systems
Recommendations
Efficiently Handling Memory Accesses to Improve QoS in Multicore Systems under Real-Time Constraints
SBAC-PAD '12: Proceedings of the 2012 IEEE 24th International Symposium on Computer Architecture and High Performance ComputingChip multiprocessors (CMPs) are becoming the common choice to implement embedded systems due to they achieve a good tradeoff between performance and power. Because of manufacturability reasons, CMPs use to implement one or several memory controllers, ...
Endurance-Aware Allocation of Data Variables on NVM-Based Scratchpad Memory in Real-Time Embedded Systems
Nonvolatile memory (NVM) has many benefits compared to the traditional static RAM, such as improved reliability and reduced power consumption, but it has long write latency and limited write endurance. Scratchpad memory (SPM) is software-managed small on-...
Understanding Off-Chip Memory Contention of Parallel Programs in Multicore Systems
ICPP '11: Proceedings of the 2011 International Conference on Parallel ProcessingMemory contention is an important performance issue in current multicore architectures. In this paper, we focus on understanding how off-chip memory contention affects the performance of parallel applications. Using measurements conducted on state-of-...
Comments