skip to main content
research-article

ACDC: Small, Predictable and High-Performance Data Cache

Published:17 February 2015Publication History
Skip Abstract Section

Abstract

In multitasking real-time systems, the worst-case execution time (WCET) of each task and also the effects of interferences between tasks in the worst-case scenario need to be calculated. This is especially complex in the presence of data caches. In this article, we propose a small instruction-driven data cache (256 bytes) that effectively exploits locality. It works by preselecting a subset of memory instructions that will have data cache replacement permission. Selection of such instructions is based on data reuse theory. Since each selected memory instruction replaces its own data cache line, it prevents pollution and performance in tasks becomes independent of the size of the associated data structures. We have modeled several memory configurations using the Lock-MS WCET analysis method. Our results show that, on average, our data cache effectively services 88% of program data of the tested benchmarks. Such results double the worst-case performance of our tested multitasking experiments. In addition, in the worst case, they reach between 75% and 89% of the ideal case of always hitting in instruction and data caches. As well, we show that using partitioning on our proposed hardware only provides marginal benefits in worst-case performance, so using partitioning is discouraged. Finally, we study the viability of our proposal in the MiBench application suite by characterizing its data reuse, achieving hit ratios beyond 90% in most programs.

References

  1. S. Altmeyer, C. Maiza, and J. Reineke. 2010. Resilience analysis: Tightening the CRPD bound for set-associative caches. ACM SIGPLAN Notices 45, 4, 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. C. Aparicio, J. Segarra, C. Rodríguez, J. L. Villarroel, and V. Viñals. 2008. Avoiding the WCET overestimation on LRU instruction cache. In Proceedings of the IEEE International Conference on Embedded and Real-Time Computing Systems and Applications. 393--398. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. C. Aparicio, J. Segarra, C. Rodríguez, and V. Viñals. 2010. Combining prefetch with instruction cache locking in multitasking real-time systems. In Proceedings of the IEEE International Conference on Embedded and Real-Time Computing Systems and Applications. 319--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. C. Aparicio, J. Segarra, C. Rodríguez, and V. Viñals. 2011. Improving the WCET computation in the presence of a lockable instruction cache in multitasking real-time systems. Journal of Systems Architecture 57, 695--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Geiger, S. McKee, and G. Tyson. 2005. Beyond basic region caching: Specializing cache structures for high performance and energy conservation. In Proceedings of the International Conference on High-Performance and Embedded Architectures and Compilers. 102--115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Ghosh, M. Martonosi, and S. Malik. 1999. Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Languages and Systems 21, 4, 703--746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. González, C. Aliagas, and M. Valero. 1995. A data cache with multiple caching strategies tuned to different types of locality. In Proceedings of the International Conference on Supercomputing. 338--347. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Gonzalez-Alberquilla, F. Castro, L. Pinuel, and F. Tirado. 2010. Stack filter: Reducing L1 data cache power consumption. Journal of Systems Architecture 56, 12, 685--695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the IEEE International Workshop on Workload Characterization. 3--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. S. Lee and G. S. Tyson. 2000. Region-based caching: An energy-delay efficient memory architecture for embedded processors. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. 120--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. T. S. Li, S. Malik, and A. Wolfe. 1996. Cache modeling for real-time software: Beyond direct mapped instruction caches. In Proceedings of the IEEE Real-Time Systems Symposium. 254--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Lundqvist and P. Stenström. 1999. An integrated path and timing analysis method based on cycle-level symbolic execution. Real-Time Systems 17, 2--3, 183--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Martí Campoy, Á. Perles Ivars, and J. V. Busquets Mataix. 2001. Static use of locking caches in multitask preemptive real-time systems. In Proceedings of the IEEE Real-Time Embedded System Workshop.Google ScholarGoogle Scholar
  14. A. Martí Campoy, Á. Perles Ivars, F. Rodríguez, and J. V. Busquets Mataix. 2003a. Static use of locking caches vs. dynamic use of locking caches for real-time systems. In Proceedings of the Canadian Conference on Electrical and Computer Engineering.Google ScholarGoogle Scholar
  15. A. Martí Campoy, S. Sáez, Á. Perles Ivars, and J. V. Busquets Mataix. 2003b. Performance comparison of locking caches under static and dynamic schedulers. In Proceedings of the 27th IFAC/IFIP/IEEE Workshop on Real-Time Programming.Google ScholarGoogle Scholar
  16. Microprocessor-Report. 2008. Chart watch: High-performance embedded processor cores. Microprocessor Report 22, 26--27.Google ScholarGoogle Scholar
  17. N. Muralimanohar, T. Balasubramonian, and N. P. Jouppi. 2007. Cacti 6.0: A Tool to Understand Large Caches. Technical Report. University of Utah and Hewlett Packard Laboratories.Google ScholarGoogle Scholar
  18. I. Puaut. 2006. WCET-centric software-controlled instruction caches for hard real-time systems. In Proceedings of the Euromicro Conference on Real-Time Systems. 217--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. Puaut and D. Decotigny. 2002. Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In Proceedings of the IEEE Real-Time Systems Symposium. 114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. I. Puaut and C. Pais. 2007. Scratchpad memories vs locked caches in hard real-time systems: A quantitative comparison. In Proceedings of the Design, Automation Test in Europe Conference Exhibition. 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Reddy and P. Petrov. 2007. Eliminating inter-process cache interference through cache reconfigurability for real-time and low-power embedded multi-tasking systems. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems. 198--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. Rossi, P. V. Beek, and T. Walsh. 2006. Handbook of Constraint Programming. Elsevier. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Seoul National University Real-Time Research Group. 2008. SNU-RT benchmark suite for worst case timing analysis.Google ScholarGoogle Scholar
  24. L. Sha, T. Abdelzaher, K.-E. Årzén, A. Cervin, T. Baker, A. Burns, G. Buttazzo, M. Caccamo, J. Lehoczky, and A. K. Mok. 2004. Real time scheduling theory: A historical perspective. Real-Time Systems 28, 101--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. V. Suhendra and T. Mitra. 2008. Exploring locking & partitioning for predictable shared caches on multi-cores. In Proceedings of the 45th Design Automation Conference. 300--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Theiling, C. Ferdinand, and R. Wilhelm. 2000. Fast and precise WCET prediction by separated cache and path analyses. Real-Time Systems 18, 2--3, 157--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Tyson, M. Farrens, J. Matthews, and A. R. Pleszkun. 1995. A modified approach to data cache management. In Proceedings of the 28th Annual International Symposium on Microarchitecture (MICRO-28). IEEE, Los Alamitos, CA, 93--103. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G.-R. Uh, Y. Wang, D. Whalley, S. Jinturkar, C. Burns, and V. Cao. 1999. Effective exploitation of a zero overhead loop buffer. ACM SIGPLAN Notices 34, 7, 10--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. X. Vera, B. Lisper, and J. Xue. 2003. Data caches in multitasking hard real-time systems. In Proceedings of the IEEE Real-Time Systems Symposium. 154--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Vera, B. Lisper, and J. Xue. 2007. Data cache locking for tight timing calculations. ACM Transactions on Embedded Computing Systems 7, 1, 1--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. S. A. Ward and R. H. Halstead. 2002. Computation Structures. Kluwer Academics.Google ScholarGoogle Scholar
  32. R. White, F. Mueller, C. Healy, D. Whalley, and M. Harmon. 1997. Timing analysis for data caches and set-associative caches. In Proceedings of the IEEE Real-Time Technology and Applications Symposium. 192--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Whitham and N. Audsley. 2010. Studying the applicability of the scratchpad memory management unit. In Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium. 205--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. E. Wolf and M. S. Lam. 1991. A data locality optimizing algorithm. ACM SIGPLAN Notices 26, 30--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Xue and X. Vera. 2004. Efficient and accurate analytical modeling of whole-program data cache behavior. IEEE Transactions on Computers 53, 5, 547--566. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ACDC: Small, Predictable and High-Performance Data Cache

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Embedded Computing Systems
          ACM Transactions on Embedded Computing Systems  Volume 14, Issue 2
          March 2015
          472 pages
          ISSN:1539-9087
          EISSN:1558-3465
          DOI:10.1145/2737797
          Issue’s Table of Contents

          Copyright © 2015 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 17 February 2015
          • Accepted: 1 December 2013
          • Revised: 1 September 2013
          • Received: 1 June 2012
          Published in tecs Volume 14, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader