Elsevier

Microprocessors and Microsystems

Volume 40, February 2016, Pages 27-44
Microprocessors and Microsystems

Energy-efficient synonym data detection and consistency for virtual cache

https://doi.org/10.1016/j.micpro.2015.11.004Get rights and content

Abstract

The cache memory consumes a large proportion of the energy used by a processor. In the on-chip cache, the translation lookaside buffer (TLB) accounts for 20–50% of energy consumption of the on-chip cache. To reduce energy consumption caused by TLB accesses, a virtual cache can be accessed by virtual addresses which are issued by a processor directly. However, a virtual cache may result in the synonym problem. In this paper, we propose low-cost synonym detection hardware and a synonym data coherence mechanism. These reduce the energy consumption incurred by TLB lookups, and maintain synonym data consistency in the virtual cache. The proposed synonym detection hardware efficiently reduces the number of blocks that must be looked up in a virtual cache for saving energy. In addition, the proposed synonym data coherence mechanism also reduces the number of invalidated blocks in the virtual cache to prevent the destruction of cache locality. The simulation results show that our proposed energy-aware virtual cache consumes 51%, 27%, and 20% less energy than the traditional physical cache, traditional virtual cache, and synonym lookaside buffer (SLB), respectively. In addition, our proposed design shows almost the same static energy consumption as SLB, and reduces static energy consumption by about 20% compared with the traditional physical cache and virtual cache.

Introduction

In a computer system, the memory consumes a significant proportion of energy during executing applications. In previous studies [1], [2], [3], it has been found that energy consumption of the cache memory accounts for 45–50% of the overall energy consumption of a processor. Therefore, the design of a memory system that will reduce energy consumption is an important research topic.

In a memory system, the physical memory space can be efficiently used with virtual memory. However, when a processor issues a virtual address to access a physical cache or main memory, the issued virtual address has to be translated to the physical address by a translation lookaside buffer (TLB). In previous studies [4], [5], [6], [7], it was found that the energy consumption of TLBs accounts for about 20–50% of energy consumption of the on-chip cache. Fig. 1 illustrates the dynamic energy consumption of the TLB with the SPLASH-2 benchmark suite using different cache sizes. The TLB typically consumes 25–45% of the dynamic energy consumption of the on-chip cache. Therefore, reducing the number of TLB accesses represents an efficient method of reducing energy consumption in the memory system.

In previous studies, it was suggested that the virtual cache architecture could efficiently reduce the number of TLB accesses, and thus reduce the energy consumed when accessing TLB [8], [9]. Because the virtual address issued by a processor can be used to directly fetch the desired instruction or data located in the virtual cache, the number of TLB accesses can be efficiently reduced.

However, using the virtual cache architecture may give rise to the synonym problem [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19]. The synonym problem occurs when a data item has multiple different virtual addresses to cause multiple copies of the data item to exist simultaneously in a virtual cache. In this paper, we called the multiple copies of a data item the synonym data items. For example, multiple threads share memory space, and multiple applications use shared data or the same library. The data inconsistency problem arises when data are written to one of the multiple synonym data items in the virtual cache (read-only cache does not suffer such a problem). The processor may then obtain the wrong data from the virtual cache. Thus, to use the virtual cache architecture to reduce energy consumption, we must solve the synonym problem.

To solve the synonym problem, the synonym data items in the virtual cache have to be consistent. In general, there is no information indicating which blocks contain the synonym data items and whether they have been modified. Therefore, all virtual cache blocks that may contain the synonym data items have to be invalidated to keep the synonym data items consistent when a write or a read miss to the virtual cache occurs. However, invalidating these blocks destroys cache locality, which lowers the hit rate of the virtual cache and increases execution time. For this reason, the previous studies [4], [10], [11], [13] have used additional hardware to check whether multiple synonym data items exist when a miss occurs in the virtual cache. However, activating additional hardware for synonym data detection increases the dynamic energy consumption and execution time. Therefore, a solution to the synonym problem that decreases energy consumption and execution time must consider the added hardware cost for detecting synonym data items and the reduction in cache locality in the virtual cache.

In this paper, in order to save energy and reduce the number of invalidated blocks, we propose low cost synonym detection hardware and a synonym data consistency mechanism for the virtual cache architecture. Instead of using complicated synonym detection hardware, we use a shared bit for each virtual cache block to indicate the existence of synonym data items. That is, if the shared bit of a block is 0, it means that the block does not have another synonym data block in the virtual cache. In contrast, if the shared bit is 1, it means that the block has another synonym data block or other synonym data blocks in the virtual cache. To maintain synonym data consistency, a synonym table located behind a virtual cache is used to record the addresses of fetched data items in the virtual cache. These techniques enable the proposed mechanism to maintain synonym data consistency, reducing the number of invalidated blocks in the virtual cache and preventing the destruction of cache locality.

This paper is organized as follows. In Section 2, we review some related studies. Section 3 discusses and analyzes the synonym problem in the virtual cache, and introduces our low-cost synonym detection hardware and synonym data consistency mechanism. In Section 4, the simulation results are presented, and Section 5 concludes this paper.

Section snippets

Related work

In previous research on reducing TLB energy consumption, two related issues were introduced: the reduction of TLB energy consumption in the physical cache, and the avoidance of the synonym problem in the virtual cache. These issues are discussed in turn.

Proposed design

In this section, we propose low-cost synonym detection hardware and a synonym data consistency mechanism that reduce energy consumption. From the perspective of making the synonym data items in a virtual cache consistent, we analyze situations where multiple synonym data items appear in different sets of virtual cache. Afterwards, the synonym detection hardware and a synonym data consistency mechanism are proposed to reduce energy consumption and the number of invalidated synonym data blocks.

Simulation results

In this section, we describe the simulation environment and present the simulation results. The Simics [21] full system simulator with a Solaris installation executed the SPLASH-2 benchmark suite [22] to obtain trace files. A trace-driven simulator was developed to analyze the trace files and evaluate our proposed virtual cache design. We also simulated and compared related cache architecture designs, including the traditional physical cache, a traditional virtual cache, and SLB [11], in terms

Conclusion

In this paper, we have proposed low-cost synonym detection hardware and a synonym data consistency mechanism for a virtual cache architecture. Our approach saves energy by reducing the number of TLB accesses, and maintains synonym data consistency by reducing the number of invalidated blocks in the virtual cache. Instead of using complicated synonym detection hardware, we simply added a shared bit for each virtual cache block to determine whether two or more synonym data items exist. In

Acknowledgments

This research was supported by Ministry of Science and Technology (MOST 102-2221-E-035-030-MY2).

An Hsia received a B.S. and M.S. degree from the Feng Chia University, Taiwan, in 2008 and 2010, respectively. Currently, he is a candidate for a Ph.D. in the Department of Information Engineering and Computer Science at the Feng Chia University, Taiwan. His research interests include computer architecture, parallel processing, and embedded systems.

References (24)

  • BijuK. Raveendran et al.

    Predictive placement scheme in set-associative cache for energy efficient embedded systems

  • SureshD.C. et al.

    Power Efficient Instruction Caches for Embedded Systems

    (2005)
  • KuganVivekanandarajah et al.

    Dynamic filter cache for low power instruction memory hierarchy

  • EkmanM. et al.

    TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors

  • MontanaroJ. et al.

    A 160 mhz, 32b 0.5 w cmos risc microprocessor

    IEEE J. Solid-State Circuits

    (1996)
  • KadayifI. et al.

    Generating physical addresses directly for saving instruction TLB energy

  • Toni Juan et al.

    Reducing TLB power requirements

  • Michel Cekleov et al.

    Virtual-address caches. Part 1: problems and solutions in uniprocessors

    IEEE Micro

    (1997)
  • Michel Cekleov et al.

    Virtual-address caches. Part 2: multiprocessor issues

    IEEE Micro

    (1997)
  • Jesung Kim et al.

    U-cache: a cost-effective solution to synonym problem

  • Xiaogang Qiu et al.

    The synonym lookaside buffer a solution to the synonym problem in virtual caches

    IEEE Trans. Comput.

    (2008)
  • JamesR. Goodman

    Coherency for multiprocessor virtual address caches

  • Cited by (6)

    An Hsia received a B.S. and M.S. degree from the Feng Chia University, Taiwan, in 2008 and 2010, respectively. Currently, he is a candidate for a Ph.D. in the Department of Information Engineering and Computer Science at the Feng Chia University, Taiwan. His research interests include computer architecture, parallel processing, and embedded systems.

    Ching-Wen Chen received a M.S. degree in the Department of Computer Science from the National Tsing-Hua University, Taiwan, 1995. He obtained his Ph.D. in Computer Science and Information Engineering from the National Chiao-Tung University, Taiwan, 2002. He was an Assistant Professor at Chaoyang University of Technology (2002–2005) and Feng Chia University (2005–2007), Taiwan and an Associate Professor at the Feng Chia University (2007–2013), Taiwan. Currently, he is a Professor at the Department of Information Engineering and Computer Science at the Feng Chia University, Taiwan. His research interests include computer architecture, parallel processing, embedded systems, mobile computing, and wireless sensor network.

    Tzong-Jye Liu received the PhD degree in 1999 from the Department of Computer Science, National Tsing Hua University, Taiwan. After he got his Ph.D. degree, he worked several years in the computer industry in Taiwan. He was an Assistant Professor at the Feng Chia University (2004–2008), Taiwan. Dr. Liu is currently an Associate Professor at the Department of Information Engineering and Computer Science, Feng Chia University, Taiwan. His research interests include operating systems, distributed computing and network security.

    View full text