article

Reuse analysis of indirectly indexed arrays

Authors:
Javed Absar

IMEC, Katholieke Universiteit Leuven, and STMicroelectronics, Leuven, Belgium

IMEC, Katholieke Universiteit Leuven, and STMicroelectronics, Leuven, Belgium
View Profile

,
Francky Catthoor

IMEC, Katholieke Universiteit Leuven, Leuven, Belgium

IMEC, Katholieke Universiteit Leuven, Leuven, Belgium
View Profile

ACM Transactions on Design Automation of Electronic Systems Volume 11 Issue 2pp 282–305https://doi.org/10.1145/1142155.1142157

Published:01 April 2006Publication History

ACM Transactions on Design Automation of Electronic Systems

Abstract

We propose techniques for identifying and exploiting spatial and temporal reuse for indirectly indexed arrays. Indirectly indexed arrays are those arrays which are, typically, accessed inside multilevel loop nests and whose index expression includes not only loop iterators and constants but arrays as well. Existing techniques for improving locality are quite sophisticated in the case of directly indexed arrays. But, unfortunately, they are inadequate for handling indirectly indexed arrays. In this article we therefore extend the existing framework and techniques of directly indexed to indirectly indexed arrays. The concepts of reuse subspace, dependence vector, self, and group reuse are extended and applied in this new context. Also, lately scratch-pad memory has become an attractive alternative to data-cache, specially in the embedded multimedia community. This is because embedded systems are very sensitive to area and energy and the scratch-pad is smaller in area and consumes less energy on a per access basis compared to the cache of the same capacity. Several techniques have been proposed in the past for the efficient exploitation of the scratch-pad for directly indexed arrays. We extend these techniques by presenting a method for scratch-pad mapping of indirectly indexed arrays. This enables the scratch-pad to be used in a larger context than was possible before.

References

Absar, M. J. and Catthoor, F. 2004. Compiler-based approach for exploiting scratch-pad in presence of irregular array access. In Proceedings of the Conference on Design Automation and Test in Europe (DATE). 1162--1167. Google ScholarDigital Library
Allen, R. and Kennedy, K. 2001. Optimizing Compilers for Modern Architectures. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library
Banerjee, U. 1988. Data Dependencies. Kluwer, Dordrecht, The Netherlands.Google Scholar
Das, R., Mavriplis, D., Saltz, J., and Gupta, S. 1994. Communication optimizations for irregular scientific computation on distributed memory architectures. J. Parallel Distrib. Comput. 22, 3, 464--478. Google ScholarDigital Library
Ding, C. and Kennedy, K. 1999. Improving cache performance in dynamic applications through data and computation reorganization at run time. In PLDI '99: Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation. ACM Press, New York, NY, 229--241. Google ScholarDigital Library
Dominguez, A., Udayakumaran, S., and Barua, R. 2005. Heap data allocation to scratch-pad memory in embedded systems. J. Embed. Comput. 1, 4, 120--137. Google ScholarDigital Library
Francesco, P., Marchal, P., Atienza, D., Benini, L., Catthoor, F., and Mendias, J. M. 2004. An integrated hardware/software approach for run time scratchpad management. In Proceedings of DAC. 238--243. Google ScholarDigital Library
Gannon, D., Jalby, W., and Gallivan, K. 1988. Strategies for cache and local memory management by global program transformation. J. Parallel Distrib. Comput. 5, 5, 587--616. Google ScholarDigital Library
Issenin, I., Brockmeyer, E., Miranda, M., and Dutt, N. 2004. Data reuse analysis technique for software-controlled memory hierarchies. In Proceedings of the Conference on Design Automation and Test in Europe (DATE). 202--207. Google ScholarDigital Library
Kandemir, M., Ramanujam, J., Irwin, J., Vijaykrishnan, N., Kadayif, I., and Parikh, A. 2001. Dynamic management of scratch-pad memory space. In DAC '01: Proceedings of the 38th Conference on Design Automation. ACM Press, New York, NY, 690--695. Google ScholarDigital Library
Kandemir, M. T. and Ramanujan, J. 2004. A compiler-based approach for dynamically managing scratch-pad memories in embedded systems. IEEE Trans. Comput. Aid. Des. Integrated Circ. Syst. 23, 2 (Mar.), 243--259. Google ScholarDigital Library
Kandemir, M. T., Ramanujan, J., and Chowdhury, A. 1999. Improving cache locality by a combination of loop and data transformation. IEEE Trans. Comput. 48, 2. Google ScholarDigital Library
Lam, M. S. 2004. A data locality optimizing algorithm, a retrospective. In 20 Years of PLDI (1979-1999) : A Selection. ACM Press, New York, NY, 30--44.Google Scholar
Lamport, L. 1974. The parallel execution of do loops. Commun. ACM 17, 2, 83--93. Google ScholarDigital Library
Lee, C., Potkonjak, M., and Smith, M. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communication systems. In Proceedings of the International Symposium on Microarchitecture. Google ScholarDigital Library
Lim, A. W., Cheong, G. I., and Lam, M. S. 1999. An affine partitioning algorithm to maximize parallelism and minimize communication. In ICS '99: Proceedings of the 13th International Conference on Supercomputing. ACM Press, New York, NY, 228--237. Google ScholarDigital Library
Marwedel, P. 2003. Embedded System Design. Kluwer, Norwell, MA. Google ScholarDigital Library
Panda, P. R., Dutt, N. D., and Nicolau, A. 1997. Efficient utilization of scratch-pad memory in embedded processor applications. In EDTC '97: Proceedings of the 1997 European Conference on Design and Test. IEEE Computer Society, Washington, DC, 7. Google ScholarDigital Library
Panda, P. R., Nicolau, A., and Dutt, N. 1998. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer, Norwell, MA. Google ScholarDigital Library
Stobach, P. 1998. A new technique in scene adpative coding. In Proceedings of the European Signal Processing Conference (EUSIPCO).Google Scholar
Strout, M. M., Carter, L., and Ferrante, J. 2003. Compile time composition of run time data and iteration reorderings. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM Press, New York, NY, 91--102. Google ScholarDigital Library
Todd, C. and Davidson, G. 1994. Ac-3: Flexible perceptual coding for audio transmission and storage. In Proceedings of the 96th Convention of the Audio Engineering Society. 89--102.Google Scholar
Verma, M., Wehmeyer, L., and Marwedel, P. 2004. Dynamic overlay of scratchpad memory for energy minimization. In Proceedings of the 2nd IEEE/ACM/IFIP Inernational Conference on Hardware/Software Codesign and System Synthesis (CODES'04, Stockholm, Sweden). 104--109. Google ScholarDigital Library
Wolf, M. E. and Lam, M. S. 1991. A data locality optimizing algorithm. In PLDI '91: Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation. ACM Press, New York, NY, 30--44. Google ScholarDigital Library

Index Terms

Reuse analysis of indirectly indexed arrays
1. Computing methodologies
  1. Modeling and simulation
    1. Model development and analysis
      1. Modeling methodologies
2. Hardware
  1. Hardware validation

Recommendations

Analyzing data reuse for cache reconfiguration

Classical compiler optimizations assume a fixed cache architecture and modify the program to take best advantage of it. In some cases, this may not be the best strategy because each nest might work best with a different cache configuration and ...
Read More
Outer-loop vectorization: revisited for short SIMD architectures
PACT '08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques

Vectorization has been an important method of using data-level parallelism to accelerate scientific workloads on vector machines such as Cray for the past three decades. In the last decade it has also proven useful for accelerating multi-media and ...
Read More
A Layout-Conscious Iteration Space Transformation Technique

Exploiting locality of references has become extremely important in realizing the potential performance of modern machines with deep memory hierarchies. The data access patterns of programs and the memory layouts of the accessed data sets play a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Design Automation of Electronic Systems Volume 11, Issue 2
April 2006
283 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/1142155
Issue’s Table of Contents

Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 1 April 2006
Published in todaes Volume 11, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Scratch-pad
data reuse
indirectly indexed arrays
irregular access
reuse vector
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 346
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Reuse analysis of indirectly indexed arrays

ACM Transactions on Design Automation of Electronic Systems

Abstract

References

Cited By

Index Terms

Recommendations

Analyzing data reuse for cache reconfiguration

Outer-loop vectorization: revisited for short SIMD architectures

A Layout-Conscious Iteration Space Transformation Technique

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Reuse analysis of indirectly indexed arrays

ACM Transactions on Design Automation of Electronic Systems

Abstract

References

Cited By

Index Terms

Recommendations

Analyzing data reuse for cache reconfiguration

Outer-loop vectorization: revisited for short SIMD architectures

A Layout-Conscious Iteration Space Transformation Technique

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media