skip to main content
article

Reuse analysis of indirectly indexed arrays

Authors Info & Claims
Published:01 April 2006Publication History
Skip Abstract Section

Abstract

We propose techniques for identifying and exploiting spatial and temporal reuse for indirectly indexed arrays. Indirectly indexed arrays are those arrays which are, typically, accessed inside multilevel loop nests and whose index expression includes not only loop iterators and constants but arrays as well. Existing techniques for improving locality are quite sophisticated in the case of directly indexed arrays. But, unfortunately, they are inadequate for handling indirectly indexed arrays. In this article we therefore extend the existing framework and techniques of directly indexed to indirectly indexed arrays. The concepts of reuse subspace, dependence vector, self, and group reuse are extended and applied in this new context. Also, lately scratch-pad memory has become an attractive alternative to data-cache, specially in the embedded multimedia community. This is because embedded systems are very sensitive to area and energy and the scratch-pad is smaller in area and consumes less energy on a per access basis compared to the cache of the same capacity. Several techniques have been proposed in the past for the efficient exploitation of the scratch-pad for directly indexed arrays. We extend these techniques by presenting a method for scratch-pad mapping of indirectly indexed arrays. This enables the scratch-pad to be used in a larger context than was possible before.

References

  1. Absar, M. J. and Catthoor, F. 2004. Compiler-based approach for exploiting scratch-pad in presence of irregular array access. In Proceedings of the Conference on Design Automation and Test in Europe (DATE). 1162--1167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Allen, R. and Kennedy, K. 2001. Optimizing Compilers for Modern Architectures. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Banerjee, U. 1988. Data Dependencies. Kluwer, Dordrecht, The Netherlands.Google ScholarGoogle Scholar
  4. Das, R., Mavriplis, D., Saltz, J., and Gupta, S. 1994. Communication optimizations for irregular scientific computation on distributed memory architectures. J. Parallel Distrib. Comput. 22, 3, 464--478. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ding, C. and Kennedy, K. 1999. Improving cache performance in dynamic applications through data and computation reorganization at run time. In PLDI '99: Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation. ACM Press, New York, NY, 229--241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Dominguez, A., Udayakumaran, S., and Barua, R. 2005. Heap data allocation to scratch-pad memory in embedded systems. J. Embed. Comput. 1, 4, 120--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Francesco, P., Marchal, P., Atienza, D., Benini, L., Catthoor, F., and Mendias, J. M. 2004. An integrated hardware/software approach for run time scratchpad management. In Proceedings of DAC. 238--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gannon, D., Jalby, W., and Gallivan, K. 1988. Strategies for cache and local memory management by global program transformation. J. Parallel Distrib. Comput. 5, 5, 587--616. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Issenin, I., Brockmeyer, E., Miranda, M., and Dutt, N. 2004. Data reuse analysis technique for software-controlled memory hierarchies. In Proceedings of the Conference on Design Automation and Test in Europe (DATE). 202--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kandemir, M., Ramanujam, J., Irwin, J., Vijaykrishnan, N., Kadayif, I., and Parikh, A. 2001. Dynamic management of scratch-pad memory space. In DAC '01: Proceedings of the 38th Conference on Design Automation. ACM Press, New York, NY, 690--695. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kandemir, M. T. and Ramanujan, J. 2004. A compiler-based approach for dynamically managing scratch-pad memories in embedded systems. IEEE Trans. Comput. Aid. Des. Integrated Circ. Syst. 23, 2 (Mar.), 243--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kandemir, M. T., Ramanujan, J., and Chowdhury, A. 1999. Improving cache locality by a combination of loop and data transformation. IEEE Trans. Comput. 48, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lam, M. S. 2004. A data locality optimizing algorithm, a retrospective. In 20 Years of PLDI (1979-1999) : A Selection. ACM Press, New York, NY, 30--44.Google ScholarGoogle Scholar
  14. Lamport, L. 1974. The parallel execution of do loops. Commun. ACM 17, 2, 83--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lee, C., Potkonjak, M., and Smith, M. 1997. Mediabench: A tool for evaluating and synthesizing multimedia and communication systems. In Proceedings of the International Symposium on Microarchitecture. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Lim, A. W., Cheong, G. I., and Lam, M. S. 1999. An affine partitioning algorithm to maximize parallelism and minimize communication. In ICS '99: Proceedings of the 13th International Conference on Supercomputing. ACM Press, New York, NY, 228--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Marwedel, P. 2003. Embedded System Design. Kluwer, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Panda, P. R., Dutt, N. D., and Nicolau, A. 1997. Efficient utilization of scratch-pad memory in embedded processor applications. In EDTC '97: Proceedings of the 1997 European Conference on Design and Test. IEEE Computer Society, Washington, DC, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Panda, P. R., Nicolau, A., and Dutt, N. 1998. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer, Norwell, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Stobach, P. 1998. A new technique in scene adpative coding. In Proceedings of the European Signal Processing Conference (EUSIPCO).Google ScholarGoogle Scholar
  21. Strout, M. M., Carter, L., and Ferrante, J. 2003. Compile time composition of run time data and iteration reorderings. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation. ACM Press, New York, NY, 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Todd, C. and Davidson, G. 1994. Ac-3: Flexible perceptual coding for audio transmission and storage. In Proceedings of the 96th Convention of the Audio Engineering Society. 89--102.Google ScholarGoogle Scholar
  23. Verma, M., Wehmeyer, L., and Marwedel, P. 2004. Dynamic overlay of scratchpad memory for energy minimization. In Proceedings of the 2nd IEEE/ACM/IFIP Inernational Conference on Hardware/Software Codesign and System Synthesis (CODES'04, Stockholm, Sweden). 104--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wolf, M. E. and Lam, M. S. 1991. A data locality optimizing algorithm. In PLDI '91: Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation. ACM Press, New York, NY, 30--44. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Reuse analysis of indirectly indexed arrays

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Design Automation of Electronic Systems
        ACM Transactions on Design Automation of Electronic Systems  Volume 11, Issue 2
        April 2006
        283 pages
        ISSN:1084-4309
        EISSN:1557-7309
        DOI:10.1145/1142155
        Issue’s Table of Contents

        Copyright © 2006 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 April 2006
        Published in todaes Volume 11, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader