Skip to main content
Log in

A Polynomial-Time Algorithm for Memory Space Reduction

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Reducing memory space requirement is important to many applications. For data-intensive applications, it may help avoid executing the program out-of-core. For high-performance computing, memory space reduction may improve the cache hit rate as well as performance. For embedded systems, it can reduce the memory requirement, the memory latency and the energy consumption. This paper investigates program transformations which a compiler can use to reduce the memory space required for storing program data. In particular, the paper uses integer programming to model the problem of combining loop shifting, loop fusion and array contraction to minimize the data memory required to execute a collection of multi-level loop nests. The integer programming problem is then reduced to an equivalent network flow problem which can be solved in polynomial time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Yonghong Song, Rong Xu, Cheng Wang, and Zhiyuan Li, Improving Data Locality by Array Contraction, IEEE Transactions on Computers, 53(9):1073–1084 (September 2004).

  • Antoine Fraboulet, Guillaume Huard, and Anne Mignotte, Loop Alignment for Memory Accesses Optimization, Proceedings of the Twelfth International Symposium on System Synthesis, Boca Raton, Florida (November 1999).

  • Antoine Fraboulet, Karen Kodary, and Anne Mignotte, Loop Fusion for Memory Space Optimization, Proceedings of the Fourteenth International Symposium on System Synthesis, Montreal, Canada, pp. 95–100 (October 2001).

  • Eddy De Greef, Francky Catthoor, and Hugo De Man, Array Placement for Storage Size Reduction in Embedded Multimedia Systems, Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors, Zurich, Switzerland (July 1997).

  • Michael Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley Publishing Company (1995).

  • Creusillet B\’{e}atrice Irigoin Fran{\c c}ois (1996) ArticleTitleInterprocedural Array Region Analyses International Journal of Parallel Programming 24 IssueID6 513–546

    Google Scholar 

  • Thomas Gross and Peter Steenkiste, Structured Dataflow Analysis for Arrays and Its Use in an Optimizing Compiler, Software-Practice and Experience,20(2):(February1990).

  • Junjie Gu, Zhiyuan Li, and Gyungho Lee, Experience with Efficient Array Data Flow Analysis for Array Privatization, Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV,\break pp. 157–167 (June 1997)

  • Dror Maydan, Saman Amarasinghe, and Monica Lam, Array Data-flow Analysis and Its Use in Array Privatization, Proceedings of ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, pp. 2–15 (January 1993).

  • D. J. Kuck, The Structure of Computers and Computations, Vol. 1, John Wiley & Sons (1978).

  • Michael Wolf, Improving Locality and Parallelism in Nested Loops, Ph.D. Thesis, Department of Computer Science, Stanford University (August 1992).

  • Yonghong Song, Rong Xu, Cheng Wang, and Zhiyuan Li, Data Locality Enhancement by Memory Reduction, Proceedings of the 15th ACM International Conference on Supercomputing, Naples, Italy (June 2001).

  • R. Ahuja T. Magnanti J. Orlin (1993) Network Flows: Theory, Algorithms, and Applications Prentice-Hall Inc. Englewood Cliffs, New Jersey

    Google Scholar 

  • A. Gaber Mohamed, Geoffrey C. Fox, Gregor von Laszewski, Manish Parashar, Tomasz Haupt, Kim Mills, Ying-Hua Lu, Neng-Tan Lin, and Nang-Kang Yeh, Applications Benchmark Set for Fortran-D and High Performance Fortran, Technical Report CRPS-TR92260, Center for Research on Parallel Computation, Rice University (June 1992).

  • John Rice and J. Jing, Problems to Test Parallel and Vector Languages, Technical Report CSD-TR-1016, Department of Computer Science, Purdue University (1990).

  • Leiserson Charles Saxe James (1991) ArticleTitleRetiming Synchronous Circuitry Algorithmica 6 5–35

    Google Scholar 

  • Ken Kennedy and Kathryn S. McKinley, Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution, Springer-Verlag Lecture Notes in Computer Science, 768. Proceedings of the Sixth Workshop on Languages and Compilers for Parallel Computing, Portland, Oregon (August 1993).

  • Sharad K. Singhai and Kathryn S. McKinley, A Parameterized Loop Fusion Algorithm for Improving Parallelism and Cache Locality, The Computer Journal, 40(6):(1997)

  • Manjikian Naraig Abdelrahman Tarek (1997) ArticleTitleFusion of Loops for Parallelism and Locality IEEE Transactions on Parallel and Distributed Systems 8 IssueID2 193–209 Occurrence Handle10.1109/71.577265

    Article  Google Scholar 

  • Guang R. Gao, Russell Olsen, Vivek Sarkar, and Radhika Thekkath, Collective Loop Fusion for Array Contraction, Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computing, Also in No. 757 in Lecture Notes in Computer Science, Springer-Verlag, pp. 281–295 (1992).

  • Amy W. Lim, Shih-Wei Liao, and Monica S. Lam, Blocking and Array Contraction Across Arbitrarily Nested Loops Using Affine Partitioning, Proceedings of 2001 ACM Conference on PPOPP, Snowbird, Utah, pp. 103–112 (June 2001).

  • Daniel Cociorva, Gerald Baumgartner, Chi-Chung Lam, P. Sadayappan, J. Ramanujam, Marcel Nooijen, David Bernholdt, and Robert Harrison, Space–time Trade-off Optimization for a Class of Electronic Structure Calculations, Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Berlin, Germany, pp. 177--186 (June 2002).

  • Geoff Pike and Paul Hilfinger, Better Tiling and Array Contraction for Compiling Scientific Programs, Proceedings of the IEEE/ACM Supercomputing Conference, Baltimore, MD (November 2002).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonghong Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, Y., Wang, C. & Li, Z. A Polynomial-Time Algorithm for Memory Space Reduction. Int J Parallel Prog 33, 1–33 (2005). https://doi.org/10.1007/s10766-004-1459-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-004-1459-8

Keywords

Navigation