Skip to main content
Log in

Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Exploiting compile time knowledge to improve memory bandwidth can produce noticeable improvements at runtime.(1, 2) Allocating the data structure(1) to separate memories whenever the data may be accessed in parallel allows improvements in memory access time of 13 to 40%. We are concerned with synthesizing compiler storage schemes for minimizing array access conflicts in parallel memories for a set of compiler predicted data access patterns. The access patterns can be easily found for many synchronous dataflow computations like multimedia compression/decompression algorithms, DSP, vision, robotics, etc. A storage scheme is a mapping from array addresses into storages. Finding a conflict-free storage scheme for a set of data patterns is NP-complete. This problem is reduceable to weighted graph coloring. Optimizing the storage scheme is investigated by using constructive heuristics, neural methods, and genetic algorithms. The details of implementation of these different approaches are presented. Using realistic data patterns, simulation shows that memory utilization of 80% or higher can be achieved in the case of 20 data patterns over up to 256 parallel memories, i.e., a scalable parallel memory. The neural approach was relatively very fast in producing reasonably good solutions even in the case of large problem sizes. Convergence of proposed neural algorithm seems to be only slightly dependent on problem size. Genetic algorithms are recommended for advanced compiler optimization especially for large problem sizes; and applications which are compiled once and run many times over different data sets. The solutions presented are also useful for other optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. M. Saghir, P. Chow, and C. Lee, Exploiting Dual Data-Memory Banks in Digital Signal Processors, Int'l. Conf. ASPLOS, pp. 234–243 (1996).

  2. E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam, Compiler Directed Page Coloring for Multiprocessors, Int'l. Conf. ASPLOS, pp. 244–255 (1996).

  3. K. Hwang and F. A. Briggs, Computer Architecture and Parallel Processing, McGraw-Hill Publishing (1987).

  4. H. G. Cragon, Memory Systems and Pipelined Processors, Jones and Bartlett Publishing (1996).

  5. A. Seznec and J. Lenfant, Interleaved Parallel Schemes, IEEE Trans. Parallel and Distrituted Syst., 5(12):1329–1334 (December 1994).

    Google Scholar 

  6. D. Lawrie, Access and Alignment of Data in an Array Processor, IEEE Trans. Computers, C-24(12):1145–1155 (December 1975).

    Google Scholar 

  7. T. Cheung and J. E. Smith, A Simulation Study of the CRAY X-MP Memory System, IEEE Trans. Computers, C-35(7):613–622 (July 1986).

    Google Scholar 

  8. P. Budnik and D. Kuck, The Organization and Use of Parallel Memories, IEEE Trans. Computers, C-20(12):1566–1569 (December 1971).

    Google Scholar 

  9. D. T. Harper, III, Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems, IEEE Trans. Parallel and Distributed Systems, 2(1):43–51 (January 1991).

    Google Scholar 

  10. D. T. Harper, III, Increased Memory Performance During Vector Accesses Through the Use of Linear Address Transformations, IEEE Trans. Computers, 41(2):227–230 (February 1992).

    Google Scholar 

  11. G. S. Sohi, High-bandwidth Interleaved Memories for Vector Processors-A Simulation Study, IEEE Trans. Computers, 42(1):34–44 (January 1993).

    Google Scholar 

  12. J. M. Jalby, W. Frailong, and J. Lenfant, XOR-Schemes: A Flexible Data Organization in Parallel Memories, Proc. Int'l. Conf. Parallel Processing, pp. 276–283 (1985).

  13. A. Norton and E. Melton, A Class of Boolean Linear Transformations for Conflict-Free Power-of-Two Stride Access, Proc. Int'l. Conf. Parallel Processing, pp. 247–254 (1987).

  14. A. Deb, Multiskewing-A Novel Technique for Optimal Parallel Memory Access, IEEE Trans. Parallel and Distributed Systems, 7(6):595–604 (June 1996).

    Google Scholar 

  15. S. McFarling, Program Optimization for Instruction Caches, Third Int'l. Conf. Architectural Support for Progr. Lang. Oper. Syst., pp. 183–191 (1989).

  16. P. P. Chang and W. W. Hwu, Achieving Very High Cache Performance with an Optimized Compiler, Proc. 16th Ann. Int'l. Symp. on Computer Architecture, pp. 242–251 (1989).

  17. R. Gupta and M. L. Soffa, Compile-Time Techniques for Improving Scalar Access Performance in Parallel Memories, IEEE Trans. Parallel and Distributed Systems, 2(2):138–148 (April 1991).

    Google Scholar 

  18. P. Briggs and J. Feo, The Tera Programming Workshop, Int'l. Conf. Parallel Architectures and Compilation Techniques, Paris, France (October 1998).

  19. T. Sterling, A Hybrid Technology Multithreaded Computer Architecture for Petaflops Computing, MS 159–79, J.P.L., California Institute of Technology (January 1997).

  20. C. E. Kozyrakis and D. A. Patterson, A New Direction for Computer Architecture Research, IEEE Computer (November 1998).

  21. M. C. Pease, The Indirect Binary-Cube Microprocessor Array, IEEE Trans. on Computers, C-26(5):458–473 (May 1977).

    Google Scholar 

  22. K. Y. Lee, On the Rearrangeability of a (2 log N-1) Stage Permutation Network, IEEE Trans. Computers, Vol. 34 (May 1985).

  23. M. Al-Mouhamed and S. Seiden, Minimization of Memory and Network Contention for Accessing Arbitrary Data Patterns in SIMD Systems, IEEE Trans. on Computers, 45(6):757–762 (June 1996).

    Google Scholar 

  24. A. Blum, New Approximation Algorithms for Graph-Coloring, J. ACM, 41(3):470–516 (1994).

    Google Scholar 

  25. X. Zhou, S.-I. Nakano, and T. Nishizeki, Edge-Coloring Partial k-Trees, IEEE Trans. Parallel and Distributed Syst., 21(3):598–617 (November 1996).

    Google Scholar 

  26. S. Y. Kung, Digital Neural Networks, Prentice Hall (1993).

  27. J. J. Hopfield, Neural Networks and Physical Systems with Emergent Collective Computational Abilities, Proc. Nat'l. Acad. Sci. U.S.A. 79:2,554–2,558 (1982).

    Google Scholar 

  28. A. K. Jain, J. Mao, and K. M. Mohiuddin, Artificial Neural Networks: A Tutorial, IEEE Computer, 29(8):31–44 (March 1996).

    Google Scholar 

  29. J. L. Filho, P. C. Treleaven, and C. Alippi, Genetic-Algorithm Programming Environments, IEEE Computer, 27(6):27–43 (June 1994).

    Google Scholar 

  30. J. J. Grefenstette, Genesis: A System for Using Genetic Search Procedures, Proc. Conf. Intelligent Systems and Machines, pp. 161–165 (1984).

  31. D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, Massachusetts (1989).

    Google Scholar 

  32. R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, Addison-Wesley (1992).

  33. M. Quinn, Designing Efficient Algorithms for Parallel Computers, McGraw-Hill Inter., Second Edition (1988).

  34. M. Al-Mouhamed and S. Seiden, A Heuristic Storage for Minimizing Access Time of Arbitrary Data Patterns, IEEE Trans. Parallel and Distrituted Systems, 8(4):441–447 (April 1997).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Mouhamed, M., Abu-Haimed, H. Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes. International Journal of Parallel Programming 29, 365–399 (2001). https://doi.org/10.1023/A:1011146518909

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011146518909

Navigation