Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

Al-Mouhamed, Mayez; Abu-Haimed, Hussam

doi:10.1023/A:1011146518909

Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

Published: August 2001

Volume 29, pages 365–399, (2001)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Mayez Al-Mouhamed¹ &
Hussam Abu-Haimed²

51 Accesses
Explore all metrics

Abstract

Exploiting compile time knowledge to improve memory bandwidth can produce noticeable improvements at runtime.^(1, 2) Allocating the data structure⁽¹⁾ to separate memories whenever the data may be accessed in parallel allows improvements in memory access time of 13 to 40%. We are concerned with synthesizing compiler storage schemes for minimizing array access conflicts in parallel memories for a set of compiler predicted data access patterns. The access patterns can be easily found for many synchronous dataflow computations like multimedia compression/decompression algorithms, DSP, vision, robotics, etc. A storage scheme is a mapping from array addresses into storages. Finding a conflict-free storage scheme for a set of data patterns is NP-complete. This problem is reduceable to weighted graph coloring. Optimizing the storage scheme is investigated by using constructive heuristics, neural methods, and genetic algorithms. The details of implementation of these different approaches are presented. Using realistic data patterns, simulation shows that memory utilization of 80% or higher can be achieved in the case of 20 data patterns over up to 256 parallel memories, i.e., a scalable parallel memory. The neural approach was relatively very fast in producing reasonably good solutions even in the case of large problem sizes. Convergence of proposed neural algorithm seems to be only slightly dependent on problem size. Genetic algorithms are recommended for advanced compiler optimization especially for large problem sizes; and applications which are compiled once and run many times over different data sets. The solutions presented are also useful for other optimization problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Machine Learning Model for Code Optimization

Article 22 September 2023

Can GPU performance increase faster than the code error rate?

Article Open access 18 April 2024

Evolutionary neural networks for deep learning: a review

Article 10 June 2022

REFERENCES

M. Saghir, P. Chow, and C. Lee, Exploiting Dual Data-Memory Banks in Digital Signal Processors, Int'l. Conf. ASPLOS, pp. 234–243 (1996).
E. Bugnion, J. Anderson, T. Mowry, M. Rosenblum, and M. Lam, Compiler Directed Page Coloring for Multiprocessors, Int'l. Conf. ASPLOS, pp. 244–255 (1996).
K. Hwang and F. A. Briggs, Computer Architecture and Parallel Processing, McGraw-Hill Publishing (1987).
H. G. Cragon, Memory Systems and Pipelined Processors, Jones and Bartlett Publishing (1996).
A. Seznec and J. Lenfant, Interleaved Parallel Schemes, IEEE Trans. Parallel and Distrituted Syst., 5(12):1329–1334 (December 1994).
Google Scholar
D. Lawrie, Access and Alignment of Data in an Array Processor, IEEE Trans. Computers, C-24(12):1145–1155 (December 1975).
Google Scholar
T. Cheung and J. E. Smith, A Simulation Study of the CRAY X-MP Memory System, IEEE Trans. Computers, C-35(7):613–622 (July 1986).
Google Scholar
P. Budnik and D. Kuck, The Organization and Use of Parallel Memories, IEEE Trans. Computers, C-20(12):1566–1569 (December 1971).
Google Scholar
D. T. Harper, III, Block, Multistride Vector, and FFT Accesses in Parallel Memory Systems, IEEE Trans. Parallel and Distributed Systems, 2(1):43–51 (January 1991).
Google Scholar
D. T. Harper, III, Increased Memory Performance During Vector Accesses Through the Use of Linear Address Transformations, IEEE Trans. Computers, 41(2):227–230 (February 1992).
Google Scholar
G. S. Sohi, High-bandwidth Interleaved Memories for Vector Processors-A Simulation Study, IEEE Trans. Computers, 42(1):34–44 (January 1993).
Google Scholar
J. M. Jalby, W. Frailong, and J. Lenfant, XOR-Schemes: A Flexible Data Organization in Parallel Memories, Proc. Int'l. Conf. Parallel Processing, pp. 276–283 (1985).
A. Norton and E. Melton, A Class of Boolean Linear Transformations for Conflict-Free Power-of-Two Stride Access, Proc. Int'l. Conf. Parallel Processing, pp. 247–254 (1987).
A. Deb, Multiskewing-A Novel Technique for Optimal Parallel Memory Access, IEEE Trans. Parallel and Distributed Systems, 7(6):595–604 (June 1996).
Google Scholar
S. McFarling, Program Optimization for Instruction Caches, Third Int'l. Conf. Architectural Support for Progr. Lang. Oper. Syst., pp. 183–191 (1989).
P. P. Chang and W. W. Hwu, Achieving Very High Cache Performance with an Optimized Compiler, Proc. 16th Ann. Int'l. Symp. on Computer Architecture, pp. 242–251 (1989).
R. Gupta and M. L. Soffa, Compile-Time Techniques for Improving Scalar Access Performance in Parallel Memories, IEEE Trans. Parallel and Distributed Systems, 2(2):138–148 (April 1991).
Google Scholar
P. Briggs and J. Feo, The Tera Programming Workshop, Int'l. Conf. Parallel Architectures and Compilation Techniques, Paris, France (October 1998).
T. Sterling, A Hybrid Technology Multithreaded Computer Architecture for Petaflops Computing, MS 159–79, J.P.L., California Institute of Technology (January 1997).
C. E. Kozyrakis and D. A. Patterson, A New Direction for Computer Architecture Research, IEEE Computer (November 1998).
M. C. Pease, The Indirect Binary-Cube Microprocessor Array, IEEE Trans. on Computers, C-26(5):458–473 (May 1977).
Google Scholar
K. Y. Lee, On the Rearrangeability of a (2 log N-1) Stage Permutation Network, IEEE Trans. Computers, Vol. 34 (May 1985).
M. Al-Mouhamed and S. Seiden, Minimization of Memory and Network Contention for Accessing Arbitrary Data Patterns in SIMD Systems, IEEE Trans. on Computers, 45(6):757–762 (June 1996).
Google Scholar
A. Blum, New Approximation Algorithms for Graph-Coloring, J. ACM, 41(3):470–516 (1994).
Google Scholar
X. Zhou, S.-I. Nakano, and T. Nishizeki, Edge-Coloring Partial k-Trees, IEEE Trans. Parallel and Distributed Syst., 21(3):598–617 (November 1996).
Google Scholar
S. Y. Kung, Digital Neural Networks, Prentice Hall (1993).
J. J. Hopfield, Neural Networks and Physical Systems with Emergent Collective Computational Abilities, Proc. Nat'l. Acad. Sci. U.S.A. 79:2,554–2,558 (1982).
Google Scholar
A. K. Jain, J. Mao, and K. M. Mohiuddin, Artificial Neural Networks: A Tutorial, IEEE Computer, 29(8):31–44 (March 1996).
Google Scholar
J. L. Filho, P. C. Treleaven, and C. Alippi, Genetic-Algorithm Programming Environments, IEEE Computer, 27(6):27–43 (June 1994).
Google Scholar
J. J. Grefenstette, Genesis: A System for Using Genetic Search Procedures, Proc. Conf. Intelligent Systems and Machines, pp. 161–165 (1984).
D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, Massachusetts (1989).
Google Scholar
R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, Addison-Wesley (1992).
M. Quinn, Designing Efficient Algorithms for Parallel Computers, McGraw-Hill Inter., Second Edition (1988).
M. Al-Mouhamed and S. Seiden, A Heuristic Storage for Minimizing Access Time of Arbitrary Data Patterns, IEEE Trans. Parallel and Distrituted Systems, 8(4):441–447 (April 1997).
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, College of Computer Science and Engineering, King Fahd University, Dhahran, 31261, Saudi Arabia
Mayez Al-Mouhamed
Department of Electrical Engineering, Stanford University, Stanford, California
Hussam Abu-Haimed

Authors

Mayez Al-Mouhamed
View author publications
You can also search for this author in PubMed Google Scholar
Hussam Abu-Haimed
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Mouhamed, M., Abu-Haimed, H. Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes. International Journal of Parallel Programming 29, 365–399 (2001). https://doi.org/10.1023/A:1011146518909

Download citation

Issue Date: August 2001
DOI: https://doi.org/10.1023/A:1011146518909

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Can GPU performance increase faster than the code error rate?

Evolutionary neural networks for deep learning: a review

REFERENCES

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Evaluation of Neural and Genetic Algorithms for Synthesizing Parallel Storage Schemes

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Can GPU performance increase faster than the code error rate?

Evolutionary neural networks for deep learning: a review

REFERENCES

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation