Skip to main content
Log in

A Compositional Framework for Developing Parallel Programs on Two-Dimensional Arrays

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Computations on two-dimensional arrays such as matrices and images are one of the most fundamental and ubiquitous things in computational science and its vast application areas, but development of efficient parallel programs on two-dimensional arrays is known to be hard. In this paper, we propose a compositional framework that supports users, even with little knowledge about parallel machines, to develop both correct and efficient parallel programs on dense two-dimensional arrays systematically. The key feature of our framework is a novel use of the abide-tree representation of two-dimensional arrays. The presentation not only inherits the advantages of tree representations of matrices where recursive blocked algorithms can be defined to achieve better performance, but also supports transformational development of parallel programs and architecture-independent implementation owing to its solid theoretical foundation – the theory of constructive algorithmics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Elmroth E., Gustavson F., Jonsson I., Kagstroom B. (2004). Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review 46(1):3–45

    Article  MATH  MathSciNet  Google Scholar 

  2. Hains G.(1994). Programming with Array Structures. In: Kent A., Williams J.G. (eds). Encyclopedia of Computer Science and Technology, Vol. 14. M. Dekker inc, New-York, pp. 105–119. Appears also in Encyclopedia of Microcomputers

    Google Scholar 

  3. L. Mullin, (ed.), Arrays, Functional Languages, and Parallel Systems. Kluwer Academic Publishers (1991).

  4. A. Grama, A. Gupta, G. Karypis, and V. Kumar, Introduction to Parallel Computing, 2 ed., Addison-Wesley, (2003).

  5. J. Reif and J. H. Reif (eds.), Synthesis of Parallel Algorithms. Morgan Kaufmann (1993).

  6. G. H. Golub and C. F. V. Loan, Matrix Computations, (3rd ed.), Johns Hopkins University Press (1996).

  7. G. W. Stewart, Matrix Algorithms. Society for Industrial and Applied Mathematics (2001).

  8. J. J. Dongarra, L. S. Blackford, J. Choi, A. Cleary, E. D’Azeuedo, J. Demmel, I. Dhillon, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley, ScaLAPACK User’s Guide, Society for Industrial and Applied Mathematics (1997).

  9. P. Alpatov, G. Baker, C. Edwards, J. Gunnels, G. Morrow, J. Overfelt, R. van de Geijn, and Y. J. Wu, PLAPACK: Parallel Linear Algebra Package, in Proceedings of the SIAM Parallel Processing Conference (1997).

  10. I. Jonsson and B. Kagstroom, RECSY – A High Performance Library for Sylvester-Type Matrix Equations, in Proceedings of 9th International Euro-Par Conference (Euro-Par’03), Vol. 2790 of Lecture Notes in Computer Science, pp. 810–819 (2003).

  11. M. Cole, Algorithmic Skeletons: A Structured Approach to the Management of Parallel Computation, Pitman, London: Research Monographs in Parallel and Distributed Computing (1989).

  12. M. Cole, eSkel Home Page, 2002 http://homepages.inf.ed.ac.uk/mic/eSkel/.

  13. F. A. Rabhi, and S. Gorlatch (eds.), Patterns and Skeletons for Parallel and Distributed Computing. Springer-Verlag (2002).

  14. Z. Hu, H. Iwasaki, and M. Takeichi, An Accumulative Parallel Skeleton for All, in Proceedings of 11st European Symposium on Programming (ESOP 2002), LNCS 2305, pp. 83–97 (2002).

  15. Z. Hu, M. Takeichi, and H. Iwasaki, Diffusion: Calculating Efficient Parallel Programs, in Proceedings of 1999 ACM SIGPLAN Workshop on Partial Evaluation and Semantics-Based Program Manipulation (PEPM’99), pp. 85–94 (1999).

  16. W. N. Chin, A. Takano, and Z. Hu, Parallelization via Context Preservation, in Proceedings of IEEE Computer Society International Conference on Computer Languages (ICCL’98), pp. 153–162 (1998).

  17. S. Gorlatch, Systematic Efficient Parallelization of Scan and Other List Homomorphisms, in Proceedings of 2nd International Euro-Par Conference (Euro-Par’96), Vol. 1124 of Lecture Notes in Computer Science, pp. 401–408 (1996).

  18. Hu Z., Iwasaki H., Takeichi M. (1997). Formal Derivation of Efficient Parallel Programs by Construction of List Homomorphisms. ACM Transactions on Programming Langauges and Systems 19(3):444–461

    Article  Google Scholar 

  19. K. Matsuzaki, K. Kakehi, H. Iwasaki, Z. Hu, and Y. Akashi, A Fusion-Embedded Skeleton Library, in Proceedings of 10th International Euro-Par Conference (Euro-Par’04), Vol. 3149 of Lecture Notes in Computer Science, pp. 644–653 (2004).

  20. Gibbons J., Cai W., Skillicorn D.B. (1994). Efficient Parallel Algorithms for Tree Accumulations. Science of Computer Programming 23(1):1–18

    Article  MATH  MathSciNet  Google Scholar 

  21. Skillicorn D.B. (1996). Parallel Implementation of Tree Skeletons. Journal of Parallel and Distributed Computing 39(2):115–125

    Article  MATH  Google Scholar 

  22. R. Miller, Two Approaches to Architecture-Independent Parallel Computation. Ph.D. thesis, Computing Laboratory, Oxford University (1994).

  23. R. S. Bird, Lectures on Constructive Functional Programming. Technical Report Technical Monograph PRG-69, Oxford University Computing Laboratory (1988).

  24. R. S. Bird, and O. de Moor, Algebras of Programming, Prentice Hall (1996).

  25. D. B. Skillicorn, Foundations of Parallel Programming, Cambridge University Press (1994).

  26. J. Jeuring, Theories for Algorithm Calculation. Ph.D. thesis, Utrecht University. Parts of the thesis appeared in the Lecture Notes of the STOP 1992 Summerschool on Constructive Algorithmics (1993).

  27. Wise D.S. (1984). Representing Matrices as Quadtrees for Parallel Processors. Information Processing Letters 20(4):195–199

    Article  MathSciNet  Google Scholar 

  28. J. D. Frens, and D. S. Wise, QR Factorization with Morton-Ordered Quadtree Matrices for Memory Re-use and Parallelism, in Proceedings of 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’03), pp. 144–154 (2003).

  29. Wise D.S. (1999). Undulant Block Elimination and Integer-Preserving Matrix Inversion. Science of Computer Programming 22(1):29–85

    Article  MathSciNet  Google Scholar 

  30. Bentley J. (1984a). Programming Pearls: Algorithm Design Techniques. Communications of the ACM 27(9):865–873

    Article  MathSciNet  Google Scholar 

  31. Bentley J. (1984b). Programming Pearls: Perspective on Performance. Communications of the ACM 27(11):1087–1092

    Article  Google Scholar 

  32. T. Takaoka, Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication, in Proceedings of Computing: The Australasian Theory Symposium (CATS’02), pp. 189–198 (2002).

  33. K. Emoto, Z. Hu, K. Kakehi, and M. Takeichi, A Compositional Framework for Developing Parallel Programs on Two Dimensional Arrays, Technical Report METR2005-09, Department of Mathematical Informatics, University of Tokyo (2005).

  34. R. S. Bird, Introduction to Functional Programming using Haskell, Prentice Hall (1998).

  35. Cole M. (1995). Parallel Programming with List Homomorphisms. Parallel Processing Letters 5(2):191–203

    Article  Google Scholar 

  36. K. Matsuzaki, K. Emoto, H. Iwasaki, and Z. Hu, A Library of Constructive Skeletons for Sequential Style of Parallel Parogramming (Invited Paper), in Proceedings of the 1st International Conference on Scalable Information Systems (INFOSCALE 2006), Vol. 152 of ACM International Conference Proceeding Series, p. 13 (2006).

  37. G. Bikshandi, J. Guo, D. Hoeflinger, G. Almasi, B. B. Fraguela, M. J. Garzaran, D. Padua, and C. von Praun, Programming for Parallelism and Locality with Hierarchically Tiled Arrays, in Proceedings of 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’06). New York, NY, USA, pp. 48–57 (2006).

  38. R. S. Bird, An Introduction to the Theory of Lists, in M. Broy (ed.), Logic of Programming and Calculi of Discrete Design, Vol. 36 of NATO ASI Series F. pp. 5–42 (1987).

  39. D. B. Skillicorn, The Bird-Meertens Formalism as a Parallel Model, in NATO ARW “Software for Parallel Computation” (1992).

  40. Z. Hu, M. Takeichi, and W. N. Chin, Parallelization in Calculational Forms, in Proceedings of 25th ACM Symposium on Principles of Programming Languages. San Diego, California, USA, pp. 316–328 (1998).

  41. D. N. Xu, S.-C. Khoo, and Z. Hu, PType System: A Featherweight Parallelizability Detector, in Proceedings of Second Asian Symposium on Programming Languages and Systems (APLAS’04), Vol. 3302 of Lecture Notes in Computer Science. pp. 197–212 (2004).

  42. Bernecky R. (1993). The Role of APL and J in High-performance Computation. APL Ouote Ouad 24(1):17–32

    Google Scholar 

  43. Falkoff A.D., Iverson K.E. (1973). The design of APL. IBM Journal of Research and Development 17(4):324–334

    Article  MATH  Google Scholar 

  44. K. E. Iverson, A Programming Language. John Wiley and Sons (1962).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kento Emoto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Emoto, K., Hu, Z., Kakehi, K. et al. A Compositional Framework for Developing Parallel Programs on Two-Dimensional Arrays. Int J Parallel Prog 35, 615–658 (2007). https://doi.org/10.1007/s10766-007-0043-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-007-0043-4

Keywords

Navigation