skip to main content
10.1145/2502323.2502327acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Towards a functional run-time for dense NLA domain

Authors Info & Claims
Published:23 September 2013Publication History

ABSTRACT

We investigate the use of functional programming to develop a numerical linear algebra run-time; i.e. a framework where the solvers can be adapted easily to different contexts and task parallelism can be attained (semi-) automatically. We follow a bottom up strategy, where the first step is the design and implementation of a framework layer, composed by a functional version of BLAS (Basic Linear Algebra Subprograms) routines. The framework allows the manipulation of arbitrary representations for matrices and vectors and it is also possible to write and combine multiple implementations of BLAS operations based on different algorithms and parallelism strategies. Using this framework, we implement a functional version of Cholesky factorization, which serves as a proof of concept to evaluate the flexibility and performance of our approach.

References

  1. SMP Superscalar (SMPSs) User's Manual, Version 2.4. Barcelona Supercomputing Center, Barcelona, 2011.Google ScholarGoogle Scholar
  2. E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov. Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects. Journal of Physics: Conference Series, 180, 2009. 10.1088/1742-6596/180/1/012037.Google ScholarGoogle Scholar
  3. E. Anderson, Z. Bai, J. Demmel, J. E. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. E. McKenney, S. Ostrouchov, and D. Sorensen. phLAPACK Users' Guide, Third Edition. SIAM, Philadelphia, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. M. Badia, J. R. Herrero, J. Labarta, J. M. Pérez, E. S. Quintana-Ortí, and G. Quintana-Ortí. Parallelizing dense and banded linear algebra libraries using smpss. Concurrency and Computation: Practice and Experience, 21 (18): 2438--2456, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Bientinesi, B. Gunter, and R. A. v. d. Geijn. Families of algorithms related to the inversion of a symmetric positive definite matrix. ACM Trans. Math. Softw., 35 (1): 3:1--3:22, July 2008. ISSN 0098--3500. 10.1145/1377603.1377606. URL http://doi.acm.org/10.1145/1377603.1377606. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Bilmes, K. Asanović, C. whye Chin, and J. Demmel. Optimizing matrix multiply using PHiPAC: a Portable, High-Performance, ANSI C coding methodology. In Proceedings of International Conference on Supercomputing, Vienna, Austria, July 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling, G. Henry, M. Heroux, L. Kaufman, A. Lumsdaine, A. Petitet, R. Pozo, K. Remington, and R. C. Whaley. An updated set of basic linear algebra subprograms (blas). ACM Transactions on Mathematical Software, 28: 135--151, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. M. Chakravarty, G. Keller, S. Lee, T. L. McDonell, and V. Grover. Accelerating haskell array codes with multicore gpus. In Proceedings of the sixth workshop on Declarative aspects of multicore programming, DAMP '11, pages 3--14. ACM, 2011. ISBN 978--1--4503-0486--3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. M. T. Chakravarty, R. Leshchinskiy, S. Peyton-Jones, G. Keller, and S. Marlow. Data parallel Haskell: a status report. In DAMP '07: Proceedings of the 2007 workshop on Declarative aspects of multicore programming, pages 10--18. ACM, 2007. ISBN 978-1-59593-690-5. 10.1145/1248648.1248652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Chan, F. G. V. Zee, P. Bientinesi, E. S. Quintana-Ortí, G. Quintana-Ortí, and R. A. van de Geijn. Supermatrix: a multithreaded runtime scheduling system for algorithms-by-blocks. In S. Chatterjee and M. L. Scott, editors, PPOPP, pages 123--132. ACM, 2008. ISBN 978-1-59593-795-7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Choi, J. J. Dongarra, R. Pozo, and D. W. Walker. ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers. In Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, pages 120--127. IEEE Comput. Soc. Press, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  12. A. Foltzer, A. Kulkarni, R. Swords, S. Sasidharan, E. Jiang, and R. Newton. A meta-scheduler for the par-monad: composable scheduling for the heterogeneous cloud. SIGPLAN Not., 47 (9): 235--246, Sept. 2012. ISSN 0362-1340. 10.1145/2398856.2364562. URL http://doi.acm.org/10.1145/2398856.2364562. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. V. Gerbessiotis. Algorithmic and Practical Considerations for Dense Matrix Computations on the BSP Model. PRG-TR 32, Oxford University Computing Laboratory, 1997. URL http://web.njit.edu/alexg/pubs/papers/PRG3297.ps.Google ScholarGoogle Scholar
  14. G. Golub and C. V. Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, 3rd edition, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Hinze. Fun with phantom types. In J. Gibbons and O. de Moor, editors, The Fun of Programming, Cornerstones of Computing, pages 245--262. Palgrave Macmillan, 2003.Google ScholarGoogle Scholar
  16. G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in haskell. In Proceedings of the 15th Intl. Conf. on Funct. Progr., ICFP '10, pages 261--272. ACM, 2010. ISBN 978-1-60558-794-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. O. Kiselyov, S. P. Jones, and C. chieh Shan. Fun with type functions. In A. W. Roscoe, C. B. Jones, and K. Wood, editors, Reflections on the work of C. A. R. Hoare. Springer, 2010.Google ScholarGoogle Scholar
  18. S. Marlow, S. L. Peyton-Jones, and S. Singh. Runtime support for multicore Haskell. In ICFP 2009, Intl. Conf. on Functional Programming, pages 65--78. ACM, 2009. http://doi.acm.org/10.1145/1596550.1596563. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Marlow, P. Maier, H.-W. Loidl, M. K. Aswad, and P. W. Trinder. Seq no more: Better strategies for parallel Haskell. In Haskell Symposium 2010, Baltimore, MD, USA, Sept. 2010. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Marlow, R. Newton, and S. Peyton-Jones. A monad for deterministic parallelism. In Haskell '11: Proceedings of the Fourth Symposium on Haskell. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Peyton Jones. Harnessing the multicores: Nested data parallelism in haskell. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems, APLAS '08, pages 138--138, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 978-3-540-89329-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Peyton-Jones et al. The Haskell 98 language and libraries: The revised report. Journal of Functional Programming, 13 (1): 0-255, Jan 2003.Google ScholarGoogle Scholar
  23. E. Quintana-Ortí;, G. Quintana-Ortí;, X. Sun, and R. van de Geijn. A note on parallel matrix inversion. SIAM Journal on Scientific Computing, 22 (5): 1762--1771, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Whaley, A. Petitet, and J. J. Dongarra. Automated Empirical Optimization of Software and the ATLAS Project. In PARALLEL COMPUTING, volume 27, 2000.Google ScholarGoogle Scholar

Index Terms

  1. Towards a functional run-time for dense NLA domain

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          FHPC '13: Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing
          September 2013
          104 pages
          ISBN:9781450323819
          DOI:10.1145/2502323

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 September 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          FHPC '13 Paper Acceptance Rate8of14submissions,57%Overall Acceptance Rate18of25submissions,72%

          Upcoming Conference

          ICFP '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader