skip to main content
10.1145/2364527.2364564acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
research-article

Work efficient higher-order vectorisation

Published:09 September 2012Publication History

ABSTRACT

Existing approaches to higher-order vectorisation, also known as flattening nested data parallelism, do not preserve the asymptotic work complexity of the source program. Straightforward examples, such as sparse matrix-vector multiplication, can suffer a severe blow-up in both time and space, which limits the practicality of this method. We discuss why this problem arises, identify the mis-handling of index space transforms as the root cause, and present a solution using a refined representation of nested arrays. We have implemented this solution in Data Parallel Haskell (DPH) and present benchmarks showing that realistic programs, which used to suffer the blow-up, now have the correct asymptotic work complexity. In some cases, the asymptotic complexity of the vectorised program is even better than the original.

References

  1. G. Blelloch and G.W. Sabot. Compiling collection-oriented languages onto massively parallel computers. Journal of Parallel and Distributed Computing, 8:119--134, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. E. Blelloch. Vector models for data-parallel computing. MIT Press, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. E. Blelloch. NESL: A nested data-parallel language (version 3.1). Technical report, Carnegie Mellon University, 1995.Google ScholarGoogle Scholar
  4. G. E. Blelloch and J. Greiner. A provable time and space efficient implementation of NESL. In ICFP 1996: International Conference on Functional Programming, pages 213--225, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. M. T. Chakravarty, G. Keller, S. Peyton Jones, and S. Marlow. Associated types with class. In POPL 2005: Principles of Programming Languages, pages 1--13. ACM Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. M. T. Chakravarty, R. Leshchinskiy, S. Peyton Jones, G. Keller, and S. Marlow. Data Parallel Haskell: a status report. In DAMP 2007: Declarative Aspects of Multicore Programming. ACM Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: from lists to streams to nothing at all. In ICFP 2007: International Conference on Functional Programming, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Fluet, M. Rainey, and J. Reppy. A scheduling framework for general-purpose parallel languages. In ICFP 2008: International Conference on Functional Programming, pages 241--252. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Ghuloum, T. Smith, G.Wu, X. Zhou, J. Fang, P. Guo, B. So, M. Rajagopalan, Y. Chen, and B. Chen. Future-proof data parallel algorithms and software on Intel multi-core architecture. Intel Technology Journal, November 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. Hill, K. M. Clarke, and R. Bornat. Vectorising a non-strict data-parallel functional language, 1994.Google ScholarGoogle Scholar
  11. R. Leshchincskiy. Higher-Order Nested Data Parallelism. PhD thesis, Technische Universität Berlin, 2006.Google ScholarGoogle Scholar
  12. R. Leshchinskiy, M. M. T. Chakravarty, and G. Keller. Higher order flattening. In ICCS 2006: International Conference on Computational Science, volume 3992, pages 920--928. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Lippmeier, M. M. T. Chakravarty, G. Keller, R. Leshchinskiy, and S. P. Jones. Work efficient higher-order vectorisation (unabridged). Technical Report UNSW-CSE-TR-201208, University of New South Wales, 2012.Google ScholarGoogle Scholar
  14. D.W. Palmer, J. F. Prins, S. Chatterjee, and R. E. Faith. Piecewise execution of nested data-parallel programs. In Languages and Compilers for Parallel Computing, volume 1033 of Lecture Notes in Computer Science, pages 346--361. Springer-Verlag, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. W. Palmer, J. F. Prins, and S. Westfold. Work-efficient nested data-parallelism. In Proc. of the 5th Symposium on the Frontiers of Massively Parallel Processing, pages 186--193. IEEE, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Peyton Jones,W. Partain, and A. Santos. Let-floating: Moving bindings to give faster programs. In ICFP 1996: International Conference on Functional Programming, pages 1--12, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Peyton Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the multicores: Nested data parallelism in Haskell. In FSTTCS 2008: Foundations of Software Technology and Theoretical Computer Science, LIPIcs, pages 383--414. Schloss Dagstuhl, 2008.Google ScholarGoogle Scholar
  18. J. Riely and J. Prins. Flattening is an improvement. In Proc. of the 7th International Symposium on Static Analysis, pages 360--376, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Spoonhower, G. E. Blelloch, R. Harper, and P. B. Gibbons. Space profiling for parallel functional programs. In ICFP 2008: International Conference on Functional Programming, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Work efficient higher-order vectorisation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICFP '12: Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
          September 2012
          392 pages
          ISBN:9781450310543
          DOI:10.1145/2364527
          • cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 47, Issue 9
            ICFP '12
            September 2012
            368 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/2398856
            Issue’s Table of Contents

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 9 September 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          ICFP '12 Paper Acceptance Rate32of88submissions,36%Overall Acceptance Rate333of1,064submissions,31%

          Upcoming Conference

          ICFP '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader