skip to main content
10.1145/3412932.3412946acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiflConference Proceedingsconference-collections
research-article
Open Access

Shapes and flattening

Published:15 July 2021Publication History

ABSTRACT

Nesl is a first-order functional language with an apply-to-each construct and other parallel primitives that enables the expression of irregular nested data-parallel (NDP) algorithms. To compile Nesl, Blelloch and others developed a global flattening transformation that maps irregular NDP code into regular flat data parallel (FDP) code suitable for executing on SIMD or SIMT architectures, such as GPUs.

While flattening solves the problem of mapping irregular parallelism into a regular model, it requires significant additional optimizations to produce performant code. Nessie is a compiler for Nesl that generates CUDA code for running on Nvidia GPUs. The Nessie compiler relies on a fairly complicated shape analysis that is performed on the FDP code produced by the flattening transformation. Shape analysis plays a key rôle in the compiler as it is the enabler of fusion optimizations, smart kernel scheduling, and other optimizations.

In this paper, we present a new approach to the shape analysis problem for Nesl that is both simpler to implement and provides better quality shape information. The key idea is to analyze the NDP representation of the program and then preserve shape information through the flattening transformation.

References

  1. Lars Bergstrom and John Reppy. 2012. Nested Data-Parallelism on the GPU. In ICFP '12 (Copenhagen, Denmark). ACM, New York, NY, 247--258.Google ScholarGoogle Scholar
  2. Guy E. Blelloch. 1989. Scans as Primitive Parallel Operations. IEEE Computer 38, 11 (Nov. 1989), 1526--1538.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Guy E. Blelloch. 1990. Vector models for data-parallel computing. MIT Press, Cambridge, MA, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Guy E. Blelloch. 1995. NESL: A nested data-parallel language (version 3.1). Technical Report CMU-CS-95-170. School of C.S., CMU, Pittsburgh, PA.Google ScholarGoogle Scholar
  5. Guy E. Blelloch. 1996. Programming parallel algorithms. CACM 39, 3 (March 1996), 85--97.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, and Marco Zagha. 1994. Implementation of a portable nested data-parallel language. JPDC 21, 1 (1994), 4--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Guy E. Blelloch and Gary W. Sabot. 1990. Compiling collection-oriented languages onto massively parallel computers. JPDC 8, 2 (1990), 119--134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Troels Henriksen, Martin Elsman, and Cosmin E. Oancea. 2014. Size Slicing: A Hybrid Approach to Size Inference in Futhark. In FHPC '14 (Gothenburg, Sweden). ACM, New York, NY, 31--42.Google ScholarGoogle Scholar
  9. Gabriele Keller. 1999. Transformation-based Implementation of Nested Data Parallelism for Distributed Memory Machines. Ph.D. Dissertation. Technische Universität Berlin, Berlin, Germany.Google ScholarGoogle Scholar
  10. Gabriele Keller, Manuel M.T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, and Ben Lippmeier. 2010. Regular, Shape-polymorphic, Parallel Arrays in Haskell. In ICFP '10 (Baltimore, MD). ACM, New York, NY, 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Gabriele Keller, Manuel M. T. Chakravarty, Roman Leshchinskiy, Ben Lippmeier, and Simon Peyton Jones. 2012. Vectorisation Avoidance. In HASKELL '12 (Copenhagen, Denmark). ACM, New York, NY, 37--48.Google ScholarGoogle Scholar
  12. Gabriele Keller and Martin Simons. 1996. A Calculational Approach to Flattening Nested Data Parallelism in Functional Languages. In Concurrency and Parallelism, Programming, Networking, and Security (LNCS), Joxan Jaffar and Roland H. C. Yap (Eds.), Vol. 1179. Springer-Verlag, New York, NY, 234--243.Google ScholarGoogle Scholar
  13. Roman Leshchinskiy. 2005. Higher-Order Nested Data Parallelism: Semantics and Implementation. Ph.D. Dissertation. Technische Universität Berlin, Berlin, Germany.Google ScholarGoogle Scholar
  14. Ben Lippmeier, Manuel M.T. Chakravarty, Gabriele Keller, Roman Leshchinskiy, and Simon Peyton Jones. 2012. Work Efficient Higher-order Vectorisation. In ICFP '12 (Copenhagen, Denmark). ACM, New York, NY, 259--270.Google ScholarGoogle Scholar
  15. Frederik M. Madsen. 2012. Flattening Nested Data Parallelism. Master's Project, DIKU. Available from http://hiperfit.dk/publications.Google ScholarGoogle Scholar
  16. Jan F. Prins and Daniel W. Palmer. 1993. Transforming High-Level Data-Parallel Programs into Vector Operations. In PPoPP '93 (San Diego, CA). ACM, New York, NY, 119--128.Google ScholarGoogle Scholar
  17. John Reppy and Nora Sandler. 2015. Nessie: A NESL to CUDA Compiler. Presented at CPC 2015; London, UK., 13 pages. Available from https://nessie.cs.uchicago.edu.Google ScholarGoogle Scholar
  18. John Reppy and Joe Wingerter. 2016. λcu --- An Intermediate Representation for Compiling Nested Data Parallelism. Presented at CPC 2016; Valladolid, Spain.., 13 pages. Available from https://cpc2016.infor.uva.es.Google ScholarGoogle Scholar
  19. Amos Robinson, Ben Lippmeier, and Gabriele Keller. 2014. Fusing Filters with Integer Linear Programming. In FHPC '14 (Gothenburg, Sweden). ACM, New York, NY, 53--62.Google ScholarGoogle Scholar
  20. Nora Sandler. 2014. Nessie: A New NESL Compiler. (June 2014). BA Honors Thesis, Department of Computer Science, University of Chicago.Google ScholarGoogle Scholar
  21. Scandal Project. [n.d.]. A library of parallel algorithms written in NESL. Available from http://www.cs.cmu.edu/~scandal/nesl/algorithms.html.Google ScholarGoogle Scholar
  22. Sven-Bodo Scholz. 2001. A Type System for Inferring Array Shapes. In IFL '01 (Stockholm, Sweden) (LNCS), Thomas Arts and Markus Mohnen (Eds.). Springer-Verlag, New York, NY, 65--82.Google ScholarGoogle Scholar
  23. Fangyong Tang and Clemens Grelck. 2013. User-Defined Shape Constraints in SAC. Presented at IFL 2012; Oxford U.K.., 19 pages. Available from www.sac-home.org.Google ScholarGoogle Scholar
  24. Kai Trojahner, Clemens Grelck, and Sven-Bodo Scholz. 2006. On Optimising Shape-Generic Array Programs Using Symbolic Structural Information. In IFL '06 (Budapest, Hungary), Zoltán Horváth, Viktória Zsók, and Andrew Butterfield (Eds.). Springer-Verlag, New York, NY, 1--18.Google ScholarGoogle Scholar
  25. Joe Wingerter. 2017. λcu --- An Intermediate Representation for Compiling Nested Data Parallelism. Master's thesis. University of Chicago.Google ScholarGoogle Scholar
  26. Yongpeng Zhang and Frank Mueller. 2012. CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures. In ICPP '12 (Pittsburgh, PA). IEEE Computer Society Press, Los Alamitos, CA, 340--349.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    IFL '19: Proceedings of the 31st Symposium on Implementation and Application of Functional Languages
    September 2019
    177 pages
    ISBN:9781450375627
    DOI:10.1145/3412932

    Copyright © 2019 Owner/Author

    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 15 July 2021

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate19of36submissions,53%
  • Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader