Skip to main content

Compiler-controlled multithreading for lenient parallel languages

  • Conference paper
  • First Online:
Book cover Functional Programming Languages and Computer Architecture (FPCA 1991)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 523))

Abstract

Tolerance to communication latency and inexpensive synchronization are critical for general-purpose computing on large multiprocessors. Fast dynamic scheduling is required for powerful non-strict parallel languages. However, machines that support rapid switching between multiple execution threads remain a design challenge. This paper explores how multithreaded execution can be addressed as a compilation problem, to achieve switching rates approaching what hardware mechanisms might provide.

Compiler-controlled multithreading is examined through compilation of a lenient parallel language, Id90, for a threaded abstract machine, TAM. A key feature of TAM is that synchronization is explicit and occurs only at the start of a thread, so that a simple cost model can be applied. A scheduling hierarchy allows the compiler to schedule logically related threads closely together in time and to use registers across threads. Remote communication is via message sends and split-phase memory accesses. Messages and memory replies are received by compiler-generated message handlers which rapidly integrate these events with thread scheduling.

To compile Id90 for TAM, we employ a new parallel intermediate form, dual-graphs, with distinct control and data arcs. This provides a clean framework for partitioning the program into threads, scheduling threads, and managing registers under asynchronous execution. The compilation process is described and preliminary measurements of its effectiveness are discussed. Dynamic execution measurements are obtained via a second compilation step, which translates TAM into native code for existing machines with instrumentation incorporated. These measurements show that the cost of compiler-controlled multithreading is within a small factor of the cost of control flow in sequential languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Alverson, D. Callahan, D. Cummings, B. Koblenz, A. Porterfield, and B. Smith. The Tera Computer System. In Proc. of the 1990 Int. Conf. on Supercomputing, pages 1–6, Amsterdam, 1990.

    Google Scholar 

  2. Arvind, D. E. Culler, and G. K. Maa. Assessing the Benefits of Fine-Grain Parallelism in Dataflow Programs. The Int. Journal of Supercomputer Applications, 2(3), November 1988.

    Google Scholar 

  3. Arvind and K. Ekanadham. Future Scientific Programming on Parallel Machines. Journal of Parallel and Distributed Computing, 5(5):460–493, October 1988.

    Google Scholar 

  4. Arvind, S. K. Heller, and R. S. Nikhil. Programming Generality and Parallel Computers. In Proc. of the Fourth Int. Symp. on Biological and Artificial Intelligence Systems, pages 255–286. ESCOM (Leider), Trento, Italy, September 1988.

    Google Scholar 

  5. Arvind and R. A. Iannucci. Two Fundamental Issues in Multiprocessing. In Proc. of DFVLR — Conf. 1987 on Par. Proc. in Science and Eng., Bonn-Bad Godesberg, W. Germany, June 1987.

    Google Scholar 

  6. A. Agarwal, B. Lim, D. Kranz, and J. Kubiatowicz. APRIL: A Processor Architecture for Multiprocessing. In Proc. of the 17th Ann. Int. Symp. on Comp. Arch., pages 104–114, Seattle, Washington, May 1990.

    Google Scholar 

  7. Arvind, R. S. Nikhil, and K. K. Pingali. I-Structures: Data Structures for Parallel Computing. Technical Report CSG Memo 269, MIT Lab for Comp. Sci., 545 Tech. Square, Cambridge, MA, February 1987. (Also in Proc. of the Graph Reduction Workshop, Santa Fe, NM. October 1986.).

    Google Scholar 

  8. P. J. Burns, M. Christon, R. Schweitzer, O. M. Lubeck, H. J. Wasserman, M. L. Simmons, and D. V. Pryor. Vectorization of Monte-Carlo Particle Transport: An Architectural Study using the LANL Benchmark “Gamteb”. In Proc. Supercomputing '89. IEEE Computer Society and ACM SIGARCH, New York, NY, November 1989.

    Google Scholar 

  9. R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. An Efficient Method of Computing Static Single Assignment Form. In Proc. of the 16th Annual ACM Symp. on Principles of Progr. Lang., pages 25–35, Los Angeles, January 1989.

    Google Scholar 

  10. W. P. Crowley, C. P. Hendrickson, and T. E. Rudy. The SIMPLE code. Technical Report UCID 17715, Lawrence Livermore Laboratory, February 1978.

    Google Scholar 

  11. D. Culler, A. Sah, K. Schauser, T. von Eicken, and J. Wawrzynek. Fine-grain Parallelism with Minimal Hardware Support: A Compiler-Controlled Threaded Abstract Machine. In Proc. of 4th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Santa-Clara, CA, April 1991. (Also available as Technical Report UCB/CSD 91/591, CS Div., University of California at Berkeley).

    Google Scholar 

  12. D. E. Culler. Managing Parallelism and Resources in Scientific Dataflow Programs. Technical Report 446, MIT Lab for Comp. Sci., March 1990. (PhD Thesis, Dept. of EECS, MIT).

    Google Scholar 

  13. V. G. Grafe and J. E. Hoch. The Epsilon-2 Hybrid Dataflow Architecture. In Proc. of Compcon90, pages 88–93, San Francisco, CA, March 1990.

    Google Scholar 

  14. R. H. Halstead, Jr. Multilisp: A Language for Concurrent Symbolic Computation. ACM Transactions on Programming Languages and Systems, 7(4):501–538, October 1985.

    Google Scholar 

  15. R. H. Halstead, Jr. and T. Fujita. MASA: a Multithreaded Processor Architecture for Parallel Symbolic Computing. In Proc. of the 15th Int. Symp. on Comp. Arch., pages 443–451, Hawaii, May 1988.

    Google Scholar 

  16. R. A. Iannucci. Toward a Dataflow/von Neumann Hybrid Architecture. In Proc. 15th Int. Symp. on Comp. Arch., pages 131–140, Hawaii, May 1988.

    Google Scholar 

  17. R. S. Nikhil and Arvind. Can Dataflow Subsume von Neumann Computing? In Proc. of the 16th Annual Int. Symp. on Comp. Arch., Jerusalem, Israel, May 1989.

    Google Scholar 

  18. R. S. Nikhil. Id (Version 90.0) Reference Manual. Technical Report CSG Memo, to appear, MIT Lab for Comp. Sci., 545 Tech. Square, Cambridge, MA, 1990.

    Google Scholar 

  19. R. S. Nikhil. The Parallel Programming Language Id and its Compilation for Parallel Machines. In Proc. Workshop on Massive Parallelism, Amalfi, Italy, October 1989. Academic Press, 1991. Also: CSG Memo 313, MIT Laboratory for Computer Science, 545 Technology Square, Cambridge, MA 02139, USA.

    Google Scholar 

  20. G. M. Papadopoulos and D. E. Culler. Monsoon: an Explicit Token-Store Architecture. In Proc. of the 17th Annual Int. Symp. on Comp. Arch., Seattle, Washington, May 1990.

    Google Scholar 

  21. G. M. Papadopoulos and K. R. Traub. Multithreading: A Revisionist View of Dataflow Architectures. In Proc. of the 18th Int. Symp. on Comp. Arch., pages 342–351, Toronto, Canada, May 1991.

    Google Scholar 

  22. A. Sah. Parallel Language Support for Shared memory multiprocessors. Master's thesis, Computer Science Div., University of California at Berkeley, May 1991.

    Google Scholar 

  23. K. E. Schauser. Compiling Dataflow into Threads. Technical report, Computer Science Div., University of California, Berkeley CA 94720, 1991. (MS Thesis, Dept. of EECS, UCB).

    Google Scholar 

  24. B. Smith. Keynote Address. Proc. of the 17th Annual Int. Symp. on Comp. Arch., May 1990.

    Google Scholar 

  25. S. Sakai, Y. Yamaguchi, K. Hiraki, Y. Kodama, and T. Yuba. An Architecture of a Dataflow Single Chip Processor. In Proc. of the 16th Annual Int. Symp. on Comp. Arch., pages 46–53, Jerusalem, Israel, June 1989.

    Google Scholar 

  26. K. R. Traub. A Compiler for the MIT Tagged-Token Dataflow Architecture. Technical Report TR-370, MIT Lab for Comp. Sci., 545 Tech. Square, Cambridge, MA, August 1986. (MS Thesis, Dept. of EECS, MIT).

    Google Scholar 

  27. K. R. Traub. Sequential Implementation of Lenient Programming Languages. Technical Report TR-417, MIT Lab for Comp. Sci., 545 Tech. Square, Cambridge, MA, September 1988. (PhD Thesis, Dept. of EECS, MIT).

    Google Scholar 

  28. T. von Eicken, K. E. Schauser, and D. E. Culler. TL0: An Implementation of the TAM Threaded Abstract Machine, Version 2.1. Technical Report, Computer Science Div., University of California at Berkeley, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

John Hughes

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schauser, K.E., Culler, D.E., von Eicken, T. (1991). Compiler-controlled multithreading for lenient parallel languages. In: Hughes, J. (eds) Functional Programming Languages and Computer Architecture. FPCA 1991. Lecture Notes in Computer Science, vol 523. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3540543961_4

Download citation

  • DOI: https://doi.org/10.1007/3540543961_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-54396-1

  • Online ISBN: 978-3-540-47599-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics