Skip to main content

Loop storage optimization for dataflow machines

  • IX. Compilers for DataFlow Machines
  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 589))

Abstract

In scientific computation, loops are frequently used to compute large quantities of data organized in arrays. On a dataflow machine, the main challenge is how to maximally exploit fine-grain parallelism to speed up loop execution while not incurring excessive storage space overhead than what is necessary.

The main contributions of this paper include:

  • The minimum storage requirement to support the maximum computation rate is analyzed and a storage minimization scheme called limited balancing is introduced. The basic intuition is that, since maximum computation rate is dominated by critical cycles in the loop, we should not allocate extra storage beyond a certain limit bounded by the ratios of the critical cycles. In other words, all cycles should be balanced to have the same balancing ratio.

  • The limited balancing problem is formulated as a integer linear programming problem. An efficient solution of the problem is presented. It reduces the problem to a network flow problem called “minimum circulation flow” problem. Therefore, a polynomial time algorithm is established for the solution of the linear relaxation of the limited balancing problem.

Our formal framework is developed under a FIFO dataflow model where each are in the dataflow graph is a FIFO queue of certain size. we establish the maximum computation rate of a loop under earliest firing schedule, and show that the maximum computation rate is dominated by the critical cycles of the dataflow graph. We discuss how our results may be applied to both the static dataflow model and the dynamic dataflow model.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arvind and D. E. Culler. Dataflow architectures. Annual Reviews in Computer Science, 1:225–253, 1986.

    Google Scholar 

  2. Arvind and et al. The tagged token dataflow architecture (preliminary version). Technical report, Laboratory for Computer Science, MIT, Cambridge, MA., August 1983.

    Google Scholar 

  3. Arvind and K. P. Gostelow. The U-Interpreter. IEEE Computer, 15(2):42–49, February 1982.

    Google Scholar 

  4. Arvind, K. P. Gostelow, and W. Plouffe. An Asynchronous Programming Language and Computing Machine. Department of Information and Computer Science, University of California, Irvine, December 1978.

    Google Scholar 

  5. Arvind and R. A. Iannucci. A critique of multiprocessing von Neumann style. In Proceedings of the Tenth Annual International Symposium on Computer Architecture, pages 426–436, 1983.

    Google Scholar 

  6. U. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Boston, MA, 1988.

    Google Scholar 

  7. D. Bernstein and I. Gertner. Scheduling expressions on a pipelined processor with a maximal delay of one cycle. ACM Transactions on Programming Languages and Systems, 11(1):57–66, January 1989.

    Google Scholar 

  8. V. Chvatal. Linear Porgramming. W.H. Freeman and Company., 1983.

    Google Scholar 

  9. D. E. Culler. Managing parallelism and resources in scientific dataflow programs, Ph.D thesis. Technical Report TR-446, Laboratory for Computer Science, MIT, 1989.

    Google Scholar 

  10. J. B. Dennis. First version of a data flow procedure language. Technical Report MIT/LCS/TM-61, Laboratory for Computer Science, MIT, 1975.

    Google Scholar 

  11. J. B. Dennis. Data flow for supercomputers. In Proceedings of the 1984 CompCon, March 1984.

    Google Scholar 

  12. J. B. Dennis. Evolution of the static dataflow architecture. In Advanced Topics in Dataflow Computing. Prentice-Hall, 1991.

    Google Scholar 

  13. J. B. Dennis and G. R. Gao. An efficient pipelined dataflow processor architecture. In Proceedings of the Supercomputing '88 Conference, pages 368–373, Florida, November 1988. IEEE Computer Society and ACM SIGARCH.

    Google Scholar 

  14. J. B. Dennis, G. R. Gao, and K. W. Todd. Modeling the weather with a data flow super-computer. IEEE Transactions on Computers, C-33(7):592–603, 1984.

    Google Scholar 

  15. J. B. Dennis and D. P. Misunas. A preliminary architecture for a basic data-flow processor. In The Second Annual Symposium on Computer Architecture, pages 126–132, January 1975.

    Google Scholar 

  16. J. Edmonds and R.M. Karp. Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM, 1972.

    Google Scholar 

  17. L. R. Ford and D. R. Fulkerson. Flow in Networks. Princeton University Press, Princeton, NJ, 1962.

    Google Scholar 

  18. D.R. Fulkerson. An out-of-kilter method for minimal cost flow problems. J. SIAM, 1961.

    Google Scholar 

  19. G. R. Gao. A pipelined code mapping scheme for static dataflow computers. Technical Report TR-371, Laboratory for Computer Science, MIT, 1986.

    Google Scholar 

  20. G. R. Gao. A Code Mapping Scheme for Dataflow Software Pipelining. Kluwer Academic Publishers, Boston, December 1990.

    Google Scholar 

  21. G. R. Gao, H. H. J. Hum, and Y. B. Wong. An efficient scheme for fine-grain software pipelining. In Proceedings of the CONPAR '90-VAPP IV Conference, Zurich, Switzerland, September 1990.

    Google Scholar 

  22. G.R. Gao. A flexible architecture model for hybrid data-flow and control-flow evaluation. In Advanced Topics in Dataflow Computing. Prentice-Hall, 1991.

    Google Scholar 

  23. G.R. Gao, Y.B. Wong, and Q. Ning. A petri net model for loop scheduling. In the Proceedings of ACM SIGPLAN'91, Toronto, Canada. June 1991.

    Google Scholar 

  24. P. B. Gibbons and S. S. Muchnik. Efficient instruction scheduling for a pipelined architecture. In Proceedings of the ACM Symposium on Compiler Construction, pages 11–16, Palo Alto, CA, June 1986.

    Google Scholar 

  25. T.R. Gross. Code Optimization of Pipeline Constraints. PhD thesis, Computing System Lab., Stanford University, 1983.

    Google Scholar 

  26. J. Hennessy and T. Gross. Postpass code optimization of pipelined constraints. ACM Transactions on Programming Languages and Systems, 5(3):422–448, July 1983.

    Google Scholar 

  27. N. Karmarkar. A new polynomial-time algorithm for linear programming. Combinatorica, 1984.

    Google Scholar 

  28. R. M. Keller, G. Lindstrom, and S. Patil. A loosely-coupled applicative multi-processing system. In AFIPS Conference Proceedings, vol. 48, pages 613–622, 1979.

    Google Scholar 

  29. L. G. Khachian. A polynomial algorithm in linear programming. Soviet Math. Doklady, 20:191–194, 1979.

    Google Scholar 

  30. J. R. Larus and P. N. Hilfinger. Register allocation in the SPUR Lisp compiler. In Proceedings of the ACM Symposium on Compiler Construction, pages 255–263, Palo Alto, CA, June 1986.

    Google Scholar 

  31. E. Lawler. Combinatorial Optimization Networks and Matroids. Holt, Rinehart, and Winston, 1976.

    Google Scholar 

  32. G. M. Papadopoulos and D. E. Culler. Monsoon: An explicit token-store architecture. In Proceedings of the Seventeenth Annual International Symposium of Computer Architecture, Seattle, WA, pages 82–91, 1990.

    Google Scholar 

  33. C. V. Ramamoorthy and G. S. Ho. Performance evaluation of asynchronous concurrent systems using Petri Nets. IEEE Transactions on Computers, pages 440–448, September 1980.

    Google Scholar 

  34. S Sakai and et al. An architecture of a dataflow single chip processor. In Proceedings of the 16th International Symposium on Computer Architecture, pages 46–53, Israel, 1989.

    Google Scholar 

  35. I. Watson and J. Gurd. A practical data flow computer. IEEE Computer, 15(2):51–57, February 1982.

    Google Scholar 

  36. T. Yuba and et al. Sigma-1: A dataflow computer for scientific computations. Computer Physics Communications, 37:141–148, 1985.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gao, G., Ning, Q. (1992). Loop storage optimization for dataflow machines. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1991. Lecture Notes in Computer Science, vol 589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0038676

Download citation

  • DOI: https://doi.org/10.1007/BFb0038676

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-55422-6

  • Online ISBN: 978-3-540-47063-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics