Loop storage optimization for dataflow machines

Gao, G.; Ning, Q.

doi:10.1007/BFb0038676

Loop storage optimization for dataflow machines

G. Gao¹ &
Q. Ning¹

IX. Compilers for DataFlow Machines
Conference paper
First Online: 01 January 2005

132 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 589))

Abstract

In scientific computation, loops are frequently used to compute large quantities of data organized in arrays. On a dataflow machine, the main challenge is how to maximally exploit fine-grain parallelism to speed up loop execution while not incurring excessive storage space overhead than what is necessary.

The main contributions of this paper include:

The minimum storage requirement to support the maximum computation rate is analyzed and a storage minimization scheme called limited balancing is introduced. The basic intuition is that, since maximum computation rate is dominated by critical cycles in the loop, we should not allocate extra storage beyond a certain limit bounded by the ratios of the critical cycles. In other words, all cycles should be balanced to have the same balancing ratio.
The limited balancing problem is formulated as a integer linear programming problem. An efficient solution of the problem is presented. It reduces the problem to a network flow problem called “minimum circulation flow” problem. Therefore, a polynomial time algorithm is established for the solution of the linear relaxation of the limited balancing problem.

Our formal framework is developed under a FIFO dataflow model where each are in the dataflow graph is a FIFO queue of certain size. we establish the maximum computation rate of a loop under earliest firing schedule, and show that the maximum computation rate is dominated by the critical cycles of the dataflow graph. We discuss how our results may be applied to both the static dataflow model and the dynamic dataflow model.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

References

Arvind and D. E. Culler. Dataflow architectures. Annual Reviews in Computer Science, 1:225–253, 1986.
Google Scholar
Arvind and et al. The tagged token dataflow architecture (preliminary version). Technical report, Laboratory for Computer Science, MIT, Cambridge, MA., August 1983.
Google Scholar
Arvind and K. P. Gostelow. The U-Interpreter. IEEE Computer, 15(2):42–49, February 1982.
Google Scholar
Arvind, K. P. Gostelow, and W. Plouffe. An Asynchronous Programming Language and Computing Machine. Department of Information and Computer Science, University of California, Irvine, December 1978.
Google Scholar
Arvind and R. A. Iannucci. A critique of multiprocessing von Neumann style. In Proceedings of the Tenth Annual International Symposium on Computer Architecture, pages 426–436, 1983.
Google Scholar
U. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Boston, MA, 1988.
Google Scholar
D. Bernstein and I. Gertner. Scheduling expressions on a pipelined processor with a maximal delay of one cycle. ACM Transactions on Programming Languages and Systems, 11(1):57–66, January 1989.
Google Scholar
V. Chvatal. Linear Porgramming. W.H. Freeman and Company., 1983.
Google Scholar
D. E. Culler. Managing parallelism and resources in scientific dataflow programs, Ph.D thesis. Technical Report TR-446, Laboratory for Computer Science, MIT, 1989.
Google Scholar
J. B. Dennis. First version of a data flow procedure language. Technical Report MIT/LCS/TM-61, Laboratory for Computer Science, MIT, 1975.
Google Scholar
J. B. Dennis. Data flow for supercomputers. In Proceedings of the 1984 CompCon, March 1984.
Google Scholar
J. B. Dennis. Evolution of the static dataflow architecture. In Advanced Topics in Dataflow Computing. Prentice-Hall, 1991.
Google Scholar
J. B. Dennis and G. R. Gao. An efficient pipelined dataflow processor architecture. In Proceedings of the Supercomputing '88 Conference, pages 368–373, Florida, November 1988. IEEE Computer Society and ACM SIGARCH.
Google Scholar
J. B. Dennis, G. R. Gao, and K. W. Todd. Modeling the weather with a data flow super-computer. IEEE Transactions on Computers, C-33(7):592–603, 1984.
Google Scholar
J. B. Dennis and D. P. Misunas. A preliminary architecture for a basic data-flow processor. In The Second Annual Symposium on Computer Architecture, pages 126–132, January 1975.
Google Scholar
J. Edmonds and R.M. Karp. Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM, 1972.
Google Scholar
L. R. Ford and D. R. Fulkerson. Flow in Networks. Princeton University Press, Princeton, NJ, 1962.
Google Scholar
D.R. Fulkerson. An out-of-kilter method for minimal cost flow problems. J. SIAM, 1961.
Google Scholar
G. R. Gao. A pipelined code mapping scheme for static dataflow computers. Technical Report TR-371, Laboratory for Computer Science, MIT, 1986.
Google Scholar
G. R. Gao. A Code Mapping Scheme for Dataflow Software Pipelining. Kluwer Academic Publishers, Boston, December 1990.
Google Scholar
G. R. Gao, H. H. J. Hum, and Y. B. Wong. An efficient scheme for fine-grain software pipelining. In Proceedings of the CONPAR '90-VAPP IV Conference, Zurich, Switzerland, September 1990.
Google Scholar
G.R. Gao. A flexible architecture model for hybrid data-flow and control-flow evaluation. In Advanced Topics in Dataflow Computing. Prentice-Hall, 1991.
Google Scholar
G.R. Gao, Y.B. Wong, and Q. Ning. A petri net model for loop scheduling. In the Proceedings of ACM SIGPLAN'91, Toronto, Canada. June 1991.
Google Scholar
P. B. Gibbons and S. S. Muchnik. Efficient instruction scheduling for a pipelined architecture. In Proceedings of the ACM Symposium on Compiler Construction, pages 11–16, Palo Alto, CA, June 1986.
Google Scholar
T.R. Gross. Code Optimization of Pipeline Constraints. PhD thesis, Computing System Lab., Stanford University, 1983.
Google Scholar
J. Hennessy and T. Gross. Postpass code optimization of pipelined constraints. ACM Transactions on Programming Languages and Systems, 5(3):422–448, July 1983.
Google Scholar
N. Karmarkar. A new polynomial-time algorithm for linear programming. Combinatorica, 1984.
Google Scholar
R. M. Keller, G. Lindstrom, and S. Patil. A loosely-coupled applicative multi-processing system. In AFIPS Conference Proceedings, vol. 48, pages 613–622, 1979.
Google Scholar
L. G. Khachian. A polynomial algorithm in linear programming. Soviet Math. Doklady, 20:191–194, 1979.
Google Scholar
J. R. Larus and P. N. Hilfinger. Register allocation in the SPUR Lisp compiler. In Proceedings of the ACM Symposium on Compiler Construction, pages 255–263, Palo Alto, CA, June 1986.
Google Scholar
E. Lawler. Combinatorial Optimization Networks and Matroids. Holt, Rinehart, and Winston, 1976.
Google Scholar
G. M. Papadopoulos and D. E. Culler. Monsoon: An explicit token-store architecture. In Proceedings of the Seventeenth Annual International Symposium of Computer Architecture, Seattle, WA, pages 82–91, 1990.
Google Scholar
C. V. Ramamoorthy and G. S. Ho. Performance evaluation of asynchronous concurrent systems using Petri Nets. IEEE Transactions on Computers, pages 440–448, September 1980.
Google Scholar
S Sakai and et al. An architecture of a dataflow single chip processor. In Proceedings of the 16th International Symposium on Computer Architecture, pages 46–53, Israel, 1989.
Google Scholar
I. Watson and J. Gurd. A practical data flow computer. IEEE Computer, 15(2):51–57, February 1982.
Google Scholar
T. Yuba and et al. Sigma-1: A dataflow computer for scientific computations. Computer Physics Communications, 37:141–148, 1985.
Google Scholar

Download references

Author information

Authors and Affiliations

McGill University, USA
G. Gao & Q. Ning

Authors

G. Gao
View author publications
You can also search for this author in PubMed Google Scholar
Q. Ning
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, G., Ning, Q. (1992). Loop storage optimization for dataflow machines. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1991. Lecture Notes in Computer Science, vol 589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0038676

Download citation

DOI: https://doi.org/10.1007/BFb0038676
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55422-6
Online ISBN: 978-3-540-47063-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics