Abstract
Many important algorithms in signal and image processing can be described by uniform recurrences. A common method for the synthesis of systolic arrays from uniform recurrences is based on space-time transformations each of which consisting of two linear mappings, an allocation and a timing function. In this paper, we address the problem of finding space-time transformations which are time-optimal or at least nearly time-optimal. For a given allocation function, a continuous relaxation of this problem is studied by passing from linear to quasi-linear timing functions. A parametrized linear programming formulation is provided for finding quasi-linear timing functions. The solution of each such linear problem, however, depends on the basis representation of the null space of the allocation function. Therefore, a branching approach is proposed for finding quasi-linear timing functions which are optimal or have at least low latency. It will be demonstrated by several large test examples that branching into hundreds or even thousands of linear subproblems can be computed with reasonable effort and often leads to an optimum linear timing function.
Similar content being viewed by others
References
L. Guibas, H.T. Kung, and C.D. Thompson, “Direct VLSI implementation of combinatorial algorithms,” Proc. Conf. on Very Large Scale Integration: Architecture, Design and Fabrication, pp. 509–525, 1979.
H.T. Kung, “Why systolic architectures?,” Computer, Vol. 15, pp. 37–46, 1982.
S.Y. Kung, VLSI Array Processors, NJ. Prentice Hall, Englewood Cliffs, 1987.
H.T. Kung and C.E. Leiserson, “Systolic arrays (for VLSI),” in Sparse Matrix Proc., Soc. Ind. App. Math., Duff et al. (Eds.), pp. 256–282, 1979.
C. Mead and L. Conway, Introduction to VLSI Systems, Addison-Wesley, Reading, MA, 1980.
W.L. Miranker and A. Winkler, “Spacetime representations of computational structures,” Computing, Vol. 32, pp. 93–114, 1984.
P.R. Cappello and K. Steiglitz, “Unifying VLSI designs with linear tranformations on space time,” Adv. Comput. Res., Vol. 2, pp. 23–65, 1984.
G. Lin and B.W. Wah, “The design of optimal systolic design,” IEEE Trans. Comput., Vol. C-34, pp. 66–77, 1985.
S.K. Rao, “Regular Iterative Algorithms and Their Implementations on Processor Arrays,” Ph.D. thesis, Stanford University, Stanford, CA, 1985.
J.-M. Delosme and I.C.F. Ipsen, “Systolic array synthesis: Computability and time cones,” in Parallel Algorithms and Architectures, M. Cosnard et al. (Eds.), North Holland, pp. 295–312, 1986.
J.A.B. Fortes, K.S. Fu, and B.W. Wah, “Systematic approach to the design of algorithmically specified systolic arrays,” Proc. ISCASSP, pp. 8.9.1–8.9.4, 1985.
Y. Wong and J.-M. Delosme, “Optimization of computation time for systolic arrays,” IEEE Transactions on Computers, Vol. 41, No. 2, pp. 159–177, 1992.
R.M. Karp, R.E. Miller, and S. Winograd, “The organization of computations for uniform recurrence equations,” Journal of the ACM, Vol. 13, No. 3, pp. 563–590, July 1967.
L. Lamport, “The parallel execution of DO loops,” Commun. ACM, pp. 83–93, 1974.
C.H. Papadimitrious and K. Steiglitz, Combinatorial Optimization: Algorithms and Complexity, NJ, Prentice Hall, Englewood Cliffs, 1983.
W. Shang and J.A.B. Fortes, “Time optimal linear schedules for algorithms with uniform dependencies,” IEEE Transaction on Computers, Vol. 40, No. 6, pp. 723–742, 1991.
A. Darte, L. Khachiyan, and Y. Robert, “Linear scheduling is close to optimality,” Int. Conf. on Application Specific Array Processors, IEEE Computer Soc. Press, pp. 37–46, 1992.
C.D. Polychronopoulos, Parallel Programming and Compilers, Kluwer, Boston, 1988.
P. Feautrier, “Some efficient solutions to the affine scheduling problem. I. One-dimensional time,” Int. J. of Parallel Programming, Vol. 21, No. 5, pp. 313–347, 1992.
A. Darte and Y. Robert, “Constructive methods for scheduling uniform loop nests,” IEEE Trans. on Parallel and Distributed Systems, Vol. 5, No. 8, pp. 814–822, 1994.
W. Shang, M.T. O'Keefe, and J.A.B. Fortes, “On loop transformations for generalized cycle shrinking,” IEEE Transaction on Parallel and Distributed Systems, Vol. 5, No. 2, pp. 193–205, 1994.
J. McCanny and J. McWhirter, “The derivation and utilization of bit level systolic array architectures,” Proc. 1986 Int. Workshop Systolic Arrays, Oxford, England, 1986, pp. 47–59.
R.W. Hockney and C.R. Jesshope, Parallel Computers: Architecture, Programming and Algorithms, Adam Hilder Ltd.: Bristol, pp. 178–192, 1981.
K.E. Batcher, “Bit-serial parallel processing systems,” IEEE Trans. on Computers, Vol. C-31, No. 5, pp. 377–384.
W.D. Hillis, The Connection Machine, MIT Press, Cambridge, MA, 1985.
M.T. O'Keefe and Z.M. Fortes, “Bit level processor array: current architectures and a design and a programming tool,” Int. Symposium on Circuit and System, Helsinki, Finland, 1988, pp. 2751–2755.
R.H. Kuhn, “Transforming algorithms for single-stage and VLSI architectures,” Proc. Workshop Interconnection Networks Parallel Distributed Processing, IEEE CH1560–2, pp. 11–19, 1980.
D.I. Moldovan, “On the analysis and synthesis of VLSI algorithms,” IEEE Transaction on Computers, Vol. C-31, No. 11, pp. 1121–1126, 1982.
Y. Wong and J.-M. Delosme, “Optimal systolic implementation of n-dimensionale recurrences,” IEEE Proc. ICCD, pp. 618–621, 1985.
P. Lee and Z.M. Kedem, “Synthesizing linear array algorithms from nested for loop algorithms,” IEEE Transaction on Computers, Vol. C-37, No. 12, pp. 1578–1597, 1988.
W. Shang and Z.M. Fortes, “Time optimal and conflict free mapping of uniform dependence algorithms into lower dimensional processor arrays,” Int. Conference on Parallel Processing, pp. 101–110, 1990.
K.-H. Zimmermann, “Linear mappings of n-dimensional uniform recurrences onto k-dimensional systolic arrays,” J. of VLSI Signal Processing (to appear).
K.-H. Zimmermann and W. Achtziger, “On time optimal implementation of uniform recurrences onto array processors via quadratic programming,” J. of VLSI Signal Processing, submitted.
R.T. Rockafellar, Convex Analysis, Princeton Univ. Press, Princeton, 1970.
J.W.S. Cassels, An Introduction to the Geometry of Numbers, Springer, Berlin, 1959.
P. Quinton, “The systematic design of systolic arrays,” in Automata Networks in Computer Science, F. Folgelman-Soulie et al. (Eds.), Manchester University Press, Manchester, pp. 229–260, 1987.
X. Zhong, S. Rajopadhye, and I. Wong, “Systematic generation of linear allocation functions in systolic array design,” J. VLSI Signal Processing, Vol. 4, pp. 279–293, 1992.
A. Schrijver, Theory of Linear and Integer Programming, Wiley & Sons, New York, 1986.
J.-P. Serre, A Course in Arithmetic, New York: Springer, 1973.
NAG (Numerical Algorithms Group Limited) Fortran Library Manual, Mark 16, NAG Ltd., Oxford, UK, 1993.
P.E. Gill, W. Murray, M.A. Saunders, and M.H. Wright, “Inertia-controlling methods for general quadratic programming,” SIAM Review, Vol. 33, pp. 1–36, 1991.
P. Lee and Z.M. Kedem, “Mapping nested loop algorithms into multidimensional systolic arrays,” IEEE Trans. on Parallel and Distributed Systems, Vol. 1, No. 1, pp. 64–76, 1990.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zimmermann, KH., Achtziger, W. Finding Space-Time Transformations for Uniform Recurrences via Branching Parametric Linear Programming. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 15, 259–274 (1997). https://doi.org/10.1023/A:1007963228049
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1007963228049