Abstract
In this paper, we describe an approach for the optimization of dedicated co-processors that are implemented either in hardware (ASIC) or configware (FPGA). Such massively parallel co-processors are typically part of a heterogeneous hardware/software-system. Each coprocessor is a massive parallel system consisting of an array of processing elements (PEs). In order to decide whether to map a computational intensive task into hardware, existing approaches either try to optimize for performance or for cost with the other objective being a secondary goal. Our approach presented here, instead, a) considers multiple objectives simultaneously. For a given specification, we explore space-time-mappings leading to different degrees of parallelism and cost, and different optimal hardware solutions. b) We show that the hardware cost may be efficiently determined in terms of the chosen space-time mapping by using state-of-the-art techniques in polyhedral theory. c) Finally, we introduce ideas to drastically reduce dimension and size of the search space of mapping candidates. d) The feasibility of our approach is shown for two realistic examples.
Supported in part by the German Science Foundation (DFG) Project SFB 376 “Massively Parallel Computation”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Philippe Clauss. Counting Solutions to Linear and Nonlinear Constraints through Ehrhart polynomials: Applications to Analyse and Transform Scientific Programs. In Tenth ACM International Conference on Supercomputing, Philadelphia, Pennsylvania, May 1996.
Philippe Clauss and Vincent Loechner. Parametric Analysis of Polyhedral Iteration Spaces. Journal of VLSI Signal Processing, 19(2):179–194, July 1998.
Dirk Fimmel and Renate Merker. Determination of Processor Allocation in the Design of Processor Arrays. Microprocessors and Microsystems, 22(3–4):149–155, 1998.
Bart Kienhuis, Edwin Rijpkema, Ed F. Deprettere, and Paul Lieverse. High Level Modeling for Parallel Executions of Nested Loop Algorithms. In IEEE International Conference on Application-specific Systems, Architectures and Processors, pages 79–91, Boston, Massachusetts, 2000.
Robert H. Kuhn. Transforming Algorithms for Single-Stage and VLSI Architectures. In Workshop on Interconnection Networks for Parallel and Distributed Processing, pages 11–19, West Layfaette, IN, April 1980.
Christian Lengauer. Loop Parallelization in the Polytope Model. In Eike Best, editor, CONCUR’93, Lecture Notes in Computer Science 715, pages 398–416. Springer-Verlag, 1993.
Dan I. Moldovan. On the Design of Algorithms for VLSI Systolic Arrays. In Proceedings of the IEEE, volume 71, pages 113–120, January 1983.
John V. Oldfield and Richard C. Dorf. Field Programmable Gate Arrays: Reconfigurable Logic for Rapid Prototyping and Implementation of Digital Systems. John Wiley & Sons, Chichester, New York, 1995.
Vilfredo Pareto. Cours d’Economie Politique, volume 1. F. Rouge & Cie., Lausanne, Switzerland, 1896.
S. K. Rao. Regular Iterative Algorithms and their Implementations on Processor Arrays. PhD thesis, Stanford University, 1985.
Robert Schreiber, Shail Aditya, B. Ramakrishna Rau, Vinod Kathail, Scott Mahlke, Santosh Abraham, and Greg Snider. High-Level Synthesis of Nonprogrammable Hardware Accelerators. In IEEE International Conference on Application-specific Systems, Architectures and Processors, pages 113–124, Boston, Massachusetts, 2000.
Jürgen Teich. A Compiler for Application-Specific Processor Arrays. PhD thesis, Institut für Mikroelektronik, Universität des Saarlandes, Saarbrücken, Germany, 1993.
Jürgen Teich, Lothar Thiele, and Li Zhang. Scheduling of Partitioned Regular Algorithms on Processor Arrays with Constrained Resources. Journal of VLSI Signal Processing, 17(1):5–20, September 1997.
Lothar Thiele. Resource Constrained Scheduling of Uniform Algorithms. Journal of VLSI Signal Processing, 10:295–310, 1995.
Yiwan Wong and Jean-Marc Delosme. Optimization of Processor Count for Systolic Arrays. Technical Report YALEEU/DCS/RR-697, Yale University, Department of Computer Science, New Haven, Conneticut, 1989.
Xilinx, Inc. http://www.xilinx.com/products/software/jbits/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hannig, F., Teich, J. (2001). Design Space Exploration for Massively Parallel Processor Arrays. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2001. Lecture Notes in Computer Science, vol 2127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44743-1_5
Download citation
DOI: https://doi.org/10.1007/3-540-44743-1_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42522-9
Online ISBN: 978-3-540-44743-6
eBook Packages: Springer Book Archive