Abstract
We explore the area efficiency of a class of stream-based dataflow architectures as a function of the grain-size, for a given set of applications. We believe the grain-size is a key parameter in balancing flexibility and efficiency of this class of architectures. We apply a clustering approach on a well-defined set of applications to derive a set of processing elements of varying grain-sizes. The resulting architectures are compared with respect to their silicon area. For a set of twenty-one industrially relevant video algorithms, we determined architectures with various grain-sizes. The results of this exercise indicate an improvement factor of two for the silicon area, while changing the grain-size from fine-grain to coarser-grain.
Similar content being viewed by others
References
P. Lippens, B. de Loore, G. de Haan, P. Eeckhout, H. Huijgen, A. Loning, B. McSweeney, M. Verstraelen, B. Pham, and J. Kettenis, “A video signal processor for motion-compensated field-rate upconversion in consumer television,” IEEE Journal of Solid-Sate Circuits, Vol. 31, No.11, pp. 1762–1769, Nov. 1996.
T. Doyle and M. Looymans, “Progressive scan conversion using edge information,” Third Int. Workshop on HDTV, 1989.
Y. Okada, “An 80mm2 MPEG2 audio/video decoder,” IEEE Int. Solid-State Circuits Conference, pp. 264–265, Feb. 7, 1997.
V.M. Bove Jr. and J.A. Watlington, “Cheops: A reconfigurable data-flow system for video processing,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 5, No. 2, April 1995.
K.W. Yeung, “Adata-driven multiprocessor architecture for high throughput digital signal processing,” PhD thesis, University of California at Berkeley, July 1995.
K.A. Vissers, G. Essink, P.H.J. van Gerwen, P.J.M. Janssen, O. Popp, E. Riddersma, W.J.M. Smits, and H.J.M. Veendrick, “Architecture and programming of two generations video signal processors,” Microprocessing and Microprogramming, Vol. 41, pp. 373–390, 1995.
G. Slavenburg, S. Rathnam, and H. Dijkstra, “The TriMedia TM-1 PCI VLIW media processor,” Proceedings of Hot Chips 8: A Symposium on High Performance Chips, August 1996.
P.E.R. Lippens, J.L. van Meerbergen, A. van der Werf, W.F.J. Verhaegh, B.T. McSweeney, J.O. Huisken, and O.P. McArdle, “PHIDEO: A silicon compiler for high speed algorithms,” Proc. EDAC, pp. 436–441, 1991.
B. Kienhuis, E. Deprettere, K. Vissers, and P. van der Wolf, “An approach for quantitative analysis of application-specific dataflow architectures,” Proc. ASAP'97, July 14-16, 1997.
J.A.J. Leijten, J.L. van Meerbergen, A.H. Timmer, and J.A.G. Jess, “Prophid, a data-driven multi-processor architecture for high-performance DSP,” Proc. ED&TC, March 17-20 1997.
A.C.J. Kienhuis, “Design space exploration of stream-based dataflow architectures: Methods and tools,” PhD thesis, Delft University of Technology, 1999.
K. Kim, R. Karri, and M. Potkonja, “Synthesis of application specific programmable processors,” Proc. DAC 97, 1997.
W. Geurts, F. Franssen, M. van Swaaij, F. Catthoor, H. De Man, and M. Moonen, “Memory and data-path mapping for image and video applications,” in Application-Driven Architecture Synthesis, F. Catthoor and L. Svensson (Eds.)</nt>, Kluwer, pp. 143–166, 1993.
S. Note, W. Geurts, F. Catthoor, and H. De Man, “Cathedral-III: Architecture-driven high-level synthesis for high throughput DSP applications,” Proc. 28th DAC, pp. 597–602, June 1991.
A.R. Newton and A. Sangiovanni-Vincentelli, “Computer-aided design for VLSI circuits,” IEEE Computer, pp. 38–60, April 1986.
R. Mehra, L.M. Guerra, and J.M. Rabaey, “Low-power architectural synthesis and the impact of exploiting locality,” Journal of VLSI Signal Processing, Vol. 13, pp. 239–258, 1996.
E.H.L. Aarts, G. Essink, and E.A. de Kock, “Recursive bipartitioning of signal flow graphs for programmable video signal processors,” Proc. ED&TC96, pp. 460–466, 1996.
E.A. de Kock, E.H.L. Aarts, G. Essink, R.E.J. Jansen, and J.H.M. Korst, “A variable-depth search algorithm for the recursive bipartitioning of signal flow graphs,” OR Spektrum, Vol. 17, pp. 159–172, 1995.
A. van der Werf, M.J.H. Peek, E.H.L. Aarts, J.L. van Meerbergen, P.E.R. Lippens, and W.F.J. Verhaegh, “Area optimization of multi-functional processing units,” Proceedings ICCAD-92, Nov. 8-12, 1992.
A. Abnous and J. Rabaey, “Ultra-low-power domain-specific multimedia processors,” Proc. IEEE VLSI Signal Processing Workshop, October 1996.
G. Essink and E.A. de Kock, “VSP programming tools user's guide,” Nat.Lab. technical note 061/96, Philips Research, 1996.
M.R. Corazao, M.A. Khalaf, L.M. Guerra, M. Potkonjak, and J.M. Rabaey, “Performance optimization using template mapping for datapath-intensive high-level synthesis,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 15, No. 8, pp. 877–888, August 1996.
E.H.L. Aarts and J.K. Lenstra (Eds.)</nt>, Local Search in Combinatorial Optimization, John Wiley & Sons, 1997.
B.W. Kernighan and S. Lin, “An efficient heuristic procedure for partitioning graphs,” Bell System Technical Journal, Vol. 49, pp. 291–307, 1970.
J.W. Babb, “Virtual wires: Overcoming pin limitations in FPGA-based logic emulation,” Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1993.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lieverse, P., Deprettere, E., Kienhuis, A. et al. A Clustering Approach to Explore Grain-Sizes in the Definition of Processing Elements in Dataflow Architectures. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 22, 9–20 (1999). https://doi.org/10.1023/A:1008113601237
Published:
Issue Date:
DOI: https://doi.org/10.1023/A:1008113601237