Abstract
The architecture of a low-cost hosted MIMD parallel processing system containing parallel processor chips interconnected by a hierarchy of crossbars is described. The parallel processing system is attached to the system bus of the host and uses the operating system and programming environment of the host. Each parallel processor chip contains 64 processors. The processors in a chip are simple in their architecture and structured so that data streams can be processed efficiently using dataflow semantics. A static dataflow model of computation is assumed for programming the chip. Arithmetic, logical, multiply, conditional branch, and select instructions are supported. Each processor has a 16-bit data path and a microcontroller. The processors in a chip are clustered for reducing data communication latency time. Eight processors are grouped into a cluster and there are eight clusters in a chip. Segmented and switched buses are used for intra and inter cluster communication in a chip. A global bus is provided for supplying instructions to the processors during program setup and to communicate the status of the processors during program execution. Two global buses are provided for data transfer between the external memory or I/O devices and the data memories of the processors. Each chip has two ports with 16 bits of data, 16 bits for processor address, and control signals for connecting to other chips using a hierarchical crossbar interconnection network. The interconnection network is based on a 16 X 16 crossbar chip with 32 ports (16 paths) capable of connecting 16 processor chips or 15 processor chips and a second level of crossbar chip. With two levels of crossbar chips it is possible to connect 225 parallel processor chips and achieve one Teraop in a shoebox sized system. The applications selected for the parallel processing system are image processing, machine vision, video compression and decompression, and 3-D imaging.
Preview
Unable to display preview. Download preview PDF.
References
B. K. Holmer and B. M. Pangrle, “Hardware/Software Codesign Using Automated Instruction Set Design and Processor Synthesis,” International Workshop on Hardware-Software Co-Design, Cambridge, MA, Oct., 1993.
J. M. Rabaey, C. Chu, P. Hoang, and M. Potkonjak, “Fast Prototyping of Datapath-Intensive Architectures,” IEEE Design & Test of Computers, pp. 40–51, June, 1991.
R. Mehra and J. M. Rabaey, “High Level Power Estimation and Exploration,” International Workshop on Low Power Design, 1994.
J. M. Rabaey, “DSP Specification Using the Silage Language,” HYPER: Selected Papers, University of California, Berkeley, CA 94720, 1993.
V. P. Srini, “An Architectural Comparison of Dataflow Systems,” IEEE Computer, pp. 68–88, March 1986.
D. Bursky, “Image-Processing Chip Set Handles Full-Motion Video,” Electronic Design, pp. 117–120, May 3, 1993.
Array, “Videoflow, The Magic Behind Multimedia,” Array Microsystems, Product Brochure, Colorado Springs, CO, July 1993.
D. C. Chen, “Programmable Arithmetic Devices for High Speed Digital Signal Processing,” Electronic Research Laboratory, Memorandum No. UCB/ERL M92/49, University of California, Berkeley, CA 94720, May 14, 1992.
M. H. Sunwoo and J. K. Aggarwal, “Flexibly Coupled Multiprocessors for Image Processing,” 1988 Intl. Conf. on Parallel Processing, St. Charles, IL, Aug. 1988.
A. K. W. Yeung and J. M. Rabaey, “A Data-Driven Architecture for Rapid Prototyping of High Throughput DSP Algorithms,” VLSI Signal Processing V, Edited by K. Yao, R. Jain, W. Przytula, J. Rabaey, IEEE New York, pp. 225–234, New York, 1992.
V. P. Srini, J. V. Tam, T. M. Nguyen, Y. N. Patt, A. M. Despain, M. Moll, and D. Ellsworth, “A CMOS Chip for Prolog,” Proceedings of the International Conference on Computer Design, pp. 605–610, Rye Town, New York, Oct. 1987.
Burroughs Corporation, Burroughs B6700 Reference Manual, Detroit, Michigan, 1969.
W.A. Wulf and C.G. Bell, “C.mmp — A Multi Miniprocessor,” Proceedings of the AFIPS Fall Joint Computer Conference, Montvale, New Jersey, 1972.
W.A. Wulf and S.P. Harbison, “Reflections in a Pool of Processors,” Proceedings of the AFIPS Fall Joint Computer Conference, Montvale, New Jersey, 1978.
V. P. Srini, “Bit-sliced Cross-connect Chip Having a Tree Topology of Arbitration Cells for Connecting Memory Modules to Processors in A Multiprocessor System,” U.S. Patent No. 5,053,942, October1, 1991.
E. A. Lee and D. G. Messerschmitt, “An Overview of the Ptolemy Project,” Technical Report, University of California, EE Dept., Berkeley, CA, June, 1992.
J. Buck, S. Ha, E. A. Lee, and D. G. Messerschmitt, “Ptolemy: A Framework for Simulating and Prototyping Heterogeneous Systems,” International Journal of Computer Simulation, Special Issue on Simulation software Development, 1992.
M. Hiraki and et al, “Data-Dependent Logic Swing Internal Bus Architecture for Ultralow-Power LSI's,” IEEE Journal of Solid-State Circuits, vol. 30, no. 4, pp. 397–402, April, 1995.
AdaptiveSolutions, “CNAPS-1064 Digital Parallel Processor,” Adaptive Solutions Inc. Product Brochure, Beaverton, OR 97006, Jan. 1994.
IIT, “IIT Vision Processor,” IIT VP Data Sheet, Santa Clara, CA, Nov. 1992.
B. D. Ackland and et al, “A Video-Codec Chip Set for Multimedia Applications,” AT&T Technical Journal, pp. 50–66, Jan./Feb. 1993.
R. Aravind and et al, “Image and Video Coding Standards,” AT&T Technical Journal, pp. 67–89, Jan./Feb. 1993.
S. K. Rao and et al, “A Real-Time P*64/MPEG Video Encoder Chip,” IEEE Solid-State Circuits Conference, pp. 32–33, San Francisco, Feb. 1993.
D. Brinthaupt, “A Video Decoder for H.261 Video Teleconferencing and MPEG Stored Interactive Video Applications,” IEEE Solid-State Circuits Conference, pp. 34–35, San Francisco, Feb. 1993.
S. Bose, S. Purcell, and T. Chiang, “A Single Chip Multistandard Video Codec,” Symposium Record, Hot Chips V, Stanford, CA, Aug. 1993.
D. Bursky, “Parallelism Pushes DSP Throughput,” Electronic Design, pp. 151–154, March 21, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Srini, V.P. (1995). DFS-superMPx: Low-cost parallel processing system for machine vision and image processing. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 1995. Lecture Notes in Computer Science, vol 964. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60222-4_125
Download citation
DOI: https://doi.org/10.1007/3-540-60222-4_125
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60222-4
Online ISBN: 978-3-540-44754-2
eBook Packages: Springer Book Archive