Skip to main content

DFS-superMPx: Low-cost parallel processing system for machine vision and image processing

  • Conference paper
  • First Online:
Parallel Computing Technologies (PaCT 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 964))

Included in the following conference series:

Abstract

The architecture of a low-cost hosted MIMD parallel processing system containing parallel processor chips interconnected by a hierarchy of crossbars is described. The parallel processing system is attached to the system bus of the host and uses the operating system and programming environment of the host. Each parallel processor chip contains 64 processors. The processors in a chip are simple in their architecture and structured so that data streams can be processed efficiently using dataflow semantics. A static dataflow model of computation is assumed for programming the chip. Arithmetic, logical, multiply, conditional branch, and select instructions are supported. Each processor has a 16-bit data path and a microcontroller. The processors in a chip are clustered for reducing data communication latency time. Eight processors are grouped into a cluster and there are eight clusters in a chip. Segmented and switched buses are used for intra and inter cluster communication in a chip. A global bus is provided for supplying instructions to the processors during program setup and to communicate the status of the processors during program execution. Two global buses are provided for data transfer between the external memory or I/O devices and the data memories of the processors. Each chip has two ports with 16 bits of data, 16 bits for processor address, and control signals for connecting to other chips using a hierarchical crossbar interconnection network. The interconnection network is based on a 16 X 16 crossbar chip with 32 ports (16 paths) capable of connecting 16 processor chips or 15 processor chips and a second level of crossbar chip. With two levels of crossbar chips it is possible to connect 225 parallel processor chips and achieve one Teraop in a shoebox sized system. The applications selected for the parallel processing system are image processing, machine vision, video compression and decompression, and 3-D imaging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. K. Holmer and B. M. Pangrle, “Hardware/Software Codesign Using Automated Instruction Set Design and Processor Synthesis,” International Workshop on Hardware-Software Co-Design, Cambridge, MA, Oct., 1993.

    Google Scholar 

  2. J. M. Rabaey, C. Chu, P. Hoang, and M. Potkonjak, “Fast Prototyping of Datapath-Intensive Architectures,” IEEE Design & Test of Computers, pp. 40–51, June, 1991.

    Google Scholar 

  3. R. Mehra and J. M. Rabaey, “High Level Power Estimation and Exploration,” International Workshop on Low Power Design, 1994.

    Google Scholar 

  4. J. M. Rabaey, “DSP Specification Using the Silage Language,” HYPER: Selected Papers, University of California, Berkeley, CA 94720, 1993.

    Google Scholar 

  5. V. P. Srini, “An Architectural Comparison of Dataflow Systems,” IEEE Computer, pp. 68–88, March 1986.

    Google Scholar 

  6. D. Bursky, “Image-Processing Chip Set Handles Full-Motion Video,” Electronic Design, pp. 117–120, May 3, 1993.

    Google Scholar 

  7. Array, “Videoflow, The Magic Behind Multimedia,” Array Microsystems, Product Brochure, Colorado Springs, CO, July 1993.

    Google Scholar 

  8. D. C. Chen, “Programmable Arithmetic Devices for High Speed Digital Signal Processing,” Electronic Research Laboratory, Memorandum No. UCB/ERL M92/49, University of California, Berkeley, CA 94720, May 14, 1992.

    Google Scholar 

  9. M. H. Sunwoo and J. K. Aggarwal, “Flexibly Coupled Multiprocessors for Image Processing,” 1988 Intl. Conf. on Parallel Processing, St. Charles, IL, Aug. 1988.

    Google Scholar 

  10. A. K. W. Yeung and J. M. Rabaey, “A Data-Driven Architecture for Rapid Prototyping of High Throughput DSP Algorithms,” VLSI Signal Processing V, Edited by K. Yao, R. Jain, W. Przytula, J. Rabaey, IEEE New York, pp. 225–234, New York, 1992.

    Google Scholar 

  11. V. P. Srini, J. V. Tam, T. M. Nguyen, Y. N. Patt, A. M. Despain, M. Moll, and D. Ellsworth, “A CMOS Chip for Prolog,” Proceedings of the International Conference on Computer Design, pp. 605–610, Rye Town, New York, Oct. 1987.

    Google Scholar 

  12. Burroughs Corporation, Burroughs B6700 Reference Manual, Detroit, Michigan, 1969.

    Google Scholar 

  13. W.A. Wulf and C.G. Bell, “C.mmp — A Multi Miniprocessor,” Proceedings of the AFIPS Fall Joint Computer Conference, Montvale, New Jersey, 1972.

    Google Scholar 

  14. W.A. Wulf and S.P. Harbison, “Reflections in a Pool of Processors,” Proceedings of the AFIPS Fall Joint Computer Conference, Montvale, New Jersey, 1978.

    Google Scholar 

  15. V. P. Srini, “Bit-sliced Cross-connect Chip Having a Tree Topology of Arbitration Cells for Connecting Memory Modules to Processors in A Multiprocessor System,” U.S. Patent No. 5,053,942, October1, 1991.

    Google Scholar 

  16. E. A. Lee and D. G. Messerschmitt, “An Overview of the Ptolemy Project,” Technical Report, University of California, EE Dept., Berkeley, CA, June, 1992.

    Google Scholar 

  17. J. Buck, S. Ha, E. A. Lee, and D. G. Messerschmitt, “Ptolemy: A Framework for Simulating and Prototyping Heterogeneous Systems,” International Journal of Computer Simulation, Special Issue on Simulation software Development, 1992.

    Google Scholar 

  18. M. Hiraki and et al, “Data-Dependent Logic Swing Internal Bus Architecture for Ultralow-Power LSI's,” IEEE Journal of Solid-State Circuits, vol. 30, no. 4, pp. 397–402, April, 1995.

    Google Scholar 

  19. AdaptiveSolutions, “CNAPS-1064 Digital Parallel Processor,” Adaptive Solutions Inc. Product Brochure, Beaverton, OR 97006, Jan. 1994.

    Google Scholar 

  20. IIT, “IIT Vision Processor,” IIT VP Data Sheet, Santa Clara, CA, Nov. 1992.

    Google Scholar 

  21. B. D. Ackland and et al, “A Video-Codec Chip Set for Multimedia Applications,” AT&T Technical Journal, pp. 50–66, Jan./Feb. 1993.

    Google Scholar 

  22. R. Aravind and et al, “Image and Video Coding Standards,” AT&T Technical Journal, pp. 67–89, Jan./Feb. 1993.

    Google Scholar 

  23. S. K. Rao and et al, “A Real-Time P*64/MPEG Video Encoder Chip,” IEEE Solid-State Circuits Conference, pp. 32–33, San Francisco, Feb. 1993.

    Google Scholar 

  24. D. Brinthaupt, “A Video Decoder for H.261 Video Teleconferencing and MPEG Stored Interactive Video Applications,” IEEE Solid-State Circuits Conference, pp. 34–35, San Francisco, Feb. 1993.

    Google Scholar 

  25. S. Bose, S. Purcell, and T. Chiang, “A Single Chip Multistandard Video Codec,” Symposium Record, Hot Chips V, Stanford, CA, Aug. 1993.

    Google Scholar 

  26. D. Bursky, “Parallelism Pushes DSP Throughput,” Electronic Design, pp. 151–154, March 21, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Victor Malyshkin

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Srini, V.P. (1995). DFS-superMPx: Low-cost parallel processing system for machine vision and image processing. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 1995. Lecture Notes in Computer Science, vol 964. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60222-4_125

Download citation

  • DOI: https://doi.org/10.1007/3-540-60222-4_125

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60222-4

  • Online ISBN: 978-3-540-44754-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics