Abstract
The memory subsystem for computer vision and image processing applications must sustain high memory bandwidth to keep processors busy. This paper advocates the use of stream descriptors, a mechanism that allows programmers to indicate data movement explicitly. Stream descriptors enable the compiler to organize memory transfers more efficiently by matching data movement to the capabilities of the underlying hardware. Stream descriptors are used in this paper on an image sensor interface to describe the deterministic movements of objects in segmented image regions. The paper shows how stream descriptors reduce the bandwidth requirements for a set of computer vision applications.
- ARM11 Reference manual ARM_DDI_0211_F, March 2005, www.arm.com/pdfs/DDI0211F_arm1136_r1p0_trm.pdfGoogle Scholar
- M. Bohr, "Interconnect Scaling -- The Real Limiter to High Performance ULSI," Proc. Intl' Electron Devices Meeting, IEEE Press, New York, 1995, pp. 241--244Google ScholarCross Ref
- Pierre Boulet, et. al., "Loop parallelization algorithms: From parallelism extraction to code generation" Parallel Computing, vol.24, issue 3--4, pp. 421--444, 1998 Google ScholarDigital Library
- S. M. Chai, A. López-Lagunas, "Streaming I/O for Imaging Applications," Proc. IEEE Computer Architectures for Machine Perception, 2005, pp. 178--183. Google ScholarDigital Library
- S. Chiricescu, et. al., "The Reconfigurable Streaming Vector Processor (RSVP#8482;)," Proceedings of the 36th International Symposium on Microarchitecture, December 2003. Google ScholarDigital Library
- S. Chiricescu, et. al., "RSVP II: A Next Generation Automotive Vector Processor," IEEE Intelligent Vehicle Symposium, June 2005.Google Scholar
- B. Flachs, et. al., "A Streaming Processor Unit for a CELL processor," IEEE Solid-State Circuit Conference, 2005, pp. 134--135Google Scholar
- N. Jayasena, W. J. Dally, "Streams and Vectors: A Memory System Perspective", Workshop on Media and Stream Processing, Dec 2004.Google Scholar
- A. W. Lim, S. W. Liao, M. S. Lam, "Blocking and Array Contraction Across Arbitrary Nested Loops Using Affine Partitioning," Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 2001. Google ScholarDigital Library
- P. Mattson, B. Thies, L. Hammond, M. Vahey "Streaming Virtual Machine Specification," Morphware Forum, Version 1.0 July 19, 2004Google Scholar
- S. A. McKee, et. al., "Dynamic Access Ordering for Streamed Computations," IEEE Transactions on Computers, Vol. 49, No. 11, November 2000. Google ScholarDigital Library
- Micron Technology Inc, "Synchonous SDRAM", DataSheet for MT48LC2M32B2, January 2002, www.micron.com/dramdsGoogle Scholar
- S. Palacharla, R. E. Kessler, "Evaluating Stream Buffers as a Secondary Cache Replacement", Proceedings of 21st Annual International Symposium on Computer Architecture, pp. 24--33, April 1994. Google ScholarDigital Library
- P. Ranganathan, S. Adve, N. P. Jouppi, "Performance of image and video processing with general-purpose processors and media ISA extensions," Proc. International Symposium on Computer Architecture, 1999, pp. 124--135. Google ScholarDigital Library
- R. Usselmann, "Memory Controller IP Core", January 2002, www.opencores.orgGoogle Scholar
- S. P. Amarasinghe; B. Thies. "Architectures, Languages, and Compilers for the Streaming Domain," PACT 2003 Tutorial.Google Scholar
- OpenCores Organization, "WISHBONE System-on-Chip (SoC) Interconnection Architecture for Portable IP Cores", revision B.3, September 2002, www.opencores.orgGoogle Scholar
- W. A. Wulf, S. A. McKee, "Hitting the memory wall: implications of the obvious," ACM SIGARCH Computer Architecture News, Vol. 23, No. 1, March 1995. Google ScholarDigital Library
Index Terms
- Memory bandwidth optimization through stream descriptors
Recommendations
Memory bandwidth optimization through stream descriptors
MEDEA '05: Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architectureThe memory subsystem for computer vision and image processing applications must sustain high memory bandwidth to keep processors busy. This paper advocates the use of stream descriptors, a mechanism that allows programmers to indicate data movement ...
Automatic Generation of Stream Descriptors for Streaming Architectures
ICPP '10: Proceedings of the 2010 39th International Conference on Parallel ProcessingWe describe a novel approach for automatically generating streaming architectures from software programs. While existing systems require user-defined stream models, our method automatically identifies producer-consumer streaming relationships and ...
Compiler Manipulation of Stream Descriptors for Data Access Optimization
ICPPW '06: Proceedings of the 2006 International Conference Workshops on Parallel ProcessingEfficient data movement is one of the key attributes for high performance computing. This paper advocates the use of stream descriptors to convey memory access patterns from the programmer to the compiler. This explicit separation of computation and ...
Comments