Streaming Data Movement for Real-Time Image Analysis

López-Lagunas, Abelardo; Chai, Sek

doi:10.1007/s11265-008-0336-x

Streaming Data Movement for Real-Time Image Analysis

Published: 22 January 2009

Volume 62, pages 29–42, (2011)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Abelardo López-Lagunas¹ &
Sek Chai²

212 Accesses
4 Citations
Explore all metrics

Abstract

High performance portable systems for real-time video/image analysis continue to demand high processing power and memory bandwidth. In embedded systems such as digital still cameras, camcorders, and camera phones, the expected performance must be delivered while meeting size, weight, and power constraints. A well-designed system should include analyses of its memory subsystem as well as the computation platform. This paper focuses on a streaming memory subsystem that leverages deterministic memory access patterns. We formalize the notion of stream descriptors as a means to define these stream access patterns and to improve memory access efficiencies by discovering locality between different data streams. Data movement for a real-time image analysis applications are performed, showing favorable bandwidth savings using stream descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable SIFT with Scala on NUMA

An extended analysis of memory hierarchies for efficient implementations of image processing applications

Article 27 September 2017

New access modes of parallel memory subsystem for sub-pixel motion estimation

Article 30 December 2014

Notes

Patents are pending that claim aspects of items and methods described in this paper. A full description is beyond the scope of this paper.

References

Wills, D. S., et al. (1996). Processing architectures for smart pixel systems. IEEE Journal of Selected Topics in Quantum Electronics, 2(1), 24–34. doi:10.1109/2944.541872.
Article Google Scholar
Wolf, W., Ozer, B., & Lv, T. (2002). Smart cameras as embedded systems. Computer, 35(9), 48–53. doi:10.1109/MC.2002.1033027.
Article Google Scholar
Luo, X., Li, J., & Zhen, L. (2004). Design and implementation of a Card Reader based on Build-in Camera. In Proceedings of the Pattern Recognition, 17th international Conference on (Icpr’04) Volume 1–Volume 01 (August 23–26, 2004). ICPR (pp. 417–420). IEEE Computer Society, Washington, DC.
Coughlan, J., & Manduchi, R. (2007). Color targets: fiducials to help visually impaired people find their way by camera phone. J Image Video Process, 2007(2), 10–10 Aug. 2007.
Google Scholar
de Lara, E., & Ebling, M. (2007). New products. IEEE Pervasive Comput, 6(3), 15–17. July 2007. doi:10.1109/MPRV.2007.62.
Article Google Scholar
Taylor, M. B., et al. (2004). Evaluation of the raw microprocessor: An exposed-wire-delay architecture for ILP and streams. In Proceedings of the 31st Annual international Symposium on Computer Architecture (pp. 2–14). München, Germany. June 19–23, 2004.
Rixner, S., Dally, W. J., Kapasi, U. J., Khailany, B., López-Lagunas, A., Mattson, P. R., et al. (1998). A bandwidth-efficient architecture for media processing. In Proceedings of the 31st Annual ACM/IEEE international symposium on microarchitecture (Dallas, Texas, United States). International Symposium on Microarchitecture (pp. 3–13). Los Alamitos, CA: IEEE Computer Society Press.
Dally, W. J., Labonte, F., Das, A., Hanrahan, P., Ahn, J., Gummaraju, J., et al. (2003). Merrimac: Supercomputing with Streams. In Proceedings of the 2003 ACM/IEEE Conference on Supercomputing (November 15–21, 2003). Conference on High Performance Networking and Computing (pp. 35–43). Washington, DC: IEEE Computer Society.
Chiricescu, S., Essick, R., Lucas, B., May, P., Moat, K., Norris, J., et al. (2003). The Reconfigurable Streaming Vector Processor (RSVPTM). In Proceedings of the 36th Annual IEEE/ACM international Symposium on Microarchitecture (December 03–05, 2003). International Symposium on Microarchitecture (pp. 141–150). Washington, DC: IEEE Computer Society.
Chiricescu, S., Schuette, M., Essick, R., Lucas, B., May, P., Moat, K., et al. (2004). “RSVP™: an automotive vector processor”. IEEE Intelligent Vehicles Symposium, 200–205. June.
Chai, S. M. Chiricescu, S, Essick, R, Lucas, B, May, P, Moat, K, et al. (2005). Streaming processors for next generation mobile imaging applications. IEEE Commununcations Magazine, 84–89. (Dec 2005).
Flachs, B., et al. (2005). A streaming processor unit for a CELL processor (pp.134–135). IEEE Solid-State Circuit Conference.
Dally, W. J. (2007). Making parallel processing simple—Storm-1: A massively parallel C-programmable 112 GMACS Stream Processor. San Jose, CA: Microprocessor Forum 2007. May 21–23.
Google Scholar
López-Lagunas, A., & Chai, S. M. (2006). Memory bandwidth optimization through stream descriptors. SIGARCH Comput. Archit. News, 34(1), 57–64. (Mar. 2006).
Article Google Scholar
Jayasena, N., Dally, W. J. (2004). Streams and vectors: a memory system perspective. workshop on media & stream processing. Dec.
Gordon, M. I., Thies, W., Karczmarek, M., Lin, J., Meli, A. S., Lamb, A. A., et al. (2002). A stream compiler for communication-exposed architectures. In Proceedings of the 10th international Conference on Architectural Support For Programming Languages and Operating Systems (San Jose, California, October 05–09, 2002) (pp. 291–303). New York, NY: ASPLOS-X. ACM.
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., et al. (2004). Brook for GPUs: Stream computing on graphics hardware. In J. Marks (Ed.), ACM SIGGRAPH 2004 Papers (Los Angeles, California, August 08–12, 2004). SIGGRAPH '04 (pp. 777–786). New York, NY: ACM.
Google Scholar
Wei Liao, S., Du, Z., Wu, G., & Lueh, G.-Y. (2006). Data and computation transformations for brook streaming applications on multiprocessors. In CGO.
Chai, S. M., Bellas, N., Dwyer, M., & Linzmeier, D. (2006). Stream memory subsystem in reconfigurable platforms, workshop on architecture research on FPGA Platforms (WARFP) (p. 4). Austin, TX. Feb.
Ascia, G., Catania, V., Palesi, M., & Patti, D. (2005). Hyperblock formation: a power/energy perspective for high performance VLIW architectures. IEEE International Symposium on Circuits and Systems, 4, 4090–4093 May 23–26.
Article Google Scholar
Thies, W., Karczmarek, M., & Amarasinghe, S. P. (2002). StreamIt: A language for streaming applications. In R. N. Horspool (Ed.), Proceedings of the 11th international conference on compiler construction (April 08–12, 2002). Lecture Notes In Computer Science (vol. 2304, pp. 179–196). London: Springer-Verlag.
Google Scholar
Gummaraju, J., Coburn, J., Turner, Y., & Rosenblum, M. (2008). Streamware: programming general-purpose multicore processors using streams. In Proceedings of the 13th international Conference on Architectural Support For Programming Languages and Operating Systems (Seattle, WA, USA, March 01–05, 2008). ASPLOS XIII (pp. 297–307). New York, NY: ACM.
Zhang, L., Fang, Z., Parker, M., et al. (2001). The Impulse Memory Controller. IEEE Transactions on Computers, 1117–1132. (Nov). doi:10.1109/12.966490.
Palacharla, S., & Kessler, R. E. (1994). Evaluating stream buffers as a secondary cache replacement. In Proceedings of the 21st Annual international Symposium on Computer Architecture (Chicago, Illinois, United States, April 18–21, 1994). International Symposium on Computer Architecture (pp. 24–33). Los Alamitos, CA: IEEE Computer Society Press.
Chai, S., & López-Lagunas, A. (2006). Stream data burst using embedded shape information (p. 7). Austin, Texas: Fourth Workshop on Memory Performance Issues (WMPI). Feb.
Chai, S. M., Lopez-Lagunas, A. (2005). Streaming I/O for imaging applications. In Proceedings of the Seventh international Workshop on Computer Architecture For Machine Perception (July 04–06, 2005) (pp. 178–183). CAMP. Washington, DC: IEEE Computer Society.
McKee, S. A., Wulf, W. A., Aylor, J., et al. (2000). Dynamic access ordering for streamed computations. IEEE Transaction on Computers, 49(11). November.
Boulet, P., Darte, A., Silber, G., & Vivien, F. (1998). Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Computing, 24(3–4), 421–444. (May. 1998).
Article MATH MathSciNet Google Scholar
Lim, A. W., Liao, S., & Lam, M. S. (2001). Blocking and array contraction across arbitrarily nested loops using affine partitioning. SIGPLAN Notices, 36(7), 103–112. doi:10.1145/568014.379586.
Article Google Scholar
Wulf, W. A., & McKee, S. A. (1995). Hitting the memory wall: implications of the obvious. SIGARCH Comput Archit News, 23(1), 20–24. (Mar. 1995).
Article Google Scholar

Download references

Acknowledgment

The authors acknowledge previous contributions by RSVP™ design team at Motorola Labs.

Author information

Authors and Affiliations

Instituto Tecnológico y de Estudios Superiores de Monterrey Campus Toluca, Toluca, México
Abelardo López-Lagunas
Motorola Labs, Schaumburg, IL, USA
Sek Chai

Authors

Abelardo López-Lagunas
View author publications
You can also search for this author in PubMed Google Scholar
Sek Chai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sek Chai.

Additional information

RSVP™ is a trademark of Motorola Inc. Other product names are the property of their respective owner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

López-Lagunas, A., Chai, S. Streaming Data Movement for Real-Time Image Analysis. J Sign Process Syst 62, 29–42 (2011). https://doi.org/10.1007/s11265-008-0336-x

Download citation

Received: 23 April 2008
Revised: 07 December 2008
Accepted: 23 December 2008
Published: 22 January 2009
Issue Date: January 2011
DOI: https://doi.org/10.1007/s11265-008-0336-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Streaming Data Movement for Real-Time Image Analysis

Abstract

Access this article

Similar content being viewed by others

Scalable SIFT with Scala on NUMA

An extended analysis of memory hierarchies for efficient implementations of image processing applications

New access modes of parallel memory subsystem for sub-pixel motion estimation

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Streaming Data Movement for Real-Time Image Analysis

Abstract

Access this article

Similar content being viewed by others

Scalable SIFT with Scala on NUMA

An extended analysis of memory hierarchies for efficient implementations of image processing applications

New access modes of parallel memory subsystem for sub-pixel motion estimation

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation