Skip to main content
Log in

Streaming Data Movement for Real-Time Image Analysis

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

High performance portable systems for real-time video/image analysis continue to demand high processing power and memory bandwidth. In embedded systems such as digital still cameras, camcorders, and camera phones, the expected performance must be delivered while meeting size, weight, and power constraints. A well-designed system should include analyses of its memory subsystem as well as the computation platform. This paper focuses on a streaming memory subsystem that leverages deterministic memory access patterns. We formalize the notion of stream descriptors as a means to define these stream access patterns and to improve memory access efficiencies by discovering locality between different data streams. Data movement for a real-time image analysis applications are performed, showing favorable bandwidth savings using stream descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Similar content being viewed by others

Notes

  1. Patents are pending that claim aspects of items and methods described in this paper. A full description is beyond the scope of this paper.

References

  1. Wills, D. S., et al. (1996). Processing architectures for smart pixel systems. IEEE Journal of Selected Topics in Quantum Electronics, 2(1), 24–34. doi:10.1109/2944.541872.

    Article  Google Scholar 

  2. Wolf, W., Ozer, B., & Lv, T. (2002). Smart cameras as embedded systems. Computer, 35(9), 48–53. doi:10.1109/MC.2002.1033027.

    Article  Google Scholar 

  3. Luo, X., Li, J., & Zhen, L. (2004). Design and implementation of a Card Reader based on Build-in Camera. In Proceedings of the Pattern Recognition, 17th international Conference on (Icpr’04) Volume 1–Volume 01 (August 23–26, 2004). ICPR (pp. 417–420). IEEE Computer Society, Washington, DC.

  4. Coughlan, J., & Manduchi, R. (2007). Color targets: fiducials to help visually impaired people find their way by camera phone. J Image Video Process, 2007(2), 10–10 Aug. 2007.

    Google Scholar 

  5. de Lara, E., & Ebling, M. (2007). New products. IEEE Pervasive Comput, 6(3), 15–17. July 2007. doi:10.1109/MPRV.2007.62.

    Article  Google Scholar 

  6. Taylor, M. B., et al. (2004). Evaluation of the raw microprocessor: An exposed-wire-delay architecture for ILP and streams. In Proceedings of the 31st Annual international Symposium on Computer Architecture (pp. 2–14). München, Germany. June 19–23, 2004.

  7. Rixner, S., Dally, W. J., Kapasi, U. J., Khailany, B., López-Lagunas, A., Mattson, P. R., et al. (1998). A bandwidth-efficient architecture for media processing. In Proceedings of the 31st Annual ACM/IEEE international symposium on microarchitecture (Dallas, Texas, United States). International Symposium on Microarchitecture (pp. 3–13). Los Alamitos, CA: IEEE Computer Society Press.

  8. Dally, W. J., Labonte, F., Das, A., Hanrahan, P., Ahn, J., Gummaraju, J., et al. (2003). Merrimac: Supercomputing with Streams. In Proceedings of the 2003 ACM/IEEE Conference on Supercomputing (November 15–21, 2003). Conference on High Performance Networking and Computing (pp. 35–43). Washington, DC: IEEE Computer Society.

  9. Chiricescu, S., Essick, R., Lucas, B., May, P., Moat, K., Norris, J., et al. (2003). The Reconfigurable Streaming Vector Processor (RSVPTM). In Proceedings of the 36th Annual IEEE/ACM international Symposium on Microarchitecture (December 03–05, 2003). International Symposium on Microarchitecture (pp. 141–150). Washington, DC: IEEE Computer Society.

  10. Chiricescu, S., Schuette, M., Essick, R., Lucas, B., May, P., Moat, K., et al. (2004). “RSVP™: an automotive vector processor”. IEEE Intelligent Vehicles Symposium, 200–205. June.

  11. Chai, S. M. Chiricescu, S, Essick, R, Lucas, B, May, P, Moat, K, et al. (2005). Streaming processors for next generation mobile imaging applications. IEEE Commununcations Magazine, 84–89. (Dec 2005).

  12. Flachs, B., et al. (2005). A streaming processor unit for a CELL processor (pp.134–135). IEEE Solid-State Circuit Conference.

  13. Dally, W. J. (2007). Making parallel processing simple—Storm-1: A massively parallel C-programmable 112 GMACS Stream Processor. San Jose, CA: Microprocessor Forum 2007. May 21–23.

    Google Scholar 

  14. López-Lagunas, A., & Chai, S. M. (2006). Memory bandwidth optimization through stream descriptors. SIGARCH Comput. Archit. News, 34(1), 57–64. (Mar. 2006).

    Article  Google Scholar 

  15. Jayasena, N., Dally, W. J. (2004). Streams and vectors: a memory system perspective. workshop on media & stream processing. Dec.

  16. Gordon, M. I., Thies, W., Karczmarek, M., Lin, J., Meli, A. S., Lamb, A. A., et al. (2002). A stream compiler for communication-exposed architectures. In Proceedings of the 10th international Conference on Architectural Support For Programming Languages and Operating Systems (San Jose, California, October 05–09, 2002) (pp. 291–303). New York, NY: ASPLOS-X. ACM.

  17. Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., et al. (2004). Brook for GPUs: Stream computing on graphics hardware. In J. Marks (Ed.), ACM SIGGRAPH 2004 Papers (Los Angeles, California, August 08–12, 2004). SIGGRAPH '04 (pp. 777–786). New York, NY: ACM.

    Google Scholar 

  18. Wei Liao, S., Du, Z., Wu, G., & Lueh, G.-Y. (2006). Data and computation transformations for brook streaming applications on multiprocessors. In CGO.

  19. Chai, S. M., Bellas, N., Dwyer, M., & Linzmeier, D. (2006). Stream memory subsystem in reconfigurable platforms, workshop on architecture research on FPGA Platforms (WARFP) (p. 4). Austin, TX. Feb.

  20. Ascia, G., Catania, V., Palesi, M., & Patti, D. (2005). Hyperblock formation: a power/energy perspective for high performance VLIW architectures. IEEE International Symposium on Circuits and Systems, 4, 4090–4093 May 23–26.

    Article  Google Scholar 

  21. Thies, W., Karczmarek, M., & Amarasinghe, S. P. (2002). StreamIt: A language for streaming applications. In R. N. Horspool (Ed.), Proceedings of the 11th international conference on compiler construction (April 08–12, 2002). Lecture Notes In Computer Science (vol. 2304, pp. 179–196). London: Springer-Verlag.

    Google Scholar 

  22. Gummaraju, J., Coburn, J., Turner, Y., & Rosenblum, M. (2008). Streamware: programming general-purpose multicore processors using streams. In Proceedings of the 13th international Conference on Architectural Support For Programming Languages and Operating Systems (Seattle, WA, USA, March 01–05, 2008). ASPLOS XIII (pp. 297–307). New York, NY: ACM.

  23. Zhang, L., Fang, Z., Parker, M., et al. (2001). The Impulse Memory Controller. IEEE Transactions on Computers, 1117–1132. (Nov). doi:10.1109/12.966490.

  24. Palacharla, S., & Kessler, R. E. (1994). Evaluating stream buffers as a secondary cache replacement. In Proceedings of the 21st Annual international Symposium on Computer Architecture (Chicago, Illinois, United States, April 18–21, 1994). International Symposium on Computer Architecture (pp. 24–33). Los Alamitos, CA: IEEE Computer Society Press.

  25. Chai, S., & López-Lagunas, A. (2006). Stream data burst using embedded shape information (p. 7). Austin, Texas: Fourth Workshop on Memory Performance Issues (WMPI). Feb.

  26. Chai, S. M., Lopez-Lagunas, A. (2005). Streaming I/O for imaging applications. In Proceedings of the Seventh international Workshop on Computer Architecture For Machine Perception (July 04–06, 2005) (pp. 178–183). CAMP. Washington, DC: IEEE Computer Society.

  27. McKee, S. A., Wulf, W. A., Aylor, J., et al. (2000). Dynamic access ordering for streamed computations. IEEE Transaction on Computers, 49(11). November.

  28. Boulet, P., Darte, A., Silber, G., & Vivien, F. (1998). Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Computing, 24(3–4), 421–444. (May. 1998).

    Article  MATH  MathSciNet  Google Scholar 

  29. Lim, A. W., Liao, S., & Lam, M. S. (2001). Blocking and array contraction across arbitrarily nested loops using affine partitioning. SIGPLAN Notices, 36(7), 103–112. doi:10.1145/568014.379586.

    Article  Google Scholar 

  30. Wulf, W. A., & McKee, S. A. (1995). Hitting the memory wall: implications of the obvious. SIGARCH Comput Archit News, 23(1), 20–24. (Mar. 1995).

    Article  Google Scholar 

Download references

Acknowledgment

The authors acknowledge previous contributions by RSVP™ design team at Motorola Labs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sek Chai.

Additional information

RSVP™ is a trademark of Motorola Inc. Other product names are the property of their respective owner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

López-Lagunas, A., Chai, S. Streaming Data Movement for Real-Time Image Analysis. J Sign Process Syst 62, 29–42 (2011). https://doi.org/10.1007/s11265-008-0336-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-008-0336-x

Keywords

Navigation