Skip to main content

Advertisement

Log in

An extended analysis of memory hierarchies for efficient implementations of image processing applications

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Through continued miniaturization of electronic devices embedded smart cameras are steadily becoming more and more important. The reduction of the camera size increases the spectrum of applications. In industrial applications the range of smart cameras spans from quality monitoring and position tracking to the calibration of production machines. In non-professional applications a distinct boom in action cameras combined with fused sensor information can be observed. However, all of these applications have a common bottleneck: the memory architecture. Most image processing applications are memory-bound tasks. Thus, the amount of time for transferring data with image processing applications decisively affects the application’s entire processing time. Different memory access patterns require different memory configurations and hierarchies. An insufficient match between the image processing application and the memory architecture leads to a poor performance in the image processing system. This can lead to longer processing times, and larger energy consumption rates. This work introduces new methods of classifying image processing applications by using their memory access pattern for mapping on memory architectures. Our work combines a simulation framework the heterogenous memory simulator with a analytical framework the memory analyzer to find bottlenecks inside the image processing application and aids in finding a suitable, application-specific memory configuration in terms of processing time and energy consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

References

  1. Avnet. http://www.zedboard.org/ (2016)

  2. Bailey, D.: Design for Embedded Image Processing on FPGAs. Wiley, New York (2011)

    Book  Google Scholar 

  3. Binkert, N., Beckmann, B., Black, G., Reinhardt, S., Saidi, A., Basu, A., Hestness, J., Hower, D., Krishna, T., Sardashti, S., Sen, R., Sewel, K., Shoaib, M., Vaish, N., Hill, M., Wood, D.: The gem5 simulator. SIGARCH Comput. Archit. News 39(2), 1–7 (2011)

    Article  Google Scholar 

  4. Burger, W., Burge, M.: Principles of Digital Image Processing. Springer, London (2009)

    MATH  Google Scholar 

  5. Das, S., Aamodt, T.M., Dally, W.J.: Reuse distance-based probabilistic cache replacement. Trans. Archit. Code Optim. 12(4), 33:1–33:22 (2015)

    Google Scholar 

  6. Eeckhout, L.: Computer Architecture Performance Evaluation Methods. Morgan and Claypool, Wisconsin (2010)

    Google Scholar 

  7. Gonzalez, R., Woods, R.: Digital Image Processing. Person Education Ltd., London (2008)

    Google Scholar 

  8. GPGPU-Sim. http://www.gpgpu-sim.org (2017)

  9. Hartmann, C., Reichenbach, M., Fey, D.: Ipol—a domain specific language for image processing applications. In: Proceedings of the International Symposium on International Conference on Systems, pp. 40–43. Barcelona, Spain, IARIA (2015)

  10. Hartmann, C., Häublein, K., Reichenbach, M., Fey, D.: Ipas: a design framework for analysis, synthesis and optimization of image processing applications for heterogenous computing architectures. J. Real Time Image Process. 11, 1–16 (2016). doi:10.1007/s11554-016-0587-x

    Article  Google Scholar 

  11. Herglotz, C., Seiler, J., Kaup, A., Hendricks, A., Reichenbach, M., Fey, D.: Estimation of non-functional properties for embedded hardware with application to image processing. In: Proceedings of the International Parallel and Distributed Processing Symposium Workshop, pp. 190–195. Hyderabad, Malay, IEEE (2015)

  12. HP Labs. http://www.hpl.hp.com/research/cacti/ (2016)

  13. Imperas. www.imperas.com (2016)

  14. Intel. www.intel.com (2016)

  15. Mathematica. http://www.wolfram.com/mathematica/ (2016)

  16. Naji, O., Hansson, A., Weis, C., Jung, M., Wehn, N.: A high-level dram timing, power and area exploration tool. In: International Conference on Embedded Computer Systems Architectures Modeling and Simulation, pp. 149–156. IEEE (2015)

  17. Nugteren, C., van den Braak, G.-J., Corporaal, H., Bal, H.: A detailed gpu cache model based on reuse distance theory. In: Proceedings of the International Symposium on High Performance Computer Architecture (HPCA), pp. 37–48. IEEE (2014)

  18. Pan, X., Jonsson, B.: A modeling framework for reuse distance-based estimation of cache performance. In: Performance Analysis of Systems and Software (ISPASS), pp. 62–71. Philadelphia, USA, IEEE (2015)

  19. Pelcat, M., Desnos, K., Heulot, J., Guy, C., Nezan, J-F., Aridhi, S.: Preesm: a dataflow-based rapid prototyping framework for simplifying multicore dsp programming. In: European Embedded Design in Education and Research Conference, pp. 30–40. Milano, Italy, IEEE (2014)

  20. Schmidt, M., Reichenbach, M., Fey, D.: Traffic sign recognition with color-based method, shape-arc estimation and svm. In: International Conference on Electrical Engineering and Informatics (ICEEI), pp. 1–6. IEEE (2011)

  21. Schmidt, M., Reichenbach, M., Fey, D.: A generic vhdl template for 2d stencil code applications on fpgas. In: International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing Workshops (ISORCW), pp. 180–187. IEEE (2012)

  22. Xu, C., Chen, X., Dick, R., Mao, Z.: Cache contention and application performance prediction for multi-core systems. In: Performance Analysis of Systems and Software (ISPASS), pp. 76–86. White Plains, USA, IEEE (2010)

  23. Zimmer. http://www.zes.com/en/Products/Precision-Power-Analyzer/LMG640 (2016)

Download references

Acknowledgements

This work is supported by the Bavarian Research Foundation (BFS) as part of their research project “FORMUS3IC”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Hartmann.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hartmann, C., Fey, D. An extended analysis of memory hierarchies for efficient implementations of image processing applications. J Real-Time Image Proc 14, 713–728 (2018). https://doi.org/10.1007/s11554-017-0723-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-017-0723-2

Keywords

Navigation