Skip to main content
Log in

A novel global methodology to analyze the embeddability of real-time image processing algorithms

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Advanced driver assistance systems applications increasingly use cameras and image processing algorithms. To embed and achieve real-time execution of these algorithms, semiconductor companies propose heterogeneous systems-on-chip (SoCs). Embedding algorithms on this type of hardware is not trivial: One needs to determine how to partition the computational load on the different processing units. In addition, it is not easy to predict whether a given algorithm can be executed on a given heterogeneous SoC while meeting real-time constraints. We propose a novel global methodology to assist with embedding image processing algorithms on heterogeneous SoC while meeting real-time constraints (using a soft real-time analysis). Our approach proposes several heuristics predicting delays and execution times and is based on a set of multi-level test vectors which extract key features of heterogeneous architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference, pp. 483–485. ACM, (1967)

  2. Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, SW, et al.: The Landscape of Parallel Computing Research: a View from Berkeley. Technical report, UCB/EECS-2006-183, EECS Department, University of California, Berkeley (2006)

  3. Benoit, N., Louise, S.: A performance prediction for automatic placement of heterogeneous workloads on many-cores. In: 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pp. 159–166. (2015)

  4. Castaño-Díez, D., Moser, D., Schoenegger, A., Pruggnaller, S., Frangakis, A.S.: Performance evaluation of image processing algorithms on the GPU. J. Struct. Biol. 164(1), 153–160 (2008)

    Article  Google Scholar 

  5. Castrillon, J., Leupers, R., Ascheid, G.: Maps: mapping concurrent dataflow applications to heterogeneous MPSoCs. IEEE Trans. Ind. Inform. 9(1), 527–545 (2013)

    Article  Google Scholar 

  6. Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: 2009 IEEE International Symposium on Workload Characterization (IISWC), pp. 44–54. IEEE, (2009)

  7. Chitnis, K., Staszewski, R., Agarwal, G.: TI Vision SDK, Optimized Vision Libraries for ADAS Systems. Technical report, Texas Instrument (2014)

  8. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Conference on Computer Vision and Pattern Recognition, vol.1, pp. 886–893. IEEE, (2005)

  9. Danalis, A., Marin, G., McCurdy, C., Meredith, J.S., Roth, P.C., Spafford, K., Tipparaju, V., Vetter, J.S.: The scalable heterogeneous computing (SHOC) benchmark suite. In: Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 63–74. ACM, (2010)

  10. Dawood, H.: Theories of Interval Arithmetic: Mathematical Foundations and Applications. LAP Lambert Academic Publishing, Saarbrücken (2011)

    Google Scholar 

  11. Everitt, B.: Cambridge dictionary of statistics. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  12. Gal-On, S., Levy, M.: Exploring coremark a benchmark maximizing simplicity and efficacy. Web ressource. (2009) http://www.eembc.org/techlit/coremark-whitepaper.pdf

  13. García, J.D., Sotomayor, R., Fernández, J., Sánchez, L.M.: Static partitioning and mapping of kernel-based applications over modern heterogeneous architectures. Simul. Model. Pract. Theory 58, 79–94 (2015)

    Article  Google Scholar 

  14. Geronimo, D., Lopez, A.M., Sappa, A.D., Graf, T.: Survey of pedestrian detection for advanced driver assistance systems. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1239–1258 (2010)

    Article  Google Scholar 

  15. Henning, J.L.: Spec cpu2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)

    Article  Google Scholar 

  16. Hillel, A.B., Lerner, R., Levi, D., Raz, G.: Recent progress in road and lane detection: a survey. Mach. Vis. Appl. 25(3), 727–745 (2014)

    Article  Google Scholar 

  17. Hong, S., Kim, H.: An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness. ACM SIGARCH Comput. Archit. News 37(3), 152–163 (2009)

    Article  Google Scholar 

  18. Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L.K., De Bosschere, K.: Performance prediction based on inherent program similarity. In: Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques, pp. 114–122. ACM, (2006)

  19. Kerr, A., Diamos, G., Yalamanchili, S.: Modeling GPU-CPU workloads and systems. In: Proceedings of the 3rd Workshop on General-Purpose Computation on GPU, pp. 31–42. ACM, (2010)

  20. Lopez-Novoa, U., Mendiburu, A., Miguel-Alonso, J.: A survey of performance modeling and simulation techniques for accelerator-based computing. IEEE Trans. Parallel Distrib. Syst. 26(1), 272–281 (2015)

    Article  Google Scholar 

  21. Manolache, S., Eles, P., Peng, Z.: Task mapping and priority assignment for soft real-time applications under deadline miss ratio constraints. ACM Trans. Embed. Comput. Syst. (TECS) 7(2), 19 (2008)

    Google Scholar 

  22. McCalpin, J.D.: STREAM: Sustainable memory bandwidth in high performance computers. http://www.cs.virginia.edu/stream/ (1995). Accessed 7 Apr 2017

  23. Mucci, P.: Llcbench-low level architectural characterization benchmark suite. Web ressource. http://icl.cs.utk.edu/projects/llcbench (2009)

  24. Nugteren, C., Corporaal, H.: A modular and parameterisable classification of algorithms. Eindhoven University of Technology, Technical report ESR-2011-02 (2011)

  25. Nugteren, C., Corporaal, H.: The boat hull model: adapting the roofline model to enable performance prediction for parallel computing. ACM Sigplan Not. 47(8), 291–292 (2012)

    Article  Google Scholar 

  26. Nvidia, CUDA C programming guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide (2015). Accessed 7 Apr 2017

  27. Rainey, E., Villarreal, J., Dedeoglu, G., Pulli, K., Lepley, T., Brill, F.: Addressing system-level optimization with OpenVX graphs. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, (2014)

  28. Sankaran, J., Zoran, N.: TDA2X, a SoC optimized for advanced driver assistance systems. In: 2014 IEEE Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2204–2208. IEEE, (2014)

  29. Saussard, R., Bouzid, B., Vasiliu, M., Reynaud, R.: The embeddability of lane detection algorithms on heterogeneous architectures. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4694–4697. IEEE, (2015)

  30. Saussard, R., Bouzid, B., Vasiliu, M., Reynaud, R.: Optimal performance prediction of ADAS algorithms on embedded parallel architectures. In: 2015 IEEE 17th International Conference on High Performance Computing and Communications (HPCC), pp. 213–218. IEEE, (2015)

  31. Saussard, R., Bouzid, B., Vasiliu, M., Reynaud, R.: Towards an automatic prediction of image processing algorithms performances on embedded heterogeneous architectures. In: 2015 44th International Conference on Parallel Processing Workshops (ICPPW), pp. 27–36. IEEE, (2015)

  32. Saussard, R., Bouzid, B., Vasiliu, M., Reynaud, R.: A robust methodology for performance analysis on hybrid embedded multicore architectures. In: IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-16). IEEE, (2016)

  33. Shen, J., Varbanescu, A.L., Sips, H.: Look before you leap: using the right hardware resources to accelerate applications. In: 2014 IEEE International Conference on High Performance Computing and Communications (HPCC), pp. 383–391. IEEE, (2014)

  34. Singh, A.K., Shafique, M., Kumar, A., Henkel, J.: Mapping on multi/many-core systems: survey of current and emerging trends. In: Proceedings of the 50th Annual Design Automation Conference, pp. 1–10. ACM, (2013)

  35. Sivaraman, S., Trivedi, M.M.: Integrated lane and vehicle detection, localization, and tracking: a synergistic approach. IEEE Trans. Intell. Transp. Syst. 14(2), 906–917 (2013)

    Article  Google Scholar 

  36. Stratton, J.A., Rodrigues, C., Sung, I.J., Obeid, N., Chang, L.W., Anssari, N., Liu, G.D., Hwu, W.M.W.: Parboil: Revised benchmark suite for scientific and commercial throughput computing. Tech. Rep. IMPACT-12-01 (2012)

  37. Ubal, R., Jang, B., Mistry, P., Schaa, D., Kaeli, D.: Multi2sim: a simulation framework for CPU-GPU computing. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, pp. 335–344. ACM, (2012)

  38. Weicker, R.P.: Dhrystone: a synthetic systems programming benchmark. Commun. ACM 27(10), 1013–1030 (1984)

    Article  Google Scholar 

  39. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)

    Article  Google Scholar 

  40. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd annual international symposium on computer architecture, pp. 24–36. IEEE (1995)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Romain Saussard.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saussard, R., Bouzid, B., Vasiliu, M. et al. A novel global methodology to analyze the embeddability of real-time image processing algorithms. J Real-Time Image Proc 14, 565–583 (2018). https://doi.org/10.1007/s11554-017-0686-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-017-0686-3

Keywords

Navigation