Skip to main content
Log in

Accelerating OpenVX Application Kernels Using Halide Scheduling

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

In this study, we investigate how to use a Domain-Specific Language—Halide to accelerate and optimize OpenVX graphs. Halide is a new high-level image processing pipeline language. It offers developers to separate the program into algorithms and schedule. This makes developers program friendly. The Halide image processing language has also proven to be an effective system for authoring high-performance image processing code. We present a prototype that use Halide to optimize OpenVX image processing modules. Since OpenVX is a lack of scheduling primitives, but Halide does. We implemented Halide into OpenVX graphs. This method increases the developer’s utilities and achieves relatively high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16

Similar content being viewed by others

Data Availability

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Khronos Group. (2014). The OpenVX API for hardware acceleration. Retrieved August 7, 2021, from https://www.khronos.org/openvx/

  2. Tagliavini, G., et al. (2016). Optimizing memory bandwidth exploitation for openvx applications on embedded many-core accelerators. Journal of Real-Time Image Processing.

  3. Tagliavini, G., Haugou, G., & Benini, L. (2014). Optimizing memory bandwidth in OpenVX graph execution on embedded many-core accelerators. In Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, pp. 1–8. IEEE.

  4. Dekkiche, D., Vincke, B., & Merigot, A. (2016). Investigation and performance analysis of openvx optimizations on computer vision applications. In 14th International Conference on Control, Automation, Robotics and Vision, pp. 1–6.

  5. Tagliavini, G., Haugou, G., Marongiu, A., & Benini, L. (2015). ADRENALINE: an OpenVX environment to optimize embedded vision applications on many-core accelerators. In IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on- Chip (MCSoC), pp. 289–296.

  6. Ragan-Kelley, J., Adams, A., Paris, S., Levoy, M., Ama-Rainghe, S., & Durand, F. (2012) Decoupling algorithms from schedules for easy optimization of image processing pipelines. ACM Transactions on Graphics, 31(4), 32.

  7. Ragan-Kelley, J., Barnes, C., Adams, A., Paris, S., Durand, F., & Amarasinghe, S. (2013). Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. ACM.

  8. Rainey, E., Villarreal, J., Dedeoglu, G., Pulli, K., Lepley, T., & Brill, F. (2014). Addressing System-Level Optimization with OpenVX Graphs. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 658–663.

  9. Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J. H., Brown, S., & Czajkowski, T. (2011). LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In: Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 33–36. ACM.

  10. Gehrig, S. K., Eberli, F., & Meyer, T. (2009). A real-time low-power stereo vision engine using semi-global matching. In: Computer Vision Systems, pp. 134–143. Springer.

  11. Lei, Y., Gang, Z., Si-Heon, R., Choon-Young, L., Sang-Ryong, L., & Bae, K. M. (2008). The platform of image acquisition and processing system based on DSP and FPGA. In: International Conference on Smart Manufacturing Application, pp. 470–473. IEEE.

  12. Cong, J., Ghodrat, M. A., Gill, M., Grigorian, B., & Reinman, G. (2012). CHARM: a composable heterogeneous accelerator-rich micro- processor. In: Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 379–384. ACM.

  13. Cong, J., Liu, C., Ghodrat, M.A., Reinman, G., Gill, M., & Zou, Y. (2011). AXR-CMP: architecture support in accelerator-rich CMPs. In: 2nd Workshop on SoC Architecture, Accelerators and Workloads.

  14. Farabet, C., Martini, B., Corda, B., Akselrod, P., Culurciello, E., & LeCun, Y. (2011) Neuflow: a runtime reconfigurable dataflow processor for vision. In: 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 109–116. IEEE.

  15. Hegarty, J., Brunhaver, J., DeVito, Z., Ragan-Kelley, J., Cohen, N., Bell, S., Vasilyev, A., Horowitz, M., & Hanrahan, P. D. (2014). Compiling high-level image processing code into hardware pipelines. In: Proceedings of the 41st International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH).

  16. Intel. (2000). OpenCV Library. Retrieved September 16, 2021, from http://www.opencv.org

  17. Coombs, J., Prabhu, R., & Peake, G. (2012). Overcoming the challenges of porting OpenCV to TI’s embedded ARM+ DSP platforms. International Journal of Electrical Engineering Education, 49(3), 260–274.

    Article  Google Scholar 

  18. Nvidia. (2008). Tegra Android Development Documentation Website. Retrieved September 1, 2021, from http://docs.nvidia.com/tegra/index.html

  19. Qualcomm. (2015). Computer Vision (FastCV). Retrieved September 2, 2021, from https://developer.qualcomm.com/computer-vision-fastcv

  20. Stone, J. E., Gohara, D., & Shi, G. (2010). OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in Science & Engineering.

  21. Czajkowski, T. S., Aydonat, U., Denisenko, D., Freeman, J., Kinsner, M., Neto, D., Wong, J., Yiannacouras, P., & Singh, DP. (2012). From OpenCL to high-performance hardware on FPGAs. In: 22nd International Conference on Field Programmable Logic and Applications (FPL), pp. 531–534. IEEE.

  22. Boudier, P., & Sellers, G. (2011). Memory system on fusion APUs. AMD fusion developer summit. Retrieved August 20, 2021, from https://developer.amd.com/wordpress/media/2013/06/1004_final.pdf

  23. Mullapudi, R. T., Adams, A., Sharlet, D., Ragan-Kelley, J., & Fatahalian, K. (2016, July). Automatically scheduling halide image processing pipelines. ACM Transactions on Graphics, 35(4), Article 83;11.

  24. Mullapudi, R. T., Vasista, V., & Bondhugula, U. (2015). PolyMage: Automatic optimization for image processing pipelines. In Proceedings of the Twentieth International Confer- ence on Architectural Support for Programming Languages and Operating Systems, pp. 429–443.

  25. Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, 25, 1106–1114.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shih-Wei Liao.

Ethics declarations

Conflicts of Interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, BY., Peng, MY., Wang, XY. et al. Accelerating OpenVX Application Kernels Using Halide Scheduling. J Sign Process Syst 95, 623–642 (2023). https://doi.org/10.1007/s11265-023-01851-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-023-01851-1

Keywords

Navigation