Skip to main content
Log in

Fast FPGA prototyping for real-time image processing with very high-level synthesis

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Programming in high abstraction level can facilitate the development of digital signal processing systems. In the recent 20 years, high-level synthesis (HLS) has made significantly progress. This technique greatly benefits the R&D productivity of the Field Programmable Gate Array (FPGA) developments and helps for adding to the maintainability of the products by automating the C-to-RTL (register transfer language) conversion. However, due to the high complexity and computational intensity, image processing algorithms usually necessitate a higher abstraction environment than C-synthesis, and the current HLS tools do not have the ability of this kind. This paper presents a conception of very high-level synthesis method which allows fast prototyping and verifying the FPGA-based image processing designs in the MATLAB environment. We build a heterogeneous development flow by using currently available tool kits for verifying the proposed approach and evaluated it within two real-life applications. Experiment results demonstrate that it can effectively reduce the complexity of the development by automatically synthesizing the algorithm behavior from the user level into the low register transfer level and give play to the advantages of FPGA related to the other devices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Adsc research highlights: synthesize hardware, without hardware expertise. Online (2016). https://adsc.illinois.edu/research/adsc-research-highlights/adsc-research-highlights-synthesize-hardware-without-hardware-expe

  2. Alle, M., Morvan, A., Derrien, S.: Runtime dependency analysis for loop pipelining in high-level synthesis. In: Design Automation Conference (DAC), 2013 50th ACM/EDAC/IEEE, pp. 1–10 (2013)

  3. Andin, J.M., Arenaz, M., Rodrguez, G., Tourio, J.: A novel compiler support for automatic parallelization on multicore systems. Parallel Comput. 39(9), 442–460 (2013). doi:10.1016/j.parco.2013.04.003

    Article  Google Scholar 

  4. Armstrong, B., Kim, S., Park, I., Voss, M., Eigenmann, R.: Compiler-based tools for analyzing parallel programs. Parallel Comput. 24(3–4), 401–420 (1998)

    Article  Google Scholar 

  5. Baklouti, M., Aydi, Y., Marquet, P., Dekeyser, J., Abid, M.: Scalable mpnoc for massively parallel systems—design and implementation on FPGA. J. Syst. Archit. 56(7), 278 – 292 (2010). doi:10.1016/j.sysarc.2010.04.001. Special Issue on HW/SW Co-Design: Systems and Networks on Chip

    Article  Google Scholar 

  6. Balla-Arabe, S., Gao, X., Wang, B., Yang, F., Brost, V.: Multi-kernel implicit curve evolution for selected texture region segmentation in VHR satellite images. Geosci. Remote Sens. IEEE Trans. 52(8), 5183–5192 (2014). doi:10.1109/TGRS.2013.2287239

    Article  Google Scholar 

  7. Barney, B.: Introduction to parallel computing. Article published online. https://computing.llnl.gov/tutorials/parallel_comp/

  8. Bi, Y., Li, C., Yang, F.: Very high level synthesis for image processing applications. In: 10th International Conference on Distributed Smart Cameras (ICDSC 2016), Paris France (2016)

  9. Cadence Design Systems, Inc: C-to-Silicon Compiler High-Level Synthesis (2011). https://www.cadence.com/rl/Resources/datasheets/C2Silicon_ds.pdf

  10. Colodro-Conde, C., Toledo-Moreo, F., Toledo-Moreo, R., Martnez-lvarez, J., Garrigs-Guerrero, J., Ferrndez-Vicente, J.: A practical evaluation of the performance of the impulse codeveloper hls tool for implementing large-kernel 2-d filters. J. Real-Time Image Proc. 9(1), 263–279 (2014). doi:10.1007/s11554-013-0374-x

    Article  Google Scholar 

  11. Colodro-Conde, C., Toledo-Moreo, F.J., Toledo-Moreo, R., Martínez-Álvarez, J.J., Guerrero, J.G., Ferrández-Vicente, J.M.: Evaluation of stereo correspondence algorithms and their implementation on FPGA. J. Syst. Archit. 60(1), 22–31 (2014). doi:10.1016/j.sysarc.2013.11.006

    Article  Google Scholar 

  12. Cong, J., Fan, Y., Han, G., Jiang, W., Zhang, Z.: Behavior and communication co-optimization for systems with sequential communication media. In: Design Automation Conference, 2006 43rd ACM/IEEE, pp. 675–678 (2006). doi:10.1109/DAC.2006.229314

  13. Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for FPGAs: from prototyping to deployment. Comput.-Aid. Design of Integr. Circuits Syst. IEEE Trans. 30(4), 473–491 (2011). doi:10.1109/TCAD.2011.2110592

    Article  Google Scholar 

  14. Cong, J., Liu, B., Prabhakar, R., Zhang, P.: A study on the impact of compiler optimizations on high-level synthesis. In: Kasahara, H., Kimura, K. (eds.) Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, vol. 7760, pp. 143–157. Springer, Berlin, Heidelberg (2013). doi:10.1007/978-3-642-37658-0_10

    Chapter  Google Scholar 

  15. Coussy, P., Morawiec, A.: High-Level Synthesis: From Algorithm to Digital Circuit, 1st edn. Springer, Berlin, Incorporated (2008)

  16. Daniel D., G., Nikil D., D., Allen C-H, W., Steve Y-L, L.: High-Level Synthesis: Introduction to Chip and System Design, 1st edn. Springer, New York (1992). doi:10.1007/978-1-4615-3636-9.

    Book  Google Scholar 

  17. Davoodi, A., Srivastava, A.: Effective techniques for the generalized low-power binding problem. ACM Trans. Des. Autom. Electron. Syst. 11(1), 52–69 (2006). doi:10.1145/1124713.1124718

    Article  Google Scholar 

  18. Deming, C., Eric, L., Kyle, R., Zheng, C.: Hardware synthesis without hardware expertise. Tech. rep., Advanced Digital Sciences Center (ADSC) of the University of Illinois at Urbana-Champaign (2011)

  19. Fowers, J., Brown, G., Cooke, P., Stitt, G.: A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA ’12, pp. 47–56. ACM, New York, NY, USA (2012). doi:10.1145/2145694.2145704

  20. Gebotys, C., Elmasry, M.: Vlsi design synthesis with testability. In: Design Automation Conference, 1988. Proceedings, 25th ACM/IEEE, pp. 16–21 (1988). doi:10.1109/DAC.1988.14728

  21. Girkar, M., Polychronopoulos, C.: Automatic extraction of functional parallelism from ordinary programs. Parallel Distrib. Syst. IEEE Trans. 3(2), 166–178 (1992). doi:10.1109/71.127258

    Article  Google Scholar 

  22. González, C., Sánchez, S., Paz, A., Resano, J., Mozos, D., Plaza, A.: Use of FPGA or GPU-based architectures for remotely sensed hyperspectral image processing. Integr. VLSI J. 46(2), 89–103 (2013). doi:10.1016/j.vlsi.2012.04.002

    Article  Google Scholar 

  23. Hafer, L., Parker, A.: Register-transfer level digital design automation: The allocation process. In: Design Automation, 1978. 15th Conference on, pp. 213–219 (1978). doi:10.1109/DAC.1978.1585172

  24. Heijligers, M., Cluitmans, L., Jess, J.: High-level synthesis scheduling and allocation using genetic algorithms. In: Design Automation Conference, 1995. Proceedings of the ASP-DAC ’95/CHDL ’95/VLSI ’95., IFIP International Conference on Hardware Description Languages. IFIP International Conference on Very Large Scal, pp. 61–66 (1995). doi:10.1109/ASPDAC.1995.486203

  25. Heijligers, M., Jess, J.: High-level synthesis scheduling and allocation using genetic algorithms based on constructive topological scheduling techniques. In: Evolutionary Computation, 1995. IEEE International Conference on, vol. 1, p. 56 (1995). doi:10.1109/ICEC.1995.489119

  26. Intel Corporation: Intel® C++ Compiler User and Reference Guides, 304968-022us edn. (2008).http://www.physics.udel.edu/bnikolic/QTTG/shared/docs/intel_c_user_and_reference_guide.pdf

  27. Jiang, J., Liu, C., Ling, S.: An FPGA implementation for real-time edge detection. J. Real-Time Image Process. (2015). doi:10.1007/s11554-015-0521-7

    Article  Google Scholar 

  28. Jolivot, R., Benezeth, Y., Marzani, F.: Skin parameter map retrieval from a dedicated multispectral imaging system applied to dermatology/cosmetology. Int. J. Biomed. Imaging 2013, 15 (2013). doi:10.1155/2013/978289

    Article  Google Scholar 

  29. Kestur, S., Davis, J., Williams, O.: Blas comparison on FPGA, CPU and GPU. In: VLSI (ISVLSI), 2010 IEEE Computer Society Annual Symposium on, pp. 288–293 (2010). doi:10.1109/ISVLSI.2010.84

  30. Lee, J.H., Hsu, Y.C., Lin, Y.L.: A new integer linear programming formulation for the scheduling problem in data path synthesis. In: Computer-Aided Design, 1989. ICCAD-89. Digest of Technical Papers., 1989 IEEE International Conference on, pp. 20–23 (1989). doi:10.1109/ICCAD.1989.76896

  31. Li, C., Balla-Arabé, S., Ginhac, D., Yang, F.: Embedded implementation of VHR satellite image segmentation. Sensors 16(6), 771 (2016). doi:10.3390/s16060771. http://www.mdpi.com/1424-8220/16/6/771

    Article  Google Scholar 

  32. Li, C., Balla-Arabé, S., Yang, F.: Embedded multispectral image processing for real-time medical application. J. Syst. Archit. (2015). doi:10.1016/j.sysarc.2015.12.002. http://www.sciencedirect.com/science/article/pii/S1383762115001526

    Article  Google Scholar 

  33. Li, C., Brost, V., Benezeth, Y., Marzani, F., Yang, F.: Design and evaluation of a parallel and optimized light-tissue interaction-based method for fast skin lesion assessment. J. Real-Time Image Process. (2015). doi:10.1007/s11554-015-0494-6

    Article  Google Scholar 

  34. Li, P., Pouchet, L.N., Cong, J.: Throughput optimization for high-level synthesis using resource constraints. In: IMPACT 2014. Fourth International Workshop on Polyhedral Compilation Techniques. In conjunction with HiPEAC 2014. Vienna, Austria (Jan 20, 2014)

  35. Liang, Y., Rupnow, K., Li, Y., Min, D., Do, M.N., Chen, D.: High-level synthesis: productivity, performance, and software constraints. J. Electr. Comput. Eng. (2012). Article ID 649057. doi:10.1155/2012/649057

    Article  Google Scholar 

  36. Lyberis, S., Kalokerinos, G., Lygerakis, M., Papaefstathiou, V., Mavroidis, I., Katevenis, M., Pnevmatikatos, D., Nikolopoulos, D.S.: Fpga prototyping of emerging manycore architectures for parallel programming research using formic boards. J. Syst. Archit. 60(6), 481–493 (2014). doi:10.1016/j.sysarc.2014.03.002

    Article  Google Scholar 

  37. Mansouri, A., Marzani, F., Gouton, P.: Neural networks in two cascade algorithms for spectral reflectance reconstruction. In: ICIP (2), pp. 718–721. IEEE (2005)

  38. Marwedel, P.: A new synthesis algorithm for the mimola software system. In: Design Automation, 1986. 23rd Conference on, pp. 271–277 (1986). doi:10.1109/DAC.1986.1586100

  39. Meeus, W., Van Beeck, K., Goedemé, T., Meel, J., Stroobandt, D.: An overview of today’s high-level synthesis tools. Des. Autom. Embed. Syst. 16(3), 31–51 (2012). doi:10.1007/s10617-012-9096-8

    Article  Google Scholar 

  40. Musavi, S., Chowdhry, B., Kumar, T., Pandey, B., Kumar, W.: Iots enable active contour modeling based energy efficient and thermal aware object tracking on fpga. Wirel. Pers. Commun. 85(2), 529–543 (2015). doi:10.1007/s11277-015-2753-z

    Article  Google Scholar 

  41. Paulin, P., Knight, J.: Scheduling and binding algorithms for high-level synthesis. In: Design Automation, 1989. 26th Conference on, pp. 1–6 (1989). doi:10.1109/DAC.1989.203360

  42. Prost-Boucle, A., Muller, O., Rousseau, F.: Fast and standalone design space exploration for high-level synthesis under resource constraints. J. Syst. Archit. 60(1), 79–93 (2014). doi:10.1016/j.sysarc.2013.10.002

    Article  Google Scholar 

  43. Rodrigues, R., Cardoso, J., Diniz, P.: A data-driven approach for pipelining sequences of data-dependent loops. In: Field-Programmable Custom Computing Machines, 2007. FCCM 2007. 15th Annual IEEE Symposium on, pp. 219–228 (2007). doi:10.1109/FCCM.2007.16

  44. Rupnow, K., Liang, Y., Li, Y., Min, D., Do, M., Chen, D.: High level synthesis of stereo matching: productivity, performance, and software constraints. In: Field-Programmable Technology (FPT), 2011 International Conference on IEEE (2011)

  45. Senturk, A., Gok, M.: Sequential large multipliers on FPGAs. J. Signal Process. Syst. 81(2), 137–152 (2015). doi:10.1007/s11265-014-0912-1

    Article  Google Scholar 

  46. Sidiropoulos, H., Siozios, K., Soudris, D.: A novel 3-d FPGA architecture targeting communication intensive applications. J. Syst. Archit. 60(1), 32–39 (2014). doi:10.1016/j.sysarc.2013.09.012

    Article  Google Scholar 

  47. Sugimoto, N., Miyajima, T., Kuhara, T., Katuta, Y., Mitsuichi, T., Amano, H.: Artificial intelligence of blokus duo on FPGA using cyber work bench. In: Field-Programmable Technology (FPT), 2013 International Conference on, pp. 498–501 (2013). doi:10.1109/FPT.2013.6718427

  48. Sukhwani, B., Thoennes, M., Min, H., Dube, P., Brezzo, B., Asaad, S., Dillenberger, D.: A hardware/software approach for database query acceleration with FPGAs. Int. J. Parallel Prog. 43(6), 1129–1159 (2015). doi:10.1007/s10766-014-0327-4

    Article  Google Scholar 

  49. Sumit, G., Rajesh, G., Nikil D., D., Alexandru, N.: SPARK: A Parallelizing Approach to the High-Level Synthesis of Digital Circuits. Springer, New York (2004). doi:10.1007/b117058

  50. Synopsys, Inc.: Synphony C Compiler (2010). http://www.scanru.ru/file_link.php?fid=831

  51. Toledo-Moreo, F.J., Martínez-Álvarez, J.J., Garrigós-Guerrero, J., Ferrández-Vicente, J.M.: FPGA-based architecture for the real-time computation of 2-d convolution with large kernel size. J. Syst. Archit. 58(8), 277–285 (2012). doi:10.1016/j.sysarc.2012.06.002

    Article  Google Scholar 

  52. Tseng, C.J., Siewiorek, D.: Automated synthesis of data paths in digital systems. Comput.-Aid. Des. Integr. Circuits Syst. IEEE Trans. 5(3), 379–395 (1986). doi:10.1109/TCAD.1986.1270207

    Article  Google Scholar 

  53. Vega-Rodríguez, M.A.: Methodologies and tools for the design space exploration of embedded systems. J. Syst. Archit. 60(1), 53–54 (2014). doi:10.1016/j.sysarc.2013.12.001

    Article  Google Scholar 

  54. Villarreal, J., Park, A., Najjar, W., Halstead, R.: Designing modular hardware accelerators in c with roccc 2.0. In: Field-Programmable Custom Computing Machines (FCCM), 2010 18th IEEE Annual International Symposium on, pp. 127–134 (2010). doi:10.1109/FCCM.2010.28

  55. Wakabayashi, K.: C-based behavioral synthesis and verification analysis on industrial design examples. In: Proceedings of the 2004 Asia and South Pacific Design Automation Conference. ASP-DAC ’04, pp. 344–348. IEEE Press, Piscataway, NJ, USA (2004)

  56. Wakabayashi, K.: Cyberworkbench: integrated design environment based on c-based behavior synthesis and verification. In: VLSI Design, Automation and Test, 2005 (VLSI-TSA-DAT). 2005 IEEE VLSI-TSA International Symposium on, pp. 173–176 (2005). doi:10.1109/VDAT.2005.1500048

  57. Wang, G.: Catapult C Synthesis Work Flow Tutorial. ECE Department, Rice University, version 1.3 edn. (2010)

  58. Xilinx: System Generator for DSP—Getting Started Guide. Xilinx, ug639 (v 14.3) edn. (2012). http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_5/sysgen_gs.pdf

  59. XILINX: Vivado Design Suite User Guide, ug902(2012.2) edn. (2012)

  60. Xilinx: Introduction to FPGA design with vivado high-level synthesis. Tech. Rep. UG998 (v1.0), Xilinx (2013)

  61. Yuki, T., Morvan, A., Derrien, S.: Derivation of efficient fsm from loop nests. In: Field-Programmable Technology (FPT), 2013 International Conference on, pp. 286–293 (2013). doi:10.1109/FPT.2013.6718367

  62. Zhang, Z., Fan, Y., Jiang, W., Han, G., Yang, C., Cong, J.: Autopilot: a platform-based esl synthesis system. In: Coussy, P., Morawiec, A. (eds.) High-Level Synthesis, pp. 99–112. Springer, Dordrecht (2008). doi:10.1007/978-1-4020-8588-8_6

  63. Ziegler, H., Hall, M.W., Diniz, P.: Compiler-generated communication for pipelined fpga applications. In: Design Automation Conference, 2003. Proceedings, pp. 610–615 (2003). doi:10.1109/DAC.2003.1219091

  64. Zou, D., Dou, Y., Xia, F.: Optimization schemes and performance evaluation of smith-waterman algorithm on cpu, gpu and fpga. Concurr. Comput. Pract. Exper. 24(14), 1625–1644 (2012). doi:10.1002/cpe.1913

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the China Scholarship Council and the Conseil Régional de Bourgogne Franche-Comté for their funding of our studies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanjing Bi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Bi, Y., Marzani, F. et al. Fast FPGA prototyping for real-time image processing with very high-level synthesis. J Real-Time Image Proc 16, 1795–1812 (2019). https://doi.org/10.1007/s11554-017-0688-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-017-0688-1

Keywords

Navigation