Skip to main content

Advertisement

Log in

A system-level FPGA design methodology for video applications with weakly-programmable hardware components

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

High-performance video applications with real-time requirements play an important role in diverse application fields and are often executed by advanced parallel processors or GPUs. For embedded scenarios with strict energy constraints such as automotive image processing, FPGAs represent a feasible power-efficient computer platform. Unfortunately, their hardware-driven design concept results in long development cycles and impedes their acceptance in industrial practice. Additionally, the verification of the FPGA’s correctness and its performance figures are unavailable until a very late development stage, which is critical during design space exploration and the integration in complex embedded systems. Weakly-programmable architectures, supporting design and run-time reuse via flexible hardware components, represent a promising and efficient FPGA development approach. However, they currently lack suitable design and verification methodologies for real-time scenarios. Therefore, this paper proposes a system-level FPGA development concept for video applications with weakly-programmable hardware components. It combines rapid software prototyping with component-based FPGA design and advanced formal real-time analysis and code generation techniques. The presented approach enables an early verification of the application’s correctness, including exact performance figures. It provides a software-level verification of weakly-programmable hardware components and an automated assembly of the final hardware design. The developed tools and their usability are demonstrated by a binarization and a dense block matching application, which represents a basic preprocessing step in automotive image processing for driver assistance systems. When compared to a hand-optimized variant, the generated hardware design achieves comparable performance and chip area figures without requiring significant hardware integration effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Bailey, D.G.: Design for Embedded Image Processing on FPGAs, 1st edn. Wiley Publishing, London (2011)

  2. Bamakhrama, M., Zhai, J., Nikolov, H., Stefanov, T.: A methodology for automated design of hard-real-time embedded streaming systems. In: Design, Automation Test in Europe Conference Exhibition (DATE), 2012, pp. 941–946 (2012)

  3. Banz, C., Hesselbarth, S., Flatt, H., Blume, H., Pirsch, P.: Real-time stereo vision system using semi-global matching disparity estimation: architecture and FPGA-implementation. In: International Conference on Embedded Computer Systems (SAMOS), pp. 93–101 IEEE (2010)

  4. Bhattacharya, B., Bhattacharyya, S.: Parameterized dataflow modeling for dsp systems. IEEE Trans. Signal Process. 49(10), 2408–2421 (2001)

    Article  MathSciNet  Google Scholar 

  5. Bhattacharyya, S.S., Brebner, G., Janneck, J.W., Eker, J., von Platen, C., Mattavelli, M., Raulet, M.: Opendf: a dataflow toolset for reconfigurable hardware and multicore systems. SIGARCH Comput. Archit. News. 36(5), 29–35 (2009)

    Article  Google Scholar 

  6. Bilsen, G., Engels, M., Lauwereins, R., Peperstraete, J.: Cycle-static dataflow. IEEE Trans. Signal Process. 44(2), 397–408 (1996)

    Article  Google Scholar 

  7. Buck, J., Ha, S., Lee, E.A., Messerschmitt, D.G.: Ptolemy: a framework for simulating and prototyping heterogeneous systems. In: De Micheli, G., Ernst, R., Wolf, W. ( eds) Readings in Hardware/Software Co-design, pp. 527–543. Kluwer Academic Publishers, Norwell (2002)

  8. Claus, C., Stechele, W., Herkersdorf, A.: Autovision a run-time reconfigurable MPSoC architecture for future driver assistance systems. Inf. Technol. 49, 181–187 (2007)

    Google Scholar 

  9. Dabney, J.B., Harman, T.L.: Mastering SIMULINK 4, 1st edn. Prentice Hall PTR, Upper Saddle River (2001)

  10. Dehon, A., Markovsky, Y., Caspi, E., Chu, M., Huang, R., Perissakis, S., Pozzi, L., Yeh, J., Wawrzynek, J.: Stream computations organized for reconfigurable execution. Microprocess. Microsyst. 30(6), 334–354 (2006)

    Article  Google Scholar 

  11. do Carmo Lucas, A., Heithecker, S., Ernst R.: FlexWAFE—a high-end real-time stream processing library for FPGAs. In: DAC ’07: Proceedings of the 44th Annual conference on Design Automation, pp. 916–921. ACM Press, New York (2007)

  12. do Carmo Lucas, A., Sahlbach, H., Whitty, S., Heithecker, S., Ernst, R.: Application development with the FlexWAFE real-time stream processing architecture for FPGAs. ACM Trans. Embedded Comput. Syst. Special Issue Config. Alg. Process. Archit. (CAPA). 9(1), 23 (2009)

    Google Scholar 

  13. Doe, P.: High costs of mask sets and design force industry change. SolidState Technology. http://www.electroiq.com/articles/wn/print/volume-11/issue-23/features/high-costs-of-mask-sets-and-design-force-industry-change.html (2004)

  14. Dutta, H., Kissler, D., Hannig, F., Kupriyanov, A., Teich, J., Pottier, B.: A holistic approach for tightly coupled reconfigurable parallel processors. Microprocess. Microsyst. 33(1), 53–62 (2009)

    Article  Google Scholar 

  15. Dutta, S., Jensen, R., Rieckmann, A.: Viper: a multiprocessor SoC for advanced set-top box and digital TV systems. In: IEEE Design and Test of Computers, Sip., pp. 21–31 (2001)

  16. Eker, J., Janneck, J.: Caltrop—language report (draft). Technical memorandum, Electronics Research Lab, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley California, Berkeley, CA 94720, USA (2002) http://www.gigascale.org/caltrop

  17. Feist, T.: WP416: Vivado design suite. white paper 1.1, Xilinx Inc (2012)

  18. Fowers, J., Brown, G., Cooke, P., Stitt, G.: A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), pp. 47–56 (2012)

  19. Franke, U., Rabe, C., Badino, H., Gehrig, S.: 6d-vision: fusion of stereo and motion for robust environment perception. In: Kropatsch, W., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition, vol 3663 of Lecture Notes in Computer Science, pp. 216–223. Springer, Berlin, Heidelberg (2005)

  20. Ge, J., Luo, Y., Tei, G.: Real-time pedestrian detection and tracking at nighttime for driver-assistance systems. IEEE Trans. Intell. Transp. Syst. 10(2), 283–298 (2009)

    Article  Google Scholar 

  21. Gehrig, S., Eberli, F., Meyer, T.: A real-time low-power stereo vision engine using semi-global matching. In: Fritz, M., Schiele, B., Piater, J. (eds) Computer Vision Systems, vol 5815 of Lecture Notes in Computer Science, pp. 134–143. Springer, Berlin, Heidelberg (2009)

  22. Gokhale, M., Stone, J., Arnold, J., Kalinowski, M.: Stream-oriented fpga computing in the streams-c high level language. In: IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 49–56 (2000)

  23. Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recognit. 40(2), 389–397 (2007)

    Article  MATH  Google Scholar 

  24. Henia, R., Hamann, A., Jersak, M., Racu, R., Richter, K., Ernst, R.: System level performance analysis - the symta/s approach. In: IEE Proceedings Computers and Digital Techniques (2005)

  25. Impulse Accelerated Technologies. Impulse C. http://www.impulseaccelerated.com (2013)

  26. Johnston, C.T., Bailey, D.G., Lyons, P.A.: visual environment for real-time image processing in hardware (VERTIPH). EURASIP J. Embedded Syst. 2006(1), 14 (2006)

    Google Scholar 

  27. Kahn, G.: The semantics of a simple language for parallel programming. In: Rosenfeld, J.L. (ed) Information Processing ’74: Proceedings of the IFIP Congress, pp. 471–475. North-Holland, New York(1974)

  28. Khan, J., Niar, S., Menhaj, A., Elhillali, Y., Dekeyser, J.: An MPSoC architecture for the multiple target tracking application in driver assistant system. In: International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 126–131 (2008)

  29. Kyo, S., Okazaki, S.: IMAPCAR: a 100 GOPS in-vehicle vision processor based on 128 ring connected four-way VLIW processing elements. J. Signal Process. Syst. 62, 5–16 (2011)

    Article  Google Scholar 

  30. Lee, E.A., Messerschmitt, D.G.: Synchronous data flow. Proc. IEEE. 75(9), 1235–1245 (1987)

    Article  Google Scholar 

  31. Leu, A., Aiteanu, D., Graser, A.: A novel stereo camera based collision warning system for automotive applications. In: IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 409–414 (2011)

  32. Martin, G., Smith, G.: High-level synthesis: past, present, and future. Design Test Computers IEEE. 26(4), 18–25 (2009)

    Article  Google Scholar 

  33. Maya-Rueda, S., Torres-Huitzil, C., Arias-Estrada, M.: A real-time fpga-based architecture for optical flow computation. In: IEEE International Workshop on Computer Architectures for Machine Perception, p. 8. IEEE (2003)

  34. Nikolov, H., Stefanov, T., Deprettere, E.: Systematic and automated multiprocessor system design, programming, and implementation. IEEE Trans. Computer Aided Design Integr. Circuits Syst. 27(3), 542–555 (2008)

    Article  Google Scholar 

  35. Sahlbach, H., Whitty, S., Ernst, R.: A flexible high-performance accelerator platform for automotive sensor applications. SAE Int. J. Passenger Cars: Electron. Electr. Syst. 5, 280–291 (2012)

    Article  Google Scholar 

  36. Sahlbach, H., Whitty, S., Ernst, R.: A high-performance dense block matching solution for automotive 6D-vision. In Proceedings of Design, Automation and Test in Europe (DATE) (2012)

  37. Sahlbach, H., Wonneberger, S., Graf, T., Ernst, R.: Exploration of FPGA-based dense block matching for motion estimation and stereo vision on a single chip. In: Proceedings of IEEE Intelligent Vehicles Symposium (IV) (2013)

  38. Sharma, S., Chen, W.: Using model-based design to accelerate FPGA development for automotive applications. In: SAE International Journal on Passenger Cars—Electronics and Electrical Systems (2009)

  39. Stein, G.P., Rushinek, E., Hayun, G., Shashua, A.: A computer vision system on a chip: a case study from the automotive domain. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPRW) (2005)

  40. Stuijk, S., Basten, T., Geilen, M.C.W., Corporaal, H.: Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In: ACM/IEEE Design Automation Conference, pp. 777–782 (2007)

  41. Stuijk, S., Geilen, M., Basten, T.: SDF3: SDF For Free. In: Proceedings of 6th International Conference on Application of Concurrency to System Design (ACSD), pp. 276–278. IEEE Computer Society Press, Los Alamitos (2006)

  42. Synopsys Inc. Synphony c compiler. http://www.synopsys.com (2013)

  43. Synopsys Inc. System studio. http://www.synopsys.com (2013)

  44. The MathWorks Inc. HDL coder. http://www.mathworks.de/products/hdl-coder/ (2013)

  45. Thiele, D., Ernst, R.: Optimizing performance analysis for synchronous dataflow graphs with shared resources. In: Proceedings of Design, Automation, and Test in Europe (DATE), Dresden, Germany (2012)

  46. Thies, W.: Language and compiler support for stream programs. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge (2009)

  47. Voigtländer, P.: ADTF: framework for driver assistance and safety systems. ATZ, 2008–2009 (2008)

  48. Wiggers, M., Bekooij, M., Smit, G.: Buffer capacity computation for throughput constrained streaming applications with data-dependent inter-task communication. In: Real-time and embedded technology and applications Symposium, 2008. RTAS ’08. IEEE, pp. 183–194 (2008)

  49. Xilinx Inc. System Generator for DSP. http://www.xilinx.com/tools/sysgen.htm (2013)

  50. Xilinx Inc. Vivado HLS. http://www.xilinx.com (2013)

  51. Zhang, C., Lenart, T., Svensson, H., Owall, V.: Design of coarse-grained dynamically reconfigurable architecture for DSP applications. In: International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp. 338–343 (2009)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henning Sahlbach.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sahlbach, H., Thiele, D. & Ernst, R. A system-level FPGA design methodology for video applications with weakly-programmable hardware components. J Real-Time Image Proc 13, 291–309 (2017). https://doi.org/10.1007/s11554-014-0403-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-014-0403-4

Keywords

Navigation