Abstract
High-performance video applications with real-time requirements play an important role in diverse application fields and are often executed by advanced parallel processors or GPUs. For embedded scenarios with strict energy constraints such as automotive image processing, FPGAs represent a feasible power-efficient computer platform. Unfortunately, their hardware-driven design concept results in long development cycles and impedes their acceptance in industrial practice. Additionally, the verification of the FPGA’s correctness and its performance figures are unavailable until a very late development stage, which is critical during design space exploration and the integration in complex embedded systems. Weakly-programmable architectures, supporting design and run-time reuse via flexible hardware components, represent a promising and efficient FPGA development approach. However, they currently lack suitable design and verification methodologies for real-time scenarios. Therefore, this paper proposes a system-level FPGA development concept for video applications with weakly-programmable hardware components. It combines rapid software prototyping with component-based FPGA design and advanced formal real-time analysis and code generation techniques. The presented approach enables an early verification of the application’s correctness, including exact performance figures. It provides a software-level verification of weakly-programmable hardware components and an automated assembly of the final hardware design. The developed tools and their usability are demonstrated by a binarization and a dense block matching application, which represents a basic preprocessing step in automotive image processing for driver assistance systems. When compared to a hand-optimized variant, the generated hardware design achieves comparable performance and chip area figures without requiring significant hardware integration effort.
Similar content being viewed by others
References
Bailey, D.G.: Design for Embedded Image Processing on FPGAs, 1st edn. Wiley Publishing, London (2011)
Bamakhrama, M., Zhai, J., Nikolov, H., Stefanov, T.: A methodology for automated design of hard-real-time embedded streaming systems. In: Design, Automation Test in Europe Conference Exhibition (DATE), 2012, pp. 941–946 (2012)
Banz, C., Hesselbarth, S., Flatt, H., Blume, H., Pirsch, P.: Real-time stereo vision system using semi-global matching disparity estimation: architecture and FPGA-implementation. In: International Conference on Embedded Computer Systems (SAMOS), pp. 93–101 IEEE (2010)
Bhattacharya, B., Bhattacharyya, S.: Parameterized dataflow modeling for dsp systems. IEEE Trans. Signal Process. 49(10), 2408–2421 (2001)
Bhattacharyya, S.S., Brebner, G., Janneck, J.W., Eker, J., von Platen, C., Mattavelli, M., Raulet, M.: Opendf: a dataflow toolset for reconfigurable hardware and multicore systems. SIGARCH Comput. Archit. News. 36(5), 29–35 (2009)
Bilsen, G., Engels, M., Lauwereins, R., Peperstraete, J.: Cycle-static dataflow. IEEE Trans. Signal Process. 44(2), 397–408 (1996)
Buck, J., Ha, S., Lee, E.A., Messerschmitt, D.G.: Ptolemy: a framework for simulating and prototyping heterogeneous systems. In: De Micheli, G., Ernst, R., Wolf, W. ( eds) Readings in Hardware/Software Co-design, pp. 527–543. Kluwer Academic Publishers, Norwell (2002)
Claus, C., Stechele, W., Herkersdorf, A.: Autovision a run-time reconfigurable MPSoC architecture for future driver assistance systems. Inf. Technol. 49, 181–187 (2007)
Dabney, J.B., Harman, T.L.: Mastering SIMULINK 4, 1st edn. Prentice Hall PTR, Upper Saddle River (2001)
Dehon, A., Markovsky, Y., Caspi, E., Chu, M., Huang, R., Perissakis, S., Pozzi, L., Yeh, J., Wawrzynek, J.: Stream computations organized for reconfigurable execution. Microprocess. Microsyst. 30(6), 334–354 (2006)
do Carmo Lucas, A., Heithecker, S., Ernst R.: FlexWAFE—a high-end real-time stream processing library for FPGAs. In: DAC ’07: Proceedings of the 44th Annual conference on Design Automation, pp. 916–921. ACM Press, New York (2007)
do Carmo Lucas, A., Sahlbach, H., Whitty, S., Heithecker, S., Ernst, R.: Application development with the FlexWAFE real-time stream processing architecture for FPGAs. ACM Trans. Embedded Comput. Syst. Special Issue Config. Alg. Process. Archit. (CAPA). 9(1), 23 (2009)
Doe, P.: High costs of mask sets and design force industry change. SolidState Technology. http://www.electroiq.com/articles/wn/print/volume-11/issue-23/features/high-costs-of-mask-sets-and-design-force-industry-change.html (2004)
Dutta, H., Kissler, D., Hannig, F., Kupriyanov, A., Teich, J., Pottier, B.: A holistic approach for tightly coupled reconfigurable parallel processors. Microprocess. Microsyst. 33(1), 53–62 (2009)
Dutta, S., Jensen, R., Rieckmann, A.: Viper: a multiprocessor SoC for advanced set-top box and digital TV systems. In: IEEE Design and Test of Computers, Sip., pp. 21–31 (2001)
Eker, J., Janneck, J.: Caltrop—language report (draft). Technical memorandum, Electronics Research Lab, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley California, Berkeley, CA 94720, USA (2002) http://www.gigascale.org/caltrop
Feist, T.: WP416: Vivado design suite. white paper 1.1, Xilinx Inc (2012)
Fowers, J., Brown, G., Cooke, P., Stitt, G.: A performance and energy comparison of FPGAs, GPUs, and multicores for sliding-window applications. In: Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA), pp. 47–56 (2012)
Franke, U., Rabe, C., Badino, H., Gehrig, S.: 6d-vision: fusion of stereo and motion for robust environment perception. In: Kropatsch, W., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition, vol 3663 of Lecture Notes in Computer Science, pp. 216–223. Springer, Berlin, Heidelberg (2005)
Ge, J., Luo, Y., Tei, G.: Real-time pedestrian detection and tracking at nighttime for driver-assistance systems. IEEE Trans. Intell. Transp. Syst. 10(2), 283–298 (2009)
Gehrig, S., Eberli, F., Meyer, T.: A real-time low-power stereo vision engine using semi-global matching. In: Fritz, M., Schiele, B., Piater, J. (eds) Computer Vision Systems, vol 5815 of Lecture Notes in Computer Science, pp. 134–143. Springer, Berlin, Heidelberg (2009)
Gokhale, M., Stone, J., Arnold, J., Kalinowski, M.: Stream-oriented fpga computing in the streams-c high level language. In: IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 49–56 (2000)
Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recognit. 40(2), 389–397 (2007)
Henia, R., Hamann, A., Jersak, M., Racu, R., Richter, K., Ernst, R.: System level performance analysis - the symta/s approach. In: IEE Proceedings Computers and Digital Techniques (2005)
Impulse Accelerated Technologies. Impulse C. http://www.impulseaccelerated.com (2013)
Johnston, C.T., Bailey, D.G., Lyons, P.A.: visual environment for real-time image processing in hardware (VERTIPH). EURASIP J. Embedded Syst. 2006(1), 14 (2006)
Kahn, G.: The semantics of a simple language for parallel programming. In: Rosenfeld, J.L. (ed) Information Processing ’74: Proceedings of the IFIP Congress, pp. 471–475. North-Holland, New York(1974)
Khan, J., Niar, S., Menhaj, A., Elhillali, Y., Dekeyser, J.: An MPSoC architecture for the multiple target tracking application in driver assistant system. In: International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 126–131 (2008)
Kyo, S., Okazaki, S.: IMAPCAR: a 100 GOPS in-vehicle vision processor based on 128 ring connected four-way VLIW processing elements. J. Signal Process. Syst. 62, 5–16 (2011)
Lee, E.A., Messerschmitt, D.G.: Synchronous data flow. Proc. IEEE. 75(9), 1235–1245 (1987)
Leu, A., Aiteanu, D., Graser, A.: A novel stereo camera based collision warning system for automotive applications. In: IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 409–414 (2011)
Martin, G., Smith, G.: High-level synthesis: past, present, and future. Design Test Computers IEEE. 26(4), 18–25 (2009)
Maya-Rueda, S., Torres-Huitzil, C., Arias-Estrada, M.: A real-time fpga-based architecture for optical flow computation. In: IEEE International Workshop on Computer Architectures for Machine Perception, p. 8. IEEE (2003)
Nikolov, H., Stefanov, T., Deprettere, E.: Systematic and automated multiprocessor system design, programming, and implementation. IEEE Trans. Computer Aided Design Integr. Circuits Syst. 27(3), 542–555 (2008)
Sahlbach, H., Whitty, S., Ernst, R.: A flexible high-performance accelerator platform for automotive sensor applications. SAE Int. J. Passenger Cars: Electron. Electr. Syst. 5, 280–291 (2012)
Sahlbach, H., Whitty, S., Ernst, R.: A high-performance dense block matching solution for automotive 6D-vision. In Proceedings of Design, Automation and Test in Europe (DATE) (2012)
Sahlbach, H., Wonneberger, S., Graf, T., Ernst, R.: Exploration of FPGA-based dense block matching for motion estimation and stereo vision on a single chip. In: Proceedings of IEEE Intelligent Vehicles Symposium (IV) (2013)
Sharma, S., Chen, W.: Using model-based design to accelerate FPGA development for automotive applications. In: SAE International Journal on Passenger Cars—Electronics and Electrical Systems (2009)
Stein, G.P., Rushinek, E., Hayun, G., Shashua, A.: A computer vision system on a chip: a case study from the automotive domain. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPRW) (2005)
Stuijk, S., Basten, T., Geilen, M.C.W., Corporaal, H.: Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In: ACM/IEEE Design Automation Conference, pp. 777–782 (2007)
Stuijk, S., Geilen, M., Basten, T.: SDF3: SDF For Free. In: Proceedings of 6th International Conference on Application of Concurrency to System Design (ACSD), pp. 276–278. IEEE Computer Society Press, Los Alamitos (2006)
Synopsys Inc. Synphony c compiler. http://www.synopsys.com (2013)
Synopsys Inc. System studio. http://www.synopsys.com (2013)
The MathWorks Inc. HDL coder. http://www.mathworks.de/products/hdl-coder/ (2013)
Thiele, D., Ernst, R.: Optimizing performance analysis for synchronous dataflow graphs with shared resources. In: Proceedings of Design, Automation, and Test in Europe (DATE), Dresden, Germany (2012)
Thies, W.: Language and compiler support for stream programs. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge (2009)
Voigtländer, P.: ADTF: framework for driver assistance and safety systems. ATZ, 2008–2009 (2008)
Wiggers, M., Bekooij, M., Smit, G.: Buffer capacity computation for throughput constrained streaming applications with data-dependent inter-task communication. In: Real-time and embedded technology and applications Symposium, 2008. RTAS ’08. IEEE, pp. 183–194 (2008)
Xilinx Inc. System Generator for DSP. http://www.xilinx.com/tools/sysgen.htm (2013)
Xilinx Inc. Vivado HLS. http://www.xilinx.com (2013)
Zhang, C., Lenart, T., Svensson, H., Owall, V.: Design of coarse-grained dynamically reconfigurable architecture for DSP applications. In: International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp. 338–343 (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sahlbach, H., Thiele, D. & Ernst, R. A system-level FPGA design methodology for video applications with weakly-programmable hardware components. J Real-Time Image Proc 13, 291–309 (2017). https://doi.org/10.1007/s11554-014-0403-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-014-0403-4