Abstract
This work presents a compact and efficient row buffer (RB) architecture on field-programmable gate array (FPGA). The design confines multiple RBs within the full capacity of Xilinx Block RAM (BRAM) in contrast to the conventional approach which partially utilizes a full BRAM per RB. The configuration of BRAM with different port aspect ratio and its data accessing via an efficient pattern generator circuitry allows the design to buffer pixelwise image data and retrieve multiple pixels per clock in a predefined pattern to achieve the functionality of multiple RBs. The design uses smallest BRAM18 primitive to be scaled in small steps for any larger kernel and image size for providing the best economical solution. The proposed architecture retains the bandwidth requirement to 1 pixel/clock at an ideal efficiency of 1 clock/pixel along with the saving of up to 87.5% BRAMs as compared to the conventional RBs and at the same time sustains high frame rates (\(1920\times 1080\) @ 217 fps) to support real-time image processing. Therefore, it is feasible to replace conventional high-cost RBs with our proposed RBs on latest FPGA devices especially for high performance yet area constraint neighbourhood image processing applications.









Similar content being viewed by others
References
Haidekker, M.: Advanced Biomedical Image Analysis. Wiley, New York (2011)
Wan Ahmad, W.S.H.M., Zaki, W.M.D., Ahmad Fauzi, M.F.: Lung segmentation on standard and mobile chest radiographs using oriented Gaussian derivatives filter. BioMed. Eng. 14(1), 1–26 (2015)
Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis, and Machine Vision. Cengage Learning, Boston (2014)
Yang, H., Zhang, J., Sun, J., Yu, L.: Review of advanced FPGA architectures and technologies. J. Electron. 31(5), 371–393 (2014)
Al Najjar, M., Ghantous, M., Bayoumi, M.: Video Surveillance for Sensor Platforms. Springer, Berlin (2014)
Kazmi, M., Aziz, A., Akhtar, P., Ikram, N.: A low cost structurally optimized design for diverse filter types. PLoS ONE 11(11), e0166,056 (2016)
Torres-Huitzil, C., Arias-Estrada, M.: Real-time image processing with a compact FPGA-based systolic architecture. Real-Time Imaging 10(3), 177–187 (2004)
Vourvoulakis, J., Kalomiros, J., Lygouras, J.: Fully pipelined FPGA-based architecture for real-time SIFT extraction. Microprocess. Microsyst. 40, 53–73 (2016)
Cooke, P., Fowers, J., Brown, G., Stitt, G.: A tradeoff analysis of FPGAs, GPUs, and multicores for sliding-window applications. ACM Trans. Reconfig. Technol. Syst. 8(1), 1–24 (2015). doi:10.1145/2659000
Pauwels, K., Tomasi, M., Alonso, J.D., Ros, E., Van Hulle, M.M.: A comparison of FPGA and GPU for real-time phase-based optical flow, stereo, and local image features. IEEE Trans. Comput. 61(7), 999–1012 (2012)
Bailey, D.G.: Design for Embedded Image Processing on FPGAs. Wiley, New York (2011)
Liu, Q., Constantinides, G.A., Masselos, K., Cheung, P.: Combining data reuse with data-level parallelization for FPGA-targeted hardware compilation: a geometric programming framework. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 28(3), 305–315 (2009)
Bosi, B., Bois, G., Savaria, Y.: Reconfigurable pipelined 2-d convolvers for fast digital signal processing. IEEE Trans. Very Large Scale Integr. Syst. 7(3), 299–308 (1999)
Cardells-Tormo, F., Molinet, P.L.: Area-efficient 2-D shift-variant convolvers for FPGA-based digital image processing. IEEE Trans. Circuits Syst. II Express Briefs 53(2), 105–109 (2006)
Zhang, H., Xia, M., Hu, G.: A multiwindow partial buffering scheme for FPGA-based 2-D convolvers. IEEE Trans. Circuits Syst. II: Express Briefs 54(2), 200–204 (2007)
Cao, T.P., Elton, D., Deng, G.: Fast buffering for FPGA implementation of vision-based object recognition systems. J. Real-Time Image Proc. 7(3), 173–183 (2012)
Schmidt, M., Reichenbach, M., Loos, A., Fey, D.: A smart camera processing pipeline for image applications utilizing marching pixels. Signal Image Process. Int. J. 2(3), 137–156 (2011)
Wiatr, K., Jamro, E.: Implementation image data convolutions operations in FPGA reconfigurable structures for real-time vision systems. In: International Conference on Information Technology: Coding and Computing, 2000. Proceedings. IEEE, pp. 152–157 (2000)
Liang, X., Jean, J., Tomko, K.: Data buffering and allocation in mapping generalized template matching on reconfigurable systems. J. Supercomput. 19(1), 77–91 (2001)
Moore, C.T., Devos, H., Stroobandt, D.: Optimizing the FPGA memory design for a sobel edge detector. In: 20th Annual Workshop on Circuits, Systems and Signal Processing (ProRISC 2009), STW Technology Foundation, pp. 496–499 (2009)
(2013) 7 Series FPGAs Memory Resources, user guide, v1.10 ed. Tech. rep., Xilinx
Holzer, M., Schumacher, F., Greiner, T., Rosenstiel, W.: Optimized hardware architecture of a smart camera with novel cyclic image line storage structures for morphological raster scan image processing. In: 2012 IEEE International Conference on Emerging Signal Processing Applications (ESPA). IEEE, pp. 83–86 (2012)
Deserno, T.M.: Biomedical Image Processing. Springer, Berlin (2011)
Tomasi, M., Vanegas, M., Barranco, F., Diaz, J., Ros, E.: High-performance optical-flow architecture based on a multi-scale, multi-orientation phase-based model. IEEE Trans. Circuits Syst. Video Technol. 20(12), 1797–1807 (2010)
(2002) Virtex-E 1.8 V Field Programmable Gate Array v2.3. Tech. rep., Xilinx
(2003) Spartan-II, FPGA Family: Complete Data Sheet v2.5. Tech. rep., Altera Corp
(2012) Virtex 5 FPGA User Guide v5.4. Tech. rep., Xilinx
(2012) XST User Guide for Virtex-6, Spartan-6, and 7 Series Device. Tech. rep., Xilinx
(2015) Stratix V, Device Overview. Tech. rep., Altera Corp
(2016) Virtex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics v1.7.1. Tech. rep., Xilinx
Wakin, M.: Standard Test Images, University of Michigan. http://www.ece.rice.edu/wakin/images/ (2003)
(2014) Artix 7 FPGA Data Sheet: DC and Switching Characteristics. Tech. rep., Xilinx
(2013) ISE Design Suite 14: Release Notes, Installation and Licensing. Tech. rep., Xilinx
Holzer, M., Schumacher, F., Flores, I., Greiner, T., Rosenstiel, W.: A real time video processing framework for hardware realization of neighborhood operations with FPGAs. In: 2011 21st International Conference Radioelektronika (RADIOELEKTRONIKA). IEEE, pp. 1–4 (2011)
Chapman, K.: Multiplexer design techniques for datapath performance with minimized routing resources (2012)
Drimer, S., Gneysu, T., Paar, C.: DSPs, BRAMs, and a pinch of logic: extended recipes for AES on FPGAs. ACM Trans. Reconfig. Technol. Syst. 3(1), 3 (2010)
Sedcole, N.P.: Reconfigurable platform-based design in FPGAs for video image processing. Ph.D. thesis, University of London (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kazmi, M., Aziz, A. & Akhtar, P. An efficient and compact row buffer architecture on FPGA for real-time neighbourhood image processing. J Real-Time Image Proc 16, 1845–1858 (2019). https://doi.org/10.1007/s11554-017-0690-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-017-0690-7