Hardware Implementation of Reconfigurable 1D Convolution

Rao, Lei; Zhang, Bin; Zhao, Jizhong

doi:10.1007/s11265-015-0969-5

Hardware Implementation of Reconfigurable 1D Convolution

Published: 17 January 2015

Volume 82, pages 1–16, (2016)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Lei Rao¹,
Bin Zhang¹ &
Jizhong Zhao¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Convolution has been extensively used in image processing and computer vision, including image enhancement, smoothing, and structure extraction. However, convolution operation typically requires a significant amount of computing resources. A novel one-dimensional (1D) convolution processor with reconfigurable architecture is implemented in this study. This processor is a combination of a line buffer, controller units, as well as a reconfigurable and separable convolution module. The use of a reconfigurable architecture and separable convolution approach improves the flexibility and performance of the convolution processor. The reconfigurable and separable convolution array, which is the main component of the processor, can simultaneously execute convolution operation with different kernels, with a maximum kernel size of up to 24 × 24. Experimental results show that the maximum frames rate of the processor is approximately 194 frames per second (fps), which exceeds the real-time requirement. Synthesis results show that the processor occupies 13.39 mm ² at a 204 MHz system clock and consumes a power of 419 mW at maximum kernel size at a 120 MHz system clock in SMIC 0.18 μm CMOS technology. Verification experiments on field programmable gate arrays (FPGAs) demonstrate that the processor is suitable for real-time image processing applications even for high-resolution images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An FPGA 2D-convolution unit based on the CAPH language

Article 12 October 2015

A new approach for design of an efficient FPGA-based reconfigurable convolver for image processing

Article 05 July 2021

Accelerating Image Processing Using Reduced Precision Calculation Convolution Engines

Article Open access 09 May 2023

References

Parmar, J.M., & Patil, S.A. (2013). Performance evaluation and comparison of modified denoising method and the local adaptive wavelet image denoising method. International Conference on Intelligent Systems and Signal Processing, 101–105.
Foi, A., & Boracchi, G. (2013). Anisotropically foveated nonlocal image denoising. In 2013 20th IEEE International Conference on Image Processing (ICIP) (pp. 464–468).
Zhu, Q., Zheng, D, Xiong, H. (2012). 3D tubular structure extraction using kernel-based superellipsoid model with Gaussian process regression. IEEE Visual Communications and Image Processing (VCIP), 1–6.
Letourneau, E., Verhaeghe, J., Reader, A.J. (2012). Impact of tracer distribution, count level, iterations and post-smoothing on PET quantification using a variously weighted least squares algorithm. IEEE Nuclear Science Symposium and Medical Imaging Conference, 2351–2353.
Hamarsheh, Q. (2012). Unified matrix processor design for FCT-based filtering, convolution and correlation of signals. Second International Conference on Digital Information and Communication Technology and its Applications, 293–299.
Chan, C., Fulton, R., Barnett, R., Feng, D.D., Meikle, S. (2014). Postreconstruction nonlocal means filtering of whole-body PET with an anatomical prior. IEEE Transactions on Medical Imaging, 33(3), 636–650.
Article Google Scholar
Ok, A.O. (2014). A new approach for the extraction of aboveground circular structures from Near-Nadir VHR satellite imagery. IEEE Transactions on Geoscience and Remote Sensing, 52(6), 3125–3140.
Article Google Scholar
Franchini, S., Gentile, A., Sorbello, F., Vassallo, G., Vitabile, S. (2013). A specialized architecture for color image edge detection based on clifford algebra, Seventh International Conference on Complex. Intelligent, and Software Intensive Systems (CISIS), 128–135.
Niclass, C., Soga, M., Matsubara, H., Ogawa, M., Kagami, M. (2014). A 0.18- μm CMOS SoC for a 100-m-range 10-frame/s 200 × 96-pixel time-of-flight depth sensor. IEEE Journal of Solid-State Circuits, 49(1), 315–330.
Article Google Scholar
Talmon, R., Cohen, I., Gannot, S. (2013). Single-channel transient interference suppression with diffusion maps. IEEE Transactions on Audio, Speech, and Language Processing, 21(1), 132–144.
Article Google Scholar
Zhang, J., Fu, N., Peng, X. (2014). Compressive circulant matrix based analog to information conversion. IEEE Signal Processing Letters, 21(4), 428–431.
Article Google Scholar
Chen, W. (2014). Determination of displacement from an image sequence based on time-reversal invariance. IEEE Transactions on Geoscience and Remote Sensing, 52(5), 2575–2592.
Article Google Scholar
Zamarreno-Ramos, C., Linares-Barranco, A., Serrano-Gotarredona, T., Linares-Barranco, B. (2013). Multicasting mesh AER: A scalable assembly approach for reconfigurable neuromorphic structured AER systems. Application to convNets, IEEE Transactions on Biomedical Circuits and Systems, 7(1), 82–102.
Article Google Scholar
Li, W.X.Y., Cheung, R.C.C., Chan, R.H.M., Song, D., Berger, T.W. (2013). A reconfigurable architecture for real-time prediction of neural activity. IEEE International Symposium on Circuits and Systems, 1869–1872.
Roy, D. (2005). Machine vision: theory, algorithms, practicalities. Singapore: Elsevier.
Google Scholar
Iandola, F.N., Sheffield, D., Anderson, M.J., Phothilimthana, P.M., Keutzer, K. (2013). Communication-minimizing 2D convolution in GPU registers, 20th IEEE International Conference on Image Processing (ICIP) (2116–2120).
Wang, X.X., & Shi, B.E. (2010). GPU implemention of fast Gabor filters. Proceedings of 2010 IEEE International Symposium on Circuits and Systems (ISCAS), 373–376.
Hartung, S., Shukla, H., Miller, J.P., Pennypacker, C. (2012). GPU acceleration of image convolution using spatially-varying kernel. 19th IEEE International Conference on Image Processing (ICIP), 1685–1688.
Krill, B., & Amira, A. (2011). Efficient reconfigurable architectures of generic cyclic convolution. IEEE 15th International Symposium on Consumer Electronics, 560–564.
Mohammad, K., & Agaian, S. (2009). Efficient FPGA implementation of convolution. IEEE International Conference on Systems, Man and Cybernetics, 3478–3483.
Vega-Rodriguez, M.A., Sanchez-Perez, J.M., Gomez-Pulido, J.A. (2004). An optimized architecture for implementing image convolution with reconfigurable hardware. Proceedings of the 2004 World Automation Congress, 16, 131–136.
Google Scholar
Hashemi, M.R., & Eshghi, M. (2012). Design of a reconfigurable parallel convolver. 19th International Conference on Systems, Signals and Image Processing, 181–184.
Zhang, B., Mei, K., Zheng, N. (2013). Coarse-grained dynamically reconfigurable processor for vision pre-processing. Journal of Signal Processing Systems.
Zhang, H., Xia, M., Hu, G. (2007). A multiwindow partial buffering scheme for FPGA-Based 2-D convolvers. IEEE Transactions on Circuits and Systems II: Express Briefs, 54(2), 200–204.
Article Google Scholar
Cardells-Tormo, F., & Molinet, P.L. (2006). Area-efficient 2-D shift-variant convolvers for FPGA-based digital image processing. IEEE Transactions on Circuits and Systems II: Express Briefs, 53(2), 105–109.
Article Google Scholar
Ohsang Kwon., Nowka K. Swartzlander E.E. (2000). A 16-bit × 16-bit MAC design using fast 5:2 compressors. IEEE International Conference on Application-Specific Systems, Architectures, and Processors, 235–243.
Rao, D.V., & Patil, S. (2006). Implementation and evaluation of image processing algorithms on reconfigurable architecture using C-based hardware descriptive languages. International Journal of Engineering and Applied Computer Sciences, 1(1), 9–34.
Google Scholar
Joginipelly, A., Varela, A., Charalampidis, D., Schott, R., Fitzsimmons, Z. (2012). Efficient FPGA implementation of steerable Gaussian smoothers. 44th Southeastern Symposium on System Theory (SSST), 78–82.
Elboher, E., & Werman, M. (2012). Efficient and accurate Gaussian image filtering using running sums. 12th International Conference on Intelligent Systems Design and Applications, 897–902.
Charalampidis, D. (2009). Efficient directional Gaussian smoothers. IEEE Geoscience and Remote Sensing Letters, 6(3), 383–387.
Article Google Scholar
Chip-Hong, C., Jiangmin, G., Mingyan, Z. (2004). Ultra low-voltage low-Power CMOS 4-2 and 5-2 compressors for fast arithmetic circuits. IEEE Transactions on Circuits and Systems-I, 51(10), 1985–1997.
Article Google Scholar
Veeramachaneni, S., Krishna, M.K., Avinash, L., Puppala, S.R., Srinivas, M.B. (2007). Novel architectures for high-speed and low-power 3-2, 4-2 and 5-2 compressors, 6th International Conference on Embedded Systems., 20th International Conference on VLSI Design (324–329).
Alexey, L. (2011). A SIMD cellular processor array visionchip with asynchronous processing capabilities. IEEE Transactions on Circuits and Systems I: Regular Papers, 58(10), 2420–2431.
Article MathSciNet Google Scholar
Wan-cheng, Z., Qiu-yu, F., Nan-jian, W. (2011). A programmable vision chip based on multiple levels of parallel processors. IEEE Journal of Solid-State Circuits, 46(9), 1–16.
Article Google Scholar
Camunas-Mesa, L., Zamarreno-Ramos, C., Linares-Barranco, A., Acosta-Jimenez, A.J., Serrano-Gotarredona, T., Linares-Barranco, B. (2012). An event-driven multi-kernel convolution processor module for event-driven vision sensors. IEEE Journal of Solid-State Circuits, 47(2), 504–517.
Article Google Scholar
Liu, Z., Song, Y., Shao, M., Li, S., Li, L., Ishiwata, S., Nakagawa, M., Goto, S., Ikenaga, T. (2009). HDTV1080p H.264/AVC encoder chip design and performance analysis. IEEE Journal of Solid-State Circuits, 44(2), 594–608.
Article Google Scholar

Download references

Acknowledgements

This work is supported by Project funded by China Postdoctoral Science Foundation (2014M550492), National Natural Science Foundation of China (61231018), and Natural Science Basic Research Plan in Shaanxi Province of China (2013JQ8025).

Author information

Authors and Affiliations

Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
Lei Rao, Bin Zhang & Jizhong Zhao

Authors

Lei Rao
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jizhong Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rao, L., Zhang, B. & Zhao, J. Hardware Implementation of Reconfigurable 1D Convolution. J Sign Process Syst 82, 1–16 (2016). https://doi.org/10.1007/s11265-015-0969-5

Download citation

Received: 09 August 2014
Revised: 31 December 2014
Accepted: 06 January 2015
Published: 17 January 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s11265-015-0969-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hardware Implementation of Reconfigurable 1D Convolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An FPGA 2D-convolution unit based on the CAPH language

A new approach for design of an efficient FPGA-based reconfigurable convolver for image processing

Accelerating Image Processing Using Reduced Precision Calculation Convolution Engines

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Hardware Implementation of Reconfigurable 1D Convolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An FPGA 2D-convolution unit based on the CAPH language

A new approach for design of an efficient FPGA-based reconfigurable convolver for image processing

Accelerating Image Processing Using Reduced Precision Calculation Convolution Engines

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation