Abstract
Gaussian convolution operation is a fundamental procedure in several data analysis tasks and scientific fields. For example, Gaussian convolution is a central step in data assimilation and machine learning and it is also frequently used in image and signal processing. Gaussian recursive filters are a class of methods designed to approximate Gaussian convolutions in a fast way. In De Luca et al. (2019 15th international conference on signal-image technology and internet-based systems (SITIS), pp 941–648, 2019), we presented a parallel implementation of the K-iterated first-order Gaussian recursive filter. This algorithm has been proved to be very efficient and accurate. Here, we provide a new GPU-parallel implementation which is based on the third order recursive filter. This filter guarantees larger accuracy and a lower computational cost with respect to the first-order one. To manage an efficient memory strategy access, and to achieve better performance results, our algorithm exploits the CUDA capabilities available on the GPU environment. Results in terms of performance and accuracy are provided in tests and experiments.
Similar content being viewed by others
References
Aprovitola A, Gallo L. Edge and junction detection improvement using the canny algorithm with a fourth order accurate derivative filter. In: 2014 tenth international conference on signal-image technology and internet-based systems, IEEE. 2014. pp. 104–11.
Chaurasia G, Ragan-Kelley J, Paris S, Drettakis G, Durand F. Compiling high performance recursive filters. In: Proceedings of the 7th conference on high-performance graphics. 2015. pp. 85–94.
Cuomo S, De Michele P, Galletti A, Marcellino L. A GPU parallel implementation of the local principal component analysis overcomplete method for DW image denoising. In: 2016 IEEE symposium on computers and communication (ISCC), IEEE. 2016. pp. 26–31.
Cuomo S, Farina R, Galletti A, Marcellino L. A K-iterated scheme for the first-order Gaussian recursive filter with boundary conditions. In: 2015 federated conference on computer science and information systems (FedCSIS), IEEE. 2015. pp. 641–47.
Cuomo S, Galletti A, Giunta G, Marcellino L. Numerical effects of the Gaussian recursive filters in solving linear systems in the 3Dvar case study. Numer Math Theory Methods Appl. 2017;10:3.
Cuomo S, Galletti A, Marcellino L. A GPU algorithm in a distributed computing system for 3D MRI denoising. In: 2015 10th international conference on P2P, parallel, grid, cloud and internet computing (3PGCIC), IEEE. 2015. pp. 557–62.
De Luca P, Fiscale S, Landolfi L, Di Mauro A. Distributed genomic compression in MapReduce paradigm. In: International conference on internet and distributed computing systems (2019), New York: Springer; 2019. pp. 369–78.
De Luca P, Galletti A, Giunta G, Marcellino L. Accelerated Gaussian convolution in a data assimilation scenario. In: International conference on computational science . New York: Springer; 2020. pp. 199–211.
De Luca P, Galletti A, Marcellino L. A Gaussian recursive filter parallel implementation with overlapping. In: 2019 15th international conference on signal-image technology & internet-based systems (SITIS) (2019), IEEE. 2019. pp. 641–48.
Gonzales RC, Woods RE. Digital image processing, 2002.
Gutiérrez PD, Lastra M, Benítez JM, Herrera F. SMOTE-GPU: big data preprocessing on commodity hardware for imbalanced classification. Prog Artif Intell. 2017;6(4):347–54.
Hewer G, Martin R, Zeh J. Robust preprocessing for Kalman filtering of glint noise. IEEE Trans Aerosp Electron Syst. 1987;1:120–8.
Liu H, Ong YS, Shen X, Cai J. When Gaussian process meets big data: A review of scalable GPs. IEEE Trans Neural Netw Learn Syst. 2020;31(11):4405–23.
László E, Giles MB, Appleyard J, Szolgay P. Methods to utilize SIMT and SIMD instruction level parallelism in tridiagonal solvers. In 2014 14th international workshop on cellular nanoscale networks and their applications (CNNA). 2014. pp. 1–2.
Marcellino L, Montella R, Kosta S, Galletti A, Di Luccio D, Santopietro V, Ruggieri M, Lapegna M, D’Amore L, Laccetti G. Using GPGPU accelerated interpolation algorithms for marine bathymetry processing with on-premises and cloud based computational resources. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 10778 LNCS (2018), pp. 14–24.
NVIDIA. 2020. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html
Steinkraus D, Buck I, Simard P. Using GPUs for machine learning algorithms. In: Eighth international conference on document analysis and recognition (ICDAR’05) (2005), IEEE. 2005. pp. 1115–20.
Triggs B, Sdika M. Boundary conditions for young-van vliet recursive filtering. IEEE Trans Signal Process. 2006;54(6):2365–7.
Yip H-M, Ahmad I, Pong T-C. An efficient parallel algorithm for computing the gaussian convolution of multi-dimensional image data. J Supercomput. 1999;14(3):233–55.
Young IT, Van Vliet LJ. Recursive implementation of the gaussian filter. Signal Process. 1995;44(2):139–51.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection “Advances on Signal Image Technology and Internet-based Systems” guest edited by Albert Dipanda, Luigi Gallo and Kokou Yetongnon.
Rights and permissions
About this article
Cite this article
De Luca, P., Galletti, A. & Marcellino, L. GPU-CUDA Implementation of the Third Order Gaussian Recursive Filter. SN COMPUT. SCI. 3, 78 (2022). https://doi.org/10.1007/s42979-021-00960-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-00960-7