GPU Optimization of Convolution for Large 3-D Real Images

Karas, Pavel; Svoboda, David; Zemčík, Pavel

doi:10.1007/978-3-642-33140-4_6

GPU Optimization of Convolution for Large 3-D Real Images

Pavel Karas²¹,
David Svoboda²¹ &
Pavel Zemčík²²

Conference paper

1414 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7517))

Abstract

In this paper, we propose a method for computing convolution of large 3-D images with respect to real signals. The convolution is performed in a frequency domain using a convolution theorem. Due to properties of real signals, the algorithm can be optimized so that both time and the memory consumption are halved when compared to complex signals of the same size. Convolution is decomposed in a frequency domain using the decimation in frequency (DIF) algorithm. The algorithm is accelerated on a graphics hardware by means of the CUDA parallel computing model, achieving up to 10× speedup with a single GPU over an optimized implementation on a quad-core CPU.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boden, A.F., Redding, D.C., Hanisch, R.J., Mo, J.: Massively parallel spatially variant maximum-likelihood restoration of Hubble Space Telescope imagery. J. Opt. Soc. Am. A 13(7), 1537–1545 (1996)
Article Google Scholar
Bracewell, R.N.: The Fourier Transform and Its Applications, 3rd edn. McGraw-Hill (2000)
Google Scholar
Brigham, E.: Fast Fourier Transform and Its Applications, 1st edn. Prentice-Hall (1988)
Google Scholar
Domanski, L., Vallotton, P., Wang, D.: Two and Three-Dimensional Image Deconvolution on Graphics Hardware. In: Proceedings of the 18th World IMACS/MODSIM Congress, Cairns, Australia, July 13-17, pp. 1010–1016 (2009)
Google Scholar
Fialka, O., Cadik, M.: FFT and Convolution Performance in Image Filtering on GPU. In: Tenth International Conference on Information Visualization, IV 2006, pp. 609–614 (2006)
Google Scholar
Fraser, D.: Array permutation by index-digit permutation. J. ACM 23(2), 298–309 (1976), http://doi.acm.org/10.1145/321941.321949
Article MathSciNet MATH Google Scholar
Frigo, M., Johnson, S.G.: FFTW 3.2.2. Massachusetts Institute of Technology (July 2009), http://www.fftw.org/fftw3.pdf
Gonzales, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Prentice-Hall (2002)
Google Scholar
Govindaraju, N.K., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete Fourier transforms on graphics processors. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12. IEEE Press, Piscataway (2008)
Google Scholar
Group, K.: OpenCL (2011), http://www.khronos.org/opencl/
Hanna, J.R., Rowland, J.H.: Fourier Series, Transforms, and Boundary Value Problems, 2nd edn. John Wiley & Sons (1990)
Google Scholar
Hey, A.: The FFT Demystified. Engineering Productivity Tools Ltd., 21 Leaveden Road, Watford, Hertfordshire, UK (1999), http://www.engineeringproductivitytools.com/stuff/T0001/PT10.HTM
Ifeachor, E.C., Jervis, B.W.: Digital Signal Processing: A Practical Approach, 2nd edn. Pearson Education (2002)
Google Scholar
Jähne, B.: Digital Image Processing, 6th edn. Springer (2005)
Google Scholar
Karas, P., Svoboda, D.: Convolution of large 3D images on GPU and its decomposition. EURASIP Journal on Advances in Signal Processing (120), 1–12 (2011), http://asp.eurasipjournals.com/content/2011/1/120
Luo, Y., Duraiswami, R.: Canny edge detection on NVIDIA CUDA. In: Computer Vision and Pattern Recognition Workshop, pp. 1–8 (2008)
Google Scholar
Nickolls, J., Dally, W.: The GPU Computing Era. IEEE Micro 30, 56–69 (2010), http://dx.doi.org/10.1109/MM.2010.41
Article Google Scholar
Nukada, A., Ogata, Y., Endo, T., Matsuoka, S.: Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–11. IEEE Press, Piscataway (2008)
Google Scholar
NVIDIA Corporation: CUDA^TM CUFFT Library 2.3 (June 2009), http://developer.nvidia.com/object/cuda_2_3_downloads.html
NVIDIA Corporation: FERMI Tuning Guide (August 2010), http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/Fermi_Tuning_Guide.pdf
NVIDIA Corporation, 2701 San Tomas Expressway, Santa Clara, USA: NVIDIA GPU Computing Developer Home Page (June 2011), http://developer.nvidia.com/category/zone/cuda-zone
Ogawa, K., Ito, Y., Nakano, K.: Efficient canny edge detection using a GPU. In: International Conference on Natural Computation, pp. 279–280 (2010)
Google Scholar
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.J.: A Survey of General-Purpose Computation on Graphics Hardware, pp. 21–51 (August 2005)
Google Scholar
Pankajakshan, P.: Blind Deconvolution for Confocal Laser Scanning Microscopy. Ph.D. thesis, Universite de Nice Sophia Antipolis (December 2009), http://tel.archives-ouvertes.fr/tel-00474264/fr/
Podlozhnyuk, V.: Image Convolution with CUDA (June 2007), http://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_64_website/projects/convolutionSeparable/doc/convolutionSeparable.pdf
Pratt, W.K.: Digital Image Processing, 3rd edn. John Wiley & Sons (2001)
Google Scholar
Rabiner, L.R.: On the use of symmetry in fft computation. IEEE Transactions on Acoustics, Speech, and Signal Processing 27, 233–239 (1979)
Article MATH Google Scholar
Saidi, A.: Generalized FFT Algorithm. In: IEEE International Conference on Communications 93: Technical program, conference record. In: IEEE International Conference on Communications, Geneva, Switzerland, May 23-26, vols. 1-3, pp. 227–231 (1993)
Google Scholar
Sarder, P., Nehorai, A.: Deconvolution methods for 3-D fluorescence microscopy images. IEEE Signal Processing Magazine 23(3), 32–45 (2006)
Article Google Scholar
Schaa, D., Kaeli, D.: Exploring the multiple-GPU design space. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–12. IEEE Computer Society, Washington, DC (2009)
Google Scholar
Svoboda, D.: Efficient Computation of Convolution of Huge Images. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part I. LNCS, vol. 6978, pp. 453–462. Springer, Heidelberg (2011)
Chapter Google Scholar
Svoboda, D., Kozubek, M., Stejskal, S.: Generation of Digital Phantoms of Cell Nuclei and Simulation of Image Formation in 3D Image Cytometry. Cytometry Part A 75A(6), 494–509 (2009)
Article Google Scholar
Trussell, H., Hunt, B.: Image restoration of space variant blurs by sectioned methods. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1978, vol. 3, pp. 196–198 (1978)
Google Scholar
Verveer, P.J.: Computational and optical methods for improving resolution and signal quality in fluorescence microscopy. Ph.D. thesis, Technische Universiteit Te Delft (1998)
Google Scholar
Press, W.H., Teukolsky, S.A., Vettrling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn., ch. 7. Cambridge University Press (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Biomedical Image Analysis, Faculty of Informatics, Masaryk University, Botanická 68a, Brno, Czech Republic
Pavel Karas & David Svoboda
Dept. of Computer Graphics and Multimedia, Faculty of Information Technology, Brno University of Technology, Božetěchova 2, Brno, Czech Republic
Pavel Zemčík

Authors

Pavel Karas
View author publications
You can also search for this author in PubMed Google Scholar
David Svoboda
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Zemčík
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DGA, 7-9 rue des mathurins, 92 221, Bagneux, France
Jacques Blanc-Talon
Telecommunications and Information processing (TELIN), Ghent University, St.-Pietersnieuwstraat 41, 9000, Ghent, Belgium
Wilfried Philips
CSIRO ICT Centre, Epping, Po Box 76, 1710, Sydney, NSW, Australia
Dan Popescu
University of Antwerp, Universiteitsplein 1, Building N. 2610,, Wilrijk, Belgium
Paul Scheunders
Faculty of Information Technology, Brno University of Technology, 61266, Brno, Czech Republic
Pavel Zemčík

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Karas, P., Svoboda, D., Zemčík, P. (2012). GPU Optimization of Convolution for Large 3-D Real Images. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P., Zemčík, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2012. Lecture Notes in Computer Science, vol 7517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33140-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-33140-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33139-8
Online ISBN: 978-3-642-33140-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics