Skip to main content

GPU Optimization of Convolution for Large 3-D Real Images

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7517))

Abstract

In this paper, we propose a method for computing convolution of large 3-D images with respect to real signals. The convolution is performed in a frequency domain using a convolution theorem. Due to properties of real signals, the algorithm can be optimized so that both time and the memory consumption are halved when compared to complex signals of the same size. Convolution is decomposed in a frequency domain using the decimation in frequency (DIF) algorithm. The algorithm is accelerated on a graphics hardware by means of the CUDA parallel computing model, achieving up to 10× speedup with a single GPU over an optimized implementation on a quad-core CPU.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boden, A.F., Redding, D.C., Hanisch, R.J., Mo, J.: Massively parallel spatially variant maximum-likelihood restoration of Hubble Space Telescope imagery. J. Opt. Soc. Am. A 13(7), 1537–1545 (1996)

    Article  Google Scholar 

  2. Bracewell, R.N.: The Fourier Transform and Its Applications, 3rd edn. McGraw-Hill (2000)

    Google Scholar 

  3. Brigham, E.: Fast Fourier Transform and Its Applications, 1st edn. Prentice-Hall (1988)

    Google Scholar 

  4. Domanski, L., Vallotton, P., Wang, D.: Two and Three-Dimensional Image Deconvolution on Graphics Hardware. In: Proceedings of the 18th World IMACS/MODSIM Congress, Cairns, Australia, July 13-17, pp. 1010–1016 (2009)

    Google Scholar 

  5. Fialka, O., Cadik, M.: FFT and Convolution Performance in Image Filtering on GPU. In: Tenth International Conference on Information Visualization, IV 2006, pp. 609–614 (2006)

    Google Scholar 

  6. Fraser, D.: Array permutation by index-digit permutation. J. ACM 23(2), 298–309 (1976), http://doi.acm.org/10.1145/321941.321949

    Article  MathSciNet  MATH  Google Scholar 

  7. Frigo, M., Johnson, S.G.: FFTW 3.2.2. Massachusetts Institute of Technology (July 2009), http://www.fftw.org/fftw3.pdf

  8. Gonzales, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Prentice-Hall (2002)

    Google Scholar 

  9. Govindaraju, N.K., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete Fourier transforms on graphics processors. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12. IEEE Press, Piscataway (2008)

    Google Scholar 

  10. Group, K.: OpenCL (2011), http://www.khronos.org/opencl/

  11. Hanna, J.R., Rowland, J.H.: Fourier Series, Transforms, and Boundary Value Problems, 2nd edn. John Wiley & Sons (1990)

    Google Scholar 

  12. Hey, A.: The FFT Demystified. Engineering Productivity Tools Ltd., 21 Leaveden Road, Watford, Hertfordshire, UK (1999), http://www.engineeringproductivitytools.com/stuff/T0001/PT10.HTM

  13. Ifeachor, E.C., Jervis, B.W.: Digital Signal Processing: A Practical Approach, 2nd edn. Pearson Education (2002)

    Google Scholar 

  14. Jähne, B.: Digital Image Processing, 6th edn. Springer (2005)

    Google Scholar 

  15. Karas, P., Svoboda, D.: Convolution of large 3D images on GPU and its decomposition. EURASIP Journal on Advances in Signal Processing (120), 1–12 (2011), http://asp.eurasipjournals.com/content/2011/1/120

  16. Luo, Y., Duraiswami, R.: Canny edge detection on NVIDIA CUDA. In: Computer Vision and Pattern Recognition Workshop, pp. 1–8 (2008)

    Google Scholar 

  17. Nickolls, J., Dally, W.: The GPU Computing Era. IEEE Micro 30, 56–69 (2010), http://dx.doi.org/10.1109/MM.2010.41

    Article  Google Scholar 

  18. Nukada, A., Ogata, Y., Endo, T., Matsuoka, S.: Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–11. IEEE Press, Piscataway (2008)

    Google Scholar 

  19. NVIDIA Corporation: CUDATM CUFFT Library 2.3 (June 2009), http://developer.nvidia.com/object/cuda_2_3_downloads.html

  20. NVIDIA Corporation: FERMI Tuning Guide (August 2010), http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/Fermi_Tuning_Guide.pdf

  21. NVIDIA Corporation, 2701 San Tomas Expressway, Santa Clara, USA: NVIDIA GPU Computing Developer Home Page (June 2011), http://developer.nvidia.com/category/zone/cuda-zone

  22. Ogawa, K., Ito, Y., Nakano, K.: Efficient canny edge detection using a GPU. In: International Conference on Natural Computation, pp. 279–280 (2010)

    Google Scholar 

  23. Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.J.: A Survey of General-Purpose Computation on Graphics Hardware, pp. 21–51 (August 2005)

    Google Scholar 

  24. Pankajakshan, P.: Blind Deconvolution for Confocal Laser Scanning Microscopy. Ph.D. thesis, Universite de Nice Sophia Antipolis (December 2009), http://tel.archives-ouvertes.fr/tel-00474264/fr/

  25. Podlozhnyuk, V.: Image Convolution with CUDA (June 2007), http://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_64_website/projects/convolutionSeparable/doc/convolutionSeparable.pdf

  26. Pratt, W.K.: Digital Image Processing, 3rd edn. John Wiley & Sons (2001)

    Google Scholar 

  27. Rabiner, L.R.: On the use of symmetry in fft computation. IEEE Transactions on Acoustics, Speech, and Signal Processing 27, 233–239 (1979)

    Article  MATH  Google Scholar 

  28. Saidi, A.: Generalized FFT Algorithm. In: IEEE International Conference on Communications 93: Technical program, conference record. In: IEEE International Conference on Communications, Geneva, Switzerland, May 23-26, vols. 1-3, pp. 227–231 (1993)

    Google Scholar 

  29. Sarder, P., Nehorai, A.: Deconvolution methods for 3-D fluorescence microscopy images. IEEE Signal Processing Magazine 23(3), 32–45 (2006)

    Article  Google Scholar 

  30. Schaa, D., Kaeli, D.: Exploring the multiple-GPU design space. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–12. IEEE Computer Society, Washington, DC (2009)

    Google Scholar 

  31. Svoboda, D.: Efficient Computation of Convolution of Huge Images. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part I. LNCS, vol. 6978, pp. 453–462. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  32. Svoboda, D., Kozubek, M., Stejskal, S.: Generation of Digital Phantoms of Cell Nuclei and Simulation of Image Formation in 3D Image Cytometry. Cytometry Part A 75A(6), 494–509 (2009)

    Article  Google Scholar 

  33. Trussell, H., Hunt, B.: Image restoration of space variant blurs by sectioned methods. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1978, vol. 3, pp. 196–198 (1978)

    Google Scholar 

  34. Verveer, P.J.: Computational and optical methods for improving resolution and signal quality in fluorescence microscopy. Ph.D. thesis, Technische Universiteit Te Delft (1998)

    Google Scholar 

  35. Press, W.H., Teukolsky, S.A., Vettrling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn., ch. 7. Cambridge University Press (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karas, P., Svoboda, D., Zemčík, P. (2012). GPU Optimization of Convolution for Large 3-D Real Images. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P., Zemčík, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2012. Lecture Notes in Computer Science, vol 7517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33140-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33140-4_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33139-8

  • Online ISBN: 978-3-642-33140-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics