Skip to main content
Log in

The 2D wavelet transform on emerging architectures: GPUs and multicores

  • Special Issue
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Because of the computational power of today’s GPUs, they are starting to be harnessed more and more to help out CPUs on high-performance computing. In addition, an increasing number of today’s state-of-the-art supercomputers include commodity GPUs to bring us unprecedented levels of performance in terms of raw GFLOPS and GFLOPS/cost. In this work, we present a GPU implementation of an image processing application of growing popularity: The 2D fast wavelet transform (2D-FWT). Based on a pair of Quadrature Mirror Filters, a complete set of application-specific optimizations are developed from a CUDA perspective to achieve outstanding factor gains over a highly optimized version of 2D-FWT run in the CPU. An alternative approach based on the Lifting Scheme is also described in Franco et al. (Acceleration of the 2D wavelet transform for CUDA-enabled Devices, 2010). Then, we investigate hardware improvements like multicores on the CPU side, and exploit them at thread-level parallelism using the OpenMP API and pthreads . Overall, the GPU exhibits better scalability and parallel performance on large-scale images to become a solid alternative for computing the 2D-FWT versus those thread-level methods run on emerging multicore architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. J. Comput. Graph. Forum 26, 21–51 (2007)

    Google Scholar 

  2. Mallat, S.: A theory for multiresolution signal descomposition: the wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674–693 (1989)

    Article  MATH  Google Scholar 

  3. Bernabé, G., González, J., García, J.M., Duato, J.: A new lossy 3-D wavelet transform for high-quality compression of medical video. In: IEEE EMBS International Conference on Information Technology Applications in Biomedicine (2000)

  4. Daubechies, I.: Ten lectures on wavelets. Soc. Ind. Appl. Math. (1992)

  5. Tenllado, C., Setoain, J., Prieto, M., Nuel, L.P., Tirado, F.: Parallel implementation of the 2D discrete wavelet transform on graphics processing units: filter bank versus lifting. IEEE Trans. Parallel Distrib. Syst. 19(2), 299–310 (2008)

    Article  Google Scholar 

  6. Meerwald, P., Norcen, R., Uhl, A.: Cache issues with JPEG2000 wavelet lifting. In: VCIP, vol. 4671, pp. 626–634 (2002)

  7. Tao, J., Shahbahrami, A., Juurlink, B., Buchty, R., Karl, W., Vassiliadis, S.: Optimizing cache performance of the discrete wavelet transform using a visualization tool. In: 9th IEEE International Symposium on Multimedia, pp. 153–160 (2007)

  8. Shahbahrami, A., Juurlink, B., Vassiliadis, S.: Improving the memory behavior of vertical filtering in the discrete wavelet transform. In: Conference on Computing Frontiers. ACM, pp. 253–260 (2006)

  9. Kirk, D., Hwu, W.: Programming massively parallel processors: a hands-on approach. Morgan Kaufmann, Menlo Park. ISBN: 978-0-12-381472-2 (2010)

  10. Intel C++ Compiler Options (Document Number: 307776-002US) (2007)

  11. GNU compiler collection GCC http://gcc.gnu.org (2010)

  12. OpenMP The OpenMP API. http://www.openmp.org (2010)

  13. Moreland, K., Angel, E.: The FFT on a GPU. In: SIGGRAPH Eurographics 6th Workshop on Computer Graphics Hardware, San Diego, (California, US), 26-27 July, pp. 112–119 (2003)

  14. NVIDIA Corporation NVIDIA CUDA CUFFT Library Version 1.1 (2007)

  15. Govindaraju, N., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete fourier transforms on graphics processors. In: Proceedings Supercomputing 2008, Austin, TX (USA) (2008)

  16. Nukada, A., Yasuhiko, O., Endo, T., Matsuoka, S.: Bandwidth intensive 3d fft kernel for gpus using cuda. In: Proceedings Supercomputing 2008, Austin, TX (USA) (2008)

  17. Wong, T.T., Leung, C.S., Heng, P.A., Wang, J.: Discrete wavelet transform on consumer-level graphics hardware. IEEE Trans. Multimedia 9(3), 668–673 (2007)

    Article  Google Scholar 

  18. Franco, J., Bernabe, G., Fernandez, J., Acacio, M.E., Ujaldon, M.: Acceleration of the 2D wavelet transform for CUDA-enabled devices. In: 10th PARA’2010: State of the Art in Scientific and Parallel Computing. Minisymposium on GPU Computing. Reykjavik (Iceland), June (2010)

  19. Franco, J., Bernabe, G., Fernandez, J., Ujaldon, M.: Parallel 3D wavelet transform on multicore CPUs and Manycore GPUs. In: 10th International Conference on Computational Science. 2nd Workshop on Emerging Parallel Architectures. Amsterdam (The Netherlands), May (2010)

  20. Sumanaweera, T., Liu, D.: Medical image reconstruction with the FFT. In: Matt Pharr (ed.) GPU Gems 2, pp. 765–784. Addison-Wesley, Reading (2005)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manuel Ujaldón.

Additional information

This work has been supported by the Spanish MEC and EU FEDER funds under grants “Consolider Ingenio-2010 CSD2006-00046” and “TIN2006-15516-C04-03”.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Franco, J., Bernabé, G., Fernández, J. et al. The 2D wavelet transform on emerging architectures: GPUs and multicores. J Real-Time Image Proc 7, 145–152 (2012). https://doi.org/10.1007/s11554-011-0224-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-011-0224-7

Keywords

Navigation