Skip to main content
Log in

Algorithms and architectures for 2D discrete wavelet transform

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The 2D Discrete Wavelet Transform (DWT) is an important function in many multimedia applications, such as JPEG2000 and MPEG-4 standards, digital watermarking, and content-based multimedia information retrieval systems. The 2D DWT is computationally intensive than other functions, for instance, in the JPEG2000 standard. Therefore, different architectures have been proposed to process 2D DWT. The goal of this paper is to review and to evaluate different algorithms and different kinds of architectures such as application-specific integrated circuits, field programmable gate array, digital signal processors, graphics processing units, and General-Purpose Processors (GPPs) that are used to process 2D DWT. In addition, we implement the 2D DWT using different algorithms on GPPs enhanced with multimedia extensions. The experimental results show that the largest speedup of the vectorized 2D DWT over the scalar implementation is about 2.8 for first level decomposition. Furthermore, the characteristics of the 2D DWT and disadvantages of the existing architectures such as GPPs enhanced with SIMD instructions are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Stollnitz EJ, Derose TD, Salesin DH (1996) Wavelets for computer graphics: theory and applications. Morgan Kaufmann, San Mateo

    Google Scholar 

  2. Yusof Y, Khalifa OO (2007) Digital watermarking for digital images using wavelet transform. In: Proc IEEE int conf on telecommunications, May

    Google Scholar 

  3. Hsieh MS, Tseng DC, Huang YH (2006) Hiding digital watermarks using multiresolution wavelet transform. IEEE Trans Ind Electron 48(5):875–882

    Article  Google Scholar 

  4. Liang KC, Kuo CJ (1997) Progressive image indexing and retrieval based on embedded wavelet coding. In: Proc int conf on image processing, October, pp 572–575

    Chapter  Google Scholar 

  5. Chang T, Kuo CCJ (1993) Texture analysis and classification with tree-structured wavelet transform. IEEE Trans Image Process 2(4):429–441

    Article  Google Scholar 

  6. Biswas PK, Chatterji BN (2007) Texture image retrieval using rotated wavelet. Pattern Recognit Lett 28:1240–1249

    Article  Google Scholar 

  7. Wang JZ, Wiederhold G, Firschein O, Wei SX (1997) Content-based image indexing and searching using Daubechies’ wavelets. Int J Digit Libr, 311–328

  8. Mandal MK, Aboulnasr T, Panchanathan S (1996) Image indexing using moments and wavelets. IEEE Trans Consum Electron 3(42):557–565

    Article  Google Scholar 

  9. Dettori L, Semler L (2007) A comparison of wavelet, ridgelet, and curvelet-based texture classification algorithms in computed tomography. Comput Biol Med 37(4):486–498

    Article  Google Scholar 

  10. Jin Y, Angelini E, Laine A (2005) Wavelets in medical image processing: denoising, segmentation, and registration. In: Handbook of biomedical image analysis volume I: segmentation models part A. Springer, New York, pp 305–358

    Google Scholar 

  11. Djebouri D, Djebbari A, Djebbouri M (2005) A new robust GPS satellite signal acquisition using lifting wavelet. Telecommun Radio Eng 65(2–6)

    Google Scholar 

  12. Shahbahrami A, Juurlink B, Vassiliadis S (2008) Implementing the 2D wavelet transform on SIMD-enhanced general-purpose processors. IEEE Trans Multimed 10(1):43–51

    Article  Google Scholar 

  13. Andreopoulos Y, Masselos K, Schelkens P, Lafruit G, Cornelis J (2002) Cache misses and energy dissipation results for JPEG-2000 filtering. In: Proc 14th IEEE int conf on digital signal processing, pp 201–209

    Google Scholar 

  14. Andreopoulos Y, Schelkens P, Cornelis J (2001) Analysis of wavelet transform implementations for image and texture coding applications in programmable platforms. In: Proc IEEE signal processing systems, pp 273–284

    Google Scholar 

  15. Andreopoulos Y, Schelkens P, Lafruit G, Masselos K, Cornelis J (2003) High-level cache modeling for 2-D discrete wavelet transform implementations. J VLSI Signal Process 34:209–226

    Article  MATH  Google Scholar 

  16. Andreopoulos Y, Zervas ND, Lafruit G, Schelkens P, Stouraitis T, Goutis CE, Cornelis J (2001) A local wavelet transform implementation versus an optimal row-column algorithm for the 2D multilevel decomposition. In: Proc IEEE int conf on image processing, vol 3, pp 330–333

    Google Scholar 

  17. Chrysafis C, Ortega A (2000) Line-based, reduced memory, wavelet image compression. IEEE Trans Image Process 9(3):378–389

    Article  MathSciNet  MATH  Google Scholar 

  18. Shahbahrami A, Juurlink B, Vassiliadis S (2005) Efficient vectorization of the FIR filter. In: Proc 16th annual workshop on circuits, systems and signal processing (ProRISC2005), November, pp 432–437

    Google Scholar 

  19. Kuo SM, Gan WS (2005) Digital signal processors architectures, implementations, and applications. Prentice Hall, New York

    Google Scholar 

  20. Trenas MA, Lopez J, Zapata EL, Arguello F (1998) A memory system supporting the efficient SIMD computation of the two dimensional DWT. In: Proc IEEE int conf on acoustics speech and signal processing, May, vol 3, pp 1521–1524

    Google Scholar 

  21. Cohen A, Daubechies I, Eauveau JCF (1992) Biorthogonal bases of compactly supported wavelets. Commun Pure Appl Math 45(5):485–560

    Article  MATH  Google Scholar 

  22. Daubechies I, Sweldens W (1998) Factoring wavelet transforms into lifting steps. J Fourier Anal Appl 4(3):247–269

    Article  MathSciNet  MATH  Google Scholar 

  23. Ferretti M, Rizzo D (2001) A parallel architecture for the 2D discrete wavelet transform with integer lifting scheme. J VLSI Signal Process 28:165–185

    Article  MATH  Google Scholar 

  24. Shahbahrami A (2008) Avoiding conversion and rearrangement overhead in SIMD architectures. PhD thesis, Delft University of Technology, September

  25. Shahbahrami A, Juurlink B, Vassiliadis S (2005) Performance comparison of SIMD implementations of the discrete wavelet transform. In: Proc 16th IEEE int conf on application-specific systems architectures and processors (ASAP), July

    Google Scholar 

  26. Andra K, Chakrabarti C, Acharya T (2002) A VLSI architecture for lifting-based forward and inverse wavelet transform. IEEE Trans Signal Process 50(4):966–977

    Article  Google Scholar 

  27. Chen CY, Yang ZL, Wang TC, Chen LG (2001) A programmable parallel VLSI architecture for 2D discrete wavelet transform. J VLSI Signal Process 28:151–163

    Article  MATH  Google Scholar 

  28. Xiong CY, Hou JH, Tian JW, Liu J (2007) Efficient array architectures for multi-dimensional lifting-based discrete wavelet transforms. Signal Process 87(5):1089–1099

    Article  MATH  Google Scholar 

  29. Weeks M, Bayoumi MA (2002) Three-dimensional discrete wavelet transform architectures. IEEE Trans Signal Process 50(8):2050–2063

    Article  Google Scholar 

  30. Liao H, Mandal MK, Cockburn BF (2004) Efficient architectures for 1D and 2D lifting-based wavelet transforms. IEEE Trans Signal Process 52(5):1315–1326

    Article  MathSciNet  Google Scholar 

  31. Tseng PC, Huang CT, Chen LG (2003) Reconfigurable discrete wavelet transform architecture for advanced multimedia systems. In: Proc IEEE workshop on signal processing systems, August

    Google Scholar 

  32. Tseng PC, Huang CT, Chen LG (2005) Reconfigurable discrete wavelet transform processor for heterogeneous reconfigurable multimedia systems. J VLSI Signal Process Syst 41(1):35–47

    Article  Google Scholar 

  33. Hyun E, Sima M, McGuire M (2006) Reconfigurable implementation of wavelet transform on an FPGA-augmented NIOS processor. In: Proc IEEE Canadian conf on electrical and computer engineering, May

    Google Scholar 

  34. Schelkens P, Decroos F, Cornelis J, Lafruit G, Catthoor F (1999) Implementation of an integer wavelet transform on a parallel TI TMS320C40 platform. In: Proc IEEE workshop on signal processing systems, October

    Google Scholar 

  35. Cho JK, Hwang MC, Kim JS, Choi BD, Ko SJ (2004) Fast implementation of wavelet lifting for JPEG2000 on a fixed-point DSP. In: Proc int conf on circuits, systems, computers, and communications, July

    Google Scholar 

  36. Chaver D, Prieto M, Pinuel L, Tirado F (2002) Parallel wavelet transform for large scale image processing. In: Proc IEEE int symp on parallel and distributed processing, April, pp 4–9

    Google Scholar 

  37. Chaver D, Tenllado C, Pinuel L, Prieto M, Tirado F (2002) 2-D wavelet transform enhancement on general-purpose microprocessors: memory hierarchy and SIMD parallelism exploitation. In: Proc int conf on the high performance computing, December

    Google Scholar 

  38. Tenllado C, Setoain J, Prieto M, Pinuel L, Tirado F (2008) Parallel implementation of the 2D discrete wavelet transform on graphics processing units: filter bank versus lifting. IEEE Trans Parallel Distrib Syst 19(3):299–310

    Article  Google Scholar 

  39. Gnavi S, Penna B, Grangetto M, Magli E, Olmo G (2002) Wavelet kernels on a DSP: a comparison between lifting and filter banks for image coding. EURASIP J Appl Signal Process 2002(1):981–989

    MATH  Google Scholar 

  40. Vishwanath M, Owens RM, Irwin MJ (1992) Discrete wavelet transforms in VLSI. In: Proc int conf on application specific array processors, August

    Google Scholar 

  41. Denk TC, Parhi KK (1997) VLSI architectures for lattice structure based orthonormal discrete wavelet transform. IEEE Trans Circuits Syst II, Analog Digit Signal Process 44(2):129–132

    Article  Google Scholar 

  42. Martina M, Masera G, Piccinini G, Zamboni M (2000) A VLSI architecture for IWT (integer wavelet transform). In: Proc 43rd IEEE midwest symposium on circuits and systems, vol 3, pp 1174–1177

    Google Scholar 

  43. Limqueco JC, Bayoumi MA (1998) A VLSI architecture for separable 2D discrete wavelet transform. J VLSI Signal Process Syst 18(2):125–140

    Article  Google Scholar 

  44. Weeks M, Bayoumi M (2003) Discrete wavelet transform: architectures, design and performance issues. J VLSI Signal Process 35:155–178

    Article  MATH  Google Scholar 

  45. Tenllado C, Lario R, Prieto M, Tirado F (2004) The 2D discrete wavelet transform on programmable graphics hardware. In: Proc 4th IASTED int conf on visualization, imaging, and image processing

    Google Scholar 

  46. Hopf M, Ertl T (2000) Hardware-accelerated wavelet transformations. In: Proc IEEE TVCG symp on visualization, May

    Google Scholar 

  47. Wang J, Wong T, Heng P, Leung C (2004) Discrete wavelet transform on GPU. In: Proc ACM workshop general-purpose computing on graphics processors

    Google Scholar 

  48. Mirsky E, DeHon A (1996) MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources. In: Proc IEEE symp on FPGAs for custom computing machines, April, pp 157–166

    Chapter  Google Scholar 

  49. Ebeling C, Cronquist D, Franklin P, Fisher C (1996) RaPiD a configurable computing architecture for compute-intensive applications. Technical report TR-96-11-03, University of Washington Department of Computer Science and Engineering, November

  50. Lee RB (1996) Subword parallelism with MAX-2. IEEE MICRO 16(4):51–59

    Article  Google Scholar 

  51. Peleg A, Wiljie S, Weiser U (1997) Intel MMX for multimedia PCs. Commun ACM 40(1):24–38

    Article  Google Scholar 

  52. Peleg A, Weiser U (1996) MMX technology extension to the intel architecture. IEEE MICRO 16(4):42–50

    Article  Google Scholar 

  53. Tremblay M, O’Connor JM, Narayanan V, He L (1996) VIS speeds new media processing. IEEE MICRO 16(4):10–20

    Article  Google Scholar 

  54. Bannon P, Saito Y (1997) The alpha 21164PC microprocessor. In: IEEE proc compcon 97, February, pp 20–27

    Google Scholar 

  55. Gwennap L (1996) Digital, MIPS add multimedia extensions. Microprocess Rep 10(15):24–28

    Google Scholar 

  56. Jennings MD, Conte TM (1998) Subword extensions for video processing on mobile systems. IEEE Concurr 6(3):13–16

    Article  Google Scholar 

  57. Lee RB (1997) Multimedia extensions for general-purpose processors. In: Proc IEEE workshop on signal processing systems, November, pp 9–23

    Google Scholar 

  58. Advanced Micro Devices Inc (2000) 3DNow technology manual

  59. Raman SK, Pentkovski V, Keshava J (2000) Implementing streaming SIMD extensions on the Pentium 3 processor. IEEE MICRO 20(4):47–57

    Article  Google Scholar 

  60. Thakkar S, Huff T (1999) The internet streaming SIMD extensions. Intel Technol J, 1–8

  61. Semiconductor F (2002) AltiVec technology programming environments manual

  62. Diefendorff K, Dubey PK, Hochsprung R, Scales H (2000) AltiVec extension to powerPC accelerates media processing. IEEE MICRO 20(2):85–95

    Article  Google Scholar 

  63. Flachs B, Asano S, Dhong SH, Hofstee HP, Kim GGR, Le T, Liu P, Leenstra J, Oh JLBMHJ, Mueller SM, Takahashi O, Watanabe AHY, Yano N, Brokenshire DA, Peyravian M, Vandung T, Iwata E (2006) The microarchitecture of the synergistic processor for a cell processor. IEEE J Solid-State Circuits 41:63–70

    Article  Google Scholar 

  64. Hofstee HP (2005) Power efficient processor architecture and the cell processor. In: Proc 11th IEEE int symp on high-performance computer architecture, February, pp 258–262

    Chapter  Google Scholar 

  65. IBM (2007) Synergistic processor unit instruction set architecture, January, version 1.2

  66. Asokan R, Nazareth S (2001) Processor architectures for multimedia. In: Proc 14th annual workshop on architecture and system design, November, pp 589–594

    Google Scholar 

  67. Hen HY (2002) Programmable digital signal processors: architecture, programming, and applications. Dekker, New York

    Google Scholar 

  68. Slingerland NT, Smith AJ (2000) Multimedia instruction sets for general purpose microprocessors: a survey. Technical Report UCB//CSD-00-1124, University of California, December

  69. Ferrand F (2003) Optimization and code parallelization for processors with multimedia SIMD instructions. Master’s thesis, ENST Bretagne

  70. Meerwald P, Norcen R, Uhl A (2002) Cache issues with JPEG2000 wavelet lifting. In: Proc of visual communications and image processing, January

    Google Scholar 

  71. Chatterjee S, Brooks CD (2002) Cache efficient wavelet lifting in JPEG 2000. In: Proc IEEE int conf on multimedia, pp 797–800

    Chapter  Google Scholar 

  72. Komi H, Ortega A (2001) Analysis of cache efficiency in 2D wavelet transform. In: Proc IEEE int conf on multimedia and expo, pp 465–468

    Google Scholar 

  73. Tao J, Shahbahrami A, Juurlink B, Buchty R, Karl W, Vassiliadis S (2007) Optimizing cache performance of the discrete wavelet transform using a visualization tool. In: Proc 9th IEEE int symp on multimedia, December

    Google Scholar 

  74. Shahbahrami A, Juurlink B, Vassiliadis S (2006) Improving the memory behavior of vertical filtering in the discrete wavelet transform. In: Proc 3rd ACM int conf on computing frontiers, May, pp 253–260

    Google Scholar 

  75. Chaver D, Tenllado C, Pinuel L, Prieto M, Tirado F (2003) Vectorization of the 2D wavelet lifting transform using SIMD extensions. In: Proc 17th IEEE int symp on parallel and distributed image processing and multimedia

    Google Scholar 

  76. Kutil R (2006) A single-loop approach to SIMD parallelization of 2D wavelet lifting. In: Proc 14th Euromicro int conf on parallel, distributed, and network-based processing, pp 413–420

    Google Scholar 

  77. Bernabe G, Garcia JM, Gonzales J (2003) Reducing 3D wavelet transform execution time through the streaming SIMD extensions. In: Proc 11th Euromicro conf on parallel distributed and network based processing, February

    Google Scholar 

  78. Shahbahrami A, Juurlink B (2007) A comparison of two SIMD implementations of the 2D discrete wavelet transform. In: Proc 18th annual workshop on circuits, systems and signal processing (ProRISC2007), November

    Google Scholar 

  79. Shahbahrami A, Juurlink B, Vassiliadis S (2006) Performance impact of misaligned accesses in SIMD extensions. In: Proc 17th annual workshop on circuits, systems and signal processing (ProRISC2006), November, pp 334–342

    Google Scholar 

  80. Shahbahrami A (2011) Improving the performance of 2D discrete wavelet transform using data-level parallelism. In: Proc IEEE int conf on high performance computing and simulation, July, pp 362–368

    Chapter  Google Scholar 

  81. Shahbahrami A, Juurlink B (2009) SIMD architectural enhancements to improve the performance of the 2D discrete wavelet transform. In: Proc 12th EUROMICRO conf on digital system design, August, pp 497–504

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asadollah Shahbahrami.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shahbahrami, A. Algorithms and architectures for 2D discrete wavelet transform. J Supercomput 62, 1045–1064 (2012). https://doi.org/10.1007/s11227-012-0790-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0790-x

Keywords

Navigation