A hierarchical pipelining architecture and FPGA implementation for lifting-based 2-D DWT

Zhang, Chunhui; Long, Yun; Kurdahi, Fadi

doi:10.1007/s11554-007-0057-6

A hierarchical pipelining architecture and FPGA implementation for lifting-based 2-D DWT

Special Issue
Published: 21 November 2007

Volume 2, pages 281–291, (2007)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Chunhui Zhang¹,
Yun Long¹ &
Fadi Kurdahi¹

209 Accesses
3 Citations
Explore all metrics

Abstract

Numerous VLSI architectures for 2-D discrete wavelet transform (DWT) have been brought forward. While most of the designs displayed good performance through parallel processing, few of them addressed thoroughly how to sustain such high throughput computing which is crucial in real-time applications. Although the affordable data transfer bandwidth has been increased tremendously during the past decade, the pressure on data communication has not yet been relieved from stream-intensive applications. The design of 2-D DWT belongs to such cases. In this paper, we expose the performance gap between the computing core and the entire system, distinguishing them by quantitative approach with metrics of peak performance and mean-time performance. In order to narrow down the discrepancy without degrading either of the two criteria, on the one hand, we introduce a software-pipelining lifting-based computing kernel to remove data dependence for peak performance, on the other hand, we apply loop fusing technique and a hierarchical pipelining method to enhance data locality and boost the mean-time performance. The architecture has been implemented in Xilinx Virtex-II FPGA, taking advantage of Virtex-II’s embedded multipliers and block RAMs. We use Daubechies (9, 7) and LeGall (5, 3) filters (the default lossy and lossless filters in JPEG2000) for illustration whereas it is a general method for other DWT filters. The post-place and routing operation frequency for Daubechies (9, 7) is 138 MHz. Notably, the mean-time performance parameterized by image size and decomposition level achieves closely to peak performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A memory and area-efficient distributed arithmetic based modular VLSI architecture of 1D/2D reconfigurable 9/7 and 5/3 DWT filters for real-time image decomposition

Article 25 July 2019

Adaptive Directional Lifting Wavelet Transform VLSI Architecture

Article 12 April 2018

Self adaptable high throughput reconfigurable bilateral filter architectures for real-time image de-noising

Article 17 March 2017

Notes

Most designs process limited number of lines (N _l) simultaneously under area and power constraints despite their independence. With small filter tap number, N _f, all the extra storage needed by convolution is N _l × N _f, remarkably smaller than the whole image (typical sizes around hundreds or even thousands).
As the default lossy filter in JPEG2000, the Daubechies (9, 7) is widely used thereby more related works are available for compassion. Furthermore, the Daubechies filter is more complicated than the LeGall filter. Therefore, we think it more representative in evaluating a related design.

References

JPEG2000 image coding system, ISO/IEC International Standard 15444-1. ITU Recommendation T.800, 2000
CS6210 discrete wavelet transform. Amphion, http://www.amphion.com/cs6210.html
LB_2DFDWT: line-based programmable forward DWT. Cast Inc., http://www.xilinx.com/products/logicore/alliance/cast/ cast_lb_2dfdwt.pdf
RC_2DDWT: combine 2D forward/inverse discrete wavelet transform. Cast Inc., http://www.xilinx.com/products/logicore/alliance/cast/ cast_rc_2ddwt.pdf
Andra, K., Chakrabarti, C., Acharya, T.: A VLSI architecture for lifting-based forward and inverse wavelet transform. IEEE Trans. Signal Process. 50(4), 966–977 (2002)
Article Google Scholar
Chen, C-Y., Yang, Z-L., Wang, T-C., Chen, L-G.: A programmable parallel VLSI architecture for 2-D discrete wavelet transform. J. VLSI Signal Process. 28, 151–163 (2001)
Article MATH Google Scholar
Chesney, D.R., Cheng, B.H.: Generalising the unimodular approach. In: Proceedings of ICPADS’94, pp. 398–404 (1994)
Chrysafis, C., Ortega, A.: Line based, reduced memory, wavelet image compression. IEEE Trans. Image Process. 9, 378–389 (2000)
Article MATH MathSciNet Google Scholar
Daubechies, I., Sweldens, W.: Factoring wavelet transforms into lifting schemes. J. Fourier Anal. Appl. 4, 247–269 (1998)
Article MATH MathSciNet Google Scholar
Dillen, G., Georis, B., Legat, J-D., Cantineau, O.: Combined line-based architecture for the 5-3 and 9-7 wavelet transform of JPEG2000. IEEE Trans. Circuits Syst. Video Technol. 13(9), 944–950 (2003)
Article Google Scholar
García, A., Ramírez, J., Meyer-Bäse, U., Castillo, E., Lloris-Ruíz, A.: Efficient embedded FPL resource usage for MS-based polyphase DWT filter banks. In: Proceedings of FPL 2005, pp. 531–534 (2005)
Jiang, W., Ortega, A.: Lifting factorization-based discrete wavelet transform architecture design. IEEE Trans. Circuits Syst. Video Technol. 11(5), 651–657 (2001)
Article Google Scholar
Mallat, S.: A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 674–693 (1989)
Article MATH Google Scholar
Ravasi, M., Tenze, L., Mattavelli, M.: A scalable and programmable architecture for 2-D DWT decoding. IEEE Trans. Circuits Syst. Video Technol. 12(8), 671–677 (2002)
Article Google Scholar
Twelves S, Wu M, White A (2001) JPEG2000 wavelet transform using starcore, an2089/d rev. 1 October 2001
Zhang, C., Long, Y., Kurdahi, F.: A scalable embedded JPEG2000 architecture. J. Syst. Arch. 53(8), 524–538 (2007)
Article Google Scholar
Zhang, C., Long, Y., Oum, S.Y., Kurdahi, F.: Software-pipelined 2-D discrete wavelet transform with VLSI hierarchical implementation. In: Proceedings of RISSP’03, pp. 148–153 (2003)

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering and Computer Science ET508, zotcode 2625, University of California, Irvine, CA, 92697, USA
Chunhui Zhang, Yun Long & Fadi Kurdahi

Authors

Chunhui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Long
View author publications
You can also search for this author in PubMed Google Scholar
Fadi Kurdahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunhui Zhang.

Appendix

Software-pipelining Daubechies (9, 7) 1-D inverse filtering for n: $\lceil\frac{i_0}{2} \rceil - 2 \leq n < \lceil \frac{i_1} {2}\rceil + 1$

$$ \begin{aligned} {\rm Tmp}_1(2n) &= K\cdot Y_{\rm ext}(2n) \\{\rm Tmp}_1(2n+1) &= (1/K)\cdot Y_{\rm ext}(2n+1) \\ {\rm Tmp}_0(2n-2) &= {\rm Tmp}_1(2n-2) - \delta \times \left[{\rm Tmp}_1(2n-3) + {\rm Tmp}_1(2n-1)\right] \\ {\rm Tmp}_0(2n-5) &= {\rm Tmp}_1(2n-5) - \gamma \times \left[{\rm Tmp}_1(2n-6) + {\rm Tmp}_1(2n-4)\right] \\ X(2n-8) &= {\rm Tmp}_0(2n-8) - \beta \times \left[{\rm Tmp}_0(2n-9) + {\rm Tmp}_0(2n-7)\right] \\ X(2n-11) &= {\rm Tmp}_0(2n-11) - \alpha \times \left[X(2n-12) + X(2n-10)\right] \\ \end{aligned} $$

Software-pipelining LeGall (5, 3) 1D Forward Filtering For n:$\lceil \frac{i_0}{2}\rceil - 2 \leq n < \lceil \frac{i_1} {2}\rceil + 1$

$$ \begin{aligned} Y(2n+1)&= X_{\rm ext}(2n+1) - \left \lfloor {\frac{X_{\rm ext}(2n)+X_{\rm ext}(2n+2)} {2}} \right \rfloor \\ Y(2n-2)&= X_{\rm ext}(2n-2) - \left \lfloor {\frac{Y_{\rm ext}(2n-3)+Y_{\rm ext}(2n-1)+2} {4}}\right \rfloor \\ \end{aligned} $$

Software-pipelining LeGall (5, 3) 1D Inverse Filtering For n:$\lceil \frac{i_0}{2}\rceil - 2 \leq n < \lceil \frac{i_1} {2}\rceil + 1$

$$ \begin{aligned} X(2n)&= Y_{\rm ext}(2n) - \left \lfloor {\frac{Y_{\rm ext}(2n-1)+Y_{\rm ext}(2n+1)+2} {4}} \right \rfloor \\ X(2n-3)&= Y_{\rm ext}(2n-3) - \left \lfloor {\frac{X(2n-4)+X_{\rm ext}(2n-2)} {2}} \right \rfloor \\ \end{aligned} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, C., Long, Y. & Kurdahi, F. A hierarchical pipelining architecture and FPGA implementation for lifting-based 2-D DWT. J Real-Time Image Proc 2, 281–291 (2007). https://doi.org/10.1007/s11554-007-0057-6

Download citation

Received: 09 May 2007
Accepted: 05 November 2007
Published: 21 November 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s11554-007-0057-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical pipelining architecture and FPGA implementation for lifting-based 2-D DWT

Abstract

Access this article

Similar content being viewed by others

A memory and area-efficient distributed arithmetic based modular VLSI architecture of 1D/2D reconfigurable 9/7 and 5/3 DWT filters for real-time image decomposition

Adaptive Directional Lifting Wavelet Transform VLSI Architecture

Self adaptable high throughput reconfigurable bilateral filter architectures for real-time image de-noising

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hierarchical pipelining architecture and FPGA implementation for lifting-based 2-D DWT

Abstract

Access this article

Similar content being viewed by others

A memory and area-efficient distributed arithmetic based modular VLSI architecture of 1D/2D reconfigurable 9/7 and 5/3 DWT filters for real-time image decomposition

Adaptive Directional Lifting Wavelet Transform VLSI Architecture

Self adaptable high throughput reconfigurable bilateral filter architectures for real-time image de-noising

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation