Elsevier

Integration

Volume 38, Issue 3, January 2005, Pages 341-352
Integration

An efficient architecture for lifting-based two-dimensional discrete wavelet transforms

https://doi.org/10.1016/j.vlsi.2004.07.010Get rights and content

Abstract

An architecture for the lifting-based two-dimensional discrete wavelet transform is presented. The architecture has regular data flow and low control complexity, and achieves 100% hardware utilization. It is easily adapted to arbitrary image sizes, multiple levels of transform, and different numbers of lifting steps. Symmetric extension of the image to be transformed is handled in a way that does not require additional computations or clock cycles. The proposed architecture achieves higher throughput and uses 30% fewer lines of embedded memory than architectures based on convolutional filter banks. The architecture has been implemented on an Altera APEX20KE field programmable gate array for three differently quantized versions of the biorthogonal 9/7 filter set used for JPEG2000 lossy compression. Our best implementation of a one-level two-dimensional discrete wavelet transform achieves a throughput of 66.8 megapixels per second using 7726 logic elements.

Introduction

Since the emergence of the JPEG2000 image compression standard [1], considerable attention has been paid to the development of efficient system architectures for the two-dimensional discrete wavelet transform (2-D DWT). The DWT has traditionally been implemented using convolutional filter banks [2], [3]. However, the lifting scheme[4] promises to reduce the number of operations involved in computing a DWT to almost one-half of those needed with a convolutional approach [5]. In addition, the lifting scheme is amenable to in-place computation[5], so that the DWT can be implemented in low memory systems.

Some work has been done on lifting-based 2-D DWT architectures. The architecture in [6], based on the recursive pyramid algorithm (RPA) of [7], is viable only for wavelet filters with no more than two lifting steps. Ref. [2] shows that using the RPA for the 2-D DWT results in inefficient hardware utilization and complicated control circuitry. Other work [8] is based on a lifting structure, but reduces embedded memory size at the expense of additional computations. Two additional lifting-based 2-D DWT architectures [9], [10] reduce embedded memory size by breaking the image into blocks, but extra computations are required at the block boundaries to avoid visual artifacts in the reconstructed images.

We propose a new 2-D DWT architecture for a lifting-based implementation of the biorthogonal 9/7 wavelet. The architecture has 100% hardware utilization, regular data flow, and low control complexity. It is easily scaled to accommodate additional lifting steps and multiple levels. The lifting scheme is exploited to do symmetric extension of images without additional computations or clock cycles. Our architecture has modest embedded memory requirements without resorting to splitting the image into blocks. The basic architecture was described in detail in [11]; here, we extend that work by demonstrating the implementation of our design on a field programmable gate array (FPGA).

Section snippets

Background and motivation

In implementing the 2-D DWT, it is difficult to perform efficient column processing after row processing; the row processor produces coefficients in a row-wise order, while traditional column processing requires those coefficients in a column-wise order. Three basic architectures, surveyed in [12], address this difficulty in different ways. A typical level-by-level architecture uses a single processing module that first processes the rows, and then the columns. Intermediate values between row

Our 2-D DWT architecture

The block diagram of the proposed 2-D DWT architecture is shown in Fig. 1. The FPGA implements a row processor (RP), a column processor (CP), and an embedded memory (MEM) used to buffer results between the two. The image, which is stored in external memory, is read to the FPGA in row-by-row order. The row processor performs horizontal filtering on the rows and writes the approximation, a, and detail, d, coefficients to the local memory. Once a sufficient number of rows has been processed, the

Characteristics of the architecture

We now discuss the general hardware characteristics of our architecture. Full derivations of these results are given in [11]. Table 1 shows expressions for the number of external memory accesses, the number of embedded memory accesses, and the number of clock cycles needed to compute an L-level 2-D DWT of an N by N image. In the table, ls is the number of lifting steps (four for the biorthogonal 9/7 wavelets) and Ls is the latency of the system, which depends on ls, N, the specific quantized

Results of FPGA-based Implementation

The proposed architecture was implemented on an Altera Apex20KE FPGA. Implementation requires that the filter set coefficients first be quantized from their floating-point ideal values into fixed-point. We show results for the three quantized versions of the 9/7 biorthogonal wavelet filter set shown in Table 2. The first system is a quantized version of a lifting structure for the original filters as specified in the JPEG2000 standard; we call this a “lifting, irrational” structure, because the

Conclusions

An efficient architecture implementing a lifting-based two-dimensional discrete wavelet transform is presented. Use of the lifting structure allows implementation of symmetric extension without additional computations, giving lifting-based architectures a significant advantage over convolutional filter bank-based architectures in terms of throughput. The lifting-based architecture also requires significantly less embedded memory than a similar convolutional filter bank-based architecture. The

S. Barua has a B.S. in electrical and electronics engineering from the Bangladesh Institute of Technology, and a Masters in electronics from the University of Electro-communications in Japan. He will complete a Masters in electrical engineering at the University of Akron in Summer 2004.

References (16)

  • ITU T.800: JPEG2000 image coding system Part 1, ITU Standard, July...
  • P.-C. Wu et al.

    An efficient architecture for two-dimensional discrete wavelet transform

    IEEE Trans. Circuits Systems Video Technol.

    (2001)
  • M. Vishwanath et al.

    VLSI architecture for discrete wavelet transform

    IEEE Trans. Circuits Systems

    (1995)
  • W. Sweldens

    The lifting schemea new philosophy in biorthogonal wavelet construction

    Proc. SPIE

    (1995)
  • I. Daubechies et al.

    Factoring wavelet transforms into lifting steps

    J. Fourier Anal. Appl.

    (2001)
  • M. Ferretti et al.

    A parallel architecture for the 2-D discrete wavelet transform with integer lifting scheme

    J. VLSI Signal Process.

    (1994)
  • M. Vishwanath

    The recursive pyramid algorithm for the discrete wavelet transform

    IEEE Trans. Signal Process.

    (1994)
  • W.-H. Chang et al.

    A line-based memory efficient and programmable architecture for 2D DWT using lifting scheme

There are more references available in the full text version of this article.

Cited by (0)

S. Barua has a B.S. in electrical and electronics engineering from the Bangladesh Institute of Technology, and a Masters in electronics from the University of Electro-communications in Japan. He will complete a Masters in electrical engineering at the University of Akron in Summer 2004.

J.E. Carletta is an assistant professor of electrical and computer engineering at the University of Akron. She received her Ph.D. in computer engineering from Case Western Reserve University. Her research involves high-performance hardware for digital signal processing applications. Her work is funded by two NSF Information Technology Research awards.

K.A. Kotteri received his B.S. in electronics engineering from the University of Mumbai in 1997, and a Masters in electrical engineering from Virginia Tech in 2004. He is now a software development engineer for Microsoft in Redmond, Washington. His research interests are in the areas of digital signal processing, image processing and communications.

A.E. Bell is an associate professor in the department of electrical and computer engineering at Virginia Tech. She received her Ph.D. in electrical engineering from the University of Michigan. Bell conducts research in wavelet image compression, embedded systems, and bioinformatics. She is the recipient of a 1999 NSF CAREER award and a 2002 NSF Information Technology Research award; she has also received two awards for teaching excellence.

This material is based upon work supported by the National Science Foundation under Grants 0218672 and 0217894.

View full text