Elsevier

Real-Time Imaging

Volume 10, Issue 5, October 2004, Pages 285-295
Real-Time Imaging

An embedded wavelet image coder with parallel encoding and sequential decoding of bit-planes

https://doi.org/10.1016/j.rti.2004.08.006Get rights and content

Abstract

Wavelet-based coders are widely used in image compression. Many popular embedded wavelet coders are based on a data structure known as zerotree. However, there exists another category of embedded wavelet coders that are fast and efficient without employing zerotree. These coders are based on three key concepts: (1) wavelet coefficient reordering, (2) bit-plane partitioning, and (3) encoding of bit-planes with certain efficient variants of run-length coding. In this paper, we propose a novel method to construct a bit-plane encoder that can be used in this category of non-zerotree coders. Instead of encoding the bit-planes progressively, the bit-plane encoding process can be finished in one pass when multiple bit-plane encoders are activated concurrently. With this proposed method, traditional partitioned-block based parallel processing strategy is enhanced with another dimension (depth of bit-planes) of processing flexibility. This bit-plane encoder inherently targets parallel processing architecture. The final output bitstream can be compatible with that of the original sequential coder if compatibility is preferred over speed and memory efficiency.

Introduction

Wavelet transform-based image coding schemes have already found their wide usage in the scientific and industrial community. In many application scenarios such as image database indexing and retrieval, parallel processing capability is desirable because of the huge volume of data and real-time processing demand.

Although parallel wavelet image compression is an interesting research topic, so far a significant amount of work has only been done on parallel wavelet transform algorithms, which function as the first stage of a complete image coder. When it comes to parallel processing of the quantization and entropy coding stages, less work has been reported in the literature, which in the authors’ opinion results from the difficulty in parallelizing the zerotree-based coding methods since many wavelet image coding methods proposed up-to-date are based on the zerotree data structure.

The first zerotree-based embedded coding method proposed was EZW [1], in which clusters of insignificant (small magnitude) coefficients are encoded through single symbols. These clusters of coefficients are called zerotrees because of the quadtree-like arrangement of coefficients which exploits the residual cross-scale statistical dependency in a wavelet pyramid. Later, an enhanced zerotree-based technique was proposed as SPIHT [2], which uses a different pattern of parent–children relationship and achieves good compression performance even without an explicit entropy encoding stage.

Parallelization of the EZW algorithm was first reported in [3], where two approaches were proposed: (1) a straightforward parallelization which ensures that each processing element (PE) contains only entire zerotrees of wavelet coefficients, and (2) one PE is reserved for the collection of the symbols that have to be encoded. This PE reorders the symbols and encodes them. This approach results in higher complexity. In fact, the first approach can be generalized as local processing principle under which an image (a decomposed wavelet pyramid) can be partitioned into pixel (coefficient) blocks and each block can be compressed locally by a PE. This one-block-per-PE approach is also suggested in [4] for consideration of parallel execution in the JPEG2000 image coding system.

The parallelization of the SPIHT algorithm is more difficult because of the list structures it uses. The lists in the SPIHT cause variable and data-dependent memory requirement. The task of memory management on adding, dropping, and removing list nodes is complicated and undesirable in memory-constraint environments. To make it worse, managing a distributed list across multiple PEs is even more difficult. Some techniques have recently been proposed to remove the list structures from zerotree-based coders to make them more suitable for hardware implementation and parallel processing [5], [6], [7].

Because of the popularity of zerotree-based techniques, virtually no effort has been dedicated to parallelization of other equally efficient techniques, which may potentially be parallelized more easily and efficiently. One category of such algorithms is based on bit-plane coding [8] and run-length coding of binary sequence [9], [10], [11], [13]. The common features of these methods are: (1) the wavelet pyramid is mapped into a one-dimensional array and thereafter the coefficients within it are linearly indexed, (2) the wavelet pyramid is treated as multiple bit-planes and bit-planes are encoded following the order of decreasing significance, and (3) two-dimensional bit-planes are converted to one-dimensional binary sequences by linear reordering before the binary sequences are encoded by efficient run-length coders.

This paper demonstrates that the category of embedded coding methods mentioned above can easily and efficiently be mapped onto a multiple instruction multiple data (MIMD) architecture in the form of parallel bit-plane encoding. In other words, in addition to being segmented into blocks, wavelet pyramids can be sliced in another dimension (depth of bit-planes) into bit-planes as shown in Fig. 1. Furthermore, a bit-plane can be encoded by a PE without any communication with other PEs as proposed in this paper.

In the next section, we will describe our proposed parallel encoding and sequential decoding (PESD) method of parallelization. We use a modified wavelet difference reduction (MWDR) algorithm [12], which does not employ arithmetic coding, as our example although the method we propose can readily be applied to other data-independent schemes such as progressive wavelet coding (PWC) [11]. The method proposed in Section 2 is extended to cover bitstream compatibility in Section 3. The decoding procedure is discussed in Section 4. Experimental results are presented in Section 5. The paper ends with concluding remarks in Section 6.

Section snippets

Parallelization through data-independent bit-plane encoding

The success of wavelets in the area of still image compression has been attributed to the utilization of certain types of data dependency. For example, zerotrees in the EZW and the SPIHT, and other context models in [4], [10] all utilize the residual data dependency among wavelet coefficients. One disadvantage of exploiting data dependency is that tracking the intermediate state information of the coder becomes more difficult than it does in the MWDR and the PWC. In the following sections, we

Bitstream compatibility in parallel bit-plane encoding

In the last section, we have already explained that an intermediate list like the ICS is not necessary because along the path of the fixed scan order, we know exactly which coefficients should be skipped solely by testing their magnitudes. Note that this magnitude test is exerted on all coefficients while only coefficients in the ICS undergo such test in the MWDR. However, the extra computational cost of the magnitude test is just one logic instruction, which is justified compared to the memory

Sequential bit-plane decoding

Contrary to the encoding process that can be truly parallelized, the decoding process is inherently sequential because the information is incrementally available to the receiver following the progressive decoding of bit-planes. However, by proper algorithm analysis, we can still potentially optimize the execution of the decoding process.The pseudo-code for bit-plane decoder is shown in Listing 3.

It is assumed that all bit-planes higher than b have already been decoded error-free. When we decode

Experimental results

To verify the correctness of the proposed algorithm, two versions of the algorithm have been implemented. The first, PESD-A, is a non-threaded version. The second, PESD-B, is a threaded version which uses the Pthread library for Linux operating system and is able to take advantage of the symmetric multi-processor (SMP) kernel of Linux. To demonstrate the simplicity of the proposed algorithm, the run time of non-threaded PESD-A is compared with that of the QccPack [14] implementation of the

Conclusions

In this paper, we proposed a wavelet image coding algorithm that is based on parallel execution of multiple bit-plane encoding subroutines. In addition to the conventional parallelization strategy of segmenting the wavelet pyramid into multiple code-blocks, we show that using our proposed algorithm, each code-block can be further segmented into multiple bit-planes that can be encoded simultaneously. The output bitstreams of the bit-plane encoders are buffered, assembled, and finally truncated

References (15)

  • R. Kutil

    Approaches to zerotree image and video coding on MIMD architectures

    Parallel Computing

    (2002)
  • J.M. Shapiro

    Embedded image coding using zerotrees of wavelet coefficients

    IEEE Transactions on Acoustics, Speech and Signal Processing

    (1993)
  • A. Said et al.

    A new, fast, and efficient image codec based on set partitioning in hierarchical trees

    IEEE Transactions on Circuits and Systems Video Technology

    (1996)
  • C.D. Creusere

    Image coding using parallel implementations of the embedded zerotree wavelet algorithm

    Proceedings of the SPIE digital video compressionalgorithms and technologies

    (1996)
  • Taubman D, Ordentlich E, Weinberger M, Seroussi G. Embedded block coding in JPEG2000. HPL-2001-35 (External), HP,...
  • Lin W-K, Burgess N. Listless zerotree coding for color images. Proceedings of the 32nd asilomar conference on signals,...
  • Wheeler FW, Pearlman WA. SPIHT image compression without lists. Proceedings of the IEEE ICASSP, 4, 2000. p....
There are more references available in the full text version of this article.

Cited by (1)

View full text