Elsevier

Real-Time Imaging

Volume 11, Issue 1, February 2005, Pages 45-58
Real-Time Imaging

An error resilient coding scheme for H.263 video transmission based on data embedding

https://doi.org/10.1016/j.rti.2005.01.003Get rights and content

Abstract

For entropy-coded H.263 video frames, a transmission error in a codeword will not only affect the underlying codeword but may also affect subsequent codewords, resulting in a great degradation of the received video frames. In this study, an error resilient coding scheme for real-time H.263 video transmission is proposed. The objective of the proposed scheme is to recover high-quality H.263 video frames from the corresponding corrupted video frames.

At the encoder, for an H.263 intra-coded I frame, the important data (the codebook index) for each macroblock (MB) are extracted and embedded into another MB within the I frame by the proposed data embedding scheme for I frames. For an H.263 inter-coded P frame, the important data (the coding mode and motion vector information) for each group of blocks (GOB) are extracted and embedded into the next frame by the proposed MB-interleaving GOB-based data embedding scheme. At the decoder, after all the corrupted MBs within an H.263 video frame are detected and located, if the important data for a corrupted MB can be extracted correctly, the extracted important data will facilitate the employed error concealment scheme to conceal the corrupted MB. Otherwise, the employed error concealment scheme is simply used to conceal the corrupted MB. As compared with some recent error resilient approaches, in this study the important data selection mechanism for different types of MBs, the detailed data embedding mechanism, and the error detection and concealment scheme performed at the decoder are well developed to design an integrated error resilient coding scheme. Based on the simulation results obtained in this study, the performance of the proposed scheme is better than those of four existing approaches used for comparison. The proposed scheme can recover high-quality H.263 video frames from the corresponding corrupted video frames up to a video packet loss rate of 30% in real time.

Introduction

To reduce transmission bit rate or storage capacity, many image/video compression techniques and standards have been developed for various applications, such as videophones, videoconferencing, interactive TV, multimedia databases, and the World Wide Web [1], [2], [3]. Transmitting real-time compressed images/video over noisy channels, such as the Internet or wireless networks, has become a challenging problem [1], [2], [3]. Transmission of real-time video usually has bandwidth, delay, and loss requirements. First, to achieve acceptable presentation quality, transmission of real-time video usually has a minimum bandwidth requirement. However, several current transmission channels, such as the Internet, cannot provide sufficient bandwidth reservation to meet such a requirement. Second, real-time video requires a bounded end-to-end delay, i.e., each video packet must be received at the decoder in time to be decoded and displayed. Third, loss of packets (transmission errors) can potentially degrade the visual quality to human eyes. However, several current transmission channels, such as the Internet, cannot support any quality of service guarantee to meet the above three requirements. Hence, to achieve robust real-time video transmission, two kinds of issues, namely, congestion control and error control, should be addressed. The main purpose of congestion control is to prevent packet loss. However, packet loss is usually unavoidable. Hence, error control is required to maximize video presentation quality in the presence of packet loss (transmission errors) [1].

For entropy-coded H.263 video frames [4], [5], [6], a transmission error in a codeword will not only affect the underlying codeword but may also affect subsequent codewords, resulting in a great degradation of the received video frames. To cope with the synchronization problem, each of the top two layers of the H.263 hierarchical structure [4], [5], [6], namely, picture and group of blocks (GOB), is ahead with a fixed-length start code (a synchronization codeword). After the decoder receives a start code, the decoder resynchronizes regardless of the preceding slippage. However, a transmission error may affect the underlying codeword and its subsequent codewords within the corrupted GOB. Moreover, because of the use of motion-compensated interframe coding, the effect of a transmission error may be propagated to the subsequent video frames, as illustrated in the example shown in Fig. 1. In this study, an error resilient coding scheme for real-time H.263 video transmission is proposed.

In general, error resilient approaches include three categories [1], [2], [3], namely: (1) the error resilient encoding approach [7], [8], [9], [10], [11], [12], [13], [14], (2) the error concealment approach [15], [16], [17], [18], [19], [20], [21], and (3) the encoder–decoder interactive error control approach [22]. The error resilient encoding approach can be further divided into four categories, namely, (1) the robust entropy coding approach [7], [8], (2) the error resilient prediction approach [9], [10], (3) the layered coding with unequal error protection approach [11], [12], and (4) the multiple description coding approach [13], [14].

The robust entropy coding approach [7], [8] copes with the synchronization problem, whereas the error resilient prediction approach [9], [10] copes with the temporal error propagation problem. The layered coding with unequal error protection approach [11], [12] divides an image/video bitstream into a base layer and one or several enhancement layer(s) with unequal degrees of error protection. The multiple description coding approach [13], [14] divides an image/video bitstream into several sub-bitstreams, known as descriptions. Any single description can provide a basic quality, and more descriptions together will provide an improved quality. The error concealment approach [15], [16], [17], [18], [19], [20], [21] conceals the corrupted (lost) information due to transmission errors in a received video bitstream at the decoder by using (1) spatial (spectral) [15], [16], (2) temporal [17], [18], or (3) hybrid (spatial and temporal) [19], [20], [21] image/video information. On the other hand, if a feedback channel can be set up from the decoder to the encoder [22], the decoder can inform the encoder about which parts of the transmitted information are corrupted, and then the encoder can adjust its encoding operations accordingly to suppress or eliminate the effect of transmission errors. If the automatic repeat request function is supported, the corrupted (lost) packets can be retransmitted with some delay constraint.

Using the information from the spatially and/or temporally neighboring blocks of a corrupted block to conceal the corrupted block may introduce some problems. First, the information from the spatially and/or temporally neighboring blocks may not be available (may also be corrupted). Second, the video contents between a corrupted block and its spatially and/or temporally neighboring blocks may be very different. In these cases, the concealed results of the foregoing error concealment approaches are usually not good enough. Recently, several error resilient coding approaches based on data embedding have been proposed [23], [24], [25], [26], [27], [28], [29], in which some important data useful for error concealment performed at the decoder can be embedded into the video frames when these video frames are encoded at the encoder. The embedded data should be “almost” invisible and cannot degrade the quality of the video frames greatly, just like digital watermarking [30]. At the decoder, if some corrupted blocks are detected and located, the important (embedded) data for the corrupted blocks will be extracted and used to facilitate error concealment performed at the decoder. Song and Liu [26] proposed a data embedding scheme for error-prone channels, in which some redundant information used to protect motion vectors and coding modes of MBs in one frame is embedded into the motion vectors in the next frame. Based on the assumption that the next video frame of a corrupted video frame is correctly received, the decoder can exactly recover the motion vectors of the corrupted GOBs in the corrupted frame. Yilmaz and Alatan [27] proposed an error resilient video transmission codec utilizing imperceptible embedded information for error detection, resynchronization, and reconstruction. For an intra-coded frame, both a spatial error recovery technique embedding the edge orientation information of each block and a resynchronization technique embedding the bit-length of each block are proposed to recover the corrupted blocks. For an inter-coded frame, the embedded motion vector information is used to recover the corrupted blocks. In some recent error resilient approaches [26], [27], [28], the case where an MB (or block) and its important data (embedded into another MB or block) are corrupted simultaneously is not well handled. In this case, the embedded data for a corrupted MB (or block) are not available and an error concealment scheme is required. In this study, the important data selection mechanism for different types of MBs, the detailed data embedding mechanism, and the error detection and concealment scheme performed at the decoder are well developed to design an integrated error resilient coding scheme. In addition, the above-mentioned case will be well handled.

In this study, at the encoder, for an H.263 intra-coded I frame, the important data (the codebook index) for each MB are extracted and embedded into another MB within the I frame by the proposed data embedding scheme for I frames. For an H.263 inter-coded P frame, the important data (the coding mode and motion vector information) for each GOB are extracted and embedded into the next frame by the proposed MB-interleaving GOB-based data embedding scheme. At the decoder, after all the corrupted MBs within an H.263 video frame are detected and located, if the important data for a corrupted MB can be extracted correctly, the extracted important data will facilitate the employed error concealment scheme to conceal the corrupted MB. Otherwise, the employed error concealment scheme is simply used to conceal the corrupted MB.

This paper is organized as follows. The employed error detection and concealment scheme for H.263 video transmission is given in Section 2. The proposed error resilient coding scheme for real-time H.263 video transmission based on data embedding is addressed in Section 3. Simulation results are included in Section 4, followed by concluding remarks.

Section snippets

Error detection for H.263 video transmission

In this study, error detection for an H.263 GOB (or an equivalent video packet) is performed by checking a set of error-checking conditions, without adding extra redundant bits to the transmitted bitstream. The set of error-checking conditions is derived from the constraints imposed on the H.263 video bitstream syntax and is listed as follows:

  • (1)

    An invalid codeword for a Huffman code, DCT coefficient, motion vector code, MCBPC code, MCBPC intra-code, or CBPY code is found.

  • (2)

    The total number of

Proposed error resilient coding scheme for real-time H.263 video transmission based on data embedding

As stated before, in this study an error resilient coding scheme for real-time H.263 video transmission based on data embedding is proposed. Within the proposed scheme, the following issues will be addressed: (1) what kind of important data for the MBs within a video frame should be extracted and embedded, (2) where should the important data be embedded, (3) how to embed the important data into the corresponding “masking” MBs or GOBs, and (4) how to extract and use the important data to

Simulation results

Four QCIF test video sequences, “Carphone,” “Coastguard,” “Foreman,” and “Salesman,” with different video packet loss rates (VPLR) are used to evaluate the performance of the proposed scheme. Here a video packet is equivalent to one complete GOB and all the test video sequences are coded at frame rate 10 fps with bit rates 16, 48 and 64 kbps by the TMN-11 rate control scheme [5].

The peak signal to noise ratio (PSNR) is employed in this study as the objective performance measure for the three

Concluding remarks

Based on the simulation results obtained in this study, several observations can be found. (1) Based on the simulation results shown in Table 1, Table 2, Table 3, Table 4, Table 5 and Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9, Fig. 10, the concealed results of the proposed scheme are better than those of the four existing approaches used for comparison. (2) Based on the simulation results shown in Table 1, Table 2, Table 3, Table 4, Table 5 and Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9

Li-Wei Kang was born in Taipei, Taiwan, Republic of China on December 26, 1974. He received the B.S. and M.S. degrees in computer science and information engineering in 1997 and 1999, respectively, all from National Chung Cheng University, Chiayi, Taiwan. Since September 1999, he has been working toward his Ph.D. degree in computer science and information engineering at National Chung Cheng University, Chiayi, Taiwan. His current research interests include image/video processing, image/video

References (32)

  • S.W. Lin et al.

    An error resilient coding scheme for H.26L video transmission based on data embedding

    Journal of Visual Communication and Image Representation

    (2004)
  • D. Wu et al.

    Transporting real-time video over the Internet: challenges and approaches

    Proceedings of the IEEE

    (2000)
  • Y. Wang et al.

    Error control and concealment for video communication: a review

    Proceedings of the IEEE

    (1998)
  • Y. Wang et al.

    Error resilient video coding techniques

    IEEE Signal Processing Magazine

    (2000)
  • ITU-T. Recommendation H.263: video coding for low bit rate communication....
  • ITU-T/SG16 Video Coding Experts Group. Video codec test model near-term, version 11 (TMN11). Document Q15-G16,...
  • S. Wenger et al.

    Error resilience support in H.263+

    IEEE Transactions on Circuits and Systems for Video Technology

    (1998)
  • D.W. Redmill et al.

    The EREC: an error-resilient technique for coding variable-length blocks of data

    IEEE Transactions on Image Processing

    (1996)
  • H.S. Jung et al.

    A hierarchical synchronization technique based on the EREC for robust transmission of H.263 bit stream

    IEEE Transactions on Circuits and Systems for Video Technology

    (2000)
  • G. Cote et al.

    Optimal mode selection and synchronization for robust video communications over error-prone networks

    IEEE Journal on Selected Areas in Communications

    (2000)
  • P. Frossard et al.

    AMISP: a complete content-based MPEG-2 error-resilient scheme

    IEEE Transactions on Circuits and Systems for Video Technology

    (2001)
  • M. Gallant et al.

    Rate-distortion optimized layered coding with unequal error protection for robust Internet video

    IEEE Transactions on Circuits and Systems for Video Technology

    (2001)
  • J. Vass et al.

    Scalable, error-resilient, and high-performance video communications in mobile wireless environments

    IEEE Transactions on Circuits and Systems for Video Technology

    (2001)
  • Y. Wang et al.

    Error-resilient video coding using multiple description motion compensation

    IEEE Transactions on Circuits and Systems for Video Technology

    (2002)
  • Y. Wang et al.

    An improvement to multiple description transform coding

    IEEE Transactions on Signal Processing

    (2002)
  • S. Shirani et al.

    Reconstruction of baseline JPEG coded images in error prone environments

    IEEE Transactions on Image Processing

    (2000)
  • Cited by (3)

    • An error recovery approach for H.264 video transmission based on data hiding

      2012, Nanjing Youdian Daxue Xuebao (Ziran Kexue Ban)/Journal of Nanjing University of Posts and Telecommunications (Natural Science)

    Li-Wei Kang was born in Taipei, Taiwan, Republic of China on December 26, 1974. He received the B.S. and M.S. degrees in computer science and information engineering in 1997 and 1999, respectively, all from National Chung Cheng University, Chiayi, Taiwan. Since September 1999, he has been working toward his Ph.D. degree in computer science and information engineering at National Chung Cheng University, Chiayi, Taiwan. His current research interests include image/video processing, image/video communication, and pattern recognition.

    Jin-Jang Leou was born in Chiayi, Taiwan, Republic of China on October 25, 1956. He received the B.S. degree in communication engineering in 1979, the M.S. degree in communication engineering in 1981, and the Ph.D. degree in electronics in 1989, all from National Chiao Tung University, Hsinchu, Taiwan.

    From 1981 to 1983, he served in the Chinese Army as a Communication Officer. From 1983 to 1984, he was at National Chiao Tung University as a lecturer. Since August 1989, he has been on the faculty of the Department of Computer Science and Information Engineering at National Chung Cheng University, Chiayi, Taiwan. His current research interests include image/video processing, image/video communication, pattern recognition, and computer vision.

    This work was supported in part by National Science Council and Ministry of Economic Affairs, Republic of China under Grants NSC 92-2213-E-194-038 and 93-EC-17-A-02-S1-032.

    View full text