Flash scene video coding using weighted prediction

https://doi.org/10.1016/j.jvcir.2011.11.004Get rights and content

Abstract

A novel algorithm for coding flash scenes is proposed. In principle, flash scenes can be detected by analyzing the histogram differences between frames. The proposed algorithm then suggests an adaptive coding order technique for increasing the efficiency of video coding by taking account of characteristics of flash scenes in video contents. The use of adaptive coding technique also benefits to enhance the accuracy of derived motion vectors for determination of weighting parameter sets. Experimental results show that a significant improvement of coding performance in terms of bitrate and PSNR can be achieved in comparison with the conventional weighted prediction algorithms.

Highlights

► We propose a novel algorithm for coding flash scenes. ► We detect flash scenes by analyzing the histogram difference between frames. ► Adaptive coding order technique increases the efficiency of video coding. ► Motion vector derivation technique improves the accuracy of weighted parameter sets. ► Significant coding gain is achieved over the conventional weighted prediction schemes.

Introduction

Changes in illumination caused by a flash being fired during a press conference, a sport match, a news interview, etc., result in huge intensity difference between frames, which may easily be mistaken for motion estimation and compensation in video coding. Weighted prediction (WP) is one of the new tools in H.264 for coding scenes with brightness or illumination variations [1], [2], [3], [4], [5], [6]. With the use of WP, a reference frame fref is scaled and shifted by a WP parameter set including a multiplicative weighting factor W and an additive offset O, and then the sum of absolute difference (SAD) between the current frame fc and the reference frame fref is defined asSAD=|fc-(W×fref+O)|With the appropriate W and O, SAD becomes smaller so that motion estimation (ME) can obtain a well-matching block in the reference frame. It results in better coding efficiency. However, WP in H.264 is a frame-based approach which can only code the scenes with global brightness variations such as fade-in and fade-out efficiently [1], [2], [3], [4], [5], [6], [7], [8] but not the scenes with local brightness variations including flashlight (FL) scenes. Firing of flashes in a scene can cause the non-uniform intensity change distributed over the entire picture. It is therefore very difficult to find an accurate frame-based model to estimate the change of the intensity within the picture. Macroblock (MB)-based approaches [9], [10] in which different MBs in the same frame can use different W and O were proposed to solve the problem of local brightness variations. Unfortunately, this may lead to increased computational complexity, considering that it would be necessary to perform ME using all possible sets of W and O [9]. This MB-based concept was extended to multi-view video to solve the illumination and focus mismatches across views. In [11], a two-pass search algorithm was proposed. In the first pass, it uses mean-removed search to compute W and O for each MB candidate in the search window to find the disparity vectors. Based on the disparity vectors, depth levels are found and new filtered reference frames are generated for the second mean-removed search to find the best match for each MB. This two-pass algorithm requires high computational complexity and is only well suited to multi-view video coding. In [12], a human vision system based scheme was designed to solve the problems of coding FL scenes by interpolating and inpainting non-FL frames as FL frames. Nevertheless, the objective quality of FL frames is dropped a lot. In this paper, a MB-based scheme is specifically designed for coding FL frames.

Section snippets

Proposed coding scheme for flash scenes

The salient characteristic of flashlight effect is the abrupt luminance change across frames of the same scene within a very short period of time, which is caused by sudden appearance of the illumination source. Normally, we assume that a flash scene cannot last more than 0.15–0.2 second [13], [14]. In other words, the number of FL frames should be smaller than 5 if the video frame rate is 25 fps. Notice that the FL frames have a much stronger intensity than the previous and the later non-FL

Experimental results

Experiments have been conducted over two 720p video sequences with flash scenes to evaluate the overall efficiency of various weighted prediction algorithms. A standard sequence “Crew” contains NASA crews leaving a building with flashlight while “Ballseq” is a self-recorded flashlight sequence of two balls rolling from left to right, as shown in Fig. 3. The test sequences were all encoded at 25 frames/s. Besides, three standard sequences “Shields”, “Exit” and “Sunflower”, with synthetic

Conclusions

In this work, we have proposed an adaptive coding order technique for video coding based on flash scene, which extracts FL and non-FL frames according to histogram differences, and assigns appropriate coding type for each frame correspondingly. Motion vector derivation is then adopted instead of using co-located block in the determination of WP parameter sets. Experimental results show that the proposed scheme with the adaptive coding order and motion vector derivation techniques achieves

Acknowledgments

The work described in this paper is partially supported by the Centre for Signal Processing and a grant from the Internal Competitive Research Grant, Department of EIE, PolyU (PolyU G-YJ27).

References (15)

  • ITU-T Recommendation H.264, Advanced Video Coding for Generic Audiovisual Services, Joint Video Team (JVT) of ISO/IEC...
  • J. Boyce, Weighted prediction in the H.264/MPEG AVC video coding standard, in: IEEE Proceedings of. International...
  • H. Kato, Y. Nakajima, Weighted factor determination algorithm for H.264/MPEG-4 AVC weighted prediction, in: IEEE 6th...
  • H. Aoki, Y. Miyamoto, An H.264 weighted prediction parameter estimation method for fade effects in video scenes, in:...
  • R. Zhang, G. Cote, Accurate parameter estimation and efficient fade detection for weighted prediction in H.264 video...
  • Joint Video Team (JVT) Reference Software Joint Model (JM) version 15.1 Available from:...
  • S.H. Tsang, Y.L. Chan, W.C. Siu, New weighted prediction architecture for coding scenes with various fading effects...
There are more references available in the full text version of this article.

Cited by (0)

View full text