DCT based reversible data embedding for MPEG-4 video using HVS characteristics

https://doi.org/10.1016/j.istr.2013.01.002Get rights and content

Abstract

Recently, many data embedding schemes using the quantized DCT coefficients have been proposed for achieving the robustness. However, most of the schemes lack to strike a trade off between the embedding capacity and the visual quality. Achieving more embedding capacity by maintaining the visual quality has become a challenging task. Most of the DCT based data embedding schemes result in various visual distortions for not considering the HVS characteristics while embedding. The widely used visual quality measure PSNR is not sufficient to assess the quality of the distorted image/video content. However, the HVS based visual quality metrics PSNR-HVS and PSNR-HVS-M are very much suitable when the data is embedded in the frequency domain using DCT. We propose two reversible data embedding schemes which embed the data during the process of MPEG-4 compression of video. The first scheme achieves good visual quality in terms of HVS based metrics which can be useful for high fidelity watermarking applications and the second scheme achieves higher embedding capacity by maintaining better visual quality which can be useful for the steganographic applications.

Introduction

Now a days, data embedding into multimedia data such as image, audio, video, etc. is emerging due to its vast applications in watermarking and steganography (Cox et al., 2008). A data embedding scheme alters the cover contents such as image, audio, video, etc. for embedding the data I. The data to be embedded can be secret information, identity of the content owner, information about the cover content, etc. depending on the application for which the embedding scheme is designed. With the digital representation of image, video, etc. the huge amount of data associated with it demand the usage of compression standards such as JPEG, MPEG, JVT, H.264, etc. for efficient storage and transmission (Salomon, 2007; Furht, 1995). Recently, due to the unavoidable relationship of multimedia data with the compression techniques, the data embedding in the compressed domain has become an active area of research. Though the data embedding can be performed in both the uncompressed and compressed domains, embedding of data in compressed domain in association with its compression technique is demanding and has its own advantages (Lin, 2005).

The DCT (Discrete Cosine Transformation) is used most widely for transforming the multimedia data to the frequency domain in most of the compression standards such as JPEG, MPEG, JVT, ITU's H.261 and H.263, etc. Embedding the data into the quantized DCT coefficients is the most common practice (Lin and Chen, 2000; Hsu and Wu, 1998). The motivation behind this practice is to achieve the robustness, real time implementation of data embedding in association with its compression technique, bit rate control, etc.

In general, the data embedding can be done in irreversible or reversible manner both in the uncompressed and compressed domains. In irreversible data embedding, the cover content can not be restored back to its original form. But there are some applications such as military communication, medical imaging, remote sensing, fine arts, multimedia archive management etc. which require the restoration of the original content after the extraction of the embedded data. The schemes which restore the cover content after extraction of the embedded data are referred to as reversible data embedding schemes (Tian, 2003; Fridrich et al., 2002; Celik et al., 2002; Jessica et al., 2001; Fridrich and Du, 2002; Zeng et al., 2011; Gujjunoori and Amberker, 2012a).

Generally, to achieve reversibility the cover content is subjected to many modifications or transformations while embedding the data. Here, apart from the regular modifications due to embedding the bits, the process of achieving reversibility results in additional modifications to the original content. These modifications result in visual degradation of the cover content. Maintaining the requirement of reversibility, one needs to strike a trade off between the visual quality and embedding capacity (Cox et al., 2008). Hence, achieving the better embedding capacity is a challenge when the data needs to be embedded in reversible manner. There are some works which make use of the characteristics of Human Visual System (HVS) for achieving the better embedding capacity (Awrangjeb and Kankanhalli, 2004; Jung et al., 2011). Some other works make use of characteristics of DCT. For instance, the middle frequency coefficients of DCT block are used widely for embedding (Lin et al., 2008; Lin and Shiu, 2009; Chang et al., 2007). Though, the schemes which make use of middle frequency coefficients for embedding the data achieves the better visual quality in terms of PSNR, MSE, etc. compared to other schemes in frequency domain, they retain the visual artifacts such as white noise like distortions, blocking artifacts, etc. This becomes more pronounced in reversible data embedding schemes as it involves more modifications to the cover content (Chang et al., 2007).

In most of the works in the literature on applications of watermarking or steganography, the objective measures such as PSNR, MSE, etc. are widely used to asses the visual quality of the degraded image/video (Wang et al., 2004; Chae and Manjunath, 1999; Ma et al., 2010). None of the works consider the measures which emulates the better HVS characteristics while embedding the data. However, very few works consider the measures such as CPSNR (Yang et al., 2007), MSSIM or SSIM (Wang et al., 2004; Noore, 2004),etc. which partially takes HVS into consideration while embedding the data. Especially, when the data is to be embedded into the frequency domain using DCT, the visual quality measures which consider the better HVS characteristics need to be identified. The PSNR is not a good measure of evaluating the visual quality of the degraded image/video (Wang et al., 2004; Avcibas et al., 2002). Hence, the PSNR is not a good measure to assess the visual quality of degraded image/video when the data is embedded into the frequency domain using DCT. For instance, PSNR does not capture HVS characteristics when the additive noise is introduced due to embedding in the quantized DCT blocks of smoother regions. Some times lesser PSNR may result in better visual quality (Segall et al., 2001; Abboud, 2005). Hence, the data embedding schemes which performs efficient in terms of embedding capacity and visual quality need to be investigated further.

In this article, we identify suitable visual quality measures such as PSNR-HVS (Egiazarian et al., 2006), PSNR-HVS-M (Ponomarenko et al., 2007) when the data is embedded into frequency domain using DCT and propose two reversible data embedding schemes. These measures are also used by Muzzarelli et al. (2010) scheme to assess the visual quality of the degraded image. Their embedding scheme is based on histogram modification but not based on DCT. We compare the proposed two schemes with Chang et al. (2007) scheme. Our first scheme referred to as HVSVIS is aimed at improving the visual quality by achieving the capacity nearer to that obtained in C. C. Chang et al. scheme. This scheme is useful for the high fidelity watermarking applications. Our second scheme referred to as HVSCAP, is aimed at improving the embedding capacity by maintaining good visual quality. Therefore, this scheme can be used for the steganographic applications which require high capacity. We also prove the reversibility of the proposed schemes. These schemes embed the data into the quantized DCT coefficients of I-frames during the process of compressing the digital video using MPEG-4 compression technique. Further, the robustness is achieved by embedding data into the compressed domain.

The article is organized as follows. Section 2 briefly reviews the MPEG-4 compression and present in detail our proposed schemes for embedding the data during the process of MPEG-4 compression. Results and discussion is given in Section 3. We conclude the article in Section 4.

Section snippets

Proposed schemes

We propose two data embedding methods aiming for different applications. Our first method referred to as HVSVIS is aimed at improving the visual quality by achieving the capacity nearer to that obtained in C. C. Chang et al. scheme. This method is useful for the high fidelity watermarking applications. Our second method referred to as HVSCAP, is aimed at improving the embedding capacity by maintaining good visual quality. Therefore, this method can be used for the steganographic applications

Results and discussion

We use various QCIF formatted videos in our experiment, including MissAm, Akiyo, CarPhone, SalesMan, etc. Some of the test videos are shown in Fig. 5. The frame size of all these test videos is 176 × 144 pixels. We compress these test videos by the standard MPEG-4 encoder. The widely used measurement for evaluating the visual quality of a stego-video (watermarked video) is PSNR (Peak Signal to Noise Ratio). The PSNR for each YUV channel of a frame is given by the following equationPSNR=10log10

Conclusions

The visual quality of an image/video can not be captured by PSNR alone when the data is embedded in frequency domain using DCT. Hence, the combination of HVS based measures PSNR-HVS and PSNR-HVS-M can provide a good measure of visual quality. Improving the HVS based measures by maintaining the acceptable range of PSNR could be the better measure of overall visual quality assessment when the data is to be embedded in frequency domain using DCT. The first method proposed for reversible data

References (32)

  • C.C. Chang et al.

    Reversible hiding in DCT-based compressed images

    Information Sciences

    (2007)
  • B. Furht

    A survey of multimedia compression techniques and standards. Part I: JPEG standard

    Real-time Imaging

    (1995)
  • I. Abboud

    Reducing of blocking effect in image and video coding by three modes of adaptive filtering or interpolating

    Damascus University Journal

    (2005)
  • I. Avcibas et al.

    Statistical evaluation of image quality measures

    Journal of Electronic Imaging

    (2002)
  • M. Awrangjeb et al.

    Lossless watermarking considering the human visual system

  • Celik M, Sharma G, Tekalp A, Saber E. Reversible data hiding. In: Proceedings of international conference on image...
  • Chae J, Manjunath B. Data hiding in video. In: Proceedings of the 6th IEEE international conference on image processing...
  • I. Cox et al.

    Digital watermarking and steganography

    (2008)
  • Egiazarian K, Astola J, Ponomarenko N, Lukin V, Battisti F, Carli M. Two new full-reference quality metrics based on...
  • Fridrich J, Du R. Lossless authentication of MPEG-2 video. In: ICIP. 2002. p....
  • Fridrich J, Goljan M, Du R. Lossless data hiding for all image formats. SPIE, Electronic imaging 2002, security and...
  • Gujjunoori S, Amberker BB. A DCT based reversible data hiding scheme for MPEG-4 video. In: Proceedings of international...
  • Gujjunoori S, Amberker BB. A DCT based reversible data embedding scheme for MPEG-4 video using HVS characteristics. In:...
  • C.T. Hsu et al.

    DCT-based watermarking for video

    IEEE Transactions on Consumer Electronics

    (1998)
  • M. Iwata et al.

    Digital steganography utilizing features of JPEG images

    IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer

    (2004)
  • Jessica F, Miroslav G, Du R. Invertible authentication watermark for JPEG images. In: Proceedings of international...
  • Cited by (0)

    View full text