DCT based reversible data embedding for MPEG-4 video using HVS characteristics
Introduction
Now a days, data embedding into multimedia data such as image, audio, video, etc. is emerging due to its vast applications in watermarking and steganography (Cox et al., 2008). A data embedding scheme alters the cover contents such as image, audio, video, etc. for embedding the data . The data to be embedded can be secret information, identity of the content owner, information about the cover content, etc. depending on the application for which the embedding scheme is designed. With the digital representation of image, video, etc. the huge amount of data associated with it demand the usage of compression standards such as JPEG, MPEG, JVT, H.264, etc. for efficient storage and transmission (Salomon, 2007; Furht, 1995). Recently, due to the unavoidable relationship of multimedia data with the compression techniques, the data embedding in the compressed domain has become an active area of research. Though the data embedding can be performed in both the uncompressed and compressed domains, embedding of data in compressed domain in association with its compression technique is demanding and has its own advantages (Lin, 2005).
The DCT (Discrete Cosine Transformation) is used most widely for transforming the multimedia data to the frequency domain in most of the compression standards such as JPEG, MPEG, JVT, ITU's H.261 and H.263, etc. Embedding the data into the quantized DCT coefficients is the most common practice (Lin and Chen, 2000; Hsu and Wu, 1998). The motivation behind this practice is to achieve the robustness, real time implementation of data embedding in association with its compression technique, bit rate control, etc.
In general, the data embedding can be done in irreversible or reversible manner both in the uncompressed and compressed domains. In irreversible data embedding, the cover content can not be restored back to its original form. But there are some applications such as military communication, medical imaging, remote sensing, fine arts, multimedia archive management etc. which require the restoration of the original content after the extraction of the embedded data. The schemes which restore the cover content after extraction of the embedded data are referred to as reversible data embedding schemes (Tian, 2003; Fridrich et al., 2002; Celik et al., 2002; Jessica et al., 2001; Fridrich and Du, 2002; Zeng et al., 2011; Gujjunoori and Amberker, 2012a).
Generally, to achieve reversibility the cover content is subjected to many modifications or transformations while embedding the data. Here, apart from the regular modifications due to embedding the bits, the process of achieving reversibility results in additional modifications to the original content. These modifications result in visual degradation of the cover content. Maintaining the requirement of reversibility, one needs to strike a trade off between the visual quality and embedding capacity (Cox et al., 2008). Hence, achieving the better embedding capacity is a challenge when the data needs to be embedded in reversible manner. There are some works which make use of the characteristics of Human Visual System (HVS) for achieving the better embedding capacity (Awrangjeb and Kankanhalli, 2004; Jung et al., 2011). Some other works make use of characteristics of DCT. For instance, the middle frequency coefficients of DCT block are used widely for embedding (Lin et al., 2008; Lin and Shiu, 2009; Chang et al., 2007). Though, the schemes which make use of middle frequency coefficients for embedding the data achieves the better visual quality in terms of PSNR, MSE, etc. compared to other schemes in frequency domain, they retain the visual artifacts such as white noise like distortions, blocking artifacts, etc. This becomes more pronounced in reversible data embedding schemes as it involves more modifications to the cover content (Chang et al., 2007).
In most of the works in the literature on applications of watermarking or steganography, the objective measures such as PSNR, MSE, etc. are widely used to asses the visual quality of the degraded image/video (Wang et al., 2004; Chae and Manjunath, 1999; Ma et al., 2010). None of the works consider the measures which emulates the better HVS characteristics while embedding the data. However, very few works consider the measures such as CPSNR (Yang et al., 2007), MSSIM or SSIM (Wang et al., 2004; Noore, 2004),etc. which partially takes HVS into consideration while embedding the data. Especially, when the data is to be embedded into the frequency domain using DCT, the visual quality measures which consider the better HVS characteristics need to be identified. The PSNR is not a good measure of evaluating the visual quality of the degraded image/video (Wang et al., 2004; Avcibas et al., 2002). Hence, the PSNR is not a good measure to assess the visual quality of degraded image/video when the data is embedded into the frequency domain using DCT. For instance, PSNR does not capture HVS characteristics when the additive noise is introduced due to embedding in the quantized DCT blocks of smoother regions. Some times lesser PSNR may result in better visual quality (Segall et al., 2001; Abboud, 2005). Hence, the data embedding schemes which performs efficient in terms of embedding capacity and visual quality need to be investigated further.
In this article, we identify suitable visual quality measures such as PSNR-HVS (Egiazarian et al., 2006), PSNR-HVS-M (Ponomarenko et al., 2007) when the data is embedded into frequency domain using DCT and propose two reversible data embedding schemes. These measures are also used by Muzzarelli et al. (2010) scheme to assess the visual quality of the degraded image. Their embedding scheme is based on histogram modification but not based on DCT. We compare the proposed two schemes with Chang et al. (2007) scheme. Our first scheme referred to as HVSVIS is aimed at improving the visual quality by achieving the capacity nearer to that obtained in C. C. Chang et al. scheme. This scheme is useful for the high fidelity watermarking applications. Our second scheme referred to as HVSCAP, is aimed at improving the embedding capacity by maintaining good visual quality. Therefore, this scheme can be used for the steganographic applications which require high capacity. We also prove the reversibility of the proposed schemes. These schemes embed the data into the quantized DCT coefficients of I-frames during the process of compressing the digital video using MPEG-4 compression technique. Further, the robustness is achieved by embedding data into the compressed domain.
The article is organized as follows. Section 2 briefly reviews the MPEG-4 compression and present in detail our proposed schemes for embedding the data during the process of MPEG-4 compression. Results and discussion is given in Section 3. We conclude the article in Section 4.
Section snippets
Proposed schemes
We propose two data embedding methods aiming for different applications. Our first method referred to as HVSVIS is aimed at improving the visual quality by achieving the capacity nearer to that obtained in C. C. Chang et al. scheme. This method is useful for the high fidelity watermarking applications. Our second method referred to as HVSCAP, is aimed at improving the embedding capacity by maintaining good visual quality. Therefore, this method can be used for the steganographic applications
Results and discussion
We use various QCIF formatted videos in our experiment, including MissAm, Akiyo, CarPhone, SalesMan, etc. Some of the test videos are shown in Fig. 5. The frame size of all these test videos is 176 × 144 pixels. We compress these test videos by the standard MPEG-4 encoder. The widely used measurement for evaluating the visual quality of a stego-video (watermarked video) is PSNR (Peak Signal to Noise Ratio). The PSNR for each YUV channel of a frame is given by the following equation
Conclusions
The visual quality of an image/video can not be captured by PSNR alone when the data is embedded in frequency domain using DCT. Hence, the combination of HVS based measures PSNR-HVS and PSNR-HVS-M can provide a good measure of visual quality. Improving the HVS based measures by maintaining the acceptable range of PSNR could be the better measure of overall visual quality assessment when the data is to be embedded in frequency domain using DCT. The first method proposed for reversible data
References (32)
- et al.
Reversible hiding in DCT-based compressed images
Information Sciences
(2007) A survey of multimedia compression techniques and standards. Part I: JPEG standard
Real-time Imaging
(1995)Reducing of blocking effect in image and video coding by three modes of adaptive filtering or interpolating
Damascus University Journal
(2005)- et al.
Statistical evaluation of image quality measures
Journal of Electronic Imaging
(2002) - et al.
Lossless watermarking considering the human visual system
- Celik M, Sharma G, Tekalp A, Saber E. Reversible data hiding. In: Proceedings of international conference on image...
- Chae J, Manjunath B. Data hiding in video. In: Proceedings of the 6th IEEE international conference on image processing...
- et al.
Digital watermarking and steganography
(2008) - Egiazarian K, Astola J, Ponomarenko N, Lukin V, Battisti F, Carli M. Two new full-reference quality metrics based on...
- Fridrich J, Du R. Lossless authentication of MPEG-2 video. In: ICIP. 2002. p....