Source–channel coding-based watermarking for self-embedding of JPEG images

https://doi.org/10.1016/j.image.2017.12.010Get rights and content

Highlights

  • A watermarking-based self-recovery method applicable to the JPEG images.

  • Adopting a source–channel coding model for self-recovery of JPEG-compressed images.

  • Tolerant to the tampering (content replacement) up to 17.2% of the image size.

  • Reasonably resistant against noise addition and JPEG compression and recompression.

Abstract

Watermarking-based self-embedding schemes designed to combat the kind of malicious tampering in which a portion of the image is replaced with the fake content are mostly developed for the uncompressed images; and thus, are vulnerable against image processing attacks such as noise addition and image compression. This drawback makes them impractical especially when applied to the popular JPEG domain. Based on modeling the tampering as a source–channel coding problem, a self-embedding scheme for digital images in JPEG domain is proposed in this paper. A compressed version of the original image protected against tampering with the help of proper channel coding approach forms the watermark of our proposed method. Source coding is applied to recover the lost content, while channel coding helps the watermarked image to tolerate tampering. Since the proposed method is designed based on the JPEG model, it can be efficiently integrated into the JPEG standard. Experimental results show that the proposed method offers robustness against attacks such as noise addition and recompression, which is the shortcoming of available existing uncompressed-domain image self-embedding schemes. Moreover, satisfactory performance of the proposed method is stated in comparison with the major existing JPEG domain self-embedding schemes in terms of better robustness and image quality.

Introduction

Image authentication has been one of the main applications of digital image watermarking since its emergence. Recently, watermarking is exploited to further help the protection of digital images in “self-recovery” or “self-embedding” schemes, where the embedded watermark not only helps to detect the tampering, but also facilitates the recovery of the lost content as much as possible. Normally in such schemes, the watermark consists of an image representation, accompanied with the check bits. Check bits help the receiver to find the tampered area, and the representation information in the survived image area helps to recover the lost content if possible. Several approaches have been applied to design the self-recovery watermark. In the pioneer work of Fridrich, both the quantized discrete cosine transform (DCT) coefficients and a low depth version of the original image are applied as candidates for reference generation [1]. Many other innovative image representations have been proposed for self-recovery systems such as [[2], [3], [4]]. It should be noted that the mentioned schemes are designed for self-recovery of uncompressed images and are not applicable to the compressed domain.

In self-recovery schemes, from tampering, we mean replacing a portion of the digital image with the fake content. Therefore, to perform the self-recovery, tampered regions must be detected at first. This is normally fulfilled by means of hash (check) bits generated from most significant bits (MSB) of each block and embedded into the least significant bits (LSB) of that block. However, even a limited overall attack such as noise addition, affects the LSB of image and invalidates the authentication and hence the whole self-recovery process. Apparently, JPEG compression as almost the most important image processing modification, affects the LSB information too. Therefore, the majority of existing algorithms cannot withstand JPEG compression, let alone the recompression. Aside from the JPEG compression, they are vulnerable even in the uncompressed image domain, as the attacker can accompany the tampering (content replacement) with an overall limited-power additive noise to break the self-recovery. We assume the malicious attacker performs only limited-power attacks, as by that she/he means only destructing the watermark not the normal appearance of the image. Although the majority of self-recovery research is in uncompressed domain, some algorithms are proposed to exhibit robustness or applicability to the JPEG–domain [[5], [6], [7], [8], [9], [10]]. In [11] which is a significant self-recovery scheme for the JPEG domain, reference bit generation based on DCT coefficients, fountain coding and hashing is applied to generate self-embedding images in the JPEG domain. The self-recovery scheme in [11] is proposed to deal with the scenario depicted in Fig. 1, from which it can be inferred that a combination of malicious tampering and recompression to a higher JPEG level is anticipated.

In this paper, our proposed solution for the self-recovery is based on modeling the tampering as a source–channel coding problem. Such model was firstly introduced in the method proposed for the uncompressed images in [[12], [13]], where watermark consists of these three parts: source coding bits, channel coding parity bits and check bits. The role of the check bits is the same as discussed above. Source coding bits are derived by applying an image compression algorithm to the entire original image. Applying a channel coding algorithm, channel coding parity bits are appended to the source coding bits. Tampering acts as an erasure channel and destructs some channel coding bits. As long as the tampering ratio is below that tolerable by the applied channel coding, channel decoding is performed successfully, and resulting decoded compressed image is applied to replace the content of the tampered region detected thanks to the check bits. Although the results reported in [12] are captivating, it is clear that due to placing the watermark in the LSB of image, it cannot withstand even the simplest image modifications such as noise addition and image compression.

Similar idea is applied in this paper to generate self-embedding images in JPEG domain. For this sake, we have to address complicated issues of the compressed domain among them the embedding capacity is the most challenging. Since the LSB information is lost in JPEG compression, the watermark must be embedded into the robust parts of the host image that results in much less capacity compared to the LSB replacement. Thus, the size of watermark is limited. To achieve the best performance, this limited capacity is optimally utilized in our proposed method by proper novel measures; among them eliminating the check bits and applying the error correction codes is the most significant. In the absence of the check bits to authenticate the blocks of the received image, some rough yet efficient approximation of the tampered region is made by comparing the received and recovered versions of the image.

Although promising applications of the channel erasure codes in self-recovery has been reported recently, [[4], [11], [12]], not many can be counted for error correction coding. In [14], Reed–Solomon (RS)-based parity bytes are generated from image rows and columns separately. Since the information replaces the LSB, it cannot withstand JPEG compression. RS error correction is also applied in [10] for the sake of recovery. In this method, content recovery is possible as long as the size of the tampered area does not exceed 3% of the entire image. Authors of [7] and [8] tried the BCH error correction coding to protect the image representation extracted from the integer wavelet transform coefficients.

In the proposed scheme, the image is first source-coded using an image compression technique. The source-coded bits then undergo a channel coding process to form the watermark that consists of source coding bits and channel coding parity bits. This bit-stream is then embedded into the JPEG image with a subtle modification to the quantization step of the JPEG algorithm. In this way, the embedding and recovery algorithms can be efficiently integrated into the JPEG compression standard.

At the receiver, the tampered watermark bit-stream is recovered at the output of the inverse quantization phase. Channel decoding and then source decoding are applied to the watermark bit-stream, resulting in the compressed image. Compressed image is compared to the received image to detect the tampered area. If the tampering rate exceeds that tolerable by the applied channel coding, the compressed image will be just random patterns. Otherwise, a meaningful recovered compressed image means the success of channel decoding. In this case, the compressed image replaces the received one in the tampered area detected in the comparison phase to deliver the recovered image. Experimental results exhibit the robustness of the proposed method against attacks such as image recompression and noise addition accompanied with the image tampering (content replacement), which is the shortage of the existing uncompressed-domain self-embedding schemes.

The rest of this paper is organized as follows. Section 2 discusses the challenges and requirements of the compressed-domain self-recovery. The proposed method is introduced in Section 3 in general, while its contributing components are discussed in Section 4 in more details. A sample for system design and parameter selection is given in Section 5. Experimental results are presented in Section 6, and finally, Section 7 concludes the paper.

Section snippets

Requirements of JPEG domain self-recovery

Our work is based on the source–channel coding model applied in [12]. However, [12] was designed for uncompressed images where the task can be performed much easier, compared to the JPEG domain in which certain limitations, impose complexities that demand wiser design. Now we briefly review the JPEG compression and then discuss about limitations of self-recovery in the JPEG domain.

JPEG encoding and decoding are shown in Fig. 2. The image is decomposed into 8 × 8 blocks that undergo DCT after

The proposed embedding and recovery scheme

The block diagrams of watermark embedding and self-recovery are given in Fig. 4. Comparing Fig. 4, Fig. 2 reveals that the proposed method can be integrated into the JPEG standard. The watermark embedding procedure can be summarized in steps below:

  • 1.

    Source coding of the original image is performed at rate ns bit per pixel (bpp). The total number of source coding bits derived from entire image equals Ns=NP×ns in which NP is the number of image pixels.

  • 2.

    Error correction channel coding

Source coding: SPIHT compression

Similar to [12], set partitioning in hierarchical trees (SPIHT) algorithm [16] is adopted for the source coding. As mentioned in Section 2, the watermark capacity and thus the SPIHT bit-rate are much less than one bpp. Therefore, although the compression rate of one bpp (ns=1) was applied in [12], here we apply the much less rates. For example, in the application discussed in Section 5, SPIHT is adopted at the compression rate of ns=0.1236 bpp, yielding a compressed bit-stream of length bs=32,

Parameter selection and algorithm design

JPEG quality factor (QF) that is applied for the embedding is denoted by QE, while QR represents that applied by the attacker for the recompression. Assume that the 8-bit 512 × 512 gray-scale Lena image compressed by embedding QF equals QE=75 is the subject of protection against possible tampering. As mentioned, the majority of DCT coefficients are zero, suggesting nw=0.25 as a proper choice, that is, 25% of the DCT coefficients of every block is applied for embedding. It was mentioned in

Experimental results

Performance of the proposed method is examined through experiments on 10,000 images of BOWS2 database [15]. Following sub-sections analyze the quality of the compressed and watermarked images, performance against tampering, noise and recompression, respectively. The relation between conflicting parameters is inspected next. Finally a comparison to the most significant works in the JPEG-domain self-recovery is presented. Keep in mind that for all experiments, tolerable BER is 0.086. Thus, if the

Conclusion

A self-recovery solution to generate tamper-proof images robust against tampering, noise addition and recompression in JPEG domain was proposed in this paper. Since compression reduces the image redundancy, the available capacity for watermark embedding and thus, tolerable tampering rate or the quality of recovered content are less than those of methods offered for the uncompressed-domain images. The SPIHT and LDPC coding algorithms were exploited as source and channel coding, respectively.

References (21)

There are more references available in the full text version of this article.

Cited by (7)

View all citing articles on Scopus
View full text