Elsevier

Information Sciences

Volume 239, 1 August 2013, Pages 253-265
Information Sciences

An efficient expanding block algorithm for image copy-move forgery detection

https://doi.org/10.1016/j.ins.2013.03.028Get rights and content

Abstract

Image forgery is becoming more prevalent in our daily lives due to advances in computers and image-editing software. As forgers develop more sophisticated forgeries, researchers must keep up to design more advanced ways of detecting these forgeries. Copy-move forgery is one type of image forgery where one region of an image is copied to another region in an attempt to cover a potentially important feature. This paper presents an efficient expanding block algorithm for detecting copy-move forgery and identifying the duplicated regions in an image. Experimental results show that the new method is effective in identifying size and shape of the duplicated region. Furthermore, it allows copy-move forgeries to be detected, where the copied region has been manufactured slightly lighter or darker, under JPEG compression, or with the effect of Gaussian blurring, in an attempt to throw-off detection algorithms.

Introduction

Image authentication techniques can be classified into two categories: active methods and passive methods. Active methods such as watermarking [10], [21] or illegal image copy detection [8], [13], [17] depend on prior information about the image. However, in many situations, prior information regarding an image is not available and passive, or blind, methods should be used to authenticate the image. In this paper, we focus on one type of passive image forgery detection known as copy-move forgery detection. A copy-move forgery is used to hide a region of the image by covering it with a copy of another region of the image. Fig. 1 shows an example of a copy-move forgery and the identified forged and copied regions using the cameraman image. When detecting this type of forgery, not only are we interested in whether the image contains this forgery, but also the location and shape of the forged region. However, identifying the forged region can be complicated by the fact that the forged region may be similar, but not an exact copy of another region. This can happen through image transformations, such as scaling or JPEG compression, or because the forger put in extra effort into hiding the forgery, such as blurring or slightly rotating the forged region.

Several algorithms to detect copy-move forgery are based on the sliding block method presented by Fridrich et al. [6]. The main idea is that rather than trying to identify the entire forged region, the image is divided into small overlapping blocks. The blocks are compared against each other in order to see which blocks are matching. The regions of the image covered by the matching blocks are the copied and forged regions.

In general, copy-move forgery detection consists of feature extraction, comparison, and copy decision based on the similarity information [5]. In the feature extraction step, important features are selected from each block which are used to indirectly compare the blocks. A good feature extraction algorithm should extract similar features for two blocks that are approximately the same. Since images can be altered in various ways (scaled, compressed, blurred, etc.), feature extraction methods should be able to extract important features while ignoring subtle noise. The features are placed into a matrix called the feature matrix.

Popescu and Farid [19] developed a method based on principal component analysis (PCA). PCA is an important image classification tool and is used in several algorithms such as facial recognition (eigenfaces) [23]. This method constructs a B2 dimensional basis (for blocks with B rows and B columns) based on the covariance matrix of the block pixels. Each block can be represented as a vector in the space spanned by the constructed basis [24]. However, it is assumed that the significant features of a block are contained in the first few dimensions. The significant block features are extracted by projecting the block pixels onto the first few dimensions of the constructed basis. Noise and other insignificant features are assumed to lie in the last few dimensions. Li et al. [12] decomposed the image using discrete wavelet transform (DWT). They computed the singular value decomposition (which is related to PCA) of the overlapping blocks from the low-frequency component. A related method by Zimba and Xingming [25] uses a combination of DWT and PCA. Shih and Yuan [20] proposed a simple method of using two features, the mean and variance of the gray values within the block, and showed the comparable performance to more advanced methods.

Fridrich et al. [6] considered a method based on discrete cosine transform (DCT) (DCT has also been used in more recent methods see [3], [9]). DCT is frequently used in compression of multimedia, such as images (JPEG) and music (MP3). With DCT, similar to PCA, the more significant features are assumed to be captured in a few coefficients [11]. Luo et al. [14] extracted features based on color images. Mahdian and Saic [15] extracted features based on blur invariants. This allows for identification of copy-move forgery under blur degradation, additive Gaussian noise, and contrast changes. Amerini et al. [1] developed a SIFT-based method for copy-move attack detection and transformation recovery. Pan and Lyu [18] estimated the transform between matched SIFT keypoints and searched all pixels within the duplicated regions after discounting the estimated transforms.

Once features have been extracted from each block, they must be compared with one another. The issue is how to quantify the similarity between blocks. The most obvious comparison is an exhaustive search where every block is compared against other blocks. However, the performance of an exhaustive search is slow and takes (Nb)(Nb  1)/2 comparisons to complete, where Nb is the total number of blocks and may be quite large.

Fridrich et al. [6] considered blocks as a match if their features are matched exactly. Popescu and Farid [19] used a nearest neighbor approach. That is, the feature matrix is lexicographically sorted so that blocks close to each other in the sorted feature matrix have similar features. Specifically, blocks within Nn rows of another block are considered a match. For example, if Nn = 1, then only the blocks 1-row away in the sorted feature matrix are considered a match. Singh and Raman [22] used the Graphical Processing Unit (GPU) to perform radix sorting of the feature vector and showed this can dramatically improve performance. Bayram et al. [2] computed the hashes of the vectors in the feature matrix and concluded two features matched exactly if their hashes matched. This technique avoids the need to sort the feature matrix.

Unfortunately, coming to a decision about which blocks are duplicated is not as simple as it may seem. Often two blocks are viewed as a match because they are near each other in an image or simply due to coincidence. Popescu and Farid [19] used a shift vector to determine whether two blocks are duplicated. Let the upper-left coordinates of the block corresponding to row i of the sorted feature matrix be denoted by (xi, yi). For two matching blocks, they computed the offset (sx, sy) as:(sx,sy)=(xi-xj,yi-yj)ifxi<xj(xj-xi,yi-yj)ifxi>xj(0,|yi-yj|)ifxi=xjLet C(sx, sy) be the number of times that two blocks have the same offset (sx, sy). If a region is duplicated, then every block in that region is likely to have the same offset. If C(sx, sy) is larger than some threshold, say Ns, then it is assumed that this offset must be due to a duplicated region. Furthermore, let Nd be a user-defined parameter, such that two blocks will not be considered as a match if the distance between the ith and jth block is smaller than Nd.

Section snippets

The proposed expanding block algorithm

Unlike the aforementioned methods, our proposed method primarily uses direct block comparison rather than indirect comparisons based on block features. The proposed method first divides an image into Nb small overlapping blocks just like in the sliding block method. However, our approach to comparing the blocks is different. Many of the blocks are obviously different and do not need to be compared against each other. A dominant feature is computed for each block. Specifically, we use the

Experimental results

This section is divided into four subsections. The first will investigate the effect of changing different parameters in the expanding block algorithm. The second and third subsection will compare the algorithm with other existing algorithms. The last subsection will evaluate the performance of the modified expanding block algorithm. All measurements are performed on a Lenovo laptop with a 2.1 GHz Intel Pentium processor and 8 GB of RAM.

Conclusions

It has been shown that the expanding block algorithm is an effective method for identifying image copy-move forgery. It is particularly good at identifying the location and shape of the forged regions. It is also shown that the expanding block algorithm is able to catch specific types of forgeries, such as under JPEG compression, the effect of Gaussian blurring, or when the duplicated region is made lighter or darker. The advantage of the expanding block algorithm is that it can handle block

References (25)

  • J.-H. Hsiao et al.

    A new approach to image copy detection based on extended feature sets

    IEEE Trans. Image Process.

    (2007)
  • A. Khana et al.

    Intelligent reversible watermarking and authentication: hiding depth map information for 3D cameras

    Inform. Sci.

    (2012)
  • Cited by (0)

    View full text