A Novel Copy-Move Forgery Detection Algorithm via Feature Label Matching and Hierarchical Segmentation Filtering

doi:10.1016/j.ipm.2021.102783

Information Processing & Management

Volume 59, Issue 1, January 2022, 102783

https://doi.org/10.1016/j.ipm.2021.102783 Get rights and content

Highlights

•
The improved structure of SIFT can extract as many as possible interesting and effective points in the homogeneous region(s), that enables the ability of CMFD in homogenous regions simultaneously under large-scaling attacks.
•
The proposed FLM algorithm makes use of a newly designed feature label and OLC for significantly reduced computation. As a result, matching efficiency is significantly improved.
•
The proposed HSF algorithm can effectively filter out suspicious outliers. Furthermore, the fusion of the three-hierarchical filtering segmentations can indicate forgery regions precisely.

Abstract

Recently, some copy-move forgeries have made use of homogeneous region(s) in an image with the large-scaling attack(s) to highlight or cover the target objects, which is easy to implement but difficult to detect. Unfortunately, existing Copy-Move Forgery Detection (CMFD) methods fail to detect such kinds of forgeries because they are incapable of extracting a sufficient number of effective keypoints in the homogeneous region(s), leading to inaccuracy and inefficiency in detection. In this work, a new CMFD scheme is proposed: 1) An improved SIFT structure with inherent scaling invariance is designed to enhance the capability of extracting effective keypoints in the homogeneous region. 2). The enhancement of massive keypoints extraction in the homogeneous region incurs a heavy computational burden in feature matching (Note that this is a common issue in all CMFD methods). For this reason, a new Feature Label Matching (FLM) method is proposed to break down the massive keypoints into different small label groups, each of which contains only a small number of keypoints, for significantly improved matching effectiveness and efficiency. 3) Identifying true keypoints for matching is a critical issue for performance. In our work, the Hierarchical Segmentation Filtering (HSF) algorithm is newly proposed to filter out suspicious outliers, based on the statistics on the coarse-to-fine hierarchical segmentations. 4) Finally, the fusion of the coarse-to-fine hierarchical segmentation maps fills the forgery regions precisely. In our experiments, the proposed scheme achieves excellent detection performance under various attacks, especially for the homogeneous region(s) detection under large-scaling attack(s). Extensive experimental results demonstrate that the proposed scheme achieves the best F₁ scores and least computational cost in addressing the geometrical attacks on the IMD dataset (a comprehensive dataset), and CMH datasets (most forgery samples under geometric attacks). Compared to existing state-of-the-art methods, the proposed scheme raises at least 20% and 25% in terms of F₁ scores under scaling factors of 50%, and 200% in large-scaling sub-datasets of IMD, respectively.

Introduction

Copy-move image forgery is one of the popular tampering techniques in digital image forgery, where the copied region(s) of an image is pasted onto the same image to hide or cover the concerned regions [Korus, 2017]. Existing copy-move forgery detection (CMFD) methods are generally divided into classic methods and deep neural network (DNN) models.

Classic CMFD methods are generally categorized into block-based methods [Fridrich et al., 2003, Zhong et al., 2016, Emam, Han, & Niu, 2016, Cozzolino et al., 2015, Pun and Chung, 2018, Ryu et al., 2013, Bi and Pun, 2017, Bi and Pun, 2018] or keypoint-based methods [Zhong and Pun, 2019, Shivakumar and Baboo, 2011, Pun et al., 2015, Li and Zhou, 2019, Silva et al., 2015, Tsai and Leou, 2021], both of which follow four procedures: feature extraction, feature matching, filtering, and post-processing. In the literature, most attacks to CMFD (e.g., JPEG compression, rotation, additional noise attack) have been well addressed, except the homogeneous region(s) with the large-scaling attack(s) in which at least a scaling factor of ±20% is applied to the image region(s) to highlight or cover the target objects in the same image. This kind of attack is easy to implement but difficult to detect for its homogeneity of image regions. As an illustration, existing methods in the literature only respectively achieve 0.75 and 0.65 in terms of F₁ scores under scaling attacks of 120% and 80% for the popular benchmark dataset IMD [Christlein et al., 2012]. Under severe large-scaling of 50% and 200%, all the block-based and keypoint-based methods can only achieve at most 0.37 of the F₁ score. Even worse, if the detection methods can only address the forgery images with suspicious homogenous regions, then the F₁ score gets further deteriorated under such scaling attacks, as shown in the experimental comparison in Fig. 11-(a3).

DNN models, e.g., Convolutional Neural Networks (CNN) [Rao and Ni, 2016], end-to-end DNN [Wu et al., 2018], and BusterNet [Wu et al., 2018], have been presented to address copy-move forgery detection. These methods extracted deep-level features using CNN, e.g., VGG16 [Simonyan and Zisserman, 2014] and Inception [Szegedy et al., 2016]. The features of 512 or even deeper layers can represent more inherent information. However, it is impossible to pre-train all forgery objects in DNN models [Rao and Ni, 2016, Wu et al., 2018, Wu et al., 2018, Liu et al., 2018]. For this reason, DNN models cannot address numerous untrained forgery types in the test stage due to the lack of prior training information.

Currently, some DNN methods (e.g., Deep-learning [Rao and Ni, 2016], BusterNet [Wu et al., 2018], semantic reinforcement network [Chen et al., 2021], Multi-Branch (MB) CNNs [Barni et al., 2021], AR-Net [Zhu et al., 2020] have been successfully applied in CMFD. For instance, Barni et al. (2021) improved a multi-branch CNN architecture from BusterNet to learn a set of features for revealing the presence of pasted forgeries and boundary inconsistencies in the copy-move regions. However, this DNN scheme only addresses the foreground but fails in processing homogeneous or smoothing objects or backgrounds. This detection defect also exists in the other CMFD DNNs. To date, the DNN model addressing copy-move forgery detection is still in its infancy. Therefore, the main trend is still on the classic methods.

Derivate applications are using the technique of image copy-move forgery, e.g., detection of copies of programming codes in 5G-IoT systems. Farhan et al. [Ullah et al., 2021] is only devoted to investigating the clone attacks whether exist in the modified code programming versions from the logic flow difference of the previous programming code version. They propose a hybrid approach using the Control Flow Graph features from source codes to view the logic flow of the codes. Then, these features are given to the designed RNN model for the efficient classification of code clones. This scheme does not belong to the CMFD, but its RNN network also presents a good reference for CMFD in future work. In addition,ZZa[Zza et al., 2021] proposed a novel residual visualization scheme over the residual image to reveal the correlation between the original and test images. However, this scheme is only devoted to identifying whether the images are original copies or similar images. This scheme does not address CMFD because it cannot find any forgery traces in a detected image.

Generally, existing CMFD methods suffer from the following issues:

Inherent defects of feature extraction: Block-based detection methods cannot handle large-scaling attacks because the features extracted respectively from the copied region and its scaled region are different due to the fixed-single block size. Moreover, due to the unknown scaling factor, it is difficult to apply suitable multi-scale blocks to extract block features effectively. Furthermore, the usage of multi-scale blocks for feature extractions may result in prohibitive computation costs. On the other hand, keypoint-based methods can handle large-scaling attacks to a certain extent. Keypoint-based methods generally search and extract the extrema in local high-entropy regions in the whole image as the candidate keypoints. However, based on the inherent structures, most keypoint-based methods using the local descriptor algorithms cannot detect the forgeries in the homogeneous region(s). Therefore, existing block-based and keypoint-based methods cannot provide effective feature extraction for the homogeneous region(s) simultaneously with the large-scaling attack(s).

Inefficient feature matching: Block-based methods need to search and match every pixel in an image with multiple block features so that expensive computational cost is always incurred. In keypoint-based methods, every keypoint is matched upon high-dimensional features (e.g., SIFT/SURF takes 128-dimensional features) for accurate matching. Besides, keypoint-based methods may extract a considerable number of candidate keypoints for more accurate detection that further deteriorate matching efficiency, especially in the homogeneous regions or large detected image size.

Ineffective and inefficient segmentation filtering: Keypoint-affine and segmentation-keypoint-filtering algorithms are mainly used for segmentation filtering algorithms. Generally, keypoint-affine algorithms [Ramu and Babu, 2017] require iterative matching analysis of a huge amount of keypoints to filter out the outliers, resulting in low matching efficiency. Besides, the keypoint-affine algorithm is incompetent to address multiple clusters and fails to detect multiple forgery regions. Existing segmentation-keypoint-filtering algorithms [Pun et al., 2015] require dozens of iterations for effective snippet segmentations as well. Furthermore, these algorithms can only locate the coarse segmentation matches and fail in indicating the forgery pixels.

In view of the above issues, a new CMFD scheme shows in Fig. 1 is proposed that includes i) an improved SIFT [Li et al., 2019], ii) Feature Label Matching (FLM) algorithm, and iii) Hierarchical Segmentation Filtering (HSF). The new scheme aims to achieve high accuracy and high efficiency under various attacks, especially for large-scaling in the homogeneous region(s). In our improved SIFT, its structure is modified by removing the contrast threshold to extract all effective keypoints in rough and homogeneous regions (detailed in Section II.A). However, matching with high dimensional features becomes computationally prohibitive under this massive amount of effective keypoints. Therefore, a new Feature Label Matching (FLM) algorithm is proposed to reduce computational cost on keypoint matching. (Remark: FLM can also apply to and resolve the same matching problem in other existing CMFD methods.) In FLM, the feature label is newly introduced to represent the corresponding feature of a keypoint. Subsequently, all keypoints are assigned into different feature label groups according to their feature label summation. Subsequently, the keypoints are assigned to the same feature label group according to their feature label summation. The massive keypoints are broken down into different label groups, each of which contains only a small number of keypoints. The successive feature label groups are then concatenated into a new overlapping label cluster (OLC). As a result, FLM only requires matching a small number of keypoints in small label groups for significantly improved matching efficiency and effectiveness (detailed in Section II.B). Another improvement for effectiveness and efficiency can benefit from removing the outlier keypoints among all candidate keypoints. For this purpose, the image is divided into three coarse-to-fine segmentations by the proposed HSF adaptively. Then, the metrics are applied to filter out the outlier pairs in every segmentation. The inlier (true keypoint) pairs in three hierarchical segmentations accurately indicate the corresponding forgery regions (detailed in Section II. C). Finally, three hierarchical forgery segmentations are fused to address the forgery region filling(detailed in Section II. D).

As a summary, the contributions of the proposed method are given as follows:

i)
The improved structure of SIFT can extract as many as possible interesting and effective points in the homogeneous region(s), enabling the ability of CMFD in homogenous regions simultaneously under large-scaling attacks.
ii)
The proposed FLM algorithm makes use of a newly designed feature label and OLC for significantly reduced computation. As a result, matching efficiency is improved dramatically.
iii)
The proposed HSF algorithm can effectively filter out suspicious outliers. Furthermore, the fusion of the three-hierarchical filtering segmentations can indicate forgery regions precisely.

Noteworthy, our contributions ii) and iii) can also be fused to the other CMFD methods. The rest of the paper is organized as follows. Section 2 presents the related work and the proposed method. Sections 3 and 4 show the experimental results and the conclusion.

Section snippets

The Proposed Scheme

This section describes three main elements of the proposed CMFD scheme: improved SIFT for effective feature extraction, Feature Label Matching (FLM), and Hierarchical Segmentation Filtering (HSF).

Experiments and Analysis

In this section, a series of experiments on the benchmark datasets are carried out to evaluate the effectiveness and efficiency of the proposed method. In Section III-A, we present the test dataset and evaluation metrics. Section III-B presents the experimental results and analysis of the image manipulation dataset (IMD) [Christlein et al., 2012]. Section III-C presents the experimental results and analysis on the CMH [Silva et al., 2015] and the GRIP [Cozzolino et al., 2015] dataset. Section

Conclusion

This paper proposes a new CMFD scheme consisting of i) an improved structure of SIFT, ii) a novel Feature Label Matching (FLM), and iii) a novel Hierarchical Segmentation Filtering (HSF). The improved structure of SIFT enhances the capability of extracting effective keypoints in the homogeneous region. FLM breaks down the massive keypoints into different small label groups for significantly improved matching effectiveness and efficiency. The HSF algorithm is proposed to filter out suspicious

CRediT authorship contribution statement

Yanfen Gan: Conceptualization, Methodology, Validation, Writing – original draft. Junliu Zhong: Writing – original draft. Chiman Vong: Writing – review & editing.

Acknowledgment

This work was supported in part by Guangdong basic and applied basic research foundation under Grant No. 2020A151501783 (2020A1515010700), 2021 Innovation team of scientific research platform of universities in Guangdong Province Grant 2021KCXTD053, the Young creative talent projects of universities in Guangdong Province (Natural Science) under Grant 2021KQNCX164.

Yanfen Gan received the B.S. and M.S. degrees from the Guangdong University of Technology, Guangzhou, China, in 2004 and 2007, respectively. She is currently an Associate Professor with the Department of Information Science and Technology, South China Business College, Guangdong University of Foreign Studies. She is currently pursuing Ph.D. degree with the Department of Computer and Information Science, University of Macau, Macau, China. Her current research interests include deep learning

References (40)

X. Bi et al.
Fast reflective offset-guided searching method for copy-move forgery detection
Information Sciences
(2017)
X. Bi et al.
Fast copy-move forgery detection using local bidirectional coherency error refinement
Pattern Recognition
(2018)
P. Korus
Digital image integrity–a survey of protection and verification techniques
Digital Signal Processing
(2017)
Z. Li et al.
Efficient parallel optimizations of a high-performance SIFT on GPUs
(2019)
C.-M. Pun et al.
A Two-stage Localization for Copy-Move Forgery Detection
Information Sciences
(2018)
E. Silva et al.
Going deeper into copy-move forgery detection: Exploring image telltales via multi-scale analysis and voting processes
Journal of Visual Communication and Image Representation
(2015)
J. Zhong et al.
Radon odd radial harmonic Fourier moments in detecting cloned forgery image
Chaos, Solitons & Fractals
(2016)
Y. Zhu et al.
AR-Net: Adaptive Attention and Residual Refinement Network for Copy-Move Forgery Detection
IEEE Transactions on Industrial Informatics
(2020)
R. Achanta et al.
SLIC superpixels compared to state-of-the-art superpixel methods
IEEE transactions on pattern analysis and machine intelligence
(2012)
A. Alahi et al.
Freak: Fast retina keypoint

I. Amerini et al.

A sift-based forensic method for copy–move attack detection and transformation recovery

IEEE transactions on information forensics and security

(2011)

M. Barni et al.

Copy Move Source-Target Disambiguation through Multi-Branch CNNs

IEEE Transactions on Information Forensics and Security

(2021)

R.A.J.a.p.a. Brown

Building a balanced kd tree in o (kn log n) time,

Computer Science

(2014)

H. Chen et al.

Hybrid features and semantic reinforcement network for image forgery detection

Multimedia Systems

(2021)

M.-M. Cheng et al.

Global contrast based salient region detection

IEEE Transactions on Pattern Analysis and Machine Intelligence

(2014)

V. Christlein et al.

An evaluation of popular copy-move forgery detection approaches

IEEE Transactions on information forensics and security

(2012)

D. Cozzolino et al.

Efficient dense-field copy–move forgery detection

IEEE Transactions on Information Forensics and Security

(2015)

M. Emam et al.

PCET based copy-move forgery detection in images under geometric transforms

Multimedia Tools and Applications

(2016)

A. Ferreira et al.

Behavior Knowledge Space-Based Fusion for Copy–Move Forgery Detection

IEEE Transactions on Image Processing

(2016)

A.J. Fridrich et al.

Detection of copy-move forgery in digital images

Cited by (13)

Strong robust copy-move forgery detection network based on layer-by-layer decoupling refinement
2024, Information Processing and Management
This paper proposes an all-encompassing methodology called Strong Robust Copy-Move Forgery Detection Network based on Layer-by-Layer Decoupling Refinement (DRNet) which concentrates on detecting a pair of structurally complete similar areas (the source and the tampered area) in the copy-move forgery image by fully extracting the semantically irrelevant shallow information. The DRNet consists of two interacting modules: the Coarse Similarity Area Detection (CD) module and the Shallow Suppression Similarity Area Detection (SD) module. Specifically, the CD module is leveraged to obtain a coarse locating of similar target areas which also work as prior knowledge to guide the detection of the SD module. The SD module fully mines the suppressed information at the shallow layer of the network through layer-by-layer decoupling and uses it as a supplement to refine the coarse detection from the CD module. In addition, we propose a High-Order Self-Correlation Scheme (HS) by dealing with the problem of introducing noise during the process of utilizing the shallow feature to avoid false alarms and improve the robustness. The designed experiments are conducted on USC-ISI CMFD, CASIA CMFD, and CoMoFoD public datasets and the pixel-level F1 score tested by DRnet is improved by 2.27%, 3.82%, and 4.60% respectively than State-of-the-Art in CMFD.
Copy move forgery detection and segmentation using improved mask region-based convolution network (RCNN)
2022, Applied Soft Computing
Citation Excerpt :
Finally, the Hierarchical Segmentation Filtering technique was used to filter the outliers. This work [31] is effective for CMF detection, however, suffers from a high computational cost. In [32], the authors presented a CNN model for the identification of CMFD.
Copy-move forgery (CMF) is a common image manipulation approach that uses the information from the same sample to manipulate it with the intent of hiding the required content. Several approaches have been designed for the timely detection of CMF; however, accurate identification of manipulated samples is a complicated job due to the similar capturing conditions of the copied content as the patch is taken from the same image. Moreover, the occurrence of several post-processing attacks i.e., noise, blurring, brightness variations, etc. further enhances the difficulties of the detection approaches. In this work, we attempted to cover the limitations of existing methods by proposing a deep learning (DL)-based approach for the accurate detection of CMF. A custom Mask-RCNN model with the DenseNet-41 as the base network is presented which is capable of nominating a better set of image features and presents the complex image transformation effectively. More descriptively, the DenseNet-41 model is used as the base network for deep keypoints extraction which is then localized, segmented, and categorized by the Mask-RCNN model to locate the manipulated area. We have tested the proposed model on three standard databases namely the CoMoFoD, MICC-F2000, and CASIA-v2 databases, and attained a precision of 98.12%, 99.02%, and 83.41%, respectively. We have reported the results for numerous image post-processing attacks and confirmed that the presented work is robust to detect the CMF in the presence of translation, scale variations, rotation, color changes, noise, compression, and blurring in images. We have confirmed through extensive quantitative and qualitative evaluation that the DenseNet-41-based Mask-RCNN model is robust to CMF detection and can assist forensic analyzers to detect forensic manipulations accurately.
A Keypoint-Based Technique for Detecting the Copy Move Forgery in Digital Images
2024, Lecture Notes in Networks and Systems
Enhancing copy-move forgery detection through a novel CNN architecture and comprehensive dataset analysis
2024, Multimedia Tools and Applications
Automated identification of copy-move forgery using Hessian and patch feature extraction techniques
2024, Journal of Forensic Sciences
Accurate and robust image copy-move forgery detection using adaptive keypoints and FQGPCET-GLCM feature
2024, Multimedia Tools and Applications

View all citing articles on Scopus

Junliu Zhong received the M.Sc. degree in detection technology and automation device from Sun Yat-sen University, China, in 2006, and the Ph.D. degree in computer science from University of Macau, China, in 2020. He is currently an Associate Professor with the Department of Information and Communication Engineering, Guangzhou Maritime University, China. His current research interests include digital image/video processing, deep learning, and information security.

Chiman Vong received the M.S. and Ph.D. degrees in software engineering from the University of Macau, Macau, China, in 2000 and 2005, respectively. He is currently an Associate Professor with the Department of Computer and Information Science, University of Macau. His current research interests include machine learning methods and intelligent systems.

^†: Yan-Fen Gan and Jun-Liu Zhong contributed equally to this work.

View full text

A Novel Copy-Move Forgery Detection Algorithm via Feature Label Matching and Hierarchical Segmentation Filtering

Highlights

Abstract

Introduction

Section snippets

The Proposed Scheme

Experiments and Analysis

Conclusion

CRediT authorship contribution statement

Acknowledgment

Information Sciences

Pattern Recognition

Digital Signal Processing

Efficient parallel optimizations of a high-performance SIFT on GPUs

Information Sciences

Journal of Visual Communication and Image Representation

Chaos, Solitons & Fractals

IEEE Transactions on Industrial Informatics

SLIC superpixels compared to state-of-the-art superpixel methods

IEEE transactions on pattern analysis and machine intelligence

Freak: Fast retina keypoint

A sift-based forensic method for copy–move attack detection and transformation recovery

IEEE transactions on information forensics and security

Copy Move Source-Target Disambiguation through Multi-Branch CNNs

IEEE Transactions on Information Forensics and Security

Building a balanced kd tree in o (kn log n) time,

Computer Science

Hybrid features and semantic reinforcement network for image forgery detection

Multimedia Systems

Global contrast based salient region detection

IEEE Transactions on Pattern Analysis and Machine Intelligence

An evaluation of popular copy-move forgery detection approaches

IEEE Transactions on information forensics and security

Efficient dense-field copy–move forgery detection

IEEE Transactions on Information Forensics and Security

PCET based copy-move forgery detection in images under geometric transforms

Multimedia Tools and Applications

Behavior Knowledge Space-Based Fusion for Copy–Move Forgery Detection

IEEE Transactions on Image Processing

Detection of copy-move forgery in digital images