Elsevier

Information Sciences

Volume 572, September 2021, Pages 29-42
Information Sciences

Feature pyramid network for diffusion-based image inpainting detection

https://doi.org/10.1016/j.ins.2021.04.042Get rights and content

Abstract

Inpainting is a technique that can be employed to tamper with the content of images. In this paper, we propose a novel forensics analysis method for diffusion-based image inpainting based on a feature pyramid network (FPN). Our method features an improved u-shaped net to migrate FPN for multi-scale inpainting feature extraction. In addition, a stagewise weighted cross-entropy loss function is designed to take advantage of both the general loss and the weighted loss to improve the prediction rate of inpainted regions of all sizes. The experimental results demonstrate that the proposed method outperforms several state-of-the-art methods, especially when the size of the inpainted region is small.

Introduction

Image inpainting is one of the most important image processsing tools that can repair damaged or degraded regions in an image. Some renowned image manipulation tools, such as Photoshop and GIMP, have adopted image inpainting techniques as image processsing methods. Image inpainting methods can be divided into three classes, i.e., diffusion-based techniques [1], [2], [3], patch-based techniques [4], [5], [6] and deep learning-based techniques [7], [8], [9], [10], [11]. The diffusion-based methods mainly focus on small-region inpainting, without leaving any perceptible artifacts. Patch-based methods are used to remove relatively large objects, which may lead to obvious inconsistencies in image context. Deep learning-based methods have developed rapidly in recent years [12], [13], [14] and can inpaint various sizes of regions and obtain good inpainting results with few artifacts. In [12], the authors proposed a de-occlusion distillation framework for face completion and masked face recognition. Ge et al. [13] proposed a method based on identity-diversity inpainting to facilitate occluded face recognition. The inpainting method was initially designed for reconstructing integral images from damaged or degraded images. However, some malevolent individuals exploit inpainting technology for malicious purposes. For instance, tampering can remove an object or a person in an image to falsify evidence in court. Others make use of inpainted images fabricating scenes to report false news stories. As a consequence, there is an urgent need to address the safety issues caused by image inpainting. Detecting whether an image is inpainted and locating the inpainted regions are of great importance for image forensics.

In the past decades, image forensics has received much attention from researchers [15], [16], [17], [18], [19], [20]. A variety of inpainting detection methods have been proposed. The authors of [21], [22], [23], [24] took advantage of similar blocks within the queried image to detect patch-based image inpainting. The image pairs with large matching degrees were considered inpainted pairs. Zhu [25] et al. proposed a deep learning method for patch-based inpainting forensics for the first time in 2018. They applied a conventional neural network with encoder-decoder architecture to detect patch-based inpainting images of size 256×256. In addition, there is a growing body of literature that concentrates on deep learning-based inpainting forensics [26], [17]. Wang et al. took advantage of Faster R-CNN for the forensics of AI inpainting. Faster R-CNN was utilized to capture the inconsistent features between the tampered region and the untouched region [26]. Recently, Li et al. presented a deep learning-based method to detect the pixels processed by AI inpainting [17]. A high-pass prefiltering module is added before the fully convolutional network to enhance the inpainting traces. The learned feature maps of the image residuals are more distinguishable than those of the original images.

However, little attention has been paid to diffusion-based inpainting forensics. To the best of our knowledge, there is only one work focusing on the forensics of diffusion-based inpainting to date [27]. To detect the diffusion-based region, the authors of [27] trained a classifier with the features extracted by calculating the local variance of the image Laplacian perpendicular to the direction of the image gradient. Finally, effective post-processing operations, such as exclusion of abnormally exposed regions and morphological filtering, were used to refine the initial results. Even though the method in [27] can detect diffusion-based inpainting, there remain several shortcomings to address. First, the conventional method can achieve relatively good results with a large inpainting size, but for a small inpainting size, the localization results are far from satisfactory. Second, not only does the conventional method require postprocessing steps to refine the localization map but also the trained classifier needs to set a threshold, which does not always work reliably. Furthermore, Li’s method is degraded by certain post-operations, such as image scaling and JPEG compression. In this paper, we only focus on the diffusion-based inpainting detection and introduce a network based on feature pyramid network (FPN) to solve the above mentioned problem with the existing methods.

Since deep learning has been developed explosively in recent years, it has been widely applied in many image processing applications, such as image classification, image reconstruction, and image segmentation. Motivated by image segmentation, in this paper, an end-to-end deep learning method based on FPN is introduced to detect and locate diffusion-based inpainted regions in images. Fig. 1 shows an overview of the proposed framework.

The main contributions of this paper can be summarized as follows:

  • 1.

    An end-to-end deep learning model focused on detection of diffusion-based inpainting is introduced. In the model, an improved u-shaped net (U-Net) is migrated to an FPN for multiscale inpainting feature extraction. To combine the information of features in different scales, the extracted features are strengthened by a convolution layer and then fused by a concatenating layer for classification. Our ablation study validates the effectiveness of feature fusion.

  • 2.

    A stagewise weighted cross-entropy loss function is designed. The advantages of both the general loss and the weighted loss are integrated to optimize the training process. In addition, the detection results are refined with the loss function, especially when the inpainting size is small.

  • 3.

    The experimental results show that our proposed method outperforms several state-of-the-art methods. The robustness of the proposed method against several image post-processing manipulations is also evaluated.

Section snippets

Related work

In this section, several deep learning-based methods of image segmentation are first introduced, since the image forgery detection problem is similar to image segmentation to some degree. Then, the feature pyramid network is described briefly.

Proposed method

In this section, we introduce an end-to-end deep learning-based FPN composed of an improved U-Net to detect and locate image inpainting forgery.

Experimental setup

Our introduced network is trained on Ubuntu 16.04 with Tesla v100 32G GPU and 128 GB PC memory.

Experimental results

In this section, we test diffusion-based inpainting on our trained model. Then, the robustness against several post-processing operations on inpainting detection is analyzed.

Conclusion

In this paper, a forensic method based on a feature pyramid network has been proposed for the detection of diffusion-based image inpainting. A feature fusion strategy and a stagewise weighted cross-entropy loss function are combined to improve the performance of localizing the inpainted regions of different sizes. Extensive experimental results have verified that the proposed method outperforms several state-of-the-art methods in terms of both F1-score and IoU. Our future work will be devoted

CRediT authorship contribution statement

Yulan Zhang: Writing - original draft, Methodology, Software. Feng Ding: Writing - review & editing, Software. Sam Kwong: Writing - review & editing, Supervision. Guopu Zhu: Resources, Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors thank the anonymous reviewers for their valuable comments. This work was supported in part by the National Natural Science Foundation of China under Grant 61872350, Grant 61572489, and Grant 61672443, in part by Hong Kong GRF-RGC General Research Fund under Grant 9042816 (CityU 11209819) and Grant 9042958 (CityU 11203820), in part by the Tip-top Scientific and Technical Innovative Youth Talents of Guangdong Special Support Program under Grant 2019TQ05X696, and in part by the Basic

References (41)

  • C. Barnes et al.

    PatchMatch: A randomized correspondence algorithm for structural image editing

    ACM Trans. Graph.

    (2009)
  • T. Ružić et al.

    Context-aware patch-based image inpainting using Markov random field modeling

    IEEE Trans. Image Process.

    (2014)
  • D. Pathak et al.

    Context encoders: Feature learning by inpainting

  • C. Yang et al.

    High-resolution image inpainting using multi-scale neural patch synthesis

  • J. Yu et al.

    Generative image inpainting with contextual attention

  • G. Liu et al.

    Image inpainting for irregular holes using partial convolutions

  • D. Ulyanov et al.

    Deep image prior

  • C. Li et al.

    Look through masks: towards masked face recognition with de-occlusion distillation

  • S. Ge et al.

    Occluded face recognition in the wild by identity-diversity inpainting

    IEEE Trans. Circ. Syst. Video Technol.

    (2020)
  • Y.-G. Shin et al.

    Pepsi++: Fast and lightweight network for image inpainting

    IEEE Trans. Neural Netw. Learn. Syst.

    (2021)
  • Cited by (27)

    • DS-UNet: A dual streams UNet for refined image forgery localization

      2022, Information Sciences
      Citation Excerpt :

      In the past decade, most methods [4,5] focused on classifying whether an image has been manipulated or not, while some methods [6,7] tackled image forgery localization problem as a binary classification task of image patches. Recently, some works turned to the detection of some specific types of image forgery such as splicing [8–10], copy-move [11–14], inpainting [15,16] and enhancement [17,18]. However in real applications, malicious attackers often tend to perform multiple manipulations in one image with a sequence of post-processing operations.

    • Restorable-inpainting: A novel deep learning approach for shoeprint restoration

      2022, Information Sciences
      Citation Excerpt :

      However, they often fail to find similar and close matching in the test cases [26]. Nevertheless, DL based image inpainting has achieved promising results [27,28]. Many existing DL methods are specialized in solving particular problems, such as face recovery and completion [29], inpainting to assist classification and retrieval [28], head inpainting [30], and other problems.

    View all citing articles on Scopus
    View full text