Abstract
Nowadays, advanced image editing tools and techniques produce more realistic tampered images, especially the addition of intelligent retouching technology makes the threshold of tampered operations lower and lower, which can easily evade image forensic systems and make it more difficult to verify the authenticity of images. The field of forensics has failed to achieve effective development due to the lack of high-quality splicing datasets. In this paper, we present the SMI20K-the first benchmark dataset for image splicing operations under intelligent tampered techniques, which contains a total of 20,000 splicing tampered images. By combining Seamless Cloning and image similarity search techniques, the tampered images have more hidden manipulation traces, making it difficult for the naked eye to distinguish the tampered targets. SMI20K brings a new challenge to the field of image forensics. Furthermore, we propose the novel Multi-scale Attention Context-aware Network (MAC-Net) to address the novel challenge of image tampered presently. Specifically, we propose a Multi-scale Multi-level Attention Module (MMAM) that not only effectively resolves feature inconsistencies at different scales, but also automatically adjusts the coefficients of the original input features to maintain detailed features. The fused features are then fed to the proposed Multi-Branch Global Context Module (MGCM), it has three different branches that not only enriches the contextual information but also maintains the detailed features of the target through automatic coefficient adjustment. Extensive experimental results on three public datasets and the proposed dataset show that the proposed model outperforms other state-of-the-art (SOTA) models in image forgery localization.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Martino JMD, Facciolo G, Meinhardt-Llopis E (2016) Poisson image editing. Image Process Line 6:300–325. https://doi.org/10.5201/ipol.2016.163
Hao J, Zhang Z, Yang S, Xie D, Pu S (2021) Transforensics: Image forgery localization with dense self-attention. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021. IEEE, Montreal, 10-17 October, 2021, pp 15035–15044. https://doi.org/10.1109/ICCV48922.2021.01478
Bappy JH, Roy-Chowdhury AK, Bunk J, Nataraj L, Manjunath BS (2017) Exploiting spatial structure for localizing manipulated image regions. In: IEEE international conference on computer vision, ICCV 2017. IEEE computer society, Italy, 22-29 October, 2017, pp 4980–4989. https://doi.org/10.1109/ICCV.2017.532
Bappy JH, Simons C, Nataraj L, Manjunath BS, Roy-Chowdhury AK (2019) Hybrid LSTM and encoder-decoder architecture for detection of image forgeries. IEEE Trans Image Process 28(7):3286–3300. https://doi.org/10.1109/TIP.2019.2895466
Wu Y, AbdAlmageed W, Natarajan P (2019) Mantra-net: manipulation tracing network for detection and localization of image forgeries with anomalous features. In: IEEE conference on computer vision and pattern recognition, CVPR 2019. Computer Vision Foundation / IEEE, USA, 16-20 June, 2019, pp 9543–9552. https://doi.org/10.1109/CVPR.2019.00977
Xiao B, Wei Y, Bi X, Li W, Ma J (2020) Image splicing forgery detection combining coarse to refined convolutional neural network and adaptive clustering. Inf Sci 511:172–191. https://doi.org/10.1016/j.ins.2019.09.038
Bi X, Wei Y, Xiao B, Li W (2019) Rru-net: the ringed residual u-net for image splicing forgery detection. In: IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2019. Computer Vision Foundation / IEEE, long beach, CA, USA, 16-20 June, 2019, pp 30–39. https://doi.org/10.1109/CVPRW.2019.00010
Huh M, Liu A, Owens A, Efros AA (2018) Fighting fake news: image splice detection via learned self-consistency. In: ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision - ECCV 2018 - 15th European conference, Germany, 8-14 September, 2018, Proceedings, Part XI, Lecture Notes in Computer Science. Springer, vol 11215, pp 106–124. https://doi.org/10.1007/978-3-030-01252-6_7
Bi X, Zhang Z, Xiao B (2021) Reality transform adversarial generators for image splicing forgery detection and localization. In: 2021 IEEE/CVF international conference on computer vision, ICCV 2021. IEEE, Canada, 10-17 October, 2021, pp 14274–14283. https://doi.org/10.1109/ICCV48922.2021.01403
Cozzolino D, Gragnaniello D, Verdoliva L (2014) Image forgery localization through the fusion of camera-based, feature-based and pixel-based techniques. In: 2014 IEEE international conference on image processing, ICIP 2014. IEEE, France, 27-30 October, 2014, pp 5302–5306. https://doi.org/10.1109/ICIP.2014.7026073
Ferrara P, Bianchi T, Rosa AD, Piva A (2012) Image forgery localization via fine-grained analysis of CFA artifacts. IEEE Trans Inf Forensics Secur 7(5):1566–1577. https://doi.org/10.1109/TIFS.2012.2202227
de Carvalho TJ, Riess C, Angelopoulou E, Pedrini H, de Rezende Rocha A (2013) Exposing digital image forgeries by illumination color classification. IEEE Trans Inf Forensics Secur 8(7):1182–1194. https://doi.org/10.1109/TIFS.2013.2265677
Riess C, Angelopoulou E (2010) Scene illumination as an indicator of image manipulation. In: Böhme R, Fong PWL, Safavi-Naini R (eds) Information hiding - 12th international conference, IH 2010. Springer, Canada, 28-30 June, 2010, Revised selected papers, lecture notes in computer science, vol 6387, pp 66–80. https://doi.org/10.1007/978-3-642-16435-4_6
Barni M, Bondi L, Bonettini N, Bestagini P, Costanzo A, Maggini M, Tondi B, Tubaro S (2017) Aligned and non-aligned double JPEG detection using convolutional neural networks. J Vis Commun Image Represent 49:153–163. https://doi.org/10.1016/j.jvcir.2017.09.003
Bianchi T, Rosa AD, Piva A (2011) Improved DCT coefficient analysis for forgery localization in JPEG images. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, ICASSP 2011, Prague Congress Center, Prague, Czech Republic. IEEE, 22-27 May, 2011, pp 2444–2447. https://doi.org/10.1109/ICASSP.2011.5946978
Bappy JH, Simons C, Nataraj L, Manjunath BS, Roy-Chowdhury AK (2019) Hybrid LSTM and encoder-decoder architecture for detection of image forgeries. IEEE Trans Image Process 28(7):3286–3300. https://doi.org/10.1109/TIP.2019.2895466
Zhou P, Han X, Morariu VI, Davis LS (2017) Two-stream neural networks for tampered face detection. In: 2017 IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2017, honolulu, HI, USA, 21-26 July, 2017, pp 1831–1839. IEEE Computer Society. https://doi.org/10.1109/CVPRW.2017.229
Xiao B, Wei Y, Bi X, Li W, Ma J (2020) Image splicing forgery detection combining coarse to refined convolutional neural network and adaptive clustering. Inf Sci 511:172–191. https://doi.org/10.1016/j.ins.2019.09.038
Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020. Computer Vision Foundation / IEEE, Seattle, WA, USA, 13-19 June, 2020, pp 4675–4684. https://doi.org/10.1109/CVPR42600.2020.00473
Zhou P, Han X, Morariu VI, Davis LS (2017) Two-stream neural networks for tampered face detection. In: 2017 IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2017. IEEE Computer Society, Honolulu, HI, USA, 21-26 July, 2017, pp 1831–1839. https://doi.org/10.1109/CVPRW.2017.229
Amerini I, Uricchio T, Ballan L, Caldelli R (2017) Localization of JPEG double compression through multi-domain convolutional neural networks. In: 2017 IEEE conference on computer vision and pattern recognition workshops, CVPR workshops 2017. IEEE Computer Society, USA, 21-26 July, 2017, pp 1865–1871. https://doi.org/10.1109/CVPRW.2017.233
Bappy JH, Roy-Chowdhury AK, Bunk J, Nataraj L, Manjunath BS (2017) Exploiting spatial structure for localizing manipulated image regions. In: IEEE international conference on computer vision, ICCV 2017. IEEE computer society, Italy, 22-29 October, 2017, pp 4980–4989. https://doi.org/10.1109/ICCV.2017.532
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: Vedaldi A, Bischof H, Brox T, Frahm J (eds) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, 23-28 August, 2020, Proceedings, Part I, lecture notes in computer science. Springer, vol 12346, pp 213–229. https://doi.org/10.1007/978-3-030-58452-8_13
Vaswani A, Ramachandran P, Srinivas A, Parmar N, Hechtman BA, Shlens J (2103) Scaling local self-attention for parameter efficient visual backbones. arXiv:2103.12731
Vaswani A, Ramachandran P, Srinivas A, Parmar N, Hechtman BA, Shlens J (2021) Scaling local self-attention for parameter efficient visual backbones. arXiv:2103.12731
Ye L, Rochan M, Liu Z, Wang Y (2019) Cross-modal self-attention network for referring image segmentation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019. Computer Vision Foundation / IEEE, USA, 16-20 June, 2019, pp 10502–10511. https://doi.org/10.1109/CVPR.2019.01075
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PHS, Zhang L (2020) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. arXiv:2012.15840
Hu X, Zhang Z, Jiang Z, Chaudhuri S, Yang Z, Nevatia R (2020) SPAN: spatial pyramid attention network for image manipulation localization. In: Vedaldi A, Bischof H, Brox T, Frahm J (eds) Computer Vision - ECCV 2020 - 16th European conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXI, Lecture Notes in Computer Science, vol 12366, pp 312–328. Springer. https://doi.org/10.1007/978-3-030-58589-1_19
Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp 4675–4684. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00473
Qin L, Che W, Ni M, Li Y, Liu T (2021) Knowing where to leverage: context-aware graph convolutional network with an adaptive fusion layer for contextual spoken language understanding. IEEE ACM Trans Audio Speech Lang Process 29:1280–1289. https://doi.org/10.1109/TASLP.2021.3053400
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, USA, July 21-26, 2017, pp 6230–6239. IEEE Computer Society. https://doi.org/10.1109/CVPR.2017.660
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: IEEE conference on computer vision and pattern recognition, CVPR 2019. Computer Vision Foundation / IEEE, USA, June 16-20, 2019, pp 3146–3154. https://doi.org/10.1109/CVPR.2019.00326
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019. IEEE, Seoul, Korea (South), October 27 - November 2, 2019, pp 603–612. https://doi.org/10.1109/ICCV.2019.00069
Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018. Computer Vision Foundation / IEEE Computer Society, USA, June 18-22, 2018, pp 1741–1750. https://doi.org/10.1109/CVPR.2018.00187
Tan J, Xiong P, Lv Z, Xiao K, He Y (2020) Local context attention for salient object segmentation. In: Ishikawa H, Liu C, Pajdla T, Shi J (eds) Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Japan, November 30 - December 4, 2020, Revised Selected Papers, Part I, Lecture Notes in Computer Science, vol. 12622, pp 706–722. Springer. https://doi.org/10.1007/978-3-030-69525-5_42
Chen Z, Xu Q, Cong R, Huang Q (2020) Global context-aware progressive aggregation network for salient object detection. arXiv:2003.00651
Liu J, Hou Q, Cheng M, Feng J, Jiang J (2019) A simple pooling-based design for real-time salient object detection. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, USA, June 16-20, 2019, pp 3917–3926. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR.2019.00404
Dong J, Wang W, Tan T (2013) CASIA Image tampering detection evaluation database. In: 2013 IEEE China summit and international conference on signal and information processing, chinaSIP 2013, China, July 6-10, 2013, pp 422–426. IEEE. https://doi.org/10.1109/ChinaSIP.2013.6625374
Novozamsky A, Mahdian B, Saic S (2020) Imd2020: a large-scale annotated dataset tailored for detecting manipulated images. In: 2020 IEEE winter applications of computer vision workshops (WACVW), pp 71–80
Novozámský A, Mahdian B, Saic S (2020) IMD2020: A large-scale annotated dataset tailored for detecting manipulated images. In: IEEE winter applications of computer vision workshops, WACV workshops 2020, USA, March 1-5, 2020, pp 71–80.IEEE. https://doi.org/10.1109/WACVW50321.2020.9096940
Hsu Y, Chang S (2006) Detecting image splicing using geometry invariants and camera characteristics consistency. In: Proceedings of the 2006 IEEE international conference on multimedia and expo, ICME 2006, July 9-12 2006, Canada, pp 549–552. IEEE Computer Society. https://doi.org/10.1109/ICME.2006.262447
Guan H, Kozak M, Robertson E, Lee Y, Yates AN, Delgado A, Zhou D, Kheyrkhah T, Smith J, Fiscus JG (2019) MFC Datasets: large-scale benchmark datasets for media forensic challenge evaluation. In: IEEE winter applications of computer vision workshops, WACV workshops 2019, USA, January 7-11, 2019, pp 63–72. IEEE. https://doi.org/10.1109/WACVW.2019.00018
Wang L, Lu H, Wang Y, Feng M, Wang D, Yin B, Ruan X (2017) Learning to detect salient objects with image-level supervision. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, USA, July 21-26, 2017, pp 3796–3805. IEEE Computer Society. https://doi.org/10.1109/CVPR.2017.404
Dai Y, Gieseke F, Oehmcke S, Wu Y, Barnard K (2021) Attentional feature fusion. In: IEEE winter conference on applications of computer vision, WACV 2021, USA, January 3-8, 2021, pp 3559–3568. IEEE. https://doi.org/10.1109/WACV48630.2021.00360
Wu Z, Su L, Huang Q (2019) Cascaded partial decoder for fast and accurate salient object detection. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, USA, June 16-20, 2019, pp 3907–3916. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR.2019.00403
de Boer P, Kroese DP, Mannor S, Rubinstein RY (2005) A tutorial on the cross-entropy method. Ann Oper Res 134(1):19–67. https://doi.org/10.1007/s10479-005-5724-z
Máttyus G, Luo W, Urtasun R (2017) Deeproadmapper: extracting road topology from aerial images. In: IEEE international conference on computer vision, ICCV 2017, Italy, October 22-29, 2017, pp 3458–3466. IEEE computer society. https://doi.org/10.1109/ICCV.2017.372
He K, Gkioxari G, Dollár P, Girshick RB (2017) Mask r-CNN. In: IEEE international conference on computer vision, ICCV 2017, Italy, october 22-29, 2017, pp 2980–2988. IEEE computer society. https://doi.org/10.1109/ICCV.2017.322
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, III WMW, Frangi AF (eds) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th international conference Munich, Germany, October 5 - 9, 2015, Proceedings, Part III, Lecture Notes in Computer Science, vol 9351, pp 234–241. Springer. https://doi.org/10.1007/978-3-319-24574-4_28
Zhou P, Han X, Morariu VI, Davis LS (2018) Learning rich features for image manipulation detection. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, USA, June 18-22, 2018, pp 1053–1061. Computer Vision Foundation / IEEE Computer Society
Wu Y, AbdAlmageed W, Natarajan P (2019) Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features. In: IEEE Conference on computer vision and pattern recognition, CVPR 2019, long beach, CA, USA, June 16-20, 2019, pp 9543–9552. Computer Vision Foundation / IEEE, DOI 10.1109/CVPR.2019.00977, (to appear in print)
Liu N, Han J, Yang M (2018) Picanet: learning pixel-wise contextual attention for saliency detection. In: 2018 IEEE Conference on computer vision and pattern recognition, CVPR 2018, USA, June 18-22, 2018, pp 3089–3098. Computer Vision Foundation / IEEE Computer Society. https://doi.org/10.1109/CVPR.2018.00326
Islam A, Long C, Basharat A, Hoogs A (2020) DOA-GAN: dual-order attentive generative adversarial network for image copy-move forgery detection and localization. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, seattle, WA, USA, June 13-19, 2020, pp 4675–4684. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00473
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part IV, Lecture Notes in Computer Science, vol 9908, pp 354–370. Springer. https://doi.org/10.1007/978-3-319-46493-0_22
Zhao J, Liu J, Fan D, Cao Y, Yang J, Cheng M (2019) Egnet: Edge guidance network for salient object detection. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Korea (South), October 27 - November 2, 2019, pp 8778–8787. IEEE. https://doi.org/10.1109/ICCV.2019.00887
Wu Z, Su L, Huang Q (2019) Stacked cross refinement network for edge-aware salient object detection. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp 7263–7272. IEEE. https://doi.org/10.1109/ICCV.2019.00736
Fan D, Ji G, Sun G, Cheng M, Shen J, Shao L (2020) Camouflaged object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, USA, June 13-19, 2020, pp 2774–2784. Computer Vision Foundation / IEEE. https://doi.org/10.1109/CVPR42600.2020.00285
Mei H, Ji G, Wei Z, Yang X, Wei X, Fan D (2021) Camouflaged object segmentation with distraction mining. In: IEEE conference on computer vision and pattern recognition, CVPR 2021, Virtual, June 19-25, 2021, pp 8772–8781. Computer vision foundation / IEEE
Sun Y, Chen G, Zhou T, Zhang Y, Liu N (2021) Context-aware cross-level fusion network for camouflaged object detection. In: Zhou Z (ed) Proceedings of the thirtieth international joint conference on artificial intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021, pp 1025–1031. ijcai.org. https://doi.org/10.24963/ijcai.2021/142
Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P (2021) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662. https://doi.org/10.1109/TPAMI.2019.2938758
Li W, Zhang Z, Wang X, Luo P (2020) Adax: adaptive gradient descent with exponential long term memory. arXiv:2004.09740
Ruyong R, Shaozhang N, Hua R, Shubin Z, Tengyue H, Xiaohai T (2022) Esrnet: efficient search and recognition network for image manipulation detection. ACM Trans Multimedia Comput, Commun, Appl (TOMM)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ren, R., Niu, S., Jin, J. et al. Multi-scale attention context-aware network for detection and localization of image splicing. Appl Intell 53, 18219–18238 (2023). https://doi.org/10.1007/s10489-022-04421-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04421-3