Skip to main content
Log in

Weakly Supervised Object Localization with Background Suppression Erasing for Art Authentication and Copyright Protection

  • Research Article
  • Published:
Machine Intelligence Research Aims and scope Submit manuscript

Abstract

The problem of art forgery and infringement is becoming increasingly prominent, since diverse self-media contents with all kinds of art pieces are released on the Internet every day. For art paintings, object detection and localization provide an efficient and effective means of art authentication and copyright protection. However, the acquisition of a precise detector requires large amounts of expensive pixel-level annotations. To alleviate this, we propose a novel weakly supervised object localization (WSOL) with background superposition erasing (BSE), which recognizes objects with inexpensive image-level labels. First, integrated adversarial erasing (IAE) for vanilla convolutional neural network (CNN) dropouts the most discriminative region by leveraging high-level semantic information. Second, a background suppression module (BSM) limits the activation area of the IAE to the object region through a self-guidance mechanism. Finally, in the inference phase, we utilize the refined importance map (RIM) of middle features to obtain class-agnostic localization results. Extensive experiments are conducted on paintings, CUB-200-2011 and ILSVRC to validate the effectiveness of our BSE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. W. Hong, N. M. Curran. Artificial intelligence, artists, and art: Attitudes toward artwork produced by humans vs. artificial intelligence. ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 15, no. 2s, Article number 58, 2019. DOI: https://doi.org/10.1145/3326337.

  2. E. Cetinic, J. She. Understanding and creating art with AI: Review and outlook. ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 2, Article number 66, 2022. DOI: https://doi.org/10.1145/3475799.

  3. Y. Y. Hong, J. Kim. Art painting detection and identification based on deep learning and image local features. Multimedia Tools and Applications, vol. 78, no.6, pp.6513–6528, 2019. DOI: https://doi.org/10.1007/s11042-018-6387-5.

    Article  Google Scholar 

  4. E. J. Crowley, A. Zisserman. In search of art. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, pp. 54–70, 2015. DOI: https://doi.org/10.1007/978-3-319-16178-5_4.

  5. T. Martins, J. Correia, S. Rebelo, J. Bicker, P. Machado. Portraits of no one: An interactive installation. In Proceedings of the 9th International Conference on Computational Intelligence in Music, Sound, Art and Design, Seville, Spain, pp. 104–117, 2020. DOI: https://doi.org/10.1007/978-3-030-43859-3_8.

  6. E. J. Crowley, A. Zisserman. The art of detection. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, pp. 721–737, 2016. DOI: https://doi.org/10.1007/978-3-319-46604-0_50.

  7. D. Kim, J. Xu, A. Elgammal, M. Mazzone. Computational analysis of content in fine art paintings. In Proceedings of the 10th International Conference on Computational Creativity, Charlotte, USA, pp. 33–40, 2019.

  8. N. Gonthier, Y. Gousseau, S. Ladjal, O. Bonfait. Weakly supervised object detection in artworks. In Proceedings of European Conference on Computer Vision, Munich, Germany, pp. 692–709, 2019. DOI: https://doi.org/10.1007/978-3-030-11012-3_53.

  9. T. Jenicek, O. Chum. Linking art through human poses. In Proceedings of International Conference on Document Analysis and Recognition, Sydney, Australia, pp. 1338–1345, 2019. DOI: https://doi.org/10.1109/ICDAR.2019.00216.

  10. P. Madhu, R. Kosti, L. Mührenberg, P. Bell, A. Maier, V. Christlein. Recognizing characters in art history using deep learning. In Proceedings of the 1st Workshop on Structuring and Understanding of Multimedia heritAge Contents, Nice, France, pp. 15–22, 2019. DOI: https://doi.org/10.1145/3347317.3357242.

  11. P. Madhu, T. Marquart, R. Kosti, P. Bell, A. Maier, V. Christlein. Understanding compositional structures in art historical images using pose and gaze priors. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, pp. 109–125, 2020. DOI: https://doi.org/10.1007/978-3-030-66096-3_9.

  12. H. Lin, M. Van Zuijlen, M. W. A. Wijntjes, S. C. Pont, K. Bala. Insights from a large-scale database of material depictions in paintings. In Proceedings of the International Conference on Pattern Recognition, pp. 531–545, 2021. DOI: https://doi.org/10.1007/978-3-030-68796-0_38.

  13. D. W. Zhang, J. W. Han, G. Cheng, M. H. Yang. Weakly supervised object localization and detection: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no.9, pp.5866–5885, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3074313.

    Google Scholar 

  14. B. L. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba. Learning deep features for discriminative localization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2921–2929, 2016. DOI: https://doi.org/10.1109/CVPR.2016.319.

  15. Y. C. Wei, J. S. Feng, X. D. Liang, M. M. Cheng, Y. Zhao, S. C. Yan. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6488–6496, 2017. DOI: https://doi.org/10.1109/CVPR.2017.687.

  16. X. L. Zhang, Y. C. Wei, G. L. Kang, Y. Yang, T. Huang. Self-produced guidance for weakly-supervised object localization. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 610–625, 2018. DOI: https://doi.org/10.1007/978-3-030-01258-8_37.

  17. J. Choe, H. Shim. Attention-based dropout layer for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 2214–2223, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00232.

  18. F. F. Shao, Y. W. Luo, L. Zhang, L. Ye, S. L. Tang, Y. Yang, J. Xiao. Improving weakly supervised object localization via causal intervention. In Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, pp. 3321–3329, 2021. DOI: https://doi.org/10.1145/3474085.3475485.

  19. S. Babar, S. Das. Where to look?: Mining complementary image regions for weakly supervised object localization. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, pp. 1009–1018, 2021. DOI: https://doi.org/10.1109/WACV48630.2021.00105.

  20. X. J. Pan, Y. G. Gao, Z. W. Lin, F. Tang, W. M. Dong, H. L. Yuan, F. Y. Huang, C. S. Xu. Unveiling the potential of structure preserving for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 11637–11646, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01147.

  21. J. J. Mai, M. Yang, W. F. Luo. Erasing integrated learning: A simple yet effective approach for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 8763–8772, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00879.

  22. K. K. Singh, Y. J. Lee. Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3544–3553, 2017. DOI: https://doi.org/10.1109/ICCV.2017.381.

  23. X. L. Zhang, Y. C. Wei, J. S. Feng, Y. Yang, T. Huang. Adversarial complementary learning for weakly supervised object localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 1325–1334, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00144.

  24. M. S. Ki, Y. Uh, W. Lee, H. Byun. In-sample contrastive learning and consistent attention for weakly supervised object localization. In Proceedings of the 15th Asian Conference on Computer Vision, Kyoto, Japan, pp. 3–18, 2021. DOI: https://doi.org/10.1007/978-3-030-69538-5_1.

  25. L. S. Luo, C. Yuan, K. Zhang, Y. Jiang, Y. W. Zhang, H. L. Zhang. Double shot: Preserve and erase based class attention networks for weakly supervised localization (Peca-Net). In Proceedings of IEEE International Conference on Multimedia and Expo, London, UK, pp. 1–6, 2020. DOI: https://doi.org/10.1109/ICME46284.2020.9102801.

  26. K. P. Li, Z. Y. Wu, K. C. Peng, J. Ernst, Y. Fu. Guided attention inference network. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 12, pp. 2996–3010, 2020. DOI: https://doi.org/10.1109/TPAMI.2019.2921543.

    Article  Google Scholar 

  27. H. L. Xue, C. Liu, F. Wan, J. B. Jiao, X. Y. Ji, Q. X. Ye. DANet: Divergent activation for weakly supervised object localization. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 6588–6597, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00669.

  28. K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition, [Online], Available: https://arxiv.org/abs/1409.1556, 2014.

  29. W. Wang, E. Ahn, D. G. Feng, J. Kim. A Review of Predictive and Contrastive Self-supervised Learning for Medical Images. Machine Intelligence Research, vol. 20, no.4, pp.483–513, 2023. DOI: https://doi.org/10.1007/s11633-022-1406-4.

    Article  Google Scholar 

  30. X. L. Zhang, Y. C. Wei, Y. Yang, F. Wu. Rethinking localization map: Towards accurate object perception with self-enhancement maps, [Online], Available: https://arxiv.org/abs/2006.05220.

  31. S. Yang, Y. Kim, Y. Kim, C. Kim. Combinational class activation maps for weakly supervised object localization. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Snowmass, USA, pp. 2930–2938, 2020. DOI: https://doi.org/10.1109/WACV45572.2020.9093566.

  32. X. L. Zhang, Y. C. Wei, Y. Yang. Inter-image communication for weakly supervised localization. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 271–287, 2020. DOI: https://doi.org/10.1007/978-3-030-58529-7_17.

  33. P. T. Jiang, C. B. Zhang, Q. B. Hou, M. M. Cheng, Y. C. Wei. LayerCAM: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing, vol. 30, pp.5875–5888, 2021. DOI: https://doi.org/10.1109/TIP.2021.3089943.

    Article  Google Scholar 

  34. X. W. Shi, S. Khademi, Y. Q. Li, J. van Gemert. Zoom-CAM: Generating fine-grained pixel annotations from image labels. In Proceedings of the 25th International Conference on Pattern Recognition, Milan, Italy, pp. 10289–10296, 2021. DOI: https://doi.org/10.1109/ICPR48806.2021.9412980.

  35. J. H. Xie, C. Luo, X. P. Zhu, Z. Q. Jin, W. Z. Lu, L. L. Shen. Online refinement of low-level feature based activation map for weakly supervised object localization. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 132–141, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00020.

  36. C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie. The Caltech-UCSD Birds-200-2011 Dataset, Technical Report 2011-001, California Institute of Technology, Pasadena, USA, 2011.

    Google Scholar 

  37. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. A. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/s11263-015-0816-y.

    Article  MathSciNet  Google Scholar 

  38. E. Crowley, A. Zisserman. The state of the art: Object retrieval in paintings using discriminative regions. In Proceedings of the British Machine Vision Conference, Nottingham, UK, 2014.

  39. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 2818–2826, 2016. DOI: https://doi.org/10.1109/CVPR.2016.308.

  40. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 618–626, 2017. DOI: https://doi.org/10.1109/ICCV.2017.74.

  41. A. Chattopadhay, A. Sarkar, P. Howlader, V. N. Balasubramanian. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, USA, pp. 839–847, 2018. DOI: https://doi.org/10.1109/WACV.2018.00097.

  42. D. Omeiza, S. Speakman, C. Cintas, K. Weldermariam. Smooth Grad-CAM++: An enhanced inference level visualization technique for deep convolutional neural network models, [Online], Available: https://arxiv.org/abs/1908.01224, 2019.

  43. R. G. Fu, Q. Y. Hu, X. H. Dong, Y. L. Guo, Y. H. Gao, B. Li. Axiom-based grad-CAM: Towards accurate visualization and explanation of CNNs. In Proceedings of the 31st British Machine Vision Conference, UK, 2020.

  44. W. Bae, J. Noh, G. Kim. Rethinking class activation mapping for weakly supervised object localization. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 618–634, 2020. DOI: https://doi.org/10.1007/978-3-030-58555-6_37.

  45. S. Yun, D. Han, S. Chun, S. J. Oh, Y. Yoo, J. Choe. Cut-Mix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 6022–6031, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00612.

  46. W. Z. Lu, X. Jia, W. C. Xie, L. L. Shen, Y. C. Zhou, J. M. Duan. Geometry constrained weakly supervised object localization. In Proceedings of the 16th European Conference Computer Vision, Glasgow, UK, pp. 481–496. 2020. DOI: https://doi.org/10.1007/978-3-030-58574-7_29.

  47. H. F. Wang, Z. F. Wang, M. N. Du, F. Yang, Z. J. Zhang, S. R. Ding, P. Mardziel, X. Hu. Score-CAM: Score-weighted visual explanations for convolutional neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, USA, pp. 111–119, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00020.

  48. S. A. Rebuffi, R. Fong, X. Ji, A. Vedaldi. There and back again: Revisiting backpropagation saliency methods. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 8836–8845, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00886.

Download references

Acknowledgements

This work was supported in part by Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, China (No. 2022B12 12010011).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ying Gao or Wing W. Y. Ng.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Chaojie Wu received the B.Eng. degree in electrical engineering from South China University of Technology, China in 2020. Currently, he is a master student in South China University of Technology, China.

His research interest is computer vision.

Mingyang Li received the B. Sc. degree in computer science from South China University of Technology, China in 2021. Currently, he is a master student in South China University of Technology, China.

His research interest is object detection.

Ying Gao received the B.Sc., M. Sc. degrees in computer science from Central South University of China, China in 1997 and 2000, the Ph.D. degree in computer science from South China University of Technology, China in 2006. She is currently a professor with the School of Computer Science and Engineering, South China University of Technology, China. She has published more than 30 papers in international journals and conferences.

Her research interests include computer vision, software architecture and network security.

Xinyan Xie received the B.Sc. degree in computer science from South China University of Technology, China in 2019. He is currently a master student in South China University of Technology, China.

His research interests include computer vision, weakly supervised learning and medical imaging analysis.

Wing W. Y. Ng received the B.Sc. and Ph.D. degrees in computer science, neural networks and learning, cybernetics from Hong Kong Polytechnic University, China in 2001 and 2006, respectively. Currently, he is a professor with the School of Computer Science and Engineering, South China University of Technology, China. He is currently an Associate Editor of International Journal of Machine Learning and Cybernetics. He is the Principle Investigator of four China National Nature Science Foundation projects and a Program for New Century Excellent Talents in University from China Ministry of Education. He served as the Board of Governor for IEEE Systems, Man and Cybernetics Society in 2011.

His research interests include neural networks, deep learning, smart grid, smart health care, smart manufacturing and non-stationary information retrieval.

Ahmad Musyafa received the B.Sc. degree in computer science from Universitas Pamulang, Indonesia and M. Sc. degree in machine learning from STMIK Eresha, Indonesia in 2012 and 2014 respectively. Currently, he is a Ph.D. degree candidate in the School of Computer Science and Engineering, South China University of Technology, China.

His research interests include deep learning, neural machine translation and computer vision.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, C., Li, M., Gao, Y. et al. Weakly Supervised Object Localization with Background Suppression Erasing for Art Authentication and Copyright Protection. Mach. Intell. Res. 21, 89–103 (2024). https://doi.org/10.1007/s11633-023-1455-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-023-1455-3

Keywords

Navigation