Abstract
Images can be censored by masking the region(s) of interest with a solid color or pattern. When a censored image is used for classification or matching, the mask itself may impact the results. Recent work in image inpainting and data augmentation provide two different approaches for dealing with censored images. In this paper, we perform an extensive evaluation of these methods to understand if the impact of censoring can be mitigated for image classification and retrieval. Results indicate that modern learning-based inpainting approaches outperform augmentation strategies and that metrics typically used to evaluate inpainting performance (e.g., reconstruction accuracy) do not necessarily correspond to improved classification or retrieval, especially in the case of person-shaped masked regions.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig2_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig11_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig13_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11263-020-01403-1/MediaObjects/11263_2020_1403_Fig14_HTML.jpg)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Due to library conflicts with the GPU version of GLCIC, we used the CPU version in testing.
References
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN. arXiv preprint arXiv:1701.07875.
Barnes, C., Shechtman, E., Finkelstein, A., & Goldman, D. B. (2009). Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics, 28, 24.
Bertalmio, M., Bertozzi, A. L., & Sapiro, G. (2001). Navier–Stokes, fluid dynamics, and image and video inpainting. In Proceedings of IEEE conference on computer vision and pattern recognition.
Bertalmio, M., Sapiro, G., Caselles, V., & Ballester, C. (2000). Image inpainting. In Proceedings of the 27th annual conference on computer graphics and interactive techniques (pp. 417–424).
Bertalmio, M., Vese, L., Sapiro, G., & Osher, S. (2003). Simultaneous structure and texture image inpainting. IEEE Transactions on Image Processing, 12(8), 882–889.
Black, S., Keshavarz, S., & Souvenir, R. (2020). Evaluation of image inpainting for classification and retrieval. In The IEEE winter conference on applications of computer vision (WACV).
Cao, Q., Shen, L., Xie, W., Parkhi, O. M., & Zisserman, A. (2018). Vggface2: A dataset for recognising faces across pose and age. In IEEE international conference on automatic face & gesture recognition (pp. 67–74). IEEE
Chan, T., & Shen, J. (2000). Mathematical models for local deterministic inpainting. UCLA computational and applied mathematics reports.
Chan, T. F., & Shen, J. (2001). Nontexture inpainting by curvature-driven diffusions. Journal of Visual Communication and Image Representation, 12, 436–449.
Chhabra, J. K., & Birchha, M. V. (2014). Detailed survey on exemplar based image inpainting techniques. International Journal of Computer Science and Information Technologies, 5(5), 635–6350.
Criminisi, A., Pérez, P., & Toyama, K. (2004). Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on Image Processing, 13(9), 1200–1212.
Darabi, S., Shechtman, E., Barnes, C., Goldman, D. B., & Sen, P. (2012). Image melding: Combining inconsistent images using patch-based synthesis. ACM Transactions on Graphics, 31, 1–82.
DeVries, T., & Taylor, G. W. (2017). Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.
Efros, A. A., & Leung, T. K. (1999). Texture synthesis by non-parametric sampling. In Proceedings of international conference on computer vision (pp. 1033–1038).
Fong, R., & Vedaldi, A. (2019). Occlusions for effective data augmentation in image classification. arXiv preprint arXiv:1910.10651.
Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 2414–2423).
Guillemot, C., & Le Meur, O. (2013). Image inpainting: Overview and recent advances. IEEE Signal Processing Magazine, 31, 127–144.
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of wasserstein GANs. In Advances in neural information processing systems (pp. 5767–5777).
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017). GANs trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in neural information processing systems (pp. 6626–6637).
Hoffer, E., Ben-Nun, T., Hubara, I., Giladi, N., Hoefler, T., & Soudry, D. (2019). Augment your batch: Better training with larger batches. arXiv preprint arXiv:1901.09335.
Hong, X., Xiong, P., Ji, R., & Fan, H. (2019). Deep fusion network for image completion. In ACM multimedia.
Huang, J. B., Kang, S. B., Ahuja, N., & Kopf, J. (2014). Image completion using planar structure guidance. ACM Transactions on Graphic, 33(4), 1–10.
Iizuka, S., Simo-Serra, E., & Ishikawa, H. (2017). Globally and locally consistent image completion. ACM Transactions on Graphics, 36(4), 1–14.
Inoue, H. (2018). Data augmentation by pairing samples for images classification. arXiv preprint arXiv:1801.02929.
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In Proceedings of European conference on computer vision (pp. 694–711).
Köhler, R., Schuler, C., Schölkopf, B., & Harmeling, S. (2014). Mask-specific inpainting with deep neural networks. In German conference on pattern recognition (pp. 523–534).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Liang, L., Liu, C., Xu, Y. Q., Guo, B., & Shum, H. Y. (2001). Real-time texture synthesis by patch-based sampling. ACM Transactions on Graphics, 20, 127–150.
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In Proceedings of European conference on computer vision (pp. 740–755). Springer.
Liu, G., Reda, F. A., Shih, K. J., Wang, T. C., Tao, A., & Catanzaro, B. (2018). Image inpainting for irregular holes using partial convolutions. In Proceedings of European conference on computer vision.
Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision.
Moreno-Barea, F. J., Strazzera, F., Jerez, J. M., Urda, D., & Franco, L. (2018). Forward noise adjustment scheme for data augmentation. In IEEE symposium series on computational intelligence (pp. 728–734). IEEE.
Nazeri, K., Ng, E., Joseph, T., Qureshi, F., & Ebrahimi, M. (2019). Edgeconnect: Generative image inpainting with adversarial edge learning. In IEEE international conference on computer vision workshop.
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context encoders: Feature learning by inpainting. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 2536–2544).
Perez, L., & Wang, J .(2017). The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621.
Rane, S. D., Sapiro, G., & Bertalmio, M. (2003). Structure and texture filling-in of missing image blocks in wireless transmission and compression applications. IEEE Transactions on Image Processing, 12, 296–303.
Ren, Y., Yu, X., Zhang, R., Li, T. H., Liu, S., & Li, G. (2019). Structureflow: Image inpainting via structure-aware appearance flow. In Proceedings of international conference on computer vision (pp. 181–190).
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention (pp. 234–241).
Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 815–823).
Sethian, J. A. (1996). A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences, 93, 1591–1595.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Singh, K. K., & Lee, Y. J. (2017). Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In Proceedings of international conference on computer vision (pp. 3544–3553).
Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems (pp. 3483–3491).
Song, Y., Yang, C., Lin, Z., Liu, X., Huang, Q., Li, H., & Jay Kuo, C. C. (2018). Contextual-based image inpainting: Infer, match, and translate. In Proceedings of European conference on computer vision (pp. 3–19).
Stylianou, A., Xuan, H., Shende, M., Brandt, J., Souvenir, R., & Pless, R. (2019). Hotels-50k: A global hotel recognition dataset. In Proceedings of national conference on artificial intelligence.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
Telea, A. (2004). An image inpainting technique based on the fast marching method. Journal of Graphics Tools, 9, 23–34.
Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2018a). High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 8798–8807).
Wang, X., Wang, K., & Lian, S. (2020). A survey on face data augmentation for the training of deep neural networks. In Neural computing and applications (pp. 1–29)
Wang, Y., Tao, X., Qi, X., Shen, X., & Jia, J. (2018b). Image inpainting via generative multi-column convolutional neural networks. In Advances in neural information processing systems (pp. 331–340).
Wang, Z., Bovik, A. C., Sheikh, H. R., Simoncelli, E. P., et al. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13, 600–612.
Wu, X., Xu, K., & Hall, P. (2017). A survey of image synthesis and editing with generative adversarial networks. Tsinghua Science and Technology, 22, 660–674.
Xie, J., Xu, L., & Chen, E. (2012). Image denoising and inpainting with deep neural networks. In Advances in neural information processing systems (pp. 341–349).
Xu, L., Yan, Q., Xia, Y., & Jia, J. (2012). Structure extraction from texture via relative total variation. ACM Transactions on Graphics, 31(6), 1–10.
Xu, Z., & Sun, J. (2010). Image inpainting by patch propagation using patch sparsity. IEEE Transactions on Image Processing, 19, 1153–1165.
Yeh, R. A., Chen, C., Yian Lim, T., Schwing, A. G., Hasegawa-Johnson, M., & Do, M. N. (2017). Semantic image inpainting with deep generative models. In Proceedings of IEEE conference on computer vision and pattern recognition.
Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122.
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2018). Generative image inpainting with contextual attention. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 5505–5514).
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., & Huang, T. S. (2019). Free-form image inpainting with gated convolution. In Proceedings of international conference on computer vision (pp. 4471–4480).
Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503.
Zheng, C., Cham, T. J., & Cai, J. (2019). Pluralistic image completion. In Proceedings of IEEE conference on computer vision and pattern recognition (pp. 1438–1447).
Zhong, Z., Zheng, L., Kang, G., Li, S., & Yang, Y. (2020). Random erasing data augmentation. In Proceedings of national conference on artificial intelligence.
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. (2017). Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1452–1464.
Zhou, T., Tulsiani, S., Sun, W., Malik, J., & Efros, A. A .(2016). View synthesis by appearance flow. In Proceedings of European conference on computer vision (pp. 286–301). Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Daniel Scharstein.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Black, S., Keshavarz, S. & Souvenir, R. Evaluation of Inpainting and Augmentation for Censored Image Queries. Int J Comput Vis 129, 977–997 (2021). https://doi.org/10.1007/s11263-020-01403-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-020-01403-1