Semantic Image Matting: General and Specific Semantics

Sun, Yanan; Tang, Chi-Keung; Tai, Yu-Wing

doi:10.1007/s11263-023-01907-6

Semantic Image Matting: General and Specific Semantics

Published: 12 October 2023

Volume 132, pages 710–730, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

711 Accesses
Explore all metrics

Abstract

Although conventional matting formulation can separate foreground from background in fractional occupancy which can be caused by highly transparent objects, complex foreground (e.g., net or tree), and objects containing very fine details (e.g., hairs), no previous work has attempted to reason the underlying causes of matting due to various foreground semantics in general. We show how to obtain better alpha mattes by incorporating into our framework semantic classification of matting regions. Specifically, we consider and learn 20 classes of general matting patterns, and propose to extend the conventional trimap to semantic trimap. The proposed semantic trimap can be obtained automatically through patch structure analysis within trimap regions. Meanwhile, we learn a multi-class discriminator to regularize the alpha prediction at semantic level, and content-sensitive weights to balance different regularization losses. Experiments on multiple benchmarks show that our method outperforms other methods benefit from such general alpha semantics and has achieved the most competitive state-of-the-art performance. We further explore the effectiveness of our method on specific semantics by specializing our method into human matting and transparent object matting. Experimental results on specific semantics demonstrate alpha matte semantic information can boost performance for not only general matting but also class-specific matting. Finally, we contribute a large-scale Semantic Image Matting Dataset constructed with careful consideration of data balancing across different semantic classes. Code and dataset are available in https://github.com/nowsyn/SIM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-context aggregation for universal image matting

Article 15 November 2023

Lightweight Image Matting via Efficient Non-local Guidance

Portrait Matting via Semantic and Detail Guidance

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and materials

The dataset proposed in this work is avaiable at https://github.com/nowsyn/SIM.

Code availability

Not applicable.

References

Aksoy, Y., Ozan Aydin, T., & Pollefeys, M. (2017). Designing effective inter-pixel information flow for natural image matting. In CVPR.
Aksoy, Y., Oh, T.-H., Paris, S., Pollefeys, M., & Matusik, W. (2018). Semantic soft segmentation. ACM Transactions on Graphics, 37(4), 1–13.
Article Google Scholar
Amin, B., Riaz, M. M., & Ghafoor, A. (2019). A hybrid defocused region segmentation approach using image matting. Multidimensional Systems and Signal Processing, 30(2), 561–569.
Article Google Scholar
Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV.
Cai, S., Zhang, X., Fan, H., Huang, H., Liu, J., Liu, J., Liu, J., Wang, J., & Sun, J. (2019). Disentangled image matting. In ICCV.
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020) End-to-end object detection with transformers. In European conference computer vision.
Chen, Q., Ge, T., Xu, Y., Zhang, Z., Yang, X., & Gai, K. (2018). Semantic human matting. In ACM MM.
Chen, G., Han, K., & Wong, K.-Y. K. (2018). Tom-net: Learning transparent object matting from a single image. In CVPR.
Chen, Q., Li, D., & Tang, C.-K. (2012). Knn matting. In CVPR.
Chen, L.-C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Cheng, B., Misra, I., Schwing, A. G., Kirillov, A., & Girdhar, R. (2022). Masked-attention mask transformer for universal image segmentation. In IEEE/CVF conference on computer vision and pattern recognition.
Cho, D., Tai, Y.-W., & Kweon, I. (2016). Natural image matting using deep convolutional neural networks. In ECCV.
Christoph, R., Carsten, R., Jue, W., Margrit, G., Pushmeet, K., & Pamela, R. Alphamatting. http://www.alphamatting.com/
Chuang, Y.-Y., Curless, B., Salesin, D. H., & Szeliski, R. (2001). A bayesian approach to digital matting. In CVPR.
Dai, Y., Lu, H., & Shen, C. (2021). Learning affinity-aware upsampling for deep image matting. In IEEE conference on computer vision and pattern recognition.
Dai, Y., Price, B., Zhang, H., & Shen, C. (2022). Boosting robustness of image matting with context assembling and strong data augmentation. In IEEE/CVF conference on computer vision and pattern recognition.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In CVPR.
Deora, R., Sharma, R., & Raj, D. S. S. (2021). Salient image matting.
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International conference on learning representations, ICLR.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. IJCV, 88(2), 303–338.
Article Google Scholar
Feng, X., Liang, X., & Zhang, Z. (2016). A cluster sampling method for image matting via sparse coding. In ECCV.
Forte, M., & Pitié, F. (2020). F, b, alpha matting. arXiv preprint arXiv:2003.07711
Gastal, E. S., & Oliveira, M. M. (2010). Shared sampling for real-time alpha matting. Computer Graphics Forum, 29(2), 575–584.
Article Google Scholar
Grady, L., Schiwietz, T., Aharon, S., & Westermann, R. (2005). Random walks for interactive alpha-matting. In Proceedings of VIIP.
He, K., Rhemann, C., Rother, C., Tang, X., & Sun, J. (2011). A global sampling method for alpha matting. In CVPR.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.
Hou, Q., & Liu, F. (2019). Context-aware image matting for simultaneous foreground and alpha estimation. In ICCV.
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In ECCV.
Ke, Z., Li, K., Zhou, Y., Wu, Q., Mao, X., Yan, Q., & Lau, R. W. H. (2020). Is a green screen really necessary for real-time portrait matting?
Köhler, R., Hirsch, M., Schölkopf, B., & Harmeling, S. (2013). Improving alpha matting and motion blurred foreground estimation. In ICIP.
Kong, X.-N., Ng, M. K., & Zhou, Z.-H. (2009). Transductive multi-label learning via alpha matting (Unpublished manuscript).
Levin, A., Lischinski, D., & Weiss, Y. (2006). A closed-form solution to natural image matting. In CVPR.
Levin, A., Rav-Acha, A., & Lischinski, D. (2007). Spectral matting. In CVPR.
Li, Y., & Lu, H. (2020). Natural image matting via guided contextual attention. In AAAI.
Li, J., Zhang, J., Maybank, S.J., & Tao, D. (2020). End-to-end animal image matting.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In ECCV.
Lin, S., Ryabtsev, A., Sengupta, S., Curless, B., Seitz, S., & Kemelmacher-Shlizerman, I. (2021). Real-time high-resolution background matting. In CVPR.
Lin, S., Yang, L., Saleemi, I., & Sengupta, S. (2022). Robust high-resolution video matting with temporal guidance. In IEEE/CVF winter conference on applications of computer vision.
Lin, H. T., Tai, Y.-W., & Brown, M. S. (2011). Motion regularization for matting motion blurred objects. TPAMI, 33(11), 2329–2336.
Article Google Scholar
Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., & Han, J. (2020). On the variance of the adaptive learning rate and beyond. In ICLR.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In IEEE/CVF international conference on computer vision.
Liu, J., Yao, Y., Hou, W., Cui, M., Xie, X., Zhang, C., & Hua, X.-S. (2020). Boosting semantic human matting with coarse annotations. In CVPR.
Lu, H., Dai, Y., Shen, C., Xu, S. (2019). Indices matter: Learning to index for deep image matting. In ICCV.
Lutz, S., Amplianitis, K., & Smolic, A. (2018). Alphagan: Generative adversarial networks for natural image matting. In BMVC.
Mechrez, R., Talmi, I., & Zelnik-Manor, L. (2018). The contextual loss for image transformation with non-aligned data. In ECCV.
Park, G., Son, S., Yoo, J., Kim, S., & Kwak, N. (2022). Matteformer: Transformer-based image matting via prior-tokens. In IEEE/CVF conference on computer vision and pattern recognition.
Qiao, Y., Liu, Y., Yang, X., Zhou, D., Xu, M., Zhang, Q., & Wei, X. (2020). Attention-guided hierarchical structure aggregation for image matting. In CVPR.
Qiao, Y., Liu, Y., Zhu, Q., Yang, X., Wang, Y., Zhang, Q., & Wei, X. (2020). Multi-scale information assembly for image matting. Computer Graphics Forum, 39(7), 565–574. https://doi.org/10.1111/cgf.14168
Article Google Scholar
Rhemann, C., Rother, C., Wang, J., Gelautz, M., Kohli, P., & Rott, P. (2009). A perceptually motivated online benchmark for image matting. In CVPR.
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention.
Ruzon, M. A., & Tomasi, C. (2000). Alpha estimation in natural images. In: CVPR.
Sengupta, S., Jayaram, V., Curless, B., Seitz, S., & Kemelmacher-Shlizerman, I. (2020). Background matting: The world is your green screen. In CVPR.
Sengupta, S., Jayaram, V., Curless, B., Seitz, S. M., & Kemelmacher-Shlizerman, I. (2020). Background matting: The world is your green screen. In IEEE/CVF conference on computer vision and pattern recognition.
Sharma, R., Deora, R., & Vishvakarma, A. (2020). AlphaNet: An attention guided deep network for automatic image matting.
Shen, X., Tao, X., Gao, H., Zhou, C., & Jia, J. (2016). Deep automatic portrait matting. In ECCV.
Sun, Y., Tang, C.-K., & Tai, Y.-W. (2021). Semantic image matting. In CVPR.
Sun, Y., Tang, C., & Tai, Y. (2022). Human instance matting via mutual guidance and multi-instance refinement. In IEEE/CVF conference on computer vision and pattern recognition, CVPR.
Tai, Y.-W., Jia, J., & Tang, C.-K. (2007). Soft color segmentation and its applications. TPAMI, 29(9), 1520–1537.
Article Google Scholar
Tang, J., Aksoy, Y., Oztireli, C., Gross, M., & Aydin, T. O. (2019). Learning-based sampling for natural image matting. In CVPR.
Torralba, A., & Oliva, A. (2003). Statistics of natural image categories. Network: Computation in Neural Systems, 14(3), 391–412.
Article PubMed Google Scholar
Wang, J., Cohen, M. F., et al. (2007). Image and video matting: A survey. Foundations and Trends ® in Computer Graphics and Vision, 3(2), 97–175.
Article Google Scholar
Wei, T., Chen, D., Zhou, W., Liao, J., Zhao, H., Zhang, W., & Yu, N. (2021). Improved image matting via real-time user clicks and uncertainty estimation. In CVPR.
Xu, N., Price, B., Cohen, S., & Huang, T. (2017). Deep image matting. In CVPR.
Yeung, S. K., Tang, C.-K., Brown, M. S., & Kang, S. B. (2011). Matting and compositing of transparent and refractive objects. ACM Transactions on Graphics, 30(1), 1–13.
Article Google Scholar
Yu, Z., Li, X., Huang, H., Zheng, W., & Chen, L. (2021). Cascade image matting with deformable graph refinement. In 2021 IEEE/CVF international conference on computer vision.
Yu, H., Xu, N., Huang, Z., Zhou, Y., & Shi, H. (2021). High-resolution deep image matting. In AAAI
Yu, Q., Zhang, J., Zhang, H., Wang, Y., Lin, Z., Xu, N., Bai, Y., & Yuille, A. (2021). Mask guided matting via progressive refinement network. In CVPR.
Zhang, Y., Gong, L., Fan, L., Ren, P., Huang, Q., Bao, H., & Xu, W. (2019). A late fusion CNN for digital matting. In CVPR.
Zhang, X. C., Ng, R., & Chen, Q. (2018). Single image reflection separation with perceptual losses. In CVPR.
Zheng, Y., & Kambhamettu, C. (2009). Learning based digital matting. In ICCV.
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In CVPR.
Zhu, B., Chen, Y., Wang, J., Liu, S., Zhang, B., & Tang, M. (2017). Fast deep matting for portrait animation on mobile phone. In ACM MM.

Download references

Funding

This work was supported by Kuaishou Technology and the Research Grant Council of the Hong Kong SAR under grant no. 16201420.

Author information

Authors and Affiliations

HKUST, Sai Kung, Kowloon, Hong Kong
Yanan Sun & Chi-Keung Tang
Dartmouth College, Hanover, USA
Yu-Wing Tai

Authors

Yanan Sun
View author publications
You can also search for this author inPubMed Google Scholar
Chi-Keung Tang
View author publications
You can also search for this author inPubMed Google Scholar
Yu-Wing Tai
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Yanan Sun and supervised by Chi-Keung Tang and Yu-Wing Tai.

Corresponding author

Correspondence to Yanan Sun.

Ethics declarations

Conflict of interest

This work is of non-financial interests.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Communicated by Ming-Hsuan Yang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, Y., Tang, CK. & Tai, YW. Semantic Image Matting: General and Specific Semantics. Int J Comput Vis 132, 710–730 (2024). https://doi.org/10.1007/s11263-023-01907-6

Download citation

Received: 26 August 2022
Accepted: 12 September 2023
Published: 12 October 2023
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11263-023-01907-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic Image Matting: General and Specific Semantics

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Dual-context aggregation for universal image matting

Lightweight Image Matting via Efficient Non-local Guidance

Portrait Matting via Semantic and Detail Guidance

Explore related subjects

Availability of data and materials

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now