EfficientMatting: Bilateral Matting Network for Real-Time Human Matting

Luo, Rongsheng; Wei, Rukai; Zhang, Huaxin; Tian, Ming; Gao, Changxin; Sang, Nong

doi:10.1007/978-981-97-8858-3_9

Rongsheng Luo¹⁵,
Rukai Wei¹⁶,
Huaxin Zhang¹⁵,
Ming Tian¹⁵,
Changxin Gao¹⁵ &
…
Nong Sang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15042))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

366 Accesses

Abstract

Recent human matting methods typically suffer from two drawbacks: 1) high computation overhead caused by multiple stages, and 2) limited practical application due to the need for auxiliary guidance (e.g., trimap, mask, or background). To address these issues, we propose EfficientMatting, a real-time human matting method using only a single image as input. Specifically, EfficientMatting incorporates a bilateral network composed of two complementary branches: a transformer-based context information branch and a CNN-based spatial information branch. Furthermore, we introduce three novel techniques to enhance model performance while maintaining high inference efficiency. Firstly, we design a Semantic Guided Fusion Module (SGFM), which empowers the model to dynamically acquire valuable features with the assistance of context information. Secondly, we design a lightweight Detail Preservation Module (DPM) to achieve detail preservation and mitigate image artifacts during the upsampling process. Thirdly, we introduce the Supervised-Enhanced Training Strategy (SETS) to explicitly provide supervision on hidden features. Extensive experiments on P3M-10k, Human-2K, and PPM-100 datasets show that EfficientMatting outperforms state-of-the-art real-time human matting methods in terms of both model performance and inference speed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Dual-context aggregation for universal image matting

Article 15 November 2023

Robust Human Matting via Semantic Guidance

Deep image matting with cross-layer contextual information propagation

Article 20 February 2024

References

Cai, S., Zhang, X., Fan, H., Huang, H., Liu, J., Liu, J., Liu, J., Wang, J., Sun, J.: Disentangled image matting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8819–8828 (2019)
Google Scholar
Chen, Q., Li, D., Tang, C.K.: KNN matting. IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2175–2188 (2013)
Article Google Scholar
Chen, Q., Ge, T., Xu, Y., Zhang, Z., Yang, X., Gai, K.: Semantic human matting. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 618–626 (2018)
Google Scholar
Chuang, Y.Y., Curless, B., Salesin, D.H., Szeliski, R.: A bayesian approach to digital matting. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001. vol. 2, pp. II–II. IEEE (2001)
Google Scholar
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: RepVGG: making VGG-style convnets great again. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13733–13742 (2021)
Google Scholar
Hong, J., Zuo, J., Han, C., Zheng, R., Tian, M., Gao, C., Sang, N.: Spatial cascaded clustering and weighted memory for unsupervised person re-identification (2024). arXiv:2403.00261
Hong, Y., Pan, H., Sun, W., Jia, Y.: Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes (2021). arXiv:2101.06085
Hou, Q., Liu, F.: Context-aware image matting for simultaneous foreground and alpha estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4130–4139 (2019)
Google Scholar
, Karacan, L., Erdem, A., Erdem, E.: Image matting with KL-divergence based sparse sampling. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 424–432 (2015)
Google Scholar
Ke, Z., Sun, J., Li, K., Yan, Q., Lau, R.W.: MODNet: real-time trimap-free portrait matting via objective decomposition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1140–1147 (2022)
Google Scholar
Lee, P., Wu, Y.: Nonlocal matting. In: CVPR 2011, pp. 2193–2200. IEEE (2011)
Google Scholar
Levin, A., Lischinski, D., Weiss, Y.: A closed-form solution to natural image matting. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 228–242 (2007)
Article MATH Google Scholar
Li, J., Ma, S., Zhang, J., Tao, D.: Privacy-preserving portrait matting. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3501–3509 (2021)
Google Scholar
Li, J., Zhang, J., Maybank, S.J., Tao, D.: Bridging composite and real: towards end-to-end deep image matting. Int. J. Comput. Vision 130(2), 246–266 (2022)
Article MATH Google Scholar
Li, J., Zhang, J., Tao, D.: Deep automatic natural image matting (2021). arXiv:2107.07235
Li, Y., Lu, H.: Natural image matting via guided contextual attention. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11450–11457 (2020)
Google Scholar
Lin, S., Ryabtsev, A., Sengupta, S., Curless, B.L., Seitz, S.M., Kemelmacher-Shlizerman, I.: Real-time high-resolution background matting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8762–8771 (2021)
Google Scholar
Liu, X., Peng, H., Zheng, N., Yang, Y., Hu, H., Yuan, Y.: Efficientvit: Memory efficient vision transformer with cascaded group attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14420–14430 (2023)
Google Scholar
Liu, Y., Xie, J., Shi, X., Qiao, Y., Huang, Y., Tang, Y., Yang, X.: Tripartite information mining and integration for image matting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7555–7564 (2021)
Google Scholar
Lu, H., Dai, Y., Shen, C., Xu, S.: Indices matter: Learning to index for deep image matting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3266–3275 (2019)
Google Scholar
Luo, R., Wei, R., Gao, C., Sang, N.: Frequency information matters for image matting. In: Asian Conference on Pattern Recognition, pp. 81–94. Springer, Berlin (2023)
Google Scholar
Park, G., Son, S., Yoo, J., Kim, S., Kwak, N.: Matteformer: transformer-based image matting via prior-tokens. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11696–11706 (2022)
Google Scholar
Qiao, Y., Liu, Y., Yang, X., Zhou, D., Xu, M., Zhang, Q., Wei, X.: Attention-guided hierarchical structure aggregation for image matting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13676–13685 (2020)
Google Scholar
Rhemann, C., Rother, C., Wang, J., Gelautz, M., Kohli, P., Rott, P.: A perceptually motivated online benchmark for image matting. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1826–1833. IEEE (2009)
Google Scholar
Sengupta, S., Jayaram, V., Curless, B., Seitz, S.M., Kemelmacher-Shlizerman, I.: Background matting: the world is your green screen. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2291–2300 (2020)
Google Scholar
Shahrian, E., Rajan, D., Price, B., Cohen, S.: Improving image matting using comprehensive sampling sets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 636–643 (2013)
Google Scholar
Wang, J., Cohen, M.F.: Optimized color sampling for robust matting. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Google Scholar
Wei, R., Liu, Y., Song, J., Cui, H., Xie, Y., Zhou, K.: Chain: Exploring global-local spatio-temporal information for improved self-supervised video hashing. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 1677–1688 (2023)
Google Scholar
Xu, N., Price, B., Cohen, S., Huang, T.: Deep image matting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2970–2979 (2017)
Google Scholar
Xu, Z., Shang, H., Yang, S., Xu, R., Yan, Y., Li, Y., Huang, J., Yang, H.C., Zhou, J.: Hierarchical painter: Chinese landscape painting restoration with fine-grained styles. Vis. Intell. 1(1), 19 (2023)
Article MATH Google Scholar
Yao, J., Wang, X., Yang, S., Wang, B.: Vitmatte: boosting image matting with pre-trained plain vision transformers. Inf. Fusion 103, 102091 (2024)
Article MATH Google Scholar
Yao, J., Wang, X., Ye, L., Liu, W.: Matte anything: interactive natural image matting with segment anything models (2023). arXiv:2306.04121
Yu, Q., Zhang, J., Zhang, H., Wang, Y., Lin, Z., Xu, N., Bai, Y., Yuille, A.: Mask guided matting via progressive refinement network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1154–1163 (2021)
Google Scholar
Zhang, H., Wang, X., Xu, X., Qing, Z., Gao, C., Sang, N.: Hr-pro: Point-supervised temporal action localization via hierarchical reliability propagation (2023). arXiv:2308.12608
Zhang, Y., Gong, L., Fan, L., Ren, P., Huang, Q., Bao, H., Xu, W.: A late fusion cnn for digital matting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7469–7478 (2019)
Google Scholar
Zhou, Y., Lu, R., Xue, F., Gao, Y.: Occlusion relationship reasoning with a feature separation and interaction network. Vis. Intell. 1(1), 23 (2023)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Rongsheng Luo, Huaxin Zhang, Ming Tian, Changxin Gao & Nong Sang
Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China
Rukai Wei

Authors

Rongsheng Luo
View author publications
You can also search for this author in PubMed Google Scholar
Rukai Wei
View author publications
You can also search for this author in PubMed Google Scholar
Huaxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Tian
View author publications
You can also search for this author in PubMed Google Scholar
Changxin Gao
View author publications
You can also search for this author in PubMed Google Scholar
Nong Sang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nong Sang .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Zhouchen Lin
Nankai University, Tianjin, China
Ming-Ming Cheng
Chinese Academy of Sciences, Beijing, China
Ran He
Xinjiang University, Ürümqi, Xinjiang, China
Kurban Ubul
Xinjiang University, Ürümqi, China
Wushouer Silamu
Peking University, Beijing, China
Hongbin Zha
Tsinghua University, Beijing, China
Jie Zhou
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, R., Wei, R., Zhang, H., Tian, M., Gao, C., Sang, N. (2025). EfficientMatting: Bilateral Matting Network for Real-Time Human Matting. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15042. Springer, Singapore. https://doi.org/10.1007/978-981-97-8858-3_9

Download citation

DOI: https://doi.org/10.1007/978-981-97-8858-3_9
Published: 03 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8857-6
Online ISBN: 978-981-97-8858-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EfficientMatting: Bilateral Matting Network for Real-Time Human Matting