Abstract
Deep learning methods have significantly advanced the performance of image matting. However, dataset biases can mislead the matting models to biased behavior. In this paper, we identify the two typical biases in existing matting models, specifically contrast bias and transparency bias, and discuss their origins in matting datasets. To address these biases, we model the image matting task from the perspective of causal inference and identify the root causes of these biases: the confounders. To mitigate the effects of these confounders, we employ causal intervention through backdoor adjustment and introduce a novel model-agnostic cofounder intervened (COIN) matting framework. Extensive experiments across various matting methods and datasets have demonstrated that our COIN framework can significantly diminish such biases, thereby enhancing the performance of existing matting models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cai, H., Xue, F., Xu, L., Guo, L.: Transmatting: enhancing transparent objects matting with transformers. In: ECCV 2022, pp. 253–269. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-19818-2_15
Chen, J., Li, X., Luo, L., Ma, J.: Multi-focus image fusion based on multi-scale gradients and image matting. IEEE Trans. Multim. 24, 655–667 (2022)
Dai, Y., Lu, H., Shen, C.: Learning affinity-aware upsampling for deep image matting. In: CVPR (2021)
Dai, Y., Price, B., Zhang, H., Shen, C.: Boosting robustness of image matting with context assembling and strong data augmentation. In: CVPR (2022)
Deng, J., Xu, Y., Zhou, Z., He, S.: Background matting via recursive excitation. In: ICME (2022)
Ding, H., Zhang, H., Liu, C., Jiang, X.: Deep interactive image matting with feature propagation. IEEE Trans. Image Process. 31, 2421–2432 (2022)
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Fang, X., Zhang, S., Chen, T., Wu, X., Shamir, A., Hu, S.: User-guided deep human image matting using arbitrary trimaps. IEEE Trans. Image Process. 31, 2040–2052 (2022)
Hou, Q., Liu, F.: Context-aware image matting for simultaneous foreground and alpha estimation. In: ICCV (2019)
Li, J., Niu, L., Zhang, L.: Knowledge proxy intervention for deconfounded video question answering. In: ICCV (2023)
Li, J., et al.: Video semantic segmentation via sparse temporal transformer. In: ACM MM (2021)
Li, J., Zhang, J., Tao, D.: Deep automatic natural image matting. In: IJCAI (2021)
Li, J., Zhang, J., Tao, D.: Referring image matting. In: Proceedings of the IEEE Computer Vision and Pattern Recognition (2023)
Li, Y., Lu, H.: Natural image matting via guided contextual attention. In: AAAI (2020)
Li, Y., Wang, X., Xiao, J., Chua, T.: Equivariant and invariant grounding for video question answering. In: ACM MM (2022)
Lin, S., Ryabtsev, A., Sengupta, S., Curless, B.L., Seitz, S.M., Kemelmacher-Shlizerman, I.: Real-time high-resolution background matting. In: CVPR (2021)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, Y., Xie, J., Qiao, Y., Tang, Y., Yang, X.: Prior-induced information alignment for image matting. IEEE Trans. Multim. 24, 2727–2738 (2022)
Liu, Z., et al.: Swin transformer v2: scaling up capacity and resolution. In: CVPR (2022)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Lu, H., Dai, Y., Shen, C., Xu, S.: Indices matter: learning to index for deep image matting. In: ICCV (2019)
Lu, L., Li, J., Cao, J., Niu, L., Zhang, L.: Painterly image harmonization using diffusion model. In: ACM MM (2023)
Lutz, S., Amplianitis, K., Smolic, A.: Alphagan: generative adversarial networks for natural image matting. In: BMVC (2018)
Niu, Y., Tang, K., Zhang, H., Lu, Z., Hua, X., Wen, J.: Counterfactual VQA: a cause-effect look at language bias. In: CVPR (2021)
Park, G., Son, S., Yoo, J., Kim, S., Kwak, N.: Matteformer: transformer-based image matting via prior-tokens. In: CVPR (2022)
Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Mask-guided matting in the wild. In: CVPR (2023)
Pearl, J., Glymour, M., Jewell, N.P.: Causal inference in statistics: a primer (2016)
Pearl, J., Mackenzie, D.: The Book of Why: The New Science of Cause and Effect. Basic Books, Inc., New York (2018)
Rubin, D.B.: Essential concepts of causal inference: a remarkable history and an intriguing future. Biostat. Epidemiol. 3, 140–155 (2019)
Sengupta, S., Jayaram, V., Curless, B., Seitz, S.M., Kemelmacher-Shlizerman, I.: Background matting: the world is your green screen. In: CVPR (2020)
Sharma, G., Wu, W., Dalal, E.N.: The ciede2000 color-difference formula: implementation notes, supplementary test data, and mathematical observations. Color Res. Appl. 30, 21–30 (2005). https://api.semanticscholar.org/CorpusID:29119937
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Sun, Y., Tang, C., Tai, Y.: Ultrahigh resolution image/video matting with spatio-temporal sparsity. In: CVPR (2023)
Tan, L., Li, J., Niu, L., Zhang, L.: Deep image harmonization in dual color spaces. In: ACM MM (2023)
Wang, T., Zhou, C., Sun, Q., Zhang, H.: Causal attention for unbiased visual recognition. In: ICCV (2021)
Wang, W., Feng, F., He, X., Zhang, H., Chua, T.S.: Clicks can be cheating: counterfactual recommendation for mitigating clickbait issue (2020)
Wei, T., et al.: Deep image matting with sparse user interactions. TPAMI (2024)
Wu, J., Zhang, C., Li, Z., Fu, H., Peng, X., Zhou, J.T.: dugMatting: decomposed-uncertainty-guided matting. In: ICML (2023)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: ICML (2015)
Xu, N., Price, B.L., Cohen, S., Huang, T.S.: Deep image matting. In: CVPR (2017)
Yao, J., Wang, X., Yang, S., Wang, B.: Vitmatte: boosting image matting with pre-trained plain vision transformers. Inf. Fusion 103, 102091 (2024)
Ye, Z., Dai, Y., Hong, C., Cao, Z., Lu, H.: Infusing definiteness into randomness: rethinking composition styles for deep image matting. In: AAAI (2023)
Yu, Q., et al.: Mask guided matting via progressive refinement network. In: CVPR (2021)
Zhang, D., Zhang, H., Tang, J., Hua, X., Sun, Q.: Causal intervention for weakly-supervised semantic segmentation. In: NeurIPS (2020)
Zhu, Q., Zhang, W.N., Liu, T., Wang, W.Y.: Counterfactual off-policy training for neural dialogue generation. In: EMNLP (2020)
Acknowledgements
The work was supported by the National Natural Science Foundation of China (Grant No. 62076162), the Shanghai Municipal Science and Technology Major Project, China (Grant No. 2021SHZDZX0102) and the Postdoctoral Fellowship Program of CPSF (Grant No. GZC20241225).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Liao, Z. et al. (2025). COIN-Matting: Confounder Intervention for Image Matting. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15077. Springer, Cham. https://doi.org/10.1007/978-3-031-72655-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-72655-2_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72654-5
Online ISBN: 978-3-031-72655-2
eBook Packages: Computer ScienceComputer Science (R0)