Unbiased Manifold Augmentation for Coarse Class Subdivision

Yan, Baoming; Gao, Ke; Gao, Bo; Wang, Lin; Yang, Jiang; Li, Xiaobo

doi:10.1007/978-3-031-19806-9_28

Unbiased Manifold Augmentation for Coarse Class Subdivision

Baoming Yan¹²,
Ke Gao¹²,
Bo Gao¹²,
Lin Wang¹²,
Jiang Yang¹² &
…
Xiaobo Li¹²

Conference paper
First Online: 20 October 2022

1828 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13685))

Abstract

Coarse Class Subdivision (CCS) is important for many practical applications, where the training set originally annotated for a coarse class (e.g. bird) needs to further support its sub-classes recognition (e.g. swan, crow) with only very few fine-grained labeled samples. From the perspective of causal representation learning, these sub-classes inherit the same determinative factors of the coarse class, and their difference lies only in values. Therefore, to support the challenging CCS task with minimum fine-grained labeling cost, an ideal data augmentation method should generate abundant variants by manipulating these sub-class samples at the granularity of generating factors. For this goal, traditional data augmentation methods are far from sufficient. They often perform in highly-coupled image or feature space, thus can only simulate global geometric or photometric transformations. Leveraging the recent progress of factor-disentangled generators, Unbiased Manifold Augmentation (UMA) is proposed for CCS. With a controllable StyleGAN pre-trained for a coarse class, an approximate unbiased augmentation is conducted on the factor-disentangled manifolds for each sub-class, revealing the unbiased mutual information between the target sub-class and its determinative factors. Extensive experiments have shown that in the case of small data learning (less than 1% fine-grained samples of commonly used), our UMA can achieve 10.37% average improvement compared with existing data augmentation methods. On challenging tasks with severe bias, the accuracy is improved by up to 16.79%. We release our code at https://github.com/leo-gb/UMA.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Celeba. https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
Cub-200-2011. https://www.vision.caltech.edu/visipedia/CUB-200-2011.html
ffhq-dataset. https://github.com/NVlabs/ffhq-dataset
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: StarGAN v2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Google Scholar
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: AutoAugment: learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 113–123 (2019)
Google Scholar
Cubuk, E.D., Zoph, B., Shlens, J., Le, Q.V.: RandAugment: practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 702–703 (2020)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)
Google Scholar
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: discovering interpretable GAN controls. In: Advances in Neural Information Processing Systems 33, pp. 9841–9850 (2020)
Google Scholar
Hong, M., Choi, J., Kim, G.: StyleMix: separating content and style for enhanced data augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14862–14870 (2021)
Google Scholar
Huang, S., Wang, X., Tao, D.: Stochastic partial swap: enhanced model generalization and interpretability for fine-grained recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 620–629 (2021)
Google Scholar
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)
Google Scholar
Lee, D.H., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML, vol. 3, p. 896 (2013)
Google Scholar
Li, B., Wu, F., Lim, S.N., Belongie, S., Weinberger, K.Q.: On feature normalization and data augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12383–12392 (2021)
Google Scholar
Li, Y., et al.: Shape-texture debiased neural network training. arXiv preprint arXiv:2010.05981 (2020)
Lin, J., Zhang, R., Ganz, F., Han, S., Zhu, J.Y.: Anycost GANs for interactive image synthesis and editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14986–14996 (2021)
Google Scholar
Nauta, M., van Bree, R., Seifert, C.: Neural prototype trees for interpretable fine-grained image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14933–14943 (2021)
Google Scholar
Rao, Y., Chen, G., Lu, J., Zhou, J.: Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1025–1034 (2021)
Google Scholar
Richardson, E., et al.: Encoding in style: a StyleGAN encoder for image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2287–2296 (2021)
Google Scholar
Roich, D., Mokady, R., Bermano, A.H., Cohen-Or, D.: Pivotal tuning for latent-based editing of real images. arXiv preprint arXiv:2106.05744 (2021)
Schölkopf, B., et al.: Toward causal representation learning. Proc. IEEE 109(5), 612–634 (2021)
Article Google Scholar
Sohn, K., et al.: FixMatch: simplifying semi-supervised learning with consistency and confidence. In: Advances in Neural Information Processing Systems 33, pp. 596–608 (2020)
Google Scholar
Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. arXiv preprint arXiv:2007.08199 (2020)
Tov, O., Alaluf, Y., Nitzan, Y., Patashnik, O., Cohen-Or, D.: Designing an encoder for StyleGAN image manipulation. ACM Trans. Graph. (TOG) 40(4), 1–14 (2021)
Article Google Scholar
Verma, V., et al.: Manifold mixup: better representations by interpolating hidden states. In: International Conference on Machine Learning, pp. 6438–6447. PMLR (2019)
Google Scholar
Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., Xiao, J.: LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
Yue, Z., Wang, T., Sun, Q., Hua, X.S., Zhang, H.: Counterfactual zero-shot and open-set visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15404–15414 (2021)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6023–6032 (2019)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zheng, X., Chalasani, T., Ghosal, K., Lutz, S., Smolic, A.: STaDA: Style transfer as data augmentation. arXiv preprint arXiv:1909.01056 (2019)
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13001–13008 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Alibaba Group, Hangzhou, China
Baoming Yan, Ke Gao, Bo Gao, Lin Wang, Jiang Yang & Xiaobo Li

Authors

Baoming Yan
View author publications
You can also search for this author in PubMed Google Scholar
Ke Gao
View author publications
You can also search for this author in PubMed Google Scholar
Bo Gao
View author publications
You can also search for this author in PubMed Google Scholar
Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobo Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke Gao .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, B., Gao, K., Gao, B., Wang, L., Yang, J., Li, X. (2022). Unbiased Manifold Augmentation for Coarse Class Subdivision. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13685. Springer, Cham. https://doi.org/10.1007/978-3-031-19806-9_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-19806-9_28
Published: 20 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19805-2
Online ISBN: 978-3-031-19806-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics